i2som-imx-linux

Author	SHA1	Message	Date
Greg Kroah-Hartman	ea101a7026	Linux 4.9.122	2018-08-18 10:47:20 +02:00
Sean Christopherson	7e5cac813b	x86/speculation/l1tf: Exempt zeroed PTEs from inversion commit `f19f5c49bb` upstream. It turns out that we should not invert all not-present mappings, because the all zeroes case is obviously special. clear_page() does not undergo the XOR logic to invert the address bits, i.e. PTE, PMD and PUD entries that have not been individually written will have val=0 and so will trigger __pte_needs_invert(). As a result, {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok because the page at physical address 0 is reserved early in boot specifically to mitigate L1TF, so explicitly exempt them from the inversion when reading the PFN. Manifested as an unexpected mprotect(..., PROT_NONE) failure when called on a VMA that has VM_PFNMAP and was mmap'd to as something other than PROT_NONE but never used. mprotect() sends the PROT_NONE request down prot_none_walk(), which walks the PTEs to check the PFNs. prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns -EACCES because it thinks mprotect() is trying to adjust a high MMIO address. [ This is a very modified version of Sean's original patch, but all credit goes to Sean for doing this and also pointing out that sometimes the __pte_needs_invert() function only gets the protection bits, not the full eventual pte. But zero remains special even in just protection bits, so that's ok. - Linus ] Fixes: `f22cc87f6c` ("x86/speculation/l1tf: Invert all not present mappings") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-18 10:47:20 +02:00
Greg Kroah-Hartman	d0e3227f31	Linux 4.9.121	2018-08-17 20:59:30 +02:00
Toshi Kani	e853786d3c	x86/mm: Add TLB purge to free pmd/pte page interfaces commit `5e0fb5df2e` upstream. ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates a pud / pmd map. The following preconditions are met at their entry. - All pte entries for a target pud/pmd address range have been cleared. - System-wide TLB purges have been peformed for a target pud/pmd address range. The preconditions assure that there is no stale TLB entry for the range. Speculation may not cache TLB entries since it requires all levels of page entries, including ptes, to have P & A-bits set for an associated address. However, speculation may cache pud/pmd entries (paging-structure caches) when they have P-bit set. Add a system-wide TLB purge (INVLPG) to a single page after clearing pud/pmd entry's P-bit. SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches, states that: INVLPG invalidates all paging-structure caches associated with the current PCID regardless of the liner addresses to which they correspond. Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: cpandya@codeaurora.org Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: Joerg Roedel <joro@8bytes.org> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-4-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Chintan Pandya	6e6b637779	ioremap: Update pgtable free interfaces with addr commit `785a19f9d1` upstream. The following kernel panic was observed on ARM64 platform due to a stale TLB entry. 1. ioremap with 4K size, a valid pte page table is set. 2. iounmap it, its pte entry is set to 0. 3. ioremap the same address with 2M size, update its pmd entry with a new value. 4. CPU may hit an exception because the old pmd entry is still in TLB, which leads to a kernel panic. Commit `b6bdb7517c` ("mm/vmalloc: add interfaces to free unmapped page table") has addressed this panic by falling to pte mappings in the above case on ARM64. To support pmd mappings in all cases, TLB purge needs to be performed in this case on ARM64. Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page() so that TLB purge can be added later in seprate patches. [toshi.kani@hpe.com: merge changes, rewrite patch description] Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Mark Salyzyn	7c7940ffba	Bluetooth: hidp: buffer overflow in hidp_process_report commit `7992c18810` upstream. CVE-2018-9363 The buffer length is unsigned at all layers, but gets cast to int and checked in hidp_process_report and can lead to a buffer overflow. Switch len parameter to unsigned int to resolve issue. This affects 3.18 and newer kernels. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Fixes: `a4b1b5877b` ("HID: Bluetooth: hidp: make sure input buffers are big enough") Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Kees Cook <keescook@chromium.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: security@kernel.org Cc: kernel-team@android.com Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Thierry Escande	5daf24711f	ASoC: Intel: cht_bsw_max98090_ti: Fix jack initialization commit `3bbda5a386` upstream. If the ts3a227e audio accessory detection hardware is present and its driver probed, the jack needs to be created before enabling jack detection in the ts3a227e driver. With this patch, the jack is instantiated in the max98090 headset init function if the ts3a227e is present. This fixes a null pointer dereference as the jack detection enabling function in the ts3a driver was called before the jack is created. [minor correction to keep error handling on jack creation the same as before by Pierre Bossart] Signed-off-by: Thierry Escande <thierry.escande@collabora.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Eric Biggers	b7c2b69911	crypto: ablkcipher - fix crash flushing dcache in error path commit `318abdfbe7` upstream. Like the skcipher_walk and blkcipher_walk cases: scatterwalk_done() is only meant to be called after a nonzero number of bytes have been processed, since scatterwalk_pagedone() will flush the dcache of the previous page. But in the error case of ablkcipher_walk_done(), e.g. if the input wasn't an integer number of blocks, scatterwalk_done() was actually called after advancing 0 bytes. This caused a crash ("BUG: unable to handle kernel paging request") during '!PageSlab(page)' on architectures like arm and arm64 that define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was page-aligned as in that case walk->offset == 0. Fix it by reorganizing ablkcipher_walk_done() to skip the scatterwalk_advance() and scatterwalk_done() if an error has occurred. Reported-by: Liu Chao <liuchao741@huawei.com> Fixes: `bf06099db1` ("crypto: skcipher - Add ablkcipher_walk interfaces") Cc: <stable@vger.kernel.org> # v2.6.35+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Eric Biggers	afd5c42dea	crypto: blkcipher - fix crash flushing dcache in error path commit `0868def3e4` upstream. Like the skcipher_walk case: scatterwalk_done() is only meant to be called after a nonzero number of bytes have been processed, since scatterwalk_pagedone() will flush the dcache of the previous page. But in the error case of blkcipher_walk_done(), e.g. if the input wasn't an integer number of blocks, scatterwalk_done() was actually called after advancing 0 bytes. This caused a crash ("BUG: unable to handle kernel paging request") during '!PageSlab(page)' on architectures like arm and arm64 that define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was page-aligned as in that case walk->offset == 0. Fix it by reorganizing blkcipher_walk_done() to skip the scatterwalk_advance() and scatterwalk_done() if an error has occurred. This bug was found by syzkaller fuzzing. Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { struct sockaddr_alg addr = { .salg_type = "skcipher", .salg_name = "ecb(aes-generic)", }; char buffer[4096] __attribute__((aligned(4096))) = { 0 }; int fd; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16); fd = accept(fd, NULL, NULL); write(fd, buffer, 15); read(fd, buffer, 15); } Reported-by: Liu Chao <liuchao741@huawei.com> Fixes: `5cde0af2a9` ("[CRYPTO] cipher: Added block cipher type") Cc: <stable@vger.kernel.org> # v2.6.19+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Eric Biggers	81ad8a8e86	crypto: vmac - separate tfm and request context commit `bb29648102` upstream. syzbot reported a crash in vmac_final() when multiple threads concurrently use the same "vmac(aes)" transform through AF_ALG. The bug is pretty fundamental: the VMAC template doesn't separate per-request state from per-tfm (per-key) state like the other hash algorithms do, but rather stores it all in the tfm context. That's wrong. Also, vmac_final() incorrectly zeroes most of the state including the derived keys and cached pseudorandom pad. Therefore, only the first VMAC invocation with a given key calculates the correct digest. Fix these bugs by splitting the per-tfm state from the per-request state and using the proper init/update/final sequencing for requests. Reproducer for the crash: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "vmac(aes)", }; char buf[256] = { 0 }; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16); fork(); fd = accept(fd, NULL, NULL); for (;;) write(fd, buf, 256); } The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds VMAC_NHBYTES, causing vmac_final() to memset() a negative length. Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com Fixes: `f1939f7c56` ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Eric Biggers	371c35cb8c	crypto: vmac - require a block cipher with 128-bit block size commit `73bf20ef3d` upstream. The VMAC template assumes the block cipher has a 128-bit block size, but it failed to check for that. Thus it was possible to instantiate it using a 64-bit block size cipher, e.g. "vmac(cast5)", causing uninitialized memory to be used. Add the needed check when instantiating the template. Fixes: `f1939f7c56` ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Eric Biggers	e87485a554	crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2() commit `af839b4e54` upstream. There is a copy-paste error where sha256_mb_mgr_get_comp_job_avx2() copies the SHA-256 digest state from sha256_mb_mgr::args::digest to job_sha256::result_digest. Consequently, the sha256_mb algorithm sometimes calculates the wrong digest. Fix it. Reproducer using AF_ALG: #include <assert.h> #include <linux/if_alg.h> #include <stdio.h> #include <string.h> #include <sys/socket.h> #include <unistd.h> static const __u8 expected[32] = "\xad\x7f\xac\xb2\x58\x6f\xc6\xe9\x66\xc0\x04\xd7\xd1\xd1\x6b\x02" "\x4f\x58\x05\xff\x7c\xb4\x7c\x7a\x85\xda\xbd\x8b\x48\x89\x2c\xa7"; int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "sha256_mb", }; __u8 data[4096] = { 0 }; __u8 digest[32]; int ret; int i; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); fork(); fd = accept(fd, 0, 0); do { ret = write(fd, data, 4096); assert(ret == 4096); ret = read(fd, digest, 32); assert(ret == 32); } while (memcmp(digest, expected, 32) == 0); printf("wrong digest: "); for (i = 0; i < 32; i++) printf("%02x", digest[i]); printf("\n"); } Output was: wrong digest: ad7facb2000000000000000000000000ffffffef7cb47c7a85dabd8b48892ca7 Fixes: `172b1d6b5a` ("crypto: sha256-mb - fix ctx pointer and digest copy") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Randy Dunlap	2d43ff0ffc	kbuild: verify that $DEPMOD is installed commit `934193a654` upstream. Verify that 'depmod' ($DEPMOD) is installed. This is a partial revert of commit `620c231c7a` ("kbuild: do not check for ancient modutils tools"). Also update Documentation/process/changes.rst to refer to kmod instead of module-init-tools. Fixes kernel bugzilla #198965: https://bugzilla.kernel.org/show_bug.cgi?id=198965 Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Lucas De Marchi <lucas.de.marchi@gmail.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Jessica Yu <jeyu@kernel.org> Cc: Chih-Wei Huang <cwhuang@linux.org.tw> Cc: stable@vger.kernel.org # any kernel since 2012 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Liwei Song	52b9b51a5f	i2c: ismt: fix wrong device address when unmap the data buffer commit `17e83549e1` upstream. Fix the following kernel bug: kernel BUG at drivers/iommu/intel-iommu.c:3260! invalid opcode: 0000 [#5] PREEMPT SMP Hardware name: Intel Corp. Harcuvar/Server, BIOS HAVLCRB0.X64.0013.D39.1608311820 08/31/2016 task: ffff880175389950 ti: ffff880176bec000 task.ti: ffff880176bec000 RIP: 0010:[<ffffffff8150a83b>] [<ffffffff8150a83b>] intel_unmap+0x25b/0x260 RSP: 0018:ffff880176bef5e8 EFLAGS: 00010296 RAX: 0000000000000024 RBX: ffff8800773c7c88 RCX: 000000000000ce04 RDX: 0000000080000000 RSI: 0000000000000000 RDI: 0000000000000009 RBP: ffff880176bef638 R08: 0000000000000010 R09: 0000000000000004 R10: ffff880175389c78 R11: 0000000000000a4f R12: ffff8800773c7868 R13: 00000000ffffac88 R14: ffff8800773c7818 R15: 0000000000000001 FS: 00007fef21258700(0000) GS:ffff88017b5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000066d6d8 CR3: 000000007118c000 CR4: 00000000003406e0 Stack: 00000000ffffac88 ffffffff8199867f ffff880176bef5f8 ffff880100000030 ffff880176bef668 ffff8800773c7c88 ffff880178288098 ffff8800772c0010 ffff8800773c7818 0000000000000001 ffff880176bef648 ffffffff8150a86e Call Trace: [<ffffffff8199867f>] ? printk+0x46/0x48 [<ffffffff8150a86e>] intel_unmap_page+0xe/0x10 [<ffffffffa039d99b>] ismt_access+0x27b/0x8fa [i2c_ismt] [<ffffffff81554420>] ? __pm_runtime_suspend+0xa0/0xa0 [<ffffffff815544a0>] ? pm_suspend_timer_fn+0x80/0x80 [<ffffffff81554420>] ? __pm_runtime_suspend+0xa0/0xa0 [<ffffffff815544a0>] ? pm_suspend_timer_fn+0x80/0x80 [<ffffffff8143dfd0>] ? pci_bus_read_dev_vendor_id+0xf0/0xf0 [<ffffffff8172b36c>] i2c_smbus_xfer+0xec/0x4b0 [<ffffffff810aa4d5>] ? vprintk_emit+0x345/0x530 [<ffffffffa038936b>] i2cdev_ioctl_smbus+0x12b/0x240 [i2c_dev] [<ffffffff810aa829>] ? vprintk_default+0x29/0x40 [<ffffffffa0389b33>] i2cdev_ioctl+0x63/0x1ec [i2c_dev] [<ffffffff811b04c8>] do_vfs_ioctl+0x328/0x5d0 [<ffffffff8119d8ec>] ? vfs_write+0x11c/0x190 [<ffffffff8109d449>] ? rt_up_read+0x19/0x20 [<ffffffff811b07f1>] SyS_ioctl+0x81/0xa0 [<ffffffff819a351b>] system_call_fastpath+0x16/0x6e This happen When run "i2cdetect -y 0" detect SMBus iSMT adapter. After finished I2C block read/write, when unmap the data buffer, a wrong device address was pass to dma_unmap_single(). To fix this, give dma_unmap_single() the "dev" parameter, just like what dma_map_single() does, then unmap can find the right devices. Fixes: `13f35ac14c` ("i2c: Adding support for Intel iSMT SMBus 2.0 host controller") Signed-off-by: Liwei Song <liwei.song@windriver.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Andrey Konovalov	76b6f30f94	kasan: don't emit builtin calls when sanitization is off commit `0e410e158e` upstream. With KASAN enabled the kernel has two different memset() functions, one with KASAN checks (memset) and one without (__memset). KASAN uses some macro tricks to use the proper version where required. For example memset() calls in mm/slub.c are without KASAN checks, since they operate on poisoned slab object metadata. The issue is that clang emits memset() calls even when there is no memset() in the source code. They get linked with improper memset() implementation and the kernel fails to boot due to a huge amount of KASAN reports during early boot stages. The solution is to add -fno-builtin flag for files with KASAN_SANITIZE := n marker. Link: http://lkml.kernel.org/r/8ffecfffe04088c52c42b92739c2bd8a0bcb3f5e.1516384594.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Sami: Backported to 4.9 avoiding `c5caf21ab0` and `e7c52b84fb` ] Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:29 +02:00
Toshi Kani	2130e543ff	x86/mm: Disable ioremap free page handling on x86-PAE commit `f967db0b9e` upstream. ioremap() supports pmd mappings on x86-PAE. However, kernel's pmd tables are not shared among processes on x86-PAE. Therefore, any update to sync'd pmd entries need re-syncing. Freeing a pte page also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one(). Disable free page handling on x86-PAE. pud_free_pmd_page() and pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present. This assures that ioremap() does not update sync'd pmd entries at the cost of falling back to pte mappings. Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Reported-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: cpandya@codeaurora.org Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-2-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:28 +02:00
Guenter Roeck	cc83ba490d	x86: i8259: Add missing include file commit `0a957467c5` upstream. i8259.h uses inb/outb and thus needs to include asm/io.h to avoid the following build error, as seen with x86_64:defconfig and CONFIG_SMP=n. In file included from drivers/rtc/rtc-cmos.c:45:0: arch/x86/include/asm/i8259.h: In function 'inb_pic': arch/x86/include/asm/i8259.h:32:24: error: implicit declaration of function 'inb' arch/x86/include/asm/i8259.h: In function 'outb_pic': arch/x86/include/asm/i8259.h:45:2: error: implicit declaration of function 'outb' Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Suggested-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Fixes: `447ae31667` ("x86: Don't include linux/irq.h from asm/hardirq.h") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:28 +02:00
Guenter Roeck	61341a364d	x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled commit `1eb46908b3` upstream. allmodconfig+CONFIG_INTEL_KVM=n results in the following build error. ERROR: "l1tf_vmx_mitigation" [arch/x86/kvm/kvm.ko] undefined! Fixes: `5b76a3cff0` ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry") Reported-by: Meelis Roos <mroos@linux.ee> Cc: Meelis Roos <mroos@linux.ee> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 20:59:28 +02:00
Greg Kroah-Hartman	93e02ae420	Linux 4.9.120	2018-08-15 18:14:55 +02:00
Borislav Petkov	7f5d090ffe	x86/CPU/AMD: Have smp_num_siblings and cpu_llc_id always be present commit `f8b64d08dd` upstream. Move smp_num_siblings and cpu_llc_id to cpu/common.c so that they're always present as symbols and not only in the CONFIG_SMP case. Then, other code using them doesn't need ugly ifdeffery anymore. Get rid of some ifdeffery. Signed-off-by: Borislav Petkov <bpetkov@suse.de> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1524864877-111962-2-git-send-email-suravee.suthikulpanit@amd.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Suravee Suthikulpanit	4edf4ad2e7	x86/cpu/amd: Limit cpu_core_id fixup to families older than F17h commit `b89b41d0b8` upstream. Current cpu_core_id fixup causes downcored F17h configurations to be incorrect: NODE: 0 processor 0 core id : 0 processor 1 core id : 1 processor 2 core id : 2 processor 3 core id : 4 processor 4 core id : 5 processor 5 core id : 0 NODE: 1 processor 6 core id : 2 processor 7 core id : 3 processor 8 core id : 4 processor 9 core id : 0 processor 10 core id : 1 processor 11 core id : 2 Code that relies on the cpu_core_id, like match_smt(), for example, which builds the thread siblings masks used by the scheduler, is mislead. So, limit the fixup to pre-F17h machines. The new value for cpu_core_id for F17h and later will represent the CPUID_Fn8000001E_EBX[CoreId], which is guaranteed to be unique for each core within a socket. This way we have: NODE: 0 processor 0 core id : 0 processor 1 core id : 1 processor 2 core id : 2 processor 3 core id : 4 processor 4 core id : 5 processor 5 core id : 6 NODE: 1 processor 6 core id : 8 processor 7 core id : 9 processor 8 core id : 10 processor 9 core id : 12 processor 10 core id : 13 processor 11 core id : 14 Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [ Heavily massaged. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Link: http://lkml.kernel.org/r/20170731085159.9455-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Jiri Kosina	b4f17de89e	x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures commit `6c26fcd2ab` upstream. pfn_modify_allowed() and arch_has_pfn_modify_check() are outside of the !__ASSEMBLY__ section in include/asm-generic/pgtable.h, which confuses assembler on archs that don't have __HAVE_ARCH_PFN_MODIFY_ALLOWED (e.g. ia64) and breaks build: include/asm-generic/pgtable.h: Assembler messages: include/asm-generic/pgtable.h:538: Error: Unknown opcode `static inline bool pfn_modify_allowed(unsigned long pfn,pgprot_t prot)' include/asm-generic/pgtable.h:540: Error: Unknown opcode `return true' include/asm-generic/pgtable.h:543: Error: Unknown opcode `static inline bool arch_has_pfn_modify_check(void)' include/asm-generic/pgtable.h:545: Error: Unknown opcode `return false' arch/ia64/kernel/entry.S:69: Error: `mov' does not fit into bundle Move those two static inlines into the !__ASSEMBLY__ section so that they don't confuse the asm build pass. Fixes: `42e4089c78` ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [groeck: Context changes] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Vlastimil Babka	16848eb10e	x86/init: fix build with CONFIG_SWAP=n commit `792adb90fa` upstream. The introduction of generic_max_swapfile_size and arch-specific versions has broken linking on x86 with CONFIG_SWAP=n due to undefined reference to 'generic_max_swapfile_size'. Fix it by compiling the x86-specific max_swapfile_size() only with CONFIG_SWAP=y. Reported-by: Tomas Pruzina <pruzinat@gmail.com> Fixes: `377eeaa8e1` ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Abel Vesa	aee0861fbe	cpu/hotplug: Non-SMP machines do not make use of booted_once commit `269777aa53` upstream. Commit `0cc3cd2165` ("cpu/hotplug: Boot HT siblings at least once") breaks non-SMP builds. [ I suspect the 'bool' fields should just be made to be bitfields and be exposed regardless of configuration, but that's a separate cleanup that I'll leave to the owners of this file for later. - Linus ] Fixes: `0cc3cd2165` ("cpu/hotplug: Boot HT siblings at least once") Cc: Dave Hansen <dave.hansen@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Abel Vesa <abelvesa@linux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Vlastimil Babka	59a6e1f276	x86/smp: fix non-SMP broken build due to redefinition of apic_id_is_primary_thread commit `d0055f351e` upstream. The function has an inline "return false;" definition with CONFIG_SMP=n but the "real" definition is also visible leading to "redefinition of ‘apic_id_is_primary_thread’" compiler error. Guard it with #ifdef CONFIG_SMP Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: `6a4d2657e0` ("x86/smp: Provide topology_is_primary_thread()") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Josh Poimboeuf	da540c063b	x86/microcode: Allow late microcode loading with SMT disabled commit `07d981ad4c` upstream The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:55 +02:00
Ashok Raj	760f9488c1	x86/microcode: Do not upload microcode if CPUs are offline commit `30ec26da99` upstream. Avoid loading microcode if any of the CPUs are offline, and issue a warning. Having different microcode revisions on the system at any time is outright dangerous. [ Borislav: Massage changelog. ] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: http://lkml.kernel.org/r/1519352533-15992-4-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/20180228102846.13447-5-bp@alien8.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:54 +02:00
David Woodhouse	d21c27185b	tools headers: Synchronise x86 cpufeatures.h for L1TF additions commit `e24f14b0ff` upstream Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:54 +02:00
Andi Kleen	e79d049743	x86/mm/kmmio: Make the tracer robust against L1TF commit `1063711b57` upstream The mmio tracer sets io mapping PTEs and PMDs to non present when enabled without inverting the address bits, which makes the PTE entry vulnerable for L1TF. Make it use the right low level macros to actually invert the address bits to protect against L1TF. In principle this could be avoided because MMIO tracing is not likely to be enabled on production machines, but the fix is straigt forward and for consistency sake it's better to get rid of the open coded PTE manipulation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:54 +02:00
Andi Kleen	7e46437335	x86/mm/pat: Make set_memory_np() L1TF safe commit `958f79b9ee` upstream set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits. Replace the open coded PTE manipulation with the L1TF protecting low level PTE routines. Passes the CPA self test. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ dwmw2: Pull in pud_mkhuge() from commit `a00cc7d9dd`, and pfn_pud() ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:54 +02:00
Andi Kleen	5ebf3f8d5b	x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert commit `0768f91530` upstream Some cases in THP like: - MADV_FREE - mprotect - split mark the PMD non present for temporarily to prevent races. The window for an L1TF attack in these contexts is very small, but it wants to be fixed for correctness sake. Use the proper low level functions for pmd/pud_mknotpresent() to address this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:54 +02:00
Andi Kleen	4656dfb6b5	x86/speculation/l1tf: Invert all not present mappings commit `f22cc87f6c` upstream For kernel mappings PAGE_PROTNONE is not necessarily set for a non present mapping, but the inversion logic explicitely checks for !PRESENT and PROT_NONE. Remove the PROT_NONE check and make the inversion unconditional for all not present mappings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Thomas Gleixner	c504b9fce7	cpu/hotplug: Fix SMT supported evaluation commit `bc2d8d262c` upstream Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: `73d5e2b472` ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Paolo Bonzini	f56c8ee659	KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry commit `5b76a3cff0` upstream When nested virtualization is in use, VMENTER operations from the nested hypervisor into the nested guest will always be processed by the bare metal hypervisor, and KVM's "conditional cache flushes" mode in particular does a flush on nested vmentry. Therefore, include the "skip L1D flush on vmentry" bit in KVM's suggested ARCH_CAPABILITIES setting. Add the relevant Documentation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Paolo Bonzini	383f160027	x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry commit `8e0b2b9166` upstream Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is not needed. Add a new value to enum vmx_l1d_flush_state, which is used either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Paolo Bonzini	ee782edd87	x86/speculation: Simplify sysfs report of VMX L1TF vulnerability commit `ea156d192f` upstream Three changes to the content of the sysfs file: - If EPT is disabled, L1TF cannot be exploited even across threads on the same core, and SMT is irrelevant. - If mitigation is completely disabled, and SMT is enabled, print "vulnerable" instead of "vulnerable, SMT vulnerable" - Reorder the two parts so that the main vulnerability state comes first and the detail on SMT is second. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Paolo Bonzini	ce2c755166	KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR commit `cd28325249` upstream This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all requested features are available on the host. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Wanpeng Li	7a1eac80b5	KVM: X86: Allow userspace to define the microcode version commit `518e7b9481` upstream Linux (among the others) has checks to make sure that certain features aren't enabled on a certain family/model/stepping if the microcode version isn't greater than or equal to a known good version. By exposing the real microcode version, we're preventing buggy guests that don't check that they are running virtualized (i.e., they should trust the hypervisor) from disabling features that are effectively not buggy. Suggested-by: Filippo Sironi <sironi@amazon.de> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Wanpeng Li	8a01dd38e5	KVM: X86: Introduce kvm_get_msr_feature() commit `66421c1ec3` upstream Introduce kvm_get_msr_feature() to handle the msrs which are supported by different vendors and sharing the same emulation logic. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:53 +02:00
Tom Lendacky	1a155ef3c9	KVM: SVM: Add MSR-based feature support for serializing LFENCE commit `d1d93fa90f` upstream In order to determine if LFENCE is a serializing instruction on AMD processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state of bit 1 checked. This patch will add support to allow a guest to properly make this determination. Add the MSR feature callback operation to svm.c and add MSR 0xc0011029 to the list of MSR-based features. If LFENCE is serializing, then the feature is supported, allowing the hypervisor to set the value of the MSR that guest will see. Support is also added to write (hypervisor only) and read the MSR value for the guest. A write by the guest will result in a #GP. A read by the guest will return the value as set by the host. In this way, the support to expose the feature to the guest is controlled by the hypervisor. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Tom Lendacky	62d88fc0fb	KVM: x86: Add a framework for supporting MSR-based features commit `801e459a6f` upstream Provide a new KVM capability that allows bits within MSRs to be recognized as features. Two new ioctls are added to the /dev/kvm ioctl routine to retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops callback is used to determine support for the listed MSR-based features. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Tweaked documentation. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Thomas Gleixner	d9f378f64c	Documentation/l1tf: Remove Yonah processors from not vulnerable list commit `5833113613` upstream Dave reported, that it's not confirmed that Yonah processors are unaffected. Remove them from the list. Reported-by: ave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	77a83b3a62	x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() commit `18b57ce2eb` upstream For VMEXITs caused by external interrupts, vmx_handle_external_intr() indirectly calls into the interrupt handlers through the host's IDT. It follows that these interrupts get accounted for in the kvm_cpu_l1tf_flush_l1d per-cpu flag. The subsequently executed vmx_l1d_flush() will thus be aware that some interrupts have happened and conduct a L1d flush anyway. Setting l1tf_flush_l1d from vmx_handle_external_intr() isn't needed anymore. Drop it. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	2c5a3a0547	x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d commit `ffcba43ff6` upstream The last missing piece to having vmx_l1d_flush() take interrupts after VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on irq entry. Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(), ipi_entering_ack_irq(), smp_reschedule_interrupt() and uv_bau_message_interrupt(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	8574df1a87	x86: Don't include linux/irq.h from asm/hardirq.h commit `447ae31667` upstream The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to .c files as needed. Note that some of these .c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [dwmw2: More fixes for EFI and Xen in 4.9] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	e371c92e16	x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d commit `45b575c00d` upstream Part of the L1TF mitigation for vmx includes flushing the L1D cache upon VMENTRY. L1D flushes are costly and two modes of operations are provided to users: "always" and the more selective "conditional" mode. If operating in the latter, the cache would get flushed only if a host side code path considered unconfined had been traversed. "Unconfined" in this context means that it might have pulled in sensitive data like user data or kernel crypto keys. The need for L1D flushes is tracked by means of the per-vcpu flag l1tf_flush_l1d. KVM exit handlers considered unconfined set it. A vmx_l1d_flush() subsequently invoked before the next VMENTER will conduct a L1d flush based on its value and reset that flag again. Currently, interrupts delivered "normally" while in root operation between VMEXIT and VMENTER are not taken into account. Part of the reason is that these don't leave any traces and thus, the vmx code is unable to tell if any such has happened. As proposed by Paolo Bonzini, prepare for tracking all interrupts by introducing a new per-cpu flag, "kvm_cpu_l1tf_flush_l1d". It will be in strong analogy to the per-vcpu ->l1tf_flush_l1d. A later patch will make interrupt handlers set it. For the sake of cache locality, group kvm_cpu_l1tf_flush_l1d into x86' per-cpu irq_cpustat_t as suggested by Peter Zijlstra. Provide the helpers kvm_set_cpu_l1tf_flush_l1d(), kvm_clear_cpu_l1tf_flush_l1d() and kvm_get_cpu_l1tf_flush_l1d(). Make them trivial resp. non-existent for !CONFIG_KVM_INTEL as appropriate. Let vmx_l1d_flush() handle kvm_cpu_l1tf_flush_l1d in the same way as l1tf_flush_l1d. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	5766dc1298	x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 commit `9aee5f8a7e` upstream An upcoming patch will extend KVM's L1TF mitigation in conditional mode to also cover interrupts after VMEXITs. For tracking those, stores to a new per-cpu flag from interrupt handlers will become necessary. In order to improve cache locality, this new flag will be added to x86's irq_cpustat_t. Make some space available there by shrinking the ->softirq_pending bitfield from 32 to 16 bits: the number of bits actually used is only NR_SOFTIRQS, i.e. 10. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:52 +02:00
Nicolai Stange	90bc306b76	x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() commit `5b6ccc6c3b` upstream Currently, vmx_vcpu_run() checks if l1tf_flush_l1d is set and invokes vmx_l1d_flush() if so. This test is unncessary for the "always flush L1D" mode. Move the check to vmx_l1d_flush()'s conditional mode code path. Notes: - vmx_l1d_flush() is likely to get inlined anyway and thus, there's no extra function call. - This inverts the (static) branch prediction, but there hadn't been any explicit likely()/unlikely() annotations before and so it stays as is. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Nicolai Stange	936f566260	x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' commit `427362a142` upstream The vmx_l1d_flush_always static key is only ever evaluated if vmx_l1d_should_flush is enabled. In that case however, there are only two L1d flushing modes possible: "always" and "conditional". The "conditional" mode's implementation tends to require more sophisticated logic than the "always" mode. Avoid inverted logic by replacing the 'vmx_l1d_flush_always' static key with a 'vmx_l1d_flush_cond' one. There is no change in functionality. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Nicolai Stange	698ac1bc17	x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() commit `379fd0c7e6` upstream vmx_l1d_flush() gets invoked only if l1tf_flush_l1d is true. There's no point in setting l1tf_flush_l1d to true from there again. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Josh Poimboeuf	8b1969db55	cpu/hotplug: detect SMT disabled by BIOS commit `73d5e2b472` upstream If SMT is disabled in BIOS, the CPU code doesn't properly detect it. The /sys/devices/system/cpu/smt/control file shows 'on', and the 'l1tf' vulnerabilities file shows SMT as vulnerable. Fix it by forcing 'cpu_smt_control' to CPU_SMT_NOT_SUPPORTED in such a case. Unfortunately the detection can only be done after bringing all the CPUs online, so we have to overwrite any previous writes to the variable. Reported-by: Joe Mario <jmario@redhat.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Fixes: `f048c399e0` ("x86/topology: Provide topology_smt_supported()") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Tony Luck	03b3614d4d	Documentation/l1tf: Fix typos commit `1949f9f497` upstream Fix spelling and other typos Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Nicolai Stange	587d499c8b	x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content commit `288d152c23` upstream The slow path in vmx_l1d_flush() reads from vmx_l1d_flush_pages in order to evict the L1d cache. However, these pages are never cleared and, in theory, their data could be leaked. More importantly, KSM could merge a nested hypervisor's vmx_l1d_flush_pages to fewer than 1 << L1D_CACHE_ORDER host physical pages and this would break the L1d flushing algorithm: L1D on x86_64 is tagged by physical addresses. Fix this by initializing the individual vmx_l1d_flush_pages with a different pattern each. Rename the "empty_zp" asm constraint identifier in vmx_l1d_flush() to "flush_pages" to reflect this change. Fixes: `a47dd5f067` ("x86/KVM/VMX: Add L1D flush algorithm") Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Thomas Gleixner	93aed2469d	Documentation: Add section about CPU vulnerabilities commit `3ec8ce5d86` upstream Add documentation for the L1TF vulnerability and the mitigation mechanisms: - Explain the problem and risks - Document the mitigation mechanisms - Document the command line controls - Document the sysfs files Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20180713142323.287429944@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Jiri Kosina	2decbf5264	x86/bugs, kvm: Introduce boot-time control of L1TF mitigations commit `d90a7a0ec8` upstream Introduce the 'l1tf=' kernel command line option to allow for boot-time switching of mitigation that is used on processors affected by L1TF. The possible values are: full Provides all available mitigations for the L1TF vulnerability. Disables SMT and enables all mitigations in the hypervisors. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. full,force Same as 'full', but disables SMT control. Implies the 'nosmt=force' command line option. sysfs control of SMT and the hypervisor flush control is disabled. flush Leaves SMT enabled and enables the conditional hypervisor mitigation. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. flush,nosmt Disables SMT and enables the conditional hypervisor mitigation. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. If SMT is reenabled or flushing disabled at runtime hypervisors will issue a warning. flush,nowarn Same as 'flush', but hypervisors will not warn when a VM is started in a potentially insecure configuration. off Disables hypervisor mitigations and doesn't emit any warnings. Default is 'flush'. Let KVM adhere to these semantics, which means: - 'lt1f=full,force' : Performe L1D flushes. No runtime control possible. - 'l1tf=full' - 'l1tf-flush' - 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if SMT has been runtime enabled or L1D flushing has been run-time enabled - 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted. - 'l1tf=off' : L1D flushes are not performed and no warnings are emitted. KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush' module parameter except when lt1f=full,force is set. This makes KVM's private 'nosmt' option redundant, and as it is a bit non-systematic anyway (this is something to control globally, not on hypervisor level), remove that option. Add the missing Documentation entry for the l1tf vulnerability sysfs file while at it. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:51 +02:00
Thomas Gleixner	929d3b2e9b	cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early commit `fee0aede6f` upstream The CPU_SMT_NOT_SUPPORTED state is set (if the processor does not support SMT) when the sysfs SMT control file is initialized. That was fine so far as this was only required to make the output of the control file correct and to prevent writes in that case. With the upcoming l1tf command line parameter, this needs to be set up before the L1TF mitigation selection and command line parsing happens. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.121795971@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:50 +02:00
Jiri Kosina	a69c5e0706	cpu/hotplug: Expose SMT control init function commit `8e1b706b6e` upstream The L1TF mitigation will gain a commend line parameter which allows to set a combination of hypervisor mitigation and SMT control. Expose cpu_smt_disable() so the command line parser can tweak SMT settings. [ tglx: Split out of larger patch and made it preserve an already existing force off state ] Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.039715135@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:50 +02:00
Thomas Gleixner	4797c2f379	x86/kvm: Allow runtime control of L1D flush commit `895ae47f99` upstream All mitigation modes can be switched at run time with a static key now: - Use sysfs_streq() instead of strcmp() to handle the trailing new line from sysfs writes correctly. - Make the static key management handle multiple invocations properly. - Set the module parameter file to RW Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.954525119@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:50 +02:00
Thomas Gleixner	6ccf633238	x86/kvm: Serialize L1D flush parameter setter commit `dd4bfa739a` upstream Writes to the parameter files are not serialized at the sysfs core level, so local serialization is required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.873642605@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:50 +02:00
Thomas Gleixner	dff0982c57	x86/kvm: Add static key for flush always commit `4c6523ec59` upstream Avoid the conditional in the L1D flush control path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.790914912@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Thomas Gleixner	641a211704	x86/kvm: Move l1tf setup function commit `7db92e165a` upstream In preparation of allowing run time control for L1D flushing, move the setup code to the module parameter handler. In case of pre module init parsing, just store the value and let vmx_init() do the actual setup after running kvm_init() so that enable_ept is having the correct state. During run-time invoke it directly from the parameter setter to prepare for run-time control. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.694063239@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Thomas Gleixner	4186ae8155	x86/l1tf: Handle EPT disabled state proper commit `a7b9020b06` upstream If Extended Page Tables (EPT) are disabled or not supported, no L1D flushing is required. The setup function can just avoid setting up the L1D flush for the EPT=n case. Invoke it after the hardware setup has be done and enable_ept has the correct state and expose the EPT disabled state in the mitigation status as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.612160168@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Thomas Gleixner	31282cf43b	x86/kvm: Drop L1TF MSR list approach commit `2f055947ae` upstream The VMX module parameter to control the L1D flush should become writeable. The MSR list is set up at VM init per guest VCPU, but the run time switching is based on a static key which is global. Toggling the MSR list at run time might be feasible, but for now drop this optimization and use the regular MSR write to make run-time switching possible. The default mitigation is the conditional flush anyway, so for extra paranoid setups this will add some small overhead, but the extra code executed is in the noise compared to the flush itself. Aside of that the EPT disabled case is not handled correctly at the moment and the MSR list magic is in the way for fixing that as well. If it's really providing a significant advantage, then this needs to be revisited after the code is correct and the control is writable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.516940445@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Thomas Gleixner	80e55b5ea4	x86/litf: Introduce vmx status variable commit `72c6d2db64` upstream Store the effective mitigation of VMX in a status variable and use it to report the VMX state in the l1tf sysfs file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.433098358@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Thomas Gleixner	e7cda2ffe1	cpu/hotplug: Online siblings when SMT control is turned on commit `215af5499d` upstream Writing 'off' to /sys/devices/system/cpu/smt/control offlines all SMT siblings. Writing 'on' merily enables the abilify to online them, but does not online them automatically. Make 'on' more useful by onlining all offline siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Konrad Rzeszutek Wilk	a8c14676a9	x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required commit `390d975e0c` upstream If the L1D flush module parameter is set to 'always' and the IA32_FLUSH_CMD MSR is available, optimize the VMENTER code with the MSR save list. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:49 +02:00
Konrad Rzeszutek Wilk	c45ff817e9	x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs commit `989e3992d2` upstream The IA32_FLUSH_CMD MSR needs only to be written on VMENTER. Extend add_atomic_switch_msr() with an entry_only parameter to allow storing the MSR only in the guest (ENTRY) MSR array. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Konrad Rzeszutek Wilk	5d3eaa2d39	x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting commit `3190709335` upstream This allows to load a different number of MSRs depending on the context: VMEXIT or VMENTER. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Konrad Rzeszutek Wilk	1555f9e8ed	x86/KVM/VMX: Add find_msr() helper function commit `ca83b4a7f2` upstream .. to help find the MSR on either the guest or host MSR list. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Konrad Rzeszutek Wilk	57e3ada3e5	x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers commit `33966dd6b2` upstream There is no semantic change but this change allows an unbalanced amount of MSRs to be loaded on VMEXIT and VMENTER, i.e. the number of MSRs to save or restore on VMEXIT or VMENTER may be different. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Jim Mattson	69c2525237	kvm: nVMX: Update MSR load counts on a VMCS switch Commit `83bafef1a1` upstream When L0 establishes (or removes) an MSR entry in the VM-entry or VM-exit MSR load lists, the change should affect the dormant VMCS as well as the current VMCS. Moreover, the vmcs02 MSR-load addresses should be initialized. [ dwmw2: Pulled in to 4.9 backports for L1TF ] Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Paolo Bonzini	b3dc63c4f4	x86/KVM/VMX: Add L1D flush logic commit `c595ceee45` upstream Add the logic for flushing L1D on VMENTER. The flush depends on the static key being enabled and the new l1tf_flush_l1d flag being set. The flags is set: - Always, if the flush module parameter is 'always' - Conditionally at: - Entry to vcpu_run(), i.e. after executing user space - From the sched_in notifier, i.e. when switching to a vCPU thread. - From vmexit handlers which are considered unsafe, i.e. where sensitive data can be brought into L1D: - The emulator, which could be a good target for other speculative execution-based threats, - The MMU, which can bring host page tables in the L1 cache. - External interrupts - Nested operations that require the MMU (see above). That is vmptrld, vmptrst, vmclear,vmwrite,vmread. - When handling invept,invvpid [ tglx: Split out from combo patch and reduced to a single flag ] [ dwmw2: Backported to 4.9, set l1tf_flush_l1d in svm/vmx code ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Paolo Bonzini	acca8a70a5	x86/KVM/VMX: Add L1D MSR based flush commit `3fa045be4c` upstream 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD aka 0x10B) which has similar write-only semantics to other MSRs defined in the document. The semantics of this MSR is to allow "finer granularity invalidation of caching structures than existing mechanisms like WBINVD. It will writeback and invalidate the L1 data cache, including all cachelines brought in by preceding instructions, without invalidating all caches (eg. L2 or LLC). Some processors may also invalidate the first level level instruction cache on a L1D_FLUSH command. The L1 data and instruction caches may be shared across the logical processors of a core." Use it instead of the loop based L1 flush algorithm. A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 [ tglx: Avoid allocating pages when the MSR is available ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Paolo Bonzini	b3d648aefa	x86/KVM/VMX: Add L1D flush algorithm commit `a47dd5f067` upstream To mitigate the L1 Terminal Fault vulnerability it's required to flush L1D on VMENTER to prevent rogue guests from snooping host memory. CPUs will have a new control MSR via a microcode update to flush L1D with a single MSR write, but in the absence of microcode a fallback to a software based flush algorithm is required. Add a software flush loop which is based on code from Intel. [ tglx: Split out from combo patch ] [ bpetkov: Polish the asm code ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:48 +02:00
Konrad Rzeszutek Wilk	af6ce92977	x86/KVM/VMX: Add module argument for L1TF mitigation commit `a399477e52` upstream Add a mitigation mode parameter "vmentry_l1d_flush" for CVE-2018-3620, aka L1 terminal fault. The valid arguments are: - "always" L1D cache flush on every VMENTER. - "cond" Conditional L1D cache flush, explained below - "never" Disable the L1D cache flush mitigation "cond" is trying to avoid L1D cache flushes on VMENTER if the code executed between VMEXIT and VMENTER is considered safe, i.e. is not bringing any interesting information into L1D which might exploited. [ tglx: Split out from a larger patch ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Konrad Rzeszutek Wilk	a0695af340	x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present commit `26acfb666a` upstream If the L1TF CPU bug is present we allow the KVM module to be loaded as the major of users that use Linux and KVM have trusted guests and do not want a broken setup. Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as such they are the ones that should set nosmt to one. Setting 'nosmt' means that the system administrator also needs to disable SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line parameter, or via the /sys/devices/system/cpu/smt/control. See commit `05736e4ac1` ("cpu/hotplug: Provide knobs to control SMT"). Other mitigations are to use task affinity, cpu sets, interrupt binding, etc - anything to make sure that _only_ the same guests vCPUs are running on sibling threads. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Thomas Gleixner	8438e49bca	cpu/hotplug: Boot HT siblings at least once commit `0cc3cd2165` upstream Due to the way Machine Check Exceptions work on X86 hyperthreads it's required to boot up _all_ logical cores at least once in order to set the CR4.MCE bit. So instead of ignoring the sibling threads right away, let them boot up once so they can configure themselves. After they came out of the initial boot stage check whether its a "secondary" sibling and cancel the operation which puts the CPU back into offline state. [dwmw2: Backport to 4.9] Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Thomas Gleixner	fe2a955476	Revert "x86/apic: Ignore secondary threads if nosmt=force" commit `506a66f374` upstream Dave Hansen reported, that it's outright dangerous to keep SMT siblings disabled completely so they are stuck in the BIOS and wait for SIPI. The reason is that Machine Check Exceptions are broadcasted to siblings and the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or reboots the machine. The MCE chapter in the SDM contains the following blurb: Because the logical processors within a physical package are tightly coupled with respect to shared hardware resources, both logical processors are notified of machine check errors that occur within a given physical processor. If machine-check exceptions are enabled when a fatal error is reported, all the logical processors within a physical package are dispatched to the machine-check exception handler. If machine-check exceptions are disabled, the logical processors enter the shutdown state and assert the IERR# signal. When enabling machine-check exceptions, the MCE flag in control register CR4 should be set for each logical processor. Reverting the commit which ignores siblings at enumeration time solves only half of the problem. The core cpuhotplug logic needs to be adjusted as well. This thoughtful engineered mechanism also turns the boot process on all Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU before the secondary CPUs are brought up. Depending on the number of physical cores the window in which this situation can happen is smaller or larger. On a HSW-EX it's about 750ms: MCE is enabled on the boot CPU: [ 0.244017] mce: CPU supports 22 MCE banks The corresponding sibling #72 boots: [ 1.008005] .... node #0, CPUs: #72 That means if an MCE hits on physical core 0 (logical CPUs 0 and 72) between these two points the machine is going to shutdown. At least it's a known safe state. It's obvious that the early boot can be hit by an MCE as well and then runs into the same situation because MCEs are not yet enabled on the boot CPU. But after enabling them on the boot CPU, it does not make any sense to prevent the kernel from recovering. Adjust the nosmt kernel parameter documentation as well. Reverts: `2207def700` ("x86/apic: Ignore secondary threads if nosmt=force") Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Michal Hocko	3f0eb66f65	x86/speculation/l1tf: Fix up pte->pfn conversion for PAE commit `e14d7dfb41` upstream Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for CONFIG_PAE because phys_addr_t is wider than unsigned long and so the pte_val reps. shift left would get truncated. Fix this up by using proper types. [dwmw2: Backport to 4.9] Fixes: `6b28baca9b` ("x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation") Reported-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Vlastimil Babka	53527af79d	x86/speculation/l1tf: Protect PAE swap entries against L1TF commit `0d0f624905` upstream The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for offset. The lower word is zeroed, thus systems with less than 4GB memory are safe. With 4GB to 128GB the swap type selects the memory locations vulnerable to L1TF; with even more memory, also the swap offfset influences the address. This might be a problem with 32bit PAE guests running on large 64bit hosts. By continuing to keep the whole swap entry in either high or low 32bit word of PTE we would limit the swap size too much. Thus this patch uses the whole PAE PTE with the same layout as the 64bit version does. The macros just become a bit tricky since they assume the arch-dependent swp_entry_t to be 32bit. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Borislav Petkov	250f0aebe2	x86/CPU/AMD: Move TOPOEXT reenablement before reading smp_num_siblings commit `7ce2f0393e` upstream The TOPOEXT reenablement is a workaround for broken BIOSen which didn't enable the CPUID bit. amd_get_topology_early(), however, relies on that bit being set so that it can read out the CPUID leaf and set smp_num_siblings properly. Move the reenablement up to early_init_amd(). While at it, simplify amd_get_topology_early(). [dwmw2: Backport to 4.9] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Konrad Rzeszutek Wilk	a8358624a3	x86/cpufeatures: Add detection of L1D cache flush support. commit `11e34e64e4` upstream 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD) which is detected by CPUID.7.EDX[28]=1 bit being set. This new MSR "gives software a way to invalidate structures with finer granularity than other architectual methods like WBINVD." A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:47 +02:00
Vlastimil Babka	c4b998c88f	x86/speculation/l1tf: Extend 64bit swap file size limit commit `1a7ed1ba4b` upstream The previous patch has limited swap file size so that large offsets cannot clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation. It assumed that offsets are encoded starting with bit 12, same as pfn. But on x86_64, offsets are encoded starting with bit 9. Thus the limit can be raised by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA. Fixes: `377eeaa8e1` ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	4a818f2c35	x86/apic: Ignore secondary threads if nosmt=force commit `2207def700` upstream nosmt on the kernel command line merely prevents the onlining of the secondary SMT siblings. nosmt=force makes the APIC detection code ignore the secondary SMT siblings completely, so they even do not show up as possible CPUs. That reduces the amount of memory allocations for per cpu variables and saves other resources from being allocated too large. This is not fully equivalent to disabling SMT in the BIOS because the low level SMT enabling in the BIOS can result in partitioning of resources between the siblings, which is not undone by just ignoring them. Some CPUs can use the full resources when their sibling is not onlined, but this is depending on the CPU family and model and it's not well documented whether this applies to all partitioned resources. That means depending on the workload disabling SMT in the BIOS might result in better performance. Linus analysis of the Intel manual: The intel optimization manual is not very clear on what the partitioning rules are. I find: "In general, the buffers for staging instructions between major pipe stages are partitioned. These buffers include µop queues after the execution trace cache, the queues after the register rename stage, the reorder buffer which stages instructions for retirement, and the load and store buffers. In the case of load and store buffers, partitioning also provided an easier implementation to maintain memory ordering for each logical processor and detect memory ordering violations" but some of that partitioning may be relaxed if the HT thread is "not active": "In Intel microarchitecture code name Sandy Bridge, the micro-op queue is statically partitioned to provide 28 entries for each logical processor, irrespective of software executing in single thread or multiple threads. If one logical processor is not active in Intel microarchitecture code name Ivy Bridge, then a single thread executing on that processor core can use the 56 entries in the micro-op queue" but I do not know what "not active" means, and how dynamic it is. Some of that partitioning may be entirely static and depend on the early BIOS disabling of HT, and even if we park the cores, the resources will just be wasted. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	ae76eb1198	x86/cpu/AMD: Evaluate smp_num_siblings early commit `1e1d7e25fd` upstream To support force disabling of SMT it's required to know the number of thread siblings early. amd_get_topology() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings and invoke it from amd_early_init(). [dwmw2: Backport to 4.9] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Borislav Petkov	112d243045	x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info commit `119bff8a9c` upstream Old code used to check whether CPUID ext max level is >= 0x80000008 because that last leaf contains the number of cores of the physical CPU. The three functions called there now do not depend on that leaf anymore so the check can go. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	0ee6f3b23c	x86/cpu/intel: Evaluate smp_num_siblings early commit `1910ad5624` upstream Make use of the new early detection function to initialize smp_num_siblings on the boot cpu before the MP-Table or ACPI/MADT scan happens. That's required for force disabling SMT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	3b4f20ad38	x86/cpu/topology: Provide detect_extended_topology_early() commit `95f3d39ccf` upstream To support force disabling of SMT it's required to know the number of thread siblings early. detect_extended_topology() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	691997bff5	x86/cpu/common: Provide detect_ht_early() commit `545401f444` upstream To support force disabling of SMT it's required to know the number of thread siblings early. detect_ht() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	a6d2fa5dd7	x86/cpu/AMD: Remove the pointless detect_ht() call commit `44ca36de56` upstream Real 32bit AMD CPUs do not have SMT and the only value of the call was to reach the magic printout which got removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	e0439285c6	x86/cpu: Remove the pointless CPU printout commit `55e6d279ab` upstream The value of this printout is dubious at best and there is no point in having it in two different places along with convoluted ways to reach it. Remove it completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	f37486c0a1	cpu/hotplug: Provide knobs to control SMT commit `05736e4ac1` upstream Provide a command line and a sysfs knob to control SMT. The command line options are: 'nosmt': Enumerate secondary threads, but do not online them 'nosmt=force': Ignore secondary threads completely during enumeration via MP table and ACPI/MADT. The sysfs control file has the following states (read/write): 'on': SMT is enabled. Secondary threads can be freely onlined 'off': SMT is disabled. Secondary threads, even if enumerated cannot be onlined 'forceoff': SMT is permanentely disabled. Writes to the control file are rejected. 'notsupported': SMT is not supported by the CPU The command line option 'nosmt' sets the sysfs control to 'off'. This can be changed to 'on' to reenable SMT during runtime. The command line option 'nosmt=force' sets the sysfs control to 'forceoff'. This cannot be changed during runtime. When SMT is 'on' and the control file is changed to 'off' then all online secondary threads are offlined and attempts to online a secondary thread later on are rejected. When SMT is 'off' and the control file is changed to 'on' then secondary threads can be onlined again. The 'off' -> 'on' transition does not automatically online the secondary threads. When the control file is set to 'forceoff', the behaviour is the same as setting it to 'off', but the operation is irreversible and later writes to the control file are rejected. When the control status is 'notsupported' then writes to the control file are rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:46 +02:00
Thomas Gleixner	373b8def45	cpu/hotplug: Split do_cpu_down() commit `cc1fe215e1` upstream Split out the inner workings of do_cpu_down() to allow reuse of that function for the upcoming SMT disabling mechanism. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Thomas Gleixner	9333575fc4	cpu/hotplug: Make bringup/teardown of smp threads symmetric commit `c4de65696d` upstream The asymmetry caused a warning to trigger if the bootup was stopped in state CPUHP_AP_ONLINE_IDLE. The warning no longer triggers as kthread_park() can now be invoked on already or still parked threads. But there is still no reason to have this be asymmetric. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Thomas Gleixner	16fd33cd35	x86/topology: Provide topology_smt_supported() commit `f048c399e0` upstream Provide information whether SMT is supoorted by the CPUs. Preparatory patch for SMT control mechanism. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Thomas Gleixner	7b69a96e5a	x86/smp: Provide topology_is_primary_thread() commit `6a4d2657e0` upstream If the CPU is supporting SMT then the primary thread can be found by checking the lower APIC ID bits for zero. smp_num_siblings is used to build the mask for the APIC ID bits which need to be taken into account. This uses the MPTABLE or ACPI/MADT supplied APIC ID, which can be different than the initial APIC ID in CPUID. But according to AMD the lower bits have to be consistent. Intel gave a tentative confirmation as well. Preparatory patch to support disabling SMT at boot/runtime. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Konrad Rzeszutek Wilk	1ac1dc1467	x86/bugs: Move the l1tf function and define pr_fmt properly commit `56563f53d3` upstream The pr_warn in l1tf_select_mitigation would have used the prior pr_fmt which was defined as "Spectre V2 : ". Move the function to be past SSBD and also define the pr_fmt. Fixes: `17dbca1193` ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Andi Kleen	e3923475eb	x86/speculation/l1tf: Limit swap file size to MAX_PA/2 commit `377eeaa8e1` upstream For the L1TF workaround its necessary to limit the swap file size to below MAX_PA/2, so that the higher bits of the swap offset inverted never point to valid memory. Add a mechanism for the architecture to override the swap file size check in swapfile.c and add a x86 specific max swapfile check function that enforces that limit. The check is only enabled if the CPU is vulnerable to L1TF. In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system with 46bit PA it is 32TB. The limit is only per individual swap file, so it's always possible to exceed these limits with multiple swap files or partitions. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Andi Kleen	7c5b42f82c	x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings commit `42e4089c78` upstream For L1TF PROT_NONE mappings are protected by inverting the PFN in the page table entry. This sets the high bits in the CPU's address space, thus making sure to point to not point an unmapped entry to valid cached memory. Some server system BIOSes put the MMIO mappings high up in the physical address space. If such an high mapping was mapped to unprivileged users they could attack low memory by setting such a mapping to PROT_NONE. This could happen through a special device driver which is not access protected. Normal /dev/mem is of course access protected. To avoid this forbid PROT_NONE mappings or mprotect for high MMIO mappings. Valid page mappings are allowed because the system is then unsafe anyways. It's not expected that users commonly use PROT_NONE on MMIO. But to minimize any impact this is only enforced if the mapping actually refers to a high MMIO address (defined as the MAX_PA-1 bit being set), and also skip the check for root. For mmaps this is straight forward and can be handled in vm_insert_pfn and in remap_pfn_range(). For mprotect it's a bit trickier. At the point where the actual PTEs are accessed a lot of state has been changed and it would be difficult to undo on an error. Since this is a uncommon case use a separate early page talk walk pass for MMIO PROT_NONE mappings that checks for this condition early. For non MMIO and non PROT_NONE there are no changes. [dwmw2: Backport to 4.9] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Andi Kleen	432e99b340	x86/speculation/l1tf: Add sysfs reporting for l1tf commit `17dbca1193` upstream L1TF core kernel workarounds are cheap and normally always enabled, However they still should be reported in sysfs if the system is vulnerable or mitigated. Add the necessary CPU feature/bug bits. - Extend the existing checks for Meltdowns to determine if the system is vulnerable. All CPUs which are not vulnerable to Meltdown are also not vulnerable to L1TF - Check for 32bit non PAE and emit a warning as there is no practical way for mitigation due to the limited physical address bits - If the system has more than MAX_PA/2 physical memory the invert page workarounds don't protect the system against the L1TF attack anymore, because an inverted physical address will also point to valid memory. Print a warning in this case and report that the system is vulnerable. Add a function which returns the PFN limit for the L1TF mitigation, which will be used in follow up patches for sanity and range checks. [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ] [ dwmw2: Backport to 4.9 (cpufeatures.h, E820) ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:45 +02:00
Andi Kleen	5b2ec92f70	x86/speculation/l1tf: Make sure the first page is always reserved commit `10a70416e1` upstream The L1TF workaround doesn't make any attempt to mitigate speculate accesses to the first physical page for zeroed PTEs. Normally it only contains some data from the early real mode BIOS. It's not entirely clear that the first page is reserved in all configurations, so add an extra reservation call to make sure it is really reserved. In most configurations (e.g. with the standard reservations) it's likely a nop. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Andi Kleen	33182fe97a	x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation commit `6b28baca9b` upstream When PTEs are set to PROT_NONE the kernel just clears the Present bit and preserves the PFN, which creates attack surface for L1TF speculation speculation attacks. This is important inside guests, because L1TF speculation bypasses physical page remapping. While the host has its own migitations preventing leaking data from other VMs into the guest, this would still risk leaking the wrong page inside the current guest. This uses the same technique as Linus' swap entry patch: while an entry is is in PROTNONE state invert the complete PFN part part of it. This ensures that the the highest bit will point to non existing memory. The invert is done by pte/pmd_modify and pfn/pmd/pud_pte for PROTNONE and pte/pmd/pud_pfn undo it. This assume that no code path touches the PFN part of a PTE directly without using these primitives. This doesn't handle the case that MMIO is on the top of the CPU physical memory. If such an MMIO region was exposed by an unpriviledged driver for mmap it would be possible to attack some real memory. However this situation is all rather unlikely. For 32bit non PAE the inversion is not done because there are really not enough bits to protect anything. Q: Why does the guest need to be protected when the HyperVisor already has L1TF mitigations? A: Here's an example: Physical pages 1 2 get mapped into a guest as GPA 1 -> PA 2 GPA 2 -> PA 1 through EPT. The L1TF speculation ignores the EPT remapping. Now the guest kernel maps GPA 1 to process A and GPA 2 to process B, and they belong to different users and should be isolated. A sets the GPA 1 PA 2 PTE to PROT_NONE to bypass the EPT remapping and gets read access to the underlying physical page. Which in this case points to PA 2, so it can read process B's data, if it happened to be in L1, so isolation inside the guest is broken. There's nothing the hypervisor can do about this. This mitigation has to be done in the guest itself. [ tglx: Massaged changelog ] [ dwmw2: backported to 4.9 ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Linus Torvalds	6071227488	x86/speculation/l1tf: Protect swap entries against L1TF commit `2f22b4cd45` upstream With L1 terminal fault the CPU speculates into unmapped PTEs, and resulting side effects allow to read the memory the PTE is pointing too, if its values are still in the L1 cache. For swapped out pages Linux uses unmapped PTEs and stores a swap entry into them. To protect against L1TF it must be ensured that the swap entry is not pointing to valid memory, which requires setting higher bits (between bit 36 and bit 45) that are inside the CPUs physical address space, but outside any real memory. To do this invert the offset to make sure the higher bits are always set, as long as the swap file is not too big. Note there is no workaround for 32bit !PAE, or on systems which have more than MAX_PA/2 worth of memory. The later case is very unlikely to happen on real systems. [AK: updated description and minor tweaks by. Split out from the original patch ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Linus Torvalds	2c9b57e447	x86/speculation/l1tf: Change order of offset/type in swap entry commit `bcd11afa7a` upstream If pages are swapped out, the swap entry is stored in the corresponding PTE, which has the Present bit cleared. CPUs vulnerable to L1TF speculate on PTE entries which have the present bit set and would treat the swap entry as phsyical address (PFN). To mitigate that the upper bits of the PTE must be set so the PTE points to non existent memory. The swap entry stores the type and the offset of a swapped out page in the PTE. type is stored in bit 9-13 and offset in bit 14-63. The hardware ignores the bits beyond the phsyical address space limit, so to make the mitigation effective its required to start 'offset' at the lowest possible bit so that even large swap offsets do not reach into the physical address space limit bits. Move offset to bit 9-58 and type to bit 59-63 which are the bits that hardware generally doesn't care about. That, in turn, means that if you on desktop chip with only 40 bits of physical addressing, now that the offset starts at bit 9, there needs to be 30 bits of offset actually in use until bit 39 ends up being set, which means when inverted it will again point into existing memory. So that's 4 terabyte of swap space (because the offset is counted in pages, so 30 bits of offset is 42 bits of actual coverage). With bigger physical addressing, that obviously grows further, until the limit of the offset is hit (at 50 bits of offset - 62 bits of actual swap file coverage). This is a preparatory change for the actual swap entry inversion to protect against L1TF. [ AK: Updated description and minor tweaks. Split into two parts ] [ tglx: Massaged changelog ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Naoya Horiguchi	1a4922e0f0	mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1 commit `eee4818baa` upstream _PAGE_PSE is used to distinguish between a truly non-present (_PAGE_PRESENT=0) PMD, and a PMD which is undergoing a THP split and should be treated as present. But _PAGE_SWP_SOFT_DIRTY currently uses the _PAGE_PSE bit, which would cause confusion between one of those PMDs undergoing a THP split, and a soft-dirty PMD. Dropping _PAGE_PSE check in pmd_present() does not work well, because it can hurt optimization of tlb handling in thp split. Thus, we need to move the bit. In the current kernel, bits 1-4 are not used in non-present format since commit `00839ee3b2` ("x86/mm: Move swap offset/type up in PTE to work around erratum"). So let's move _PAGE_SWP_SOFT_DIRTY to bit 1. Bit 7 is used as reserved (always clear), so please don't use it for other purpose. [dwmw2: Pulled in to 4.9 backport to support L1TF changes] Link: http://lkml.kernel.org/r/20170717193955.20207-3-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Andi Kleen	bbd07cbb10	x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT commit `50896e180c` upstream L1 Terminal Fault (L1TF) is a speculation related vulnerability. The CPU speculates on PTE entries which do not have the PRESENT bit set, if the content of the resulting physical address is available in the L1D cache. The OS side mitigation makes sure that a !PRESENT PTE entry points to a physical address outside the actually existing and cachable memory space. This is achieved by inverting the upper bits of the PTE. Due to the address space limitations this only works for 64bit and 32bit PAE kernels, but not for 32bit non PAE. This mitigation applies to both host and guest kernels, but in case of a 64bit host (hypervisor) and a 32bit PAE guest, inverting the upper bits of the PAE address space (44bit) is not enough if the host has more than 43 bits of populated memory address space, because the speculation treats the PTE content as a physical host address bypassing EPT. The host (hypervisor) protects itself against the guest by flushing L1D as needed, but pages inside the guest are not protected against attacks from other processes inside the same guest. For the guest the inverted PTE mask has to match the host to provide the full protection for all pages the host could possibly map into the guest. The hosts populated address space is not known to the guest, so the mask must cover the possible maximal host address space, i.e. 52 bit. On 32bit PAE the maximum PTE mask is currently set to 44 bit because that is the limit imposed by 32bit unsigned long PFNs in the VMs. This limits the mask to be below what the host could possible use for physical pages. The L1TF PROT_NONE protection code uses the PTE masks to determine which bits to invert to make sure the higher bits are set for unmapped entries to prevent L1TF speculation attacks against EPT inside guests. In order to invert all bits that could be used by the host, increase __PHYSICAL_PAGE_SHIFT to 52 to match 64bit. The real limit for a 32bit PAE kernel is still 44 bits because all Linux PTEs are created from unsigned long PFNs, so they cannot be higher than 44 bits on a 32bit kernel. So these extra PFN bits should be never set. The only users of this macro are using it to look at PTEs, so it's safe. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Nick Desaulniers	329d815667	x86/irqflags: Provide a declaration for native_save_fl commit `208cbb3255` upstream. It was reported that the commit `d0a8d9378d` is causing users of gcc < 4.9 to observe -Werror=missing-prototypes errors. Indeed, it seems that: extern inline unsigned long native_save_fl(void) { return 0; } compiled with -Werror=missing-prototypes produces this warning in gcc < 4.9, but not gcc >= 4.9. Fixes: `d0a8d9378d` ("x86/paravirt: Make native_save_fl() extern inline"). Reported-by: David Laight <david.laight@aculab.com> Reported-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: jgross@suse.com Cc: kstewart@linuxfoundation.org Cc: gregkh@linuxfoundation.org Cc: boris.ostrovsky@oracle.com Cc: astrachan@google.com Cc: mka@chromium.org Cc: arnd@arndb.de Cc: tstellar@redhat.com Cc: sedat.dilek@gmail.com Cc: David.Laight@aculab.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180803170550.164688-1-ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Masami Hiramatsu	a92daabdfc	kprobes/x86: Fix %p uses in error messages commit `0ea063306e` upstream. Remove all %p uses in error messages in kprobes/x86. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Howells <dhowells@redhat.com> Cc: David S . Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jon Medhurst <tixy@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tobin C . Harding <me@tobin.cc> Cc: Will Deacon <will.deacon@arm.com> Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491902310.9916.13355297638917767319.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Jiri Kosina	6455f41db5	x86/speculation: Protect against userspace-userspace spectreRSB commit `fdf82a7856` upstream. The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdf Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:44 +02:00
Peter Zijlstra	640fe070d8	x86/paravirt: Fix spectre-v2 mitigations for paravirt guests commit `5800dc5c19` upstream. Nadav reported that on guests we're failing to rewrite the indirect calls to CALLEE_SAVE paravirt functions. In particular the pv_queued_spin_unlock() call is left unpatched and that is all over the place. This obviously wrecks Spectre-v2 mitigation (for paravirt guests) which relies on not actually having indirect calls around. The reason is an incorrect clobber test in paravirt_patch_call(); this function rewrites an indirect call with a direct call to the _SAME_ function, there is no possible way the clobbers can be different because of this. Therefore remove this clobber check. Also put WARNs on the other patch failure case (not enough room for the instruction) which I've not seen trigger in my (limited) testing. Three live kernel image disassemblies for lock_sock_nested (as a small function that illustrates the problem nicely). PRE is the current situation for guests, POST is with this patch applied and NATIVE is with or without the patch for !guests. PRE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq *0xffffffff822299e8 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. POST: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq 0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock> 0xffffffff817be9a5 <+53>: xchg %ax,%ax 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063aa0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. NATIVE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: movb $0x0,(%rdi) 0xffffffff817be9a3 <+51>: nopl 0x0(%rax) 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. Fixes: `63f70270cc` ("[PATCH] i386: PARAVIRT: add common patching machinery") Fixes: `3010a0663f` ("x86/paravirt, objtool: Annotate indirect calls") Reported-by: Nadav Amit <namit@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Oleksij Rempel	16aeb3f175	ARM: dts: imx6sx: fix irq for pcie bridge commit `1bcfe05640` upstream. Use the correct IRQ line for the MSI controller in the PCIe host controller. Apparently a different IRQ line is used compared to other i.MX6 variants. Without this change MSI IRQs aren't properly propagated to the upstream interrupt controller. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Fixes: `b1d17f68e5` ("ARM: dts: imx: add initial imx6sx device tree source") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Michael Mera	27250cf83d	IB/ocrdma: fix out of bounds access to local buffer commit `062d0f22a3` upstream. In write to debugfs file 'resource_stats' the local buffer 'tmp_str' is written at index 'count-1' where 'count' is the size of the write, so potentially 0. This patch filters odd values for the write size/position to avoid this type of problem. Signed-off-by: Michael Mera <dev@michaelmera.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Fabio Estevam	5ee45fc998	mtd: nand: qcom: Add a NULL check for devm_kasprintf() commit `069f05346d` upstream. devm_kasprintf() may fail, so we should better add a NULL check and propagate an error on failure. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Jack Morgenstein	e2ba7bf197	IB/mlx4: Mark user MR as writable if actual virtual memory is writable commit `d8f9cc328c` upstream. To allow rereg_user_mr to modify the MR from read-only to writable without using get_user_pages again, we needed to define the initial MR as writable. However, this was originally done unconditionally, without taking into account the writability of the underlying virtual memory. As a result, any attempt to register a read-only MR over read-only virtual memory failed. To fix this, do not add the writable flag bit when the user virtual memory is not writable (e.g. const memory). However, when the underlying memory is NOT writable (and we therefore do not define the initial MR as writable), the IB core adds a "force writable" flag to its user-pages request. If this succeeds, the reg_user_mr caller gets a writable copy of the original pages. If the user-space caller then does a rereg_user_mr operation to enable writability, this will succeed. This should not be allowed, since the original virtual memory was not writable. Cc: <stable@vger.kernel.org> Fixes: `9376932d0c` ("IB/mlx4_ib: Add support for user MR re-registration") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Jack Morgenstein	11410f9998	IB/core: Make testing MR flags for writability a static inline function commit `08bb558ac1` upstream. Make the MR writability flags check, which is performed in umem.c, a static inline function in file ib_verbs.h This allows the function to be used by low-level infiniband drivers. Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Eric W. Biederman	a3a7b992b2	proc: Fix proc_sys_prune_dcache to hold a sb reference commit `2fd1d2c4ce` upstream. Andrei Vagin writes: FYI: This bug has been reproduced on 4.11.7 > BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo} still in use (1) [unmount of proc proc] > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80 > CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1 > Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015 > Workqueue: events proc_cleanup_work > Call Trace: > dump_stack+0x63/0x86 > __warn+0xcb/0xf0 > warn_slowpath_null+0x1d/0x20 > umount_check+0x6e/0x80 > d_walk+0xc6/0x270 > ? dentry_free+0x80/0x80 > do_one_tree+0x26/0x40 > shrink_dcache_for_umount+0x2d/0x90 > generic_shutdown_super+0x1f/0xf0 > kill_anon_super+0x12/0x20 > proc_kill_sb+0x40/0x50 > deactivate_locked_super+0x43/0x70 > deactivate_super+0x5a/0x60 > cleanup_mnt+0x3f/0x90 > mntput_no_expire+0x13b/0x190 > kern_unmount+0x3e/0x50 > pid_ns_release_proc+0x15/0x20 > proc_cleanup_work+0x15/0x20 > process_one_work+0x197/0x450 > worker_thread+0x4e/0x4a0 > kthread+0x109/0x140 > ? process_one_work+0x450/0x450 > ? kthread_park+0x90/0x90 > ret_from_fork+0x2c/0x40 > ---[ end trace e1c109611e5d0b41 ]--- > VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds. Have a nice day... > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: _raw_spin_lock+0xc/0x30 > PGD 0 Fix this by taking a reference to the super block in proc_sys_prune_dcache. The superblock reference is the core of the fix however the sysctl_inodes list is converted to a hlist so that hlist_del_init_rcu may be used. This allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while not causing problems for proc_sys_evict_inode when if it later choses to remove the inode from the sysctl_inodes list. Removing inodes from the sysctl_inodes list allows proc_sys_prune_dcache to have a progress guarantee, while still being able to drop all locks. The fact that head->unregistering is set in start_unregistering ensures that no more inodes will be added to the the sysctl_inodes list. Previously the code did a dance where it delayed calling iput until the next entry in the list was being considered to ensure the inode remained on the sysctl_inodes list until the next entry was walked to. The structure of the loop in this patch does not need that so is much easier to understand and maintain. Cc: stable@vger.kernel.org Reported-by: Andrei Vagin <avagin@gmail.com> Tested-by: Andrei Vagin <avagin@openvz.org> Fixes: `ace0c791e6` ("proc/sysctl: Don't grab i_lock under sysctl_lock.") Fixes: `d6cffbbe9a` ("proc/sysctl: prune stale dentries during unregistering") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Eric W. Biederman	631f93a6fe	proc/sysctl: Don't grab i_lock under sysctl_lock. commit `ace0c791e6` upstream. Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > This patch has locking problem. I've got lockdep splat under LTP. > > [ 6633.115456] ====================================================== > [ 6633.115502] [ INFO: possible circular locking dependency detected ] > [ 6633.115553] 4.9.10-debug+ #9 Tainted: G L > [ 6633.115584] ------------------------------------------------------- > [ 6633.115627] ksm02/284980 is trying to acquire lock: > [ 6633.115659] (&sb->s_type->i_lock_key#4){+.+...}, at: [<ffffffff816bc1ce>] igrab+0x1e/0x80 > [ 6633.115834] but task is already holding lock: > [ 6633.115882] (sysctl_lock){+.+...}, at: [<ffffffff817e379b>] unregister_sysctl_table+0x6b/0x110 > [ 6633.116026] which lock already depends on the new lock. > [ 6633.116026] > [ 6633.116080] > [ 6633.116080] the existing dependency chain (in reverse order) is: > [ 6633.116117] > -> #2 (sysctl_lock){+.+...}: > -> #1 (&(&dentry->d_lockref.lock)->rlock){+.+...}: > -> #0 (&sb->s_type->i_lock_key#4){+.+...}: > > d_lock nests inside i_lock > sysctl_lock nests inside d_lock in d_compare > > This patch adds i_lock nesting inside sysctl_lock. Al Viro <viro@ZenIV.linux.org.uk> replied: > Once ->unregistering is set, you can drop sysctl_lock just fine. So I'd > try something like this - use rcu_read_lock() in proc_sys_prune_dcache(), > drop sysctl_lock() before it and regain after. Make sure that no inodes > are added to the list ones ->unregistering has been set and use RCU list > primitives for modifying the inode list, with sysctl_lock still used to > serialize its modifications. > > Freeing struct inode is RCU-delayed (see proc_destroy_inode()), so doing > igrab() is safe there. Since we don't drop inode reference until after we'd > passed beyond it in the list, list_for_each_entry_rcu() should be fine. I agree with Al Viro's analsysis of the situtation. Fixes: `d6cffbbe9a` ("proc/sysctl: prune stale dentries during unregistering") Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Tested-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Konstantin Khlebnikov	b96e215e53	proc/sysctl: prune stale dentries during unregistering commit `d6cffbbe9a` upstream. Currently unregistering sysctl table does not prune its dentries. Stale dentries could slowdown sysctl operations significantly. For example, command: # for i in {1..100000} ; do unshare -n -- sysctl -a &> /dev/null ; done creates a millions of stale denties around sysctls of loopback interface: # sysctl fs.dentry-state fs.dentry-state = 25812579 24724135 45 0 0 0 All of them have matching names thus lookup have to scan though whole hash chain and call d_compare (proc_sys_compare) which checks them under system-wide spinlock (sysctl_lock). # time sysctl -a > /dev/null real 1m12.806s user 0m0.016s sys 1m12.400s Currently only memory reclaimer could remove this garbage. But without significant memory pressure this never happens. This patch collects sysctl inodes into list on sysctl table header and prunes all their dentries once that table unregisters. Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > On 10.02.2017 10:47, Al Viro wrote: >> how about >> the matching stats after that patch? > > dcache size doesn't grow endlessly, so stats are fine > > # sysctl fs.dentry-state > fs.dentry-state = 92712 58376 45 0 0 0 > > # time sysctl -a &>/dev/null > > real 0m0.013s > user 0m0.004s > sys 0m0.008s Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Al Viro	e31578c6fb	fix __legitimize_mnt()/mntput() race commit `119e1ef80e` upstream. __legitimize_mnt() has two problems - one is that in case of success the check of mount_lock is not ordered wrt preceding increment of refcount, making it possible to have successful __legitimize_mnt() on one CPU just before the otherwise final mntpu() on another, with __legitimize_mnt() not seeing mntput() taking the lock and mntput() not seeing the increment done by __legitimize_mnt(). Solved by a pair of barriers. Another is that failure of __legitimize_mnt() on the second read_seqretry() leaves us with reference that'll need to be dropped by caller; however, if that races with final mntput() we can end up with caller dropping rcu_read_lock() and doing mntput() to release that reference - with the first mntput() having freed the damn thing just as rcu_read_lock() had been dropped. Solution: in "do mntput() yourself" failure case grab mount_lock, check if MNT_DOOMED has been set by racing final mntput() that has missed our increment and if it has - undo the increment and treat that as "failure, caller doesn't need to drop anything" case. It's not easy to hit - the final mntput() has to come right after the first read_seqretry() in __legitimize_mnt() and manage to miss the increment done by __legitimize_mnt() before the second read_seqretry() in there. The things that are almost impossible to hit on bare hardware are not impossible on SMP KVM, though... Reported-by: Oleg Nesterov <oleg@redhat.com> Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:43 +02:00
Al Viro	87a2d84d2f	fix mntput/mntput race commit `9ea0a46ca2` upstream. mntput_no_expire() does the calculation of total refcount under mount_lock; unfortunately, the decrement (as well as all increments) are done outside of it, leading to false positives in the "are we dropping the last reference" test. Consider the following situation: * mnt is a lazy-umounted mount, kept alive by two opened files. One of those files gets closed. Total refcount of mnt is 2. On CPU 42 mntput(mnt) (called from __fput()) drops one reference, decrementing component * After it has looked at component #0, the process on CPU 0 does mntget(), incrementing component #0, gets preempted and gets to run again - on CPU 69. There it does mntput(), which drops the reference (component #69) and proceeds to spin on mount_lock. * On CPU 42 our first mntput() finishes counting. It observes the decrement of component #69, but not the increment of component #0. As the result, the total it gets is not 1 as it should've been - it's 0. At which point we decide that vfsmount needs to be killed and proceed to free it and shut the filesystem down. However, there's still another opened file on that filesystem, with reference to (now freed) vfsmount, etc. and we are screwed. It's not a wide race, but it can be reproduced with artificial slowdown of the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups. Fix consists of moving the refcount decrement under mount_lock; the tricky part is that we want (and can) keep the fast case (i.e. mount that still has non-NULL ->mnt_ns) entirely out of mount_lock. All places that zero mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu() before that mntput(). IOW, if mntput() observes (under rcu_read_lock()) a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to be dropped. Reported-by: Jann Horn <jannh@google.com> Tested-by: Jann Horn <jannh@google.com> Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Al Viro	59199c04b7	make sure that __dentry_kill() always invalidates d_seq, unhashed or not commit `4c0d7cd5c8` upstream. RCU pathwalk relies upon the assumption that anything that changes ->d_inode of a dentry will invalidate its ->d_seq. That's almost true - the one exception is that the final dput() of already unhashed dentry does not touch ->d_seq at all. Unhashing does, though, so for anything we'd found by RCU dcache lookup we are fine. Unfortunately, we can start with an unhashed dentry or jump into it. We could try and be careful in the (few) places where that could happen. Or we could just make the final dput() invalidate the damn thing, unhashed or not. The latter is much simpler and easier to backport, so let's do it that way. Reported-by: "Dae R. Jeong" <threeearcat@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Al Viro	cfac7df7dc	root dentries need RCU-delayed freeing commit `90bad5e05b` upstream. Since mountpoint crossing can happen without leaving lazy mode, root dentries do need the same protection against having their memory freed without RCU delay as everything else in the tree. It's partially hidden by RCU delay between detaching from the mount tree and dropping the vfsmount reference, but the starting point of pathwalk can be on an already detached mount, in which case umount-caused RCU delay has already passed by the time the lazy pathwalk grabs rcu_read_lock(). If the starting point happens to be at the root of that vfsmount and that vfsmount covers the entire filesystem, we get trouble. Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Linus Torvalds	6bb53ee170	init: rename and re-order boot_cpu_state_init() commit `b5b1404d08` upstream. This is purely a preparatory patch for upcoming changes during the 4.19 merge window. We have a function called "boot_cpu_state_init()" that isn't really about the bootup cpu state: that is done much earlier by the similarly named "boot_cpu_init()" (note lack of "state" in name). This function initializes some hotplug CPU state, and needs to run after the percpu data has been properly initialized. It even has a comment to that effect. Except it _doesn't_ actually run after the percpu data has been properly initialized. On x86 it happens to do that, but on at least arm and arm64, the percpu base pointers are initialized by the arch-specific 'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init(). This had some unexpected results, and in particular we have a patch pending for the merge window that did the obvious cleanup of using 'this_cpu_write()' in the cpu hotplug init code: - per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE; + this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); which is obviously the right thing to do. Except because of the ordering issue, it actually failed miserably and unexpectedly on arm64. So this just fixes the ordering, and changes the name of the function to be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu hotplug state, because the core CPU state was supposed to have already been done earlier. Marked for stable, since the (not yet merged) patch that will show this problem is marked for stable. Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Bart Van Assche	bcf447f808	scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled commit `1214fd7b49` upstream. Surround scsi_execute() calls with scsi_autopm_get_device() and scsi_autopm_put_device(). Note: removing sr_mutex protection from the scsi_cd_get() and scsi_cd_put() calls is safe because the purpose of sr_mutex is to serialize cdrom_*() calls. This patch avoids that complaints similar to the following appear in the kernel log if runtime power management is enabled: INFO: task systemd-udevd:650 blocked for more than 120 seconds. Not tainted 4.18.0-rc7-dbg+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. systemd-udevd D28176 650 513 0x00000104 Call Trace: __schedule+0x444/0xfe0 schedule+0x4e/0xe0 schedule_preempt_disabled+0x18/0x30 __mutex_lock+0x41c/0xc70 mutex_lock_nested+0x1b/0x20 __blkdev_get+0x106/0x970 blkdev_get+0x22c/0x5a0 blkdev_open+0xe9/0x100 do_dentry_open.isra.19+0x33e/0x570 vfs_open+0x7c/0xd0 path_openat+0x6e3/0x1120 do_filp_open+0x11c/0x1c0 do_sys_open+0x208/0x2d0 __x64_sys_openat+0x59/0x70 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: <stable@vger.kernel.org> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Hans de Goede	51b3938e39	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices commit `fdcb613d49` upstream. The LPSS PWM device on on Bay Trail and Cherry Trail devices has a set of private registers at offset 0x800, the current lpss_device_desc for them already sets the LPSS_SAVE_CTX flag to have these saved/restored over device-suspend, but the current lpss_device_desc was not setting the prv_offset field, leading to the regular device registers getting saved/restored instead. This is causing the PWM controller to no longer work, resulting in a black screen, after a suspend/resume on systems where the firmware clears the APB clock and reset bits at offset 0x804. This commit fixes this by properly setting prv_offset to 0x800 for the PWM devices. Cc: stable@vger.kernel.org Fixes: `e1c7481797` ("ACPI / LPSS: Add Intel BayTrail ACPI mode PWM") Fixes: `1bfbd8eb8a` ("ACPI / LPSS: Add ACPI IDs for Intel Braswell") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Rafael J . Wysocki <rjw@rjwysocki.net> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Juergen Gross	af3bd8d6a9	xen/netfront: don't cache skb_shinfo() commit `d472b3a6cf` upstream. skb_shinfo() can change when calling __pskb_pull_tail(): Don't cache its return value. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Linus Torvalds	fbf12e19c9	Mark HI and TASKLET softirq synchronous commit `3c53776e29` upstream. Way back in 4.9, we committed `4cd13c21b2` ("softirq: Let ksoftirqd do its job"), and ever since we've had small nagging issues with it. For example, we've had: `1ff688209e` ("watchdog: core: make sure the watchdog_worker is not deferred") `8d5755b3f7` ("watchdog: softdog: fire watchdog even if softirqs do not get to run") `217f697436` ("net: busy-poll: allow preemption in sk_busy_loop()") all of which worked around some of the effects of that commit. The DVB people have also complained that the commit causes excessive USB URB latencies, which seems to be due to the USB code using tasklets to schedule USB traffic. This seems to be an issue mainly when already living on the edge, but waiting for ksoftirqd to handle it really does seem to cause excessive latencies. Now Hanna Hawa reports that this issue isn't just limited to USB URB and DVB, but also causes timeout problems for the Marvell SoC team: "I'm facing kernel panic issue while running raid 5 on sata disks connected to Macchiatobin (Marvell community board with Armada-8040 SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean the done descriptors from the queue" The latency problem causes a panic: mv_xor_v2 f0400000.xor: dma_sync_wait: timeout! Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction We've discussed simply just reverting the original commit entirely, and also much more involved solutions (with per-softirq threads etc). This patch is intentionally stupid and fairly limited, because the issue still remains, and the other solutions either got sidetracked or had other issues. We should probably also consider the timer softirqs to be synchronous and not be delayed to ksoftirqd (since they were the issue with the earlier watchdog problems), but that should be done as a separate patch. This does only the tasklet cases. Reported-and-tested-by: Hanna Hawa <hannah@marvell.com> Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at> Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
Andrey Konovalov	50bed434ad	kasan: add no_sanitize attribute for clang builds commit `12c8f25a01` upstream. KASAN uses the __no_sanitize_address macro to disable instrumentation of particular functions. Right now it's defined only for GCC build, which causes false positives when clang is used. This patch adds a definition for clang. Note, that clang's revision 329612 or higher is required. [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check] Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Paul Lawrence <paullawrence@google.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sodagudi Prasad <psodagud@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:42 +02:00
John David Anglin	2106b21a8a	parisc: Define mb() and add memory barriers to assembler unlock sequences commit `fedb8da963` upstream. For years I thought all parisc machines executed loads and stores in order. However, Jeff Law recently indicated on gcc-patches that this is not correct. There are various degrees of out-of-order execution all the way back to the PA7xxx processor series (hit-under-miss). The PA8xxx series has full out-of-order execution for both integer operations, and loads and stores. This is described in the following article: http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml For this reason, we need to define mb() and to insert a memory barrier before the store unlocking spinlocks. This ensures that all memory accesses are complete prior to unlocking. The ldcw instruction performs the same function on entry. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:41 +02:00
Helge Deller	5f394c9ef6	parisc: Enable CONFIG_MLONGCALLS by default commit `66509a276c` upstream. Enable the -mlong-calls compiler option by default, because otherwise in most cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F relocations. This fixes building the 64-bit defconfig. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:41 +02:00
Tadeusz Struk	1d4167a818	tpm: fix race condition in tpm_common_write() commit `3ab2011ea3` upstream. There is a race condition in tpm_common_write function allowing two threads on the same /dev/tpm<N>, or two different applications on the same /dev/tpmrm<N> to overwrite each other commands/responses. Fixed this by taking the priv->buffer_mutex early in the function. Also converted the priv->data_pending from atomic to a regular size_t type. There is no need for it to be atomic since it is only touched under the protection of the priv->buffer_mutex. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:41 +02:00
Theodore Ts'o	954e572ae2	ext4: fix check to prevent initializing reserved inodes commit `5012284700` upstream. Commit `8844618d8a`: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: `8844618d8a` ("ext4: only look at the bg_flags field if it is valid") Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:14:41 +02:00
Greg Kroah-Hartman	8f21ecb424	Linux 4.9.119	2018-08-09 12:18:00 +02:00
Shankara Pailoor	240d46556d	jfs: Fix inconsistency between memory allocation and ea_buf->max_size commit `92d3413419` upstream. The code is assuming the buffer is max_size length, but we weren't allocating enough space for it. Signed-off-by: Shankara Pailoor <shankarapailoor@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Michael J. Ruhl	34a5bbbb6d	IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values commit `b697d7d8c7` upstream. The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL. All of the relevant call sites look for IS_ERR, so the NULL return would lead to a NULL pointer exception. Do not use the ERR_PTR mechanism for this function. Update all call sites to handle the return value correctly. Clean up error paths to reflect return value. Fixes: `45842abbb2` ("staging/rdma/hfi1: move txreq header code") Cc: <stable@vger.kernel.org> # 4.9.x+ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Kees Cook	6a19e26f11	fork: unconditionally clear stack on fork commit `e01e80634e` upstream. One of the classes of kernel stack content leaks[1] is exposing the contents of prior heap or stack contents when a new process stack is allocated. Normally, those stacks are not zeroed, and the old contents remain in place. In the face of stack content exposure flaws, those contents can leak to userspace. Fixing this will make the kernel no longer vulnerable to these flaws, as the stack will be wiped each time a stack is assigned to a new process. There's not a meaningful change in runtime performance; it almost looks like it provides a benefit. Performing back-to-back kernel builds before: Run times: 157.86 157.09 158.90 160.94 160.80 Mean: 159.12 Std Dev: 1.54 and after: Run times: 159.31 157.34 156.71 158.15 160.81 Mean: 158.46 Std Dev: 1.46 Instead of making this a build or runtime config, Andy Lutomirski recommended this just be enabled by default. [1] A noisy search for many kinds of stack content leaks can be seen here: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak I did some more with perf and cycle counts on running 100,000 execs of /bin/true. before: Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841 Mean: 221015379122.60 Std Dev: 4662486552.47 after: Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348 Mean: 217745009865.40 Std Dev: 5935559279.99 It continues to look like it's faster, though the deviation is rather wide, but I'm not sure what I could do that would be less noisy. I'm open to ideas! Link: http://lkml.kernel.org/r/20180221021659.GA37073@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Srivatsa: Backported to 4.9.y ] Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu> Reviewed-by: Srinidhi Rao <srinidhir@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Konstantin Khlebnikov	885b49b4f3	kmemleak: clear stale pointers from task stacks commit `ca18255185` upstream. Kmemleak considers any pointers on task stacks as references. This patch clears newly allocated and reused vmap stacks. Link: http://lkml.kernel.org/r/150728990124.744199.8403409836394318684.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Srivatsa: Backported to 4.9.y ] Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Eric Dumazet	36ee106e84	tcp: add tcp_ooo_try_coalesce() helper commit `58152ecbbc` upstream. In case skb in out_or_order_queue is the result of multiple skbs coalescing, we would like to get a proper gso_segs counter tracking, so that future tcp_drop() can report an accurate number. I chose to not implement this tracking for skbs in receive queue, since they are not dropped, unless socket is disconnected. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Filipe Manana	b2486a81f6	Btrfs: fix file data corruption after cloning a range and fsync commit `bd3599a0e1` upstream. When we clone a range into a file we can end up dropping existing extent maps (or trimming them) and replacing them with new ones if the range to be cloned overlaps with a range in the destination inode. When that happens we add the new extent maps to the list of modified extents in the inode's extent map tree, so that a "fast" fsync (the flag BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps and log corresponding extent items. However, at the end of range cloning operation we do truncate all the pages in the affected range (in order to ensure future reads will not get stale data). Sometimes this truncation will release the corresponding extent maps besides the pages from the page cache. If this happens, then a "fast" fsync operation will miss logging some extent items, because it relies exclusively on the extent maps being present in the inode's extent tree, leading to data loss/corruption if the fsync ends up using the same transaction used by the clone operation (that transaction was not committed in the meanwhile). An extent map is released through the callback btrfs_invalidatepage(), which gets called by truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The later ends up calling try_release_extent_mapping() which will release the extent map if some conditions are met, like the file size being greater than 16Mb, gfp flags allow blocking and the range not being locked (which is the case during the clone operation) nor being the extent map flagged as pinned (also the case for cloning). The following example, turned into a test for fstests, reproduces the issue: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo $ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar # reflink destination offset corresponds to the size of file bar, # 2728Kb minus 4Kb. $ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar $ md5sum /mnt/bar 95a95813a8c2abc9aa75a6c2914a077e /mnt/bar <power fail> $ mount /dev/sdb /mnt $ md5sum /mnt/bar 207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar # digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the # power failure In the above example, the destination offset of the clone operation corresponds to the size of the "bar" file minus 4Kb. So during the clone operation, the extent map covering the range from 2572Kb to 2728Kb gets trimmed so that it ends at offset 2724Kb, and a new extent map covering the range from 2724Kb to 11724Kb is created. So at the end of the clone operation when we ask to truncate the pages in the range from 2724Kb to 2724Kb + 15908Kb, the page invalidation callback ends up removing the new extent map (through try_release_extent_mapping()) when the page at offset 2724Kb is passed to that callback. Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent map is removed at try_release_extent_mapping(), forcing the next fsync to search for modified extents in the fs/subvolume tree instead of relying on the presence of extent maps in memory. This way we can continue doing a "fast" fsync if the destination range of a clone operation does not overlap with an existing range or if any of the criteria necessary to remove an extent map at try_release_extent_mapping() is not met (file size not bigger then 16Mb or gfp flags do not allow blocking). CC: stable@vger.kernel.org # 3.16+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Esben Haabendal	7f8d5ff5ea	i2c: imx: Fix reinit_completion() use commit `9f9e3e0d4d` upstream. Make sure to call reinit_completion() before dma is started to avoid race condition where reinit_completion() is called after complete() and before wait_for_completion_timeout(). Signed-off-by: Esben Haabendal <eha@deif.com> Fixes: `ce1a78840f` ("i2c: imx: add DMA support for freescale i2c driver") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:18:00 +02:00
Masami Hiramatsu	a26030a63e	ring_buffer: tracing: Inherit the tracing setting to next ring buffer commit `73c8d89455` upstream. Maintain the tracing on/off setting of the ring_buffer when switching to the trace buffer snapshot. Taking a snapshot is done by swapping the backup ring buffer (max_tr_buffer). But since the tracing on/off setting is defined by the ring buffer, when swapping it, the tracing on/off setting can also be changed. This causes a strange result like below: /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 0 > tracing_on /sys/kernel/debug/tracing # cat tracing_on 0 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 0 We don't touch tracing_on, but snapshot changes tracing_on setting each time. This is an anomaly, because user doesn't know that each "ring_buffer" stores its own tracing-enable state and the snapshot is done by swapping ring buffers. Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox Cc: Ingo Molnar <mingo@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp> Cc: stable@vger.kernel.org Fixes: `debdd57f51` ("tracing: Make a snapshot feature available from userspace") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> [ Updated commit log and comment in the code ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:59 +02:00
Vitaly Kuznetsov	b209a097ca	ACPI / PCI: Bail early in acpi_pci_add_bus() if there is no ACPI handle commit `a0040c0145` upstream. Hyper-V instances support PCI pass-through which is implemented through PV pci-hyperv driver. When a device is passed through, a new root PCI bus is created in the guest. The bus sits on top of VMBus and has no associated information in ACPI. acpi_pci_add_bus() in this case proceeds all the way to acpi_evaluate_dsm(), which reports ACPI: \: failed to evaluate _DSM (0x1001) While acpi_pci_slot_enumerate() and acpiphp_enumerate_slots() are protected against ACPI_HANDLE() being NULL and do nothing, acpi_evaluate_dsm() is not and gives us the error. It seems the correct fix is to not do anything in acpi_pci_add_bus() in such cases. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Sinan Kaya <okaya@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:59 +02:00
Theodore Ts'o	9bf8d5bf50	ext4: fix false negatives and false positives in ext4_check_descriptors() commit `44de022c43` upstream. Ext4_check_descriptors() was getting called before s_gdb_count was initialized. So for file systems w/o the meta_bg feature, allocation bitmaps could overlap the block group descriptors and ext4 wouldn't notice. For file systems with the meta_bg feature enabled, there was a fencepost error which would cause the ext4_check_descriptors() to incorrectly believe that the block allocation bitmap overlaps with the block group descriptor blocks, and it would reject the mount. Fix both of these problems. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Gilbert <bgilbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:59 +02:00
Dmitry Safonov	c68c772262	netlink: Don't shift on 64 for ngroups commit `91874ecf32` upstream. It's legal to have 64 groups for netlink_sock. As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe only to first 32 groups. The check for correctness of .bind() userspace supplied parameter is done by applying mask made from ngroups shift. Which broke Android as they have 64 groups and the shift for mask resulted in an overflow. Fixes: `61f4b23769` ("netlink: Don't shift with UB on nlk->ngroups") Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Reported-and-Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:59 +02:00
Dmitry Safonov	4d502572ea	netlink: Don't shift with UB on nlk->ngroups [ Upstream commit `61f4b23769` ] On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in hang during boot. Check for 0 ngroups and use (unsigned long long) as a type to shift. Fixes: `7acf9d4237` ("netlink: Do not subscribe to non-existent groups"). Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:59 +02:00
Dmitry Safonov	4f08437d6c	netlink: Do not subscribe to non-existent groups [ Upstream commit `7acf9d4237` ] Make ABI more strict about subscribing to group > ngroups. Code doesn't check for that and it looks bogus. (one can subscribe to non-existing group) Still, it's possible to bind() to all possible groups with (-1) Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:58 +02:00
Anna-Maria Gleixner	f4a9db57e7	nohz: Fix local_timer_softirq_pending() commit `80d20d35af` upstream. local_timer_softirq_pending() checks whether the timer softirq is pending with: local_softirq_pending() & TIMER_SOFTIRQ. This is wrong because TIMER_SOFTIRQ is the softirq number and not a bitmask. So the test checks for the wrong bit. Use BIT(TIMER_SOFTIRQ) instead. Fixes: `5d62c183f9` ("nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()") Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Cc: bigeasy@linutronix.de Cc: peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180731161358.29472-1-anna-maria@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:57 +02:00
Thomas Gleixner	eecd08afb0	genirq: Make force irq threading setup more robust commit `d1f0301b33` upstream. The support of force threading interrupts which are set up with both a primary and a threaded handler wreckaged the setup of regular requested threaded interrupts (primary handler == NULL). The reason is that it does not check whether the primary handler is set to the default handler which wakes the handler thread. Instead it replaces the thread handler with the primary handler as it would do with force threaded interrupts which have been requested via request_irq(). So both the primary and the thread handler become the same which then triggers the warnon that the thread handler tries to wakeup a not configured secondary thread. Fortunately this only happens when the driver omits the IRQF_ONESHOT flag when requesting the threaded interrupt, which is normaly caught by the sanity checks when force irq threading is disabled. Fix it by skipping the force threading setup when a regular threaded interrupt is requested. As a consequence the interrupt request which lacks the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging it. Fixes: `2a1d3ab898` ("genirq: Handle force threading of irqs with primary and thread handler") Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:57 +02:00
Anil Gurumurthy	24b79a95b2	scsi: qla2xxx: Return error when TMF returns commit `b4146c4929` upstream. Propagate the task management completion status properly to avoid unnecessary waits for commands to complete. Fixes: `faef62d134` ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling") Cc: <stable@vger.kernel.org> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:57 +02:00
Quinn Tran	f71d13c397	scsi: qla2xxx: Fix ISP recovery on unload commit `b08abbd9f5` upstream. During unload process, the chip can encounter problem where a FW dump would be captured. For this case, the full reset sequence will be skip to bring the chip back to full operational state. Fixes: `e315cd28b9` ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:17:57 +02:00
Greg Kroah-Hartman	e01202b36f	Linux 4.9.118	2018-08-06 16:23:04 +02:00
Tony Battersby	0ff94fb99e	scsi: sg: fix minor memory leak in error path commit `c170e5a8d2` upstream. Fix a minor memory leak when there is an error opening a /dev/sg device. Fixes: `cc833acbee` ("sg: O_EXCL and other lock handling") Cc: <stable@vger.kernel.org> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Boris Brezillon	e79a2db21e	drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats commit `a6a00918d4` upstream. This is needed to ensure ->is_unity is correct when the plane was previously configured to output a multi-planar format with scaling enabled, and is then being reconfigured to output a uniplanar format. Fixes: `fc04023faf` ("drm/vc4: Add support for YUV planes.") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180724133601.32114-1-boris.brezillon@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Herbert Xu	804f510bf2	crypto: padlock-aes - Fix Nano workaround data corruption commit `46d8c4b286` upstream. This was detected by the self-test thanks to Ard's chunking patch. I finally got around to testing this out on my ancient Via box. It turns out that the workaround got the assembly wrong and we end up doing count + initial cycles of the loop instead of just count. This obviously causes corruption, either by overwriting the source that is yet to be processed, or writing over the end of the buffer. On CPUs that don't require the workaround only ECB is affected. On Nano CPUs both ECB and CBC are affected. This patch fixes it by doing the subtraction prior to the assembly. Fixes: `a76c1c23d0` ("crypto: padlock-aes - work around Nano CPU...") Cc: <stable@vger.kernel.org> Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Roman Kagan	020a90f653	kvm: x86: vmx: fix vpid leak commit `63aff65573` upstream. VPID for the nested vcpu is allocated at vmx_create_vcpu whenever nested vmx is turned on with the module parameter. However, it's only freed if the L1 guest has executed VMXON which is not a given. As a result, on a system with nested==on every creation+deletion of an L1 vcpu without running an L2 guest results in leaking one vpid. Since the total number of vpids is limited to 64k, they can eventually get exhausted, preventing L2 from starting. Delay allocation of the L2 vpid until VMXON emulation, thus matching its freeing. Fixes: `5c614b3583` Cc: stable@vger.kernel.org Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Jiang Biao	1d43314459	virtio_balloon: fix another race between migration and ballooning commit `89da619bc1` upstream. Kernel panic when with high memory pressure, calltrace looks like, PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java" #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942 #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30 #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8 #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46 #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc #6 [ffff881ec7ed7838] __node_set at ffffffff81680300 #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5 #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8 [exception RIP: _raw_spin_lock_irqsave+47] RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8 RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008 RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098 R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 It happens in the pagefault and results in double pagefault during compacting pages when memory allocation fails. Analysed the vmcore, the page leads to second pagefault is corrupted with _mapcount=-256, but private=0. It's caused by the race between migration and ballooning, and lock missing in virtballoon_migratepage() of virtio_balloon driver. This patch fix the bug. Fixes: `e22504296d` ("virtio_balloon: introduce migration primitives to balloon pages") Cc: stable@vger.kernel.org Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Huang Chong <huang.chong@zte.com.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Jeremy Cline	9a492f8c71	net: socket: fix potential spectre v1 gadget in socketcall commit `c8e8cd579b` upstream. 'call' is a user-controlled value, so sanitize the array index after the bounds check to avoid speculating past the bounds of the 'nargs' array. Found with the help of Smatch: net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue 'nargs' [r] (local cap) Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:04 +02:00
Anton Vasilyev	18d971807d	can: ems_usb: Fix memory leak on ems_usb_disconnect() commit `72c05f32f4` upstream. ems_usb_probe() allocates memory for dev->tx_msg_buffer, but there is no its deallocation in ems_usb_disconnect(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Linus Torvalds	52cd8f3790	squashfs: more metadata hardenings commit `71755ee535` upstream. The squashfs fragment reading code doesn't actually verify that the fragment is inside the fragment table. The end result _is_ verified to be inside the image when actually reading the fragment data, but before that is done, we may end up taking a page fault because the fragment table itself might not even exist. Another report from Anatoly and his endless squashfs image fuzzing. Reported-by: Анатолий Тросиненко <anatoly.trosinenko@gmail.com> Acked-by:: Phillip Lougher <phillip.lougher@gmail.com>, Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Linus Torvalds	3abef06039	squashfs: more metadata hardening commit `d512584780` upstream. Anatoly reports another squashfs fuzzing issue, where the decompression parameters themselves are in a compressed block. This causes squashfs_read_data() to be called in order to read the decompression options before the decompression stream having been set up, making squashfs go sideways. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Acked-by: Phillip Lougher <phillip.lougher@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Jose Abreu	c9bd4fd4b7	net: stmmac: Fix WoL for PCI-based setups [ Upstream commit `b7d0f08e91` ] WoL won't work in PCI-based setups because we are not saving the PCI EP state before entering suspend state and not allowing D3 wake. Fix this by using a wrapper around stmmac_{suspend/resume} which correctly sets the PCI EP state. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Jeremy Cline	67f0a2887b	netlink: Fix spectre v1 gadget in netlink_create() [ Upstream commit `bc5b6c0b62` ] 'protocol' is a user-controlled value, so sanitize it after the bounds check to avoid using it for speculative out-of-bounds access to arrays indexed by it. This addresses the following accesses detected with the help of smatch: * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_keys' [w] * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_key_strings' [w] * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre issue 'nl_table' [w] (local cap) Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Florian Fainelli	ab9a0f80bc	net: dsa: Do not suspend/resume closed slave_dev [ Upstream commit `a94c689e6c` ] If a DSA slave network device was previously disabled, there is no need to suspend or resume it. Fixes: `2446254915` ("net: dsa: allow switch drivers to implement suspend/resume hooks") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Eric Dumazet	d59dcdf13e	ipv4: frags: handle possible skb truesize change [ Upstream commit `4672694bd4` ] ip_frag_queue() might call pskb_pull() on one skb that is already in the fragment queue. We need to take care of possible truesize change, or we might have an imbalance of the netns frags memory usage. IPv6 is immune to this bug, because RFC5722, Section 4, amended by Errata ID 3089 states : When reassembling an IPv6 datagram, if one or more its constituent fragments is determined to be an overlapping fragment, the entire datagram (and any constituent fragments) MUST be silently discarded. Fixes: `158f323b98` ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Eric Dumazet	c5282a032f	inet: frag: enforce memory limits earlier [ Upstream commit `56e2c94f05` ] We currently check current frags memory usage only when a new frag queue is created. This allows attackers to first consume the memory budget (default : 4 MB) creating thousands of frag queues, then sending tiny skbs to exceed high_thresh limit by 2 to 3 order of magnitude. Note that before commit `648700f76b` ("inet: frags: use rhashtables for reassembly units"), work queue could be starved under DOS, getting no cpu cycles. After commit `648700f76b`, only the per frag queue timer can eventually remove an incomplete frag queue and its skbs. Fixes: `b13d3cbfb8` ("inet: frag: move eviction of queues to work queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jann Horn <jannh@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Peter Oskolkov <posk@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Eric Dumazet	7142fdb6a9	bonding: avoid lockdep confusion in bond_get_stats() [ Upstream commit `7e2556e400` ] syzbot found that the following sequence produces a LOCKDEP splat [1] ip link add bond10 type bond ip link add bond11 type bond ip link set bond11 master bond10 To fix this, we can use the already provided nest_level. This patch also provides correct nesting for dev->addr_list_lock [1] WARNING: possible recursive locking detected 4.18.0-rc6+ #167 Not tainted -------------------------------------------- syz-executor751/4439 is trying to acquire lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 but task is already holding lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&bond->stats_lock)->rlock); lock(&(&bond->stats_lock)->rlock); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by syz-executor751/4439: #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 #2: (____ptrval____) (rcu_read_lock){....}, at: bond_get_stats+0x0/0x560 include/linux/compiler.h:215 stack backtrace: CPU: 0 PID: 4439 Comm: syz-executor751 Not tainted 4.18.0-rc6+ #167 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1765 [inline] check_deadlock kernel/locking/lockdep.c:1809 [inline] validate_chain kernel/locking/lockdep.c:2405 [inline] __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 bond_get_stats+0x232/0x560 drivers/net/bonding/bond_main.c:3432 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 rtnl_fill_stats+0x4d/0xac0 net/core/rtnetlink.c:1169 rtnl_fill_ifinfo+0x1aa6/0x3fb0 net/core/rtnetlink.c:1611 rtmsg_ifinfo_build_skb+0xc8/0x190 net/core/rtnetlink.c:3268 rtmsg_ifinfo_event.part.30+0x45/0xe0 net/core/rtnetlink.c:3300 rtmsg_ifinfo_event net/core/rtnetlink.c:3297 [inline] rtnetlink_event+0x144/0x170 net/core/rtnetlink.c:4716 notifier_call_chain+0x180/0x390 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735 call_netdevice_notifiers net/core/dev.c:1753 [inline] netdev_features_change net/core/dev.c:1321 [inline] netdev_change_features+0xb3/0x110 net/core/dev.c:7759 bond_compute_features.isra.47+0x585/0xa50 drivers/net/bonding/bond_main.c:1120 bond_enslave+0x1b25/0x5da0 drivers/net/bonding/bond_main.c:1755 bond_do_ioctl+0x7cb/0xae0 drivers/net/bonding/bond_main.c:3528 dev_ifsioc+0x43c/0xb30 net/core/dev_ioctl.c:327 dev_ioctl+0x1b5/0xcc0 net/core/dev_ioctl.c:493 sock_do_ioctl+0x1d3/0x3e0 net/socket.c:992 sock_ioctl+0x30d/0x680 net/socket.c:1093 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440859 Code: e8 2c af 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc51a92878 EFLAGS: 00000213 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440859 RDX: 0000000020000040 RSI: 0000000000008990 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000022d5880 R11: 0000000000000213 R12: 0000000000007390 R13: 0000000000401db0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:03 +02:00
Boqun Feng	047f9d6a56	sched/wait: Remove the lockless swait_active() check in swake_up() commit `35a2897c2a` upstream. Steven Rostedt reported a potential race in RCU core because of swake_up(): CPU0 CPU1 ---- ---- __call_rcu_core() { spin_lock(rnp_root) need_wake = __rcu_start_gp() { rcu_start_gp_advanced() { gp_flags = FLAG_INIT } } rcu_gp_kthread() { swait_event_interruptible(wq, gp_flags & FLAG_INIT) { spin_lock(q->lock) fetch wq->task_list here! * list_add(wq->task_list, q->task_list) spin_unlock(q->lock); fetch old value of gp_flags here spin_unlock(rnp_root) rcu_gp_kthread_wake() { swake_up(wq) { swait_active(wq) { list_empty(wq->task_list) } * return false * if (condition) * false * schedule(); In this case, a wakeup is missed, which could cause the rcu_gp_kthread waits for a long time. The reason of this is that we do a lockless swait_active() check in swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up() before swait_active() to provide the proper order or 2) simply remove the swait_active() in swake_up(). The solution 2 not only fixes this problem but also keeps the swait and wait API as close as possible, as wake_up() doesn't provide a full barrier and doesn't do a lockless check of the wait queue either. Moreover, there are users already using swait_active() to do their quick checks for the wait queues, so it make less sense that swake_up() and swake_up_all() do this on their own. This patch then removes the lockless swait_active() check in swake_up() and swake_up_all(). Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170615041828.zk3a3sfyudm5p6nl@tardis Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Chen <david.chen@nutanix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Andy Shevchenko	d4c9c7c1ee	pinctrl: intel: Read back TX buffer state commit `d68b42e30b` upstream. In the same way as it's done in pinctrl-cherryview.c we would provide a readback TX buffer state. Fixes: `17fab47369` ("pinctrl: intel: Set pin direction properly") Reported-by: "Bourque, Francis" <francis.bourque@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: "Bourque, Francis" <francis.bourque@intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Anthony de Boer <adb@adb.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Eric Dumazet	019ea5193f	tcp: add one more quick ack after after ECN events [ Upstream commit `15ecbe94a4` ] Larry Brakmo proposal ( https://patchwork.ozlabs.org/patch/935233/ tcp: force cwnd at least 2 in tcp_cwnd_reduction) made us rethink about our recent patch removing ~16 quick acks after ECN events. tcp_enter_quickack_mode(sk, 1) makes sure one immediate ack is sent, but in the case the sender cwnd was lowered to 1, we do not want to have a delayed ack for the next packet we will receive. Fixes: `522040ea5f` ("tcp: do not aggressively quick ack after ECN events") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Neal Cardwell <ncardwell@google.com> Cc: Lawrence Brakmo <brakmo@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Yousuk Seung	095ab5f46c	tcp: refactor tcp_ecn_check_ce to remove sk type cast [ Upstream commit `f4c9f85f3b` ] Refactor tcp_ecn_check_ce and __tcp_ecn_check_ce to accept struct sock* instead of tcp_sock* to clean up type casts. This is a pure refactor patch. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Eric Dumazet	65d986cb5e	tcp: do not aggressively quick ack after ECN events [ Upstream commit `522040ea5f` ] ECN signals currently forces TCP to enter quickack mode for up to 16 (TCP_MAX_QUICKACKS) following incoming packets. We believe this is not needed, and only sending one immediate ack for the current packet should be enough. This should reduce the extra load noticed in DCTCP environments, after congestion events. This is part 2 of our effort to reduce pure ACK packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Eric Dumazet	90cf17d665	tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode [ Upstream commit `9a9c9b51e5` ] We want to add finer control of the number of ACK packets sent after ECN events. This patch is not changing current behavior, it only enables following change. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Eric Dumazet	8ca41e4efc	tcp: do not force quickack when receiving out-of-order packets [ Upstream commit `a3893637e1` ] As explained in commit `9f9843a751` ("tcp: properly handle stretch acks in slow start"), TCP stacks have to consider how many packets are acknowledged in one single ACK, because of GRO, but also because of ACK compression or losses. We plan to add SACK compression in the following patch, we must therefore not call tcp_enter_quickack_mode() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Xiao Liang	b03ca669d5	xen-netfront: wait xenbus state change when load module manually [ Upstream commit `822fb18a82` ] When loading module manually, after call xenbus_switch_state to initializes the state of the netfront device, the driver state did not change so fast that may lead no dev created in latest kernel. This patch adds wait to make sure xenbus knows the driver is not in closed/unknown state. Current state: [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available With the patch installed. [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Link detected: yes Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:02 +02:00
Neal Cardwell	b3e349fd55	tcp_bbr: fix bw probing to raise in-flight data for very small BDPs [ Upstream commit `383d470936` ] For some very small BDPs (with just a few packets) there was a quantization effect where the target number of packets in flight during the super-unity-gain (1.25x) phase of gain cycling was implicitly truncated to a number of packets no larger than the normal unity-gain (1.0x) phase of gain cycling. This meant that in multi-flow scenarios some flows could get stuck with a lower bandwidth, because they did not push enough packets inflight to discover that there was more bandwidth available. This was really only an issue in multi-flow LAN scenarios, where RTTs and BDPs are low enough for this to be an issue. This fix ensures that gain cycling can raise inflight for small BDPs by ensuring that in PROBE_BW mode target inflight values with a super-unity gain are always greater than inflight values with a gain <= 1. Importantly, this applies whether the inflight value is calculated for use as a cwnd value, or as a target inflight value for the end of the super-unity phase in bbr_is_next_cycle_phase() (both need to be bigger to ensure we can probe with more packets in flight reliably). This is a candidate fix for stable releases. Fixes: `0f8782ea14` ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Eugeniy Paltsev	f6488f40a8	NET: stmmac: align DMA stuff to largest cache line length [ Upstream commit `9939a46d90` ] As for today STMMAC_ALIGN macro (which is used to align DMA stuff) relies on L1 line length (L1_CACHE_BYTES). This isn't correct in case of system with several cache levels which might have L1 cache line length smaller than L2 line. This can lead to sharing one cache line between DMA buffer and other data, so we can lose this data while invalidate DMA buffer before DMA transaction. Fix that by using SMP_CACHE_BYTES instead of L1_CACHE_BYTES for aligning. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Anton Vasilyev	32363930df	net: mdio-mux: bcm-iproc: fix wrong getter and setter pair [ Upstream commit `b0753408aa` ] mdio_mux_iproc_probe() uses platform_set_drvdata() to store md pointer in device, whereas mdio_mux_iproc_remove() restores md pointer by dev_get_platdata(&pdev->dev). This leads to wrong resources release. The patch replaces getter to platform_get_drvdata. Fixes: `98bc865a1e` ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Stefan Wahren	a9deaa1971	net: lan78xx: fix rx handling before first packet is send [ Upstream commit `136f55f660` ] As long the bh tasklet isn't scheduled once, no packet from the rx path will be handled. Since the tx path also schedule the same tasklet this situation only persits until the first packet transmission. So fix this issue by scheduling the tasklet after link reset. Link: https://github.com/raspberrypi/linux/issues/2617 Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Suggested-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
tangpengpeng	31a9d4dd85	net: fix amd-xgbe flow-control issue [ Upstream commit `7f3fc7ddf7` ] If we enable or disable xgbe flow-control by ethtool , it does't work.Because the parameter is not properly assigned,so we need to adjust the assignment order of the parameters. Fixes: `c1ce2f7736` ("amd-xgbe: Fix flow control setting logic") Signed-off-by: tangpengpeng <tangpengpeng@higon.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Gal Pressman	6fff429df7	net: ena: Fix use of uninitialized DMA address bits field [ Upstream commit `101f0cd4f2` ] UBSAN triggers the following undefined behaviour warnings: [...] [ 13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22 [ 13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] [ 13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4 [ 13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] When splitting the address to high and low, GENMASK_ULL is used to generate a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and ena_com_add_single_rx_desc). The problem is that dma_addr_bits is not initialized with a proper value (besides being cleared in ena_com_create_io_queue). Assign dma_addr_bits the correct value that is stored in ena_dev when initializing the SQ. Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Gal Pressman <pressmangal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Lorenzo Bianconi	e364f1a2cc	ipv4: remove BUG_ON() from fib_compute_spec_dst [ Upstream commit `9fc12023d6` ] Remove BUG_ON() from fib_compute_spec_dst routine and check in_dev pointer during flowi4 data structure initialization. fib_compute_spec_dst routine can be run concurrently with device removal where ip_ptr net_device pointer is set to NULL. This can happen if userspace enables pkt info on UDP rx socket and the device is removed while traffic is flowing Fixes: `35ebf65e85` ("ipv4: Create and use fib_compute_spec_dst() helper") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:23:01 +02:00
Greg Kroah-Hartman	ddd28fff50	Linux 4.9.117	2018-08-03 07:55:27 +02:00
Michal Vokáč	db890d30b9	net: dsa: qca8k: Allow overwriting CPU port setting commit `9bb2289f90` upstream. Implement adjust_link function that allows to overwrite default CPU port setting using fixed-link device tree subnode. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:27 +02:00
Michal Vokáč	53a1a29a92	net: dsa: qca8k: Add QCA8334 binding documentation commit `218bbea11a` upstream. Add support for the four-port variant of the Qualcomm QCA833x switch. The CPU port default link settings can be reconfigured using a fixed-link sub-node. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Michal Vokáč	b429bf7de4	net: dsa: qca8k: Enable RXMAC when bringing up a port commit `eee1fe6476` upstream. When a port is brought up/down do not enable/disable only the TXMAC but the RXMAC as well. This is essential for the CPU port to work. Fixes: `6b93fb4648` ("net-next: dsa: add new driver for qca8xxx family") Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Michal Vokáč	e59af2831d	net: dsa: qca8k: Force CPU port to its highest bandwidth commit `79a4ed4f0f` upstream. By default autonegotiation is enabled to configure MAC on all ports. For the CPU port autonegotiation can not be used so we need to set some sensible defaults manually. This patch forces the default setting of the CPU port to 1000Mbps/full duplex which is the chip maximum capability. Also correct size of the bit field used to configure link speed. Fixes: `6b93fb4648` ("net-next: dsa: add new driver for qca8xxx family") Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Leon Romanovsky	40af3250e9	RDMA/uverbs: Protect from attempts to create flows on unsupported QP commit `940efcc888` upstream. Flows can be created on UD and RAW_PACKET QP types. Attempts to provide other QP types as an input causes to various unpredictable failures. The reason is that in order to support all various types (e.g. XRC), we are supposed to use real_qp handle and not qp handle and expect to driver/FW to fail such (XRC) flows. The simpler and safer variant is to ban all QP types except UD and RAW_PACKET, instead of relying on driver/FW. Cc: <stable@vger.kernel.org> # 3.11 Fixes: `436f2ad05a` ("IB/core: Export ib_create/destroy_flow through uverbs") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Theodore Ts'o	262a62cc50	ext4: check for allocation block validity with block group locked commit `8d5a803c6a` upstream. With commit `044e6e3d74`: "ext4: don't update checksum of new initialized bitmaps" the buffer valid bit will get set without actually setting up the checksum for the allocation bitmap, since the checksum will get calculated once we actually allocate an inode or block. If we are doing this, then we need to (re-)check the verified bit after we take the block group lock. Otherwise, we could race with another process reading and verifying the bitmap, which would then complain about the checksum being invalid. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Theodore Ts'o	5eed597ca6	ext4: fix inline data updates with checksums enabled commit `362eca70b5` upstream. The inline data code was updating the raw inode directly; this is problematic since if metadata checksums are enabled, ext4_mark_inode_dirty() must be called to update the inode's checksum. In addition, the jbd2 layer requires that get_write_access() be called before the metadata buffer is modified. Fix both of these problems. https://bugzilla.kernel.org/show_bug.cgi?id=200443 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Linus Torvalds	1aecbe4326	squashfs: be more careful about metadata corruption commit `01cfb7937a` upstream. Anatoly Trosinenko reports that a corrupted squashfs image can cause a kernel oops. It turns out that squashfs can end up being confused about negative fragment lengths. The regular squashfs_read_data() does check for negative lengths, but squashfs_read_metadata() did not, and the fragment size code just blindly trusted the on-disk value. Fix both the fragment parsing and the metadata reading code. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Theodore Ts'o	820f2bcacb	random: mix rdrand with entropy sent in from userspace commit `81e69df38e` upstream. Fedora has integrated the jitter entropy daemon to work around slow boot problems, especially on VM's that don't support virtio-rng: https://bugzilla.redhat.com/show_bug.cgi?id=1572944 It's understandable why they did this, but the Jitter entropy daemon works fundamentally on the principle: "the CPU microarchitecture is so complicated and we can't figure it out, so it must be random". Yes, it uses statistical tests to "prove" it is secure, but AES_ENCRYPT(NSA_KEY, COUNTER++) will also pass statistical tests with flying colors. So if RDRAND is available, mix it into entropy submitted from userspace. It can't hurt, and if you believe the NSA has backdoored RDRAND, then they probably have enough details about the Intel microarchitecture that they can reverse engineer how the Jitter entropy daemon affects the microarchitecture, and attack its output stream. And if RDRAND is in fact an honest DRNG, it will immeasurably improve on what the Jitter entropy daemon might produce. This also provides some protection against someone who is able to read or set the entropy seed file. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
José Roberto de Souza	f685597b13	drm: Add DP PSR2 sink enable bit [ Upstream commit `4f212e4046` ] To comply with eDP1.4a this bit should be set when enabling PSR2. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180328223046.16125-1-jose.souza@intel.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Mauro Carvalho Chehab	401103613d	media: si470x: fix __be16 annotations [ Upstream commit `90db5c8296` ] The annotations there are wrong as warned: drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: warning: incorrect type in assignment (different base types) drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: expected unsigned short [unsigned] [short] <noident> drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: got restricted __be16 [usertype] <noident> drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Shivasharan S	6e8738c1c1	scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs [ Upstream commit `3239b8cd28` ] Hardware could time out Fastpath IOs one second earlier than the timeout provided by the host. For non-RAID devices, driver provides timeout value based on OS provided timeout value. Under certain scenarios, if the OS provides a timeout value of 1 second, due to above behavior hardware will timeout immediately. Increase timeout value for non-RAID fastpath IOs by 1 second. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:26 +02:00
Xose Vazquez Perez	6337861a0f	scsi: scsi_dh: replace too broad "TP9" string with the exact models [ Upstream commit `37b37d2609` ] SGI/TP9100 is not an RDAC array: ^^^ https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235 This partially reverts commit `35204772ea` ("[SCSI] scsi_dh_rdac : Consolidate rdac strings together") [mkp: fixed up the new entries to align with rest of struct] Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com> Cc: Hannes Reinecke <hare@suse.de> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Cc: DM ML <dm-devel@redhat.com> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Suman Anna	8fcb8b5ea0	media: omap3isp: fix unbalanced dma_iommu_mapping [ Upstream commit `b7e1e6859f` ] The OMAP3 ISP driver manages its MMU mappings through the IOMMU-aware ARM DMA backend. The current code creates a dma_iommu_mapping and attaches this to the ISP device, but never detaches the mapping in either the probe failure paths or the driver remove path resulting in an unbalanced mapping refcount and a memory leak. Fix this properly. Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Suman Anna <s-anna@ti.com> Tested-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Tudor-Dan Ambarus	15aa793dad	crypto: authenc - don't leak pointers to authenc keys [ Upstream commit `ad2fdcdf75` ] In crypto_authenc_setkey we save pointers to the authenc keys in a local variable of type struct crypto_authenc_keys and we don't zeroize it after use. Fix this and don't leak pointers to the authenc keys. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Tudor-Dan Ambarus	6b4cdfa0ab	crypto: authencesn - don't leak pointers to authenc keys [ Upstream commit `31545df391` ] In crypto_authenc_esn_setkey we save pointers to the authenc keys in a local variable of type struct crypto_authenc_keys and we don't zeroize it after use. Fix this and don't leak pointers to the authenc keys. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Dominik Bozek	399e549fe5	usb: hub: Don't wait for connect state at resume for powered-off ports [ Upstream commit `5d111f5190` ] wait_for_connected() wait till a port change status to USB_PORT_STAT_CONNECTION, but this is not possible if the port is unpowered. The loop will only exit at timeout. Such case take place if an over-current incident happen while system is in S3. Then during resume wait_for_connected() will wait 2s, which may be noticeable by the user. Signed-off-by: Dominik Bozek <dominikx.bozek@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Michal Simek	eac904dd39	microblaze: Fix simpleImage format generation [ Upstream commit `ece97f3a5f` ] simpleImage generation was broken for some time. This patch is fixing steps how simpleImage.*.ub file is generated. Steps are objdump of vmlinux and create .ub. Also make sure that there is striped elf version with .strip suffix. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Douglas Anderson	1d1a409502	serial: core: Make sure compiler barfs for 16-byte earlycon names [ Upstream commit `c1c734cb1f` ] As part of bringup I ended up wanting to call an earlycon driver by a name that was exactly 16-bytes big, specifically "qcom_geni_serial". Unfortunately, when I tried this I found that things compiled just fine. They just didn't work. Specifically the compiler felt perfectly justified in initting the ".name" field of "struct earlycon_id" with the full 16-bytes and just skipping the '\0'. Needless to say, that behavior didn't seem ideal, but I guess someone must have allowed it for a reason. One way to fix this is to shorten the name field to 15 bytes and then add an extra byte after that nobody touches. This should always be initted to 0 and we're golden. There are, of course, other ways to fix this too. We could audit all the users of the "name" field and make them stop at both null termination or at 16 bytes. We could also just make the name field much bigger so that we're not likely to run into this. ...but both seem like we'll just hit the bug again. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
NeilBrown	c18d68c7c2	staging: lustre: ldlm: free resource when ldlm_lock_create() fails. [ Upstream commit `d8caf662b4` ] ldlm_lock_create() gets a resource, but don't put it on all failure paths. It should. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
James Simmons	1c80292332	staging: lustre: llite: correct removexattr detection [ Upstream commit `1b60f6dfa3` ] In ll_xattr_set_common() detect the removexattr() case correctly by testing for a NULL value as well as XATTR_REPLACE. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10787 Reviewed-on: https://review.whamcloud.com/ Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: James Simmons <uja.ornl@yahoo.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Ondrej Mosnáček	5f5e70d7ec	audit: allow not equal op for audit by executable [ Upstream commit `23bcc480da` ] Current implementation of auditing by executable name only implements the 'equal' operator. This patch extends it to also support the 'not equal' operator. See: https://github.com/linux-audit/audit-kernel/issues/53 Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Siva Rebbagondla	3c90e828db	rsi: Fix 'invalid vdd' warning in mmc [ Upstream commit `78e450719c` ] While performing cleanup, driver is messing with card->ocr value by not masking rocr against ocr_avail. Below panic is observed with some of the SDIO host controllers due to this. Issue is resolved by reverting incorrect modifications to vdd. [ 927.423821] mmc1: Invalid vdd 0x1f [ 927.423925] Modules linked in: rsi_sdio(+) cmac bnep arc4 rsi_91x mac80211 cfg80211 btrsi rfcomm bluetooth ecdh_generic [ 927.424073] CPU: 0 PID: 1624 Comm: insmod Tainted: G W 4.15.0-1000-caracalla #1 [ 927.424075] Hardware name: Dell Inc. Edge Gateway 3003/ , BIOS 01.00.06 01/22/2018 [ 927.424082] RIP: 0010:sdhci_set_power_noreg+0xdd/0x190[sdhci] [ 927.424085] RSP: 0018:ffffac3fc064b930 EFLAGS: 00010282 [ 927.424107] Call Trace: [ 927.424118] sdhci_set_power+0x5a/0x60 [sdhci] [ 927.424125] sdhci_set_ios+0x360/0x3b0 [sdhci] [ 927.424133] mmc_set_initial_state+0x92/0x120 [ 927.424137] mmc_power_up.part.34+0x33/0x1d0 [ 927.424141] mmc_power_up+0x17/0x20 [ 927.424147] mmc_sdio_runtime_resume+0x2d/0x50 [ 927.424151] mmc_runtime_resume+0x17/0x20 [ 927.424156] __rpm_callback+0xc4/0x200 [ 927.424161] ? idr_alloc_cyclic+0x57/0xd0 [ 927.424165] ? mmc_runtime_suspend+0x20/0x20 [ 927.424169] rpm_callback+0x24/0x80 [ 927.424172] ? mmc_runtime_suspend+0x20/0x20 [ 927.424176] rpm_resume+0x4b3/0x6c0 [ 927.424181] __pm_runtime_resume+0x4e/0x80 [ 927.424188] driver_probe_device+0x41/0x490 [ 927.424192] __driver_attach+0xdf/0xf0 [ 927.424196] ? driver_probe_device+0x490/0x490 [ 927.424201] bus_for_each_dev+0x6c/0xc0 [ 927.424205] driver_attach+0x1e/0x20 [ 927.424209] bus_add_driver+0x1f4/0x270 [ 927.424217] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.424221] driver_register+0x60/0xe0 [ 927.424227] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.424231] sdio_register_driver+0x20/0x30 [ 927.424237] rsi_module_init+0x16/0x40 [rsi_sdio] Signed-off-by: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:25 +02:00
Chris Novakovic	34447a69c9	ipconfig: Correctly initialise ic_nameservers [ Upstream commit `300eec7c0a` ] ic_nameservers, which stores the list of name servers discovered by ipconfig, is initialised (i.e. has all of its elements set to NONE, or 0xffffffff) by ic_nameservers_predef() in the following scenarios: - before the "ip=" and "nfsaddrs=" kernel command line parameters are parsed (in ip_auto_config_setup()); - before autoconfiguring via DHCP or BOOTP (in ic_bootp_init()), in order to clear any values that may have been set after parsing "ip=" or "nfsaddrs=" and are no longer needed. This means that ic_nameservers_predef() is not called when neither "ip=" nor "nfsaddrs=" is specified on the kernel command line. In this scenario, every element in ic_nameservers remains set to 0x00000000, which is indistinguishable from ANY and causes pnp_seq_show() to write the following (bogus) information to /proc/net/pnp: #MANUAL nameserver 0.0.0.0 nameserver 0.0.0.0 nameserver 0.0.0.0 This is potentially problematic for systems that blindly link /etc/resolv.conf to /proc/net/pnp. Ensure that ic_nameservers is also initialised when neither "ip=" nor "nfsaddrs=" are specified by calling ic_nameservers_predef() in ip_auto_config(), but only when ip_auto_config_setup() was not called earlier. This causes the following to be written to /proc/net/pnp, and is consistent with what gets written when ipconfig is configured manually but no name servers are specified on the kernel command line: #MANUAL Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Luc Van Oostenryck	917f481feb	drm/gma500: fix psb_intel_lvds_mode_valid()'s return type [ Upstream commit `2ea009095c` ] The method struct drm_connector_helper_funcs::mode_valid is defined as returning an 'enum drm_mode_status' but the driver implementation for this method, psb_intel_lvds_mode_valid(), uses an 'int' for it. Fix this by using 'enum drm_mode_status' for psb_intel_lvds_mode_valid(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180424131458.2060-1-luc.vanoostenryck@gmail.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Enric Balletbo i Serra	b713163129	arm64: defconfig: Enable Rockchip io-domain driver [ Upstream commit `7c8b77f815` ] Heiko Stübner justified pretty well the change in commit `e330eb86ba` ("ARM: multi_v7_defconfig: enable Rockchip io-domain driver"). This change is also needed for arm64 rockchip boards, so, do the same for arm64. The io-domain driver is necessary to notify the soc about voltages changes happening on supplying regulators. Probably the most important user right now is the mmc tuning code, where the soc needs to get notified when the voltage is dropped to the 1.8V point. As this option is necessary to successfully tune UHS cards etc, it should get built in. Otherwise, tuning will fail with, dwmmc_rockchip fe320000.dwmmc: All phases bad! mmc0: tuning execution failed: -5 Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Dmitry Osipenko	dc6afdde4b	memory: tegra: Apply interrupts mask per SoC [ Upstream commit `1c74d5c0de` ] Currently we are enabling handling of interrupts specific to Tegra124+ which happen to overlap with previous generations. Let's specify interrupts mask per SoC generation for consistency and in a preparation of squashing of Tegra20 driver into the common one that will enable handling of GART faults which may be undesirable by newer generations. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Dmitry Osipenko	1516a60194	memory: tegra: Do not handle spurious interrupts [ Upstream commit `bf3fbdfbec` ] The ISR reads interrupts-enable mask, but doesn't utilize it. Apply the mask to the interrupt status and don't handle interrupts that MC driver haven't asked for. Kernel would disable spurious MC IRQ and report the error. This would happen only in a case of a very severe bug. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Thomas Gleixner	7d044d940f	stop_machine: Use raw spinlocks [ Upstream commit `de5b55c1d4` ] Use raw-locks in stop_machine() to allow locking in irq-off and preempt-disabled regions on -RT. This also documents the possible locking context in general. [bigeasy: update patch description.] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20180423191635.6014-1-bigeasy@linutronix.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Yixun Lan	68f96e5410	dt-bindings: net: meson-dwmac: new compatible name for AXG SoC [ Upstream commit `7e5d05e18b` ] We need to introduce a new compatible name for the Meson-AXG SoC in order to support the RMII 100M ethernet PHY, since the PRG_ETH0 register of the dwmac glue layer is changed from previous old SoC. Signed-off-by: Yixun Lan <yixun.lan@amlogic.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Martin Blumenstingl	77620f3990	dt-bindings: pinctrl: meson: add support for the Meson8m2 SoC [ Upstream commit `03d9fbc397` ] The Meson8m2 SoC is a variant of Meson8 with some updates from Meson8b (such as the Gigabit capable DesignWare MAC). It is mostly pin compatible with Meson8, only 10 (existing) CBUS pins get an additional function (four of these are Ethernet RXD2, RXD3, TXD2 and TXD3 which are required when the board uses an RGMII PHY). The AOBUS pins seem to be identical on Meson8 and Meson8m2. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Tobin C. Harding	df157f60b9	mmc: pwrseq: Use kmalloc_array instead of stack VLA [ Upstream commit `486e666136` ] The use of stack Variable Length Arrays needs to be avoided, as they can be a vector for stack exhaustion, which can be both a runtime bug (kernel Oops) or a security flaw (overwriting memory beyond the stack). Also, in general, as code evolves it is easy to lose track of how big a VLA can get. Thus, we can end up having runtime failures that are hard to debug. As part of the directive[1] to remove all VLAs from the kernel, and build with -Wvla. Currently driver is using a VLA declared using the number of descriptors. This array is used to store integer values and is later used as an argument to `gpiod_set_array_value_cansleep()` This can be avoided by using `kmalloc_array()` to allocate memory for the array of integer values. Memory is free'd before return from function. >From the code it appears that it is safe to sleep so we can use GFP_KERNEL (based _cansleep() suffix of function `gpiod_set_array_value_cansleep()`. It can be expected that this patch will result in a small increase in overhead due to the use of `kmalloc_array()` [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Shawn Lin	de3466cc15	mmc: dw_mmc: update actual clock for mmc debugfs [ Upstream commit `ff178981bd` ] Respect the actual clock for mmc debugfs to help better debug the hardware. mmc_host mmc0: Bus speed (slot 0) = 135475200Hz (slot req 150000000Hz, actual 135475200HZ div = 0) cat /sys/kernel/debug/mmc0/ios clock: 150000000 Hz actual clock: 135475200 Hz vdd: 21 (3.3 ~ 3.4 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 3 (8 bits) timing spec: 9 (mmc HS200) signal voltage: 0 (1.80 V) driver type: 0 (driver type B) Cc: Xiao Yao <xiaoyao@rock-chips.com> Cc: Ziyuan <xzy.xu@rock-chips.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:24 +02:00
Takashi Sakamoto	575aa79d55	ALSA: hda/ca0132: fix build failure when a local macro is defined [ Upstream commit `8e142e9e62` ] DECLARE_TLV_DB_SCALE (alias of SNDRV_CTL_TLVD_DECLARE_DB_SCALE) is used but tlv.h is not included. This causes build failure when local macro is defined by comment-out. This commit fixes the bug. At the same time, the alias macro is replaced with a destination macro added at a commit `46e860f768` ("ALSA: rename TLV-related macros so that they're friendly to user applications") Reported-by: Connor McAdams <conmanx360@gmail.com> Fixes: `44f0c9782c` ('ALSA: hda/ca0132: Add tuning controls') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Satendra Singh Thakur	004256bb88	drm/atomic: Handling the case when setting old crtc for plane [ Upstream commit `fc2a69f390` ] In the func drm_atomic_set_crtc_for_plane, with the current code, if crtc of the plane_state and crtc passed as argument to the func are same, entire func will executed in vein. It will get state of crtc and clear and set the bits in plane_mask. All these steps are not required for same old crtc. Ideally, we should do nothing in this case, this patch handles the same, and causes the program to return without doing anything in such scenario. Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com> Cc: Madhur Verma <madhur.verma@samsung.com> Cc: Hemanshu Srivastava <hemanshu.s@samsung.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Mauro Carvalho Chehab	f3382cb557	media: siano: get rid of __le32/__le16 cast warnings [ Upstream commit `e1b7f11b37` ] Those are all false-positives that appear with smatch when building for arm: drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c💯27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c💯27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c💯27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c💯27: warning: cast to restricted __le16 Get rid of them by adding explicit forced casts. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Jakub Kicinski	e31a06ec82	bpf: fix references to free_bpf_prog_info() in comments [ Upstream commit `ab7f5bf092` ] Comments in the verifier refer to free_bpf_prog_info() which seems to have never existed in tree. Replace it with free_used_maps(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Bartlomiej Zolnierkiewicz	3221a270e2	thermal: exynos: fix setting rising_threshold for Exynos5433 [ Upstream commit `8bfc218d0e` ] Add missing clearing of the previous value when setting rising temperature threshold. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Doug Oucahrek	30f32e09af	staging: lustre: o2iblnd: fix race at kiblnd_connect_peer [ Upstream commit `cf04968efe` ] cmid will be destroyed at OFED if kiblnd_cm_callback return error. if error happen before the end of kiblnd_connect_peer, it will touch destroyed cmid and fail as (o2iblnd_cb.c:1315:kiblnd_connect_peer()) ASSERTION( cmid->device != ((void *)0) ) failed: Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015 Reviewed-by: Alexey Lyashkov <c17817@cray.com> Reviewed-by: Doug Oucharek <dougso@me.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Signed-off-by: Doug Oucharek <dougso@me.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Dan Carpenter	749c6f0e3b	scsi: megaraid: silence a static checker bug [ Upstream commit `27e833daba` ] If we had more than 32 megaraid cards then it would cause memory corruption. That's not likely, of course, but it's handy to enforce it and make the static checker happy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Wenwen Wang	5a644f6822	scsi: 3w-xxxx: fix a missing-check bug [ Upstream commit `9899e4d352` ] In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'data_buffer_length'. Then a security check is performed on it to make sure that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, an error code -EINVAL is returned. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer length between the two copies. This way, the user can bypass the security check and inject invalid data buffer length. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Wenwen Wang	80e75bdc0e	scsi: 3w-9xxx: fix a missing-check bug [ Upstream commit `c9318a3e02` ] In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'driver_command'. Then a security check is performed on the data buffer size indicated by 'driver_command', which is 'driver_command.buffer_length'. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer size between the two copies. This way, the user can bypass the security check and inject invalid data buffer size. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Michael Chan	a85b32ebaa	bnxt_en: Check unsupported speeds in bnxt_update_link() on PF only. [ Upstream commit `dac0490718` ] Only non-NPAR PFs need to actively check and manage unsupported link speeds. NPAR functions and VFs do not control the link speed and should skip the unsupported speed detection logic, to avoid warning messages from firmware rejecting the unsupported firmware calls. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Thomas Richter	67d64e1cb1	perf: fix invalid bit in diagnostic entry [ Upstream commit `3c0a83b14e` ] The s390 CPU measurement facility sampling mode supports basic entries and diagnostic entries. Each entry has a valid bit to indicate the status of the entry as valid or invalid. This bit is bit 31 in the diagnostic entry, but the bit mask definition refers to bit 30. Fix this by making the reserved field one bit larger. Fixes: `7e75fc3ff4` ("s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:23 +02:00
Thomas Richter	157674ac44	s390/cpum_sf: Add data entry sizes to sampling trailer entry [ Upstream commit `77715b7ddb` ] The CPU Measurement sampling facility creates a trailer entry for each Sample-Data-Block of stored samples. The trailer entry contains the sizes (in bytes) of the stored sampling types: - basic-sampling data entry size - diagnostic-sampling data entry size Both sizes are 2 bytes long. This patch changes the trailer entry definition to reflect this. Fixes: `fcc77f5073` ("s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Sean Lanigan	4139a62102	brcmfmac: Add support for bcm43364 wireless chipset [ Upstream commit `9c4a121e82` ] Add support for the BCM43364 chipset via an SDIO interface, as used in e.g. the Murata 1FX module. The BCM43364 uses the same firmware as the BCM43430 (which is already included), the only difference is the omission of Bluetooth. However, the SDIO_ID for the BCM43364 is 02D0:A9A4, giving it a MODALIAS of sdio:c00v02D0dA9A4, which doesn't get recognised and hence doesn't load the brcmfmac module. Adding the 'A9A4' ID in the appropriate place triggers the brcmfmac driver to load, and then correctly use the firmware file 'brcmfmac43430-sdio.bin'. Signed-off-by: Sean Lanigan <sean@lano.id.au> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Jane Wan	e70e69a8dc	mtd: rawnand: fsl_ifc: fix FSL NAND driver to read all ONFI parameter pages [ Upstream commit `a75bbe71a2` ] Per ONFI specification (Rev. 4.0), if the CRC of the first parameter page read is not valid, the host should read redundant parameter page copies. Fix FSL NAND driver to read the two redundant copies which are mandatory in the specification. Signed-off-by: Jane Wan <Jane.Wan@nokia.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Brad Love	523a9ce7d2	media: saa7164: Fix driver name in debug output [ Upstream commit `0cc4655cb5` ] This issue was reported by a user who downloaded a corrupt saa7164 firmware, then went looking for a valid xc5000 firmware to fix the error displayed...but the device in question has no xc5000, thus after much effort, the wild goose chase eventually led to a support call. The xc5000 has nothing to do with saa7164 (as far as I can tell), so replace the string with saa7164 as well as give a meaningful hint on the firmware mismatch. Signed-off-by: Brad Love <brad@nextdimension.cc> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Sami Tolvanen	f638764e9b	media: media-device: fix ioctl function types [ Upstream commit `daa36370b6` ] This change fixes function types for media device ioctls to avoid indirect call mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Damien Le Moal	cbc0c24c9c	libata: Fix command retry decision [ Upstream commit `804689ad2d` ] For failed commands with valid sense data (e.g. NCQ commands), scsi_check_sense() is used in ata_analyze_tf() to determine if the command can be retried. In such case, rely on this decision and ignore the command error mask based decision done in ata_worth_retry(). This fixes useless retries of commands such as unaligned writes on zoned disks (TYPE_ZAC). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Wei Yongjun	f3be42dc93	media: rcar_jpu: Add missing clk_disable_unprepare() on error in jpu_open() [ Upstream commit `43d0d3c527` ] Add the missing clk_disable_unprepare() before return from jpu_open() in the software reset error handling case. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Mikhail Ulyanov <mikhail.ulyanov@cogentembedded.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Marc Zyngier	4fccb92b53	dma-iommu: Fix compilation when !CONFIG_IOMMU_DMA [ Upstream commit `8a22a3e1e7` ] Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected results in the following splat: In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0: ./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’ static inline int iommu_get_msi_cookie(struct iommu_domain domain, dma_addr_t base) ^~~~~~~~~~ ./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration static inline void iommu_dma_get_resv_regions(struct device dev, struct list_head *list) ^~~~~~~~~ scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed Fix it by including linux/types.h. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
DaeRyong Jeong	d83904cb2e	tty: Fix data race in tty_insert_flip_string_fixed_flag [ Upstream commit `b6da31b2c0` ] Unlike normal serials, in pty layer, there is no guarantee that multiple threads don't insert input characters at the same time. If it is happened, tty_insert_flip_string_fixed_flag can be executed concurrently. This can lead slab out-of-bounds write in tty_insert_flip_string_fixed_flag. Call sequences are as follows. CPU0 CPU1 n_tty_ioctl_helper n_tty_ioctl_helper __start_tty tty_send_xchar tty_wakeup pty_write n_hdlc_tty_wakeup tty_insert_flip_string n_hdlc_send_frames tty_insert_flip_string_fixed_flag pty_write tty_insert_flip_string tty_insert_flip_string_fixed_flag To fix the race, acquire port->lock in pty_write() before it inserts input characters to tty buffer. It prevents multiple threads from inserting input characters concurrently. The crash log is as follows: BUG: KASAN: slab-out-of-bounds in tty_insert_flip_string_fixed_flag+0xb5/ 0x130 drivers/tty/tty_buffer.c:316 at addr ffff880114fcc121 Write of size 1792 by task syz-executor0/30017 CPU: 1 PID: 30017 Comm: syz-executor0 Not tainted 4.8.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 0000000000000000 ffff88011638f888 ffffffff81694cc3 ffff88007d802140 ffff880114fcb300 ffff880114fcc300 ffff880114fcb300 ffff88011638f8b0 ffffffff8130075c ffff88011638f940 ffff88007d802140 ffff880194fcc121 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0xb3/0x110 lib/dump_stack.c:51 kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 print_address_description mm/kasan/report.c:194 [inline] kasan_report_error+0x1f7/0x4e0 mm/kasan/report.c:283 kasan_report+0x36/0x40 mm/kasan/report.c:303 check_memory_region_inline mm/kasan/kasan.c:292 [inline] check_memory_region+0x13e/0x1a0 mm/kasan/kasan.c:299 memcpy+0x37/0x50 mm/kasan/kasan.c:335 tty_insert_flip_string_fixed_flag+0xb5/0x130 drivers/tty/tty_buffer.c:316 tty_insert_flip_string include/linux/tty_flip.h:35 [inline] pty_write+0x7f/0xc0 drivers/tty/pty.c:115 n_hdlc_send_frames+0x1d4/0x3b0 drivers/tty/n_hdlc.c:419 n_hdlc_tty_wakeup+0x73/0xa0 drivers/tty/n_hdlc.c:496 tty_wakeup+0x92/0xb0 drivers/tty/tty_io.c:601 __start_tty.part.26+0x66/0x70 drivers/tty/tty_io.c:1018 __start_tty+0x34/0x40 drivers/tty/tty_io.c:1013 n_tty_ioctl_helper+0x146/0x1e0 drivers/tty/tty_ioctl.c:1138 n_hdlc_tty_ioctl+0xb3/0x2b0 drivers/tty/n_hdlc.c:794 tty_ioctl+0xa85/0x16d0 drivers/tty/tty_io.c:2992 vfs_ioctl fs/ioctl.c:43 [inline] do_vfs_ioctl+0x13e/0xba0 fs/ioctl.c:679 SYSC_ioctl fs/ioctl.c:694 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 entry_SYSCALL_64_fastpath+0x1f/0xbd Signed-off-by: DaeRyong Jeong <threeearcat@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Mathieu Malaterre	30ac755c76	nvmem: properly handle returned value nvmem_reg_read [ Upstream commit `50808bfcc1` ] Function nvmem_reg_read can return a non zero value indicating an error. This returned value must be read and error propagated to nvmem_cell_prepare_write_buffer. Silence the following gcc warning (W=1): drivers/nvmem/core.c:1093:9: warning: variable 'rc' set but not used [-Wunused-but-set-variable] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:22 +02:00
Geert Uytterhoeven	202a0cf0c0	ARM: dts: sh73a0: Add missing interrupt-affinity to PMU node [ Upstream commit `57a66497e1` ] The PMU node references two interrupts, but lacks the interrupt-affinity property, which is required in that case: hw perfevents: no interrupt-affinity property for /pmu, guessing. Add the missing property to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Geert Uytterhoeven	1af8796a8b	ARM: dts: emev2: Add missing interrupt-affinity to PMU node [ Upstream commit `7207b94754` ] The PMU node references two interrupts, but lacks the interrupt-affinity property, which is required in that case: hw perfevents: no interrupt-affinity property for /pmu, guessing. Add the missing property to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Thor Thayer	b0d0e7162c	EDAC, altera: Fix ARM64 build warning [ Upstream commit `9ef20753e0` ] The kbuild test robot reported the following warning: drivers/edac/altera_edac.c: In function 'ocram_free_mem': drivers/edac/altera_edac.c:1410:42: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] gen_pool_free((struct gen_pool *)other, (u32)p, size); ^ After adding support for ARM64 architectures, the unsigned long parameter is 64 bits and causes a build warning on 64-bit configs. Fix by casting to the correct size (unsigned long) instead of u32. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `c3eea1942a` ("EDAC, altera: Add Altera L2 cache and OCRAM support") Link: http://lkml.kernel.org/r/1526317441-4996-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Dmitry Torokhov	3d06d3ca40	HID: i2c-hid: check if device is there before really probing [ Upstream commit `b3a81b6c4f` ] On many Chromebooks touch devices are multi-sourced; the components are electrically compatible and one can be freely swapped for another without changing the OS image or firmware. To avoid bunch of scary messages when device is not actually present in the system let's try testing basic communication with it and if there is no response terminate probe early with -ENXIO. Signed-off-by: Dmitry Torokhov <dtor@chromium.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Jonathan Neuschäfer	e7de1c6bbe	powerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled by Starlet [ Upstream commit `9dcb3df428` ] The interrupt controller inside the Wii's Hollywood chip is connected to two masters, the "Broadway" PowerPC and the "Starlet" ARM926, each with their own interrupt status and mask registers. When booting the Wii with mini[1], interrupts from the SD card controller (IRQ 7) are handled by the ARM, because mini provides SD access over IPC. Linux however can't currently use or disable this IPC service, so both sides try to handle IRQ 7 without coordination. Let's instead make sure that all interrupts that are unmasked on the PPC side are masked on the ARM side; this will also make sure that Linux can properly talk to the SD card controller (and potentially other devices). If access to a device through IPC is desired in the future, interrupts from that device should not be handled by Linux directly. [1]: https://github.com/lewurm/mini Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Luc Van Oostenryck	cab5ec8da3	drm/radeon: fix mode_valid's return type [ Upstream commit `7a47f20eb1` ] The method struct drm_connector_helper_funcs::mode_valid is defined as returning an 'enum drm_mode_status' but the driver implementation for this method uses an 'int' for it. Fix this by using 'enum drm_mode_status' in the driver too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Terry Junge	c57798822f	HID: hid-plantronics: Re-resend Update to map button for PTT products [ Upstream commit `37e376df5f` ] Add a mapping for Push-To-Talk joystick trigger button. Tested on ChromeBox/ChromeBook with various Plantronics devices. Signed-off-by: Terry Junge <terry.junge@plantronics.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Will Deacon	fba1048559	arm64: cmpwait: Clear event register before arming exclusive monitor [ Upstream commit `1cfc63b5ae` ] When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the event register was already set (for example, because of the event stream). Avoid these spurious wakeups by explicitly clearing the event register before loading the cacheline and setting the exclusive monitor. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Takashi Iwai	03df65a0bc	ALSA: usb-audio: Apply rate limit to warning messages in URB complete callback [ Upstream commit `377a879d98` ] retire_capture_urb() may print warning messages when the given URB doesn't align, and this may flood the system log easily. Put the rate limit to the message for avoiding it. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1093485 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Grygorii Strashko	1fa620150c	net: ethernet: ti: cpsw-phy-sel: check bus_find_device() ret value [ Upstream commit `c6213eb1ae` ] This fixes klockworks warnings: Pointer 'dev' returned from call to function 'bus_find_device' at line 179 may be NULL and will be dereferenced at line 181. cpsw-phy-sel.c:179: 'dev' is assigned the return value from function 'bus_find_device'. bus.c:342: 'bus_find_device' explicitly returns a NULL value. cpsw-phy-sel.c:181: 'dev' is dereferenced by passing argument 1 to function 'dev_get_drvdata'. device.h:1024: 'dev' is passed to function 'dev_get_drvdata'. device.h:1026: 'dev' is explicitly dereferenced. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> [nsekhar@ti.com: add an error message, fix return path] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:21 +02:00
Colin Ian King	77b6f72cef	media: smiapp: fix timeout checking in smiapp_read_nvm [ Upstream commit `7a2148dfda` ] The current code decrements the timeout counter i and the end of each loop i is incremented, so the check for timeout will always be false and hence the timeout mechanism is just a dead code path. Potentially, if the RD_READY bit is not set, we could end up in an infinite loop. Fix this so the timeout starts from 1000 and decrements to zero, if at the end of the loop i is zero we have a timeout condition. Detected by CoverityScan, CID#1324008 ("Logically dead code") Fixes: `ccfc97bdb5` ("[media] smiapp: Add driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Emil Tantilov	8d02fc16fa	ixgbevf: fix MAC address changes through ixgbevf_set_mac() [ Upstream commit `6e7d0ba1e5` ] Set hw->mac.perm_addr in ixgbevf_set_mac() in order to avoid losing the custom MAC on reset. This can happen in the following case: >ip link set $vf address $mac >ethtool -r $vf Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Yufen Yu	e51f4fcfad	md: fix NULL dereference of mddev->pers in remove_and_add_spares() [ Upstream commit `c42a0e2675` ] We met NULL pointer BUG as follow: [ 151.760358] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 [ 151.761340] PGD 80000001011eb067 P4D 80000001011eb067 PUD 1011ea067 PMD 0 [ 151.762039] Oops: 0000 [#1] SMP PTI [ 151.762406] Modules linked in: [ 151.762723] CPU: 2 PID: 3561 Comm: mdadm-test Kdump: loaded Not tainted 4.17.0-rc1+ #238 [ 151.763542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 [ 151.764432] RIP: 0010:remove_and_add_spares.part.56+0x13c/0x3a0 [ 151.765061] RSP: 0018:ffffc90001d7fcd8 EFLAGS: 00010246 [ 151.765590] RAX: 0000000000000000 RBX: ffff88013601d600 RCX: 0000000000000000 [ 151.766306] RDX: 0000000000000000 RSI: ffff88013601d600 RDI: ffff880136187000 [ 151.767014] RBP: ffff880136187018 R08: 0000000000000003 R09: 0000000000000051 [ 151.767728] R10: ffffc90001d7fed8 R11: 0000000000000000 R12: ffff88013601d600 [ 151.768447] R13: ffff8801298b1300 R14: ffff880136187000 R15: 0000000000000000 [ 151.769160] FS: 00007f2624276700(0000) GS:ffff88013ae80000(0000) knlGS:0000000000000000 [ 151.769971] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.770554] CR2: 0000000000000060 CR3: 0000000111aac000 CR4: 00000000000006e0 [ 151.771272] Call Trace: [ 151.771542] md_ioctl+0x1df2/0x1e10 [ 151.771906] ? __switch_to+0x129/0x440 [ 151.772295] ? __schedule+0x244/0x850 [ 151.772672] blkdev_ioctl+0x4bd/0x970 [ 151.773048] block_ioctl+0x39/0x40 [ 151.773402] do_vfs_ioctl+0xa4/0x610 [ 151.773770] ? dput.part.23+0x87/0x100 [ 151.774151] ksys_ioctl+0x70/0x80 [ 151.774493] __x64_sys_ioctl+0x16/0x20 [ 151.774877] do_syscall_64+0x5b/0x180 [ 151.775258] entry_SYSCALL_64_after_hwframe+0x44/0xa9 For raid6, when two disk of the array are offline, two spare disks can be added into the array. Before spare disks recovery completing, system reboot and mdadm thinks it is ok to restart the degraded array by md_ioctl(). Since disks in raid6 is not only_parity(), raid5_run() will abort, when there is no PPL feature or not setting 'start_dirty_degraded' parameter. Therefore, mddev->pers is NULL. But, mddev->raid_disks has been set and it will not be cleared when raid5_run abort. md_ioctl() can execute cmd 'HOT_REMOVE_DISK' to remove a disk by mdadm, which will cause NULL pointer dereference in remove_and_add_spares() finally. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Anson Huang	211c2bc42a	regulator: pfuze100: add .is_enable() for pfuze100_swb_regulator_ops [ Upstream commit `0b01fd3d40` ] If is_enabled() is not defined, regulator core will assume this regulator is already enabled, then it can NOT be really enabled after disabled. Based on Li Jun's patch from the NXP kernel tree. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Takashi Iwai	995cbcab6d	ALSA: emu10k1: Rate-limit error messages about page errors [ Upstream commit `11d42c8103` ] The error messages at sanity checks of memory pages tend to repeat too many times once when it hits, and without the rate limit, it may flood and become unreadable. Replace such messages with the *_ratelimited() variant. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1093027 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Maya Erez	62413bacaf	scsi: ufs: fix exception event handling [ Upstream commit `2e3611e954` ] The device can set the exception event bit in one of the response UPIU, for example to notify the need for urgent BKOPs operation. In such a case, the host driver calls ufshcd_exception_event_handler to handle this notification. When trying to check the exception event status (for finding the cause for the exception event), the device may be busy with additional SCSI commands handling and may not respond within the 100ms timeout. To prevent that, we need to block SCSI commands during handling of exception events and allow retransmissions of the query requests, in case of timeout. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Eric Biggers	3ce14632e7	fscrypt: use unbound workqueue for decryption [ Upstream commit `36dd26e0c8` ] Improve fscrypt read performance by switching the decryption workqueue from bound to unbound. With the bound workqueue, when multiple bios completed on the same CPU, they were decrypted on that same CPU. But with the unbound queue, they are now decrypted in parallel on any CPU. Although fscrypt read performance can be tough to measure due to the many sources of variation, this change is most beneficial when decryption is slow, e.g. on CPUs without AES instructions. For example, I timed tarring up encrypted directories on f2fs. On x86 with AES-NI instructions disabled, the unbound workqueue improved performance by about 25-35%, using 1 to NUM_CPUs jobs with 4 or 8 CPUs available. But with AES-NI enabled, performance was unchanged to within ~2%. I also did the same test on a quad-core ARM CPU using xts-speck128-neon encryption. There performance was usually about 10% better with the unbound workqueue, bringing it closer to the unencrypted speed. The unbound workqueue may be worse in some cases due to worse locality, but I think it's still the better default. dm-crypt uses an unbound workqueue by default too, so this change makes fscrypt match. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:20 +02:00
Mark Rutland	e6d90b8c60	drivers/perf: arm-ccn: don't log to dmesg in event_init [ Upstream commit `1898eb61fb` ] The ARM CCN PMU driver uses dev_warn() to complain about parameters in the user-provided perf_event_attr. This means that under normal operation (e.g. a single invocation of the perf tool), a number of messages warnings may be logged to dmesg. Tools may issue multiple syscalls to probe for feature support, and multiple applications (from multiple users) can attempt to open events simultaneously, so this is not very helpful, even if a user happens to have access to dmesg. Worse, this can push important information out of the dmesg ring buffer, and can significantly slow down syscall fuzzers, vastly increasing the time it takes to find critical bugs. Demote the dev_warn() instances to dev_dbg(), as is the case for all other PMU drivers under drivers/perf/. Users who wish to debug PMU event initialisation can enable dynamic debug to receive these messages. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Mimi Zohar	81be5529c8	ima: based on policy verify firmware signatures (pre-allocated buffer) [ Upstream commit `fd90bc559b` ] Don't differentiate, for now, between kernel_read_file_id READING_FIRMWARE and READING_FIRMWARE_PREALLOC_BUFFER enumerations. Fixes: `a098ecd` firmware: support loading into a pre-allocated buffer (since 4.8) Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: David Howells <dhowells@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Xinming Hu	db6872750d	mwifiex: correct histogram data with appropriate index [ Upstream commit `30bfce0b63` ] Correct snr/nr/rssi data index to avoid possible buffer underflow. Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Michal Vokáč	f14629f347	net: dsa: qca8k: Add support for QCA8334 switch [ Upstream commit `64cf81675a` ] Add support for the four-port variant of the Qualcomm QCA833x switch. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Mika Westerberg	15da894376	PCI: pciehp: Request control of native hotplug only if supported [ Upstream commit `408fec36a1` ] Currently we request control of native PCIe hotplug unconditionally. Native PCIe hotplug events are handled by the pciehp driver, and if it is not enabled those events will be lost. Request control of native PCIe hotplug only if the pciehp driver is enabled, so we will actually handle native PCIe hotplug events. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Sandipan Das	0416be409e	bpf: powerpc64: pad function address loads with NOPs [ Upstream commit `4ea69b2fd6` ] For multi-function programs, loading the address of a callee function to a register requires emitting instructions whose count varies from one to five depending on the nature of the address. Since we come to know of the callee's address only before the extra pass, the number of instructions required to load this address may vary from what was previously generated. This can make the JITed image grow or shrink. To avoid this, we should generate a constant five-instruction when loading function addresses by padding the optimized load sequence with NOPs. Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Julia Lawall	23d25f9bda	pinctrl: at91-pio4: add missing of_node_put [ Upstream commit `2181636471` ] The device node iterators perform an of_node_get on each iteration, so a jump out of the loop requires an of_node_put. The semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ expression root,e; local idexpression child; iterator name for_each_child_of_node; @@ for_each_child_of_node(root, child) { ... when != of_node_put(child) when != e = child + of_node_put(child); ? break; ... } ... when != child // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Christophe Leroy	38d96f7888	powerpc/8xx: fix invalid register expression in head_8xx.S [ Upstream commit `e4ccb1dae6` ] New binutils generate the following warning AS arch/powerpc/kernel/head_8xx.o arch/powerpc/kernel/head_8xx.S: Assembler messages: arch/powerpc/kernel/head_8xx.S:916: Warning: invalid register expression This patch fixes it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Mathieu Malaterre	e0da21e7e7	powerpc/powermac: Mark variable x as unused [ Upstream commit `5a4b475cf8` ] Since the value of x is never intended to be read, declare it with gcc attribute as unused. Fix warning treated as error with W=1: arch/powerpc/platforms/powermac/bootx_init.c:471:21: error: variable ‘x’ set but not used [-Werror=unused-but-set-variable] Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:19 +02:00
Mathieu Malaterre	0cd9fd8406	powerpc/powermac: Add missing prototype for note_bootable_part() [ Upstream commit `f72cf3f1d4` ] Add a missing prototype for function `note_bootable_part` to silence a warning treated as error with W=1: arch/powerpc/platforms/powermac/setup.c:361:12: error: no previous prototype for ‘note_bootable_part’ [-Werror=missing-prototypes] Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Mathieu Malaterre	f851d8ac65	powerpc/chrp/time: Make some functions static, add missing header include [ Upstream commit `b87a358b4a` ] Add a missing include <platforms/chrp/chrp.h>. These functions can all be static, make it so. Fix warnings treated as errors with W=1: arch/powerpc/platforms/chrp/time.c:41:13: error: no previous prototype for ‘chrp_time_init’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:66:5: error: no previous prototype for ‘chrp_cmos_clock_read’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:74:6: error: no previous prototype for ‘chrp_cmos_clock_write’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:86:5: error: no previous prototype for ‘chrp_set_rtc_time’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:130:6: error: no previous prototype for ‘chrp_get_rtc_time’ [-Werror=missing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Mathieu Malaterre	ecd04c80fa	powerpc/32: Add a missing include header [ Upstream commit `c89ca59322` ] The header file <linux/syscalls.h> was missing from the includes. Fix the following warning, treated as error with W=1: arch/powerpc/kernel/pci_32.c:286:6: error: no previous prototype for ‘sys_pciconfig_iobase’ [-Werror=missing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	cf619559ec	ath: Add regulatory mapping for Bahamas [ Upstream commit `699e2302c2` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	c7cc26414a	ath: Add regulatory mapping for Bermuda [ Upstream commit `9c790f2d23` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: FCC * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	0d50a24c54	ath: Add regulatory mapping for Serbia [ Upstream commit `2a3169a54b` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	9d04d93f4b	ath: Add regulatory mapping for Tanzania [ Upstream commit `667ddac574` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	410639a859	ath: Add regulatory mapping for Uganda [ Upstream commit `1ea3986ad2` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:18 +02:00
Sven Eckelmann	3cfd18697d	ath: Add regulatory mapping for APL2_FCCA [ Upstream commit `4f183687e3` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: FCC * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Sven Eckelmann	31e1b250c0	ath: Add regulatory mapping for APL13_WORLD [ Upstream commit `9ba8df0c52` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Sven Eckelmann	e6cd75968d	ath: Add regulatory mapping for ETSI8_WORLD [ Upstream commit `45faf6e096` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Sven Eckelmann	1d4de3ff87	ath: Add regulatory mapping for FCC3_ETSIC [ Upstream commit `01fb2994a9` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Christoph Hellwig	db16571fb7	PCI: Prevent sysfs disable of device while driver is attached [ Upstream commit `6f5cdfa802` ] Manipulating the enable_cnt behind the back of the driver will wreak complete havoc with the kernel state, so disallow it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Qu Wenruo	7e51effb7a	btrfs: qgroup: Finish rescan when hit the last leaf of extent tree [ Upstream commit `ff3d27a048` ] Under the following case, qgroup rescan can double account cowed tree blocks: In this case, extent tree only has one tree block. - \| transid=5 last committed=4 \| btrfs_qgroup_rescan_worker() \| \|- btrfs_start_transaction() \| \| transid = 5 \| \|- qgroup_rescan_leaf() \| \|- btrfs_search_slot_for_read() on extent tree \| Get the only extent tree block from commit root (transid = 4). \| Scan it, set qgroup_rescan_progress to the last \| EXTENT/META_ITEM + 1 \| now qgroup_rescan_progress = A + 1. \| \| fs tree get CoWed, new tree block is at A + 16K \| transid 5 get committed - \| transid=6 last committed=5 \| btrfs_qgroup_rescan_worker() \| btrfs_qgroup_rescan_worker() \| \|- btrfs_start_transaction() \| \| transid = 5 \| \|- qgroup_rescan_leaf() \| \|- btrfs_search_slot_for_read() on extent tree \| Get the only extent tree block from commit root (transid = 5). \| scan it using qgroup_rescan_progress (A + 1). \| found new tree block beyong A, and it's fs tree block, \| account it to increase qgroup numbers. - In above case, tree block A, and tree block A + 16K get accounted twice, while qgroup rescan should stop when it already reach the last leaf, other than continue using its qgroup_rescan_progress. Such case could happen by just looping btrfs/017 and with some possibility it can hit such double qgroup accounting problem. Fix it by checking the path to determine if we should finish qgroup rescan, other than relying on next loop to exit. Reported-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
David Sterba	65cb469d02	btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups [ Upstream commit `3d3a2e610e` ] Currently the code assumes that there's an implied barrier by the sequence of code preceding the wakeup, namely the mutex unlock. As Nikolay pointed out: I think this is wrong (not your code) but the original assumption that the RELEASE semantics provided by mutex_unlock is sufficient. According to memory-barriers.txt: Section 'LOCK ACQUISITION FUNCTIONS' states: (2) RELEASE operation implication: Memory operations issued before the RELEASE will be completed before the RELEASE operation has completed. Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed. (I've bolded the may portion) The example given there: As an example, consider the following: A = a; B = b; ACQUIRE C = c; D = d; RELEASE E = e; F = f; The following sequence of events is acceptable: ACQUIRE, {F,A}, E, {C,D}, B, RELEASE So if we assume that C is modifying the flag which the waitqueue is checking, and E is the actual wakeup, then those accesses can be re-ordered... IMHO this code should be considered broken... Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Hans Verkuil	9ac47200b5	media: videobuf2-core: don't call memop 'finish' when queueing [ Upstream commit `90b2da89a0` ] When a buffer is queued or requeued in vb2_buffer_done, then don't call the finish memop. In this case the buffer is only returned to vb2, not to userspace. Calling 'finish' here will cause an unbalance when the queue is canceled, since the core will call the same memop again. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Ezequiel Garcia	739feeba55	media: tw686x: Fix incorrect vb2_mem_ops GFP flags [ Upstream commit `636757ab6c` ] When the driver is configured in the "memcpy" dma-mode, it uses vb2_vmalloc_memops, which is backed by a SLAB allocator and so shouldn't be using GFP_DMA32. Fix it. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:17 +02:00
Eyal Reizer	a783c6d7a9	wlcore: sdio: check for valid platform device data before suspend [ Upstream commit `6e91d48371` ] the wl pointer can be null In case only wlcore_sdio is probed while no WiLink module is successfully probed, as in the case of mounting a wl12xx module while using a device tree file configured with wl18xx related settings. In this case the system was crashing in wl1271_suspend() as platform device data is not set. Make sure wl the pointer is valid before using it. Signed-off-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:16 +02:00
Ganapathi Bhat	a7a336ed3d	mwifiex: handle race during mwifiex_usb_disconnect [ Upstream commit `b817047ae7` ] Race condition is observed during rmmod of mwifiex_usb: 1. The rmmod thread will call mwifiex_usb_disconnect(), download SHUTDOWN command and do wait_event_interruptible_timeout(), waiting for response. 2. The main thread will handle the response and will do a wake_up_interruptible(), unblocking rmmod thread. 3. On getting unblocked, rmmod thread will make rx_cmd.urb = NULL in mwifiex_usb_free(). 4. The main thread will try to resubmit rx_cmd.urb in mwifiex_usb_submit_rx_urb(), which is NULL. To fix, wait for main thread to complete before calling mwifiex_usb_free(). Signed-off-by: Ganapathi Bhat <gbhat@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:16 +02:00
Vincent Palatin	5e0b8c1732	mfd: cros_ec: Fail early if we cannot identify the EC [ Upstream commit `0dbbf25561` ] If we cannot communicate with the EC chip to detect the protocol version and its features, it's very likely useless to continue. Else we will commit all kind of uninformed mistakes (using the wrong protocol, the wrong buffer size, mixing the EC with other chips). Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Acked-by: Benson Leung <bleung@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:16 +02:00
Kai Chieh Chuang	32b7d638a0	ASoC: dpcm: fix BE dai not hw_free and shutdown [ Upstream commit `9c0ac70ad2` ] In case, one BE is used by two FE1/FE2 FE1--->BE--> \| FE2----] when FE1/FE2 call dpcm_be_dai_hw_free() together the BE users will be 2 (> 1), hence cannot be hw_free the be state will leave at, ex. SND_SOC_DPCM_STATE_STOP later FE1/FE2 call dpcm_be_dai_shutdown(), will be skip due to wrong state. leaving the BE not being hw_free and shutdown. The BE dai will be hw_free later when calling dpcm_be_dai_shutdown() if still in invalid state. Signed-off-by: KaiChieh Chuang <kaichieh.chuang@mediatek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:16 +02:00
Jian-Hong Pan	c70cc94075	Bluetooth: btusb: Add a new Realtek 8723DE ID 2ff8:b011 [ Upstream commit `66d9975c5a` ] Without this patch we cannot turn on the Bluethooth adapter on ASUS E406MA. T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2ff8 ProdID=b011 Rev= 2.00 S: Manufacturer=Realtek S: Product=802.11n WLAN Adapter S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:16 +02:00
Thierry Escande	922c668529	Bluetooth: hci_qca: Fix "Sleep inside atomic section" warning [ Upstream commit `9960521c44` ] This patch fixes the following warning during boot: do not call blocking ops when !TASK_RUNNING; state=1 set at [<(ptrval)>] qca_setup+0x194/0x750 [hci_uart] WARNING: CPU: 2 PID: 1878 at kernel/sched/core.c:6135 __might_sleep+0x7c/0x88 In qca_set_baudrate(), the current task state is set to TASK_UNINTERRUPTIBLE before going to sleep for 300ms. It was then restored to TASK_INTERRUPTIBLE. This patch sets the current task state back to TASK_RUNNING instead. Signed-off-by: Thierry Escande <thierry.escande@linaro.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Shaul Triebitz	2e1bfab64c	iwlwifi: pcie: fix race in Rx buffer allocator [ Upstream commit `0f22e40053` ] Make sure the rx_allocator worker is canceled before running the rx_init routine. rx_init frees and re-allocates all rxb's pages. The rx_allocator worker also allocates pages for the used rxb's. Running rx_init and rx_allocator simultaniously causes a kernel panic. Fix that by canceling the work in rx_init. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Daniel Díaz	d4fd1bf83f	selftests/intel_pstate: Improve test, minor fixes [ Upstream commit `e9d33f149f` ] A few changes improve the overall usability of the test: * fix a hard-coded maximum frequency (3300), * don't adjust the CPU frequency if only evaluating results, * fix a comparison for multiple frequencies. A symptom of that last issue looked like this: ./run.sh: line 107: [: too many arguments ./run.sh: line 110: 3099 3099 3100-3100: syntax error in expression (error token is \"3099 3100-3100\") Because a check will count how many differente frequencies there are among the CPUs of the system, and after they are tallied another read is performed, which might produce different results. Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Kan Liang	9f4dd60356	perf/x86/intel/uncore: Correct fixed counter index check for NHM [ Upstream commit `d71f11c076` ] For Nehalem and Westmere, there is only one fixed counter for W-Box. There is no index which is bigger than UNCORE_PMC_IDX_FIXED. It is not correct to use >= to check fixed counter. The code quality issue will bring problem when new counter index is introduced. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Kan Liang	47fc151cbd	perf/x86/intel/uncore: Correct fixed counter index check in generic code [ Upstream commit `4749f81964` ] There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only exception is client IMC uncore, which has been specially handled. For generic code, it is not correct to use >= to check fixed counter. The code quality issue will bring problem when a new counter index is introduced. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Shuah Khan (Samsung OSG)	ce28cf5fb4	usbip: usbip_detach: Fix memory, udev context and udev leak [ Upstream commit `d179f99a65` ] detach_port() fails to call usbip_vhci_driver_close() from its error path after usbip_vhci_detach_device() returns failure, leaking memory allocated in usbip_vhci_driver_open() and holding udev_context and udev references. Fix it to call usbip_vhci_driver_close(). Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Chao Yu	9e222d7ca5	f2fs: fix race in between GC and atomic open [ Upstream commit `27319ba404` ] Thread GC thread - f2fs_ioc_start_atomic_write - get_dirty_pages - filemap_write_and_wait_range - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - f2fs_is_atomic_file - set_page_dirty - set_inode_flag(, FI_ATOMIC_FILE) Dirty data page can still be generated by GC in race condition as above call stack. This patch adds fi->dio_rwsem[WRITE] in f2fs_ioc_start_atomic_write to avoid such race. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Sahitya Tummala	bce7f720f4	f2fs: Fix deadlock in shutdown ioctl [ Upstream commit `60b2b4ee2b` ] f2fs_ioc_shutdown() ioctl gets stuck in the below path when issued with F2FS_GOING_DOWN_FULLSYNC option. __switch_to+0x90/0xc4 percpu_down_write+0x8c/0xc0 freeze_super+0xec/0x1e4 freeze_bdev+0xc4/0xcc f2fs_ioctl+0xc0c/0x1ce0 f2fs_compat_ioctl+0x98/0x1f0 Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Chao Yu	570f12a8b6	f2fs: fix to wait page writeback during revoking atomic write [ Upstream commit `e5e5732d81` ] After revoking atomic write, related LBA can be reused by others, so we need to wait page writeback before reusing the LBA, in order to avoid interference between old atomic written in-flight IO and new IO. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:15 +02:00
Chao Yu	b7ea2b8616	f2fs: fix to don't trigger writeback during recovery [ Upstream commit `64c74a7ab5` ] - f2fs_fill_super - recover_fsync_data - recover_data - del_fsync_inode - iput - iput_final - write_inode_now - f2fs_write_inode - f2fs_balance_fs - f2fs_balance_fs_bg - sync_dirty_inodes With data_flush mount option, during recovery, in order to avoid entering above writeback flow, let's detect recovery status and do skip in f2fs_balance_fs_bg. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Chao Yu	4e6b7aad50	f2fs: fix error path of move_data_page [ Upstream commit `14a28559f4` ] This patch fixes error path of move_data_page: - clear cold data flag if it fails to write page. - redirty page for non-ENOMEM case. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Anatoly Pugachev	c9ab0cefc5	disable loading f2fs module on PAGE_SIZE > 4KB [ Upstream commit `4071e67cff` ] The following patch disables loading of f2fs module on architectures which have PAGE_SIZE > 4096 , since it is impossible to mount f2fs on such architectures , log messages are: mount: /mnt: wrong fs type, bad option, bad superblock on /dev/vdiskb1, missing codepage or helper program, or other error. /dev/vdiskb1: F2FS filesystem, UUID=1d8b9ca4-2389-4910-af3b-10998969f09c, volume name "" May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 1th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 2th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB which was introduced by git commit `5c9b469295` tested on git kernel 4.17.0-rc6-00309-gec30dcf7f425 with patch applied: modprobe: ERROR: could not insert 'f2fs': Invalid argument May 28 01:40:28 v215 kernel: F2FS not supported on PAGE_SIZE(8192) != 4096 Signed-off-by: Anatoly Pugachev <matorola@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Trond Myklebust	b05c460a0c	pnfs: Don't release the sequence slot until we've processed layoutget on open [ Upstream commit `ae55e59da0` ] If the server recalls the layout that was just handed out, we risk hitting a race as described in RFC5661 Section 2.10.6.3 unless we ensure that we release the sequence slot after processing the LAYOUTGET operation that was sent as part of the OPEN compound. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Alexey Kodanev	759fb7f94f	netfilter: nf_tables: check msg_type before nft_trans_set(trans) [ Upstream commit `9c7f96fd77` ] The patch moves the "trans->msg_type == NFT_MSG_NEWSET" check before using nft_trans_set(trans). Otherwise we can get out of bounds read. For example, KASAN reported the one when running 0001_cache_handling_0 nft test. In this case "trans->msg_type" was NFT_MSG_NEWTABLE: [75517.177808] BUG: KASAN: slab-out-of-bounds in nft_set_lookup_global+0x22f/0x270 [nf_tables] [75517.279094] Read of size 8 at addr ffff881bdb643fc8 by task nft/7356 ... [75517.375605] CPU: 26 PID: 7356 Comm: nft Tainted: G E 4.17.0-rc7.1.x86_64 #1 [75517.489587] Hardware name: Oracle Corporation SUN SERVER X4-2 [75517.618129] Call Trace: [75517.648821] dump_stack+0xd1/0x13b [75517.691040] ? show_regs_print_info+0x5/0x5 [75517.742519] ? kmsg_dump_rewind_nolock+0xf5/0xf5 [75517.799300] ? lock_acquire+0x143/0x310 [75517.846738] print_address_description+0x85/0x3a0 [75517.904547] kasan_report+0x18d/0x4b0 [75517.949892] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.019153] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.088420] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.157689] nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.224869] nf_tables_newsetelem+0x1a5/0x5d0 [nf_tables] [75518.291024] ? nft_add_set_elem+0x2280/0x2280 [nf_tables] [75518.357154] ? nla_parse+0x1a5/0x300 [75518.401455] ? kasan_kmalloc+0xa6/0xd0 [75518.447842] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75518.507743] ? nfnetlink_rcv+0x7a5/0x1bdf [nfnetlink] [75518.569745] ? nfnl_err_reset+0x3c0/0x3c0 [nfnetlink] [75518.631711] ? lock_acquire+0x143/0x310 [75518.679133] ? netlink_deliver_tap+0x9b/0x1070 [75518.733840] ? kasan_unpoison_shadow+0x31/0x40 [75518.788542] netlink_unicast+0x45d/0x680 [75518.837111] ? __isolate_free_page+0x890/0x890 [75518.891913] ? netlink_attachskb+0x6b0/0x6b0 [75518.944542] netlink_sendmsg+0x6fa/0xd30 [75518.993107] ? netlink_unicast+0x680/0x680 [75519.043758] ? netlink_unicast+0x680/0x680 [75519.094402] sock_sendmsg+0xd9/0x160 [75519.138810] ___sys_sendmsg+0x64d/0x980 [75519.186234] ? copy_msghdr_from_user+0x350/0x350 [75519.243118] ? lock_downgrade+0x650/0x650 [75519.292738] ? do_raw_spin_unlock+0x5d/0x250 [75519.345456] ? _raw_spin_unlock+0x24/0x30 [75519.395065] ? __handle_mm_fault+0xbde/0x3410 [75519.448830] ? sock_setsockopt+0x3d2/0x1940 [75519.500516] ? __lock_acquire.isra.25+0xdc/0x19d0 [75519.558448] ? lock_downgrade+0x650/0x650 [75519.608057] ? __audit_syscall_entry+0x317/0x720 [75519.664960] ? __fget_light+0x58/0x250 [75519.711325] ? __sys_sendmsg+0xde/0x170 [75519.758850] __sys_sendmsg+0xde/0x170 [75519.804193] ? __ia32_sys_shutdown+0x90/0x90 [75519.856725] ? syscall_trace_enter+0x897/0x10e0 [75519.912354] ? trace_event_raw_event_sys_enter+0x920/0x920 [75519.979432] ? __audit_syscall_entry+0x720/0x720 [75520.036118] do_syscall_64+0xa3/0x3d0 [75520.081248] ? prepare_exit_to_usermode+0x47/0x1d0 [75520.139904] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75520.201680] RIP: 0033:0x7fc153320ba0 [75520.245772] RSP: 002b:00007ffe294c3638 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [75520.337708] RAX: ffffffffffffffda RBX: 00007ffe294c4820 RCX: 00007fc153320ba0 [75520.424547] RDX: 0000000000000000 RSI: 00007ffe294c46b0 RDI: 0000000000000003 [75520.511386] RBP: 00007ffe294c47b0 R08: 0000000000000004 R09: 0000000002114090 [75520.598225] R10: 00007ffe294c30a0 R11: 0000000000000246 R12: 00007ffe294c3660 [75520.684961] R13: 0000000000000001 R14: 00007ffe294c3650 R15: 0000000000000001 [75520.790946] Allocated by task 7356: [75520.833994] kasan_kmalloc+0xa6/0xd0 [75520.878088] __kmalloc+0x189/0x450 [75520.920107] nft_trans_alloc_gfp+0x20/0x190 [nf_tables] [75520.983961] nf_tables_newtable+0xcd0/0x1bd0 [nf_tables] [75521.048857] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75521.108655] netlink_unicast+0x45d/0x680 [75521.157013] netlink_sendmsg+0x6fa/0xd30 [75521.205271] sock_sendmsg+0xd9/0x160 [75521.249365] ___sys_sendmsg+0x64d/0x980 [75521.296686] __sys_sendmsg+0xde/0x170 [75521.341822] do_syscall_64+0xa3/0x3d0 [75521.386957] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.467867] Freed by task 23454: [75521.507804] __kasan_slab_free+0x132/0x180 [75521.558137] kfree+0x14d/0x4d0 [75521.596005] free_rt_sched_group+0x153/0x280 [75521.648410] sched_autogroup_create_attach+0x19a/0x520 [75521.711330] ksys_setsid+0x2ba/0x400 [75521.755529] __ia32_sys_setsid+0xa/0x10 [75521.802850] do_syscall_64+0xa3/0x3d0 [75521.848090] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.929000] The buggy address belongs to the object at ffff881bdb643f80 which belongs to the cache kmalloc-96 of size 96 [75522.079797] The buggy address is located 72 bytes inside of 96-byte region [ffff881bdb643f80, ffff881bdb643fe0) [75522.221234] The buggy address belongs to the page: [75522.280100] page:ffffea006f6d90c0 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [75522.377443] flags: 0x2fffff80000100(slab) [75522.426956] raw: 002fffff80000100 0000000000000000 0000000000000000 0000000180200020 [75522.521275] raw: ffffea006e6fafc0 0000000c0000000c ffff881bf180f400 0000000000000000 [75522.615601] page dumped because: kasan: bad access detected Fixes: `37a9cc5255` ("netfilter: nf_tables: add generation mask to sets") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Leon Romanovsky	efb4dd6ab9	RDMA/mad: Convert BUG_ONs to error flows [ Upstream commit `2468b82d69` ] Let's perform checks in-place instead of BUG_ONs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Nicholas Piggin	ea8e4ff38f	powerpc/64s: Fix compiler store ordering to SLB shadow area [ Upstream commit `926bc2f100` ] The stores to update the SLB shadow area must be made as they appear in the C code, so that the hypervisor does not see an entry with mismatched vsid and esid. Use WRITE_ONCE for this. GCC has been observed to elide the first store to esid in the update, which means that if the hypervisor interrupts the guest after storing to vsid, it could see an entry with old esid and new vsid, which may possibly result in memory corruption. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Stewart Smith	c3e347251c	hvc_opal: don't set tb_ticks_per_usec in udbg_init_opal_common() [ Upstream commit `447808bf50` ] time_init() will set up tb_ticks_per_usec based on reality. time_init() is called after udbg_init_opal_common() during boot. from arch/powerpc/kernel/time.c: unsigned long tb_ticks_per_usec = 100; /* sane default */ Currently, all powernv systems have a timebase frequency of 512mhz (512000000/1000000 == 0x200) - although there's nothing written down anywhere that I can find saying that we couldn't make that different based on the requirements in the ISA. So, we've been (accidentally) thwacking the (currently) correct (for powernv at least) value for tb_ticks_per_usec earlier than we otherwise would have. The "sane default" seems to be adequate for our purposes between udbg_init_opal_common() and time_init() being called, and if it isn't, then we should probably be setting it somewhere that isn't hvc_opal.c! Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Sam Bobroff	ee245de4b3	powerpc/eeh: Fix use-after-release of EEH driver [ Upstream commit `46d4be41b9` ] Correct two cases where eeh_pcid_get() is used to reference the driver's module but the reference is dropped before the driver pointer is used. In eeh_rmv_device() also refactor a little so that only two calls to eeh_pcid_put() are needed, rather than three and the reference isn't taken at all if it wasn't needed. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:14 +02:00
Cong Wang	73298a828c	infiniband: fix a possible use-after-free bug [ Upstream commit `cb2595c139` ] ucma_process_join() will free the new allocated "mc" struct, if there is any error after that, especially the copy_to_user(). But in parallel, ucma_leave_multicast() could find this "mc" through idr_find() before ucma_process_join() frees it, since it is already published. So "mc" could be used in ucma_leave_multicast() after it is been allocated and freed in ucma_process_join(), since we don't refcnt it. Fix this by separating "publish" from ID allocation, so that we can get an ID first and publish it later after copy_to_user(). Fixes: `c8f6a362bf` ("RDMA/cma: Add multicast communication support") Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Jozsef Kadlecsik	6e02c062e9	netfilter: ipset: List timing out entries with "timeout 1" instead of zero [ Upstream commit `bd975e6914` ] When listing sets with timeout support, there's a probability that just timing out entries with "0" timeout value is listed/saved. However when restoring the saved list, the zero timeout value means permanent elelements. The new behaviour is that timing out entries are listed with "timeout 1" instead of zero. Fixes netfilter bugzilla #1258. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Jiri Olsa	5629505121	perf tools: Fix pmu events parsing rule [ Upstream commit `ceac7b79df` ] Currently all the event parsing fails end up in the event_pmu rule, and display misleading help like: $ perf stat -e inst kill event syntax error: 'inst' \___ Cannot find PMU `inst'. Missing kernel support? ... The reason is that the event_pmu is too strong and match also single string. Changing it to force the '/' separators to be part of the rule, and getting the proper error now: $ perf stat -e inst kill event syntax error: 'inst' \___ parser error Run 'perf list' for a list of valid events ... Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180605121416.31645-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Alexandre Belloni	fda8caa9cb	rtc: ensure rtc_set_alarm fails when alarms are not supported [ Upstream commit `abfdff44bc` ] When using RTC_ALM_SET or RTC_WKALM_SET with rtc_wkalrm.enabled not set, rtc_timer_enqueue() is not called and rtc_set_alarm() may succeed but the subsequent RTC_AIE_ON ioctl will fail. RTC_ALM_READ would also fail in that case. Ensure rtc_set_alarm() fails when alarms are not supported to avoid letting programs think the alarms are working for a particular RTC when they are not. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Mathieu Malaterre	c99dbd9572	mm/slub.c: add __printf verification to slab_err() [ Upstream commit `a38965bf94` ] __printf is useful to verify format and arguments. Remove the following warning (with W=1): mm/slub.c:721:2: warning: function might be possible candidate for `gnu_printf' format attribute [-Wsuggest-attribute=format] Link: http://lkml.kernel.org/r/20180505200706.19986-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Chintan Pandya	e18d3280da	mm: vmalloc: avoid racy handling of debugobjects in vunmap [ Upstream commit `f3c01d2f3a` ] Currently, __vunmap flow is, 1) Release the VM area 2) Free the debug objects corresponding to that vm area. This leave some race window open. 1) Release the VM area 1.5) Some other client gets the same vm area 1.6) This client allocates new debug objects on the same vm area 2) Free the debug objects corresponding to this vm area. Here, we actually free 'other' client's debug objects. Fix this by freeing the debug objects first and then releasing the VM area. Link: http://lkml.kernel.org/r/1523961828-9485-2-git-send-email-cpandya@codeaurora.org Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Geert Uytterhoeven	c6e8116307	vfio: platform: Fix reset module leak in error path [ Upstream commit `28a6838788` ] If the IOMMU group setup fails, the reset module is not released. Fixes: `b5add544d6` ("vfio, platform: make reset driver a requirement by default") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Scott Mayhew	8bccc6c902	nfsd: fix potential use-after-free in nfsd4_decode_getdeviceinfo [ Upstream commit `3171822fdc` ] When running a fuzz tester against a KASAN-enabled kernel, the following splat periodically occurs. The problem occurs when the test sends a GETDEVICEINFO request with a malformed xdr array (size but no data) for gdia_notify_types and the array size is > 0x3fffffff, which results in an overflow in the value of nbytes which is passed to read_buf(). If the array size is 0x40000000, 0x80000000, or 0xc0000000, then after the overflow occurs, the value of nbytes 0, and when that happens the pointer returned by read_buf() points to the end of the xdr data (i.e. argp->end) when really it should be returning NULL. Fix this by returning NFS4ERR_BAD_XDR if the array size is > 1000 (this value is arbitrary, but it's the same threshold used by nfsd4_decode_bitmap()... in could really be any value >= 1 since it's expected to get at most a single bitmap in gdia_notify_types). [ 119.256854] ================================================================== [ 119.257611] BUG: KASAN: use-after-free in nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.258422] Read of size 4 at addr ffff880113ada000 by task nfsd/538 [ 119.259146] CPU: 0 PID: 538 Comm: nfsd Not tainted 4.17.0+ #1 [ 119.259662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 [ 119.261202] Call Trace: [ 119.262265] dump_stack+0x71/0xab [ 119.263371] print_address_description+0x6a/0x270 [ 119.264609] kasan_report+0x258/0x380 [ 119.265854] ? nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.267291] nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.268549] ? nfs4svc_decode_compoundargs+0xa5b/0x13c0 [nfsd] [ 119.269873] ? nfsd4_decode_sequence+0x490/0x490 [nfsd] [ 119.271095] nfs4svc_decode_compoundargs+0xa5b/0x13c0 [nfsd] [ 119.272393] ? nfsd4_release_compoundargs+0x1b0/0x1b0 [nfsd] [ 119.273658] nfsd_dispatch+0x183/0x850 [nfsd] [ 119.274918] svc_process+0x161c/0x31a0 [sunrpc] [ 119.276172] ? svc_printk+0x190/0x190 [sunrpc] [ 119.277386] ? svc_xprt_release+0x451/0x680 [sunrpc] [ 119.278622] nfsd+0x2b9/0x430 [nfsd] [ 119.279771] ? nfsd_destroy+0x1c0/0x1c0 [nfsd] [ 119.281157] kthread+0x2db/0x390 [ 119.282347] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 119.283756] ret_from_fork+0x35/0x40 [ 119.286041] Allocated by task 436: [ 119.287525] kasan_kmalloc+0xa0/0xd0 [ 119.288685] kmem_cache_alloc+0xe9/0x1f0 [ 119.289900] get_empty_filp+0x7b/0x410 [ 119.291037] path_openat+0xca/0x4220 [ 119.292242] do_filp_open+0x182/0x280 [ 119.293411] do_sys_open+0x216/0x360 [ 119.294555] do_syscall_64+0xa0/0x2f0 [ 119.295721] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 119.298068] Freed by task 436: [ 119.299271] __kasan_slab_free+0x130/0x180 [ 119.300557] kmem_cache_free+0x78/0x210 [ 119.301823] rcu_process_callbacks+0x35b/0xbd0 [ 119.303162] __do_softirq+0x192/0x5ea [ 119.305443] The buggy address belongs to the object at ffff880113ada000 which belongs to the cache filp of size 256 [ 119.308556] The buggy address is located 0 bytes inside of 256-byte region [ffff880113ada000, ffff880113ada100) [ 119.311376] The buggy address belongs to the page: [ 119.312728] page:ffffea00044eb680 count:1 mapcount:0 mapping:0000000000000000 index:0xffff880113ada780 [ 119.314428] flags: 0x17ffe000000100(slab) [ 119.315740] raw: 0017ffe000000100 0000000000000000 ffff880113ada780 00000001000c0001 [ 119.317379] raw: ffffea0004553c60 ffffea00045c11e0 ffff88011b167e00 0000000000000000 [ 119.319050] page dumped because: kasan: bad access detected [ 119.321652] Memory state around the buggy address: [ 119.322993] ffff880113ad9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 119.324515] ffff880113ad9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 119.326087] >ffff880113ada000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 119.327547] ^ [ 119.328730] ffff880113ada080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 119.330218] ffff880113ada100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb [ 119.331740] ================================================================== Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Zhouyang Jia	ca08131ee7	ALSA: fm801: add error handling for snd_ctl_add [ Upstream commit `ef1ffbe788` ] When snd_ctl_add fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling snd_ctl_add. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Zhouyang Jia	9f9e506d8e	ALSA: emu10k1: add error handling for snd_ctl_add [ Upstream commit `6d531e7b97` ] When snd_ctl_add fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling snd_ctl_add. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:13 +02:00
Juergen Gross	acd9aba8e4	xen/netfront: raise max number of slots in xennet_get_responses() [ Upstream commit `57f230ab04` ] The max number of slots used in xennet_get_responses() is set to MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD). In old kernel-xen MAX_SKB_FRAGS was 18, while nowadays it is 17. This difference is resulting in frequent messages "too many slots" and a reduced network throughput for some workloads (factor 10 below that of a kernel-xen based guest). Replacing MAX_SKB_FRAGS by XEN_NETIF_NR_SLOTS_MIN for calculation of the max number of slots to use solves that problem (tests showed no more messages "too many slots" and throughput was as high as with the kernel-xen based guest system). Replace MAX_SKB_FRAGS-2 by XEN_NETIF_NR_SLOTS_MIN-1 in netfront_tx_slot_available() for making it clearer what is really being tested without actually modifying the tested value. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Mark Rutland	31ad104de6	kcov: ensure irq code sees a valid area [ Upstream commit `c9484b986e` ] Patch series "kcov: fix unexpected faults". These patches fix a few issues where KCOV code could trigger recursive faults, discovered while debugging a patch enabling KCOV for arch/arm: * On CONFIG_PREEMPT kernels, there's a small race window where __sanitizer_cov_trace_pc() can see a bogus kcov_area. * Lazy faulting of the vmalloc area can cause mutual recursion between fault handling code and __sanitizer_cov_trace_pc(). * During the context switch, switching the mm can cause the kcov_area to be transiently unmapped. These are prerequisites for enabling KCOV on arm, but the issues themsevles are generic -- we just happen to avoid them by chance rather than design on x86-64 and arm64. This patch (of 3): For kernels built with CONFIG_PREEMPT, some C code may execute before or after the interrupt handler, while the hardirq count is zero. In these cases, in_task() can return true. A task can be interrupted in the middle of a KCOV_DISABLE ioctl while it resets the task's kcov data via kcov_task_init(). Instrumented code executed during this period will call __sanitizer_cov_trace_pc(), and as in_task() returns true, will inspect t->kcov_mode before trying to write to t->kcov_area. In kcov_init_task() we update t->kcov_{mode,area,size} with plain stores, which may be re-ordered, torn, etc. Thus __sanitizer_cov_trace_pc() may see bogus values for any of these fields, and may attempt to write to memory which is not mapped. Let's avoid this by using WRITE_ONCE() to set t->kcov_mode, with a barrier() to ensure this is ordered before we clear t->kov_{area,size}. This ensures that any code execute while kcov_init_task() is preempted will either see valid values for t->kcov_{area,size}, or will see that t->kcov_mode is KCOV_MODE_DISABLED, and bail out without touching t->kcov_area. Link: http://lkml.kernel.org/r/20180504135535.53744-2-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Antti Seppälä	7ff1861f49	usb: dwc2: Fix DMA alignment to start at allocated boundary commit `56406e017a` upstream. The commit `3bc04e28a0` ("usb: dwc2: host: Get aligned DMA in a more supported way") introduced a common way to align DMA allocations. The code in the commit aligns the struct dma_aligned_buffer but the actual DMA address pointed by data[0] gets aligned to an offset from the allocated boundary by the kmalloc_ptr and the old_xfer_buffer pointers. This is against the recommendation in Documentation/DMA-API.txt which states: Therefore, it is recommended that driver writers who don't take special care to determine the cache line size at run time only map virtual regions that begin and end on page boundaries (which are guaranteed also to be cache line boundaries). The effect of this is that architectures with non-coherent DMA caches may run into memory corruption or kernel crashes with Unhandled kernel unaligned accesses exceptions. Fix the alignment by positioning the DMA area in front of the allocation and use memory at the end of the area for storing the orginal transfer_buffer pointer. This may have the added benefit of increased performance as the DMA area is now fully aligned on all architectures. Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM). Fixes: `3bc04e28a0` ("usb: dwc2: host: Get aligned DMA in a more supported way") Cc: <stable@vger.kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> [ Antti: backported to 4.9: edited difference in whitespace ] Signed-off-by: Antti Seppälä <a.seppala@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Johannes Weiner	e8d77bd71e	arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups commit `7b0eb6b41a` upstream. Arnd reports the following arm64 randconfig build error with the PSI patches that add another page flag: /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init': /git/arm-soc/include/linux/compiler.h:357:38: error: call to '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT) The additional page flag causes other information stored in page->flags to get bumped into their own struct page member: #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <= BITS_PER_LONG - NR_PAGEFLAGS #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT #else #define LAST_CPUPID_WIDTH 0 #endif #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0 #define LAST_CPUPID_NOT_IN_PAGE_FLAGS #endif which in turn causes the struct page size to exceed the size set in STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the VMEMMAP page array according to address space and struct page size. However, the check is performed - and triggers here - on a !VMEMMAP config, which consumes an additional 22 page bits for the sparse section id. When VMEMMAP is enabled, those bits are returned, cpupid doesn't need its own member, and the page passes the VMEMMAP check. Restrict that check to the situation it was meant to check: that we are sizing the VMEMMAP page array correctly. Says Arnd: Further experiments show that the build error already existed before, but was only triggered with larger values of CONFIG_NR_CPU and/or CONFIG_NODES_SHIFT that might be used in actual configurations but not in randconfig builds. With longer CPU and node masks, I could recreate the problem with kernels as old as linux-4.7 when arm64 NUMA support got added. Reported-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Fixes: `1a2db30034` ("arm64, numa: Add NUMA support for arm64 platforms.") Fixes: `3e1907d5bf` ("arm64: mm: move vmemmap region right below the linear region") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Steven Rostedt (VMware)	b985a7303d	tracing: Quiet gcc warning about maybe unused link variable commit `2519c1bbe3` upstream. Commit `57ea2a34ad` ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") added an if statement that depends on another if statement that gcc doesn't see will initialize the "link" variable and gives the warning: "warning: 'link' may be used uninitialized in this function" It is really a false positive, but to quiet the warning, and also to make sure that it never actually is used uninitialized, initialize the "link" variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler thinks it could be used uninitialized. Cc: stable@vger.kernel.org Fixes: `57ea2a34ad` ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Artem Savkov	987e425ad3	tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure commit `57ea2a34ad` upstream. If enable_trace_kprobe fails to enable the probe in enable_k(ret)probe it returns an error, but does not unset the tp flags it set previously. This results in a probe being considered enabled and failures like being unable to remove the probe through kprobe_events file since probes_open() expects every probe to be disabled. Link: http://lkml.kernel.org/r/20180725102826.8300-1-asavkov@redhat.com Link: http://lkml.kernel.org/r/20180725142038.4765-1-asavkov@redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: `41a7dd420c` ("tracing/kprobes: Support ftrace_event_file base multibuffer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Snild Dolkow	b38f8292f0	kthread, tracing: Don't expose half-written comm when creating kthreads commit `3e536e222f` upstream. There is a window for racing when printing directly to task->comm, allowing other threads to see a non-terminated string. The vsnprintf function fills the buffer, counts the truncated chars, then finally writes the \0 at the end. creator other vsnprintf: fill (not terminated) count the rest trace_sched_waking(p): ... memcpy(comm, p->comm, TASK_COMM_LEN) write \0 The consequences depend on how 'other' uses the string. In our case, it was copied into the tracing system's saved cmdlines, a buffer of adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be): crash-arm64> x/1024s savedcmd->saved_cmdlines \| grep 'evenk' 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12" ...and a strcpy out of there would cause stack corruption: [224761.522292] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff9bf9783c78 crash-arm64> kbt \| grep 'comm\\|trace_print_context' #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396) comm (char [16]) = "irq/497-pwr_even" crash-arm64> rd 0xffffffd4d0e17d14 8 ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_ ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16: ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`.. ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`.......... The workaround in `e09e28671` (use strlcpy in __trace_find_cmdline) was likely needed because of this same bug. Solved by vsnprintf:ing to a local buffer, then using set_task_comm(). This way, there won't be a window where comm is not terminated. Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com Cc: stable@vger.kernel.org Fixes: `bc0c38d139` ("ftrace: latency tracer infrastructure") Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Snild Dolkow <snild@sony.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Steven Rostedt (VMware)	a9737bb91c	tracing: Fix possible double free in event_enable_trigger_func() commit `15cc78644d` upstream. There was a case that triggered a double free in event_trigger_callback() due to the called reg() function freeing the trigger_data and then it getting freed again by the error return by the caller. The solution there was to up the trigger_data ref count. Code inspection found that event_enable_trigger_func() has the same issue, but is not as easy to trigger (requires harder to trigger failures). It needs to be solved slightly different as it needs more to clean up when the reg() function fails. Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home Cc: stable@vger.kernel.org Fixes: `7862ad1846` ("tracing: Add 'enable_event' and 'disable_event' event trigger commands") Reivewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Steven Rostedt (VMware)	2a0ce1ff08	tracing: Fix double free of event_trigger_data commit `1863c38725` upstream. Running the following: # cd /sys/kernel/debug/tracing # echo 500000 > buffer_size_kb [ Or some other number that takes up most of memory ] # echo snapshot > events/sched/sched_switch/trigger Triggers the following bug: ------------[ cut here ]------------ kernel BUG at mm/slub.c:296! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI CPU: 6 PID: 6878 Comm: bash Not tainted 4.18.0-rc6-test+ #1066 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:kfree+0x16c/0x180 Code: 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 ac b3 f8 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 f4 f3 ff ff <0f> 0b 0f 0b 48 8b 3d d9 d8 f9 00 e9 c1 fe ff ff 0f 1f 40 00 0f 1f RSP: 0018:ffffb654436d3d88 EFLAGS: 00010246 RAX: ffff91a9d50f3d80 RBX: ffff91a9d50f3d80 RCX: ffff91a9d50f3d80 RDX: 00000000000006a4 RSI: ffff91a9de5a60e0 RDI: ffff91a9d9803500 RBP: ffffffff8d267c80 R08: 00000000000260e0 R09: ffffffff8c1a56be R10: fffff0d404543cc0 R11: 0000000000000389 R12: ffffffff8c1a56be R13: ffff91a9d9930e18 R14: ffff91a98c0c2890 R15: ffffffff8d267d00 FS: 00007f363ea64700(0000) GS:ffff91a9de580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c1cacc8e10 CR3: 00000000d9b46003 CR4: 00000000001606e0 Call Trace: event_trigger_callback+0xee/0x1d0 event_trigger_write+0xfc/0x1a0 __vfs_write+0x33/0x190 ? handle_mm_fault+0x115/0x230 ? _cond_resched+0x16/0x40 vfs_write+0xb0/0x190 ksys_write+0x52/0xc0 do_syscall_64+0x5a/0x160 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f363e16ab50 Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24 RSP: 002b:00007fff9a4c6378 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f363e16ab50 RDX: 0000000000000009 RSI: 000055c1cacc8e10 RDI: 0000000000000001 RBP: 000055c1cacc8e10 R08: 00007f363e435740 R09: 00007f363ea64700 R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000009 R13: 0000000000000001 R14: 00007f363e4345e0 R15: 00007f363e4303c0 Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device i915 snd_pcm snd_timer i2c_i801 snd soundcore i2c_algo_bit drm_kms_helper 86_pkg_temp_thermal video kvm_intel kvm irqbypass wmi e1000e ---[ end trace d301afa879ddfa25 ]--- The cause is because the register_snapshot_trigger() call failed to allocate the snapshot buffer, and then called unregister_trigger() which freed the data that was passed to it. Then on return to the function that called register_snapshot_trigger(), as it sees it failed to register, it frees the trigger_data again and causes a double free. By calling event_trigger_init() on the trigger_data (which only ups the reference counter for it), and then event_trigger_free() afterward, the trigger_data would not get freed by the registering trigger function as it would only up and lower the ref count for it. If the register trigger function fails, then the event_trigger_free() called after it will free the trigger data normally. Link: http://lkml.kernel.org/r/20180724191331.738eb819@gandalf.local.home Cc: stable@vger.kerne.org Fixes: `93e31ffbf4` ("tracing: Add 'snapshot' event trigger command") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
Shakeel Butt	eb025250ae	kvm, mm: account shadow page tables to kmemcg commit `d97e5e6160` upstream. The size of kvm's shadow page tables corresponds to the size of the guest virtual machines on the system. Large VMs can spend a significant amount of memory as shadow page tables which can not be left as system memory overhead. So, account shadow page tables to the kmemcg. [shakeelb@google.com: replace (GFP_KERNEL\|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT] Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Feiner <pfeiner@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:12 +02:00
KT Liao	6ed569edd4	Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST commit `6f88a6439d` upstream. Add ELAN0622 to ACPI mapping table to support Elan touchpad found in Ideapad 330-15AST. Signed-off-by: KT Liao <kt.liao@emc.com.tw> Reported-by: Anant Shende <anantshende@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:11 +02:00
Chen-Yu Tsai	79f4095a16	Input: i8042 - add Lenovo LaVie Z to the i8042 reset list commit `384cf4285b` upstream. The Lenovo LaVie Z laptop requires i8042 to be reset in order to consistently detect its Elantech touchpad. The nomux and kbdreset quirks are not sufficient. It's possible the other LaVie Z models from NEC require this as well. Cc: stable@vger.kernel.org Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:11 +02:00
Donald Shanty III	19e28842d0	Input: elan_i2c - add ACPI ID for lenovo ideapad 330 commit `938f45008d` upstream. This allows Elan driver to bind to the touchpad found in Lenovo Ideapad 330 series laptops. Signed-off-by: Donald Shanty III <dshanty@protonmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:55:11 +02:00
Greg Kroah-Hartman	94c67449c7	Linux 4.9.116	2018-07-28 07:49:14 +02:00
Arnd Bergmann	b9dd13488a	exec: avoid gcc-8 warning for get_task_comm commit `3756f6401c` upstream. gcc-8 warns about using strncpy() with the source size as the limit: fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess] This is indeed slightly suspicious, as it protects us from source arguments without NUL-termination, but does not guarantee that the destination is terminated. This keeps the strncpy() to ensure we have properly padded target buffer, but ensures that we use the correct length, by passing the actual length of the destination buffer as well as adding a build-time check to ensure it is exactly TASK_COMM_LEN. There are only 23 callsites which I all reviewed to ensure this is currently the case. We could get away with doing only the check or passing the right length, but it doesn't hurt to do both. Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Aleksa Sarai <asarai@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:14 +02:00
Arnd Bergmann	b1a1d9bdb1	turn off -Wattribute-alias Starting with gcc-8.1, we get a warning about all system call definitions, which use an alias between functions with incompatible prototypes, e.g.: In file included from ../mm/process_vm_access.c:19: ../include/linux/syscalls.h:211:18: warning: 'sys_process_vm_readv' alias between functions of incompatible types 'long int(pid_t, const struct iovec , long unsigned int, const struct iovec , long unsigned int, long unsigned int)' {aka 'long int(int, const struct iovec , long unsigned int, const struct iovec , long unsigned int, long unsigned int)'} and 'long int(long int, long int, long int, long int, long int, long int)' [-Wattribute-alias] asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \ ^~~ ../include/linux/syscalls.h:207:2: note: in expansion of macro '__SYSCALL_DEFINEx' __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) ^~~~~~~~~~~~~~~~~ ../include/linux/syscalls.h:201:36: note: in expansion of macro 'SYSCALL_DEFINEx' #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__) ^~~~~~~~~~~~~~~ ../mm/process_vm_access.c:300:1: note: in expansion of macro 'SYSCALL_DEFINE6' SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user , lvec, ^~~~~~~~~~~~~~~ ../include/linux/syscalls.h:215:18: note: aliased declaration here asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ ^~~ ../include/linux/syscalls.h:207:2: note: in expansion of macro '__SYSCALL_DEFINEx' __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) ^~~~~~~~~~~~~~~~~ ../include/linux/syscalls.h:201:36: note: in expansion of macro 'SYSCALL_DEFINEx' #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__) ^~~~~~~~~~~~~~~ ../mm/process_vm_access.c:300:1: note: in expansion of macro 'SYSCALL_DEFINE6' SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user , lvec, This is really noisy and does not indicate a real problem. In the latest mainline kernel, this was addressed by commit `bee2003177` ("disable -Wattribute-alias warning for SYSCALL_DEFINEx()"), which seems too invasive to backport. This takes a much simpler approach and just disables the warning across the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	b2019f0f70	can: xilinx_can: fix RX overflow interrupt not being enabled commit `8399799725` upstream. RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt() processes it. This means that an RX overflow interrupt will only be processed when another interrupt gets asserted (e.g. for RX/TX). Fix that by enabling the RXOFLW interrupt. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	9f7308434e	can: xilinx_can: fix incorrect clear of non-processed interrupts commit `2f4f0f338c` upstream. xcan_interrupt() clears ERROR\|RXOFLV\|BSOFF\|ARBLST interrupts if any of them is asserted. This does not take into account that some of them could have been asserted between interrupt status read and interrupt clear, therefore clearing them without handling them. Fix the code to only clear those interrupts that it knows are asserted and therefore going to be processed in xcan_err_interrupt(). Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	bee7ff7eaa	can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting commit `620050d9c2` upstream. The xilinx_can driver assumes that the TXOK interrupt only clears after it has been acknowledged as many times as there have been successfully sent frames. However, the documentation does not mention such behavior, instead saying just that the interrupt is cleared when the clear bit is set. Similarly, testing seems to also suggest that it is immediately cleared regardless of the amount of frames having been sent. Performing some heavy TX load and then going back to idle has the tx_head drifting further away from tx_tail over time, steadily reducing the amount of frames the driver keeps in the TX FIFO (but not to zero, as the TXOK interrupt always frees up space for 1 frame from the driver's perspective, so frames continue to be sent) and delaying the local echo frames. The TX FIFO tracking is also otherwise buggy as it does not account for TX FIFO being cleared after software resets, causing BUG!, TX FIFO full when queue awake! messages to be output. There does not seem to be any way to accurately track the state of the TX FIFO for local echo support while using the full TX FIFO. The Zynq version of the HW (but not the soft-AXI version) has watermark programming support and with it an additional TX-FIFO-empty interrupt bit. Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used to detect whether 1 or 2 frames have been sent at interrupt processing time. Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode was also tested. An alternative way to solve this would be to drop local echo support but keep using the full TX FIFO. v2: Add FIFO space check before TX queue wake with locking to synchronize with queue stop. This avoids waking the queue when xmit() had just filled it. v3: Keep local echo support and reduce the amount of frames in FIFO instead as suggested by Marc Kleine-Budde. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	1fd9fa57c1	can: xilinx_can: fix device dropping off bus on RX overrun commit `2574fe5451` upstream. The xilinx_can driver performs a software reset when an RX overrun is detected. This causes the device to enter Configuration mode where no messages are received or transmitted. The documentation does not mention any need to perform a reset on an RX overrun, and testing by inducing an RX overflow also indicated that the device continues to work just fine without a reset. Remove the software reset. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	c98f577204	can: xilinx_can: fix recovery from error states not being propagated commit `877e0b7594` upstream. The xilinx_can driver contains no mechanism for propagating recovery from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE. Add such a mechanism by factoring the handling of XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of xcan_err_interrupt and checking for recovery after RX and TX if the interface is in one of those states. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	1fadfbd9f5	can: xilinx_can: fix power management handling commit `8ebd83bdb0` upstream. There are several issues with the suspend/resume handling code of the driver: - The device is attached and detached in the runtime_suspend() and runtime_resume() callbacks if the interface is running. However, during xcan_chip_start() the interface is considered running, causing the resume handler to incorrectly call netif_start_queue() at the beginning of xcan_chip_start(), and on xcan_chip_start() error return the suspend handler detaches the device leaving the user unable to bring-up the device anymore. - The device is not brought properly up on system resume. A reset is done and the code tries to determine the bus state after that. However, after reset the device is always in Configuration mode (down), so the state checking code does not make sense and communication will also not work. - The suspend callback tries to set the device to sleep mode (low-power mode which monitors the bus and brings the device back to normal mode on activity), but then immediately disables the clocks (possibly before the device reaches the sleep mode), which does not make sense to me. If a clean shutdown is wanted before disabling clocks, we can just bring it down completely instead of only sleep mode. Reorganize the PM code so that only the clock logic remains in the runtime PM callbacks and the system PM callbacks contain the device bring-up/down logic. This makes calling the runtime PM callbacks during e.g. xcan_chip_start() safe. The system PM callbacks now simply call common code to start/stop the HW if the interface was running, replacing the broken code from before. xcan_chip_stop() is updated to use the common reset code so that it will wait for the reset to complete. Reset also disables all interrupts so do not do that separately. Also, the device_may_wakeup() checks are removed as the driver does not have wakeup support. Tested on Zynq-7000 integrated CAN. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Anssi Hannula	de2219a86c	can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK commit `32852c561b` upstream. If the device gets into a state where RXNEMP (RX FIFO not empty) interrupt is asserted without RXOK (new frame received successfully) interrupt being asserted, xcan_rx_poll() will continue to try to clear RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is not empty, the interrupt will not be cleared and napi_schedule() will just be called again. This situation can occur when: (a) xcan_rx() returns without reading RX FIFO due to an error condition. The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear due to a frame still being in the FIFO. The frame will never be read from the FIFO as RXOK is no longer set. (b) A frame is received between xcan_rx_poll() reading interrupt status and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain set as the new message is still in the FIFO. I'm able to trigger case (b) by flooding the bus with frames under load. There does not seem to be any benefit in using both RXNEMP and RXOK in the way the driver does, and the polling example in the reference manual (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either RXOK or RXNEMP can be used for detecting incoming messages. Fix the issue and simplify the RX processing by only using RXNEMP without RXOK. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Rafael J. Wysocki	bf0070e2f5	driver core: Partially revert "driver core: correct device's shutdown order" commit `722e5f2b1e` upstream. Commit `52cdbdd498` (driver core: correct device's shutdown order) introduced a regression by breaking device shutdown on some systems. Namely, the devices_kset_move_last() call in really_probe() added by that commit is a mistake as it may cause parents to follow children in the devices_kset list which then causes shutdown to fail. For example, if a device has children before really_probe() is called for it (which is not uncommon), that call will cause it to be reordered after the children in the devices_kset list and the ordering of that list will not reflect the correct device shutdown order any more. Also it causes the devices_kset list to be constantly reordered until all drivers have been probed which is totally pointless overhead in the majority of cases and it only covered an issue with system shutdown, while system-wide suspend/resume potentially had the same issue on the affected platforms (which was not covered). Moreover, the shutdown issue originally addressed by the change in really_probe() made by commit `52cdbdd498` is not present in 4.18-rc any more, since dra7 started to use the sdhci-omap driver which doesn't disable any regulators during shutdown, so the really_probe() part of commit `52cdbdd498` can be safely reverted. [The original issue was related to the omap_hsmmc driver used by dra7 previously.] For the above reasons, revert the really_probe() modifications made by commit `52cdbdd498`. The other code changes made by commit `52cdbdd498` are useful and they need not be reverted. Fixes: `52cdbdd498` (driver core: correct device's shutdown order) Link: https://lore.kernel.org/lkml/CAFgQCTt7VfqM=UyCnvNFxrSw8Z6cUtAi3HUwR4_xPAc03SgHjQ@mail.gmail.com/ Reported-by: Pingfan Liu <kernelfans@gmail.com> Tested-by: Pingfan Liu <kernelfans@gmail.com> Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Jerry Zhang	9e10043b6b	usb: gadget: f_fs: Only return delayed status when len is 0 commit `4d644abf25` upstream. Commit `1b9ba000` ("Allow function drivers to pause control transfers") states that USB_GADGET_DELAYED_STATUS is only supported if data phase is 0 bytes. It seems that when the length is not 0 bytes, there is no need to explicitly delay the data stage since the transfer is not completed until the user responds. However, when the length is 0, there is no data stage and the transfer is finished once setup() returns, hence there is a need to explicitly delay completion. This manifests as the following bugs: Prior to `946ef68ad4` ('Let setup() return USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs would require user to queue a 0 byte request in order to clear setup state. However, that 0 byte request was actually not needed and would hang and cause errors in other setup requests. After the above commit, 0 byte setups work since the gadget now accepts empty queues to ep0 to clear the delay, but all other setups hang. Fixes: `946ef68ad4` ("Let setup() return USB_GADGET_DELAYED_STATUS") Signed-off-by: Jerry Zhang <zhangjerry@google.com> Cc: stable <stable@vger.kernel.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Bin Liu	e2996cf59e	usb: core: handle hub C_PORT_OVER_CURRENT condition commit `249a32b7ee` upstream. Based on USB2.0 Spec Section 11.12.5, "If a hub has per-port power switching and per-port current limiting, an over-current on one port may still cause the power on another port to fall below specific minimums. In this case, the affected port is placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the port, but PORT_OVER_CURRENT is not set." so let's check C_PORT_OVER_CURRENT too for over current condition. Fixes: `08d1dec6f4` ("usb:hub set hub->change_bits when over-current happens") Cc: <stable@vger.kernel.org> Tested-by: Alessandro Antenucci <antenucci@korg.it> Signed-off-by: Bin Liu <b-liu@ti.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:13 +02:00
Lubomir Rintel	b0bd06a475	usb: cdc_acm: Add quirk for Castles VEGA3000 commit `1445cbe476` upstream. The device (a POS terminal) implements CDC ACM, but has not union descriptor. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Eric Dumazet	94623c7463	tcp: call tcp_drop() from tcp_data_queue_ofo() [ Upstream commit `8541b21e78` ] In order to be able to give better diagnostics and detect malicious traffic, we need to have better sk->sk_drops tracking. Fixes: `9f5afeae51` ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Eric Dumazet	a878681484	tcp: detect malicious patterns in tcp_collapse_ofo_queue() [ Upstream commit `3d4bf93ac1` ] In case an attacker feeds tiny packets completely out of order, tcp_collapse_ofo_queue() might scan the whole rb-tree, performing expensive copies, but not changing socket memory usage at all. 1) Do not attempt to collapse tiny skbs. 2) Add logic to exit early when too many tiny skbs are detected. We prefer not doing aggressive collapsing (which copies packets) for pathological flows, and revert to tcp_prune_ofo_queue() which will be less expensive. In the future, we might add the possibility of terminating flows that are proven to be malicious. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Eric Dumazet	fdf258ed5d	tcp: avoid collapses in tcp_prune_queue() if possible [ Upstream commit `f4a3313d8e` ] Right after a TCP flow is created, receiving tiny out of order packets allways hit the condition : if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk); tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc (guarded by tcp_rmem[2]) Calling tcp_collapse_ofo_queue() in this case is not useful, and offers a O(N^2) surface attack to malicious peers. Better not attempt anything before full queue capacity is reached, forcing attacker to spend lots of resource and allow us to more easily detect the abuse. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Eric Dumazet	2d08921c8d	tcp: free batches of packets in tcp_prune_ofo_queue() [ Upstream commit `72cd43ba64` ] Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: `36a6503fed` ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Yuchung Cheng	8736711f4e	tcp: do not delay ACK in DCTCP upon CE status change [ Upstream commit `a0496ef2c2` ] Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change has to be sent immediately so the sender can respond quickly: """ When receiving packets, the CE codepoint MUST be processed as follows: 1. If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to true and send an immediate ACK. 2. If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE to false and send an immediate ACK. """ Previously DCTCP implementation may continue to delay the ACK. This patch fixes that to implement the RFC by forcing an immediate ACK. Tested with this packetdrill script provided by Larry Brakmo 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 +0.005 < [ce] . 2001:3001(1000) ack 2 win 257 +0.000 > [ect01] . 2:2(0) ack 2001 // Previously the ACK below would be delayed by 40ms +0.000 > [ect01] E. 2:2(0) ack 3001 +0.500 < F. 9501:9501(0) ack 4 win 257 Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Yuchung Cheng	57ec8824b1	tcp: do not cancel delay-AcK on DCTCP special ACK [ Upstream commit `27cde44a25` ] Currently when a DCTCP receiver delays an ACK and receive a data packet with a different CE mark from the previous one's, it sends two immediate ACKs acking previous and latest sequences respectly (for ECN accounting). Previously sending the first ACK may mark off the delayed ACK timer (tcp_event_ack_sent). This may subsequently prevent sending the second ACK to acknowledge the latest sequence (tcp_ack_snd_check). The culprit is that tcp_send_ack() assumes it always acknowleges the latest sequence, which is not true for the first special ACK. The fix is to not make the assumption in tcp_send_ack and check the actual ack sequence before cancelling the delayed ACK. Further it's safer to pass the ack sequence number as a local variable into tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid future bugs like this. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Yuchung Cheng	1fcccc5786	tcp: helpers to send special DCTCP ack [ Upstream commit `2987babb69` ] Refactor and create helpers to send the special ACK in DCTCP. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Yuchung Cheng	8417780182	tcp: fix dctcp delayed ACK schedule [ Upstream commit `b0c05d0e99` ] Previously, when a data segment was sent an ACK was piggybacked on the data segment without generating a CA_EVENT_NON_DELAYED_ACK event to notify congestion control modules. So the DCTCP ca->delayed_ack_reserved flag could incorrectly stay set when in fact there were no delayed ACKs being reserved. This could result in sending a special ECN notification ACK that carries an older ACK sequence, when in fact there was no need for such an ACK. DCTCP keeps track of the delayed ACK status with its own separate state ca->delayed_ack_reserved. Previously it may accidentally cancel the delayed ACK without updating this field upon sending a special ACK that carries a older ACK sequence. This inconsistency would lead to DCTCP receiver never acknowledging the latest data until the sender times out and retry in some cases. Packetdrill script (provided by Larry Brakmo) 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 2:3(1) ack 2001 0.200 < [ect0] . 2001:3001(1000) ack 3 win 257 0.200 < [ect0] . 3001:4001(1000) ack 3 win 257 0.200 > [ect01] . 3:3(0) ack 4001 0.210 < [ce] P. 4001:4501(500) ack 3 win 257 +0.001 read(4, ..., 4500) = 4500 +0 write(4, ..., 1) = 1 +0 > [ect01] PE. 3:4(1) ack 4501 +0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257 // Previously the ACK sequence below would be 4501, causing a long RTO +0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack +0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data +0 > [ect01] . 4:4(0) ack 6501 // now acks everything +0.500 < F. 9501:9501(0) ack 4 win 257 Reported-by: Larry Brakmo <brakmo@fb.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Roopa Prabhu	19b7479915	rtnetlink: add rtnl_link_state check in rtnl_configure_link [ Upstream commit `5025f7f7d5` ] rtnl_configure_link sets dev->rtnl_link_state to RTNL_LINK_INITIALIZED and unconditionally calls __dev_notify_flags to notify user-space of dev flags. current call sequence for rtnl_configure_link rtnetlink_newlink rtnl_link_ops->newlink rtnl_configure_link (unconditionally notifies userspace of default and new dev flags) If a newlink handler wants to call rtnl_configure_link early, we will end up with duplicate notifications to user-space. This patch fixes rtnl_configure_link to check rtnl_link_state and call __dev_notify_flags with gchanges = 0 if already RTNL_LINK_INITIALIZED. Later in the series, this patch will help the following sequence where a driver implementing newlink can call rtnl_configure_link to initialize the link early. makes the following call sequence work: rtnetlink_newlink rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes link and notifies user-space of default dev flags) rtnl_configure_link (updates dev flags if requested by user ifm and notifies user-space of new dev flags) Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:12 +02:00
Heiner Kallweit	c6ac36be72	net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv [ Upstream commit `215d08a85b` ] The situation described in the comment can occur also with PHY_IGNORE_INTERRUPT, therefore change the condition to include it. Fixes: `f555f34fdc` ("net: phy: fix auto-negotiation stall due to unavailable interrupt") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Hangbin Liu	cc403d5dc1	multicast: do not restore deleted record source filter mode to new one There are two scenarios that we will restore deleted records. The first is when device down and up(or unmap/remap). In this scenario the new filter mode is same with previous one. Because we get it from in_dev->mc_list and we do not touch it during device down and up. The other scenario is when a new socket join a group which was just delete and not finish sending status reports. In this scenario, we should use the current filter mode instead of restore old one. Here are 4 cases in total. old_socket new_socket before_fix after_fix IN(A) IN(A) ALLOW(A) ALLOW(A) IN(A) EX( ) TO_IN( ) TO_EX( ) EX( ) IN(A) TO_EX( ) ALLOW(A) EX( ) EX( ) TO_EX( ) TO_EX( ) Fixes: `24803f38a5` (igmp: do not remove igmp souce list info when set link down) Fixes: `1666d49e1d` (mld: do not remove mld souce list info when set link down) Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Eran Ben Elisha	b7e37add79	net/mlx5e: Fix quota counting in aRFS expire flow [ Upstream commit `2630bae801` ] Quota should follow the amount of rules which do expire, and not the number of rules that were examined, fixed that. Fixes: `18c908e477` ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Eran Ben Elisha	d9d5801216	net/mlx5e: Don't allow aRFS for encapsulated packets [ Upstream commit `d2e1c57bcf` ] Driver is yet to support aRFS for encapsulated packets, return early error in such case. Fixes: `18c908e477` ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Ariel Levkovich	adcecd4ab1	net/mlx5: Adjust clock overflow work period [ Upstream commit `33180bee86` ] When driver converts HW timestamp to wall clock time it subtracts the last saved cycle counter from the HW timestamp and converts the difference to nanoseconds. The conversion is done by multiplying the cycles difference with the clock multiplier value as a first step and therefore the cycles difference should be small enough so that the multiplication product doesn't exceed 64bit. The overflow handling routine is in charge of updating the last saved cycle counter in driver and it is called periodically using kernel delayed workqueue. The delay period for this work is calculated using the max HW cycle counter value (a 41 bit mask) as a base which doesn't take the 64bit limit into account so the delay period may be incorrect and too long to prevent a large difference between the HW counter and the last saved counter in SW. This change adjusts the work period for the HW clock overflow work by taking the minimum between the previous value and the quotient of max u64 value and the clock multiplier value. Fixes: `ef9814deaf` ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Eric Dumazet	e2ffdd646c	net: skb_segment() should not return NULL [ Upstream commit `ff907a11a0` ] syzbot caught a NULL deref [1], caused by skb_segment() skb_segment() has many "goto err;" that assume the @err variable contains -ENOMEM. A successful call to __skb_linearize() should not clear @err, otherwise a subsequent memory allocation error could return NULL. While we are at it, we might use -EINVAL instead of -ENOMEM when MAX_SKB_FRAGS limit is reached. [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN CPU: 0 PID: 13285 Comm: syz-executor3 Not tainted 4.18.0-rc4+ #146 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_gso_segment+0x3dc/0x1780 net/ipv4/tcp_offload.c:106 Code: f0 ff ff 0f 87 1c fd ff ff e8 00 88 0b fb 48 8b 75 d0 48 b9 00 00 00 00 00 fc ff df 48 8d be 90 00 00 00 48 89 f8 48 c1 e8 03 <0f> b6 14 08 48 8d 86 94 00 00 00 48 89 c6 83 e0 07 48 c1 ee 03 0f RSP: 0018:ffff88019b7fd060 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000020 RCX: dffffc0000000000 RDX: 0000000000040000 RSI: 0000000000000000 RDI: 0000000000000090 RBP: ffff88019b7fd0f0 R08: ffff88019510e0c0 R09: ffffed003b5c46d6 R10: ffffed003b5c46d6 R11: ffff8801dae236b3 R12: 0000000000000001 R13: ffff8801d6c581f4 R14: 0000000000000000 R15: ffff8801d6c58128 FS: 00007fcae64d6700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004e8664 CR3: 00000001b669b000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcp4_gso_segment+0x1c3/0x440 net/ipv4/tcp_offload.c:54 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 skb_mac_gso_segment+0x3b5/0x740 net/core/dev.c:2792 __skb_gso_segment+0x3c3/0x880 net/core/dev.c:2865 skb_gso_segment include/linux/netdevice.h:4099 [inline] validate_xmit_skb+0x640/0xf30 net/core/dev.c:3104 __dev_queue_xmit+0xc14/0x3910 net/core/dev.c:3561 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_hh_output include/net/neighbour.h:473 [inline] neigh_output include/net/neighbour.h:481 [inline] ip_finish_output2+0x1063/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 iptunnel_xmit+0x567/0x850 net/ipv4/ip_tunnel_core.c:91 ip_tunnel_xmit+0x1598/0x3af1 net/ipv4/ip_tunnel.c:778 ipip_tunnel_xmit+0x264/0x2c0 net/ipv4/ipip.c:308 __netdev_start_xmit include/linux/netdevice.h:4148 [inline] netdev_start_xmit include/linux/netdevice.h:4157 [inline] xmit_one net/core/dev.c:3034 [inline] dev_hard_start_xmit+0x26c/0xc30 net/core/dev.c:3050 __dev_queue_xmit+0x29ef/0x3910 net/core/dev.c:3569 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_direct_output+0x15/0x20 net/core/neighbour.c:1403 neigh_output include/net/neighbour.h:483 [inline] ip_finish_output2+0xa67/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9df/0x1f80 net/ipv4/ip_output.c:504 tcp_transmit_skb+0x1bf9/0x3f10 net/ipv4/tcp_output.c:1168 tcp_write_xmit+0x1641/0x5c20 net/ipv4/tcp_output.c:2363 __tcp_push_pending_frames+0xb2/0x290 net/ipv4/tcp_output.c:2536 tcp_push+0x638/0x8c0 net/ipv4/tcp.c:735 tcp_sendmsg_locked+0x2ec5/0x3f00 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 __sys_sendto+0x3d7/0x670 net/socket.c:1797 __do_sys_sendto net/socket.c:1809 [inline] __se_sys_sendto net/socket.c:1805 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1805 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455ab9 Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fcae64d5c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fcae64d66d4 RCX: 0000000000455ab9 RDX: 0000000000000001 RSI: 0000000020000200 RDI: 0000000000000013 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000014 R13: 00000000004c1145 R14: 00000000004d1818 R15: 0000000000000006 Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) Fixes: `ddff00d420` ("net: Move skb_has_shared_frag check out of GRE code and into segmentation") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Jack Morgenstein	444987d535	net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper [ Upstream commit `958c696f5a` ] Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp context, rather than the one passed in the input modifier. However, the qp number in the qp context is not defined as a required parameter by the FW. Therefore, drivers may choose to not specify the qp number in the qp context for the reset-to-init transition. Thus, we must save the qp number passed in the command input modifier -- which is always present. (This saved qp number is used as the input modifier for command 2RST_QP when a slave's qp's are destroyed). Fixes: `c82e9aa0a8` ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Willem de Bruijn	03fbf2b823	ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull [ Upstream commit `2efd4fca70` ] Syzbot reported a read beyond the end of the skb head when returning IPV6_ORIGDSTADDR: BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242 CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125 kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219 kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261 copy_to_user include/linux/uaccess.h:184 [inline] put_cmsg+0x5ef/0x860 net/core/scm.c:242 ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719 ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733 rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521 [..] This logic and its ipv4 counterpart read the destination port from the packet at skb_transport_offset(skb) + 4. With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a packet that stores headers exactly up to skb_transport_offset(skb) in the head and the remainder in a frag. Call pskb_may_pull before accessing the pointer to ensure that it lies in skb head. Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Paolo Abeni	93d94fec94	ip: hash fragments consistently [ Upstream commit `3dd1c9a127` ] The skb hash for locally generated ip[v6] fragments belonging to the same datagram can vary in several circumstances: * for connected UDP[v6] sockets, the first fragment get its hash via set_owner_w()/skb_set_hash_from_sk() * for unconnected IPv6 UDPv6 sockets, the first fragment can get its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if auto_flowlabel is enabled For the following frags the hash is usually computed via skb_get_hash(). The above can cause OoO for unconnected IPv6 UDPv6 socket: in that scenario the egress tx queue can be selected on a per packet basis via the skb hash. It may also fool flow-oriented schedulers to place fragments belonging to the same datagram in different flows. Fix the issue by copying the skb hash from the head frag into the others at fragmentation time. Before this commit: perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8" netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n & perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1 perf script probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0 After this commit: probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 Fixes: `b73c3d0e4f` ("net: Save TX flow hash in sock and set in skbuf on xmit") Fixes: `67800f9b1f` ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Paul Burton	650321fe96	MIPS: Fix off-by-one in pci_resource_to_user() commit `38c0a74fe0` upstream. The MIPS implementation of pci_resource_to_user() introduced in v3.12 by commit `4c2924b725` ("MIPS: PCI: Use pci_resource_to_user to map pci memory space properly") incorrectly sets *end to the address of the byte after the resource, rather than the last byte of the resource. This results in userland seeing resources as a byte larger than they actually are, for example a 32 byte BAR will be reported by a tool such as lspci as being 33 bytes in size: Region 2: I/O ports at 1000 [disabled] [size=33] Correct this by subtracting one from the calculated end address, reporting the correct address to userland. Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Rui Wang <rui.wang@windriver.com> Fixes: `4c2924b725` ("MIPS: PCI: Use pci_resource_to_user to map pci memory space properly") Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v3.12+ Patchwork: https://patchwork.linux-mips.org/patch/19829/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Felix Fietkau	92f724130f	MIPS: ath79: fix register address in ath79_ddr_wb_flush() commit `bc88ad2efd` upstream. ath79_ddr_wb_flush_base has the type void __iomem *, so register offsets need to be a multiple of 4 in order to access the intended register. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: `24b0e3e84f` ("MIPS: ath79: Improve the DDR controller interface") Patchwork: https://patchwork.linux-mips.org/patch/19912/ Cc: Alban Bedel <albeu@free.fr> Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # 4.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:49:11 +02:00
Greg Kroah-Hartman	dbcdf42bab	Linux 4.9.115	2018-07-25 11:24:03 +02:00
Alan Jenkins	3118ceb456	block: do not use interruptible wait anywhere commit `1dc3039bc8` upstream. When blk_queue_enter() waits for a queue to unfreeze, or unset the PREEMPT_ONLY flag, do not allow it to be interrupted by a signal. The PREEMPT_ONLY flag was introduced later in commit `3a0a529971` ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI device is resumed asynchronously, i.e. after un-freezing userspace tasks. So that commit exposed the bug as a regression in v4.15. A mysterious SIGBUS (or -EIO) sometimes happened during the time the device was being resumed. Most frequently, there was no kernel log message, and we saw Xorg or Xwayland killed by SIGBUS.[1] [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979 Without this fix, I get an IO error in this test: # dd if=/dev/sda of=/dev/null iflag=direct & \ while killall -SIGUSR1 dd; do sleep 0.1; done & \ echo mem > /sys/power/state ; \ sleep 5; killall dd # stop after 5 seconds The interruptible wait was added to blk_queue_enter in commit `3ef28e83ab` ("block: generic request_queue reference counting"). Before then, the interruptible wait was only in blk-mq, but I don't think it could ever have been correct. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Alan Jenkins <alan.christopher.jenkins@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:03 +02:00
Chuck Lever	2ea8b93c03	xprtrdma: Return -ENOBUFS when no pages are available commit `a8f688ec43` upstream. The use of -EAGAIN in rpcrdma_convert_iovs() is a latent bug: the transport never calls xprt_write_space() when more pages become available. -ENOBUFS will trigger the correct "delay briefly and call again" logic. Fixes: `7a89f9c626` ("xprtrdma: Honor ->send_request API contract") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:02 +02:00
Mathias Nyman	33b2110bd9	xhci: Fix perceived dead host due to runtime suspend race with event handler commit `229bc19fd7` upstream. Don't rely on event interrupt (EINT) bit alone to detect pending port change in resume. If no change event is detected the host may be suspended again, oterwise roothubs are resumed. There is a lag in xHC setting EINT. If we don't notice the pending change in resume, and the controller is runtime suspeded again, it causes the event handler to assume host is dead as it will fail to read xHC registers once PCI puts the controller to D3 state. [ 268.520969] xhci_hcd: xhci_resume: starting port polling. [ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling. [ 268.521030] xhci_hcd: xhci_suspend: stopping port polling. [ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001 [ 268.521139] xhci_hcd: Port Status Change Event for port 3 [ 268.521149] xhci_hcd: resume root hub [ 268.521163] xhci_hcd: port resume event for port 3 [ 268.521168] xhci_hcd: xHC is not running. [ 268.521174] xhci_hcd: handle_port_status: starting port polling. [ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead The EINT lag is described in a additional note in xhci specs 4.19.2: "Due to internal xHC scheduling and system delays, there will be a lag between a change bit being set and the Port Status Change Event that it generated being written to the Event Ring. If SW reads the PORTSC and sees a change bit set, there is no guarantee that the corresponding Port Status Change Event has already been written into the Event Ring." Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:02 +02:00
Stefano Brivio	ad375eae79	skbuff: Unconditionally copy pfmemalloc in __skb_clone() [ Upstream commit `e78bfb0751` ] Commit `8b7008620b` ("net: Don't copy pfmemalloc flag in __copy_skb_header()") introduced a different handling for the pfmemalloc flag in copy and clone paths. In __skb_clone(), now, the flag is set only if it was set in the original skb, but not cleared if it wasn't. This is wrong and might lead to socket buffers being flagged with pfmemalloc even if the skb data wasn't allocated from pfmemalloc reserves. Copy the flag instead of ORing it. Reported-by: Sabrina Dubroca <sd@queasysnail.net> Fixes: `8b7008620b` ("net: Don't copy pfmemalloc flag in __copy_skb_header()") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:02 +02:00
Stefano Brivio	cad99229aa	net: Don't copy pfmemalloc flag in __copy_skb_header() [ Upstream commit `8b7008620b` ] The pfmemalloc flag indicates that the skb was allocated from the PFMEMALLOC reserves, and the flag is currently copied on skb copy and clone. However, an skb copied from an skb flagged with pfmemalloc wasn't necessarily allocated from PFMEMALLOC reserves, and on the other hand an skb allocated that way might be copied from an skb that wasn't. So we should not copy the flag on skb copy, and rather decide whether to allow an skb to be associated with sockets unrelated to page reclaim depending only on how it was allocated. Move the pfmemalloc flag before headers_start[0] using an existing 1-bit hole, so that __copy_skb_header() doesn't copy it. When cloning, we'll now take care of this flag explicitly, contravening to the warning comment of __skb_clone(). While at it, restore the newline usage introduced by commit `b193722731` ("net: reorganize sk_buff for faster __copy_skb_header()") to visually separate bytes used in bitfields after headers_start[0], that was gone after commit `a9e419dc7b` ("netfilter: merge ctinfo into nfct pointer storage area"), and describe the pfmemalloc flag in the kernel-doc structure comment. This doesn't change the size of sk_buff or cacheline boundaries, but consolidates the 15 bits hole before tc_index into a 2 bytes hole before csum, that could now be filled more easily. Reported-by: Patrick Talbert <ptalbert@redhat.com> Fixes: `c93bdd0e03` ("netvm: allow skb allocation to use PFMEMALLOC reserves") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:02 +02:00
Alexander Couzens	c439f62038	net: usb: asix: replace mii_nway_restart in resume path [ Upstream commit `5c968f4802` ] mii_nway_restart is not pm aware which results in a rtnl deadlock. Implement mii_nway_restart manual by setting BMCR_ANRESTART if BMCR_ANENABLE is set. To reproduce: * plug an asix based usb network interface * wait until the device enters PM (~5 sec) * `ip link set eth1 up` will never return Fixes: `d9fe64e511` ("net: asix: Add in_pm parameter") Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:02 +02:00
Sanjeev Bansal	dd08f4e691	tg3: Add higher cpu clock for 5762. [ Upstream commit `3a498606bb` ] This patch has fix for TX timeout while running bi-directional traffic with 100 Mbps using 5762. Signed-off-by: Sanjeev Bansal <sanjeevb.bansal@broadcom.com> Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
Matevz Vucnik	323bbb17d4	qmi_wwan: add support for Quectel EG91 [ Upstream commit `38cd58ed9c` ] This adds the USB id of LTE modem Quectel EG91. It requires the same quirk as other Quectel modems to make it work. Signed-off-by: Matevz Vucnik <vucnikm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
Gustavo A. R. Silva	5ac2bc675c	ptp: fix missing break in switch [ Upstream commit `9ba8376ce1` ] It seems that a break is missing in order to avoid falling through to the default case. Otherwise, checking chan makes no sense. Fixes: `72df7a7244` ("ptp: Allow reassigning calibration pin function") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
Heiner Kallweit	77befb4bd1	net: phy: fix flag masking in __set_phy_supported [ Upstream commit `df8ed346d4` ] Currently also the pause flags are removed from phydev->supported because they're not included in PHY_DEFAULT_FEATURES. I don't think this is intended, especially when considering that this function can be called via phy_set_max_speed() anywhere in a driver. Change the masking to mask out only the values we're going to change. In addition remove the misleading comment, job of this small function is just to adjust the supported and advertised speeds. Fixes: `f3a6bd393c` ("phylib: Add phy_set_max_speed helper") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
David Ahern	f08ca4c8b4	net/ipv4: Set oif in fib_compute_spec_dst [ Upstream commit `e7372197e1` ] Xin reported that icmp replies may not use the address on the device the echo request is received if the destination address is broadcast. Instead a route lookup is done without considering VRF context. Fix by setting oif in flow struct to the master device if it is enslaved. That directs the lookup to the VRF table. If the device is not enslaved, oif is still 0 so no affect. Fixes: `cd2fbe1b6b` ("net: Use VRF device index for lookups on RX") Reported-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
Lorenzo Colitti	66a7cfa057	net: diag: Don't double-free TCP_NEW_SYN_RECV sockets in tcp_abort [ Upstream commit `acc2cf4e37` ] When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first frees it by calling inet_csk_reqsk_queue_drop_and_and_put in tcp_abort, and then frees it again by calling sock_gen_put. Since tcp_abort only has one caller, and all the other codepaths in tcp_abort don't free the socket, just remove the free in that function. Cc: David Ahern <dsa@cumulusnetworks.com> Tested: passes Android sock_diag_test.py, which exercises this codepath Fixes: `d7226c7a4d` ("net: diag: Fix refcnt leak in error path destroying socket") Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:01 +02:00
Davidlohr Bueso	09ae0085ce	lib/rhashtable: consider param->min_size when setting initial table size [ Upstream commit `107d01f5ba` ] rhashtable_init() currently does not take into account the user-passed min_size parameter unless param->nelem_hint is set as well. As such, the default size (number of buckets) will always be HASH_DEFAULT_SIZE even if the smallest allowed size is larger than that. Remediate this by unconditionally calling into rounded_hashtable_size() and handling things accordingly. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:00 +02:00
Colin Ian King	8582bbfb86	ipv6: fix useless rol32 call on hash [ Upstream commit `169dc027fb` ] The rol32 call is currently rotating hash but the rol'd value is being discarded. I believe the current code is incorrect and hash should be assigned the rotated value returned from rol32. Thanks to David Lebrun for spotting this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:00 +02:00
Tyler Hicks	79870c6c65	ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns [ Upstream commit `70ba5b6db9` ] The low and high values of the net.ipv4.ping_group_range sysctl were being silently forced to the default disabled state when a write to the sysctl contained GIDs that didn't map to the associated user namespace. Confusingly, the sysctl's write operation would return success and then a subsequent read of the sysctl would indicate that the low and high values are the overflowgid. This patch changes the behavior by clearly returning an error when the sysctl write operation receives a GID range that doesn't map to the associated user namespace. In such a situation, the previous value of the sysctl is preserved and that range will be returned in a subsequent read of the sysctl. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:00 +02:00
Toke Høiland-Jørgensen	ec6a6039d7	gen_stats: Fix netlink stats dumping in the presence of padding [ Upstream commit `d5a672ac9f` ] The gen_stats facility will add a header for the toplevel nlattr of type TCA_STATS2 that contains all stats added by qdisc callbacks. A reference to this header is stored in the gnet_dump struct, and when all the per-qdisc callbacks have finished adding their stats, the length of the containing header will be adjusted to the right value. However, on architectures that need padding (i.e., that don't set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS), the padding nlattr is added before the stats, which means that the stored pointer will point to the padding, and so when the header is fixed up, the result is just a very big padding nlattr. Because most qdiscs also supply the legacy TCA_STATS struct, this problem has been mostly invisible, but we exposed it with the netlink attribute-based statistics in CAKE. Fix the issue by fixing up the stored pointer if it points to a padding nlattr. Tested-by: Pete Heist <pete@heistp.net> Tested-by: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:00 +02:00
Ville Syrjälä	1e02c4f403	drm/i915: Fix hotplug irq ack on i965/g4x commit `96a85cc517` upstream. Just like with PIPESTAT, the edge triggered IIR on i965/g4x also causes problems for hotplug interrupts. To make sure we don't get the IIR port interrupt bit stuck low with the ISR bit high we must force an edge in ISR. Unfortunately we can't borrow the PIPESTAT trick and toggle the enable bits in PORT_HOTPLUG_EN as that act itself generates hotplug interrupts. Instead we just have to loop until we've cleared PORT_HOTPLUG_STAT, or we just give up and WARN. v2: Don't frob with PORT_HOTPLUG_EN Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180614175625.1615-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit `0ba7c51a6f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:24:00 +02:00
Gustavo A. R. Silva	40974672ae	vfio/pci: Fix potential Spectre v1 commit `0e714d2778` upstream. info.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl() warn: potential spectre issue 'vdev->region' Fix this by sanitizing info.index before indirectly using it to index vdev->region Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:59 +02:00
Hugh Dickins	3472e37379	mm/huge_memory.c: fix data loss when splitting a file pmd commit `e1f1b1572e` upstream. __split_huge_pmd_locked() must check if the cleared huge pmd was dirty, and propagate that to PageDirty: otherwise, data may be lost when a huge tmpfs page is modified then split then reclaimed. How has this taken so long to be noticed? Because there was no problem when the huge page is written by a write system call (shmem_write_end() calls set_page_dirty()), nor when the page is allocated for a write fault (fault_dirty_shared_page() calls set_page_dirty()); but when allocated for a read fault (which MAP_POPULATE simulates), no set_page_dirty(). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils Fixes: `d21b9e57c7` ("thp: handle file pages in split_huge_pmd()") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Ashwin Chaugule <ashwinch@google.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:59 +02:00
Jing Xia	f46b054eca	mm: memcg: fix use after free in mem_cgroup_iter() commit `9f15bde671` upstream. It was reported that a kernel crash happened in mem_cgroup_iter(), which can be triggered if the legacy cgroup-v1 non-hierarchical mode is used. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f ...... Call trace: mem_cgroup_iter+0x2e0/0x6d4 shrink_zone+0x8c/0x324 balance_pgdat+0x450/0x640 kswapd+0x130/0x4b8 kthread+0xe8/0xfc ret_from_fork+0x10/0x20 mem_cgroup_iter(): ...... if (css_tryget(css)) <-- crash here break; ...... The crashing reason is that mem_cgroup_iter() uses the memcg object whose pointer is stored in iter->position, which has been freed before and filled with POISON_FREE(0x6b). And the root cause of the use-after-free issue is that invalidate_reclaim_iterators() fails to reset the value of iter->position to NULL when the css of the memcg is released in non- hierarchical mode. Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.com Fixes: `6df38689e0` ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim") Signed-off-by: Jing Xia <jing.xia.mail@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <chunyan.zhang@unisoc.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:59 +02:00
Alexey Brodkin	1be686fe50	ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs commit `64234961c1` upstream. We used to have pre-set CONFIG_INITRAMFS_SOURCE with local path to intramfs in ARC defconfigs. This was quite convenient for in-house development but not that convenient for newcomers who obviusly don't have folders like "arc_initramfs" next to the Linux source tree. Which leads to quite surprising failure of defconfig building: ------------------------------->8----------------------------- ../scripts/gen_initramfs_list.sh: Cannot open '../../arc_initramfs_hs/' ../usr/Makefile:57: recipe for target 'usr/initramfs_data.cpio.gz' failed make[2]: *** [usr/initramfs_data.cpio.gz] Error 1 ------------------------------->8----------------------------- So now when more and more people start to deal with our defconfigs let's make their life easier with removal of CONFIG_INITRAMFS_SOURCE. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Kevin Hilman <khilman@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:59 +02:00
Vineet Gupta	2ee7d6f173	ARC: mm: allow mprotect to make stack mappings executable commit `93312b6da4` upstream. mprotect(EXEC) was failing for stack mappings as default vm flags was missing MAYEXEC. This was triggered by glibc test suite nptl/tst-execstack testcase What is surprising is that despite running LTP for years on, we didn't catch this issue as it lacks a directed test case. gcc dejagnu tests with nested functions also requiring exec stack work fine though because they rely on the GNU_STACK segment spit out by compiler and handled in kernel elf loader. This glibc case is different as the stack is non exec to begin with and a dlopen of shared lib with GNU_STACK segment triggers the exec stack proceedings using a mprotect(PROT_EXEC) which was broken. CC: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:59 +02:00
Alexey Brodkin	3a80fb0d77	ARC: Fix CONFIG_SWAP commit `6e3761145a` upstream. swap was broken on ARC due to silly copy-paste issue. We encode offset from swapcache page in __swp_entry() as (off << 13) but were not decoding back in __swp_offset() as (off >> 13) - it was still (off << 13). This finally fixes swap usage on ARC. \| # mkswap /dev/sda2 \| \| # swapon -a -e /dev/sda2 \| Adding 500728k swap on /dev/sda2. Priority:-2 extents:1 across:500728k \| \| # free \| total used free shared buffers cached \| Mem: 765104 13456 751648 4736 8 4736 \| -/+ buffers/cache: 8712 756392 \| Swap: 500728 0 500728 Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:58 +02:00
Takashi Iwai	c4f094deb3	ALSA: rawmidi: Change resized buffers atomically commit `39675f7a7c` upstream. The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the current code is racy. For example, the sequencer client may write to buffer while it being resized. As a simple workaround, let's switch to the resized buffer inside the stream runtime lock. Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:58 +02:00
OGAWA Hirofumi	6fc87cc95b	fat: fix memory allocation failure handling of match_strdup() commit `35033ab988` upstream. In parse_options(), if match_strdup() failed, parse_options() leaves opts->iocharset in unexpected state (i.e. still pointing the freed string). And this can be the cause of double free. To fix, this initialize opts->iocharset always when freeing. Link: http://lkml.kernel.org/r/8736wp9dzc.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: syzbot+90b8e10515ae88228a92@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:58 +02:00
Dewet Thibaut	91b6b9d0bf	x86/MCE: Remove min interval polling limitation commit `fbdb328c6b` upstream. commit `b3b7c4795c` ("x86/MCE: Serialize sysfs changes") introduced a min interval limitation when setting the check interval for polled MCEs. However, the logic is that 0 disables polling for corrected MCEs, see Documentation/x86/x86_64/machinecheck. The limitation prevents disabling. Remove this limitation and allow the value 0 to disable polling again. Fixes: `b3b7c4795c` ("x86/MCE: Serialize sysfs changes") Signed-off-by: Dewet Thibaut <thibaut.dewet@nokia.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180716084927.24869-1-alexander.sverdlin@nokia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:58 +02:00
Ville Syrjälä	6ac85d2233	x86/apm: Don't access __preempt_count with zeroed fs commit `6f6060a5c9` upstream. APM_DO_POP_SEGS does not restore fs/gs which were zeroed by APM_DO_ZERO_SEGS. Trying to access __preempt_count with zeroed fs doesn't really work. Move the ibrs call outside the APM_DO_SAVE_SEGS/APM_DO_RESTORE_SEGS invocations so that fs is actually restored before calling preempt_enable(). Fixes the following sort of oopses: [ 0.313581] general protection fault: 0000 [#1] PREEMPT SMP [ 0.313803] Modules linked in: [ 0.314040] CPU: 0 PID: 268 Comm: kapmd Not tainted 4.16.0-rc1-triton-bisect-00090-gdd84441a7971 #19 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 [ 0.316161] EFLAGS: 00210016 CPU: 0 [ 0.316161] EAX: 00000102 EBX: 00000000 ECX: 00000102 EDX: 00000000 [ 0.316161] ESI: 0000530e EDI: dea95f64 EBP: dea95f18 ESP: dea95ef0 [ 0.316161] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 0.316161] CR0: 80050033 CR2: 00000000 CR3: 015d3000 CR4: 000006d0 [ 0.316161] Call Trace: [ 0.316161] ? cpumask_weight.constprop.15+0x20/0x20 [ 0.316161] on_cpu0+0x44/0x70 [ 0.316161] apm+0x54e/0x720 [ 0.316161] ? __switch_to_asm+0x26/0x40 [ 0.316161] ? __schedule+0x17d/0x590 [ 0.316161] kthread+0xc0/0xf0 [ 0.316161] ? proc_apm_show+0x150/0x150 [ 0.316161] ? kthread_create_worker_on_cpu+0x20/0x20 [ 0.316161] ret_from_fork+0x2e/0x38 [ 0.316161] Code: da 8e c2 8e e2 8e ea 57 55 2e ff 1d e0 bb 5d b1 0f 92 c3 5d 5f 07 1f 89 47 0c 90 8d b4 26 00 00 00 00 90 8d b4 26 00 00 00 00 90 <64> ff 0d 84 16 5c b1 74 7f 8b 45 dc 8e e0 8b 45 d8 8e e8 8b 45 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 SS:ESP: 0068:dea95ef0 [ 0.316161] ---[ end trace 656253db2deaa12c ]--- Fixes: `dd84441a79` ("x86/speculation: Use IBRS if available before calling into firmware") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180709133534.5963-1-ville.syrjala@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:58 +02:00
Lan Tianyu	76267a8a19	KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. commit `b5020a8e6b` upstream. Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel for one specific eventfd. When the assign path hasn't finished but irqfd has been added to kvm->irqfds.items list, another thead may deassign the eventfd and free struct kvm_kernel_irqfd(). The assign path then uses the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid such issue, keep irqfd under kvm->irq_srcu protection after the irqfd has been added to kvm->irqfds.items list, and call synchronize_srcu() in irq_shutdown() to make sure that irqfd has been fully initialized in the assign path. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Tianyu Lan <tianyu.lan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:23:57 +02:00
Greg Kroah-Hartman	19e5f4da12	Linux 4.9.114	2018-07-22 14:27:43 +02:00
Tejun Heo	5c067898fe	string: drop __must_check from strscpy() and restore strscpy() usages in cgroup commit `08a77676f9` upstream. `e7fd37ba12` ("cgroup: avoid copying strings longer than the buffers") converted possibly unsafe strncpy() usages in cgroup to strscpy(). However, although the callsites are completely fine with truncated copied, because strscpy() is marked __must_check, it led to the following warnings. kernel/cgroup/cgroup.c: In function ‘cgroup_file_name’: kernel/cgroup/cgroup.c:1400:10: warning: ignoring return value of ‘strscpy’, declared with attribute warn_unused_result [-Wunused-result] strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX); ^ To avoid the warnings, `50034ed496` ("cgroup: use strlcpy() instead of strscpy() to avoid spurious warning") switched them to strlcpy(). strlcpy() is worse than strlcpy() because it unconditionally runs strlen() on the source string, and the only reason we switched to strlcpy() here was because it was lacking __must_check, which doesn't reflect any material differences between the two function. It's just that someone added __must_check to strscpy() and not to strlcpy(). These basic string copy operations are used in variety of ways, and one of not-so-uncommon use cases is safely handling truncated copies, where the caller naturally doesn't care about the return value. The __must_check doesn't match the actual use cases and forces users to opt for inferior variants which lack __must_check by happenstance or spread ugly (void) casts. Remove __must_check from strscpy() and restore strscpy() usages in cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@ezchip.com> [backport only the string.h portion to remove build warnings starting to show up - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	ba3fe91cba	arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID commit `5d81f7dc9b` upstream. Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	f99e406491	arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests commit `b4f18c063a` upstream. In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3, add a small(-ish) sequence to handle it at EL2. Special care must be taken to track the state of the guest itself by updating the workaround flags. We also rely on patching to enable calls into the firmware. Note that since we need to execute branches, this always executes after the Spectre-v2 mitigation has been applied. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	68240e9bb1	arm64: KVM: Add ARCH_WORKAROUND_2 support for guests commit `55e3748e89` upstream. In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	7b62e8503f	arm64: KVM: Add HYP per-cpu accessors commit `85478bab40` upstream. As we're going to require to access per-cpu variables at EL2, let's craft the minimum set of accessors required to implement reading a per-cpu variable, relying on tpidr_el2 to contain the per-cpu offset. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	9c06aab19b	arm64: ssbd: Add prctl interface for per-thread mitigation commit `9cdc0108ba` upstream. If running on a system that performs dynamic SSBD mitigation, allow userspace to request the mitigation for itself. This is implemented as a prctl call, allowing the mitigation to be enabled or disabled at will for this particular thread. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	cf14b896e7	arm64: ssbd: Introduce thread flag to control userspace mitigation commit `9dd9614f54` upstream. In order to allow userspace to be mitigated on demand, let's introduce a new thread flag that prevents the mitigation from being turned off when exiting to userspace, and doesn't turn it on on entry into the kernel (with the assumption that the mitigation is always enabled in the kernel itself). This will be used by a prctl interface introduced in a later patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	d8fbc84469	arm64: ssbd: Restore mitigation status on CPU resume commit `647d0519b5` upstream. On a system where firmware can dynamically change the state of the mitigation, the CPU will always come up with the mitigation enabled, including when coming back from suspend. If the user has requested "no mitigation" via a command line option, let's enforce it by calling into the firmware again to disable it. Similarily, for a resume from hibernate, the mitigation could have been disabled by the boot kernel. Let's ensure that it is set back on in that case. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:42 +02:00
Marc Zyngier	42f967dede	arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation commit `986372c436` upstream. In order to avoid checking arm64_ssbd_callback_required on each kernel entry/exit even if no mitigation is required, let's add yet another alternative that by default jumps over the mitigation, and that gets nop'ed out if we're doing dynamic mitigation. Think of it as a poor man's static key... Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	242bff3816	arm64: ssbd: Add global mitigation state accessor commit `c32e1736ca` upstream. We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	3a64e6a998	arm64: Add 'ssbd' command-line option commit `a43ae4dfe5` upstream. On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	e7037bd9fc	arm64: Add ARCH_WORKAROUND_2 probing commit `a725e3dda1` upstream. As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	d8174bd75c	arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2 commit `5cf9ce6e5e` upstream. In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	be33163090	arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1 commit `8e2906245f` upstream. In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	d1b5c19583	arm/arm64: smccc: Add SMCCC-specific return codes commit `eff0e9e107` upstream. We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Christoffer Dall	cab367c1c9	KVM: arm64: Avoid storing the vcpu pointer on the stack Commit `4464e210de` upstream. We already have the percpu area for the host cpu state, which points to the VCPU, so there's no need to store the VCPU pointer on the stack on every context switch. We can be a little more clever and just use tpidr_el2 for the percpu offset and load the VCPU pointer from the host context. This has the benefit of being able to retrieve the host context even when our stack is corrupted, and it has a potential performance benefit because we trade a store plus a load for an mrs and a load on a round trip to the guest. This does require us to calculate the percpu offset without including the offset from the kernel mapping of the percpu array to the linear mapping of the array (which is what we store in tpidr_el1), because a PC-relative generated address in EL2 is already giving us the hyp alias of the linear mapping of a kernel address. We do this in __cpu_init_hyp_mode() by using kvm_ksym_ref(). The code that accesses ESR_EL2 was previously using an alternative to use the _EL1 accessor on VHE systems, but this was actually unnecessary as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2 accessor does the same thing on both systems. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	4276825938	KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state Commit `44a497abd6` upstream. kvm_vgic_global_state is part of the read-only section, and is usually accessed using a PC-relative address generation (adrp + add). It is thus useless to use kern_hyp_va() on it, and actively problematic if kern_hyp_va() becomes non-idempotent. On the other hand, there is no way that the compiler is going to guarantee that such access is always PC relative. So let's bite the bullet and provide our own accessor. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:41 +02:00
Marc Zyngier	3e75f25aad	arm64: alternatives: Add dynamic patching feature Commit `dea5e2a4c5` upstream. We've so far relied on a patching infrastructure that only gave us a single alternative, without any way to provide a range of potential replacement instructions. For a single feature, this is an all or nothing thing. It would be interesting to have a more flexible grained way of patching the kernel though, where we could dynamically tune the code that gets injected. In order to achive this, let's introduce a new form of dynamic patching, assiciating a callback to a patching site. This callback gets source and target locations of the patching request, as well as the number of instructions to be patched. Dynamic patching is declared with the new ALTERNATIVE_CB and alternative_cb directives: asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback) : "r" (v)); or alternative_cb callback mov x0, #0 alternative_cb_end where callback is the C function computing the alternative. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
James Morse	8bace8ac81	KVM: arm64: Stop save/restoring host tpidr_el1 on VHE Commit `1f742679c3` upstream. Now that a VHE host uses tpidr_el2 for the cpu offset we no longer need KVM to save/restore tpidr_el1. Move this from the 'common' code into the non-vhe code. While we're at it, on VHE we don't need to save the ELR or SPSR as kernel_entry in entry.S will have pushed these onto the kernel stack, and will restore them from there. Move these to the non-vhe code as we need them to get back to the host. Finally remove the always-copy-tpidr we hid in the stage2 setup code, cpufeature's enable callback will do this for VHE, we only need KVM to do it for non-vhe. Add the copy into kvm-init instead. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
James Morse	eea59020a7	arm64: alternatives: use tpidr_el2 on VHE hosts Commit `6d99b68933` upstream. Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 on VHE hosts, and allows future code to blindly access per-cpu variables without triggering world-switch. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
James Morse	fa043b975c	KVM: arm64: Change hyp_panic()s dependency on tpidr_el2 Commit `c97e166e54` upstream. Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and on VHE they can be the same register. KVM calls hyp_panic() when anything unexpected happens. This may occur while a guest owns the EL1 registers. KVM stashes the vcpu pointer in tpidr_el2, which it uses to find the host context in order to restore the host EL1 registers before parachuting into the host's panic(). The host context is a struct kvm_cpu_context allocated in the per-cpu area, and mapped to hyp. Given the per-cpu offset for this CPU, this is easy to find. Change hyp_panic() to take a pointer to the struct kvm_cpu_context. Wrap these calls with an asm function that retrieves the struct kvm_cpu_context from the host's per-cpu area. Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during kvm init. (Later patches will make this unnecessary for VHE hosts) We print out the vcpu pointer as part of the panic message. Add a back reference to the 'running vcpu' in the host cpu context to preserve this. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
James Morse	6a654e6939	KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation Commit `36989e7fd3` upstream. kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init() used to store the host EL1 registers when KVM switches to a guest. Make it easier for ASM to generate pointers into this per-cpu memory by making it a static allocation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
James Morse	02891fdbfd	KVM: arm64: Store vcpu on the stack during __guest_enter() Commit `32b03d1059` upstream. KVM uses tpidr_el2 as its private vcpu register, which makes sense for non-vhe world switch as only KVM can access this register. This means vhe Linux has to use tpidr_el1, which KVM has to save/restore as part of the host context. If the SDEI handler code runs behind KVMs back, it mustn't access any per-cpu variables. To allow this on systems with vhe we need to make the host use tpidr_el2, saving KVM from save/restoring it. __guest_enter() stores the host_ctxt on the stack, do the same with the vcpu. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
Mark Rutland	c488ae439d	arm64: assembler: introduce ldr_this_cpu Commit `1b7e2296a8` upstream. Shortly we will want to load a percpu variable in the return from userspace path. We can save an instruction by folding the addition of the percpu offset into the load instruction, and this patch adds a new helper to do so. At the same time, we clean up this_cpu_ptr for consistency. As with {adr,ldr,str}_l, we change the template to take the destination register first, and name this dst. Secondly, we rename the macro to adr_this_cpu, following the scheme of adr_l, and matching the newly added ldr_this_cpu. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
Tetsuo Handa	d31a56d23e	net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL. commit `3bc53be9db` upstream. syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is because nfc_llcp_send_ui_frame() is retrying the loop without any delay when nonblocking nfc_alloc_send_skb() returned NULL. Since there is no need to use MSG_DONTWAIT if we retry until sock_alloc_send_pskb() succeeds, let's use blocking call. Also, in case an unexpected error occurred, let's break the loop if blocking nfc_alloc_send_skb() failed. [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+d29d18215e477cfbfbdd@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:40 +02:00
Santosh Shilimkar	863d5568b7	rds: avoid unenecessary cong_update in loop transport commit `f1693c63ab` upstream. Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Florian Westphal	ad8b1ffc3e	netfilter: ipv6: nf_defrag: drop skb dst before queueing commit `84379c9afe` upstream. Eric Dumazet reports: Here is a reproducer of an annoying bug detected by syzkaller on our production kernel [..] ./b78305423 enable_conntrack Then : sleep 60 dmesg \| tail -10 [ 171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2 Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook. skb gets queued until frag timer expiry -- 1 minute. Normally nf_conntrack_reasm gets called during prerouting, so skb has no dst yet which might explain why this wasn't spotted earlier. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Eric Biggers	3d0ce44daf	KEYS: DNS: fix parsing multiple options commit `c604cb7670` upstream. My recent fix for dns_resolver_preparse() printing very long strings was incomplete, as shown by syzbot which still managed to hit the WARN_ONCE() in set_precision() by adding a crafted "dns_resolver" key: precision 50001 too large WARNING: CPU: 7 PID: 864 at lib/vsprintf.c:2164 vsnprintf+0x48a/0x5a0 The bug this time isn't just a printing bug, but also a logical error when multiple options ("#"-separated strings) are given in the key payload. Specifically, when separating an option string into name and value, if there is no value then the name is incorrectly considered to end at the end of the key payload, rather than the end of the current option. This bypasses validation of the option length, and also means that specifying multiple options is broken -- which presumably has gone unnoticed as there is currently only one valid option anyway. A similar problem also applied to option values, as the kstrtoul() when parsing the "dnserror" option will read past the end of the current option and into the next option. Fix these bugs by correctly computing the length of the option name and by copying the option value, null-terminated, into a temporary buffer. Reproducer for the WARN_ONCE() that syzbot hit: perl -e 'print "#A#", "\0" x 50000' \| keyctl padd dns_resolver desc @s Reproducer for "dnserror" option being parsed incorrectly (expected behavior is to fail when seeing the unknown option "foo", actual behavior was to read the dnserror value as "1#foo" and fail there): perl -e 'print "#dnserror=1#foo\0"' \| keyctl padd dns_resolver desc @s Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: `4a2d789267` ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Eric Biggers	ec5e52a881	reiserfs: fix buffer overflow with long warning messages commit `fe10e398e8` upstream. ReiserFS prepares log messages into a 1024-byte buffer with no bounds checks. Long messages, such as the "unknown mount option" warning when userspace passes a crafted mount options string, overflow this buffer. This causes KASAN to report a global-out-of-bounds write. Fix it by truncating messages to the buffer size. Link: http://lkml.kernel.org/r/20180707203621.30922-1-ebiggers3@gmail.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot+b890b3335a4d8c608963@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Florian Westphal	064d9e9744	netfilter: ebtables: reject non-bridge targets commit `11ff7288be` upstream. the ebtables evaluation loop expects targets to return positive values (jumps), or negative values (absolute verdicts). This is completely different from what xtables does. In xtables, targets are expected to return the standard netfilter verdicts, i.e. NF_DROP, NF_ACCEPT, etc. ebtables will consider these as jumps. Therefore reject any target found due to unspec fallback. v2: also reject watchers. ebtables ignores their return value, so a target that assumes skb ownership (and returns NF_STOLEN) causes use-after-free. The only watchers in the 'ebtables' front-end are log and nflog; both have AF_BRIDGE specific wrappers on kernel side. Reported-by: syzbot+2b43f681169a2a0d306a@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Stefan Wahren	f6ed63bc39	net: lan78xx: Fix race in tx pending skb size calculation commit `dea39aca1d` upstream. The skb size calculation in lan78xx_tx_bh is in race with the start_xmit, which could lead to rare kernel oopses. So protect the whole skb walk with a spin lock. As a benefit we can unlink the skb directly. This patch was tested on Raspberry Pi 3B+ Link: https://github.com/raspberrypi/linux/issues/2608 Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Cc: stable <stable@vger.kernel.org> Signed-off-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Ping-Ke Shih	254f52df9b	rtlwifi: rtl8821ae: fix firmware is not ready to run commit `9a98302de1` upstream. Without this patch, firmware will not run properly on rtl8821ae, and it causes bad user experience. For example, bad connection performance with low rate, higher power consumption, and so on. rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and each firmware has firmware data buffer and size individually. Original code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and this mismatch causes firmware checksum error, then firmware can't start. In this situation, driver gives message "Firmware is not ready to run!". Fixes: `fe89707f0a` ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> # 4.0+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Gustavo A. R. Silva	53e795c755	net: cxgb3_main: fix potential Spectre v1 commit `676bcfece1` upstream. t.qset_idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl() warn: potential spectre issue 'adapter->msix_info' Fix this by sanitizing t.qset_idx before using it to index adapter->msix_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Alex Vesker	224d2337c0	net/mlx5: Fix command interface race in polling mode [ Upstream commit `d412c31dae` ] The command interface can work in two modes: Events and Polling. In the general case, each time we invoke a command, a work is queued to handle it. When working in events, the interrupt handler completes the command execution. On the other hand, when working in polling mode, the work itself completes it. Due to a bug in the work handler, a command could have been completed by the interrupt handler, while the work handler hasn't finished yet, causing the it to complete once again if the command interface mode was changed from Events to polling after the interrupt handler was called. mlx5_unload_one() mlx5_stop_eqs() // Destroy the EQ before cmd EQ ...cmd_work_handler() write_doorbell() --> EVENT_TYPE_CMD mlx5_cmd_comp_handler() // First free free_ent(cmd, ent->idx) complete(&ent->done) <-- mlx5_stop_eqs //cmd was complete // move to polling before destroying the last cmd EQ mlx5_cmd_use_polling() cmd->mode = POLL; --> cmd_work_handler (continues) if (cmd->mode == POLL) mlx5_cmd_comp_handler() // Double free The solution is to store the cmd->mode before writing the doorbell. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:39 +02:00
Eric Dumazet	9dc96f7205	net/packet: fix use-after-free [ Upstream commit `945d015ee0` ] We should put copy_skb in receive_queue only after a successful call to virtio_net_hdr_from_skb(). syzbot report : BUG: KASAN: use-after-free in __skb_unlink include/linux/skbuff.h:1843 [inline] BUG: KASAN: use-after-free in __skb_dequeue include/linux/skbuff.h:1863 [inline] BUG: KASAN: use-after-free in skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 Read of size 8 at addr ffff8801b044ecc0 by task syz-executor217/4553 CPU: 0 PID: 4553 Comm: syz-executor217 Not tainted 4.18.0-rc1+ #111 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __skb_unlink include/linux/skbuff.h:1843 [inline] __skb_dequeue include/linux/skbuff.h:1863 [inline] skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 skb_queue_purge+0x26/0x40 net/core/skbuff.c:2852 packet_set_ring+0x675/0x1da0 net/packet/af_packet.c:4331 packet_release+0x630/0xd90 net/packet/af_packet.c:2991 __sock_release+0xd7/0x260 net/socket.c:603 sock_close+0x19/0x20 net/socket.c:1186 __fput+0x35b/0x8b0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x1b08/0x2750 kernel/exit.c:865 do_group_exit+0x177/0x440 kernel/exit.c:968 __do_sys_exit_group kernel/exit.c:979 [inline] __se_sys_exit_group kernel/exit.c:977 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4448e9 Code: Bad RIP value. RSP: 002b:00007ffd5f777ca8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004448e9 RDX: 00000000004448e9 RSI: 000000000000fcfb RDI: 0000000000000001 RBP: 00000000006cf018 R08: 00007ffd0000a45b R09: 0000000000000000 R10: 00007ffd5f777e48 R11: 0000000000000202 R12: 00000000004021f0 R13: 0000000000402280 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554 skb_clone+0x1f5/0x500 net/core/skbuff.c:1282 tpacket_rcv+0x28f7/0x3200 net/packet/af_packet.c:2221 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x86/0x2d0 mm/slab.c:3756 kfree_skbmem+0x154/0x230 net/core/skbuff.c:582 __kfree_skb net/core/skbuff.c:642 [inline] kfree_skb+0x1a5/0x580 net/core/skbuff.c:659 tpacket_rcv+0x189e/0x3200 net/packet/af_packet.c:2385 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801b044ecc0 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 0 bytes inside of 232-byte region [ffff8801b044ecc0, ffff8801b044eda8) The buggy address belongs to the page: page:ffffea0006c11380 count:1 mapcount:0 mapping:ffff8801d9be96c0 index:0x0 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006c17988 ffff8801d9bec248 ffff8801d9be96c0 raw: 0000000000000000 ffff8801b044e040 000000010000000c 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b044eb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801b044ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff8801b044ec80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8801b044ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b044ed80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc Fixes: `58d19b19cd` ("packet: vnet_hdr support for tpacket_rcv") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:38 +02:00
Jason Wang	e11eb6a3f9	vhost_net: validate sock before trying to put its fd [ Upstream commit `b8f1f65882` ] Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: `bab632d69e` ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:38 +02:00
Ilpo Järvinen	65fb77c3ba	tcp: prevent bogus FRTO undos with non-SACK flows [ Upstream commit `1236f22fba` ] If SACK is not enabled and the first cumulative ACK after the RTO retransmission covers more than the retransmitted skb, a spurious FRTO undo will trigger (assuming FRTO is enabled for that RTO). The reason is that any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case so the check for that bit won't help like it does with SACK). Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo in tcp_process_loss. We need to use more strict condition for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted to prove that progress is due to original transmissions. Only then keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in non-SACK case. (FLAG_ORIG_SACK_ACKED is planned to be renamed to FLAG_ORIG_PROGRESS to better indicate its purpose but to keep this change minimal, it will be done in another patch). Besides burstiness and congestion control violations, this problem can result in RTO loop: When the loss recovery is prematurely undoed, only new data will be transmitted (if available) and the next retransmission can occur only after a new RTO which in case of multiple losses (that are not for consecutive packets) requires one RTO per loss to recover. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Yuchung Cheng	63253726a5	tcp: fix Fast Open key endianness [ Upstream commit `c860e997e9` ] Fast Open key could be stored in different endian based on the CPU. Previously hosts in different endianness in a server farm using the same key config (sysctl value) would produce different cookies. This patch fixes it by always storing it as little endian to keep same API for LE hosts. Reported-by: Daniele Iamartino <danielei@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Jiri Slaby	3e05636990	r8152: napi hangup fix after disconnect [ Upstream commit `0ee1f47349` ] When unplugging an r8152 adapter while the interface is UP, the NIC becomes unusable. usb->disconnect (aka rtl8152_disconnect) deletes napi. Then, rtl8152_disconnect calls unregister_netdev and that invokes netdev->ndo_stop (aka rtl8152_close). rtl8152_close tries to napi_disable, but the napi is already deleted by disconnect above. So the first while loop in napi_disable never finishes. This results in complete deadlock of the network layer as there is rtnl_mutex held by unregister_netdev. So avoid the call to napi_disable in rtl8152_close when the device is already gone. The other calls to usb_kill_urb, cancel_delayed_work_sync, netif_stop_queue etc. seem to be fine. The urb and netdev is not destroyed yet. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Aleksander Morgado	b0a508a5c8	qmi_wwan: add support for the Dell Wireless 5821e module [ Upstream commit `e7e197edd0` ] This module exposes two USB configurations: a QMI+AT capable setup on USB config #1 and a MBIM capable setup on USB config #2. By default the kernel will choose the MBIM capable configuration as long as the cdc_mbim driver is available. This patch adds support for the QMI port in the secondary configuration. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Sudarsana Reddy Kalluru	0b79604960	qed: Limit msix vectors in kdump kernel to the minimum required count. [ Upstream commit `bb7858ba11` ] Memory size is limited in the kdump kernel environment. Allocation of more msix-vectors (or queues) consumes few tens of MBs of memory, which might lead to the kdump kernel failure. This patch adds changes to limit the number of MSI-X vectors in kdump kernel to minimum required value (i.e., 2 per engine). Fixes: `fe56b9e6a` ("qed: Add module with basic common support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Sudarsana Reddy Kalluru	a648a4636d	qed: Fix use of incorrect size in memcpy call. [ Upstream commit `cc9b27cdf7` ] Use the correct size value while copying chassis/port id values. Fixes: `6ad8c632e` ("qed: Add support for query/config dcbx.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Eric Dumazet	32490f4d76	net: sungem: fix rx checksum support [ Upstream commit `12b03558ce` ] After commit `88078d98d1` ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), sungem owners reported the infamous "eth0: hw csum failure" message. CHECKSUM_COMPLETE has in fact never worked for this driver, but this was masked by the fact that upper stacks had to strip the FCS, and therefore skb->ip_summed was set back to CHECKSUM_NONE before my recent change. Driver configures a number of bytes to skip when the chip computes the checksum, and for some reason only half of the Ethernet header was skipped. Then a second problem is that we should strip the FCS by default, unless the driver is updated to eventually support NETIF_F_RXFCS in the future. Finally, a driver should check if NETIF_F_RXCSUM feature is enabled or not, so that the admin can turn off rx checksum if wanted. Many thanks to Andreas Schwab and Mathieu Malaterre for their help in debugging this issue. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Mathieu Malaterre <malat@debian.org> Reported-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:37 +02:00
Konstantin Khlebnikov	1f1fbe1692	net_sched: blackhole: tell upper qdisc about dropped packets [ Upstream commit `7e85dc8cb3` ] When blackhole is used on top of classful qdisc like hfsc it breaks qlen and backlog counters because packets are disappear without notice. In HFSC non-zero qlen while all classes are inactive triggers warning: WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc] and schedules watchdog work endlessly. This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS, this flag tells upper layer: this packet is gone and isn't queued. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Shay Agroskin	14e9e6527a	net/mlx5: Fix wrong size allocation for QoS ETC TC regitster [ Upstream commit `d14fcb8d87` ] The driver allocates wrong size (due to wrong struct name) when issuing a query/set request to NIC's register. Fixes: `d8880795da` ("net/mlx5e: Implement DCBNL IEEE max rate") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Alex Vesker	5b3cc7f9b3	net/mlx5: Fix incorrect raw command length parsing [ Upstream commit `603b7bcff8` ] The NULL character was not set correctly for the string containing the command length, this caused failures reading the output of the command due to a random length. The fix is to initialize the output length string. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Eric Dumazet	e555ae018b	net: dccp: switch rx_tstamp_last_feedback to monotonic clock [ Upstream commit `0ce4e70ff0` ] To compute delays, better not use time of the day which can be changed by admins or malicious programs. Also change ccid3_first_li() to use s64 type for delta variable to avoid potential overflows. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Eric Dumazet	87cd5e4acd	net: dccp: avoid crash in ccid3_hc_rx_send_feedback() [ Upstream commit `74174fe563` ] On fast hosts or malicious bots, we trigger a DCCP_BUG() which seems excessive. syzbot reported : BUG: delta (-6195) <= 0 at net/dccp/ccids/ccid3.c:628/ccid3_hc_rx_send_feedback() CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.18.0-rc1+ #112 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 ccid3_hc_rx_send_feedback net/dccp/ccids/ccid3.c:628 [inline] ccid3_hc_rx_packet_recv.cold.16+0x38/0x71 net/dccp/ccids/ccid3.c:793 ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline] dccp_deliver_input_to_ccids+0xf0/0x280 net/dccp/input.c:180 dccp_rcv_established+0x87/0xb0 net/dccp/input.c:378 dccp_v4_do_rcv+0x153/0x180 net/dccp/ipv4.c:654 sk_backlog_rcv include/net/sock.h:914 [inline] __sk_receive_skb+0x3ba/0xd80 net/core/sock.c:517 dccp_v4_rcv+0x10f9/0x1f58 net/dccp/ipv4.c:875 ip_local_deliver_finish+0x2eb/0xda0 net/ipv4/ip_input.c:215 NF_HOOK include/linux/netfilter.h:287 [inline] ip_local_deliver+0x1e9/0x750 net/ipv4/ip_input.c:256 dst_input include/net/dst.h:450 [inline] ip_rcv_finish+0x823/0x2220 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:287 [inline] ip_rcv+0xa18/0x1284 net/ipv4/ip_input.c:492 __netif_receive_skb_core+0x2488/0x3680 net/core/dev.c:4628 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 process_backlog+0x219/0x760 net/core/dev.c:5373 napi_poll net/core/dev.c:5771 [inline] net_rx_action+0x7da/0x1980 net/core/dev.c:5837 __do_softirq+0x2e8/0xb17 kernel/softirq.c:284 run_ksoftirqd+0x86/0x100 kernel/softirq.c:645 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Xin Long	d7adadbf09	ipvlan: fix IFLA_MTU ignored on NEWLINK [ Upstream commit `30877961b1` ] Commit `296d485680` ("ipvlan: inherit MTU from master device") adjusted the mtu from the master device when creating a ipvlan device, but it would also override the mtu value set in rtnl_create_link. It causes IFLA_MTU param not to take effect. So this patch is to not adjust the mtu if IFLA_MTU param is set when creating a ipvlan device. Fixes: `296d485680` ("ipvlan: inherit MTU from master device") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Gustavo A. R. Silva	b76942ac26	atm: zatm: Fix potential Spectre v1 [ Upstream commit `ced9e19150` ] pool can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/atm/zatm.c:1491 zatm_ioctl() warn: potential spectre issue 'zatm_dev->pool_info' (local cap) Fix this by sanitizing pool before using it to index zatm_dev->pool_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Christian Lamparter	e77e7d8f6b	crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak commit `5d59ad6eea` upstream. If one of the later memory allocations in rypto4xx_build_pdr() fails: dev->pdr (and/or) dev->pdr_uinfo wouldn't be freed. crypto4xx_build_sdr() has the same issue with dev->sdr. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Christian Lamparter	af4b765a78	crypto: crypto4xx - remove bad list_del commit `a728a196d2` upstream. alg entries are only added to the list, after the registration was successful. If the registration failed, it was never added to the list in the first place. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Jonas Gorski	68bf812b35	bcm63xx_enet: do not write to random DMA channel on BCM6345 commit `d6213c1f2a` upstream. The DMA controller regs actually point to DMA channel 0, so the write to ENETDMA_CFG_REG will actually modify a random DMA channel. Since DMA controller registers do not exist on BCM6345, guard the write with the usual check for dma_has_sram. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:36 +02:00
Jonas Gorski	f5490a6ec5	bcm63xx_enet: correct clock usage commit `9c86b846ce` upstream. Check the return code of prepare_enable and change one last instance of enable only to prepare_enable. Also properly disable and release the clock in error paths and on remove for enetsw. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Heiner Kallweit	f61de8ef5c	mtd: m25p80: consider max message size in m25p80_read commit `9e276de6a3` upstream. Consider a message size limit when calculating the maximum amount of data that can be read. The message size limit has been introduced with 4.9, so cc it to stable. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
alex chen	78a65505cd	ocfs2: ip_alloc_sem should be taken in ocfs2_get_block() commit `3e4c56d41e` upstream. ip_alloc_sem should be taken in ocfs2_get_block() when reading file in DIRECT mode to prevent concurrent access to extent tree with ocfs2_dio_end_io_write(), which may cause BUGON in the following situation: read file 'A' end_io of writing file 'A' vfs_read __vfs_read ocfs2_file_read_iter generic_file_read_iter ocfs2_direct_IO __blockdev_direct_IO do_blockdev_direct_IO do_direct_IO get_more_blocks ocfs2_get_block ocfs2_extent_map_get_blocks ocfs2_get_clusters ocfs2_get_clusters_nocache() ocfs2_search_extent_list return the index of record which contains the v_cluster, that is v_cluster > rec[i]->e_cpos. ocfs2_dio_end_io ocfs2_dio_end_io_write down_write(&oi->ip_alloc_sem); ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_split_extent ... --> modify the rec[i]->e_cpos, resulting in v_cluster < rec[i]->e_cpos. BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos)) [alex.chen@huawei.com: v3] Link: http://lkml.kernel.org/r/59EF3614.6050008@huawei.com Link: http://lkml.kernel.org/r/59EF3614.6050008@huawei.com Fixes: `c15471f795` ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Reviewed-by: Gang He <ghe@suse.com> Acked-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
alex chen	32a1733cf8	ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent commit `853bc26a7e` upstream. The subsystem.su_mutex is required while accessing the item->ci_parent, otherwise, NULL pointer dereference to the item->ci_parent will be triggered in the following situation: add node delete node sys_write vfs_write configfs_write_file o2nm_node_store o2nm_node_local_write do_rmdir vfs_rmdir configfs_rmdir mutex_lock(&subsys->su_mutex); unlink_obj item->ci_group = NULL; item->ci_parent = NULL; to_o2nm_cluster_from_node node->nd_item.ci_parent->ci_parent BUG since of NULL pointer dereference to nd_item.ci_parent Moreover, the o2nm_cluster also should be protected by the subsystem.su_mutex. [alex.chen@huawei.com: v2] Link: http://lkml.kernel.org/r/59EEAA69.9080703@huawei.com Link: http://lkml.kernel.org/r/59E9B36A.10700@huawei.com Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Nick Desaulniers	1919f3fd55	x86/paravirt: Make native_save_fl() extern inline commit `d0a8d9378d` upstream. native_save_fl() is marked static inline, but by using it as a function pointer in arch/x86/kernel/paravirt.c, it MUST be outlined. paravirt's use of native_save_fl() also requires that no GPRs other than %rax are clobbered. Compilers have different heuristics which they use to emit stack guard code, the emittance of which can break paravirt's callee saved assumption by clobbering %rcx. Marking a function definition extern inline means that if this version cannot be inlined, then the out-of-line version will be preferred. By having the out-of-line version be implemented in assembly, it cannot be instrumented with a stack protector, which might violate custom calling conventions that code like paravirt rely on. The semantics of extern inline has changed since gnu89. This means that folks using GCC versions >= 5.1 may see symbol redefinition errors at link time for subdirs that override KBUILD_CFLAGS (making the C standard used implicit) regardless of this patch. This has been cleaned up earlier in the patch set, but is left as a note in the commit message for future travelers. Reports: https://lkml.org/lkml/2018/5/7/534 https://github.com/ClangBuiltLinux/linux/issues/16 Discussion: https://bugs.llvm.org/show_bug.cgi?id=37512 https://lkml.org/lkml/2018/5/24/1371 Thanks to the many folks that participated in the discussion. Debugged-by: Alistair Strachan <astrachan@google.com> Debugged-by: Matthias Kaehlcke <mka@chromium.org> Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Tom Stellar <tstellar@redhat.com> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: joe@perches.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: thomas.lendacky@amd.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-4-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
H. Peter Anvin	cb877e4763	x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h> commit `0e2e160033` upstream. i386 and x86-64 uses different registers for arguments; make them available so we don't have to #ifdef in the actual code. Native size and specified size (q, l, w, b) versions are provided. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: arnd@arndb.de Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: joe@perches.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: thomas.lendacky@amd.com Cc: tstellar@redhat.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-3-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Nick Desaulniers	02c89527b0	compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations commit `d03db2bc26` upstream. Functions marked extern inline do not emit an externally visible function when the gnu89 C standard is used. Some KBUILD Makefiles overwrite KBUILD_CFLAGS. This is an issue for GCC 5.1+ users as without an explicit C standard specified, the default is gnu11. Since c99, the semantics of extern inline have changed such that an externally visible function is always emitted. This can lead to multiple definition errors of extern inline functions at link time of compilation units whose build files have removed an explicit C standard compiler flag for users of GCC 5.1+ or Clang. Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: sedat.dilek@gmail.com Cc: thomas.lendacky@amd.com Cc: tstellar@redhat.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-2-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
David Rientjes	29524a9d42	compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled commit `9a04dbcfb3` upstream. The motivation for commit `abb2ea7dfd` ("compiler, clang: suppress warning for unused static inline functions") was to suppress clang's warnings about unused static inline functions. For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86 architecture, `inline' in the kernel implies that __attribute__((always_inline)) is used. Some code depends on that behavior, see https://lkml.org/lkml/2017/6/13/918: net/built-in.o: In function `__xchg_mb': arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99' arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99 The full fix would be to identify these breakages and annotate the functions with __always_inline instead of `inline'. But since we are late in the 4.12-rc cycle, simply carry forward the forced inlining behavior and work toward moving arm64, and other architectures, toward CONFIG_OPTIMIZE_INLINING behavior. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Linus Torvalds	f276b50c3a	compiler, clang: properly override 'inline' for clang commit `6d53cefb18` upstream. Commit `abb2ea7dfd` ("compiler, clang: suppress warning for unused static inline functions") just caused more warnings due to re-defining the 'inline' macro. So undef it before re-defining it, and also add the 'notrace' attribute like the gcc version that this is overriding does. Maybe this makes clang happier. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
David Rientjes	94cc698fda	compiler, clang: suppress warning for unused static inline functions commit `abb2ea7dfd` upstream. GCC explicitly does not warn for unused static inline functions for -Wunused-function. The manual states: Warn whenever a static function is declared but not defined or a non-inline static function is unused. Clang does warn for static inline functions that are unused. It turns out that suppressing the warnings avoids potentially complex #ifdef directives, which also reduces LOC. Suppress the warning for clang. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Paul Burton	dc9e795b08	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace() commit `b63e132b64` upstream. The current MIPS implementation of arch_trigger_cpumask_backtrace() is broken because it attempts to use synchronous IPIs despite the fact that it may be run with interrupts disabled. This means that when arch_trigger_cpumask_backtrace() is invoked, for example by the RCU CPU stall watchdog, we may: - Deadlock due to use of synchronous IPIs with interrupts disabled, causing the CPU that's attempting to generate the backtrace output to hang itself. - Not succeed in generating the desired output from remote CPUs. - Produce warnings about this from smp_call_function_many(), for example: [42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks: [42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0 [42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0 [42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33) [42760.568927] ------------[ cut here ]------------ [42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c [42760.587839] Modules linked in: [42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2 [42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8 [42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000 [42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0 [42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0 [42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca [42760.669424] ... [42760.673919] Call Trace: [42760.678672] [<27fde568>] show_stack+0x70/0xf0 [42760.685417] [<84751641>] dump_stack+0xaa/0xd0 [42760.692188] [<699d671c>] __warn+0x80/0x92 [42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36 [42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c [42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a [42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98 [42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac [42760.737476] [<059b3b43>] update_process_times+0x18/0x34 [42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38 [42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50 [42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226 [42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a [42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a [42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168 [42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c [42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86 [42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14 [42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c [42760.820086] [<c7521934>] do_IRQ+0x16/0x20 [42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94 [42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78 [42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c [42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c [42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98 [42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44 [42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e [42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66 [42760.883907] [<b3fce717>] exit_mmap+0x76/0xea [42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a [42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c [42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e [42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e [42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c [42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c [42760.931765] ---[ end trace 02aa09da9dc52a60 ]--- [42760.938342] ------------[ cut here ]------------ [42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8 ... This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async IPIs & smp_call_function_single_async() in order to resolve this problem. We ensure use of the pre-allocated call_single_data_t structures is serialized by maintaining a cpumask indicating that they're busy, and refusing to attempt to send an IPI when a CPU's bit is set in this mask. This should only happen if a CPU hasn't responded to a previous backtrace IPI - ie. if it's hung - and we print a warning to the console in this case. I've marked this for stable branches as far back as v4.9, to which it applies cleanly. Strictly speaking the faulty MIPS implementation can be traced further back to commit `856839b768` ("MIPS: Add arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel versions v3.19 through v4.8 will require further work to backport due to the rework performed in commit `9a01c3ed5c` ("nmi_backtrace: add more trigger__cpu_backtrace() methods"). Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19597/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.9+ Fixes: `856839b768` ("MIPS: Add arch_trigger_all_cpu_backtrace() function") Fixes: `9a01c3ed5c` ("nmi_backtrace: add more trigger__cpu_backtrace() methods") [ Huacai: backported to 4.9: Replace "call_single_data_t" with "struct call_single_data" ] Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 14:27:35 +02:00
Greg Kroah-Hartman	f77982e691	Linux 4.9.113	2018-07-17 11:37:54 +02:00
Tetsuo Handa	b2660f35d3	loop: remember whether sysfs_create_group() was done commit `d3349b6b3c` upstream. syzbot is hitting WARN() triggered by memory allocation fault injection [1] because loop module is calling sysfs_remove_group() when sysfs_create_group() failed. Fix this by remembering whether sysfs_create_group() succeeded. [1] https://syzkaller.appspot.com/bug?id=3f86c0edf75c86d2633aeb9dd69eccc70bc7e90b Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+9f03168400f56df89dbc6f1751f4458fe739ff29@syzkaller.appspotmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Renamed sysfs_ready -> sysfs_inited. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-07-17 11:37:54 +02:00
Leon Romanovsky	684db31e74	RDMA/ucm: Mark UCM interface as BROKEN commit `7a8690ed6f` upstream. In commit 357d23c811a7 ("Remove the obsolete libibcm library") in rdma-core [1], we removed obsolete library which used the /dev/infiniband/ucmX interface. Following multiple syzkaller reports about non-sanitized user input in the UCMA module, the short audit reveals the same issues in UCM module too. It is better to disable this interface in the kernel, before syzkaller team invests time and energy to harden this unused interface. [1] https://github.com/linux-rdma/rdma-core/pull/279 Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Tetsuo Handa	34f841a3c3	PM / hibernate: Fix oops at snapshot_write() commit `fc14eebfc2` upstream. syzbot is reporting NULL pointer dereference at snapshot_write() [1]. This is because data->handle is zero-cleared by ioctl(SNAPSHOT_FREE). Fix this by checking data_of(data->handle) != NULL before using it. [1] https://syzkaller.appspot.com/bug?id=828a3c71bd344a6de8b6a31233d51a72099f27fd Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+ae590932da6e45d6564d@syzkaller.appspotmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Theodore Ts'o	e3cf1cc9ed	loop: add recursion validation to LOOP_CHANGE_FD commit `d2ac838e4c` upstream. Refactor the validation code used in LOOP_SET_FD so it is also used in LOOP_CHANGE_FD. Otherwise it is possible to construct a set of loop devices that all refer to each other. This can lead to a infinite loop in starting with "while (is_loop_device(f)) .." in loop_set_fd(). Fix this by refactoring out the validation code and using it for LOOP_CHANGE_FD as well as LOOP_SET_FD. Reported-by: syzbot+4349872271ece473a7c91190b68b4bac7c5dbc87@syzkaller.appspotmail.com Reported-by: syzbot+40bd32c4d9a3cc12a339@syzkaller.appspotmail.com Reported-by: syzbot+769c54e66f994b041be7@syzkaller.appspotmail.com Reported-by: syzbot+0a89a9ce473936c57065@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Florian Westphal	40352e791c	netfilter: x_tables: initialise match/target check parameter struct commit `c568503ef0` upstream. syzbot reports following splat: BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506 ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline] ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline] The uninitialised access is xt_mtchk_param->nft_compat ... which should be set to 0. Fix it by zeroing the struct beforehand, same for tgchk. ip(6)tables targetinfo uses c99-style initialiser, so no change needed there. Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com Fixes: `55917a21d0` ("netfilter: x_tables: add context to know if extension runs from nft_compat") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Eric Dumazet	ac378e6ade	netfilter: nf_queue: augment nfqa_cfg_policy commit `ba062ebb2c` upstream. Three attributes are currently not verified, thus can trigger KMSAN warnings such as : BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] BUG: KMSAN: uninit-value in nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268 CPU: 1 PID: 4521 Comm: syz-executor120 Not tainted 4.17.0+ #5 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1117 __msan_warning_32+0x70/0xc0 mm/kmsan/kmsan_instr.c:620 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] __fswab32 include/uapi/linux/swab.h:59 [inline] nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268 nfnetlink_rcv_msg+0xb2e/0xc80 net/netfilter/nfnetlink.c:212 netlink_rcv_skb+0x37e/0x600 net/netlink/af_netlink.c:2448 nfnetlink_rcv+0x2fe/0x680 net/netfilter/nfnetlink.c:513 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1680/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x43fd59 RSP: 002b:00007ffde0e30d28 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd59 RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401680 R13: 0000000000401710 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2753 [inline] __kmalloc_node_track_caller+0xb35/0x11b0 mm/slub.c:4395 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:988 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `fdb694a01f` ("netfilter: Add fail-open support") Fixes: `829e17a1a6` ("[NETFILTER]: nfnetlink_queue: allow changing queue length through netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Oleg Nesterov	377fb3d894	uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn() commit `90718e32e1` upstream. insn_get_length() has the side-effect of processing the entire instruction but only if it was decoded successfully, otherwise insn_complete() can fail and in this case we need to just return an error without warning. Reported-by: syzbot+30d675e3ca03c1c351e7@syzkaller.appspotmail.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Link: https://lkml.kernel.org/lkml/20180518162739.GA5559@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Keith Busch	062c4965a1	nvme-pci: Remap CMB SQ entries on every controller reset commit `815c6704bf` upstream. The controller memory buffer is remapped into a kernel address on each reset, but the driver was setting the submission queue base address only on the very first queue creation. The remapped address is likely to change after a reset, so accessing the old address will hit a kernel bug. This patch fixes that by setting the queue's CMB base address each time the queue is created. Fixes: `f63572dff1` ("nvme: unmap CMB and remove sysfs file in reset path") Reported-by: Christian Black <christian.d.black@intel.com> Cc: Jon Derrick <jonathan.derrick@intel.com> Cc: <stable@vger.kernel.org> # 4.9+ Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Scott Bauer <scott.bauer@intel.com> Reviewed-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:54 +02:00
Steve Wise	70c89bccb5	iw_cxgb4: correctly enforce the max reg_mr depth commit `7b72717a20` upstream. The code was mistakenly using the length of the page array memory instead of the depth of the page array. This would cause MR creation to fail in some cases. Fixes: `8376b86de7` ("iw_cxgb4: Support the new memory registration API") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Jon Hunter	e78e3706bf	i2c: tegra: Fix NACK error handling commit `54836e2d03` upstream. On Tegra30 Cardhu the PCA9546 I2C mux is not ACK'ing I2C commands on resume from suspend (which is caused by the reset signal for the I2C mux not being configured correctl). However, this NACK is causing the Tegra30 to hang on resuming from suspend which is not expected as we detect NACKs and handle them. The hang observed appears to occur when resetting the I2C controller to recover from the NACK. Commit `77821b4678` ("i2c: tegra: proper handling of error cases") added additional error handling for some error cases including NACK, however, it appears that this change conflicts with an early fix by commit `f70893d083` ("i2c: tegra: Add delay before resetting the controller after NACK"). After commit `77821b4678` was made we now disable 'packet mode' before the delay from commit `f70893d083` happens. Testing shows that moving the delay to before disabling 'packet mode' fixes the hang observed on Tegra30. The delay was added to give the I2C controller chance to send a stop condition and so it makes sense to move this to before we disable packet mode. Please note that packet mode is always enabled for Tegra. Fixes: `77821b4678` ("i2c: tegra: proper handling of error cases") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Paul Menzel	36c038f0a9	tools build: fix # escaping in .cmd files for future Make commit `9feeb638cd` upstream. In 2016 GNU Make made a backwards incompatible change to the way '#' characters were handled in Makefiles when used inside functions or macros: http://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Due to this change, when attempting to run `make prepare' I get a spurious make syntax error: /home/earnest/linux/tools/objtool/.fixdep.o.cmd:1: *** missing separator. Stop. When inspecting `.fixdep.o.cmd' it includes two lines which use unescaped comment characters at the top: \# cannot find fixdep (/home/earnest/linux/tools/objtool//fixdep) \# using basic dep data This is because `tools/build/Build.include' prints these '\#' characters: printf '\# cannot find fixdep (%s)\n' $(fixdep) > $(dot-target).cmd; \ printf '\# using basic dep data\n\n' >> $(dot-target).cmd; \ This completes commit `9564a8cf42` ("Kbuild: fix # escaping in .cmd files for future Make"). Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847 Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Oscar Salvador	db2858f193	fs, elf: make sure to page align bss in load_elf_library commit `24962af7e1` upstream. The current code does not make sure to page align bss before calling vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate() due to the requested lenght not being correctly aligned. Let us make sure to align it properly. Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured for libc5. Link: http://lkml.kernel.org/r/20180705145539.9627-1-osalvador@techadventures.net Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-by: syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Acked-by: Kees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Chris Wilson	bc193057d4	ALSA: hda - Handle pm failure during hotplug commit `aaa23f8600` upstream. Obtaining the runtime pm wakeref can fail, especially in a hotplug scenario where i915.ko has been unloaded. If we do not catch the failure, we end up with an unbalanced pm. v2 additions by tiwai: hdmi_present_sense() checks the return value and handle only a negative error case and bails out only if it's really still suspended. Also, snd_hda_power_down() is called at the error path so that the refcount is balanced. Along with it, the spec->pcm_lock is taken outside hdmi_present_sense() in the caller side, so that it won't cause deadlock at reentrace via runtime resume. v3 fix by tiwai: Missing linux/pm_runtime.h is included. References: `222bde0388` ("ALSA: hda - Fix mutex deadlock at HDMI/DP hotplug") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Linus Torvalds	d2c7c52431	Fix up non-directory creation in SGID directories commit `0fa3ecd878` upstream. sgid directories have special semantics, making newly created files in the directory belong to the group of the directory, and newly created subdirectories will also become sgid. This is historically used for group-shared directories. But group directories writable by non-group members should not imply that such non-group members can magically join the group, so make sure to clear the sgid bit on non-directories for non-members (but remember that sgid without group execute means "mandatory locking", just to confuse things even more). Reported-by: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Tomasz Kramkowski	16387eb51c	HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter commit `9547837bdc` upstream. The (1292:4745) Innomedia INNEX GENESIS/ATARI adapter needs HID_QUIRK_MULTI_INPUT to split the device up into two controllers instead of inputs from both being merged into one. Signed-off-by: Tomasz Kramkowski <tk@the-tk.com> Acked-By: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Dan Carpenter	268476c9d3	xhci: xhci-mem: off by one in xhci_stream_id_to_ring() commit `313db3d648` upstream. The > should be >= here so that we don't read one element beyond the end of the ep->stream_info->stream_rings[] array. Fixes: `e9df17eb14` ("USB: xhci: Correct assumptions about number of rings per endpoint.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:53 +02:00
Nico Sneck	cac38ab7d4	usb: quirks: add delay quirks for Corsair Strafe commit `bba57eddad` upstream. Corsair Strafe appears to suffer from the same issues as the Corsair Strafe RGB. Apply the same quirks (control message delay and init delay) that the RGB version has to 1b1c:1b15. With these quirks in place the keyboard works correctly upon booting the system, and no longer requires reattaching the device. Signed-off-by: Nico Sneck <snecknico@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Johan Hovold	7675c7b78e	USB: serial: mos7840: fix status-register error handling commit `794744abff` upstream. Add missing transfer-length sanity check to the status-register completion handler to avoid leaking bits of uninitialised slab data to user space. Fixes: `3f5429746d` ("USB: Moschip 7840 USB-Serial Driver") Cc: stable <stable@vger.kernel.org> # 2.6.19 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Jann Horn	0fdef3142f	USB: yurex: fix out-of-bounds uaccess in read handler commit `f1e255d60a` upstream. In general, accessing userspace memory beyond the length of the supplied buffer in VFS read/write handlers can lead to both kernel memory corruption (via kernel_read()/kernel_write(), which can e.g. be triggered via sys_splice()) and privilege escalation inside userspace. Fix it by using simple_read_from_buffer() instead of custom logic. Fixes: `6bc235a2e2` ("USB: add driver for Meywa-Denki & Kayac YUREX") Signed-off-by: Jann Horn <jannh@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Johan Hovold	7e7c86d275	USB: serial: keyspan_pda: fix modem-status error handling commit `01b3cdfca2` upstream. Fix broken modem-status error handling which could lead to bits of slab data leaking to user space. Fixes: `3b36a8fd67` ("usb: fix uninitialized variable warning in keyspan_pda") Cc: stable <stable@vger.kernel.org> # 2.6.27 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Olli Salonen	4115045f95	USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick commit `367b160fe4` upstream. There are two versions of the Qivicon Zigbee stick in circulation. This adds the second USB ID to the cp210x driver. Signed-off-by: Olli Salonen <olli.salonen@iki.fi> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Dan Carpenter	4c73f193b3	USB: serial: ch341: fix type promotion bug in ch341_control_in() commit `e33eab9ded` upstream. The "r" variable is an int and "bufsize" is an unsigned int so the comparison is type promoted to unsigned. If usb_control_msg() returns a negative that is treated as a high positive value and the error handling doesn't work. Fixes: `2d5a9c72d0` ("USB: serial: ch341: fix control-message error handling") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Hans de Goede	f510cc3a2f	ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS commit `240630e618` upstream. There have been several reports of LPM related hard freezes about once a day on multiple Lenovo 50 series models. Strange enough these reports where not disk model specific as LPM issues usually are and some users with the exact same disk + laptop where seeing them while other users where not seeing these issues. It turns out that enabling LPM triggers a firmware bug somewhere, which has been fixed in later BIOS versions. This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM for dealing with this. The ahci_broken_lpm() function contains DMI match info for the 4 models which are known to be affected by this and the DMI BIOS date field for known good BIOS versions. If the BIOS date is older then the one in the table LPM will be disabled and a warning will be printed. Note the BIOS dates are for known good versions, some older versions may work too, but we don't know for sure, the table is using dates from BIOS versions for which users have confirmed that upgrading to that version makes the problem go away. Unfortunately I've been unable to get hold of the reporter who reported that BIOS version 2.35 fixed the problems on the W541 for him. I've been able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older dmidecode, but I don't know the exact BIOS date as reported in the DMI. Lenovo keeps a changelog with dates in their release notes, but the dates there are the release dates not the build dates which are in DMI. So I've chosen to set the date to which we compare to one day past the release date of the 2.34 BIOS. I plan to fix this with a follow up commit once I've the necessary info. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:52 +02:00
Nadav Amit	63c003e3ff	vmw_balloon: fix inflation with batching commit `90d72ce079` upstream. Embarrassingly, the recent fix introduced worse problem than it solved, causing the balloon not to inflate. The VM informed the hypervisor that the pages for lock/unlock are sitting in the wrong address, as it used the page that is used the uninitialized page variable. Fixes: `b23220fe05` ("vmw_balloon: fixing double free when batching mode is off") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Damien Le Moal	3f205d7a89	ata: Fix ZBC_OUT all bit handling commit `6edf1d4cb0` upstream. If the ALL bit is set in the ZBC_OUT command, the command zone ID field (block) should be ignored. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Damien Le Moal	51bacd848c	ata: Fix ZBC_OUT command block check commit `b320a0a9f2` upstream. The block (LBA) specified must not exceed the last addressable LBA, which is dev->nr_sectors - 1. So fix the correct check is "if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)". Additionally, the asc/ascq to return for an LBA that is not a zone start LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of range. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Jann Horn	2823345cd4	ibmasm: don't write out of bounds in read handler commit `a0341fc198` upstream. This read handler had a lot of custom logic and wrote outside the bounds of the provided buffer. This could lead to kernel and userspace memory corruption. Just use simple_read_from_buffer() with a stack buffer. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
x00270170	35479c22ff	mmc: dw_mmc: fix card threshold control configuration commit `7a6b9f4d60` upstream. Card write threshold control is supposed to be set since controller version 2.80a for data write in HS400 mode and data read in HS200/HS400/SDR104 mode. However the current code returns without configuring it in the case of data writing in HS400 mode. Meanwhile the patch fixes that the current code goes to 'disable' when doing data reading in HS400 mode. Fixes: `7e4bf1bc95` ("mmc: dw_mmc: add the card write threshold for HS400 mode") Signed-off-by: Qing Xia <xiaqing17@hisilicon.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Paul Burton	92cb1184ae	MIPS: Fix ioremap() RAM check commit `523402fa91` upstream. We currently attempt to check whether a physical address range provided to __ioremap() may be in use by the page allocator by examining the value of PageReserved for each page in the region - lowmem pages not marked reserved are presumed to be in use by the page allocator, and requests to ioremap them fail. The way we check this has been broken since commit `92923ca3aa` ("mm: meminit: only set page reserved in the memblock region"), because memblock will typically not have any knowledge of non-RAM pages and therefore those pages will not have the PageReserved flag set. Thus when we attempt to ioremap a region outside of RAM we incorrectly fail believing that the region is RAM that may be in use. In most cases ioremap() on MIPS will take a fast-path to use the unmapped kseg1 or xkphys virtual address spaces and never hit this path, so the only way to hit it is for a MIPS32 system to attempt to ioremap() an address range in lowmem with flags other than _CACHE_UNCACHED. Perhaps the most straightforward way to do this is using ioremap_uncached_accelerated(), which is how the problem was discovered. Fix this by making use of walk_system_ram_range() to test the address range provided to __ioremap() against only RAM pages, rather than all lowmem pages. This means that if we have a lowmem I/O region, which is very common for MIPS systems, we're free to ioremap() address ranges within it. A nice bonus is that the test is no longer limited to lowmem. The approach here matches the way x86 performed the same test after commit `c81c8a1eee` ("x86, ioremap: Speed up check for RAM pages") until x86 moved towards a slightly more complicated check using walk_mem_res() for unrelated reasons with commit `0e4c12b45a` ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages"). Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Serge Semin <fancer.lancer@gmail.com> Tested-by: Serge Semin <fancer.lancer@gmail.com> Fixes: `92923ca3aa` ("mm: meminit: only set page reserved in the memblock region") Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.2+ Patchwork: https://patchwork.linux-mips.org/patch/19786/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Paul Burton	473b33dd61	MIPS: Call dump_stack() from show_regs() commit `5a267832c2` upstream. The generic nmi_cpu_backtrace() function calls show_regs() when a struct pt_regs is available, and dump_stack() otherwise. If we were to make use of the generic nmi_cpu_backtrace() with MIPS' current implementation of show_regs() this would mean that we see only register data with no accompanying stack information, in contrast with our current implementation which calls dump_stack() regardless of whether register state is available. In preparation for making use of the generic nmi_cpu_backtrace() to implement arch_trigger_cpumask_backtrace(), have our implementation of show_regs() call dump_stack() and drop the explicit dump_stack() call in arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace(). This will allow the output we produce to remain the same after a later patch switches to using nmi_cpu_backtrace(). It may mean that we produce extra stack output in other uses of show_regs(), but this: 1) Seems harmless. 2) Is good for consistency between arch_trigger_cpumask_backtrace() and other users of show_regs(). 3) Matches the behaviour of the ARM & PowerPC architectures. Marked for stable back to v4.9 as a prerequisite of the following patch "MIPS: Call dump_stack() from show_regs()". Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19596/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Scott Bauer	93e54f40c8	nvme: validate admin queue before unquiesce commit `7dd1ab163c` upstream. With a misbehaving controller it's possible we'll never enter the live state and create an admin queue. When we fail out of reset work it's possible we failed out early enough without setting up the admin queue. We tear down queues after a failed reset, but needed to do some more sanitization. Fixes `443bd90f2c`: "nvme: host: unquiesce queue in nvme_kill_queues()" [ 189.650995] nvme nvme1: pci function 0000:0b:00.0 [ 317.680055] nvme nvme0: Device not ready; aborting reset [ 317.680183] nvme nvme0: Removing after probe failure status: -19 [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 317.681397] general protection fault: 0000 [#1] SMP KASAN [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5 [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000 [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0 [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282 [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3 [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000 [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000 [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0 [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0 [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000 [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0 [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 317.685018] Call Trace: [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core] [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme] [ 317.685289] process_one_work+0x771/0x1170 [ 317.685372] worker_thread+0xde/0x11e0 [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110 [ 317.685550] kthread+0x2d3/0x3d0 [ 317.685617] ? process_one_work+0x1170/0x1170 [ 317.685704] ? kthread_create_on_node+0xc0/0xc0 [ 317.685785] ret_from_fork+0x25/0x30 [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3 [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40 [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]--- Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [ adapted for 4.9: added check around blk_mq_start_hw_queues() call instead of upstream blk_mq_unquiesce_queue() ] Fixes: `4aae438816` ("nvme: fix hang in remove path") Signed-off-by: Simon Veith <sveith@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Amit Shah <aams@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:37:51 +02:00
Greg Kroah-Hartman	060744011e	Linux 4.9.112	2018-07-11 16:26:46 +02:00
Dan Carpenter	e31cd420e1	staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write() commit `1376b0a216` upstream. There is a '>' vs '<' typo so this loop is a no-op. Fixes: `d35dcc89fc` ("staging: comedi: quatech_daqp_cs: fix daqp_ao_insn_write()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Jann Horn	1712fae948	netfilter: nf_log: don't hold nf_log_mutex during user access commit `ce00bf07cc` upstream. The old code would indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd region. Fix it by moving proc_dostring() out of the locked region. This is a followup to commit `266d07cb1c` ("netfilter: nf_log: fix sleeping function called from invalid context"), which changed this code from using rcu_read_lock() to taking nf_log_mutex. Fixes: `266d07cb1c` ("netfilter: nf_log: fix sleeping function calle[...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Tokunori Ikegami	a0239d83e1	mtd: cfi_cmdset_0002: Change erase functions to check chip good only commit `79ca484b61` upstream. Currently the functions use to check both chip ready and good. But the chip ready is not enough to check the operation status. So change this to check the chip good instead of this. About the retry functions to make sure the error handling remain it. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Tokunori Ikegami	ed1746148b	mtd: cfi_cmdset_0002: Change erase functions to retry for error commit `45f75b8a91` upstream. For the word write functions it is retried for error. But it is not implemented to retry for the erase functions. To make sure for the erase functions change to retry as same. This is needed to prevent the flash erase error caused only once. It was caused by the error case of chip_good() in the do_erase_oneblock(). Also it was confirmed on the MACRONIX flash device MX29GL512FHT2I-11G. But the error issue behavior is not able to reproduce at this moment. The flash controller is parallel Flash interface integrated on BCM53003. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Tokunori Ikegami	c2f163e35f	mtd: cfi_cmdset_0002: Change definition naming to retry write operation commit `85a82e28b0` upstream. The definition can be used for other program and erase operations also. So change the naming to MAX_RETRIES from MAX_WORD_RETRIES. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Mikulas Patocka	4779184af7	dm bufio: don't take the lock in dm_bufio_shrink_count commit `d12067f428` upstream. dm_bufio_shrink_count() is called from do_shrink_slab to find out how many freeable objects are there. The reported value doesn't have to be precise, so we don't need to take the dm-bufio lock. Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Martin Kaiser	9d1304f581	mtd: rawnand: mxc: set spare area size register explicitly commit `3f77f244d8` upstream. The v21 version of the NAND flash controller contains a Spare Area Size Register (SPAS) at offset 0x10. Its setting defaults to the maximum spare area size of 218 bytes. The size that is set in this register is used by the controller when it calculates the ECC bytes internally in hardware. Usually, this register is updated from settings in the IIM fuses when the system is booting from NAND flash. For other boot media, however, the SPAS register remains at the default setting, which may not work for the particular flash chip on the board. The same goes for flash chips whose configuration cannot be set in the IIM fuses (e.g. chips with 2k sector size and 128 bytes spare area size can't be configured in the IIM fuses on imx25 systems). Set the SPAS register explicitly during the preset operation. Derive the register value from mtd->oobsize that was detected during probe by decoding the flash chip's ID bytes. While at it, rename the define for the spare area register's offset to NFC_V21_RSLTSPARE_AREA. The register at offset 0x10 on v1 controllers is different from the register on v21 controllers. Fixes: `d484018` ("mtd: mxc_nand: set NFC registers after reset") Cc: stable@vger.kernel.org Signed-off-by: Martin Kaiser <martin@kaiser.cx> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:46 +02:00
Mikulas Patocka	34d2fe724a	dm bufio: drop the lock when doing GFP_NOIO allocation commit `41c73a49df` upstream. If the first allocation attempt using GFP_NOWAIT fails, drop the lock and retry using GFP_NOIO allocation (lock is dropped because the allocation can take some time). Note that we won't do GFP_NOIO allocation when we loop for the second time, because the lock shouldn't be dropped between __wait_for_free_buffer and __get_unclaimed_buffer. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Douglas Anderson	0758c35b53	dm bufio: avoid sleeping while holding the dm_bufio lock commit `9ea61cac0b` upstream. We've seen in-field reports showing _lots_ (18 in one case, 41 in another) of tasks all sitting there blocked on: mutex_lock+0x4c/0x68 dm_bufio_shrink_count+0x38/0x78 shrink_slab.part.54.constprop.65+0x100/0x464 shrink_zone+0xa8/0x198 In the two cases analyzed, we see one task that looks like this: Workqueue: kverityd verity_prefetch_io __switch_to+0x9c/0xa8 __schedule+0x440/0x6d8 schedule+0x94/0xb4 schedule_timeout+0x204/0x27c schedule_timeout_uninterruptible+0x44/0x50 wait_iff_congested+0x9c/0x1f0 shrink_inactive_list+0x3a0/0x4cc shrink_lruvec+0x418/0x5cc shrink_zone+0x88/0x198 try_to_free_pages+0x51c/0x588 __alloc_pages_nodemask+0x648/0xa88 __get_free_pages+0x34/0x7c alloc_buffer+0xa4/0x144 __bufio_new+0x84/0x278 dm_bufio_prefetch+0x9c/0x154 verity_prefetch_io+0xe8/0x10c process_one_work+0x240/0x424 worker_thread+0x2fc/0x424 kthread+0x10c/0x114 ...and that looks to be the one holding the mutex. The problem has been reproduced on fairly easily: 0. Be running Chrome OS w/ verity enabled on the root filesystem 1. Pick test patch: http://crosreview.com/412360 2. Install launchBalloons.sh and balloon.arm from http://crbug.com/468342 ...that's just a memory stress test app. 3. On a 4GB rk3399 machine, run nice ./launchBalloons.sh 4 900 100000 ...that tries to eat 4 * 900 MB of memory and keep accessing. 4. Login to the Chrome web browser and restore many tabs With that, I've seen printouts like: DOUG: long bufio 90758 ms ...and stack trace always show's we're in dm_bufio_prefetch(). The problem is that we try to allocate memory with GFP_NOIO while we're holding the dm_bufio lock. Instead we should be using GFP_NOWAIT. Using GFP_NOIO can cause us to sleep while holding the lock and that causes the above problems. The current behavior explained by David Rientjes: It will still try reclaim initially because __GFP_WAIT (or __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO. This is the cause of contention on dm_bufio_lock() that the thread holds. You want to pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a mutex that can be contended by a concurrent slab shrinker (if count_objects didn't use a trylock, this pattern would trivially deadlock). This change significantly increases responsiveness of the system while in this state. It makes a real difference because it unblocks kswapd. In the bug report analyzed, kswapd was hung: kswapd0 D ffffffc000204fd8 0 72 2 0x00000000 Call trace: [<ffffffc000204fd8>] __switch_to+0x9c/0xa8 [<ffffffc00090b794>] __schedule+0x440/0x6d8 [<ffffffc00090bac0>] schedule+0x94/0xb4 [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44 [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68 [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78 [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464 [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198 [<ffffffc00030e578>] balance_pgdat+0x328/0x508 [<ffffffc00030eb7c>] kswapd+0x424/0x51c [<ffffffc00023f06c>] kthread+0x10c/0x114 [<ffffffc000203dd0>] ret_from_fork+0x10/0x40 By unblocking kswapd memory pressure should be reduced. Suggested-by: David Rientjes <rientjes@google.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Vlastimil Babka	6cfbbdd2bc	mm, page_alloc: do not break __GFP_THISNODE by zonelist reset commit `7810e6781e` upstream. In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for allocations that can ignore memory policies. The zonelist is obtained from current CPU's node. This is a problem for __GFP_THISNODE allocations that want to allocate on a different node, e.g. because the allocating thread has been migrated to a different CPU. This has been observed to break SLAB in our 4.4-based kernel, because there it relies on __GFP_THISNODE working as intended. If a slab page is put on wrong node's list, then further list manipulations may corrupt the list because page_to_nid() is used to determine which node's list_lock should be locked and thus we may take a wrong lock and race. Current SLAB implementation seems to be immune by luck thanks to commit `511e3a0588` ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") but there may be others assuming that __GFP_THISNODE works as promised. We can fix it by simply removing the zonelist reset completely. There is actually no reason to reset it, because memory policies and cpusets don't affect the zonelist choice in the first place. This was different when commit `183f6371aa` ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their own restricted zonelists. We might consider this for 4.17 although I don't know if there's anything currently broken. SLAB is currently not affected, but in kernels older than 4.7 that don't yet have `511e3a0588` ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") it is. That's at least 4.4 LTS. Older ones I'll have to check. So stable backports should be more important, but will have to be reviewed carefully, as the code went through many changes. BTW I think that also the ac->preferred_zoneref reset is currently useless if we don't also reset ac->nodemask from a mempolicy to NULL first (which we probably should for the OOM victims etc?), but I would leave that for a separate patch. Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: `183f6371aa` ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Brad Love	d96a0d3cd5	media: cx25840: Use subdev host data for PLL override commit `3ee9bc1234` upstream. The cx25840 driver currently configures 885, 887, and 888 using default divisors for each chip. This check to see if the cx23885 driver has passed the cx25840 a non-default clock rate for a specific chip. If a cx23885 board has left clk_freq at 0, the clock default values will be used to configure the PLLs. This patch only has effect on 888 boards who set clk_freq to 25M. Signed-off-by: Brad Love <brad@nextdimension.cc> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Rasmus Villemoes	b5d7d7d919	Kbuild: fix # escaping in .cmd files for future Make commit `9564a8cf42` upstream. I tried building using a freshly built Make (4.2.1-69-g8a731d1), but already the objtool build broke with orc_dump.c: In function ‘orc_dump’: orc_dump.c:106:2: error: ‘elf_getshnum’ is deprecated [-Werror=deprecated-declarations] if (elf_getshdrnum(elf, &nr_sections)) { Turns out that with that new Make, the backslash was not removed, so cpp didn't see a #include directive, grep found nothing, and -DLIBELF_USE_DEPRECATED was wrongly put in CFLAGS. Now, that new Make behaviour is documented in their NEWS file: * WARNING: Backward-incompatibility! Number signs (#) appearing inside a macro reference or function invocation no longer introduce comments and should not be escaped with backslashes: thus a call such as: foo := $(shell echo '#') is legal. Previously the number sign needed to be escaped, for example: foo := $(shell echo '\#') Now this latter will resolve to "\#". If you want to write makefiles portable to both versions, assign the number sign to a variable: C := \# foo := $(shell echo '$C') This was claimed to be fixed in 3.81, but wasn't, for some reason. To detect this change search for 'nocomment' in the .FEATURES variable. This also fixes up the two make-cmd instances to replace # with $(pound) rather than with \#. There might very well be other places that need similar fixup in preparation for whatever future Make release contains the above change, but at least this builds an x86_64 defconfig with the new make. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847 Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Waldemar Rymarkiewicz	6989d4079d	PM / OPP: Update voltage in case freq == old_freq commit `c5c2a97b3a` upstream. This commit fixes a rare but possible case when the clk rate is updated without update of the regulator voltage. At boot up, CPUfreq checks if the system is running at the right freq. This is a sanity check in case a bootloader set clk rate that is outside of freq table present with cpufreq core. In such cases system can be unstable so better to change it to a freq that is preset in freq-table. The CPUfreq takes next freq that is >= policy->cur and this is our target_freq that needs to be set now. dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the old_freq (a current rate). If these are equal it returns early. If not, it searches for OPP (old_opp) that fits best to old_freq (not listed in the table) and updates old_freq (!). Here, we can end up with old_freq = old_opp.rate = target_freq, which is not handled in _generic_set_opp_regulator(). It's supposed to update voltage only when freq > old_freq \|\| freq > old_freq. if (freq > old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); [...] if (freq < old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); if (ret) It results in, no voltage update while clk rate is updated. Example: freq-table = { 1000MHz 1.15V 666MHZ 1.10V 333MHz 1.05V } boot-up-freq = 800MHz # not listed in freq-table freq = target_freq = 1GHz old_freq = 800Mhz old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!) old_freq = 1GHz Fixes: `6a0712f6f1` ("PM / OPP: Add dev_pm_opp_set_rate()") Cc: 4.6+ <stable@vger.kernel.org> # v4.6+ Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Daniel Rosenberg	4a30c12542	HID: debug: check length before copy_to_user() commit `717adfdaf1` upstream. If our length is greater than the size of the buffer, we overflow the buffer Cc: stable@vger.kernel.org Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Gustavo A. R. Silva	82e360cd6f	HID: hiddev: fix potential Spectre v1 commit `4f65245f2d` upstream. uref->field_index, uref->usage_index, finfo.field_index and cinfo.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/hid/usbhid/hiddev.c:473 hiddev_ioctl_usage() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:477 hiddev_ioctl_usage() warn: potential spectre issue 'field->usage' (local cap) drivers/hid/usbhid/hiddev.c:757 hiddev_ioctl() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:801 hiddev_ioctl() warn: potential spectre issue 'hid->collection' (local cap) Fix this by sanitizing such structure fields before using them to index report->field, field->usage and hid->collection Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Jason Andryuk	814b4302fd	HID: i2c-hid: Fix "incomplete report" noise commit `ef6eaf2727` upstream. Commit `ac75a04104` ("HID: i2c-hid: fix size check and type usage") started writing messages when the ret_size is <= 2 from i2c_master_recv. However, my device i2c-DLL07D1 returns 2 for a short period of time (~0.5s) after I stop moving the pointing stick or touchpad. It varies, but you get ~50 messages each time which spams the log hard. [ 95.925055] i2c_hid i2c-DLL07D1:01: i2c_hid_get_input: incomplete report (83/2) This has also been observed with a i2c-ALP0017. [ 1781.266353] i2c_hid i2c-ALP0017:00: i2c_hid_get_input: incomplete report (30/2) Only print the message when ret_size is totally invalid and less than 2 to cut down on the log spam. Fixes: `ac75a04104` ("HID: i2c-hid: fix size check and type usage") Reported-by: John Smith <john-s-84@gmx.net> Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Ido Schimmel	2f1a56ef23	mlxsw: spectrum: Forbid linking of VLAN devices to devices that have uppers Jiri Slaby noticed that the backport of upstream commit `25cc72a338` ("mlxsw: spectrum: Forbid linking to devices that have uppers") to kernel 4.9.y introduced the same check twice in the same function instead of in two different places. Fix this by relocating one of the checks to its intended place, thus preventing unsupported configurations as described in the original commit. Fixes: `73ee5a73e7` ("mlxsw: spectrum: Forbid linking to devices that have uppers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Jon Derrick	917692c9cd	ext4: check superblock mapped prior to committing commit `a17712c8e4` upstream. This patch attempts to close a hole leading to a BUG seen with hot removals during writes [1]. A block device (NVME namespace in this test case) is formatted to EXT4 without partitions. It's mounted and write I/O is run to a file, then the device is hot removed from the slot. The superblock attempts to be written to the drive which is no longer present. The typical chain of events leading to the BUG: ext4_commit_super() __sync_dirty_buffer() submit_bh() submit_bh_wbc() BUG_ON(!buffer_mapped(bh)); This fix checks for the superblock's buffer head being mapped prior to syncing. [1] https://www.spinics.net/lists/linux-ext4/msg56527.html Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:45 +02:00
Theodore Ts'o	eb13a42605	ext4: add more mount time checks of the superblock commit `bfe0a5f47a` upstream. The kernel's ext4 mount-time checks were more permissive than e2fsprogs's libext2fs checks when opening a file system. The superblock is considered too insane for debugfs or e2fsck to operate on it, the kernel has no business trying to mount it. This will make file system fuzzing tools work harder, but the failure cases that they find will be more useful and be easier to evaluate. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	425dc465de	ext4: add more inode number paranoia checks commit `c37e9e0134` upstream. If there is a directory entry pointing to a system inode (such as a journal inode), complain and declare the file system to be corrupted. Also, if the superblock's first inode number field is too small, refuse to mount the file system. This addresses CVE-2018-10882. https://bugzilla.kernel.org/show_bug.cgi?id=200069 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	a5e063d348	ext4: clear i_data in ext4_inode_info when removing inline data commit `6e8ab72a81` upstream. When converting from an inode from storing the data in-line to a data block, ext4_destroy_inline_data_nolock() was only clearing the on-disk copy of the i_blocks[] array. It was not clearing copy of the i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually used by ext4_map_blocks(). This didn't matter much if we are using extents, since the extents header would be invalid and thus the extents could would re-initialize the extents tree. But if we are using indirect blocks, the previous contents of the i_blocks array will be treated as block numbers, with potentially catastrophic results to the file system integrity and/or user data. This gets worse if the file system is using a 1k block size and s_first_data is zero, but even without this, the file system can get quite badly corrupted. This addresses CVE-2018-10881. https://bugzilla.kernel.org/show_bug.cgi?id=200015 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	2f135cc8c0	ext4: include the illegal physical block in the bad map ext4_error msg commit `bdbd6ce01a` upstream. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	87dad44faa	ext4: verify the depth of extent tree in ext4_find_extent() commit `bc890a6024` upstream. If there is a corupted file system where the claimed depth of the extent tree is -1, this can cause a massive buffer overrun leading to sadness. This addresses CVE-2018-10877. https://bugzilla.kernel.org/show_bug.cgi?id=199417 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	5ae5732958	ext4: only look at the bg_flags field if it is valid commit `8844618d8a` upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	cdde876fce	ext4: always check block group bounds in ext4_init_block_bitmap() commit `819b23f1c5` upstream. Regardless of whether the flex_bg feature is set, we should always check to make sure the bits we are setting in the block bitmap are within the block group bounds. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	9e4842f2aa	ext4: make sure bitmaps and the inode table don't overlap with bg descriptors commit `77260807d1` upstream. It's really bad when the allocation bitmaps and the inode table overlap with the block group descriptors, since it causes random corruption of the bg descriptors. So we really want to head those off at the pass. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Theodore Ts'o	8ef97ef67c	jbd2: don't mark block as modified if the handle is out of credits commit `e09463f220` upstream. Do not set the b_modified flag in block's journal head should not until after we're sure that jbd2_journal_dirty_metadat() will not abort with an error due to there not being enough space reserved in the jbd2 handle. Otherwise, future attempts to modify the buffer may lead a large number of spurious errors and warnings. This addresses CVE-2018-10883. https://bugzilla.kernel.org/show_bug.cgi?id=200071 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Mikulas Patocka	0f80447d03	drm/udl: fix display corruption of the last line commit `99ec9e7751` upstream. The displaylink hardware has such a peculiarity that it doesn't render a command until next command is received. This produces occasional corruption, such as when setting 22x11 font on the console, only the first line of the cursor will be blinking if the cursor is located at some specific columns. When we end up with a repeating pixel, the driver has a bug that it leaves one uninitialized byte after the command (and this byte is enough to flush the command and render it - thus it fixes the screen corruption), however whe we end up with a non-repeating pixel, there is no byte appended and this results in temporary screen corruption. This patch fixes the screen corruption by always appending a byte 0xAF at the end of URB. It also removes the uninitialized byte. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:44 +02:00
Paulo Alcantara	2c4f6b710b	cifs: Fix infinite loop when using hard mount option commit `7ffbe65578` upstream. For every request we send, whether it is SMB1 or SMB2+, we attempt to reconnect tcon (cifs_reconnect_tcon or smb2_reconnect) before carrying out the request. So, while server->tcpStatus != CifsNeedReconnect, we wait for the reconnection to succeed on wait_event_interruptible_timeout(). If it returns, that means that either the condition was evaluated to true, or timeout elapsed, or it was interrupted by a signal. Since we're not handling the case where the process woke up due to a received signal (-ERESTARTSYS), the next call to wait_event_interruptible_timeout() will _always_ fail and we end up looping forever inside either cifs_reconnect_tcon() or smb2_reconnect(). Here's an example of how to trigger that: $ mount.cifs //foo/share /mnt/test -o username=foo,password=foo,vers=1.0,hard (break connection to server before executing bellow cmd) $ stat -f /mnt/test & sleep 140 [1] 2511 $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 0.0 0.0 12892 1008 pts/0 S 12:24 0:00 stat -f /mnt/test $ kill -9 2511 (wait for a while; process is stuck in the kernel) $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 83.2 0.0 12892 1008 pts/0 R 12:24 30:01 stat -f /mnt/test By using 'hard' mount point means that cifs.ko will keep retrying indefinitely, however we must allow the process to be killed otherwise it would hang the system. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Cc: stable@vger.kernel.org Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Lars Ellenberg	f9b1cd6e74	drbd: fix access after free commit `64dafbc953` upstream. We have struct drbd_requests { ... struct bio *private_bio; ... } to hold a bio clone for local submission. On local IO completion, we put that bio, and in case we want to use the result later, we overload that member to hold the ERR_PTR() of the completion result, Which, before v4.3, used to be the passed in "int error", so we could first bio_put(), then assign. v4.3-rc1~100^2~21 `4246a0b63b` block: add a bi_error field to struct bio changed that: bio_put(req->private_bio); - req->private_bio = ERR_PTR(error); + req->private_bio = ERR_PTR(bio->bi_error); Which introduces an access after free, because it was non obvious that req->private_bio == bio. Impact of that was mostly unnoticable, because we only use that value in a multiple-failure case, and even then map any "unexpected" error code to EIO, so worst case we could potentially mask a more specific error with EIO in a multiple failure case. Unless the pointed to memory region was unmapped, as is the case with CONFIG_DEBUG_PAGEALLOC, in which case this results in BUG: unable to handle kernel paging request v4.13-rc1~70^2~75 `4e4cbee93d` block: switch bios to blk_status_t changes it further to bio_put(req->private_bio); req->private_bio = ERR_PTR(blk_status_to_errno(bio->bi_status)); And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected values, which catches this "sometimes", if the memory has been reused quickly enough for other things. Should also go into stable since 4.3, with the trivial change around 4.13. Cc: stable@vger.kernel.org Fixes: `4246a0b63b` block: add a bi_error field to struct bio Reported-by: Sarah Newman <srn@prgmr.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Christian Borntraeger	0cab67a1ed	s390: Correct register corruption in critical section cleanup commit `891f6a726c` upstream. In the critical section cleanup we must not mess with r1. For march=z9 or older, larl + ex (instead of exrl) are used with r1 as a temporary register. This can clobber r1 in several interrupt handlers. Fix this by using r11 as a temp register. r11 is being saved by all callers of cleanup_critical. Fixes: `6dd85fbb87` ("s390: move expoline assembler macros to a header") Cc: stable@vger.kernel.org #v4.16 Reported-by: Oliver Kurz <okurz@suse.com> Reported-by: Petr Tesařík <ptesarik@suse.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Jann Horn	b6db8af7e3	scsi: sg: mitigate read/write abuse commit `26b5b874af` upstream. As Al Viro noted in commit `128394eff3` ("sg_write()/bsg_write() is not fit to be called under KERNEL_DS"), sg improperly accesses userspace memory outside the provided buffer, permitting kernel memory corruption via splice(). But it doesn't just do it on ->write(), also on ->read(). As a band-aid, make sure that the ->read() and ->write() handlers can not be called in weird contexts (kernel context or credentials different from file opener), like for ib_safe_file_access(). If someone needs to use these interfaces from different security contexts, a new interface should be written that goes through the ->ioctl() handler. I've mostly copypasted ib_safe_file_access() over as sg_safe_file_access() because I couldn't find a good common header - please tell me if you know a better way. [mkp: s/_safe_/_check_/] Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: <stable@vger.kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Changbin Du	07cd8167aa	tracing: Fix missing return symbol in function_graph output commit `1fe4293f4b` upstream. The function_graph tracer does not show the interrupt return marker for the leaf entry. On leaf entries, we see an unbalanced interrupt marker (the interrupt was entered, but nevern left). Before: 1) \| SyS_write() { 1) \| __fdget_pos() { 1) 0.061 us \| __fget_light(); 1) 0.289 us \| } 1) \| vfs_write() { 1) 0.049 us \| rw_verify_area(); 1) + 15.424 us \| __vfs_write(); 1) ==========> \| 1) 6.003 us \| smp_apic_timer_interrupt(); 1) 0.055 us \| __fsnotify_parent(); 1) 0.073 us \| fsnotify(); 1) + 23.665 us \| } 1) + 24.501 us \| } After: 0) \| SyS_write() { 0) \| __fdget_pos() { 0) 0.052 us \| __fget_light(); 0) 0.328 us \| } 0) \| vfs_write() { 0) 0.057 us \| rw_verify_area(); 0) \| __vfs_write() { 0) ==========> \| 0) 8.548 us \| smp_apic_timer_interrupt(); 0) <========== \| 0) + 36.507 us \| } /* __vfs_write */ 0) 0.049 us \| __fsnotify_parent(); 0) 0.066 us \| fsnotify(); 0) + 50.064 us \| } 0) + 50.952 us \| } Link: http://lkml.kernel.org/r/1517413729-20411-1-git-send-email-changbin.du@intel.com Cc: stable@vger.kernel.org Fixes: `f8b755ac8e` ("tracing/function-graph-tracer: Output arrows signal on hardirq call/return") Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Cannon Matthews	433c183fa2	mm: hugetlb: yield when prepping struct pages commit `520495fe96` upstream. When booting with very large numbers of gigantic (i.e. 1G) pages, the operations in the loop of gather_bootmem_prealloc, and specifically prep_compound_gigantic_page, takes a very long time, and can cause a softlockup if enough pages are requested at boot. For example booting with 3844 1G pages requires prepping (set_compound_head, init the count) over 1 billion 4K tail pages, which takes considerable time. Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to prevent this lockup. Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and no softlockup is reported, and the hugepages are reported as successfully setup. Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com Signed-off-by: Cannon Matthews <cannonmatthews@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Peter Feiner <pfeiner@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Andy Lutomirski	1adc34adc3	x86/cpu: Re-apply forced caps every time CPU caps are re-read commit `60d3450167` upstream. Calling get_cpu_cap() will reset a bunch of CPU features. This will cause the system to lose track of force-set and force-cleared features in the words that are reset until the end of CPU initialization. This can cause X86_FEATURE_FPU, for example, to change back and forth during boot and potentially confuse CPU setup. To minimize the chance of confusion, re-apply forced caps every time get_cpu_cap() is called. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Whitehead <tedheadster@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/c817eb373d2c67c2c81413a70fc9b845fa34a37e.1484705016.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Juergen Gross	05a5d4baac	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths commit `74899d92e6` upstream. Commit: `1f50ddb4f4` ("x86/speculation: Handle HT correctly on AMD") ... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence. speculative_store_bypass_ht_init() needs to be called on each CPU for PV guests, too. Reported-by: Brian Woods <brian.woods@amd.com> Tested-by: Brian Woods <brian.woods@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: xen-devel@lists.xenproject.org Fixes: `1f50ddb4f4` ("x86/speculation: Handle HT correctly on AMD") Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Mike Marciniszyn	389a3fcb34	IB/hfi1: Fix user context tail allocation for DMA_RTAIL commit `1bc0299d97` upstream. The following code fails to allocate a buffer for the tail address that the hardware DMAs into when the user context DMA_RTAIL is set. if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) { rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent( &dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail, gfp_flags); if (!rcd->rcvhdrtail_kvaddr) goto bail_free; rcd->rcvhdrqtailaddr_dma = dma_hdrqtail; } So the rcvhdrtail_kvaddr would then be NULL. The mmap logic fails to check for a NULL rcvhdrtail_kvaddr. The fix is to test for both user and kernel DMA_TAIL options during the allocation as well as testing for a NULL rcvhdrtail_kvaddr during the mmap processing. Additionally, all downstream testing of the capmask for DMA_RTAIL have been eliminated in favor of testing rcvhdrtail_kvaddr. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Sean Nyekjaer	0e76f4db40	ARM: dts: imx6q: Use correct SDMA script for SPI5 core commit `df07101e1c` upstream. According to the reference manual the shp_2_mcu / mcu_2_shp scripts must be used for devices connected through the SPBA. This fixes an issue we saw with DMA transfers. Sometimes the SPI controller RX FIFO was not empty after a DMA transfer and the driver got stuck in the next PIO transfer when it read one word more than expected. commit `dd4b487b32` ("ARM: dts: imx6: Use correct SDMA script for SPI cores") is fixing the same issue but only for SPI1 - 4. Fixes: `677940258d` ("ARM: dts: imx6q: enable dma for ecspi5") Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:43 +02:00
Grygorii Strashko	7dafda5bf2	net: phy: micrel: fix crash when statistic requested for KSZ9031 phy commit `bfe7244257` upstream. Now the command: ethtool --phy-statistics eth0 will cause system crash with meassage "Unable to handle kernel NULL pointer dereference at virtual address 00000010" from: (kszphy_get_stats) from [<c069f1d8>] (ethtool_get_phy_stats+0xd8/0x210) (ethtool_get_phy_stats) from [<c06a0738>] (dev_ethtool+0x5b8/0x228c) (dev_ethtool) from [<c06b5484>] (dev_ioctl+0x3fc/0x964) (dev_ioctl) from [<c0679f7c>] (sock_ioctl+0x170/0x2c0) (sock_ioctl) from [<c02419d4>] (do_vfs_ioctl+0xa8/0x95c) (do_vfs_ioctl) from [<c02422c4>] (SyS_ioctl+0x3c/0x64) (SyS_ioctl) from [<c0107d60>] (ret_fast_syscall+0x0/0x44) The reason: phy_driver structure for KSZ9031 phy has no .probe() callback defined. As result, struct phy_device *phydev->priv pointer will not be initializes (null). This issue will affect also following phys: KSZ8795, KSZ886X, KSZ8873MLL, KSZ9031, KSZ9021, KSZ8061, KS8737 Fix it by: - adding .probe() = kszphy_probe() callback to KSZ9031, KSZ9021 phys. The kszphy_probe() can be re-used as it doesn't do any phy specific settings. - removing statistic callbacks from other phys (KSZ8795, KSZ886X, KSZ8873MLL, KSZ8061, KS8737) as they doesn't have corresponding statistic counters. Fixes: `2b2427d064` ("phy: micrel: Add ethtool statistics counters") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Dan Rue <dan.rue@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
David S. Miller	5b8fcc0757	Revert "sit: reload iphdr in ipip6_rcv" commit `f4eb17e1ef` upstream. This reverts commit `b699d00358`. As per Eric Dumazet, the pskb_may_pull() is a NOP in this particular case, so the 'iph' reload is unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Luca Boccassi <luca.boccassi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Taehee Yoo	440bf5ac49	netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain() commit `adc972c5b8` upstream. When depth of chain is bigger than NFT_JUMP_STACK_SIZE, the nft_do_chain crashes. But there is no need to crash hard here. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Masami Hiramatsu	8391d38ca8	kprobes/x86: Do not modify singlestep buffer while resuming commit `804dec5bda` upstream. Do not modify singlestep execution buffer (kprobe.ainsn.insn) while resuming from single-stepping, instead, modifies the buffer to add a jump back instruction at preparing buffer. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ye Xiaolong <xiaolong.ye@intel.com> Link: http://lkml.kernel.org/r/149076361560.22469.1610155860343077495.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Signed-off-by: Alexey Makhalov <amakhalov@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Ben Hutchings	58d7ac7d30	ipv4: Fix error return value in fib_convert_metrics() The validation code modified by commit `5b5e7a0de2` ("net: metrics: add proper netlink validation") is organised differently in older kernel versions. The fib_convert_metrics() function that is modified in the backports to 4.4 and 4.9 needs to returns an error code, not a success flag. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Wolfram Sang	e581746bc7	i2c: rcar: fix resume by always initializing registers before transfer commit `ae481cc139` upstream. Resume failed because of uninitialized registers. Instead of adding a resume callback, we simply initialize registers before every transfer. This lightweight change is more robust and will keep us safe if we ever need support for power domains or dynamic frequency changes. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Alexander Potapenko	3bf351b891	vt: prevent leaking uninitialized data to userspace via /dev/vcs* commit `21eff69aaa` upstream. KMSAN reported an infoleak when reading from /dev/vcs*: BUG: KMSAN: kernel-infoleak in vcs_read+0x18ba/0x1cc0 Call Trace: ... kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1253 copy_to_user ./include/linux/uaccess.h:184 vcs_read+0x18ba/0x1cc0 drivers/tty/vt/vc_screen.c:352 __vfs_read+0x1b2/0x9d0 fs/read_write.c:416 vfs_read+0x36c/0x6b0 fs/read_write.c:452 ... Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 __kmalloc+0x13a/0x350 mm/slub.c:3818 kmalloc ./include/linux/slab.h:517 vc_allocate+0x438/0x800 drivers/tty/vt/vt.c:787 con_install+0x8c/0x640 drivers/tty/vt/vt.c:2880 tty_driver_install_tty drivers/tty/tty_io.c:1224 tty_init_dev+0x1b5/0x1020 drivers/tty/tty_io.c:1324 tty_open_by_driver drivers/tty/tty_io.c:1959 tty_open+0x17b4/0x2ed0 drivers/tty/tty_io.c:2007 chrdev_open+0xc25/0xd90 fs/char_dev.c:417 do_dentry_open+0xccc/0x1440 fs/open.c:794 vfs_open+0x1b6/0x2f0 fs/open.c:908 ... Bytes 0-79 of 240 are uninitialized Consistently allocating \|vc_screenbuf\| with kzalloc() fixes the problem Reported-by: syzbot+17a8efdf800000@syzkaller.appspotmail.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Laura Abbott	06bef9eebe	staging: android: ion: Return an ERR_PTR in ion_map_kernel commit `0a2bc00341` upstream. The expected return value from ion_map_kernel is an ERR_PTR. The error path for a vmalloc failure currently just returns NULL, triggering a warning in ion_buffer_kmap_get. Encode the vmalloc failure as an ERR_PTR. Reported-by: syzbot+55b1d9f811650de944c6@syzkaller.appspotmail.com Signed-off-by: Laura Abbott <labbott@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:42 +02:00
Tetsuo Handa	9264e9864a	n_tty: Access echo_* variables carefully. commit `ebec3f8f52` upstream. syzbot is reporting stalls at __process_echoes() [1]. This is because since ldata->echo_commit < ldata->echo_tail becomes true for some reason, the discard loop is serving as almost infinite loop. This patch tries to avoid falling into ldata->echo_commit < ldata->echo_tail situation by making access to echo_* variables more carefully. Since reset_buffer_flags() is called without output_lock held, it should not touch echo_* variables. And omit a call to reset_buffer_flags() from n_tty_open() by using vzalloc(). Since add_echo_byte() is called without output_lock held, it needs memory barrier between storing into echo_buf[] and incrementing echo_head counter. echo_buf() needs corresponding memory barrier before reading echo_buf[]. Lack of handling the possibility of not-yet-stored multi-byte operation might be the reason of falling into ldata->echo_commit < ldata->echo_tail situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to echo_buf(ldata, tail + 1), the WARN_ON() fires. Also, explicitly masking with buffer for the former "while" loop, and use ldata->echo_commit > tail for the latter "while" loop. [1] https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+108696293d7a21ab688f@syzkaller.appspotmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
Tetsuo Handa	947dead99e	n_tty: Fix stall at n_tty_receive_char_special(). commit `3d63b7e4ae` upstream. syzbot is reporting stalls at n_tty_receive_char_special() [1]. This is because comparison is not working as expected since ldata->read_head can change at any moment. Mitigate this by explicitly masking with buffer size when checking condition for "while" loops. [1] https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com> Fixes: `bc5a5e3f45` ("n_tty: Don't wrap input buffer indices at buffer size") Cc: stable <stable@vger.kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
William Wu	42525f7a25	usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub commit `8760675932` upstream. The dwc2_get_ls_map() use ttport to reference into the bitmap if we're on a multi_tt hub. But the bitmaps index from 0 to (hub->maxchild - 1), while the ttport index from 1 to hub->maxchild. This will cause invalid memory access when the number of ttport is hub->maxchild. Without this patch, I can easily meet a Kernel panic issue if connect a low-speed USB mouse with the max port of FE2.1 multi-tt hub (1a40:0201) on rk3288 platform. Fixes: `9f9f09b048` ("usb: dwc2: host: Totally redo the microframe scheduler") Cc: <stable@vger.kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
Karoly Pados	1b9f7d2705	USB: serial: cp210x: add Silicon Labs IDs for Windows Update commit `2f83982338` upstream. Silicon Labs defines alternative VID/PID pairs for some chips that when used will automatically install drivers for Windows users without manual intervention. Unfortunately, these IDs are not recognized by the Linux module, so using these IDs improves user experience on one platform but degrades it on Linux. This patch addresses this problem. Signed-off-by: Karoly Pados <pados@pados.hu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
Johan Hovold	b9a0ce3b84	USB: serial: cp210x: add CESINEL device ids commit `24160628a3` upstream. Add device ids for CESINEL products. Reported-by: Carlos Barcala Lara <cabl@cesinel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
Houston Yaroschoff	e8aa3b401d	usb: cdc_acm: Add quirk for Uniden UBC125 scanner commit `4a762569a2` upstream. Uniden UBC125 radio scanner has USB interface which fails to work with cdc_acm driver: usb 1-1.5: new full-speed USB device number 4 using xhci_hcd cdc_acm 1-1.5:1.0: Zero length descriptor references cdc_acm: probe of 1-1.5:1.0 failed with error -22 Adding the NO_UNION_NORMAL quirk for the device fixes the issue: usb 1-4: new full-speed USB device number 15 using xhci_hcd usb 1-4: New USB device found, idVendor=1965, idProduct=0018 usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 1-4: Product: UBC125XLT usb 1-4: Manufacturer: Uniden Corp. usb 1-4: SerialNumber: 0001 cdc_acm 1-4:1.0: ttyACM0: USB ACM device `lsusb -v` of the device: Bus 001 Device 015: ID 1965:0018 Uniden Corporation Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 2 Communications bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1965 Uniden Corporation idProduct 0x0018 bcdDevice 0.01 iManufacturer 1 Uniden Corp. iProduct 2 UBC125XLT iSerial 3 0001 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 48 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 500mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 0 None iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 10 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Status: 0x0000 (Bus Powered) Signed-off-by: Houston Yaroschoff <hstn@4ever3.net> Cc: stable <stable@vger.kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:26:41 +02:00
Greg Kroah-Hartman	e692f66fab	Linux 4.9.111	2018-07-03 11:23:18 +02:00
Bjørn Mork	35fd10aeb2	cdc_ncm: avoid padding beyond end of skb commit `49c2c3f246` upstream. Commit `4a0e3e989d` ("cdc_ncm: Add support for moving NDP to end of NCM frame") added logic to reserve space for the NDP at the end of the NTB/skb. This reservation did not take the final alignment of the NDP into account, causing us to reserve too little space. Additionally the padding prior to NDP addition did not ensure there was enough space for the NDP. The NTB/skb with the NDP appended would then exceed the configured max size. This caused the final padding of the NTB to use a negative count, padding to almost INT_MAX, and resulting in: [60103.825970] BUG: unable to handle kernel paging request at ffff9641f2004000 [60103.825998] IP: __memset+0x24/0x30 [60103.826001] PGD a6a06067 P4D a6a06067 PUD 4f65a063 PMD 72003063 PTE 0 [60103.826013] Oops: 0002 [#1] SMP NOPTI [60103.826018] Modules linked in: (removed( [60103.826158] CPU: 0 PID: 5990 Comm: Chrome_DevTools Tainted: G O 4.14.0-3-amd64 #1 Debian 4.14.17-1 [60103.826162] Hardware name: LENOVO 20081 BIOS 41CN28WW(V2.04) 05/03/2012 [60103.826166] task: ffff964193484fc0 task.stack: ffffb2890137c000 [60103.826171] RIP: 0010:__memset+0x24/0x30 [60103.826174] RSP: 0000:ffff964316c03b68 EFLAGS: 00010216 [60103.826178] RAX: 0000000000000000 RBX: 00000000fffffffd RCX: 000000001ffa5000 [60103.826181] RDX: 0000000000000005 RSI: 0000000000000000 RDI: ffff9641f2003ffc [60103.826184] RBP: ffff964192f6c800 R08: 00000000304d434e R09: ffff9641f1d2c004 [60103.826187] R10: 0000000000000002 R11: 00000000000005ae R12: ffff9642e6957a80 [60103.826190] R13: ffff964282ff2ee8 R14: 000000000000000d R15: ffff9642e4843900 [60103.826194] FS: 00007f395aaf6700(0000) GS:ffff964316c00000(0000) knlGS:0000000000000000 [60103.826197] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [60103.826200] CR2: ffff9641f2004000 CR3: 0000000013b0c000 CR4: 00000000000006f0 [60103.826204] Call Trace: [60103.826212] <IRQ> [60103.826225] cdc_ncm_fill_tx_frame+0x5e3/0x740 [cdc_ncm] [60103.826236] cdc_ncm_tx_fixup+0x57/0x70 [cdc_ncm] [60103.826246] usbnet_start_xmit+0x5d/0x710 [usbnet] [60103.826254] ? netif_skb_features+0x119/0x250 [60103.826259] dev_hard_start_xmit+0xa1/0x200 [60103.826267] sch_direct_xmit+0xf2/0x1b0 [60103.826273] __dev_queue_xmit+0x5e3/0x7c0 [60103.826280] ? ip_finish_output2+0x263/0x3c0 [60103.826284] ip_finish_output2+0x263/0x3c0 [60103.826289] ? ip_output+0x6c/0xe0 [60103.826293] ip_output+0x6c/0xe0 [60103.826298] ? ip_forward_options+0x1a0/0x1a0 [60103.826303] tcp_transmit_skb+0x516/0x9b0 [60103.826309] tcp_write_xmit+0x1aa/0xee0 [60103.826313] ? sch_direct_xmit+0x71/0x1b0 [60103.826318] tcp_tasklet_func+0x177/0x180 [60103.826325] tasklet_action+0x5f/0x110 [60103.826332] __do_softirq+0xde/0x2b3 [60103.826337] irq_exit+0xae/0xb0 [60103.826342] do_IRQ+0x81/0xd0 [60103.826347] common_interrupt+0x98/0x98 [60103.826351] </IRQ> [60103.826355] RIP: 0033:0x7f397bdf2282 [60103.826358] RSP: 002b:00007f395aaf57d8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff6e [60103.826362] RAX: 0000000000000000 RBX: 00002f07bc6d0900 RCX: 00007f39752d7fe7 [60103.826365] RDX: 0000000000000022 RSI: 0000000000000147 RDI: 00002f07baea02c0 [60103.826368] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [60103.826371] R10: 00000000ffffffff R11: 0000000000000000 R12: 00002f07baea02c0 [60103.826373] R13: 00002f07bba227a0 R14: 00002f07bc6d090c R15: 0000000000000000 [60103.826377] Code: 90 90 90 90 90 90 90 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 [60103.826442] RIP: __memset+0x24/0x30 RSP: ffff964316c03b68 [60103.826444] CR2: ffff9641f2004000 Commit `e1069bbfcf` ("net: cdc_ncm: Reduce memory use when kernel memory low") made this bug much more likely to trigger by reducing the NTB size under memory pressure. Link: https://bugs.debian.org/893393 Reported-by: Горбешко Богдан <bodqhrohro@gmail.com> Reported-and-tested-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: Enrico Mioso <mrkiko.rs@gmail.com> Fixes: `4a0e3e989d` ("cdc_ncm: Add support for moving NDP to end of NCM frame") [ bmork: tx_curr_size => tx_max and context fixup for v4.12 and older ] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:18 +02:00
Mike Snitzer	f2bc5d18d2	dm thin: handle running out of data space vs concurrent discard commit `a685557fbb` upstream. Discards issued to a DM thin device can complete to userspace (via fstrim) _before_ the metadata changes associated with the discards is reflected in the thinp superblock (e.g. free blocks). As such, if a user constructs a test that loops repeatedly over these steps, block allocation can fail due to discards not having completed yet: 1) fill thin device via filesystem file 2) remove file 3) fstrim From initial report, here: https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html "The root cause of this issue is that dm-thin will first remove mapping and increase corresponding blocks' reference count to prevent them from being reused before DISCARD bios get processed by the underlying layers. However. increasing blocks' reference count could also increase the nr_allocated_this_transaction in struct sm_disk which makes smd->old_ll.nr_allocated + smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks. In this case, alloc_data_block() will never commit metadata to reset the begin pointer of struct sm_disk, because sm_disk_get_nr_free() always return an underflow value." While there is room for improvement to the space-map accounting that thinp is making use of: the reality is this test is inherently racey and will result in the previous iteration's fstrim's discard(s) completing vs concurrent block allocation, via dd, in the next iteration of the loop. No amount of space map accounting improvements will be able to allow user's to use a block before a discard of that block has completed. So the best we can really do is allow DM thinp to gracefully handle such aggressive use of all the pool's data by degrading the pool into out-of-data-space (OODS) mode. We _should_ get that behaviour already (if space map accounting didn't falsely cause alloc_data_block() to believe free space was available).. but short of that we handle the current reality that dm_pool_alloc_data_block() can return -ENOSPC. Reported-by: Dennis Yang <dennisyang@qnap.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:18 +02:00
Keith Busch	17057c59bd	block: Fix transfer when chunk sectors exceeds max commit `15bfd21fbc` upstream. A device may have boundary restrictions where the number of sectors between boundaries exceeds its max transfer size. In this case, we need to cap the max size to the smaller of the two limits. Reported-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Tested-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: <stable@vger.kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:18 +02:00
Takashi Iwai	afd82d0757	ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210 commit `275ec0cb94` upstream. Fujitsu Seimens ESPRIMO Mobile U9210 requires the same fixup as H270 for the correct pin configs. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200107 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
Takashi Iwai	6008de291a	ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co commit `d5a6cabf02` upstream. Some Lenovo laptops, e.g. Lenovo P50, showed the pop noise at resume or runtime resume. It turned out to be reduced by applying alc_no_shutup() just like TPT440 quirk does. Since there are many Lenovo models showing the same behavior, put this workaround in ALC269_FIXUP_THINKPAD_ACPI entry so that it's applied commonly to all such Lenovo machines. Reported-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Benjamin Berg <bberg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
???	58d8103113	Input: elantech - fix V4 report decoding for module with middle key commit `e0ae2519ca` upstream. Some touchpad has middle key and it will be indicated in bit 2 of packet[0]. We need to fix V4 formation's byte mask to prevent error decoding. Signed-off-by: KT Liao <kt.liao@emc.com.tw> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
Aaron Ma	465e965f64	Input: elantech - enable middle button of touchpads on ThinkPad P52 commit `24bb555e6e` upstream. PNPID is better way to identify the type of touchpads. Enable middle button support on 2 types of touchpads on Lenovo P52. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
Ben Hutchings	54ae564b35	Input: elan_i2c_smbus - fix more potential stack buffer overflows commit `50fc7b6195` upstream. Commit `40f7090bb1` ("Input: elan_i2c_smbus - fix corrupted stack") fixed most of the functions using i2c_smbus_read_block_data() to allocate a buffer with the maximum block size. However three functions were left unchanged: * In elan_smbus_initialize(), increase the buffer size in the same way. * In elan_smbus_calibrate_result(), the buffer is provided by the caller (calibrate_store()), so introduce a bounce buffer. Also name the result buffer size. * In elan_smbus_get_report(), the buffer is provided by the caller but happens to be the right length. Add a compile-time assertion to ensure this remains the case. Cc: <stable@vger.kernel.org> # 3.19+ Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
Jan Kara	2a1b1234d0	udf: Detect incorrect directory size commit `fa65653e57` upstream. Detect when a directory entry is (possibly partially) beyond directory size and return EIO in that case since it means the filesystem is corrupted. Otherwise directory operations can further corrupt the directory and possibly also oops the kernel. CC: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> CC: stable@vger.kernel.org Reported-and-tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:17 +02:00
Boris Ostrovsky	3cac26f2a2	xen: Remove unnecessary BUG_ON from __unbind_from_irq() commit `eef04c7b37` upstream. Commit `910f8befdf` ("xen/pirq: fix error path cleanup when binding MSIs") fixed a couple of errors in error cleanup path of xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to __unbind_from_irq() with an unbound irq, which would result in triggering the BUG_ON there. Since there is really no reason for the BUG_ON (xen_free_irq() can operate on unbound irqs) we can remove it. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Dan Williams	6d28f2d64c	mm: fix devmem_is_allowed() for sub-page System RAM intersections commit `2bdce74412` upstream. Hussam reports: I was poking around and for no real reason, I did cat /dev/mem and strings /dev/mem. Then I saw the following warning in dmesg. I saved it and rebooted immediately. memremap attempted on mixed range 0x000000000009c000 size: 0x1000 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 11810 at kernel/memremap.c:98 memremap+0x104/0x170 [..] Call Trace: xlate_dev_mem_ptr+0x25/0x40 read_mem+0x89/0x1a0 __vfs_read+0x36/0x170 The memremap() implementation checks for attempts to remap System RAM with MEMREMAP_WB and instead redirects those mapping attempts to the linear map. However, that only works if the physical address range being remapped is page aligned. In low memory we have situations like the following: 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved ...where System RAM intersects Reserved ranges on a sub-page page granularity. Given that devmem_is_allowed() special cases any attempt to map System RAM in the first 1MB of memory, replace page_is_ram() with the more precise region_intersects() to trap attempts to map disallowed ranges. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199999 Link: http://lkml.kernel.org/r/152856436164.18127.2847888121707136898.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: `92281dee82` ("arch: introduce memremap()") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Hussam Al-Tayeb <me@hussam.eu.org> Tested-by: Hussam Al-Tayeb <me@hussam.eu.org> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Dongsheng Yang	1f00b1fc77	rbd: flush rbd_dev->watch_dwork after watch is unregistered commit `23edca8649` upstream. There is a problem if we are going to unmap a rbd device and the watch_dwork is going to queue delayed work for watch: unmap Thread watch Thread timer do_rbd_remove cancel_tasks_sync(rbd_dev) queue_delayed_work for watch destroy_workqueue(rbd_dev->task_wq) drain_workqueue(wq) destroy other resources in wq call_timer_fn __queue_work() Then the delayed work escape the cancel_tasks_sync() and destroy_workqueue() and we will get an user-after-free call trace: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 4.17.0-rc6+ #13 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__queue_work+0x6a/0x3b0 RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086 RAX: ffff9427deca8400 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI: 0000000000000000 RBP: ffff942783e39e00 R08: ffff9427deca8400 R09: ffff9427df1c3f00 R10: 0000000000000004 R11: 0000000000000005 R12: ffff9427cfb85970 R13: 0000000000002000 R14: 000000000001eca0 R15: 0000000000000007 FS: 0000000000000000(0000) GS:ffff9427df1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000004c900a005 CR4: 00000000000206e0 Call Trace: <IRQ> ? __queue_work+0x3b0/0x3b0 call_timer_fn+0x2d/0x130 run_timer_softirq+0x16e/0x430 ? tick_sched_timer+0x37/0x70 __do_softirq+0xd2/0x280 irq_exit+0xd5/0xe0 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 [ Move rbd_dev->watch_dwork cancellation so that rbd_reregister_watch() either bails out early because the watch is UNREGISTERED at that point or just gets cancelled. ] Cc: stable@vger.kernel.org Fixes: `99d1694310` ("rbd: retry watch re-registration periodically") Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Hans de Goede	037aca0e2f	pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume commit `1d375b58c1` upstream. On some devices the contents of the ctrl register get lost over a suspend/resume and the PWM comes back up disabled after the resume. This is seen on some Bay Trail devices with the PWM in ACPI enumerated mode, so it shows up as a platform device instead of a PCI device. If we still think it is enabled and then try to change the duty-cycle after this, we end up with a "PWM_SW_UPDATE was not cleared" error and the PWM is stuck in that state from then on. This commit adds suspend and resume pm callbacks to the pwm-lpss-platform code, which save/restore the ctrl register over a suspend/resume, fixing this. Note that: 1) There is no need to do this over a runtime suspend, since we only runtime suspend when disabled and then we properly set the enable bit and reprogram the timings when we re-enable the PWM. 2) This may be happening on more systems then we realize, but has been covered up sofar by a bug in the acpi-lpss.c code which was save/restoring the regular device registers instead of the lpss private registers due to lpss_device_desc.prv_offset not being set. This is fixed by a later patch in this series. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Alexandr Savca	c38bac75d1	Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID commit `8938fc7b8f` upstream. Add ELAN0618 to the list of supported touchpads; this ID is used in Lenovo v330 15IKB devices. Signed-off-by: Alexandr Savca <alexandr.savca@saltedge.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Kees Cook	7673ca3c93	video: uvesafb: Fix integer overflow in allocation commit `9f645bcc56` upstream. cmap->len can get close to INT_MAX/2, allowing for an integer overflow in allocation. This uses kmalloc_array() instead to catch the condition. Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com> Fixes: `8bdb3a2d7d` ("uvesafb: the driver core") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Trond Myklebust	cdc83c3669	NFSv4: Revert commit `5f83d86cf5` ("NFSv4.x: Fix wraparound issues..") commit `fc40724fc6` upstream. The correct behaviour for NFSv4 sequence IDs is to wrap around to the value 0 after 0xffffffff. See https://tools.ietf.org/html/rfc5661#section-2.10.6.1 Fixes: `5f83d86cf5` ("NFSv4.x: Fix wraparound issues when validing...") Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Dave Wysochanski	5b7f582e80	NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message commit `d68894800e` upstream. In nfs_idmap_read_and_verify_message there is an incorrect sprintf '%d' that converts the __u32 'im_id' from struct idmap_msg to 'id_str', which is a stack char array variable of length NFS_UINT_MAXLEN == 11. If a uid or gid value is > 2147483647 = 0x7fffffff, the conversion overflows into a negative value, for example: crash> p (unsigned) (0x80000000) $1 = 2147483648 crash> p (signed) (0x80000000) $2 = -2147483648 The '-' sign is written to the buffer and this causes a 1 byte overflow when the NULL byte is written, which corrupts kernel stack memory. If CONFIG_CC_STACKPROTECTOR_STRONG is set we see a stack-protector panic: [11558053.616565] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa05b8a8c [11558053.639063] CPU: 6 PID: 9423 Comm: rpc.idmapd Tainted: G W ------------ T 3.10.0-514.el7.x86_64 #1 [11558053.641990] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014 [11558053.644462] ffffffff818c7bc0 00000000b1f3aec1 ffff880de0f9bd48 ffffffff81685eac [11558053.646430] ffff880de0f9bdc8 ffffffff8167f2b3 ffffffff00000010 ffff880de0f9bdd8 [11558053.648313] ffff880de0f9bd78 00000000b1f3aec1 ffffffff811dcb03 ffffffffa05b8a8c [11558053.650107] Call Trace: [11558053.651347] [<ffffffff81685eac>] dump_stack+0x19/0x1b [11558053.653013] [<ffffffff8167f2b3>] panic+0xe3/0x1f2 [11558053.666240] [<ffffffff811dcb03>] ? kfree+0x103/0x140 [11558053.682589] [<ffffffffa05b8a8c>] ? idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.689710] [<ffffffff810855db>] __stack_chk_fail+0x1b/0x30 [11558053.691619] [<ffffffffa05b8a8c>] idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.693867] [<ffffffffa00209d6>] rpc_pipe_write+0x56/0x70 [sunrpc] [11558053.695763] [<ffffffff811fe12d>] vfs_write+0xbd/0x1e0 [11558053.702236] [<ffffffff810acccc>] ? task_work_run+0xac/0xe0 [11558053.704215] [<ffffffff811fec4f>] SyS_write+0x7f/0xe0 [11558053.709674] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b Fix this by calling the internally defined nfs_map_numeric_to_string() function which properly uses '%u' to convert this __u32. For consistency, also replace the one other place where snprintf is called. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reported-by: Stephen Johnston <sjohnsto@redhat.com> Fixes: `cf4ab538f1` ("NFSv4: Fix the string length returned by the idmapper") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Scott Mayhew	40d79a6195	nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir commit `9c2ece6ef6` upstream. nfsd4_readdir_rsize restricts rd_maxcount to svc_max_payload when estimating the size of the readdir reply, but nfsd_encode_readdir restricts it to INT_MAX when encoding the reply. This can result in log messages like "kernel: RPC request reserved 32896 but used 1049444". Restrict rd_dircount similarly (no reason it should be larger than svc_max_payload). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:16 +02:00
Mauro Carvalho Chehab	dc00f08645	media: dvb_frontend: fix locking issues at dvb_frontend_get_event() commit `76d81243a4` upstream. As warned by smatch: drivers/media/dvb-core/dvb_frontend.c:314 dvb_frontend_get_event() warn: inconsistent returns 'sem:&fepriv->sem'. Locked on: line 288 line 295 line 306 line 314 Unlocked on: line 303 The lock implementation for get event is wrong, as, if an interrupt occurs, down_interruptible() will fail, and the routine will call up() twice when userspace calls the ioctl again. The bad code is there since when Linux migrated to git, in 2005. Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Kai-Heng Feng	1a4726ba1d	media: cx231xx: Add support for AverMedia DVD EZMaker 7 commit `29e61d6ef0` upstream. User reports AverMedia DVD EZMaker 7 can be driven by VIDEO_GRABBER. Add the device to the id_table to make it work. BugLink: https://bugs.launchpad.net/bugs/1620762 Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Mauro Carvalho Chehab	1e6b50b6b6	media: v4l2-compat-ioctl32: prevent go past max size commit `ea72fbf588` upstream. As warned by smatch: drivers/media/v4l2-core/v4l2-compat-ioctl32.c:879 put_v4l2_ext_controls32() warn: check for integer overflow 'count' The access_ok() logic should check for too big arrays too. Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	d6a267b4c5	perf intel-pt: Fix packet decoding of CYC packets commit `621a5a327c` upstream. Use a 64-bit type so that the cycle count is not limited to 32-bits. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1528371002-8862-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	d129ab791d	perf intel-pt: Fix "Unexpected indirect branch" error commit `9fb523363f` upstream. Some Atom CPUs can produce FUP packets that contain NLIP (next linear instruction pointer) instead of CLIP (current linear instruction pointer). That will result in "Unexpected indirect branch" errors. Fix by comparing IP to NLIP in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	4213d9b8cd	perf intel-pt: Fix MTC timing after overflow commit `dd27b87ab5` upstream. On some platforms, overflows will clear before MTC wraparound, and there is no following TSC/TMA packet. In that case the previous TMA is valid. Since there will be a valid TMA either way, stop setting 'have_tma' to false upon overflow. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	282f1f66b5	perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP commit `bd2e49ec48` upstream. It is possible to have a CBR packet between a FUP packet and corresponding TIP packet. Stop treating it as an error. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	31606f7f56	perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING commit `dbcb82b93f` upstream. sync_switch is a facility to synchronize decoding more closely with the point in the kernel when the context actually switched. In one case, INTEL_PT_SS_NOT_TRACING state was not correctly transitioning to INTEL_PT_SS_TRACING state due to a missing case clause. Add it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Adrian Hunter	dfd2eff6f4	perf tools: Fix symbol and object code resolution for vdso32 and vdsox32 commit `aef4feace2` upstream. Fix __kmod_path__parse() so that perf tools does not treat vdso32 and vdsox32 as kernel modules and fail to find the object. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org Fixes: `1f121b03d0` ("perf tools: Deal with kernel module names in '[]' correctly") Link: http://lkml.kernel.org/r/1528117014-30032-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:15 +02:00
Andy Shevchenko	49d98a8e1f	mfd: intel-lpss: Program REMAP register in PIO mode commit `d28b625208` upstream. According to documentation REMAP register has to be programmed in either DMA or PIO mode of the slice. Move the DMA capability check below to let REMAP register be programmed in PIO mode. Cc: stable@vger.kernel.org # 4.3+ Fixes: `4b45efe852` ("mfd: Add support for Intel Sunrisepoint LPSS devices") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Johan Hovold	099fae46d8	backlight: tps65217_bl: Fix Device Tree node lookup commit `2b12dfa124` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. This would only cause trouble if the child node is missing while there is an unrelated node named "backlight" elsewhere in the tree. Cc: stable <stable@vger.kernel.org> # 3.7 Fixes: `eebfdc17cc` ("backlight: Add TPS65217 WLED driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Johan Hovold	a89e596f12	backlight: max8925_bl: Fix Device Tree node lookup commit `d1cc0ec3da` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent mfd node was also prematurely freed, while the child backlight node was leaked. Cc: stable <stable@vger.kernel.org> # 3.9 Fixes: `47ec340cb8` ("mfd: max8925: Support dt for backlight") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Johan Hovold	47f764c65c	backlight: as3711_bl: Fix Device Tree node lookup commit `4a9c8bb2ac` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent mfd node was also prematurely freed. Cc: stable <stable@vger.kernel.org> # 3.10 Fixes: `59eb2b5e57` ("drivers/video/backlight/as3711_bl.c: add OF support") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Silvio Cesare	da05be5556	UBIFS: Fix potential integer overflow in allocation commit `353748a359` upstream. There is potential for the size and len fields in ubifs_data_node to be too large causing either a negative value for the length fields or an integer overflow leading to an incorrect memory allocation. Likewise, when the len field is small, an integer underflow may occur. Signed-off-by: Silvio Cesare <silvio.cesare@gmail.com> Fixes: `1e51764a3c` ("UBIFS: add new flash file system") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Richard Weinberger	df15c6eeab	ubi: fastmap: Correctly handle interrupted erasures in EBA commit `781932375f` upstream. Fastmap cannot track the LEB unmap operation, therefore it can happen that after an interrupted erasure the mapping still looks good from Fastmap's point of view, while reading from the PEB will cause an ECC error and confuses the upper layer. Instead of teaching users of UBI how to deal with that, we read back the VID header and check for errors. If the PEB is empty or shows ECC errors we fixup the mapping and schedule the PEB for erasure. Fixes: `dbb7d2a88d` ("UBI: Add fastmap core") Cc: <stable@vger.kernel.org> Reported-by: martin bayern <Martinbayern@outlook.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Richard Weinberger	9eb99e738b	ubi: fastmap: Cancel work upon detach commit `6e7d801610` upstream. Ben Hutchings pointed out that `29b7a6fa1e` ("ubi: fastmap: Don't flush fastmap work on detach") does not really fix the problem, it just reduces the risk to hit the race window where fastmap work races against free()'ing ubi->volumes[]. The correct approach is making sure that no more fastmap work is in progress before we free ubi data structures. So we cancel fastmap work right after the ubi background thread is stopped. By setting ubi->thread_enabled to zero we make sure that no further work tries to wake the thread. Fixes: `29b7a6fa1e` ("ubi: fastmap: Don't flush fastmap work on detach") Fixes: `74cdaf2400` ("UBI: Fastmap: Fix memory leaks while closing the WL sub-system") Cc: stable@vger.kernel.org Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Martin Townsend <mtownsend1973@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:14 +02:00
Srinivas Kandagatla	ec7ee4d60f	rpmsg: smd: do not use mananged resources for endpoints and channels commit `4a2e84c6ed` upstream. All the managed resources would be freed by the time release function is invoked. Handling such memory in qcom_smd_edge_release() would do bad things. Found this issue while testing Audio usecase where the dsp is started up and shutdown in a loop. This patch fixes this issue by using simple kzalloc for allocating channel->name and channel which is then freed in qcom_smd_edge_release(). Without this patch restarting a remoteproc would crash the system. Fixes: `53e2822e56` ("rpmsg: Introduce Qualcomm SMD backend") Cc: <stable@vger.kernel.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:13 +02:00
NeilBrown	486684887a	md: fix two problems with setting the "re-add" device state. commit `011abdc9df` upstream. If "re-add" is written to the "state" file for a device which is faulty, this has an effect similar to removing and re-adding the device. It should take up the same slot in the array that it previously had, and an accelerated (e.g. bitmap-based) rebuild should happen. The slot that "it previously had" is determined by rdev->saved_raid_disk. However this is not set when a device fails (only when a device is added), and it is cleared when resync completes. This means that "re-add" will normally work once, but may not work a second time. This patch includes two fixes. 1/ when a device fails, record the ->raid_disk value in ->saved_raid_disk before clearing ->raid_disk 2/ when "re-add" is written to a device for which ->saved_raid_disk is not set, fail. I think this is suitable for stable as it can cause re-adding a device to be forced to do a full resync which takes a lot longer and so puts data at more risk. Cc: <stable@vger.kernel.org> (v4.1) Fixes: `97f6cd39da` ("md-cluster: re-add capabilities") Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:13 +02:00
Marcin Ziemianowicz	c0eb205dfe	clk: at91: PLL recalc_rate() now using cached MUL and DIV values commit `a982e45dc1` upstream. When a USB device is connected to the USB host port on the SAM9N12 then you get "-62" error which seems to indicate USB replies from the device are timing out. Based on a logic sniffer, I saw the USB bus was running at half speed. The PLL code uses cached MUL and DIV values which get set in set_rate() and applied in prepare(), but the recalc_rate() function instead queries the hardware instead of using these cached values. Therefore, if recalc_rate() is called between a set_rate() and prepare(), the wrong frequency is calculated and later the USB clock divider for the SAM9N12 SOC will be configured for an incorrect clock. In my case, the PLL hardware was set to 96 Mhz before the OHCI driver loads, and therefore the usb clock divider was being set to /2 even though the OHCI driver set the PLL to 48 Mhz. As an alternative explanation, I noticed this was fixed in the past by `87e2ed338f` ("clk: at91: fix recalc_rate implementation of PLL driver") but the bug was later re-introduced by `1bdf02326b` ("clk: at91: make use of syscon/regmap internally"). Fixes: `1bdf02326b` ("clk: at91: make use of syscon/regmap internally) Cc: <stable@vger.kernel.org> Signed-off-by: Marcin Ziemianowicz <marcin@ziemianowicz.com> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:13 +02:00
Robert Elliott	f216d1e933	linvdimm, pmem: Preserve read-only setting for pmem devices commit `254a4cd50b` upstream. The pmem driver does not honor a forced read-only setting for very long: $ blockdev --setro /dev/pmem0 $ blockdev --getro /dev/pmem0 1 followed by various commands like these: $ blockdev --rereadpt /dev/pmem0 or $ mkfs.ext4 /dev/pmem0 results in this in the kernel serial log: nd_pmem namespace0.0: region0 read-write, marking pmem0 read-write with the read-only setting lost: $ blockdev --getro /dev/pmem0 0 That's from bus.c nvdimm_revalidate_disk(), which always applies the setting from nd_region (which is initially based on the ACPI NFIT NVDIMM state flags not_armed bit). In contrast, commit `20bd1d026a` ("scsi: sd: Keep disk read-only when re-reading partition") fixed this issue for SCSI devices to preserve the previous setting if it was set to read-only. This patch modifies bus.c to preserve any previous read-only setting. It also eliminates the kernel serial log print except for cases where read-write is changed to read-only, so it doesn't print read-only to read-only non-changes. Cc: <stable@vger.kernel.org> Fixes: `5813882094` ("libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only") Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:13 +02:00
Steffen Maier	c6751cb1e8	scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread commit `6a76550841` upstream. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:12 +02:00
Steffen Maier	2df7e6f33c	scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED commit `8c3d20aada` upstream. That other commit introduced an inconsistency because it would trace on ERP_FAILED for all callers of port forced reopen triggers (not just terminate_rport_io), but it would not trace on ERP_FAILED for all callers of other ERP triggers such as adapter, port regular, LUN. Therefore, generalize that other commit. zfcp_erp_action_enqueue() already had two early outs which re-used the one zfcp_dbf_rec_trig() call. All ERP trigger functions finally run through zfcp_erp_action_enqueue(). So move the special handling for ZFCP_STATUS_COMMON_ERP_FAILED into zfcp_erp_action_enqueue() and add another early out with new trace marker for pseudo ERP need in this case. This removes all early returns from all ERP trigger functions so we always end up at zfcp_dbf_rec_trig(). Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:12 +02:00
Steffen Maier	21224f6f13	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED commit `d70aab5592` upstream. For problem determination we always want to see when we were invoked on the terminate_rport_io callback whether we perform something or not. Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec: loose remote port t workqueue [s] zfcp_q_<dev> IRQ zfcperp<dev> === ================== =================== ============================ 0 recv RSCN q p.test_link_work block rport start fast_io_fail_tmo send ADISC ELS 4 recv ADISC fail block zfcp_port port forced reopen send open port 12 recv open port fail q p.gid_pn_work zfcp_erp_wakeup (zfcp_erp_wait would return) GID_PN fail Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail, e.g. with the typical 5 sec setting. port.status \|= ERP_FAILED If fast_io_fail_tmo triggers after this point, we missed a SCSI trace. workqueue fc_dl_<host> ================== 27 fc_timeout_fail_rport_io fc_terminate_rport_io zfcp_scsi_terminate_rport_io zfcp_erp_port_forced_reopen _zfcp_erp_port_forced_reopen if (port.status & ERP_FAILED) return; Therefore, write a trace before above early return. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : sctrpi1 SCSI terminate rport I/O LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> D_ID : 0x<n_port_id> Adapter status : 0x... Port status : 0x... LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:12 +02:00
Steffen Maier	48ae373c57	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return commit `96d9270499` upstream. get_device() and its internally used kobject_get() only return NULL if they get passed NULL as argument. zfcp_get_port_by_wwpn() loops over adapter->port_list so the iteration variable port is always non-NULL. Struct device is embedded in struct zfcp_port so &port->dev is always non-NULL. This is the argument to get_device(). However, if we get an fc_rport in terminate_rport_io() for which we cannot find a match within zfcp_get_port_by_wwpn(), the latter can return NULL. v2.6.30 commit `70932935b6` ("[SCSI] zfcp: Fix oops when port disappears") introduced an early return without adding a trace record for this case. Even if we don't need recovery in this case, for debugging we should still see that our callback was invoked originally by scsi_transport_fc. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : sctrpin SCSI terminate rport I/O, no zfcp port LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> WWPN D_ID : 0x<n_port_id> N_Port-ID Adapter status : 0x... Port status : 0xffffffff unknown (-1) LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `70932935b6` ("[SCSI] zfcp: Fix oops when port disappears") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:12 +02:00
Steffen Maier	b0c2fc11ce	scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed commit `512857a795` upstream. If a SCSI device is deleted during scsi_eh host reset, we cannot get a reference to the SCSI device anymore since scsi_device_get returns !=0 by design. Assuming the recovery of adapter and port(s) was successful, zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for the half-gone SCSI device. Unfortunately, it causes the following confusing trace record which states that zfcp will do a LUN recovery as "ERP need" is ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want". Old example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ZFCP_STATUS_COMMON_RUNNING but not ZFCP_STATUS_COMMON_UNBLOCKED as it was closed on close part of adapter reopen ERP want : 0x01 ERP need : 0x01 misleading However, zfcp_erp_setup_act() returns NULL as it cannot get the reference. Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery actually happens. We always do want the recovery trigger trace record even if no erp_action could be enqueued as in this case. For other cases where we did not enqueue an erp_action, 'need' has always been zero to indicate this. In order to indicate above goto out, introduce an eyecatcher "flag" to mark the "ERP need" as 'not needed' but still keep the information which erp_action type, that zfcp_erp_required_act() had decided upon, is needed. 0xc_ is chosen to be visibly different from 0x0_ in "ERP want". New example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ERP want : 0x01 ERP need : 0xc1 would need LUN ERP, but no action set up ^ Before v2.6.38 commit `ae0904f60f` ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") we could detect this case because the "erp_action" field in the trace was NULL. The rework removed erp_action as argument and field from the trace. This patch here is for tracing. A fix to allow LUN recovery in the case at hand is a topic for a separate patch. See also commit `fdbd1c5e27` ("[SCSI] zfcp: Allow running unit/LUN shutdown without acquiring reference") for a similar case and background info. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `ae0904f60f` ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:12 +02:00
Steffen Maier	97d3625bdd	scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF commit `81979ae63e` upstream. We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the start of our eh callback and an actual send/recv of an abort / TMF request. In order to see the temporal sequence including any abort / TMF send retries, add a trace before the above two blocking functions. This supports problem determination with scsi_eh and parallel zfcp ERP. No need to explicitly trace the beginning of our eh callback, since we typically can send an abort / TMF and see its HBA response (in the worst case, it's a pseudo response on dismiss all of adapter recovery, e.g. due to an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send, we now get a trace record for the first "abrt_wt" or "[lt]r_wait" which denotes almost the beginning of the callback. No need to explicitly trace the wakeup after the above two blocking functions because the next retry loop causes another trace in any case and that is sufficient. Example trace records formatted with zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : abrt_wt abort, before zfcp_erp_wait() Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0x<scsi_id> SCSI LUN : 0x<scsi_lun> SCSI LUN high : 0x<scsi_lun_high> SCSI result : 0x<scsi_result_of_cmd_to_be_aborted> SCSI retries : 0x<retries_of_cmd_to_be_aborted> SCSI allowed : 0x<allowed_retries_of_cmd_to_be_aborted> SCSI scribble : 0x<req_id_of_cmd_to_be_aborted> SCSI opcode : <CDB_of_cmd_to_be_aborted> FCP rsp inf cod: 0x.. none (invalid) FCP rsp IU : ... none (invalid) Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : lr_wait LUN reset, before zfcp_erp_wait() Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0x<scsi_id> SCSI LUN : 0x<scsi_lun> SCSI LUN high : 0x<scsi_lun_high> SCSI result : 0x... unrelated SCSI retries : 0x.. unrelated SCSI allowed : 0x.. unrelated SCSI scribble : 0x... unrelated SCSI opcode : ... unrelated FCP rsp inf cod: 0x.. none (invalid) FCP rsp IU : ... none (invalid) Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `63caf367e1` ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") Fixes: `af4de36d91` ("[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Steffen Maier	9779f499d8	scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler commit `df30781699` upstream. For problem determination we need to see whether and why we were successful or not. This allows deduction of scsi_eh escalation. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : schrh_r SCSI host reset handler result Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0xffffffff none (invalid) SCSI LUN : 0xffffffff none (invalid) SCSI LUN high : 0xffffffff none (invalid) SCSI result : 0x00002002 field re-used for midlayer value: SUCCESS or in other cases: 0x2009 == FAST_IO_FAIL SCSI retries : 0xff none (invalid) SCSI allowed : 0xff none (invalid) SCSI scribble : 0xffffffffffffffff none (invalid) SCSI opcode : ffffffff ffffffff ffffffff ffffffff none (invalid) FCP rsp inf cod: 0xff none (invalid) FCP rsp IU : 00000000 00000000 00000000 00000000 none (invalid) 00000000 00000000 v2.6.35 commit `a1dbfddd02` ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") introduced the first return with something other than the previously hardcoded single SUCCESS return path. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `a1dbfddd02` ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Himanshu Madhani	f0c543159a	scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails commit `413c2f3348` upstream. This patch prevents driver from setting lower default speed of 1 GB/sec, if the switch does not support Get Port Speed Capabilities (GPSC) command. Setting this default speed results into much lower write performance for large sequential WRITE. This patch modifies driver to check for gpsc_supported flags and prevents driver from issuing MBC_SET_PORT_PARAM (001Ah) to set default speed of 1 GB/sec. If driver does not send this mailbox command, firmware assumes maximum supported link speed and will operate at the max speed. Cc: stable@vger.kernel.org Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reported-by: Eda Zhou <ezhou@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Martin Kelly	0400b066ea	iio:buffer: make length types match kfifo types commit `c043ec1ca5` upstream. Currently, we use int for buffer length and bytes_per_datum. However, kfifo uses unsigned int for length and size_t for element size. We need to make sure these matches or we will have bugs related to overflow (in the range between INT_MAX and UINT_MAX for length, for example). In addition, set_bytes_per_datum uses size_t while bytes_per_datum is an int, which would cause bugs for large values of bytes_per_datum. Change buffer length to use unsigned int and bytes_per_datum to use size_t. Signed-off-by: Martin Kelly <mkelly@xevo.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [bwh: Backported to 4.9: - Drop change to iio_dma_buffer_set_length() - Adjust filename, context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Liu Bo	3fd6a73da1	Btrfs: fix unexpected cow in run_delalloc_nocow commit `5811375325` upstream. Fstests generic/475 provides a way to fail metadata reads while checking if checksum exists for the inode inside run_delalloc_nocow(), and csum_exist_in_range() interprets error (-EIO) as inode having checksum and makes its caller enter the cow path. In case of free space inode, this ends up with a warning in cow_file_range(). The same problem applies to btrfs_cross_ref_exist() since it may also read metadata in between. With this, run_delalloc_nocow() bails out when errors occur at the two places. cc: <stable@vger.kernel.org> v2.6.28+ Fixes: `17d217fe97` ("Btrfs: fix nodatasum handling in balancing code") Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Filipe Manana	77c82917d5	Btrfs: fix return value on rename exchange failure commit `c5b4a50b74` upstream. If we failed during a rename exchange operation after starting/joining a transaction, we would end up replacing the return value, stored in the local 'ret' variable, with the return value from btrfs_end_transaction(). So this could end up returning 0 (success) to user space despite the operation having failed and aborted the transaction, because if there are multiple tasks having a reference on the transaction at the time btrfs_end_transaction() is called by the rename exchange, that function returns 0 (otherwise it returns -EIO and not the original error value). So fix this by not overwriting the return value on error after getting a transaction handle. Fixes: `cdd1fedf82` ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Maciej S. Szmigiero	41b1d57a67	X.509: unpack RSA signatureValue field from BIT STRING commit `b65c32ec5a` upstream. The signatureValue field of a X.509 certificate is encoded as a BIT STRING. For RSA signatures this BIT STRING is of so-called primitive subtype, which contains a u8 prefix indicating a count of unused bits in the encoding. We have to strip this prefix from signature data, just as we already do for key data in x509_extract_key_data() function. This wasn't noticed earlier because this prefix byte is zero for RSA key sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero prefixes has no bearing on its value. The signature length, however was incorrect, which is a problem for RSA implementations that need it to be exactly correct (like AMD CCP). Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: `c26fd69fa0` ("X.509: Add a crypto key parser for binary (DER) X.509 certificates") Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Geert Uytterhoeven	8fd86587ea	time: Make sure jiffies_to_msecs() preserves non-zero time periods commit `abcbcb80cd` upstream. For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of 1000, jiffies_to_msecs() never returns zero when passed a non-zero time period. However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or 1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero for small non-zero time periods. This may break code that relies on receiving back a non-zero value. jiffies_to_usecs() does not need such a fix: one jiffy can only be less than one µs if HZ > 1000000, and such large values of HZ are already rejected at build time, twice: - include/linux/jiffies.h does #error if HZ >= 12288, - kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC). Broken since forever. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Huacai Chen	344d6159fe	MIPS: io: Add barrier after register read in inX() commit `18f3e95b90` upstream. While a barrier is present in the outX() functions before the register write, a similar barrier is missing in the inX() functions after the register read. This could allow memory accesses following inX() to observe stale data. This patch is very similar to commit `a1cc7034e3` ("MIPS: io: Add barrier after register read in readX()"). Because war_io_reorder_wmb() is both used by writeX() and outX(), if readX() need a barrier then so does inX(). Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhc@lemote.com> Patchwork: https://patchwork.linux-mips.org/patch/19516/ Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: James Hogan <james.hogan@mips.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:11 +02:00
Tetsuo Handa	db2baeef79	printk: fix possible reuse of va_list variable commit `988a35f8da` upstream. I noticed that there is a possibility that printk_safe_log_store() causes kernel oops because "args" parameter is passed to vsnprintf() again when atomic_cmpxchg() detected that we raced. Fix this by using va_copy(). Link: http://lkml.kernel.org/r/201805112002.GIF21216.OFVHFOMLJtQFSO@I-love.SAKURA.ne.jp Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: dvyukov@google.com Cc: syzkaller@googlegroups.com Cc: fengguang.wu@intel.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `42a0bb3f71` ("printk/nmi: generic solution for safe printk in NMI") Cc: 4.7+ <stable@vger.kernel.org> # v4.7+ Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Mika Westerberg	ca558fb836	PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume commit `13c65840fe` upstream. After a suspend/resume cycle the Presence Detect or Data Link Layer Status Changed bits might be set. If we don't clear them those events will not fire anymore and nothing happens for instance when a device is now hot-unplugged. Fix this by clearing those bits in a newly introduced function pcie_reenable_notification(). This should be fine because immediately after, we check if the adapter is still present by reading directly from the status register. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Mika Westerberg	0d3d58337d	PCI: Add ACS quirk for Intel 300 series commit `f154a718e6` upstream. Intel 300 series chipset still has the same ACS issue as the previous generations so extend the ACS quirk to cover it as well. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Alex Williamson	5e1deade60	PCI: Add ACS quirk for Intel 7th & 8th Gen mobile commit `e8440f4bfe` upstream. The specification update indicates these have the same errata for implementing non-standard ACS capabilities. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Tokunori Ikegami	83f9549d65	MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum commit `2a027b47db` upstream. The erratum and workaround are described by BCM5300X-ES300-RDS.pdf as below. R10: PCIe Transactions Periodically Fail Description: The BCM5300X PCIe does not maintain transaction ordering. This may cause PCIe transaction failure. Fix Comment: Add a dummy PCIe configuration read after a PCIe configuration write to ensure PCIe configuration access ordering. Set ES bit of CP0 configu7 register to enable sync function so that the sync instruction is functional. Resolution: hndpci.c: extpci_write_config() hndmips.c: si_mips_init() mipsinc.h CONF7_ES This is fixed by the CFE MIPS bcmsi chipset driver also for BCM47XX. Also the dummy PCIe configuration read is already implemented in the Linux BCMA driver. Enable ExternalSync in Config7 when CONFIG_BCMA_DRIVER_PCI_HOSTMODE=y too so that the sync instruction is externalised. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Paul Burton <paul.burton@mips.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Rafał Miłecki <zajec5@gmail.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19461/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Joakim Tjernlund	5fdb3c468b	mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking. commit `f1ce87f608` upstream. cfi_ppb_unlock() walks all flash chips when unlocking sectors, avoid walking chips unaffected by the unlock operation. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Joakim Tjernlund	b4e24c2842	mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary commit `0cd8116f17` upstream. The "sector is in requested range" test used to determine whether sectors should be re-locked or not is done on a variable that is reset everytime we cross a chip boundary, which can lead to some blocks being re-locked while the caller expect them to be unlocked. Fix the check to make sure this cannot happen. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Joakim Tjernlund	0bf4e48c20	mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips commit `5fdfc3dbad` upstream. cfi_ppb_unlock() tries to relock all sectors that were locked before unlocking the whole chip. This locking used the chip start address + the FULL offset from the first flash chip, thereby forming an illegal address. Fix that by using the chip offset(adr). Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:10 +02:00
Joakim Tjernlund	552eacd58e	mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock() commit `f93aa8c4de` upstream. do_ppb_xxlock() fails to add chip->start when querying for lock status (and chip_ready test), which caused false status reports. Fix that by adding adr += chip->start and adjust call sites accordingly. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Tokunori Ikegami	e9dc5dce09	mtd: cfi_cmdset_0002: Change write buffer to check correct value commit `dfeae10735` upstream. For the word write it is checked if the chip has the correct value. But it is not checked for the write buffer as only checked if ready. To make sure for the write buffer change to check the value. It is enough as this patch is only checking the last written word. Since it is described by data sheets to check the operation status. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Leon Romanovsky	afe249e3e3	RDMA/mlx4: Discard unknown SQP work requests commit `6b1ca7ece1` upstream. There is no need to crash the machine if unknown work request was received in SQP MAD. Cc: <stable@vger.kernel.org> # 3.6 Fixes: `37bfc7c1e8` ("IB/mlx4: SR-IOV multiplex and demultiplex MADs") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Max Gurtovoy	52e167187b	IB/isert: fix T10-pi check mask setting commit `0e12af84cd` upstream. A copy/paste bug (probably) caused setting of an app_tag check mask in case where a ref_tag check was needed. Fixes: `38a2d0d429` ("IB/isert: convert to the generic RDMA READ/WRITE API") Fixes: `9e961ae73c` ("IB/isert: Support T10-PI protected transactions") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Alex Estrin	a664281b85	IB/isert: Fix for lib/dma_debug check_sync warning commit `763b69654b` upstream. The following error message occurs on a target host in a debug build during session login: [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0 [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes] ......snip ..... [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 [ 3524.555740] Call Trace: [ 3524.559102] [<ffffffffa5fe915b>] dump_stack+0x19/0x1b [ 3524.565477] [<ffffffffa58a2f58>] __warn+0xd8/0x100 [ 3524.571557] [<ffffffffa58a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 3524.578610] [<ffffffffa5bf5b8c>] check_sync+0x4ec/0x5b0 [ 3524.585177] [<ffffffffa58efc3f>] ? set_cpus_allowed_ptr+0x5f/0x1c0 [ 3524.592812] [<ffffffffa5bf5cd0>] debug_dma_sync_single_for_cpu+0x80/0x90 [ 3524.601029] [<ffffffffa586add3>] ? x2apic_send_IPI_mask+0x13/0x20 [ 3524.608574] [<ffffffffa585ee1b>] ? native_smp_send_reschedule+0x5b/0x80 [ 3524.616699] [<ffffffffa58e9b76>] ? resched_curr+0xf6/0x140 [ 3524.623567] [<ffffffffc0879af0>] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert] [ 3524.633060] [<ffffffffc087af95>] isert_put_login_tx+0x55/0x8b0 [ib_isert] [ 3524.641383] [<ffffffffa58ef114>] ? try_to_wake_up+0x1a4/0x430 [ 3524.648561] [<ffffffffc098cfed>] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod] [ 3524.658557] [<ffffffffc098d827>] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod] [ 3524.668084] [<ffffffffa59f9bc9>] ? kstrdup+0x49/0x60 [ 3524.674420] [<ffffffffc098e976>] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod] [ 3524.684656] [<ffffffffc098c2ee>] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod] [ 3524.694901] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.705446] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.715976] [<ffffffffc098ca78>] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod] [ 3524.725739] [<ffffffffa58d60ff>] kthread+0xef/0x100 [ 3524.732007] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.739540] [<ffffffffa5fff1b7>] ret_from_fork_nospec_begin+0x21/0x21 [ 3524.747558] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]--- [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it with dma-mapped address. login_tx_desc is a part of iser_conn structure and is used only once during login negotiation, so the issue is fixed by eliminating dma_sync call for this buffer using a special case routine. Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Erez Shitrit	e355402cf1	IB/mlx5: Fetch soft WQE's on fatal error state commit `7b74a83cf5` upstream. On fatal error the driver simulates CQE's for ULPs that rely on completion of all their posted work-request. For the GSI traffic, the mlx5 has its own mechanism that sends the completions via software CQE's directly to the relevant CQ. This should be kept in fatal error too, so the driver should simulate such CQE's with the specified error state in order to complete GSI QP work requests. Without the fix the next deadlock might appears: schedule_timeout+0x274/0x350 wait_for_common+0xec/0x240 mcast_remove_one+0xd0/0x120 [ib_core] ib_unregister_device+0x12c/0x230 [ib_core] mlx5_ib_remove+0xc4/0x270 [mlx5_ib] mlx5_detach_device+0x184/0x1a0 [mlx5_core] mlx5_unload_one+0x308/0x340 [mlx5_core] mlx5_pci_err_detected+0x74/0xe0 [mlx5_core] Cc: <stable@vger.kernel.org> # 4.7 Fixes: `89ea94a7b6` ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Alex Estrin	9cac0a08e4	IB/{hfi1, qib}: Add handling of kernel restart commit `8d3e71136a` upstream. A warm restart will fail to unload the driver, leaving link state potentially flapping up to the point the BIOS resets the adapter. Correct the issue by hooking the shutdown pci method, which will bring port down. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:09 +02:00
Mike Marciniszyn	9321e83034	IB/qib: Fix DMA api warning with debug kernel commit `0252f73334` upstream. The following error occurs in a debug build when running MPI PSM: [ 307.415911] WARNING: CPU: 4 PID: 23867 at lib/dma-debug.c:1158 check_unmap+0x4ee/0xa20 [ 307.455661] ib_qib 0000:05:00.0: DMA-API: device driver failed to check map error[device address=0x00000000df82b000] [size=4096 bytes] [mapped as page] [ 307.517494] Modules linked in: [ 307.531584] ib_isert iscsi_target_mod ib_srpt target_core_mod rpcrdma sunrpc ib_srp scsi_transport_srp scsi_tgt ib_iser libiscsi ib_ipoib scsi_transport_iscsi rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_qib intel_powerclamp coretemp rdmavt intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel ipmi_ssif ib_core aesni_intel sg ipmi_si lrw gf128mul dca glue_helper ipmi_devintf iTCO_wdt gpio_ich hpwdt iTCO_vendor_support ablk_helper hpilo acpi_power_meter cryptd ipmi_msghandler ie31200_edac shpchp pcc_cpufreq lpc_ich pcspkr ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crct10dif_pclmul crct10dif_common drm crc32c_intel libahci tg3 libata serio_raw ptp i2c_core [ 307.846113] pps_core dm_mirror dm_region_hash dm_log dm_mod [ 307.866505] CPU: 4 PID: 23867 Comm: mpitests-IMB-MP Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 307.911178] Hardware name: HP ProLiant DL320e Gen8, BIOS J05 11/09/2013 [ 307.944206] Call Trace: [ 307.956973] [<ffffffffbd9e915b>] dump_stack+0x19/0x1b [ 307.982201] [<ffffffffbd2a2f58>] __warn+0xd8/0x100 [ 308.005999] [<ffffffffbd2a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 308.034260] [<ffffffffbd5f667e>] check_unmap+0x4ee/0xa20 [ 308.060801] [<ffffffffbd41acaa>] ? page_add_file_rmap+0x2a/0x1d0 [ 308.090689] [<ffffffffbd5f6c4d>] debug_dma_unmap_page+0x9d/0xb0 [ 308.120155] [<ffffffffbd4082e0>] ? might_fault+0xa0/0xb0 [ 308.146656] [<ffffffffc07761a5>] qib_tid_free.isra.14+0x215/0x2a0 [ib_qib] [ 308.180739] [<ffffffffc0776bf4>] qib_write+0x894/0x1280 [ib_qib] [ 308.210733] [<ffffffffbd540b00>] ? __inode_security_revalidate+0x70/0x80 [ 308.244837] [<ffffffffbd53c2b7>] ? security_file_permission+0x27/0xb0 [ 308.266025] qib_ib0.8006: multicast join failed for ff12:401b:8006:0000:0000:0000:ffff:ffff, status -22 [ 308.323421] [<ffffffffbd46f5d3>] vfs_write+0xc3/0x1f0 [ 308.347077] [<ffffffffbd492a5c>] ? fget_light+0xfc/0x510 [ 308.372533] [<ffffffffbd47045a>] SyS_write+0x8a/0x100 [ 308.396456] [<ffffffffbd9ff355>] system_call_fastpath+0x1c/0x21 The code calls a qib_map_page() which has never correctly tested for a mapping error. Fix by testing for pci_dma_mapping_error() in all cases and properly handling the failure in the caller. Additionally, streamline qib_map_page() arguments to satisfy just the single caller. Cc: <stable@vger.kernel.org> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Tested-by: Don Dutile <ddutile@redhat.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Stefan M Schaeckeler	f92ec84c49	of: unittest: for strings, account for trailing \0 in property length field commit `3b9cf7905f` upstream. For strings, account for trailing \0 in property length field: This is consistent with how dtc builds string properties. Function __of_prop_dup() would misbehave on such properties as it duplicates properties based on the property length field creating new string values without trailing \0s. Signed-off-by: Stefan M Schaeckeler <sschaeck@cisco.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Tested-by: Frank Rowand <frank.rowand@sony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Will Deacon	fb6786ce77	arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance commit `71c8fc0c96` upstream. When rewriting swapper using nG mappings, we must performance cache maintenance around each page table access in order to avoid coherency problems with the host's cacheable alias under KVM. To ensure correct ordering of the maintenance with respect to Device memory accesses made with the Stage-1 MMU disabled, DMBs need to be added between the maintenance and the corresponding memory access. This patch adds a missing DMB between writing a new page table entry and performing a clean+invalidate on the same line. Fixes: `f992b4dfd5` ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings") Cc: <stable@vger.kernel.org> # 4.16.x- Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Will Deacon	12942d52f2	arm64: kpti: Use early_param for kpti= command-line option commit `b5b7dd647f` upstream. We inspect __kpti_forced early on as part of the cpufeature enable callback which remaps the swapper page table using non-global entries. Ensure that __kpti_forced has been updated to reflect the kpti= command-line option before we start using it. Fixes: `ea1e3de85e` ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0") Cc: <stable@vger.kernel.org> # 4.16.x- Reported-by: Wei Xu <xuwei5@hisilicon.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
David Rivshin	8f27499338	ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size commit `76ed0b803a` upstream. NUMREGBYTES (which is used as the size for gdb_regs[]) is incorrectly based on DBG_MAX_REG_NUM instead of GDB_MAX_REGS. DBG_MAX_REG_NUM is the number of total registers, while GDB_MAX_REGS is the number of 'unsigned longs' it takes to serialize those registers. Since FP registers require 3 'unsigned longs' each, DBG_MAX_REG_NUM is smaller than GDB_MAX_REGS. This causes GDB 8.0 give the following error on connect: "Truncated register 19 in remote 'g' packet" This also causes the register serialization/deserialization logic to overflow gdb_regs[], overwriting whatever follows. Fixes: `834b2964b7` ("kgdb,arm: fix register dump") Cc: <stable@vger.kernel.org> # 2.6.37+ Signed-off-by: David Rivshin <drivshin@allworx.com> Acked-by: Rabin Vincent <rabin@rab.in> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Mahesh Salgaonkar	81d6e715d1	powerpc/fadump: Unregister fadump on kexec down path. commit `722cde76d6` upstream. Unregister fadump on kexec down path otherwise the fadump registration in new kexec-ed kernel complains that fadump is already registered. This makes new kernel to continue using fadump registered by previous kernel which may lead to invalid vmcore generation. Hence this patch fixes this issue by un-registering fadump in fadump_cleanup() which is called during kexec path so that new kernel can register fadump with new valid values. Fixes: `b500afff11` ("fadump: Invalidate registration and release reserved memory for general use.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Gautham R. Shenoy	443004a666	cpuidle: powernv: Fix promotion from snooze if next state disabled commit `0a4ec6aa03` upstream. The commit `78eaa10f02` ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") introduced a timeout for the snooze idle state so that it could be eventually be promoted to a deeper idle state. The snooze timeout value is static and set to the target residency of the next idle state, which would train the cpuidle governor to pick the next idle state eventually. The unfortunate side-effect of this is that if the next idle state(s) is disabled, the CPU will forever remain in snooze, despite the fact that the system is completely idle, and other deeper idle states are available. This patch fixes the issue by dynamically setting the snooze timeout to the target residency of the next enabled state on the device. Before Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01297 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| Nap \| Fast 0\| 8\| 0\| 96.41\| 0.00\| 0.00 0\| 8\| 1\| 96.43\| 0.00\| 0.00 0\| 8\| 2\| 96.47\| 0.00\| 0.00 0\| 8\| 3\| 96.35\| 0.00\| 0.00 0\| 8\| 4\| 96.37\| 0.00\| 0.00 0\| 8\| 5\| 96.37\| 0.00\| 0.00 0\| 8\| 6\| 96.47\| 0.00\| 0.00 0\| 8\| 7\| 96.47\| 0.00\| 0.00 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.05033 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| stop \| stop \| stop \| stop \| stop \| stop \| stop \| stop 0\| 16\| 0\| 89.79\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 1\| 90.12\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 2\| 90.21\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 3\| 90.29\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 After Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01200 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| Nap \| Fast 0\| 8\| 0\| 16.58\| 0.00\| 77.21 0\| 8\| 1\| 18.42\| 0.00\| 75.38 0\| 8\| 2\| 4.70\| 0.00\| 94.09 0\| 8\| 3\| 17.06\| 0.00\| 81.73 0\| 8\| 4\| 3.06\| 0.00\| 95.73 0\| 8\| 5\| 7.00\| 0.00\| 96.80 0\| 8\| 6\| 1.00\| 0.00\| 98.79 0\| 8\| 7\| 5.62\| 0.00\| 94.17 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.02110 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| stop \| stop \| stop \| stop \| stop \| stop \| stop \| stop 0\| 0\| 0\| 0.69\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 9.39\| 89.70 0\| 0\| 1\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.05\| 93.21 0\| 0\| 2\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 89.93 0\| 0\| 3\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 93.26 Fixes: `78eaa10f02` ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Alexey Kardashevskiy	f9b25660d6	powerpc/powernv/ioda2: Remove redundant free of TCE pages commit `98fd72fe82` upstream. When IODA2 creates a PE, it creates an IOMMU table with it_ops::free set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages(). Since iommu_tce_table_put() calls it_ops::free when the last reference to the table is released, explicit call to pnv_pci_ioda2_table_free_pages() is not needed so let's remove it. This should fix double free in the case of PCI hotuplug as pnv_pci_ioda2_table_free_pages() does not reset neither iommu_table::it_base nor ::it_size. This was not exposed by SRIOV as it uses different code path via pnv_pcibios_sriov_disable(). IODA1 does not inialize it_ops::free so it does not have this issue. Fixes: `c5f7700bbd` ("powerpc/powernv: Dynamically release PE") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Michael Neuling	90f88f05d8	powerpc/ptrace: Fix enforcement of DAWR constraints commit `cd6ef7eebf` upstream. Back when we first introduced the DAWR, in commit `4ae7ebe952` ("powerpc: Change hardware breakpoint to allow longer ranges"), we screwed up the constraint making it a 1024 byte boundary rather than a 512. This makes the check overly permissive. Fortunately GDB is the only real user and it always did they right thing, so we never noticed. This fixes the constraint to 512 bytes. Fixes: `4ae7ebe952` ("powerpc: Change hardware breakpoint to allow longer ranges") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:08 +02:00
Michael Neuling	5ea3b9bddf	powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG commit `4f7c06e26e` upstream. In commit `e2a800beac` ("powerpc/hw_brk: Fix off by one error when validating DAWR region end") we fixed setting the DAWR end point to its max value via PPC_PTRACE_SETHWDEBUG. Unfortunately we broke PTRACE_SET_DEBUGREG when setting a 512 byte aligned breakpoint. PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to zero (memset() in hw_breakpoint_init()). This worked with arch_validate_hwbkpt_settings() before the above patch was applied but is now broken if the breakpoint is 512byte aligned. This sets the length of the breakpoint to 8 bytes when using PTRACE_SET_DEBUGREG. Fixes: `e2a800beac` ("powerpc/hw_brk: Fix off by one error when validating DAWR region end") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Aneesh Kumar K.V	10e46042f2	powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch commit `91d0697188` upstream. Currently we do not have an isync, or any other context synchronizing instruction prior to the slbie/slbmte in _switch() that updates the SLB entry for the kernel stack. However that is not correct as outlined in the ISA. From Power ISA Version 3.0B, Book III, Chapter 11, page 1133: "Changing the contents of ... the contents of SLB entries ... can have the side effect of altering the context in which data addresses and instruction addresses are interpreted, and in which instructions are executed and data accesses are performed. ... These side effects need not occur in program order, and therefore may require explicit synchronization by software. ... The synchronizing instruction before the context-altering instruction ensures that all instructions up to and including that synchronizing instruction are fetched and executed in the context that existed before the alteration." And page 1136: "For data accesses, the context synchronizing instruction before the slbie, slbieg, slbia, slbmte, tlbie, or tlbiel instruction ensures that all preceding instructions that access data storage have completed to a point at which they have reported all exceptions they will cause." We're not aware of any bugs caused by this, but it should be fixed regardless. Add the missing isync when updating kernel stack SLB entry. Cc: stable@vger.kernel.org Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Flesh out change log with more ISA text & explanation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Miklos Szeredi	12715f3ef1	fuse: fix control dir setup and teardown commit `6becdb601b` upstream. syzbot is reporting NULL pointer dereference at fuse_ctl_remove_conn() [1]. Since fc->ctl_ndents is incremented by fuse_ctl_add_conn() when new_inode() failed, fuse_ctl_remove_conn() reaches an inode-less dentry and tries to clear d_inode(dentry)->i_private field. Fix by only adding the dentry to the array after being fully set up. When tearing down the control directory, do d_invalidate() on it to get rid of any mounts that might have been added. [1] https://syzkaller.appspot.com/bug?id=f396d863067238959c91c0b7cfc10b163638cac6 Reported-by: syzbot <syzbot+32c236387d66c4516827@syzkaller.appspotmail.com> Fixes: `bafa96541b` ("[PATCH] fuse: add control filesystem") Cc: <stable@vger.kernel.org> # v2.6.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Tetsuo Handa	a0fbcaf999	fuse: don't keep dead fuse_conn at fuse_fill_super(). commit `543b8f8662` upstream. syzbot is reporting use-after-free at fuse_kill_sb_blk() [1]. Since sb->s_fs_info field is not cleared after fc was released by fuse_conn_put() when initialization failed, fuse_kill_sb_blk() finds already released fc and tries to hold the lock. Fix this by clearing sb->s_fs_info field after calling fuse_conn_put(). [1] https://syzkaller.appspot.com/bug?id=a07a680ed0a9290585ca424546860464dd9658db Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+ec3986119086fe4eec97@syzkaller.appspotmail.com> Fixes: `3b463ae0c6` ("fuse: invalidation reverse calls") Cc: John Muir <john@jmuir.com> Cc: Csaba Henk <csaba@gluster.com> Cc: Anand Avati <avati@redhat.com> Cc: <stable@vger.kernel.org> # v2.6.31 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Miklos Szeredi	ebdc37febe	fuse: atomic_o_trunc should truncate pagecache commit `df0e91d488` upstream. Fuse has an "atomic_o_trunc" mode, where userspace filesystem uses the O_TRUNC flag in the OPEN request to truncate the file atomically with the open. In this mode there's no need to send a SETATTR request to userspace after the open, so fuse_do_setattr() checks this mode and returns. But this misses the important step of truncating the pagecache. Add the missing parts of truncation to the ATTR_OPEN branch. Reported-by: Chad Austin <chadaustin@fb.com> Fixes: `6ff958edbf` ("fuse: add atomic open+truncate support") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Amit Pundir	f1e9a633e6	Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader commit `7dc5fe0814` upstream. AOSP use userspace firmware loader to load firmwares, which will return -EAGAIN in case qca/rampatch_00440302.bin is not found. Since there is no rampatch for dragonboard820c QCA controller revision, just make it work as is. CC: Loic Poulain <loic.poulain@linaro.org> CC: Nicolas Dechesne <nicolas.dechesne@linaro.org> CC: Marcel Holtmann <marcel@holtmann.org> CC: Johan Hedberg <johan.hedberg@gmail.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Corey Minyard	d11ec041b2	ipmi:bt: Set the timeout before doing a capabilities check commit `fe50a7d039` upstream. There was one place where the timeout value for an operation was not being set, if a capabilities request was done from idle. Move the timeout value setting to before where that change might be requested. IMHO the cause here is the invisible returns in the macros. Maybe that's a job for later, though. Reported-by: Nordmark Claes <Claes.Nordmark@tieto.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Mikulas Patocka	3e4fab744b	branch-check: fix long->int truncation when profiling branches commit `2026d35741` upstream. The function __builtin_expect returns long type (see the gcc documentation), and so do macros likely and unlikely. Unfortunatelly, when CONFIG_PROFILE_ANNOTATED_BRANCHES is selected, the macros likely and unlikely expand to __branch_check__ and __branch_check__ truncates the long type to int. This unintended truncation may cause bugs in various kernel code (we found a bug in dm-writecache because of it), so it's better to fix __branch_check__ to return long. Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1805300818140.24812@file01.intranet.prod.int.rdu2.redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: `1f0d69a9fc` ("tracing: profile likely and unlikely annotations") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Matthias Schiffer	95f8713422	mips: ftrace: fix static function graph tracing commit `6fb8656646` upstream. ftrace_graph_caller was never run after calling ftrace_trace_function, breaking the function graph tracer. Fix this, bringing it in line with the x86 implementation. While we're at it, also streamline the control flow of _mcount a bit to reduce the number of branches. This issue was reported before: https://www.linux-mips.org/archives/linux-mips/2014-11/msg00295.html Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Tested-by: Matt Redfearn <matt.redfearn@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/18929/ Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:07 +02:00
Geert Uytterhoeven	ec7bea37c8	lib/vsprintf: Remove atomic-unsafe support for %pCr commit `666902e42f` upstream. "%pCr" formats the current rate of a clock, and calls clk_get_rate(). The latter obtains a mutex, hence it must not be called from atomic context. Remove support for this rarely-used format, as vsprintf() (and e.g. printk()) must be callable from any context. Any remaining out-of-tree users will start seeing the clock's name printed instead of its rate. Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Fixes: `900cca2944` ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks") Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.1+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Geert Uytterhoeven	676b002f26	clk: renesas: cpg-mssr: Stop using printk format %pCr commit `ef4b0be626` upstream. Printk format "%pCr" will be removed soon, as clk_get_rate() must not be called in atomic context. Replace it by open-coding the operation. This is safe here, as the code runs in task context. Link: http://lkml.kernel.org/r/1527845302-12159-2-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Alexander Sverdlin	a879f6c232	ASoC: cirrus: i2s: Fix {TX\|RX}LinCtrlData setup commit `5d302ed3cc` upstream. According to "EP93xx User’s Guide", I2STXLinCtrlData and I2SRXLinCtrlData registers actually have different format. The only currently used bit (Left_Right_Justify) has different position. Fix this and simplify the whole setup taking into account the fact that both registers have zero default value. The practical effect of the above is repaired SND_SOC_DAIFMT_RIGHT_J support (currently unused). Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Alexander Sverdlin	d6aa7326e8	ASoC: cirrus: i2s: Fix LRCLK configuration commit `2d534113be` upstream. The bit responsible for LRCLK polarity is i2s_tlrs (0), not i2s_trel (2) (refer to "EP93xx User's Guide"). Previously card drivers which specified SND_SOC_DAIFMT_NB_IF actually got SND_SOC_DAIFMT_NB_NF, an adaptation is necessary to retain the old behavior. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Srinivas Kandagatla	1a1b2790f0	ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it commit `ff2faf1289` upstream. dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the paths pointer dangling in the list. This leads to system crash when we try to unload and reload sound card. I hit this bug during ADSP crash/reboot test case on Dragon board DB410c. Without this patch, on SLAB Poisoning enabled build, kernel crashes with "BUG kmalloc-128 (Tainted: G W ): Poison overwritten" Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Ingo Flaschberger	cf05568cb8	1wire: family module autoload fails because of upper/lower case mismatch. commit `065c09563c` upstream. 1wire family module autoload fails because of upper/lower case mismatch. Signed-off-by: Ingo Flaschberger <ingo.flaschberger@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Maxim Moseychuk	55365ad775	usb: do not reset if a low-speed or full-speed device timed out commit `6e01827ed9` upstream. Some low-speed and full-speed devices (for example, bluetooth) do not have time to initialize. For them, ETIMEDOUT is a valid error. We need to give them another try. Otherwise, they will never be initialized correctly and in dmesg will be messages "Bluetooth: hci0 command 0x1002 tx timeout" or similars. Fixes: `264904ccc3` ("usb: retry reset if a device times out") Cc: stable <stable@vger.kernel.org> Signed-off-by: Maxim Moseychuk <franchesko.salias.hudro.pedros@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Eric W. Biederman	c82ccd7122	signal/xtensa: Consistenly use SIGBUS in do_unaligned_user commit `7de712ccc0` upstream. While working on changing this code to use force_sig_fault I discovered that do_unaliged_user is sets si_signo to SIGBUS and passes SIGSEGV to force_sig_info. Which is just b0rked. The code is reporting a SIGBUS error so replace the SIGSEGV with SIGBUS. Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: linux-xtensa@linux-xtensa.org Cc: stable@vger.kernel.org Acked-by: Max Filippov <jcmvbkbc@gmail.com> Fixes: `5a0015d626` ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 3") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Daniel Wagner	d9c202b269	serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version commit `8afb1d2c12` upstream. Commit `40f70c03e3` ("serial: sh-sci: add locking to console write function to avoid SMP lockup") copied the strategy to avoid locking problems in conjuncture with the console from the UART8250 driver. Instead using directly spin_{try}lock_irqsave(), local_irq_save() followed by spin_{try}lock() was used. While this is correct on mainline, for -rt it is a problem. spin_{try}lock() will check if it is running in a valid context. Since the local_irq_save() has already been executed, the context has changed and spin_{try}lock() will complain. The reason why spin_{try}lock() complains is that on -rt the spin locks are turned into mutexes and therefore can sleep. Sleeping with interrupts disabled is not valid. BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995 in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0 CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12 Hardware name: Generic RZ/G1 (Flattened Device Tree) Backtrace: [<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c) r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0 [<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94) [<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194) r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000 [<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74) r5:c06f4d60 r4:c06f4d60 [<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118) r5:c06f4d60 r4:c06f4d60 [<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124) r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798 r4:c06b9914 r3:c02576e4 [<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430) r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798 r4:c06d2798 r3:00000028 [<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0) r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027 r4:00000003 [<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30) r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c r4:c0062fa8 [<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14) [<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84) [<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc) r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523 r4:00000006 [<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178) r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c r4:dfbcf680 [<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248) r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000 r4:df525180 [<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c) r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00 r4:decac000 [<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c) r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00 Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Wagner <daniel.wagner@siemens.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:06 +02:00
Michael Schmitz	5692dcf90e	m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap() commit `3f90f9ef2d` upstream. If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap between mappings which is added to the vm_struct representing the mapping. __ioremap() uses the actual requested size (after alignment), while __iounmap() is passed the size from the vm_struct. On 020/030, early termination descriptors are used to set up mappings of extent 'size', which are validated on unmapping. The unmapped gap of size IO_SIZE defeats the sanity check of the pmd tables, causing __iounmap() to loop forever on 030. On 040/060, unmapping of page table entries does not check for a valid mapping, so the umapping loop always completes there. Adjust size to be unmapped by the gap that had been added in the vm_struct prior. This fixes the hang in atari_platform_init() reported a long time ago, and a similar one reported by Finn recently (addressed by removing ioremap() use from the SWIM driver. Tested on my Falcon in 030 mode - untested but should work the same on 040/060 (the extra page tables cleared there would never have been set up anyway). Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> [geert: Minor commit description improvements] [geert: This was fixed in 2.4.23, but not in 2.5.x] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Siarhei Liakh	7a68dcdc9d	x86: Call fixup_exception() before notify_die() in math_error() commit `3ae6295ccb` upstream. fpu__drop() has an explicit fwait which under some conditions can trigger a fixable FPU exception while in kernel. Thus, we should attempt to fixup the exception first, and only call notify_die() if the fixup failed just like in do_general_protection(). The original call sequence incorrectly triggers KDB entry on debug kernels under particular FPU-intensive workloads. Andy noted, that this makes the whole conditional irq enable thing even more inconsistent, but fixing that it outside the scope of this. Signed-off-by: Siarhei Liakh <siarhei.liakh@concurrent-rt.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Borislav Petkov" <bpetkov@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Borislav Petkov	5a48f6084d	x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out() commit `1f74c8a647` upstream. mce_no_way_out() does a quick check during #MC to see whether some of the MCEs logged would require the kernel to panic immediately. And it passes a struct mce where MCi_STATUS gets written. However, after having saved a valid status value, the next iteration of the loop which goes over the MCA banks on the CPU, overwrites the valid status value because we're using struct mce as storage instead of a temporary variable. Which leads to MCE records with an empty status value: mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000 mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10} In order to prevent the loss of the status register value, return immediately when severity is a panic one so that we can panic immediately with the first fatal MCE logged. This is also the intention of this function and not to noodle over the banks while a fatal MCE is already logged. Tony: read the rest of the MCA bank to populate the struct mce fully. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Tony Luck	c267eaaceb	x86/mce: Fix incorrect "Machine check from unknown source" message commit `40c36e2741` upstream. Some injection testing resulted in the following console log: mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]} mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043 mce: [Hardware Error]: Run the above through 'mcelog --ascii' Kernel panic - not syncing: Machine check from unknown source This confused everybody because the first line quite clearly shows that we found a logged error in "Bank 1", while the last line says "unknown source". The problem is that the Linux code doesn't do the right thing for a local machine check that results in a fatal error. It turns out that we know very early in the handler whether the machine check is fatal. The call to mce_no_way_out() has checked all the banks for the CPU that took the local machine check. If it says we must crash, we can do so right away with the right messages. We do scan all the banks again. This means that we might initially not see a problem, but during the second scan find something fatal. If this happens we print a slightly different message (so I can see if it actually every happens). [ bp: Remove unneeded severity assignment. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org # 4.2 Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Tony Luck	e7905a78ad	x86/mce: Check for alternate indication of machine check recovery on Skylake commit `4c5717da1d` upstream. Currently we just check the "CAPID0" register to see whether the CPU can recover from machine checks. But there are also some special SKUs which do not have all advanced RAS features, but do enable machine check recovery for use with NVDIMMs. Add a check for any of bits {8:5} in the "CAPID5" register (each reports some NVDIMM mode available, if any of them are set, then the system supports memory machine check recovery). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # 4.9 Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/03cbed6e99ddafb51c2eadf9a3b7c8d7a0cc204e.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Tony Luck	b4eb80a751	x86/mce: Improve error message when kernel cannot recover commit `c7d606f560` upstream. Since we added support to add recovery from some errors inside the kernel in: commit `b2f9d678e2` ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") we have done a less than stellar job at reporting the cause of recoverable machine checks that occur in other parts of the kernel. The user just gets the unhelpful message: mce: [Hardware Error]: Machine check: Action required: unknown MCACOD doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD and see that it is listed as one of the standard recoverable values. Add an extra rule to the MCE severity table to catch this case and report it as: mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel Fixes: `b2f9d678e2` ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # 4.6+ Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Dan Williams	995cddcc33	x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec() commit `eab6870fee` upstream. Mark Rutland noticed that GCC optimization passes have the potential to elide necessary invocations of the array_index_mask_nospec() instruction sequence, so mark the asm() volatile. Mark explains: "The volatile will inhibit some cases where the compiler could lift the array_index_nospec() call out of a branch, e.g. where there are multiple invocations of array_index_nospec() with the same arguments: if (idx < foo) { idx1 = array_idx_nospec(idx, foo) do_something(idx1); } < some other code > if (idx < foo) { idx2 = array_idx_nospec(idx, foo); do_something_else(idx2); } ... since the compiler can determine that the two invocations yield the same result, and reuse the first result (likely the same register as idx was in originally) for the second branch, effectively re-writing the above as: if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } < some other code > if (idx < foo) { do_something_else(idx); } ... if we don't take the first branch, then speculatively take the second, we lose the nospec protection. There's more info on volatile asm in the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile " Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `babdde2698` ("x86: Implement array_index_mask_nospec") Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:23:05 +02:00
Greg Kroah-Hartman	c806e08569	Linux 4.9.110	2018-06-26 08:08:09 +08:00
Thadeu Lima de Souza Cascardo	f3e7234932	fs/binfmt_misc.c: do not allow offset overflow commit `5cc41e0995` upstream. WHen registering a new binfmt_misc handler, it is possible to overflow the offset to get a negative value, which might crash the system, or possibly leak kernel data. Here is a crash log when 2500000000 was used as an offset: BUG: unable to handle kernel paging request at ffff989cfd6edca0 IP: load_misc_binary+0x22b/0x470 [binfmt_misc] PGD 1ef3e067 P4D 1ef3e067 PUD 0 Oops: 0000 [#1] SMP NOPTI Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc] Call Trace: search_binary_handler+0x97/0x1d0 do_execveat_common.isra.34+0x667/0x810 SyS_execve+0x31/0x40 do_syscall_64+0x73/0x130 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Use kstrtoint instead of simple_strtoul. It will work as the code already set the delimiter byte to '\0' and we only do it when the field is not empty. Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested with examples documented at Documentation/admin-guide/binfmt-misc.rst and other registrations from packages on Ubuntu. Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:09 +08:00
Michael S. Tsirkin	9681c3bdb0	vhost: fix info leak due to uninitialized memory commit `670ae9caac` upstream. struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Even Xu	a875bc1c9e	HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation commit `ebeaa36754` upstream. Current ISH driver only registers suspend/resume PM callbacks which don't support hibernation (suspend to disk). Basically after hiberation, the ISH can't resume properly and user may not see sensor events (for example: screen rotation may not work). User will not see a crash or panic or anything except the following message in log: hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device So this patch adds support for S4/hiberbation to ISH by using the SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend and resume functions will now be used for both suspend to RAM and hibernation. If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend and resume related functions won't be used, so mark them as __maybe_unused to clarify that this is the intended behavior, and remove #ifdefs for power management. Cc: stable@vger.kernel.org Signed-off-by: Even Xu <even.xu@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Martin Brandenburg	88f36d1b4f	orangefs: set i_size on new symlink commit `f6a4b4c9d0` upstream. As long as a symlink inode remains in-core, the destination (and therefore size) will not be re-fetched from the server, as it cannot change. The original implementation of the attribute cache assumed that setting the expiry time in the past was sufficient to cause a re-fetch of all attributes on the next getattr. That does not work in this case. The bug manifested itself as follows. When the command sequence touch foo; ln -s foo bar; ls -l bar is run, the output was lrwxrwxrwx. 1 fedora fedora 4906 Apr 24 19:10 bar -> foo However, after a re-mount, ls -l bar produces lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo After this commit, even before a re-mount, the output is lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo Reported-by: Becky Ligon <ligon@clemson.edu> Signed-off-by: Martin Brandenburg <martin@omnibond.com> Fixes: `71680c18c8` ("orangefs: Cache getattr results.") Cc: stable@vger.kernel.org Cc: hubcap@omnibond.com Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Stefan Potyra	aec3dd5ef1	w1: mxc_w1: Enable clock before calling clk_get_rate() on it commit `955bc61328` upstream. According to the API, you may only call clk_get_rate() after actually enabling it. Found by Linux Driver Verification project (linuxtesting.org). Fixes: `a5fd9139f7` ("w1: add 1-wire master driver for i.MX27 / i.MX31") Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Hans de Goede	139cd53baf	libata: Drop SanDisk SD7UB3QG1001 NOLPM quirk commit `2cfce3a86b` upstream. Commit `184add2ca2` ("libata: Apply NOLPM quirk for SanDisk SD7UB3QG1001 SSDs") disabled LPM for SanDisk SD7UB3Q*G1001 SSDs. This has lead to several reports of users of that SSD where LPM was working fine and who know have a significantly increased idle power consumption on their laptops. Likely there is another problem on the T450s from the original reporter which gets exposed by the uncore reaching deeper sleep states (higher PC-states) due to LPM being enabled. The problem as reported, a hardfreeze about once a day, already did not sound like it would be caused by LPM and the reports of the SSD working fine confirm this. The original reporter is ok with dropping the quirk. A X250 user has reported the same hard freeze problem and for him the problem went away after unrelated updates, I suspect some GPU driver stack changes fixed things. TL;DR: The original reporters problem were triggered by LPM but not an LPM issue, so drop the quirk for the SSD in question. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1583207 Cc: stable@vger.kernel.org Cc: Richard W.M. Jones <rjones@redhat.com> Cc: Lorenzo Dalrio <lorenzo.dalrio@gmail.com> Reported-by: Lorenzo Dalrio <lorenzo.dalrio@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Richard W.M. Jones" <rjones@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Dan Carpenter	0e9806ec73	libata: zpodd: small read overflow in eject_tray() commit `18c9a99bce` upstream. We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes. Fixes: `213342053d` ("libata: handle power transition of ODD") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Colin Ian King	21e6919834	libata: zpodd: make arrays cdb static, reduces object code size commit `795ef78814` upstream. Don't populate the arrays cdb on the stack, instead make them static. Makes the object code smaller by 230 bytes: Before: text data bss dec hex filename 3797 240 0 4037 fc5 drivers/ata/libata-zpodd.o After: text data bss dec hex filename 3407 400 0 3807 edf drivers/ata/libata-zpodd.o Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Tao Wang	5930589d3f	cpufreq: Fix new policy initialization during limits updates via sysfs commit `c7d1f119c4` upstream. If the policy limits are updated via cpufreq_update_policy() and subsequently via sysfs, the limits stored in user_policy may be set incorrectly. For example, if both min and max are set via sysfs to the maximum available frequency, user_policy.min and user_policy.max will also be the maximum. If a policy notifier triggered by cpufreq_update_policy() lowers both the min and the max at this point, that change is not reflected by the user_policy limits, so if the max is updated again via sysfs to the same lower value, then user_policy.max will be lower than user_policy.min which shouldn't happen. In particular, if one of the policy CPUs is then taken offline and back online, cpufreq_set_policy() will fail for it due to a failing limits check. To prevent that from happening, initialize the min and max fields of the new_policy object to the ones stored in user_policy that were previously set via sysfs. Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Steve French	a6c9a62e0f	smb3: on reconnect set PreviousSessionId field commit `b2adf22fdf` upstream. The server detects reconnect by the (non-zero) value in PreviousSessionId of SMB2/SMB3 SessionSetup request, but this behavior regressed due to commit `166cea4dc3` ("SMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup") CC: Stable <stable@vger.kernel.org> CC: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Dennis Wassenberg	2c6707ce9a	ALSA: hda: add dock and led support for HP ProBook 640 G4 commit `7eef32c1ef` upstream. This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP ProBook 640 G4 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Dennis Wassenberg	594790ef9f	ALSA: hda: add dock and led support for HP EliteBook 830 G5 commit `2861751f67` upstream. This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP EliteBook 830 G5 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:08 +08:00
Bo Chen	5514389fb2	ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream() commit `a3aa60d511` upstream. When 'kzalloc()' fails in 'snd_hda_attach_pcm_stream()', a new pcm instance is created without setting its operators via 'snd_pcm_set_ops()'. Following operations on the new pcm instance can trigger kernel null pointer dereferences and cause kernel oops. This bug was found with my work on building a gray-box fault-injection tool for linux-kernel-module binaries. A kernel null pointer dereference was confirmed from line 'substream->ops->open()' in function 'snd_pcm_open_substream()' in file 'sound/core/pcm_native.c'. This patch fixes the bug by calling 'snd_device_free()' in the error handling path of 'kzalloc()', which removes the new pcm instance from the snd card before returns with an error code. Signed-off-by: Bo Chen <chenbo@pdx.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Qu Wenruo	2102637c85	btrfs: scrub: Don't use inode pages for device replace commit `ac0b4145d6` upstream. [BUG] Btrfs can create compressed extent without checksum (even though it shouldn't), and if we then try to replace device containing such extent, the result device will contain all the uncompressed data instead of the compressed one. Test case already submitted to fstests: https://patchwork.kernel.org/patch/10442353/ [CAUSE] When handling compressed extent without checksum, device replace will goe into copy_nocow_pages() function. In that function, btrfs will get all inodes referring to this data extents and then use find_or_create_page() to get pages direct from that inode. The problem here is, pages directly from inode are always uncompressed. And for compressed data extent, they mismatch with on-disk data. Thus this leads to corrupted compressed data extent written to replace device. [FIX] In this attempt, we could just remove the "optimization" branch, and let unified scrub_pages() to handle it. Although scrub_pages() won't bother reusing page cache, it will be a little slower, but it does the correct csum checking and won't cause such data corruption caused by "optimization". Note about the fix: this is the minimal fix that can be backported to older stable trees without conflicts. The whole callchain from copy_nocow_pages() can be deleted, and will be in followup patches. Fixes: `ff023aac31` ("Btrfs: add code to scrub to copy read data to another disk") CC: stable@vger.kernel.org # 4.4+ Reported-by: James Harvey <jamespharvey20@gmail.com> Reviewed-by: James Harvey <jamespharvey20@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> [ remove code removal, add note why ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Omar Sandoval	9bb94d8120	Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2() commit `fd4e994bd1` upstream. If we have invalid flags set, when we error out we must drop our writer counter and free the buffer we allocated for the arguments. This bug is trivially reproduced with the following program on 4.7+: #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <linux/btrfs.h> #include <linux/btrfs_tree.h> int main(int argc, char **argv) { struct btrfs_ioctl_vol_args_v2 vol_args = { .flags = UINT64_MAX, }; int ret; int fd; if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } fd = open(argv[1], O_WRONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } ret = ioctl(fd, BTRFS_IOC_RM_DEV_V2, &vol_args); if (ret == -1) perror("ioctl"); close(fd); return EXIT_SUCCESS; } When unmounting the filesystem, we'll hit the WARN_ON(mnt_get_writers(mnt)) in cleanup_mnt() and also may prevent the filesystem to be remounted read-only as the writer count will stay lifted. Fixes: `6b526ed70c` ("btrfs: introduce device delete by devid") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Omar Sandoval	52ea25b2b8	Btrfs: fix clone vs chattr NODATASUM race commit `b5c40d598f` upstream. In btrfs_clone_files(), we must check the NODATASUM flag while the inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags() will change the flags after we check and we can end up with a party checksummed file. The race window is only a few instructions in size, between the if and the locks which is: 3834 if (S_ISDIR(src->i_mode) \|\| S_ISDIR(inode->i_mode)) 3835 return -EISDIR; where the setflags must be run and toggle the NODATASUM flag (provided the file size is 0). The clone will block on the inode lock, segflags takes the inode lock, changes flags, releases log and clone continues. Not impossible but still needs a lot of bad luck to hit unintentionally. Fixes: `0e7b824c4e` ("Btrfs: don't make a file partly checksummed through file clone") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Tetsuo Handa	4f65ebcffa	driver core: Don't ignore class_dir_create_and_add() failure. commit `84d0c27d62` upstream. syzbot is hitting WARN() at kernfs_add_one() [1]. This is because kernfs_create_link() is confused by previous device_add() call which continued without setting dev->kobj.parent field when get_device_parent() failed by memory allocation fault injection. Fix this by propagating the error from class_dir_create_and_add() to the calllers of get_device_parent(). [1] https://syzkaller.appspot.com/bug?id=fae0fb607989ea744526d1c082a5b8de6529116f Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+df47f81c226b31d89fb1@syzkaller.appspotmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Jan Kara	e45ab2d6a8	ext4: fix fencepost error in check for inode count overflow during resize commit `4f2f76f751` upstream. ext4_resize_fs() has an off-by-one bug when checking whether growing of a filesystem will not overflow inode count. As a result it allows a filesystem with 8192 inodes per group to grow to 64TB which overflows inode count to 0 and makes filesystem unusable. Fix it. Cc: stable@vger.kernel.org Fixes: `3f8a6411fb` Reported-by: Jaco Kroon <jaco@uls.co.za> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Lukas Czerner	ade6e140df	ext4: update mtime in ext4_punch_hole even if no blocks are released commit `eee597ac93` upstream. Currently in ext4_punch_hole we're going to skip the mtime update if there are no actual blocks to release. However we've actually modified the file by zeroing the partial block so the mtime should be updated. Moreover the sync and datasync handling is skipped as well, which is also wrong. Fix it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Joe Habermann <joe.habermann@quantum.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Jan Kara	42cc42eaba	ext4: fix hole length detection in ext4_ind_map_blocks() commit `2ee3ee06a8` upstream. When ext4_ind_map_blocks() computes a length of a hole, it doesn't count with the fact that mapped offset may be somewhere in the middle of the completely empty subtree. In such case it will return too large length of the hole which then results in lseek(SEEK_DATA) to end up returning an incorrect offset beyond the end of the hole. Fix the problem by correctly taking offset within a subtree into account when computing a length of a hole. Fixes: `facab4d971` CC: stable@vger.kernel.org Reported-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Kailang Yang	2435d6b110	ALSA: hda/realtek - New codec support for ALC257 commit `f429e7e494` upstream. Add new support for ALC257 codec. [ It's supposed to be almost equivalent with other ALC25x variants, just adding another type and id -- tiwai ] Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Frank van der Linden	6caca347e4	tcp: verify the checksum of the first data segment in a new connection [ Upstream commit `4fd44a98ff` ] commit `079096f103` ("tcp/dccp: install syn_recv requests into ehash table") introduced an optimization for the handling of child sockets created for a new TCP connection. But this optimization passes any data associated with the last ACK of the connection handshake up the stack without verifying its checksum, because it calls tcp_child_process(), which in turn calls tcp_rcv_state_process() directly. These lower-level processing functions do not do any checksum verification. Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to fix this. Fixes: `079096f103` ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:07 +08:00
Davide Caratti	2d34743a2c	net/sched: act_simple: fix parsing of TCA_DEF_DATA [ Upstream commit `8d499533e0` ] use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA netlink attribute, in case it is less than SIMP_MAX_DATA and it does not end with '\0' character. v2: fix errors in the commit message, thanks Hangbin Liu Fixes: `fa1b1cff3d` ("net_cls_act: Make act_simple use of netlink policy.") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Julian Anastasov	c669191257	ipv6: allow PMTU exceptions to local routes [ Upstream commit `0975764684` ] IPVS setups with local client and remote tunnel server need to create exception for the local virtual IP. What we do is to change PMTU from 64KB (on "lo") to 1460 in the common case. Suggested-by: Martin KaFai Lau <kafai@fb.com> Fixes: `45e4fd2668` ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Fixes: `7343ff31eb` ("ipv6: Don't create clones of host routes.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Xiangning Yu	780617b249	bonding: re-evaluate force_primary when the primary slave name changes [ Upstream commit `eb55bbf865` ] There is a timing issue under active-standy mode, when bond_enslave() is called, bond->params.primary might not be initialized yet. Any time the primary slave string changes, bond->force_primary should be set to true to make sure the primary becomes the active slave. Signed-off-by: Xiangning Yu <yuxiangning@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Daniel Glöckner	c4f24a093f	usb: musb: fix remote wakeup racing with suspend [ Upstream commit `ebc3dd688c` ] It has been observed that writing 0xF2 to the power register while it reads as 0xF4 results in the register having the value 0xF0, i.e. clearing RESUME and setting SUSPENDM in one go does not work. It might also violate the USB spec to transition directly from resume to suspend, especially when not taking T_DRSMDN into account. But this is what happens when a remote wakeup occurs between SetPortFeature USB_PORT_FEAT_SUSPEND on the root hub and musb_bus_suspend being called. This commit returns -EBUSY when musb_bus_suspend is called while remote wakeup is signalled and thus avoids to reset the RESUME bit. Ignoring this error when musb_port_suspend is called from musb_hub_control is ok. Signed-off-by: Daniel Glöckner <dg@emlix.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Liu Bo	42ff36e9cb	Btrfs: make raid6 rebuild retry more [ Upstream commit `8810f7517a` ] There is a scenario that can end up with rebuild process failing to return good content, i.e. suppose that all disks can be read without problems and if the content that was read out doesn't match its checksum, currently for raid6 btrfs at most retries twice, - the 1st retry is to rebuild with all other stripes, it'll eventually be a raid5 xor rebuild, - if the 1st fails, the 2nd retry will deliberately fail parity p so that it will do raid6 style rebuild, however, the chances are that another non-parity stripe content also has something corrupted, so that the above retries are not able to return correct content, and users will think of this as data loss. More seriouly, if the loss happens on some important internal btree roots, it could refuse to mount. This extends btrfs to do more retries and each retry fails only one stripe. Since raid6 can tolerate 2 disk failures, if there is one more failure besides the failure on which we're recovering, this can always work. The worst case is to retry as many times as the number of raid6 disks, but given the fact that such a scenario is really rare in practice, it's still acceptable. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Eric Dumazet	4e43b6a8b4	tcp: do not overshoot window_clamp in tcp_rcv_space_adjust() commit `02db55718d` upstream. While rcvbuf is properly clamped by tcp_rmem[2], rcvwin is left to a potentially too big value. It has no serious effect, since : 1) tcp_grow_window() has very strict checks. 2) window_clamp can be mangled by user space to any value anyway. tcp_init_buffer_space() and companions use tcp_full_space(), we use tcp_win_from_space() to avoid reloading sk->sk_rcvbuf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Sasha Levin	1fab25ce8d	Revert "Btrfs: fix scrub to repair raid6 corruption" This reverts commit `186a6519dc`. This commit used an incorrect log message. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:06 +08:00
Damien Thébault	60649dacb3	net: dsa: b53: Add BCM5389 support [ Upstream commit `a95691bc54` ] This patch adds support for the BCM5389 switch connected through MDIO. Signed-off-by: Damien Thébault <damien.thebault@vitec.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Finn Thain	1249ccd806	net/sonic: Use dma_mapping_error() [ Upstream commit `26de0b76d9` ] With CONFIG_DMA_API_DEBUG=y, calling sonic_open() produces the message, "DMA-API: device driver failed to check map error". Add the missing dma_mapping_error() call. Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
João Paulo Rechi Vita	2fe56703ee	platform/x86: asus-wmi: Fix NULL pointer dereference [ Upstream commit `32ffd6e8d1` ] Do not perform the rfkill cleanup routine when (asus->driver->wlan_ctrl_by_user && ashs_present()) is true, since nothing is registered with the rfkill subsystem in that case. Doing so leads to the following kernel NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 PGD 1a3aa8067 PUD 1a3b3d067 PMD 0 Oops: 0002 [#1] PREEMPT SMP Modules linked in: bnep ccm binfmt_misc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_a4tech videodev x86_pkg_temp_thermal intel_powerclamp coretemp ath3k btusb btrtl btintel bluetooth kvm_intel snd_hda_codec_hdmi kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass crc32c_intel arc4 i915 snd_hda_intel snd_hda_codec ath9k ath9k_common ath9k_hw ath i2c_algo_bit snd_hwdep mac80211 ghash_clmulni_intel snd_hda_core snd_pcm snd_timer cfg80211 ehci_pci xhci_pci drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm xhci_hcd ehci_hcd asus_nb_wmi(-) asus_wmi sparse_keymap r8169 rfkill mxm_wmi serio_raw snd mii mei_me lpc_ich i2c_i801 video soundcore mei i2c_smbus wmi i2c_core mfd_core CPU: 3 PID: 3275 Comm: modprobe Not tainted 4.9.34-gentoo #34 Hardware name: ASUSTeK COMPUTER INC. K56CM/K56CM, BIOS K56CM.206 08/21/2012 task: ffff8801a639ba00 task.stack: ffffc900014cc000 RIP: 0010:[<ffffffff816c7348>] [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 RSP: 0018:ffffc900014cfce0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8801a54315b0 RCX: 00000000c0000100 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801a54315b4 RBP: ffffc900014cfd30 R08: 0000000000000000 R09: 0000000000000002 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801a54315b4 R13: ffff8801a639ba00 R14: 00000000ffffffff R15: ffff8801a54315b8 FS: 00007faa254fb700(0000) GS:ffff8801aef80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001a3b1b000 CR4: 00000000001406e0 Stack: ffff8801a54315b8 0000000000000000 ffffffff814733ae ffffc900014cfd28 ffffffff8146a28c ffff8801a54315b0 0000000000000000 ffff8801a54315b0 ffff8801a66f3820 0000000000000000 ffffc900014cfd48 ffffffff816c73e7 Call Trace: [<ffffffff814733ae>] ? acpi_ut_release_mutex+0x5d/0x61 [<ffffffff8146a28c>] ? acpi_ns_get_node+0x49/0x52 [<ffffffff816c73e7>] mutex_lock+0x17/0x30 [<ffffffffa00a3bb4>] asus_rfkill_hotplug+0x24/0x1a0 [asus_wmi] [<ffffffffa00a4421>] asus_wmi_rfkill_exit+0x61/0x150 [asus_wmi] [<ffffffffa00a49f1>] asus_wmi_remove+0x61/0xb0 [asus_wmi] [<ffffffff814a5128>] platform_drv_remove+0x28/0x40 [<ffffffff814a2901>] __device_release_driver+0xa1/0x160 [<ffffffff814a29e3>] device_release_driver+0x23/0x30 [<ffffffff814a1ffd>] bus_remove_device+0xfd/0x170 [<ffffffff8149e5a9>] device_del+0x139/0x270 [<ffffffff814a5028>] platform_device_del+0x28/0x90 [<ffffffff814a50a2>] platform_device_unregister+0x12/0x30 [<ffffffffa00a4209>] asus_wmi_unregister_driver+0x19/0x30 [asus_wmi] [<ffffffffa00da0ea>] asus_nb_wmi_exit+0x10/0xf26 [asus_nb_wmi] [<ffffffff8110c692>] SyS_delete_module+0x192/0x270 [<ffffffff810022b2>] ? exit_to_usermode_loop+0x92/0xa0 [<ffffffff816ca560>] entry_SYSCALL_64_fastpath+0x13/0x94 Code: e8 5e 30 00 00 8b 03 83 f8 01 0f 84 93 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 41 be ff ff ff ff 4c 89 3c 24 48 89 44 24 08 <48> 89 20 4c 89 6c 24 10 eb 1d 4c 89 e7 49 c7 45 08 02 00 00 00 RIP [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120 RSP <ffffc900014cfce0> CR2: 0000000000000000 ---[ end trace 8d484233fa7cb512 ]--- note: modprobe[3275] exited with preempt_count 2 https://bugzilla.kernel.org/show_bug.cgi?id=196467 Reported-by: red.f0xyz@gmail.com Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Josh Hill	baa3a68614	net: qmi_wwan: Add Netgear Aircard 779S [ Upstream commit `2415f3bd05` ] Add support for Netgear Aircard 779S Signed-off-by: Josh Hill <josh@joshuajhill.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Ivan Bornyakov	5dbffe4201	atm: zatm: fix memcmp casting [ Upstream commit `f9c6442a8f` ] memcmp() returns int, but eprom_try_esi() cast it to unsigned char. One can lose significant bits and get 0 from non-0 value returned by the memcmp(). Signed-off-by: Ivan Bornyakov <brnkv.i1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Hao Wei Tee	46bada0a93	iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs [ Upstream commit `ab1068d686` ] When there are 16 or more logical CPUs, we request for `IWL_MAX_RX_HW_QUEUES` (16) IRQs only as we limit to that number of IRQs, but later on we compare the number of IRQs returned to nr_online_cpus+2 instead of max_irqs, the latter being what we actually asked for. This ends up setting num_rx_queues to 17 which causes lots of out-of-bounds array accesses later on. Compare to max_irqs instead, and also add an assertion in case num_rx_queues > IWM_MAX_RX_HW_QUEUES. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199551 Fixes: `2e5d4a8f61` ("iwlwifi: pcie: Add new configuration to enable MSIX") Signed-off-by: Hao Wei Tee <angelsl@in04.sg> Tested-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Julian Anastasov	0063faaa86	ipvs: fix buffer overflow with sync daemon and service [ Upstream commit `52f9675790` ] syzkaller reports for buffer overflow for interface name when starting sync daemons [1] What we do is that we copy user structure into larger stack buffer but later we search NUL past the stack buffer. The same happens for sched_name when adding/editing virtual server. We are restricted by IP_VS_SCHEDNAME_MAXLEN and IP_VS_IFNAME_MAXLEN being used as size in include/uapi/linux/ip_vs.h, so they include the space for NUL. As using strlcpy is wrong for unsafe source, replace it with strscpy and add checks to return EINVAL if source string is not NUL-terminated. The incomplete strlcpy fix comes from 2.6.13. For the netlink interface reduce the len parameter for IPVS_DAEMON_ATTR_MCAST_IFN and IPVS_SVC_ATTR_SCHED_NAME, so that we get proper EINVAL. [1] kernel BUG at lib/string.c:1052! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 373 Comm: syz-executor936 Not tainted 4.17.0-rc4+ #45 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:fortify_panic+0x13/0x20 lib/string.c:1051 RSP: 0018:ffff8801c976f800 EFLAGS: 00010282 RAX: 0000000000000022 RBX: 0000000000000040 RCX: 0000000000000000 RDX: 0000000000000022 RSI: ffffffff8160f6f1 RDI: ffffed00392edef6 RBP: ffff8801c976f800 R08: ffff8801cf4c62c0 R09: ffffed003b5e4fb0 R10: ffffed003b5e4fb0 R11: ffff8801daf27d87 R12: ffff8801c976fa20 R13: ffff8801c976fae4 R14: ffff8801c976fae0 R15: 000000000000048b FS: 00007fd99f75e700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200001c0 CR3: 00000001d6843000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: strlen include/linux/string.h:270 [inline] strlcpy include/linux/string.h:293 [inline] do_ip_vs_set_ctl+0x31c/0x1d00 net/netfilter/ipvs/ip_vs_ctl.c:2388 nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] nf_setsockopt+0x7d/0xd0 net/netfilter/nf_sockopt.c:115 ip_setsockopt+0xd8/0xf0 net/ipv4/ip_sockglue.c:1253 udp_setsockopt+0x62/0xa0 net/ipv4/udp.c:2487 ipv6_setsockopt+0x149/0x170 net/ipv6/ipv6_sockglue.c:917 tcp_setsockopt+0x93/0xe0 net/ipv4/tcp.c:3057 sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3046 __sys_setsockopt+0x1bd/0x390 net/socket.c:1903 __do_sys_setsockopt net/socket.c:1914 [inline] __se_sys_setsockopt net/socket.c:1911 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x447369 RSP: 002b:00007fd99f75dda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00000000006e39e4 RCX: 0000000000447369 RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000018 R09: 0000000000000000 R10: 00000000200001c0 R11: 0000000000000246 R12: 00000000006e39e0 R13: 75a1ff93f0896195 R14: 6f745f3168746576 R15: 0000000000000001 Code: 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 df e8 d2 8f 48 fa eb de 55 48 89 fe 48 c7 c7 60 65 64 88 48 89 e5 e8 91 dd f3 f9 <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 56 RIP: fortify_panic+0x13/0x20 lib/string.c:1051 RSP: ffff8801c976f800 Reported-and-tested-by: syzbot+aac887f77319868646df@syzkaller.appspotmail.com Fixes: `e4ff675130` ("ipvs: add sync_maxlen parameter for the sync daemon") Fixes: `4da62fc70d` ("[IPVS]: Fix for overflows") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Paolo Abeni	8268afc568	netfilter: ebtables: handle string from userspace with care [ Upstream commit `94c752f999` ] strlcpy() can't be safely used on a user-space provided string, as it can try to read beyond the buffer's end, if the latter is not NULL terminated. Leveraging the above, syzbot has been able to trigger the following splat: BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] BUG: KASAN: stack-out-of-bounds in ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] BUG: KASAN: stack-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504 CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285 compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367 compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline] compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156 compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279 inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041 compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901 compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050 __compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403 __do_compat_sys_setsockopt net/compat.c:416 [inline] __se_compat_sys_setsockopt net/compat.c:413 [inline] __ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413 do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline] do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7fb3cb9 RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x2fffc0000000000() raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Fix the issue replacing the unsafe function with strscpy() and taking care of possible errors. Fixes: `81e675c227` ("netfilter: ebtables: add CONFIG_COMPAT support") Reported-and-tested-by: syzbot+4e42a04e0bc33cb6c087@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Eric Dumazet	c8197f96bc	xfrm6: avoid potential infinite loop in _decode_session6() [ Upstream commit `d9f92772e8` ] syzbot found a way to trigger an infinitie loop by overflowing @offset variable that has been forced to use u16 for some very obscure reason in the past. We probably want to look at NEXTHDR_FRAGMENT handling which looks wrong, in a separate patch. In net-next, we shall try to use skb_header_pointer() instead of pskb_may_pull(). watchdog: BUG: soft lockup - CPU#1 stuck for 134s! [syz-executor738:4553] Modules linked in: irq event stamp: 13885653 hardirqs last enabled at (13885652): [<ffffffff878009d5>] restore_regs_and_return_to_kernel+0x0/0x2b hardirqs last disabled at (13885653): [<ffffffff87800905>] interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625 softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_napi_alloc_frags drivers/net/tun.c:1478 [inline] softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_get_user+0x1dd9/0x4290 drivers/net/tun.c:1825 softirqs last disabled at (13614032): [<ffffffff84df1b6f>] tun_get_user+0x313f/0x4290 drivers/net/tun.c:1942 CPU: 1 PID: 4553 Comm: syz-executor738 Not tainted 4.17.0-rc3+ #40 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:check_kcov_mode kernel/kcov.c:67 [inline] RIP: 0010:__sanitizer_cov_trace_pc+0x20/0x50 kernel/kcov.c:101 RSP: 0018:ffff8801d8cfe250 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffff8801d88a8080 RBX: ffff8801d7389e40 RCX: 0000000000000006 RDX: 0000000000000000 RSI: ffffffff868da4ad RDI: ffff8801c8a53277 RBP: ffff8801d8cfe250 R08: ffff8801d88a8080 R09: ffff8801d8cfe3e8 R10: ffffed003b19fc87 R11: ffff8801d8cfe43f R12: ffff8801c8a5327f R13: 0000000000000000 R14: ffff8801c8a4e5fe R15: ffff8801d8cfe3e8 FS: 0000000000d88940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffff600400 CR3: 00000001acab3000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: _decode_session6+0xc1d/0x14f0 net/ipv6/xfrm6_policy.c:150 __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2368 xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline] icmpv6_route_lookup+0x395/0x6e0 net/ipv6/icmp.c:372 icmp6_send+0x1982/0x2da0 net/ipv6/icmp.c:551 icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43 ip6_input_finish+0x14e1/0x1a30 net/ipv6/ip6_input.c:305 NF_HOOK include/linux/netfilter.h:288 [inline] ip6_input+0xe1/0x5e0 net/ipv6/ip6_input.c:327 dst_input include/net/dst.h:450 [inline] ip6_rcv_finish+0x29c/0xa10 net/ipv6/ip6_input.c:71 NF_HOOK include/linux/netfilter.h:288 [inline] ipv6_rcv+0xeb8/0x2040 net/ipv6/ip6_input.c:208 __netif_receive_skb_core+0x2468/0x3650 net/core/dev.c:4646 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4711 netif_receive_skb_internal+0x126/0x7b0 net/core/dev.c:4785 napi_frags_finish net/core/dev.c:5226 [inline] napi_gro_frags+0x631/0xc40 net/core/dev.c:5299 tun_get_user+0x3168/0x4290 drivers/net/tun.c:1951 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1996 call_write_iter include/linux/fs.h:1784 [inline] do_iter_readv_writev+0x859/0xa50 fs/read_write.c:680 do_iter_write+0x185/0x5f0 fs/read_write.c:959 vfs_writev+0x1c7/0x330 fs/read_write.c:1004 do_writev+0x112/0x2f0 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev fs/read_write.c:1109 [inline] __x64_sys_writev+0x75/0xb0 fs/read_write.c:1109 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: syzbot+0053c8...@syzkaller.appspotmail.com Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Greg Kroah-Hartman	ccd19d3a38	objtool: update .gitignore file With the recent sync with objtool from 4.14.y, the objtool .gitignore file was forgotten. Fix that up now to properly handle the change in where the autogenerated files live. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 08:08:05 +08:00
Greg Kroah-Hartman	8e52b94e19	Linux 4.9.109	2018-06-16 09:52:35 +02:00
Greg Kroah-Hartman	f09a7b0eea	perf: sync up x86/.../cpufeatures.h The x86 copy of cpufeatures.h is now out of sync, so fix that. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:35 +02:00
Bin Liu	20f4d771b3	crypto: omap-sham - fix memleak commit `9dbc8a0328` upstream. Fixes: `8043bb1ae0` ("crypto: omap-sham - convert driver logic to use sgs for data xmit") The memory pages freed in omap_sham_finish_req() were less than those allocated in omap_sham_copy_sgs(). Cc: stable@vger.kernel.org Signed-off-by: Bin Liu <b-liu@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:35 +02:00
Michael Ellerman	ef2aa9f3a7	crypto: vmx - Remove overly verbose printk from AES init routines commit `1411b5218a` upstream. In the vmx AES init routines we do a printk(KERN_INFO ...) to report the fallback implementation we're using. However with a slow console this can significantly affect the speed of crypto operations. Using 'cryptsetup benchmark' the removal of the printk() leads to a ~5x speedup for aes-cbc decryption. So remove them. Fixes: `8676590a15` ("crypto: vmx - Adding AES routines for VMX module") Fixes: `8c755ace35` ("crypto: vmx - Adding CBC routines for VMX module") Fixes: `4f7f60d312` ("crypto: vmx - Adding CTR routines for VMX module") Fixes: `cc333cd68d` ("crypto: vmx - Adding GHASH routines for VMX module") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:35 +02:00
Johannes Wienke	05ca7fe5a7	Input: elan_i2c - add ELAN0612 (Lenovo v330 14IKB) ACPI ID commit `e6e7e9cd8e` upstream. Add ELAN0612 to the list of supported touchpads; this ID is used in Lenovo v330 14IKB devices. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199253 Signed-off-by: Johannes Wienke <languitar@semipol.de> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Ethan Lee	78e7bbf60c	Input: goodix - add new ACPI id for GPD Win 2 touch screen commit `5ca4d1ae9b` upstream. GPD Win 2 Website: http://www.gpd.hk/gpdwin2.asp Tested on a unit from the first production run sent to Indiegogo backers Signed-off-by: Ethan Lee <flibitijibibo@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Paolo Bonzini	13d1c5b17d	kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access commit `3c9fa24ca7` upstream. The functions that were used in the emulation of fxrstor, fxsave, sgdt and sidt were originally meant for task switching, and as such they did not check privilege levels. This is very bad when the same functions are used in the emulation of unprivileged instructions. This is CVE-2018-10853. The obvious fix is to add a new argument to ops->read_std and ops->write_std, which decides whether the access is a "system" access or should use the processor's CPL. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Dave Martin	018e5191c6	tty: pl011: Avoid spuriously stuck-off interrupts commit `4a7e625ce5` upstream. Commit `9b96fbacda` ("serial: PL011: clear pending interrupts") clears the RX and receive timeout interrupts on pl011 startup, to avoid a screaming-interrupt scenario that can occur when the firmware or bootloader leaves these interrupts asserted. This has been noted as an issue when running Linux on qemu [1]. Unfortunately, the above fix seems to lead to potential misbehaviour if the RX FIFO interrupt is asserted _non_ spuriously on driver startup, if the RX FIFO is also already full to the trigger level. Clearing the RX FIFO interrupt does not change the FIFO fill level. In this scenario, because the interrupt is now clear and because the FIFO is already full to the trigger level, no new assertion of the RX FIFO interrupt can occur unless the FIFO is drained back below the trigger level. This never occurs because the pl011 driver is waiting for an RX FIFO interrupt to tell it that there is something to read, and does not read the FIFO at all until that interrupt occurs. Thus, simply clearing "spurious" interrupts on startup may be misguided, since there is no way to be sure that the interrupts are truly spurious, and things can go wrong if they are not. This patch instead clears the interrupt condition by draining the RX FIFO during UART startup, after clearing any potentially spurious interrupt. This should ensure that an interrupt will definitely be asserted if the RX FIFO subsequently becomes sufficiently full. The drain is done at the point of enabling interrupts only. This means that it will occur any time the UART is newly opened through the tty layer. It will not apply to polled-mode use of the UART by kgdboc: since that scenario cannot use interrupts by design, this should not matter. kgdboc will interact badly with "normal" use of the UART in any case: this patch makes no attempt to paper over such issues. This patch does not attempt to address the case where the RX FIFO fills faster than it can be drained: that is a pathological hardware design problem that is beyond the scope of the driver to work around. As a failsafe, the number of poll iterations for draining the FIFO is limited to twice the FIFO size. This will ensure that the kernel at least boots even if it is impossible to drain the FIFO for some reason. [1] [Qemu-devel] [Qemu-arm] [PATCH] pl011: do not put into fifo before enabled the interruption https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06446.html Reported-by: Wei Xu <xuwei5@hisilicon.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: `9b96fbacda` ("serial: PL011: clear pending interrupts") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: stable <stable@vger.kernel.org> Tested-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Gil Kupfer	d9bc59c44d	vmw_balloon: fixing double free when batching mode is off commit `b23220fe05` upstream. The balloon.page field is used for two different purposes if batching is on or off. If batching is on, the field point to the page which is used to communicate with with the hypervisor. If it is off, balloon.page points to the page that is about to be (un)locked. Unfortunately, this dual-purpose of the field introduced a bug: when the balloon is popped (e.g., when the machine is reset or the balloon driver is explicitly removed), the balloon driver frees, unconditionally, the page that is held in balloon.page. As a result, if batching is disabled, this leads to double freeing the last page that is sent to the hypervisor. The following error occurs during rmmod when kernel checkers are on, and the balloon is not empty: [ 42.307653] ------------[ cut here ]------------ [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable] [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5 [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016 [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000 [ 42.313290] RIP: 0010:__free_pages+0x38/0x40 [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246 [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006 [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0 [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000 [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4 [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200 [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000 [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0 [ 42.314864] Call Trace: [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon] [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon] [ 42.315853] SyS_delete_module+0x1e2/0x250 [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 42.315924] RIP: 0033:0x7f3af5b0e8e7 [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7 [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248 [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999 [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130 [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0 [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48 [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98 [ 42.325735] ---[ end trace 872e008e33f81508 ]--- To solve the bug, we eliminate the dual purpose of balloon.page. Fixes: `f220a80f0c` ("VMware balloon: add batching to the vmw_balloon.") Cc: stable@vger.kernel.org Reported-by: Oleksandr Natalenko <onatalen@redhat.com> Signed-off-by: Gil Kupfer <gilkup@gmail.com> Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Tested-by: Oleksandr Natalenko <oleksandr@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Tony Lindgren	f6e6f0c542	serial: 8250: omap: Fix idling of clocks for unused uarts commit `13dc04d0e5` upstream. I noticed that unused UARTs won't necessarily idle properly always unless at least one byte tx transfer is done first. After some debugging I narrowed down the problem to the scr register dma configuration bits that need to be set before softreset for the clocks to idle. Unless we do this, the module clkctrl idlest bits may be set to 1 instead of 3 meaning the clock will never idle and is blocking deeper idle states for the whole domain. This might be related to the configuration done by the bootloader or kexec booting where certain configurations cause the 8250 or the clkctrl clock to jam in a way where setting of the scr bits and reset is needed to clear it. I've tried diffing the 8250 registers for the various modes, but did not see anything specific. So far I've only seen this on omap4 but I'm suspecting this might also happen on the other clkctrl using SoCs considering they already have a quirk enabled for UART_ERRATA_CLOCK_DISABLE. Let's fix the issue by configuring scr before reset for basic dma even if we don't use it. The scr register will be reset when we do softreset few lines after, and we restore scr on resume. We should do this for all the SoCs with UART_ERRATA_CLOCK_DISABLE quirk flag set since the ones with UART_ERRATA_CLOCK_DISABLE are all based using clkctrl similar to omap4. Looks like both OMAP_UART_SCR_DMAMODE_1 \| OMAP_UART_SCR_DMAMODE_CTL bits are needed for the clkctrl to idle after a softreset. And we need to add omap4 to also use the UART_ERRATA_CLOCK_DISABLE for the related workaround to be enabled. This same compatible value will also be used for omap5. Fixes: `cdb929e445` ("serial: 8250_omap: workaround errata around idling UART after using DMA") Cc: Keerthy <j-keerthy@ti.com> Cc: Matthijs van Duin <matthijsvanduin@gmail.com> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Marek Szyprowski	5b91ae57b5	serial: samsung: fix maxburst parameter for DMA transactions commit `aa2f80e752` upstream. The best granularity of residue that DMA engine can report is in the BURST units, so the serial driver must use MAXBURST = 1 and DMA_SLAVE_BUSWIDTH_1_BYTE if it relies on exact number of bytes transferred by DMA engine. Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Sebastian Andrzej Siewior	41bdf9702c	tty/serial: atmel: use port->name as name in request_irq() commit `9594b5be7e` upstream. I was puzzled while looking at /proc/interrupts and random things showed up between reboots. This occurred more often but I realised it later. The "correct" output should be: \|38: 11861 atmel-aic5 2 Level ttyS0 but I saw sometimes \|38: 6426 atmel-aic5 2 Level tty1 and accounted it wrongly as correct. This is use after free and the former example randomly got the "old" pointer which pointed to the same content. With SLAB_FREELIST_RANDOM and HARDENED I even got \|38: 7067 atmel-aic5 2 Level E=Started User Manager for UID 0 or other nonsense. As it turns out the tty, pointer that is accessed in atmel_startup(), is freed() before atmel_shutdown(). It seems to happen quite often that the tty for ttyS0 is allocated and freed while ->shutdown is not invoked. I don't do anything special - just a systemd boot :) Use dev_name(&pdev->dev) as the IRQ name for request_irq(). This exists as long as the driver is loaded so no use-after-free here. Cc: stable@vger.kernel.org Fixes: `761ed4a945` ("tty: serial_core: convert uart_close to use tty_port_close") Acked-by: Richard Genoud <richard.genoud@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:34 +02:00
Geert Uytterhoeven	70f0a59bbd	serial: sh-sci: Stop using printk format %pCr commit `d63c16f8e1` upstream. Printk format "%pCr" will be removed soon, as clk_get_rate() must not be called in atomic context. Replace it by open-coding the operation. This is safe here, as the code runs in task context. Link: http://lkml.kernel.org/r/1527845302-12159-4-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Yoshihiro Shimoda	bc62b33d5f	usb: gadget: udc: renesas_usb3: disable the controller's irqs for reconnecting commit `bd6bce004d` upstream. This patch fixes an issue that reconnection is possible to fail because unexpected state handling happens by the irqs. To fix the issue, the driver disables the controller's irqs when disconnected. Fixes: `746bfe63bb` ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Cc: <stable@vger.kernel.org> # v4.5+ Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Alexander Kappner	244eb27f96	usb-storage: Add compatibility quirk flags for G-Technologies G-Drive commit `ca7d9515d0` upstream. The "G-Drive" (sold by G-Technology) external USB 3.0 drive hangs on write access under UAS and usb-storage: [ 136.079121] sd 15:0:0:0: [sdi] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 136.079144] sd 15:0:0:0: [sdi] tag#0 Sense Key : Illegal Request [current] [ 136.079152] sd 15:0:0:0: [sdi] tag#0 Add. Sense: Invalid field in cdb [ 136.079176] sd 15:0:0:0: [sdi] tag#0 CDB: Write(16) 8a 08 00 00 00 00 00 00 00 00 00 00 00 08 00 00 [ 136.079180] print_req_error: critical target error, dev sdi, sector 0 [ 136.079183] Buffer I/O error on dev sdi, logical block 0, lost sync page write [ 136.173148] EXT4-fs (sdi): mounted filesystem with ordered data mode. Opts: (null) [ 140.583998] sd 15:0:0:0: [sdi] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 140.584010] sd 15:0:0:0: [sdi] tag#0 Sense Key : Illegal Request [current] [ 140.584016] sd 15:0:0:0: [sdi] tag#0 Add. Sense: Invalid field in cdb [ 140.584022] sd 15:0:0:0: [sdi] tag#0 CDB: Write(16) 8a 08 00 00 00 00 e8 c4 00 18 00 00 00 08 00 00 [ 140.584025] print_req_error: critical target error, dev sdi, sector 3905159192 [ 140.584044] print_req_error: critical target error, dev sdi, sector 3905159192 [ 140.584052] Aborting journal on device sdi-8. The proposed patch adds compatibility quirks. Because the drive requires two quirks (one to work with UAS, and another to work with usb-storage), adding this under unusual_devs.h and not just unusual_uas.h so kernels compiled without UAS receive the quirk. With the patch, the drive works reliably on UAS and usb- storage. (tested on NEC Corporation uPD720200 USB 3.0 host controller). Signed-off-by: Alexander Kappner <agk@godking.net> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Alexander Kappner	187941e505	usb-storage: Add support for FL_ALWAYS_SYNC flag in the UAS driver commit `8c4e97ddfe` upstream. The ALWAYS_SYNC flag is currently honored by the usb-storage driver but not UAS and is required to work around devices that become unstable upon being queried for cache. This code is taken straight from: drivers/usb/storage/scsiglue.c:284 Signed-off-by: Alexander Kappner <agk@godking.net> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Gustavo A. R. Silva	14450abb38	usbip: vhci_sysfs: fix potential Spectre v1 commit `a0d6ec8809` upstream. pdev_nr and rhport can be controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/usb/usbip/vhci_sysfs.c:238 detach_store() warn: potential spectre issue 'vhcis' drivers/usb/usbip/vhci_sysfs.c:328 attach_store() warn: potential spectre issue 'vhcis' drivers/usb/usbip/vhci_sysfs.c:338 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_ss->vdev' drivers/usb/usbip/vhci_sysfs.c:340 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_hs->vdev' Fix this by sanitizing pdev_nr and rhport before using them to index vhcis and vhci->vhci_hcd_ss->vdev respectively. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Laura Abbott	8da07ee9e4	staging: android: ion: Switch to pr_warn_once in ion_buffer_destroy commit `45ad559a29` upstream. Syzbot reported yet another warning with Ion: WARNING: CPU: 0 PID: 1467 at drivers/staging/android/ion/ion.c:122 ion_buffer_destroy+0xd4/0x190 drivers/staging/android/ion/ion.c:122 Kernel panic - not syncing: panic_on_warn set ... This is catching that a buffer was freed with an existing kernel mapping still present. This can be easily be triggered from userspace by calling DMA_BUF_SYNC_START without calling DMA_BUF_SYNC_END. Switch to a single pr_warn_once to indicate the error without being disruptive. Reported-by: syzbot+cd8bcd40cb049efa2770@syzkaller.appspotmail.com Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Paolo Bonzini	838b0e900a	KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system commit `ce14e868a5` upstream. Int the next patch the emulator's .read_std and .write_std callbacks will grow another argument, which is not needed in kvm_read_guest_virt and kvm_write_guest_virt_system's callers. Since we have to make separate functions, let's give the currently existing names a nicer interface, too. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Paolo Bonzini	00b1391f95	KVM: x86: introduce linear_{read,write}_system commit `79367a6574` upstream. Wrap the common invocation of ctxt->ops->read_std and ctxt->ops->write_std, so as to have a smaller patch when the functions grow another argument. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Linus Walleij	be1f605bea	gpio: No NULL owner commit `7d18f0a14a` upstream. Sometimes a GPIO is fetched with NULL as parent device, and that is just fine. So under these circumstances, avoid using dev_name() to provide a name for the GPIO line. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Daniel Rosenberg <drosen@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Martin Wilck	1c4eb2a50e	nvmet: don't overwrite identify sn/fr with 0-bytes commit `42819eb7a0` upstream. The merged version of my patch "nvmet: don't report 0-bytes in serial number" fails to remove two lines which should have been replaced, so that the space-padded strings are overwritten again with 0-bytes. Fix it. Fixes: `42de82a8b5` nvmet: don't report 0-bytes in serial number Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:33 +02:00
Martin Wilck	f43d8e4c86	nvmet: don't report 0-bytes in serial number commit `42de82a8b5` upstream. The NVME standard mandates that the SN, MN, and FR fields of the Identify Controller Data Structure be "ASCII strings". That means that they may not contain 0-bytes, not even string terminators. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.de> [hch: fixed for the move of the serial field, updated description] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Johannes Thumshirn	1e38f8e986	nvmet: Move serial number from controller to subsystem commit `2e7f5d2af2` upstream. The NVMe specification defines the serial number as: "Serial Number (SN): Contains the serial number for the NVM subsystem that is assigned by the vendor as an ASCII string. Refer to section 7.10 for unique identifier requirements. Refer to section 1.5 for ASCII string requirements" So move it from the controller to the subsystem, where it belongs. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Andy Lutomirski	077c9e26bb	x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code commit `02f39b2379` upstream. The crypto code was checking both use_eager_fpu() and defined(X86_FEATURE_EAGER_FPU). The latter was nonsensical, so remove it. This will avoid breakage when we remove X86_FEATURE_EAGER_FPU. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: pbonzini@redhat.com Link: http://lkml.kernel.org/r/1475627678-20788-2-git-send-email-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Kevin Easton	142b79aa0b	af_key: Always verify length of provided sadb_key commit `4b66af2d63` upstream. Key extensions (struct sadb_key) include a user-specified number of key bits. The kernel uses that number to determine how much key data to copy out of the message in pfkey_msg2xfrm_state(). The length of the sadb_key message must be verified to be long enough, even in the case of SADB_X_AALG_NULL. Furthermore, the sadb_key_len value must be long enough to include both the key data and the struct sadb_key itself. Introduce a helper function verify_key_len(), and call it from parse_exthdrs() where other exthdr types are similarly checked for correctness. Signed-off-by: Kevin Easton <kevin@guarana.org> Reported-by: syzbot+5022a34ca5a3d49b84223653fab632dfb7b4cf37@syzkaller.appspotmail.com Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Zubin Mithra <zsm@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Keith Busch	b53761a18e	nvme-pci: initialize queue memory before interrupts commit `161b8be2bd` upstream. A spurious interrupt before the nvme driver has initialized the completion queue may inadvertently cause the driver to believe it has a completion to process. This may result in a NULL dereference since the nvmeq's tags are not set at this point. The patch initializes the host's CQ memory so that a spurious interrupt isn't mistaken for a real completion. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Andreas Born	ae0c8eeb66	bonding: require speed/duplex only for 802.3ad, alb and tlb commit `ad729bc9ac` upstream. The patch `c4adfc822b` ("bonding: make speed, duplex setting consistent with link state") puts the link state to down if bond_update_speed_duplex() cannot retrieve speed and duplex settings. Assumably the patch was written with 802.3ad mode in mind which relies on link speed/duplex settings. For other modes like active-backup these settings are not required. Thus, only for these other modes, this patch reintroduces support for slaves that do not support reporting speed or duplex such as wireless devices. This fixes the regression reported in bug 196547 (https://bugzilla.kernel.org/show_bug.cgi?id=196547). Fixes: `c4adfc822b` ("bonding: make speed, duplex setting consistent with link state") Signed-off-by: Andreas Born <futur.andy@googlemail.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Nate Clark <nate@neworld.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Mahesh Bandewar	bc5ad40583	bonding: fix active-backup transition commit `3f3c278c94` upstream. Earlier patch `c4adfc822b` ("bonding: make speed, duplex setting consistent with link state") made an attempt to keep slave state consistent with speed and duplex settings. Unfortunately link-state transition is used to change the active link especially when used in conjunction with mii-mon. The above mentioned patch broke that logic. Also when speed and duplex settings for a link are updated during a link-event, the link-status should not be changed to invoke correct transition logic. This patch fixes this issue by moving the link-state update outside of the bond_update_speed_duplex() fn and to the places where this fn is called and update link-state selectively. Fixes: `c4adfc822b` ("bonding: make speed, duplex setting consistent with link state") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reviewed-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Nate Clark <nate@neworld.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Mahesh Bandewar	c5b9d36f1e	bonding: correctly update link status during mii-commit phase commit `b5bf0f5b16` upstream. bond_miimon_commit() marks the link UP after attempting to get the speed and duplex settings for the link. There is a possibility that bond_update_speed_duplex() could fail. This is another place where it could result into an inconsistent bonding link state. With this patch the link will be marked UP only if the speed and duplex values retrieved have sane values and processed further. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Nate Clark <nate@neworld.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00
Andy Lutomirski	47a6aa5975	x86/fpu: Hard-disable lazy FPU mode commit `ca6938a1cd` upstream. Since commit: `58122bf1d8` ("x86/fpu: Default eagerfpu=on on all CPUs") ... in Linux 4.6, eager FPU mode has been the default on all x86 systems, and no one has reported any regressions. This patch removes the ability to enable lazy mode: use_eager_fpu() becomes "return true" and all of the FPU mode selection machinery is removed. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: pbonzini@redhat.com Link: http://lkml.kernel.org/r/1475627678-20788-3-git-send-email-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:52:32 +02:00

652 changed files with 7944 additions and 2654 deletions

24

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

 @ -356,6 +356,7 @@ What:		/sys/devices/system/cpu/vulnerabilities
 		/sys/devices/system/cpu/vulnerabilities/spectre_v1
 		/sys/devices/system/cpu/vulnerabilities/spectre_v2
 		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 		/sys/devices/system/cpu/vulnerabilities/l1tf
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
 @ -367,3 +368,26 @@ Description:	Information about CPU vulnerabilities
 		"Not affected"	  CPU is not affected by the vulnerability
 		"Vulnerable"	  CPU is affected and no mitigation in effect
 		"Mitigation: $M"  CPU is affected and mitigation $M is in effect
 		Details about the l1tf file can be found in
 		Documentation/admin-guide/l1tf.rst
 What:		/sys/devices/system/cpu/smt
 		/sys/devices/system/cpu/smt/active
 		/sys/devices/system/cpu/smt/control
 Date:		June 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Control Symetric Multi Threading (SMT)
 		active:  Tells whether SMT is active (enabled and siblings online)
 		control: Read/write interface to control SMT. Possible
 			 values:
 			 "on"		SMT is enabled
 			 "off"		SMT is disabled
 			 "forceoff"	SMT is force disabled. Cannot be changed.
 			 "notsupported" SMT is not supported by the CPU
 			 If control status is "forceoff" or "notsupported" writes
 			 are rejected.

19

Documentation/Changes

View File

 @ -33,7 +33,7 @@ GNU C                  3.2              gcc --version
 GNU make               3.80             make --version
 binutils               2.12             ld -v
 util-linux             2.10o            fdformat --version
 module-init-tools      0.9.10           depmod -V
 kmod                   13               depmod -V
 e2fsprogs              1.41.4           e2fsck -V
 jfsutils               1.1.3            fsck.jfs -V
 reiserfsprogs          3.6.3            reiserfsck -V
 @ -143,12 +143,6 @@ is not build with ``CONFIG_KALLSYMS`` and you have no way to rebuild and
 reproduce the Oops with that option, then you can still decode that Oops
 with ksymoops.
 Module-Init-Tools
 -----------------
 A new module loader is now in the kernel that requires ``module-init-tools``
 to use.  It is backward compatible with the 2.4.x series kernels.
 Mkinitrd
 --------
 @ -363,16 +357,17 @@ Util-linux
 - <ftp://ftp.kernel.org/pub/linux/utils/util-linux/>
 Kmod
 ----
 - <https://www.kernel.org/pub/linux/utils/kernel/kmod/>
 - <https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git>
 Ksymoops
 --------
 - <ftp://ftp.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/>
 Module-Init-Tools
 -----------------
 - <ftp://ftp.kernel.org/pub/linux/kernel/people/rusty/modules/>
 Mkinitrd
 --------

1

Documentation/devicetree/bindings/net/dsa/b53.txt

View File

 @ -10,6 +10,7 @@ Required properties:
       "brcm,bcm53128"
       "brcm,bcm5365"
       "brcm,bcm5395"
       "brcm,bcm5389"
       "brcm,bcm5397"
       "brcm,bcm5398"

23

Documentation/devicetree/bindings/net/dsa/qca8k.txt

View File

 @ -2,7 +2,10 @@
 Required properties:
 - compatible: should be "qca,qca8337"
 - compatible: should be one of:
     "qca,qca8334"
     "qca,qca8337"
 - #size-cells: must be 0
 - #address-cells: must be 1
 @ -14,6 +17,20 @@ port and PHY id, each subnode describing a port needs to have a valid phandle
 referencing the internal PHY connected to it. The CPU port of this switch is
 always port 0.
 A CPU port node has the following optional node:
 - fixed-link            : Fixed-link subnode describing a link to a non-MDIO
                           managed entity. See
                           Documentation/devicetree/bindings/net/fixed-link.txt
                           for details.
 For QCA8K the 'fixed-link' sub-node supports only the following properties:
 - 'speed' (integer, mandatory), to indicate the link speed. Accepted
   values are 10, 100 and 1000
 - 'full-duplex' (boolean, optional), to indicate that full duplex is
   used. When absent, half duplex is assumed.
 Example:
 @ -53,6 +70,10 @@ Example:
 					label = "cpu";
 					ethernet = <&gmac1>;
 					phy-mode = "rgmii";
 					fixed-link {
 						speed = 1000;
 						full-duplex;
 					};
 				};
 				port@1 {

1

Documentation/devicetree/bindings/net/meson-dwmac.txt

View File

 @ -10,6 +10,7 @@ Required properties on all platforms:
 			- "amlogic,meson6-dwmac"
 			- "amlogic,meson8b-dwmac"
 			- "amlogic,meson-gxbb-dwmac"
 			- "amlogic,meson-axg-dwmac"
 		Additionally "snps,dwmac" and any applicable more
 		detailed version number described in net/stmmac.txt
 		should be used.

2

Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt

View File

 @ -3,8 +3,10 @@
 Required properties for the root node:
  - compatible: one of "amlogic,meson8-cbus-pinctrl"
 		      "amlogic,meson8b-cbus-pinctrl"
 		      "amlogic,meson8m2-cbus-pinctrl"
 		      "amlogic,meson8-aobus-pinctrl"
 		      "amlogic,meson8b-aobus-pinctrl"
 		      "amlogic,meson8m2-aobus-pinctrl"
 		      "amlogic,meson-gxbb-periphs-pinctrl"
 		      "amlogic,meson-gxbb-aobus-pinctrl"
  - reg: address and size of registers controlling irq functionality

									
										1

Documentation/index.rst
									
												View File
												
				@ -12,6 +12,7 @@ Contents:

				   :maxdepth: 2

				   kernel-documentation

				   l1tf

				   development-process/index

				   dev-tools/tools

				   driver-api/index

95

Documentation/kernel-parameters.txt

View File

 @ -2010,10 +2010,84 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			(virtualized real and unpaged mode) on capable
 			Intel chips. Default is 1 (enabled)
 	kvm-intel.vmentry_l1d_flush=[KVM,Intel] Mitigation for L1 Terminal Fault
 			CVE-2018-3620.
 			Valid arguments: never, cond, always
 			always: L1D cache flush on every VMENTER.
 			cond:	Flush L1D on VMENTER only when the code between
 				VMEXIT and VMENTER can leak host memory.
 			never:	Disables the mitigation
 			Default is cond (do L1 cache flush in specific instances)
 	kvm-intel.vpid=	[KVM,Intel] Disable Virtual Processor Identification
 			feature (tagged TLBs) on capable Intel chips.
 			Default is 1 (enabled)
 	l1tf=           [X86] Control mitigation of the L1TF vulnerability on
 			      affected CPUs
 			The kernel PTE inversion protection is unconditionally
 			enabled and cannot be disabled.
 			full
 				Provides all available mitigations for the
 				L1TF vulnerability. Disables SMT and
 				enables all mitigations in the
 				hypervisors, i.e. unconditional L1D flush.
 				SMT control and L1D flush control via the
 				sysfs interface is still possible after
 				boot.  Hypervisors will issue a warning
 				when the first VM is started in a
 				potentially insecure configuration,
 				i.e. SMT enabled or L1D flush disabled.
 			full,force
 				Same as 'full', but disables SMT and L1D
 				flush runtime control. Implies the
 				'nosmt=force' command line option.
 				(i.e. sysfs control of SMT is disabled.)
 			flush
 				Leaves SMT enabled and enables the default
 				hypervisor mitigation, i.e. conditional
 				L1D flush.
 				SMT control and L1D flush control via the
 				sysfs interface is still possible after
 				boot.  Hypervisors will issue a warning
 				when the first VM is started in a
 				potentially insecure configuration,
 				i.e. SMT enabled or L1D flush disabled.
 			flush,nosmt
 				Disables SMT and enables the default
 				hypervisor mitigation.
 				SMT control and L1D flush control via the
 				sysfs interface is still possible after
 				boot.  Hypervisors will issue a warning
 				when the first VM is started in a
 				potentially insecure configuration,
 				i.e. SMT enabled or L1D flush disabled.
 			flush,nowarn
 				Same as 'flush', but hypervisors will not
 				warn when a VM is started in a potentially
 				insecure configuration.
 			off
 				Disables hypervisor mitigations and doesn't
 				emit any warnings.
 			Default is 'flush'.
 			For details see: Documentation/admin-guide/l1tf.rst
 	l2cr=		[PPC]
 	l3cr=		[PPC]
 @ -2694,6 +2768,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	nosmt		[KNL,S390] Disable symmetric multithreading (SMT).
 			Equivalent to smt=1.
 			[KNL,x86] Disable symmetric multithreading (SMT).
 			nosmt=force: Force disable SMT, cannot be undone
 				     via the sysfs control file.
 	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
 			(indirect branch prediction) vulnerability. System may
 			allow data leaks with this option, which is equivalent
 @ -4023,6 +4101,23 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	spia_pedr=
 	spia_peddr=
 	ssbd=		[ARM64,HW]
 			Speculative Store Bypass Disable control
 			On CPUs that are vulnerable to the Speculative
 			Store Bypass vulnerability and offer a
 			firmware based mitigation, this parameter
 			indicates how the mitigation should be used:
 			force-on:  Unconditionally enable mitigation for
 				   for both kernel and userspace
 			force-off: Unconditionally disable mitigation for
 				   for both kernel and userspace
 			kernel:    Always enable mitigation in the
 				   kernel, and offer a prctl interface
 				   to allow userspace to register its
 				   interest in being mitigated too.
 	stack_guard_gap=	[MM]
 			override the default stack gap protection. The value
 			is in page units and it defines how many pages prior

									
										610

Documentation/l1tf.rst
									
										Normal file
									
												View File
												
				@ -0,0 +1,610 @@

				L1TF - L1 Terminal Fault

				========================

				L1 Terminal Fault is a hardware vulnerability which allows unprivileged

				speculative access to data which is available in the Level 1 Data Cache

				when the page table entry controlling the virtual address, which is used

				for the access, has the Present bit cleared or other reserved bits set.

				Affected processors

				-------------------

				This vulnerability affects a wide range of Intel processors. The

				vulnerability is not present on:

				   - Processors from AMD, Centaur and other non Intel vendors

				   - Older processor models, where the CPU family is < 6

				   - A range of Intel ATOM processors (Cedarview, Cloverview, Lincroft,

				     Penwell, Pineview, Silvermont, Airmont, Merrifield)

				   - The Intel XEON PHI family

				   - Intel processors which have the ARCH_CAP_RDCL_NO bit set in the

				     IA32_ARCH_CAPABILITIES MSR. If the bit is set the CPU is not affected

				     by the Meltdown vulnerability either. These CPUs should become

				     available by end of 2018.

				Whether a processor is affected or not can be read out from the L1TF

				vulnerability file in sysfs. See :ref:`l1tf_sys_info`.

				Related CVEs

				------------

				The following CVE entries are related to the L1TF vulnerability:

				   =============  =================  ==============================

				   CVE-2018-3615  L1 Terminal Fault  SGX related aspects

				   CVE-2018-3620  L1 Terminal Fault  OS, SMM related aspects

				   CVE-2018-3646  L1 Terminal Fault  Virtualization related aspects

				   =============  =================  ==============================

				Problem

				-------

				If an instruction accesses a virtual address for which the relevant page

				table entry (PTE) has the Present bit cleared or other reserved bits set,

				then speculative execution ignores the invalid PTE and loads the referenced

				data if it is present in the Level 1 Data Cache, as if the page referenced

				by the address bits in the PTE was still present and accessible.

				While this is a purely speculative mechanism and the instruction will raise

				a page fault when it is retired eventually, the pure act of loading the

				data and making it available to other speculative instructions opens up the

				opportunity for side channel attacks to unprivileged malicious code,

				similar to the Meltdown attack.

				While Meltdown breaks the user space to kernel space protection, L1TF

				allows to attack any physical memory address in the system and the attack

				works across all protection domains. It allows an attack of SGX and also

				works from inside virtual machines because the speculation bypasses the

				extended page table (EPT) protection mechanism.

				Attack scenarios

				----------------

				1. Malicious user space

				^^^^^^^^^^^^^^^^^^^^^^^

				   Operating Systems store arbitrary information in the address bits of a

				   PTE which is marked non present. This allows a malicious user space

				   application to attack the physical memory to which these PTEs resolve.

				   In some cases user-space can maliciously influence the information

				   encoded in the address bits of the PTE, thus making attacks more

				   deterministic and more practical.

				   The Linux kernel contains a mitigation for this attack vector, PTE

				   inversion, which is permanently enabled and has no performance

				   impact. The kernel ensures that the address bits of PTEs, which are not

				   marked present, never point to cacheable physical memory space.

				   A system with an up to date kernel is protected against attacks from

				   malicious user space applications.

				2. Malicious guest in a virtual machine

				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				   The fact that L1TF breaks all domain protections allows malicious guest

				   OSes, which can control the PTEs directly, and malicious guest user

				   space applications, which run on an unprotected guest kernel lacking the

				   PTE inversion mitigation for L1TF, to attack physical host memory.

				   A special aspect of L1TF in the context of virtualization is symmetric

				   multi threading (SMT). The Intel implementation of SMT is called

				   HyperThreading. The fact that Hyperthreads on the affected processors

				   share the L1 Data Cache (L1D) is important for this. As the flaw allows

				   only to attack data which is present in L1D, a malicious guest running

				   on one Hyperthread can attack the data which is brought into the L1D by

				   the context which runs on the sibling Hyperthread of the same physical

				   core. This context can be host OS, host user space or a different guest.

				   If the processor does not support Extended Page Tables, the attack is

				   only possible, when the hypervisor does not sanitize the content of the

				   effective (shadow) page tables.

				   While solutions exist to mitigate these attack vectors fully, these

				   mitigations are not enabled by default in the Linux kernel because they

				   can affect performance significantly. The kernel provides several

				   mechanisms which can be utilized to address the problem depending on the

				   deployment scenario. The mitigations, their protection scope and impact

				   are described in the next sections.

				   The default mitigations and the rationale for choosing them are explained

				   at the end of this document. See :ref:`default_mitigations`.

				.. _l1tf_sys_info:

				L1TF system information

				-----------------------

				The Linux kernel provides a sysfs interface to enumerate the current L1TF

				status of the system: whether the system is vulnerable, and which

				mitigations are active. The relevant sysfs file is:

				/sys/devices/system/cpu/vulnerabilities/l1tf

				The possible values in this file are:

				  ===========================   ===============================

				  'Not affected'		The processor is not vulnerable

				  'Mitigation: PTE Inversion'	The host protection is active

				  ===========================   ===============================

				If KVM/VMX is enabled and the processor is vulnerable then the following

				information is appended to the 'Mitigation: PTE Inversion' part:

				  - SMT status:

				    =====================  ================

				    'VMX: SMT vulnerable'  SMT is enabled

				    'VMX: SMT disabled'    SMT is disabled

				    =====================  ================

				  - L1D Flush mode:

				    ================================  ====================================

				    'L1D vulnerable'		      L1D flushing is disabled

				    'L1D conditional cache flushes'   L1D flush is conditionally enabled

				    'L1D cache flushes'		      L1D flush is unconditionally enabled

				    ================================  ====================================

				The resulting grade of protection is discussed in the following sections.

				Host mitigation mechanism

				-------------------------

				The kernel is unconditionally protected against L1TF attacks from malicious

				user space running on the host.

				Guest mitigation mechanisms

				---------------------------

				.. _l1d_flush:

				1. L1D flush on VMENTER

				^^^^^^^^^^^^^^^^^^^^^^^

				   To make sure that a guest cannot attack data which is present in the L1D

				   the hypervisor flushes the L1D before entering the guest.

				   Flushing the L1D evicts not only the data which should not be accessed

				   by a potentially malicious guest, it also flushes the guest

				   data. Flushing the L1D has a performance impact as the processor has to

				   bring the flushed guest data back into the L1D. Depending on the

				   frequency of VMEXIT/VMENTER and the type of computations in the guest

				   performance degradation in the range of 1% to 50% has been observed. For

				   scenarios where guest VMEXIT/VMENTER are rare the performance impact is

				   minimal. Virtio and mechanisms like posted interrupts are designed to

				   confine the VMEXITs to a bare minimum, but specific configurations and

				   application scenarios might still suffer from a high VMEXIT rate.

				   The kernel provides two L1D flush modes:

				    - conditional ('cond')

				    - unconditional ('always')

				   The conditional mode avoids L1D flushing after VMEXITs which execute

				   only audited code paths before the corresponding VMENTER. These code

				   paths have been verified that they cannot expose secrets or other

				   interesting data to an attacker, but they can leak information about the

				   address space layout of the hypervisor.

				   Unconditional mode flushes L1D on all VMENTER invocations and provides

				   maximum protection. It has a higher overhead than the conditional

				   mode. The overhead cannot be quantified correctly as it depends on the

				   workload scenario and the resulting number of VMEXITs.

				   The general recommendation is to enable L1D flush on VMENTER. The kernel

				   defaults to conditional mode on affected processors.

				   **Note**, that L1D flush does not prevent the SMT problem because the

				   sibling thread will also bring back its data into the L1D which makes it

				   attackable again.

				   L1D flush can be controlled by the administrator via the kernel command

				   line and sysfs control files. See :ref:`mitigation_control_command_line`

				   and :ref:`mitigation_control_kvm`.

				.. _guest_confinement:

				2. Guest VCPU confinement to dedicated physical cores

				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				   To address the SMT problem, it is possible to make a guest or a group of

				   guests affine to one or more physical cores. The proper mechanism for

				   that is to utilize exclusive cpusets to ensure that no other guest or

				   host tasks can run on these cores.

				   If only a single guest or related guests run on sibling SMT threads on

				   the same physical core then they can only attack their own memory and

				   restricted parts of the host memory.

				   Host memory is attackable, when one of the sibling SMT threads runs in

				   host OS (hypervisor) context and the other in guest context. The amount

				   of valuable information from the host OS context depends on the context

				   which the host OS executes, i.e. interrupts, soft interrupts and kernel

				   threads. The amount of valuable data from these contexts cannot be

				   declared as non-interesting for an attacker without deep inspection of

				   the code.

				   **Note**, that assigning guests to a fixed set of physical cores affects

				   the ability of the scheduler to do load balancing and might have

				   negative effects on CPU utilization depending on the hosting

				   scenario. Disabling SMT might be a viable alternative for particular

				   scenarios.

				   For further information about confining guests to a single or to a group

				   of cores consult the cpusets documentation:

				   https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

				.. _interrupt_isolation:

				3. Interrupt affinity

				^^^^^^^^^^^^^^^^^^^^^

				   Interrupts can be made affine to logical CPUs. This is not universally

				   true because there are types of interrupts which are truly per CPU

				   interrupts, e.g. the local timer interrupt. Aside of that multi queue

				   devices affine their interrupts to single CPUs or groups of CPUs per

				   queue without allowing the administrator to control the affinities.

				   Moving the interrupts, which can be affinity controlled, away from CPUs

				   which run untrusted guests, reduces the attack vector space.

				   Whether the interrupts with are affine to CPUs, which run untrusted

				   guests, provide interesting data for an attacker depends on the system

				   configuration and the scenarios which run on the system. While for some

				   of the interrupts it can be assumed that they won't expose interesting

				   information beyond exposing hints about the host OS memory layout, there

				   is no way to make general assumptions.

				   Interrupt affinity can be controlled by the administrator via the

				   /proc/irq/$NR/smp_affinity[_list] files. Limited documentation is

				   available at:

				   https://www.kernel.org/doc/Documentation/IRQ-affinity.txt

				.. _smt_control:

				4. SMT control

				^^^^^^^^^^^^^^

				   To prevent the SMT issues of L1TF it might be necessary to disable SMT

				   completely. Disabling SMT can have a significant performance impact, but

				   the impact depends on the hosting scenario and the type of workloads.

				   The impact of disabling SMT needs also to be weighted against the impact

				   of other mitigation solutions like confining guests to dedicated cores.

				   The kernel provides a sysfs interface to retrieve the status of SMT and

				   to control it. It also provides a kernel command line interface to

				   control SMT.

				   The kernel command line interface consists of the following options:

				     =========== ==========================================================

				     nosmt	 Affects the bring up of the secondary CPUs during boot. The

						 kernel tries to bring all present CPUs online during the

						 boot process. "nosmt" makes sure that from each physical

						 core only one - the so called primary (hyper) thread is

						 activated. Due to a design flaw of Intel processors related

						 to Machine Check Exceptions the non primary siblings have

						 to be brought up at least partially and are then shut down

						 again.  "nosmt" can be undone via the sysfs interface.

				     nosmt=force Has the same effect as "nosmt" but it does not allow to

						 undo the SMT disable via the sysfs interface.

				     =========== ==========================================================

				   The sysfs interface provides two files:

				   - /sys/devices/system/cpu/smt/control

				   - /sys/devices/system/cpu/smt/active

				   /sys/devices/system/cpu/smt/control:

				     This file allows to read out the SMT control state and provides the

				     ability to disable or (re)enable SMT. The possible states are:

					==============  ===================================================

					on		SMT is supported by the CPU and enabled. All

							logical CPUs can be onlined and offlined without

							restrictions.

					off		SMT is supported by the CPU and disabled. Only

							the so called primary SMT threads can be onlined

							and offlined without restrictions. An attempt to

							online a non-primary sibling is rejected

					forceoff	Same as 'off' but the state cannot be controlled.

							Attempts to write to the control file are rejected.

					notsupported	The processor does not support SMT. It's therefore

							not affected by the SMT implications of L1TF.

							Attempts to write to the control file are rejected.

					==============  ===================================================

				     The possible states which can be written into this file to control SMT

				     state are:

				     - on

				     - off

				     - forceoff

				   /sys/devices/system/cpu/smt/active:

				     This file reports whether SMT is enabled and active, i.e. if on any

				     physical core two or more sibling threads are online.

				   SMT control is also possible at boot time via the l1tf kernel command

				   line parameter in combination with L1D flush control. See

				   :ref:`mitigation_control_command_line`.

				5. Disabling EPT

				^^^^^^^^^^^^^^^^

				  Disabling EPT for virtual machines provides full mitigation for L1TF even

				  with SMT enabled, because the effective page tables for guests are

				  managed and sanitized by the hypervisor. Though disabling EPT has a

				  significant performance impact especially when the Meltdown mitigation

				  KPTI is enabled.

				  EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter.

				There is ongoing research and development for new mitigation mechanisms to

				address the performance impact of disabling SMT or EPT.

				.. _mitigation_control_command_line:

				Mitigation control on the kernel command line

				---------------------------------------------

				The kernel command line allows to control the L1TF mitigations at boot

				time with the option "l1tf=". The valid arguments for this option are:

				  ============  =============================================================

				  full		Provides all available mitigations for the L1TF

						vulnerability. Disables SMT and enables all mitigations in

						the hypervisors, i.e. unconditional L1D flushing

						SMT control and L1D flush control via the sysfs interface

						is still possible after boot.  Hypervisors will issue a

						warning when the first VM is started in a potentially

						insecure configuration, i.e. SMT enabled or L1D flush

						disabled.

				  full,force	Same as 'full', but disables SMT and L1D flush runtime

						control. Implies the 'nosmt=force' command line option.

						(i.e. sysfs control of SMT is disabled.)

				  flush		Leaves SMT enabled and enables the default hypervisor

						mitigation, i.e. conditional L1D flushing

						SMT control and L1D flush control via the sysfs interface

						is still possible after boot.  Hypervisors will issue a

						warning when the first VM is started in a potentially

						insecure configuration, i.e. SMT enabled or L1D flush

						disabled.

				  flush,nosmt	Disables SMT and enables the default hypervisor mitigation,

						i.e. conditional L1D flushing.

						SMT control and L1D flush control via the sysfs interface

						is still possible after boot.  Hypervisors will issue a

						warning when the first VM is started in a potentially

						insecure configuration, i.e. SMT enabled or L1D flush

						disabled.

				  flush,nowarn	Same as 'flush', but hypervisors will not warn when a VM is

						started in a potentially insecure configuration.

				  off		Disables hypervisor mitigations and doesn't emit any

						warnings.

				  ============  =============================================================

				The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`.

				.. _mitigation_control_kvm:

				Mitigation control for KVM - module parameter

				-------------------------------------------------------------

				The KVM hypervisor mitigation mechanism, flushing the L1D cache when

				entering a guest, can be controlled with a module parameter.

				The option/parameter is "kvm-intel.vmentry_l1d_flush=". It takes the

				following arguments:

				  ============  ==============================================================

				  always	L1D cache flush on every VMENTER.

				  cond		Flush L1D on VMENTER only when the code between VMEXIT and

						VMENTER can leak host memory which is considered

						interesting for an attacker. This still can leak host memory

						which allows e.g. to determine the hosts address space layout.

				  never		Disables the mitigation

				  ============  ==============================================================

				The parameter can be provided on the kernel command line, as a module

				parameter when loading the modules and at runtime modified via the sysfs

				file:

				/sys/module/kvm_intel/parameters/vmentry_l1d_flush

				The default is 'cond'. If 'l1tf=full,force' is given on the kernel command

				line, then 'always' is enforced and the kvm-intel.vmentry_l1d_flush

				module parameter is ignored and writes to the sysfs file are rejected.

				Mitigation selection guide

				--------------------------

				1. No virtualization in use

				^^^^^^^^^^^^^^^^^^^^^^^^^^^

				   The system is protected by the kernel unconditionally and no further

				   action is required.

				2. Virtualization with trusted guests

				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				   If the guest comes from a trusted source and the guest OS kernel is

				   guaranteed to have the L1TF mitigations in place the system is fully

				   protected against L1TF and no further action is required.

				   To avoid the overhead of the default L1D flushing on VMENTER the

				   administrator can disable the flushing via the kernel command line and

				   sysfs control files. See :ref:`mitigation_control_command_line` and

				   :ref:`mitigation_control_kvm`.

				3. Virtualization with untrusted guests

				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				3.1. SMT not supported or disabled

				""""""""""""""""""""""""""""""""""

				  If SMT is not supported by the processor or disabled in the BIOS or by

				  the kernel, it's only required to enforce L1D flushing on VMENTER.

				  Conditional L1D flushing is the default behaviour and can be tuned. See

				  :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`.

				3.2. EPT not supported or disabled

				""""""""""""""""""""""""""""""""""

				  If EPT is not supported by the processor or disabled in the hypervisor,

				  the system is fully protected. SMT can stay enabled and L1D flushing on

				  VMENTER is not required.

				  EPT can be disabled in the hypervisor via the 'kvm-intel.ept' parameter.

				3.3. SMT and EPT supported and active

				"""""""""""""""""""""""""""""""""""""

				  If SMT and EPT are supported and active then various degrees of

				  mitigations can be employed:

				  - L1D flushing on VMENTER:

				    L1D flushing on VMENTER is the minimal protection requirement, but it

				    is only potent in combination with other mitigation methods.

				    Conditional L1D flushing is the default behaviour and can be tuned. See

				    :ref:`mitigation_control_command_line` and :ref:`mitigation_control_kvm`.

				  - Guest confinement:

				    Confinement of guests to a single or a group of physical cores which

				    are not running any other processes, can reduce the attack surface

				    significantly, but interrupts, soft interrupts and kernel threads can

				    still expose valuable data to a potential attacker. See

				    :ref:`guest_confinement`.

				  - Interrupt isolation:

				    Isolating the guest CPUs from interrupts can reduce the attack surface

				    further, but still allows a malicious guest to explore a limited amount

				    of host physical memory. This can at least be used to gain knowledge

				    about the host address space layout. The interrupts which have a fixed

				    affinity to the CPUs which run the untrusted guests can depending on

				    the scenario still trigger soft interrupts and schedule kernel threads

				    which might expose valuable information. See

				    :ref:`interrupt_isolation`.

				The above three mitigation methods combined can provide protection to a

				certain degree, but the risk of the remaining attack surface has to be

				carefully analyzed. For full protection the following methods are

				available:

				  - Disabling SMT:

				    Disabling SMT and enforcing the L1D flushing provides the maximum

				    amount of protection. This mitigation is not depending on any of the

				    above mitigation methods.

				    SMT control and L1D flushing can be tuned by the command line

				    parameters 'nosmt', 'l1tf', 'kvm-intel.vmentry_l1d_flush' and at run

				    time with the matching sysfs control files. See :ref:`smt_control`,

				    :ref:`mitigation_control_command_line` and

				    :ref:`mitigation_control_kvm`.

				  - Disabling EPT:

				    Disabling EPT provides the maximum amount of protection as well. It is

				    not depending on any of the above mitigation methods. SMT can stay

				    enabled and L1D flushing is not required, but the performance impact is

				    significant.

				    EPT can be disabled in the hypervisor via the 'kvm-intel.ept'

				    parameter.

				3.4. Nested virtual machines

				""""""""""""""""""""""""""""

				When nested virtualization is in use, three operating systems are involved:

				the bare metal hypervisor, the nested hypervisor and the nested virtual

				machine.  VMENTER operations from the nested hypervisor into the nested

				guest will always be processed by the bare metal hypervisor. If KVM is the

				bare metal hypervisor it wiil:

				 - Flush the L1D cache on every switch from the nested hypervisor to the

				   nested virtual machine, so that the nested hypervisor's secrets are not

				   exposed to the nested virtual machine;

				 - Flush the L1D cache on every switch from the nested virtual machine to

				   the nested hypervisor; this is a complex operation, and flushing the L1D

				   cache avoids that the bare metal hypervisor's secrets are exposed to the

				   nested virtual machine;

				 - Instruct the nested hypervisor to not perform any L1D cache flush. This

				   is an optimization to avoid double L1D flushing.

				.. _default_mitigations:

				Default mitigations

				-------------------

				  The kernel default mitigations for vulnerable processors are:

				  - PTE inversion to protect against malicious user space. This is done

				    unconditionally and cannot be controlled.

				  - L1D conditional flushing on VMENTER when EPT is enabled for

				    a guest.

				  The kernel does not by default enforce the disabling of SMT, which leaves

				  SMT systems vulnerable when running untrusted guests with EPT enabled.

				  The rationale for this choice is:

				  - Force disabling SMT can break existing setups, especially with

				    unattended updates.

				  - If regular users run untrusted guests on their machine, then L1TF is

				    just an add on to other malware which might be embedded in an untrusted

				    guest, e.g. spam-bots or attacks on the local network.

				    There is no technical way to prevent a user from running untrusted code

				    on their machines blindly.

				  - It's technically extremely unlikely and from today's knowledge even

				    impossible that L1TF can be exploited via the most popular attack

				    mechanisms like JavaScript because these mechanisms have no way to

				    control PTEs. If this would be possible and not other mitigation would

				    be possible, then the default might be different.

				  - The administrators of cloud and hosting setups have to carefully

				    analyze the risk for their scenarios and make the appropriate

				    mitigation choices, which might even vary across their deployed

				    machines and also result in other changes of their overall setup.

				    There is no way for the kernel to provide a sensible default for this

				    kind of scenarios.

3

Documentation/printk-formats.txt

View File

 @ -279,11 +279,10 @@ struct clk:
 	%pC	pll1
 	%pCn	pll1
 	%pCr	1560000000
 	For printing struct clk structures. '%pC' and '%pCn' print the name
 	(Common Clock Framework) or address (legacy clock framework) of the
 	structure; '%pCr' prints the current clock rate.
 	structure.
 	Passed by reference.

40

Documentation/virtual/kvm/api.txt

View File

 @ -122,14 +122,15 @@ KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as
 privileged user (CAP_SYS_ADMIN).
 .3 KVM_GET_MSR_INDEX_LIST
 .3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
 Capability: basic
 Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST
 Architectures: x86
 Type: system
 Type: system ioctl
 Parameters: struct kvm_msr_list (in/out)
 Returns: 0 on success; -1 on error
 Errors:
   EFAULT:    the msr index list cannot be read from or written to
   E2BIG:     the msr index list is to be to fit in the array specified by
              the user.
 @ -138,16 +139,23 @@ struct kvm_msr_list {
 	__u32 indices[0];
 };
 This ioctl returns the guest msrs that are supported.  The list varies
 by kvm version and host processor, but does not change otherwise.  The
 user fills in the size of the indices array in nmsrs, and in return
 kvm adjusts nmsrs to reflect the actual number of msrs and fills in
 the indices array with their numbers.
 The user fills in the size of the indices array in nmsrs, and in return
 kvm adjusts nmsrs to reflect the actual number of msrs and fills in the
 indices array with their numbers.
 KVM_GET_MSR_INDEX_LIST returns the guest msrs that are supported.  The list
 varies by kvm version and host processor, but does not change otherwise.
 Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
 not returned in the MSR list, as different vcpus can have a different number
 of banks, as set via the KVM_X86_SETUP_MCE ioctl.
 KVM_GET_MSR_FEATURE_INDEX_LIST returns the list of MSRs that can be passed
 to the KVM_GET_MSRS system ioctl.  This lets userspace probe host capabilities
 and processor features that are exposed via MSRs (e.g., VMX capabilities).
 This list also varies by kvm version and host processor, but does not change
 otherwise.
 .4 KVM_CHECK_EXTENSION
 @ -474,14 +482,22 @@ Support for this has been removed.  Use KVM_SET_GUEST_DEBUG instead.
 .18 KVM_GET_MSRS
 Capability: basic
 Capability: basic (vcpu), KVM_CAP_GET_MSR_FEATURES (system)
 Architectures: x86
 Type: vcpu ioctl
 Type: system ioctl, vcpu ioctl
 Parameters: struct kvm_msrs (in/out)
 Returns: 0 on success, -1 on error
 Returns: number of msrs successfully returned;
         -1 on error
 When used as a system ioctl:
 Reads the values of MSR-based features that are available for the VM.  This
 is similar to KVM_GET_SUPPORTED_CPUID, but it returns MSR indices and values.
 The list of msr-based features can be obtained using KVM_GET_MSR_FEATURE_INDEX_LIST
 in a system ioctl.
 When used as a vcpu ioctl:
 Reads model-specific registers from the vcpu.  Supported msr indices can
 be obtained using KVM_GET_MSR_INDEX_LIST.
 be obtained using KVM_GET_MSR_INDEX_LIST in a system ioctl.
 struct kvm_msrs {
 	__u32 nmsrs; /* number of msrs in entries */

									
										6

Makefile
									
												View File
												
				@ -1,6 +1,6 @@

				VERSION = 4

				PATCHLEVEL = 9

				SUBLEVEL = 108

				SUBLEVEL = 122

				EXTRAVERSION =

				NAME = Roaring Lionus

				@ -417,7 +417,8 @@ export MAKE AWK GENKSYMS INSTALLKERNEL PERL PYTHON UTS_MACHINE

				export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS

				export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS

				export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_KASAN CFLAGS_UBSAN

				export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE

				export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN

				export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE

				export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE

				export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL

				@ -635,6 +636,7 @@ KBUILD_CFLAGS	+= $(call cc-disable-warning,frame-address,)

				KBUILD_CFLAGS	+= $(call cc-disable-warning, format-truncation)

				KBUILD_CFLAGS	+= $(call cc-disable-warning, format-overflow)

				KBUILD_CFLAGS	+= $(call cc-disable-warning, int-in-bool-context)

				KBUILD_CFLAGS	+= $(call cc-disable-warning, attribute-alias)

				ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION

				KBUILD_CFLAGS	+= $(call cc-option,-ffunction-sections,)

3

arch/Kconfig

View File

 @ -5,6 +5,9 @@
 config KEXEC_CORE
 	bool
 config HOTPLUG_SMT
 	bool
 config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING

1

arch/arc/configs/axs101_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs/"
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y
 # CONFIG_VM_EVENT_COUNTERS is not set

1

arch/arc/configs/axs103_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../../arc_initramfs_hs/"
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y
 # CONFIG_VM_EVENT_COUNTERS is not set

1

arch/arc/configs/axs103_smp_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../../arc_initramfs_hs/"
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y
 # CONFIG_VM_EVENT_COUNTERS is not set

1

arch/arc/configs/nsim_700_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs/"
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y

1

arch/arc/configs/nsim_hs_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../../arc_initramfs_hs/"
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y

1

arch/arc/configs/nsim_hs_smp_defconfig

View File

 @ -9,7 +9,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs_hs/"
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y

1

arch/arc/configs/nsimosci_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs/"
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y

1

arch/arc/configs/nsimosci_hs_defconfig

View File

 @ -11,7 +11,6 @@ CONFIG_NAMESPACES=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs_hs/"
 CONFIG_KALLSYMS_ALL=y
 CONFIG_EMBEDDED=y
 CONFIG_PERF_EVENTS=y

1

arch/arc/configs/nsimosci_hs_smp_defconfig

View File

 @ -9,7 +9,6 @@ CONFIG_IKCONFIG_PROC=y
 # CONFIG_UTS_NS is not set
 # CONFIG_PID_NS is not set
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_INITRAMFS_SOURCE="../arc_initramfs_hs/"
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_KPROBES=y

									
										2

arch/arc/include/asm/page.h
									
												View File
												
				@ -105,7 +105,7 @@ typedef pte_t * pgtable_t;

				#define virt_addr_valid(kaddr)  pfn_valid(virt_to_pfn(kaddr))

				/* Default Permissions for stack/heaps pages (Non Executable) */

				#define VM_DATA_DEFAULT_FLAGS   (VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE)

				#define VM_DATA_DEFAULT_FLAGS   (VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)

				#define WANT_PAGE_VIRTUAL   1

									
										2

arch/arc/include/asm/pgtable.h
									
												View File
												
				@ -378,7 +378,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address,

				/* Decode a PTE containing swap "identifier "into constituents */

				#define __swp_type(pte_lookalike)	(((pte_lookalike).val) & 0x1f)

				#define __swp_offset(pte_lookalike)	((pte_lookalike).val << 13)

				#define __swp_offset(pte_lookalike)	((pte_lookalike).val >> 13)

				/* NOPs, to keep generic kernel happy */

				#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })

5

arch/arm/boot/dts/emev2.dtsi

View File

 @ -30,13 +30,13 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 		cpu@0 {
 		cpu0: cpu@0 {
 			device_type = "cpu";
 			compatible = "arm,cortex-a9";
 			reg = <0>;
 			clock-frequency = <533000000>;
 		};
 		cpu@1 {
 		cpu1: cpu@1 {
 			device_type = "cpu";
 			compatible = "arm,cortex-a9";
 			reg = <1>;
 @ -56,6 +56,7 @@
 		compatible = "arm,cortex-a9-pmu";
 		interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>;
 		interrupt-affinity = <&cpu0>, <&cpu1>;
 	};
 	clocks@e0110000 {

2

arch/arm/boot/dts/imx6q.dtsi

View File

 @ -96,7 +96,7 @@
 					clocks = <&clks IMX6Q_CLK_ECSPI5>,
 						 <&clks IMX6Q_CLK_ECSPI5>;
 					clock-names = "ipg", "per";
 					dmas = <&sdma 11 7 1>, <&sdma 12 7 2>;
 					dmas = <&sdma 11 8 1>, <&sdma 12 8 2>;
 					dma-names = "rx", "tx";
 					status = "disabled";
 				};

2

arch/arm/boot/dts/imx6sx.dtsi

View File

 @ -1280,7 +1280,7 @@
 				  /* non-prefetchable memory */
 x82000000 0 0x08000000 0x08000000 0 0x00f00000>;
 			num-lanes = <1>;
 			interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>;
 			interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>;
 			clocks = <&clks IMX6SX_CLK_PCIE_REF_125M>,
 				 <&clks IMX6SX_CLK_PCIE_AXI>,
 				 <&clks IMX6SX_CLK_LVDS1_OUT>,

5

arch/arm/boot/dts/sh73a0.dtsi

View File

 @ -22,7 +22,7 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 		cpu@0 {
 		cpu0: cpu@0 {
 			device_type = "cpu";
 			compatible = "arm,cortex-a9";
 			reg = <0>;
 @ -30,7 +30,7 @@
 			power-domains = <&pd_a2sl>;
 			next-level-cache = <&L2>;
 		};
 		cpu@1 {
 		cpu1: cpu@1 {
 			device_type = "cpu";
 			compatible = "arm,cortex-a9";
 			reg = <1>;
 @ -89,6 +89,7 @@
 		compatible = "arm,cortex-a9-pmu";
 		interrupts = <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>,
 			     <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>;
 		interrupt-affinity = <&cpu0>, <&cpu1>;
 	};
 	cmt1: timer@e6138000 {

									
										2

arch/arm/include/asm/kgdb.h
									
												View File
												
				@ -76,7 +76,7 @@ extern int kgdb_fault_expected;

				#define KGDB_MAX_NO_CPUS	1

				#define BUFMAX			400

				#define NUMREGBYTES		(DBG_MAX_REG_NUM << 2)

				#define NUMREGBYTES		(GDB_MAX_REGS << 2)

				#define NUMCRITREGBYTES		(32 << 2)

				#define _R0			0

									
										12

arch/arm/include/asm/kvm_host.h
									
												View File
												
				@ -327,4 +327,16 @@ static inline bool kvm_arm_harden_branch_predictor(void)

					return false;

				}

				#define KVM_SSBD_UNKNOWN		-1

				#define KVM_SSBD_FORCE_DISABLE		0

				#define KVM_SSBD_KERNEL		1

				#define KVM_SSBD_FORCE_ENABLE		2

				#define KVM_SSBD_MITIGATED		3

				static inline int kvm_arm_have_ssbd(void)

				{

					/* No way to detect it yet, pretend it is not there. */

					return KVM_SSBD_UNKNOWN;

				}

				#endif /* __ARM_KVM_HOST_H__ */

									
										12

arch/arm/include/asm/kvm_mmu.h
									
												View File
												
				@ -28,6 +28,13 @@

				 */

				#define kern_hyp_va(kva)	(kva)

				/* Contrary to arm64, there is no need to generate a PC-relative address */

				#define hyp_symbol_addr(s)						\

					({								\

						typeof(s) *addr = &(s);					\

						addr;							\

					})

				/*

				 * KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.

				 */

				@ -249,6 +256,11 @@ static inline int kvm_map_vectors(void)

					return 0;

				}

				static inline int hyp_map_aux_data(void)

				{

					return 0;

				}

				#endif	/* !__ASSEMBLY__ */

				#endif /* __ARM_KVM_MMU_H__ */

									
										24

arch/arm/kvm/arm.c
									
												View File
												
				@ -51,8 +51,8 @@

				__asm__(".arch_extension	virt");

				#endif

				DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);

				static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);

				static kvm_cpu_context_t __percpu *kvm_host_cpu_state;

				static unsigned long hyp_default_vectors;

				/* Per-CPU variable containing the currently running vcpu. */

				@ -338,7 +338,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)

					}

					vcpu->cpu = cpu;

					vcpu->arch.host_cpu_context = this_cpu_ptr(kvm_host_cpu_state);

					vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);

					kvm_arm_set_running_vcpu(vcpu);

				}

				@ -1199,19 +1199,8 @@ static inline void hyp_cpu_pm_exit(void)

				}

				#endif

				static void teardown_common_resources(void)

				{

					free_percpu(kvm_host_cpu_state);

				}

				static int init_common_resources(void)

				{

					kvm_host_cpu_state = alloc_percpu(kvm_cpu_context_t);

					if (!kvm_host_cpu_state) {

						kvm_err("Cannot allocate host CPU state\n");

						return -ENOMEM;

					}

					/* set size of VMID supported by CPU */

					kvm_vmid_bits = kvm_get_vmid_bits();

					kvm_info("%d-bit VMID\n", kvm_vmid_bits);

				@ -1369,7 +1358,7 @@ static int init_hyp_mode(void)

					for_each_possible_cpu(cpu) {

						kvm_cpu_context_t *cpu_ctxt;

						cpu_ctxt = per_cpu_ptr(kvm_host_cpu_state, cpu);

						cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);

						err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);

						if (err) {

				@ -1378,6 +1367,12 @@ static int init_hyp_mode(void)

						}

					}

					err = hyp_map_aux_data();

					if (err) {

						kvm_err("Cannot map host auxilary data: %d\n", err);

						goto out_err;

					}

					kvm_info("Hyp mode initialized successfully\n");

					return 0;

				@ -1447,7 +1442,6 @@ int kvm_arch_init(void *opaque)

				out_hyp:

					teardown_hyp_mode();

				out_err:

					teardown_common_resources();

					return err;

				}

									
										18

arch/arm/kvm/psci.c
									
												View File
												
				@ -403,7 +403,7 @@ static int kvm_psci_call(struct kvm_vcpu *vcpu)

				int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)

				{

					u32 func_id = smccc_get_function(vcpu);

					u32 val = PSCI_RET_NOT_SUPPORTED;

					u32 val = SMCCC_RET_NOT_SUPPORTED;

					u32 feature;

					switch (func_id) {

				@ -415,7 +415,21 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)

						switch(feature) {

						case ARM_SMCCC_ARCH_WORKAROUND_1:

							if (kvm_arm_harden_branch_predictor())

								val = 0;

								val = SMCCC_RET_SUCCESS;

							break;

						case ARM_SMCCC_ARCH_WORKAROUND_2:

							switch (kvm_arm_have_ssbd()) {

							case KVM_SSBD_FORCE_DISABLE:

							case KVM_SSBD_UNKNOWN:

								break;

							case KVM_SSBD_KERNEL:

								val = SMCCC_RET_SUCCESS;

								break;

							case KVM_SSBD_FORCE_ENABLE:

							case KVM_SSBD_MITIGATED:

								val = SMCCC_RET_NOT_REQUIRED;

								break;

							}

							break;

						}

						break;

9

arch/arm64/Kconfig

View File

 @ -776,6 +776,15 @@ config HARDEN_BRANCH_PREDICTOR
 	  If unsure, say Y.
 config ARM64_SSBD
 	bool "Speculative Store Bypass Disable" if EXPERT
 	default y
 	help
 	  This enables mitigation of the bypassing of previous stores
 	  by speculative loads.
 	  If unsure, say Y.
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
 	depends on COMPAT

2

arch/arm64/configs/defconfig

View File

 @ -260,6 +260,8 @@ CONFIG_GPIO_XGENE=y
 CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCA953X_IRQ=y
 CONFIG_GPIO_MAX77620=y
 CONFIG_POWER_AVS=y
 CONFIG_ROCKCHIP_IODOMAIN=y
 CONFIG_POWER_RESET_MSM=y
 CONFIG_BATTERY_BQ27XXX=y
 CONFIG_POWER_RESET_XGENE=y

									
										43

arch/arm64/include/asm/alternative.h
									
												View File
												
				@ -4,6 +4,8 @@

				#include <asm/cpucaps.h>

				#include <asm/insn.h>

				#define ARM64_CB_PATCH ARM64_NCAPS

				#ifndef __ASSEMBLY__

				#include <linux/init.h>

				@ -11,6 +13,8 @@

				#include <linux/stddef.h>

				#include <linux/stringify.h>

				extern int alternatives_applied;

				struct alt_instr {

					s32 orig_offset;	/* offset to original instruction */

					s32 alt_offset;		/* offset to replacement instruction */

				@ -19,12 +23,19 @@ struct alt_instr {

					u8  alt_len;		/* size of new instruction(s), <= orig_len */

				};

				typedef void (*alternative_cb_t)(struct alt_instr *alt,

								 __le32 *origptr, __le32 *updptr, int nr_inst);

				void __init apply_alternatives_all(void);

				void apply_alternatives(void *start, size_t length);

				#define ALTINSTR_ENTRY(feature)						      \

				#define ALTINSTR_ENTRY(feature,cb)					      \

					" .word 661b - .\n"				/* label           */ \

					" .if " __stringify(cb) " == 0\n"				      \

					" .word 663f - .\n"				/* new instruction */ \

					" .else\n"							      \

					" .word " __stringify(cb) "- .\n"		/* callback */	      \

					" .endif\n"							      \

					" .hword " __stringify(feature) "\n"		/* feature bit     */ \

					" .byte 662b-661b\n"				/* source len      */ \

					" .byte 664f-663f\n"				/* replacement len */

				@ -42,15 +53,18 @@ void apply_alternatives(void *start, size_t length);

				 * but most assemblers die if insn1 or insn2 have a .inst. This should

				 * be fixed in a binutils release posterior to 2.25.51.0.2 (anything

				 * containing commit 4e4d08cf7399b606 or c1baaddf8861).

				 *

				 * Alternatives with callbacks do not generate replacement instructions.

				 */

				#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\

				#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb)	\

					".if "__stringify(cfg_enabled)" == 1\n"				\

					"661:\n\t"							\

					oldinstr "\n"							\

					"662:\n"							\

					".pushsection .altinstructions,\"a\"\n"				\

					ALTINSTR_ENTRY(feature)						\

					ALTINSTR_ENTRY(feature,cb)					\

					".popsection\n"							\

					" .if " __stringify(cb) " == 0\n"				\

					".pushsection .altinstr_replacement, \"a\"\n"			\

					"663:\n\t"							\

					newinstr "\n"							\

				@ -58,11 +72,17 @@ void apply_alternatives(void *start, size_t length);

					".popsection\n\t"						\

					".org	. - (664b-663b) + (662b-661b)\n\t"			\

					".org	. - (662b-661b) + (664b-663b)\n"			\

					".else\n\t"							\

					"663:\n\t"							\

					"664:\n\t"							\

					".endif\n"							\

					".endif\n"

				#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\

					__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))

					__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)

				#define ALTERNATIVE_CB(oldinstr, cb) \

					__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_CB_PATCH, 1, cb)

				#else

				#include <asm/assembler.h>

				@ -129,6 +149,14 @@ void apply_alternatives(void *start, size_t length);

				661:

				.endm

				.macro alternative_cb cb

					.set .Lasm_alt_mode, 0

					.pushsection .altinstructions, "a"

					altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0

					.popsection

				661:

				.endm

				/*

				 * Provide the other half of the alternative code sequence.

				 */

				@ -154,6 +182,13 @@ void apply_alternatives(void *start, size_t length);

					.org	. - (662b-661b) + (664b-663b)

				.endm

				/*

				 * Callback-based alternative epilogue

				 */

				.macro alternative_cb_end

				662:

				.endm

				/*

				 * Provides a trivial alternative or default sequence consisting solely

				 * of NOPs. The number of NOPs is chosen automatically to match the

									
										27

arch/arm64/include/asm/assembler.h
									
												View File
												
				@ -239,14 +239,33 @@ lr	.req	x30		// link register

					.endm

					/*

					 * @dst: Result of per_cpu(sym, smp_processor_id())

					 * @sym: The name of the per-cpu variable

					 * @reg: Result of per_cpu(sym, smp_processor_id())

					 * @tmp: scratch register

					 */

					.macro this_cpu_ptr, sym, reg, tmp

					adr_l	\reg, \sym

					.macro adr_this_cpu, dst, sym, tmp

					adr_l	\dst, \sym

				alternative_if_not ARM64_HAS_VIRT_HOST_EXTN

					mrs	\tmp, tpidr_el1

					add	\reg, \reg, \tmp

				alternative_else

					mrs	\tmp, tpidr_el2

				alternative_endif

					add	\dst, \dst, \tmp

					.endm

					/*

					 * @dst: Result of READ_ONCE(per_cpu(sym, smp_processor_id()))

					 * @sym: The name of the per-cpu variable

					 * @tmp: scratch register

					 */

					.macro ldr_this_cpu dst, sym, tmp

					adr_l	\dst, \sym

				alternative_if_not ARM64_HAS_VIRT_HOST_EXTN

					mrs	\tmp, tpidr_el1

				alternative_else

					mrs	\tmp, tpidr_el2

				alternative_endif

					ldr	\dst, [\dst, \tmp]

					.endm

				/*

									
										4

arch/arm64/include/asm/cmpxchg.h
									
												View File
												
				@ -229,7 +229,9 @@ static inline void __cmpwait_case_##name(volatile void *ptr,		\

					unsigned long tmp;						\

													\

					asm volatile(							\

					"	ldxr" #sz "\t%" #w "[tmp], %[v]\n"		\

					"	sevl\n"							\

					"	wfe\n"							\

					"	ldxr" #sz "\t%" #w "[tmp], %[v]\n"			\

					"	eor	%" #w "[tmp], %" #w "[tmp], %" #w "[val]\n"	\

					"	cbnz	%" #w "[tmp], 1f\n"				\

					"	wfe\n"							\

									
										3

arch/arm64/include/asm/cpucaps.h
									
												View File
												
				@ -36,7 +36,8 @@

				#define ARM64_MISMATCHED_CACHE_LINE_SIZE	15

				#define ARM64_UNMAP_KERNEL_AT_EL0		16

				#define ARM64_HARDEN_BRANCH_PREDICTOR		17

				#define ARM64_SSBD				18

				#define ARM64_NCAPS				18

				#define ARM64_NCAPS				19

				#endif /* __ASM_CPUCAPS_H */

									
										22

arch/arm64/include/asm/cpufeature.h
									
												View File
												
				@ -221,6 +221,28 @@ static inline bool system_supports_mixed_endian_el0(void)

					return id_aa64mmfr0_mixed_endian_el0(read_system_reg(SYS_ID_AA64MMFR0_EL1));

				}

				#define ARM64_SSBD_UNKNOWN		-1

				#define ARM64_SSBD_FORCE_DISABLE	0

				#define ARM64_SSBD_KERNEL		1

				#define ARM64_SSBD_FORCE_ENABLE		2

				#define ARM64_SSBD_MITIGATED		3

				static inline int arm64_get_ssbd_state(void)

				{

				#ifdef CONFIG_ARM64_SSBD

					extern int ssbd_state;

					return ssbd_state;

				#else

					return ARM64_SSBD_UNKNOWN;

				#endif

				}

				#ifdef CONFIG_ARM64_SSBD

				void arm64_set_ssbd_mitigation(bool state);

				#else

				static inline void arm64_set_ssbd_mitigation(bool state) {}

				#endif

				#endif /* __ASSEMBLY__ */

				#endif

									
										41

arch/arm64/include/asm/kvm_asm.h
									
												View File
												
				@ -33,6 +33,10 @@

				#define KVM_ARM64_DEBUG_DIRTY_SHIFT	0

				#define KVM_ARM64_DEBUG_DIRTY		(1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)

				#define	VCPU_WORKAROUND_2_FLAG_SHIFT	0

				#define	VCPU_WORKAROUND_2_FLAG		(_AC(1, UL) << VCPU_WORKAROUND_2_FLAG_SHIFT)

				/* Translate a kernel address of @sym into its equivalent linear mapping */

				#define kvm_ksym_ref(sym)						\

					({								\

						void *val = &sym;					\

				@ -65,6 +69,43 @@ extern u32 __kvm_get_mdcr_el2(void);

				extern u32 __init_stage2_translation(void);

				/* Home-grown __this_cpu_{ptr,read} variants that always work at HYP */

				#define __hyp_this_cpu_ptr(sym)						\

					({								\

						void *__ptr = hyp_symbol_addr(sym);			\

						__ptr += read_sysreg(tpidr_el2);			\

						(typeof(&sym))__ptr;					\

					 })

				#define __hyp_this_cpu_read(sym)					\

					({								\

						*__hyp_this_cpu_ptr(sym);				\

					 })

				#else /* __ASSEMBLY__ */

				.macro hyp_adr_this_cpu reg, sym, tmp

					adr_l	\reg, \sym

					mrs	\tmp, tpidr_el2

					add	\reg, \reg, \tmp

				.endm

				.macro hyp_ldr_this_cpu reg, sym, tmp

					adr_l	\reg, \sym

					mrs	\tmp, tpidr_el2

					ldr	\reg,  [\reg, \tmp]

				.endm

				.macro get_host_ctxt reg, tmp

					hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp

				.endm

				.macro get_vcpu_ptr vcpu, ctxt

					get_host_ctxt \ctxt, \vcpu

					ldr	\vcpu, [\ctxt, #HOST_CONTEXT_VCPU]

					kern_hyp_va	\vcpu

				.endm

				#endif

				#endif /* __ARM_KVM_ASM_H__ */

									
										43

arch/arm64/include/asm/kvm_host.h
									
												View File
												
				@ -197,6 +197,8 @@ struct kvm_cpu_context {

						u64 sys_regs[NR_SYS_REGS];

						u32 copro[NR_COPRO_REGS];

					};

					struct kvm_vcpu *__hyp_running_vcpu;

				};

				typedef struct kvm_cpu_context kvm_cpu_context_t;

				@ -211,6 +213,9 @@ struct kvm_vcpu_arch {

					/* Exception Information */

					struct kvm_vcpu_fault_info fault;

					/* State of various workarounds, see kvm_asm.h for bit assignment */

					u64 workaround_flags;

					/* Guest debug state */

					u64 debug_flags;

				@ -354,10 +359,15 @@ int kvm_perf_teardown(void);

				struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);

				void __kvm_set_tpidr_el2(u64 tpidr_el2);

				DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);

				static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,

								       unsigned long hyp_stack_ptr,

								       unsigned long vector_ptr)

				{

					u64 tpidr_el2;

					/*

					 * Call initialization code, and switch to the full blown HYP code.

					 * If the cpucaps haven't been finalized yet, something has gone very

				@ -366,6 +376,16 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,

					 */

					BUG_ON(!static_branch_likely(&arm64_const_caps_ready));

					__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);

					/*

					 * Calculate the raw per-cpu offset without a translation from the

					 * kernel's mapping to the linear mapping, and store it in tpidr_el2

					 * so that we can use adr_l to access per-cpu variables in EL2.

					 */

					tpidr_el2 = (u64)this_cpu_ptr(&kvm_host_cpu_state)

						- (u64)kvm_ksym_ref(kvm_host_cpu_state);

					kvm_call_hyp(__kvm_set_tpidr_el2, tpidr_el2);

				}

				void __kvm_hyp_teardown(void);

				@ -405,4 +425,27 @@ static inline bool kvm_arm_harden_branch_predictor(void)

					return cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR);

				}

				#define KVM_SSBD_UNKNOWN		-1

				#define KVM_SSBD_FORCE_DISABLE		0

				#define KVM_SSBD_KERNEL		1

				#define KVM_SSBD_FORCE_ENABLE		2

				#define KVM_SSBD_MITIGATED		3

				static inline int kvm_arm_have_ssbd(void)

				{

					switch (arm64_get_ssbd_state()) {

					case ARM64_SSBD_FORCE_DISABLE:

						return KVM_SSBD_FORCE_DISABLE;

					case ARM64_SSBD_KERNEL:

						return KVM_SSBD_KERNEL;

					case ARM64_SSBD_FORCE_ENABLE:

						return KVM_SSBD_FORCE_ENABLE;

					case ARM64_SSBD_MITIGATED:

						return KVM_SSBD_MITIGATED;

					case ARM64_SSBD_UNKNOWN:

					default:

						return KVM_SSBD_UNKNOWN;

					}

				}

				#endif /* __ARM64_KVM_HOST_H__ */

									
										44

arch/arm64/include/asm/kvm_mmu.h
									
												View File
												
				@ -130,6 +130,26 @@ static inline unsigned long __kern_hyp_va(unsigned long v)

				#define kern_hyp_va(v) 	((typeof(v))(__kern_hyp_va((unsigned long)(v))))

				/*

				 * Obtain the PC-relative address of a kernel symbol

				 * s: symbol

				 *

				 * The goal of this macro is to return a symbol's address based on a

				 * PC-relative computation, as opposed to a loading the VA from a

				 * constant pool or something similar. This works well for HYP, as an

				 * absolute VA is guaranteed to be wrong. Only use this if trying to

				 * obtain the address of a symbol (i.e. not something you obtained by

				 * following a pointer).

				 */

				#define hyp_symbol_addr(s)						\

					({								\

						typeof(s) *addr;					\

						asm("adrp	%0, %1\n"				\

						    "add	%0, %0, :lo12:%1\n"			\

						    : "=r" (addr) : "S" (&s));				\

						addr;							\

					})

				/*

				 * We currently only support a 40bit IPA.

				 */

				@ -367,5 +387,29 @@ static inline int kvm_map_vectors(void)

				}

				#endif

				#ifdef CONFIG_ARM64_SSBD

				DECLARE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);

				static inline int hyp_map_aux_data(void)

				{

					int cpu, err;

					for_each_possible_cpu(cpu) {

						u64 *ptr;

						ptr = per_cpu_ptr(&arm64_ssbd_callback_required, cpu);

						err = create_hyp_mappings(ptr, ptr + 1, PAGE_HYP);

						if (err)

							return err;

					}

					return 0;

				}

				#else

				static inline int hyp_map_aux_data(void)

				{

					return 0;

				}

				#endif

				#endif /* __ASSEMBLY__ */

				#endif /* __ARM64_KVM_MMU_H__ */

									
										12

arch/arm64/include/asm/percpu.h
									
												View File
												
				@ -16,9 +16,14 @@

				#ifndef __ASM_PERCPU_H

				#define __ASM_PERCPU_H

				#include <asm/alternative.h>

				static inline void set_my_cpu_offset(unsigned long off)

				{

					asm volatile("msr tpidr_el1, %0" :: "r" (off) : "memory");

					asm volatile(ALTERNATIVE("msr tpidr_el1, %0",

								 "msr tpidr_el2, %0",

								 ARM64_HAS_VIRT_HOST_EXTN)

							:: "r" (off) : "memory");

				}

				static inline unsigned long __my_cpu_offset(void)

				@ -29,7 +34,10 @@ static inline unsigned long __my_cpu_offset(void)

					 * We want to allow caching the value, so avoid using volatile and

					 * instead use a fake stack read to hazard against barrier().

					 */

					asm("mrs %0, tpidr_el1" : "=r" (off) :

					asm(ALTERNATIVE("mrs %0, tpidr_el1",

							"mrs %0, tpidr_el2",

							ARM64_HAS_VIRT_HOST_EXTN)

						: "=r" (off) :

						"Q" (*(const unsigned long *)current_stack_pointer));

					return off;

									
										1

arch/arm64/include/asm/thread_info.h
									
												View File
												
				@ -122,6 +122,7 @@ static inline struct thread_info *current_thread_info(void)

				#define TIF_RESTORE_SIGMASK	20

				#define TIF_SINGLESTEP		21

				#define TIF_32BIT		22	/* 32bit process */

				#define TIF_SSBD		23	/* Wants SSB mitigation */

				#define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)

				#define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)

									
										1

arch/arm64/kernel/Makefile
									
												View File
												
				@ -50,6 +50,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o

				arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o

				arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\

									   cpu-reset.o

				arm64-obj-$(CONFIG_ARM64_SSBD)		+= ssbd.o

				ifeq ($(CONFIG_KVM),y)

				arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR)	+= bpi.o

									
										54

arch/arm64/kernel/alternative.c
									
												View File
												
				@ -28,10 +28,12 @@

				#include <asm/sections.h>

				#include <linux/stop_machine.h>

				#define __ALT_PTR(a,f)		(u32 *)((void *)&(a)->f + (a)->f)

				#define __ALT_PTR(a,f)		((void *)&(a)->f + (a)->f)

				#define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)

				#define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)

				int alternatives_applied;

				struct alt_region {

					struct alt_instr *begin;

					struct alt_instr *end;

				@ -105,31 +107,52 @@ static u32 get_alt_insn(struct alt_instr *alt, u32 *insnptr, u32 *altinsnptr)

					return insn;

				}

				static void patch_alternative(struct alt_instr *alt,

							      __le32 *origptr, __le32 *updptr, int nr_inst)

				{

					__le32 *replptr;

					int i;

					replptr = ALT_REPL_PTR(alt);

					for (i = 0; i < nr_inst; i++) {

						u32 insn;

						insn = get_alt_insn(alt, origptr + i, replptr + i);

						updptr[i] = cpu_to_le32(insn);

					}

				}

				static void __apply_alternatives(void *alt_region)

				{

					struct alt_instr *alt;

					struct alt_region *region = alt_region;

					u32 *origptr, *replptr;

					__le32 *origptr;

					alternative_cb_t alt_cb;

					for (alt = region->begin; alt < region->end; alt++) {

						u32 insn;

						int i, nr_inst;

						int nr_inst;

						if (!cpus_have_cap(alt->cpufeature))

						/* Use ARM64_CB_PATCH as an unconditional patch */

						if (alt->cpufeature < ARM64_CB_PATCH &&

						    !cpus_have_cap(alt->cpufeature))

							continue;

						BUG_ON(alt->alt_len != alt->orig_len);

						if (alt->cpufeature == ARM64_CB_PATCH)

							BUG_ON(alt->alt_len != 0);

						else

							BUG_ON(alt->alt_len != alt->orig_len);

						pr_info_once("patching kernel code\n");

						origptr = ALT_ORIG_PTR(alt);

						replptr = ALT_REPL_PTR(alt);

						nr_inst = alt->alt_len / sizeof(insn);

						nr_inst = alt->orig_len / AARCH64_INSN_SIZE;

						for (i = 0; i < nr_inst; i++) {

							insn = get_alt_insn(alt, origptr + i, replptr + i);

							*(origptr + i) = cpu_to_le32(insn);

						}

						if (alt->cpufeature < ARM64_CB_PATCH)

							alt_cb = patch_alternative;

						else

							alt_cb  = ALT_REPL_PTR(alt);

						alt_cb(alt, origptr, origptr, nr_inst);

						flush_icache_range((uintptr_t)origptr,

								   (uintptr_t)(origptr + nr_inst));

				@ -142,7 +165,6 @@ static void __apply_alternatives(void *alt_region)

				 */

				static int __apply_alternatives_multi_stop(void *unused)

				{

					static int patched = 0;

					struct alt_region region = {

						.begin	= (struct alt_instr *)__alt_instructions,

						.end	= (struct alt_instr *)__alt_instructions_end,

				@ -150,14 +172,14 @@ static int __apply_alternatives_multi_stop(void *unused)

					/* We always have a CPU 0 at this point (__init) */

					if (smp_processor_id()) {

						while (!READ_ONCE(patched))

						while (!READ_ONCE(alternatives_applied))

							cpu_relax();

						isb();

					} else {

						BUG_ON(patched);

						BUG_ON(alternatives_applied);

						__apply_alternatives(&region);

						/* Barriers provided by the cache flushing */

						WRITE_ONCE(patched, 1);

						WRITE_ONCE(alternatives_applied, 1);

					}

					return 0;

									
										2

arch/arm64/kernel/asm-offsets.c
									
												View File
												
				@ -127,11 +127,13 @@ int main(void)

				  BLANK();

				#ifdef CONFIG_KVM_ARM_HOST

				  DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));

				  DEFINE(VCPU_WORKAROUND_FLAGS,	offsetof(struct kvm_vcpu, arch.workaround_flags));

				  DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));

				  DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));

				  DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));

				  DEFINE(VCPU_FPEXC32_EL2,	offsetof(struct kvm_vcpu, arch.ctxt.sys_regs[FPEXC32_EL2]));

				  DEFINE(VCPU_HOST_CONTEXT,	offsetof(struct kvm_vcpu, arch.host_cpu_context));

				  DEFINE(HOST_CONTEXT_VCPU,	offsetof(struct kvm_cpu_context, __hyp_running_vcpu));

				#endif

				#ifdef CONFIG_CPU_PM

				  DEFINE(CPU_SUSPEND_SZ,	sizeof(struct cpu_suspend_ctx));

									
										180

arch/arm64/kernel/cpu_errata.c
									
												View File
												
				@ -187,6 +187,178 @@ static int enable_smccc_arch_workaround_1(void *data)

				}

				#endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */

				#ifdef CONFIG_ARM64_SSBD

				DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);

				int ssbd_state __read_mostly = ARM64_SSBD_KERNEL;

				static const struct ssbd_options {

					const char	*str;

					int		state;

				} ssbd_options[] = {

					{ "force-on",	ARM64_SSBD_FORCE_ENABLE, },

					{ "force-off",	ARM64_SSBD_FORCE_DISABLE, },

					{ "kernel",	ARM64_SSBD_KERNEL, },

				};

				static int __init ssbd_cfg(char *buf)

				{

					int i;

					if (!buf || !buf[0])

						return -EINVAL;

					for (i = 0; i < ARRAY_SIZE(ssbd_options); i++) {

						int len = strlen(ssbd_options[i].str);

						if (strncmp(buf, ssbd_options[i].str, len))

							continue;

						ssbd_state = ssbd_options[i].state;

						return 0;

					}

					return -EINVAL;

				}

				early_param("ssbd", ssbd_cfg);

				void __init arm64_update_smccc_conduit(struct alt_instr *alt,

								       __le32 *origptr, __le32 *updptr,

								       int nr_inst)

				{

					u32 insn;

					BUG_ON(nr_inst != 1);

					switch (psci_ops.conduit) {

					case PSCI_CONDUIT_HVC:

						insn = aarch64_insn_get_hvc_value();

						break;

					case PSCI_CONDUIT_SMC:

						insn = aarch64_insn_get_smc_value();

						break;

					default:

						return;

					}

					*updptr = cpu_to_le32(insn);

				}

				void __init arm64_enable_wa2_handling(struct alt_instr *alt,

								      __le32 *origptr, __le32 *updptr,

								      int nr_inst)

				{

					BUG_ON(nr_inst != 1);

					/*

					 * Only allow mitigation on EL1 entry/exit and guest

					 * ARCH_WORKAROUND_2 handling if the SSBD state allows it to

					 * be flipped.

					 */

					if (arm64_get_ssbd_state() == ARM64_SSBD_KERNEL)

						*updptr = cpu_to_le32(aarch64_insn_gen_nop());

				}

				void arm64_set_ssbd_mitigation(bool state)

				{

					switch (psci_ops.conduit) {

					case PSCI_CONDUIT_HVC:

						arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL);

						break;

					case PSCI_CONDUIT_SMC:

						arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL);

						break;

					default:

						WARN_ON_ONCE(1);

						break;

					}

				}

				static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry,

								    int scope)

				{

					struct arm_smccc_res res;

					bool required = true;

					s32 val;

					WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());

					if (psci_ops.smccc_version == SMCCC_VERSION_1_0) {

						ssbd_state = ARM64_SSBD_UNKNOWN;

						return false;

					}

					switch (psci_ops.conduit) {

					case PSCI_CONDUIT_HVC:

						arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,

								  ARM_SMCCC_ARCH_WORKAROUND_2, &res);

						break;

					case PSCI_CONDUIT_SMC:

						arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,

								  ARM_SMCCC_ARCH_WORKAROUND_2, &res);

						break;

					default:

						ssbd_state = ARM64_SSBD_UNKNOWN;

						return false;

					}

					val = (s32)res.a0;

					switch (val) {

					case SMCCC_RET_NOT_SUPPORTED:

						ssbd_state = ARM64_SSBD_UNKNOWN;

						return false;

					case SMCCC_RET_NOT_REQUIRED:

						pr_info_once("%s mitigation not required\n", entry->desc);

						ssbd_state = ARM64_SSBD_MITIGATED;

						return false;

					case SMCCC_RET_SUCCESS:

						required = true;

						break;

					case 1:	/* Mitigation not required on this CPU */

						required = false;

						break;

					default:

						WARN_ON(1);

						return false;

					}

					switch (ssbd_state) {

					case ARM64_SSBD_FORCE_DISABLE:

						pr_info_once("%s disabled from command-line\n", entry->desc);

						arm64_set_ssbd_mitigation(false);

						required = false;

						break;

					case ARM64_SSBD_KERNEL:

						if (required) {

							__this_cpu_write(arm64_ssbd_callback_required, 1);

							arm64_set_ssbd_mitigation(true);

						}

						break;

					case ARM64_SSBD_FORCE_ENABLE:

						pr_info_once("%s forced from command-line\n", entry->desc);

						arm64_set_ssbd_mitigation(true);

						required = true;

						break;

					default:

						WARN_ON(1);

						break;

					}

					return required;

				}

				#endif	/* CONFIG_ARM64_SSBD */

				#define MIDR_RANGE(model, min, max) \

					.def_scope = SCOPE_LOCAL_CPU, \

					.matches = is_affected_midr_range, \

				@ -309,6 +481,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {

						MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),

						.enable = enable_smccc_arch_workaround_1,

					},

				#endif

				#ifdef CONFIG_ARM64_SSBD

					{

						.desc = "Speculative Store Bypass Disable",

						.def_scope = SCOPE_LOCAL_CPU,

						.capability = ARM64_SSBD,

						.matches = has_ssbd_mitigation,

					},

				#endif

					{

					}

									
										19

arch/arm64/kernel/cpufeature.c
									
												View File
												
				@ -826,9 +826,25 @@ static int __init parse_kpti(char *str)

					__kpti_forced = enabled ? 1 : -1;

					return 0;

				}

				__setup("kpti=", parse_kpti);

				early_param("kpti", parse_kpti);

				#endif	/* CONFIG_UNMAP_KERNEL_AT_EL0 */

				static int cpu_copy_el2regs(void *__unused)

				{

					/*

					 * Copy register values that aren't redirected by hardware.

					 *

					 * Before code patching, we only set tpidr_el1, all CPUs need to copy

					 * this value to tpidr_el2 before we patch the code. Once we've done

					 * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to

					 * do anything here.

					 */

					if (!alternatives_applied)

						write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);

					return 0;

				}

				static const struct arm64_cpu_capabilities arm64_features[] = {

					{

						.desc = "GIC system register CPU interface",

				@ -895,6 +911,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

						.capability = ARM64_HAS_VIRT_HOST_EXTN,

						.def_scope = SCOPE_SYSTEM,

						.matches = runs_at_el2,

						.enable = cpu_copy_el2regs,

					},

					{

						.desc = "32-bit EL0 Support",

									
										32

arch/arm64/kernel/entry.S
									
												View File
												
				@ -18,6 +18,7 @@

				 * along with this program.  If not, see <http://www.gnu.org/licenses/>.

				 */

				#include <linux/arm-smccc.h>

				#include <linux/init.h>

				#include <linux/linkage.h>

				@ -95,6 +96,25 @@ alternative_else_nop_endif

					add	\dst, \dst, #(\sym - .entry.tramp.text)

					.endm

					// This macro corrupts x0-x3. It is the caller's duty

					// to save/restore them if required.

					.macro	apply_ssbd, state, targ, tmp1, tmp2

				#ifdef CONFIG_ARM64_SSBD

				alternative_cb	arm64_enable_wa2_handling

					b	\targ

				alternative_cb_end

					ldr_this_cpu	\tmp2, arm64_ssbd_callback_required, \tmp1

					cbz	\tmp2, \targ

					ldr	\tmp2, [tsk, #TI_FLAGS]

					tbnz	\tmp2, #TIF_SSBD, \targ

					mov	w0, #ARM_SMCCC_ARCH_WORKAROUND_2

					mov	w1, #\state

				alternative_cb	arm64_update_smccc_conduit

					nop					// Patched to SMC/HVC #0

				alternative_cb_end

				#endif

					.endm

					.macro	kernel_entry, el, regsize = 64

					.if	\regsize == 32

					mov	w0, w0				// zero upper 32 bits of x0

				@ -122,6 +142,14 @@ alternative_else_nop_endif

					ldr	x19, [tsk, #TI_FLAGS]		// since we can unmask debug

					disable_step_tsk x19, x20		// exceptions when scheduling.

					apply_ssbd 1, 1f, x22, x23

				#ifdef CONFIG_ARM64_SSBD

					ldp	x0, x1, [sp, #16 * 0]

					ldp	x2, x3, [sp, #16 * 1]

				#endif

				1:

					mov	x29, xzr			// fp pointed to user-space

					.else

					add	x21, sp, #S_FRAME_SIZE

				@ -190,6 +218,8 @@ alternative_if ARM64_WORKAROUND_845719

				alternative_else_nop_endif

				#endif

				3:

					apply_ssbd 0, 5f, x0, x1

				5:

					.endif

					msr	elr_el1, x21			// set up the return data

					msr	spsr_el1, x22

				@ -243,7 +273,7 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0

					cmp	x25, tsk

					b.ne	9998f

					this_cpu_ptr irq_stack, x25, x26

					adr_this_cpu x25, irq_stack, x26

					mov	x26, #IRQ_STACK_START_SP

					add	x26, x25, x26

									
										11

arch/arm64/kernel/hibernate.c
									
												View File
												
				@ -308,6 +308,17 @@ int swsusp_arch_suspend(void)

						sleep_cpu = -EINVAL;

						__cpu_suspend_exit();

						/*

						 * Just in case the boot kernel did turn the SSBD

						 * mitigation off behind our back, let's set the state

						 * to what we expect it to be.

						 */

						switch (arm64_get_ssbd_state()) {

						case ARM64_SSBD_FORCE_ENABLE:

						case ARM64_SSBD_KERNEL:

							arm64_set_ssbd_mitigation(true);

						}

					}

					local_dbg_restore(flags);

									
										108

arch/arm64/kernel/ssbd.c
									
										Normal file
									
												View File
												
				@ -0,0 +1,108 @@

				// SPDX-License-Identifier: GPL-2.0

				/*

				 * Copyright (C) 2018 ARM Ltd, All Rights Reserved.

				 */

				#include <linux/errno.h>

				#include <linux/prctl.h>

				#include <linux/sched.h>

				#include <linux/thread_info.h>

				#include <asm/cpufeature.h>

				/*

				 * prctl interface for SSBD

				 */

				static int ssbd_prctl_set(struct task_struct *task, unsigned long ctrl)

				{

					int state = arm64_get_ssbd_state();

					/* Unsupported */

					if (state == ARM64_SSBD_UNKNOWN)

						return -EINVAL;

					/* Treat the unaffected/mitigated state separately */

					if (state == ARM64_SSBD_MITIGATED) {

						switch (ctrl) {

						case PR_SPEC_ENABLE:

							return -EPERM;

						case PR_SPEC_DISABLE:

						case PR_SPEC_FORCE_DISABLE:

							return 0;

						}

					}

					/*

					 * Things are a bit backward here: the arm64 internal API

					 * *enables the mitigation* when the userspace API *disables

					 * speculation*. So much fun.

					 */

					switch (ctrl) {

					case PR_SPEC_ENABLE:

						/* If speculation is force disabled, enable is not allowed */

						if (state == ARM64_SSBD_FORCE_ENABLE ||

						    task_spec_ssb_force_disable(task))

							return -EPERM;

						task_clear_spec_ssb_disable(task);

						clear_tsk_thread_flag(task, TIF_SSBD);

						break;

					case PR_SPEC_DISABLE:

						if (state == ARM64_SSBD_FORCE_DISABLE)

							return -EPERM;

						task_set_spec_ssb_disable(task);

						set_tsk_thread_flag(task, TIF_SSBD);

						break;

					case PR_SPEC_FORCE_DISABLE:

						if (state == ARM64_SSBD_FORCE_DISABLE)

							return -EPERM;

						task_set_spec_ssb_disable(task);

						task_set_spec_ssb_force_disable(task);

						set_tsk_thread_flag(task, TIF_SSBD);

						break;

					default:

						return -ERANGE;

					}

					return 0;

				}

				int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which,

							     unsigned long ctrl)

				{

					switch (which) {

					case PR_SPEC_STORE_BYPASS:

						return ssbd_prctl_set(task, ctrl);

					default:

						return -ENODEV;

					}

				}

				static int ssbd_prctl_get(struct task_struct *task)

				{

					switch (arm64_get_ssbd_state()) {

					case ARM64_SSBD_UNKNOWN:

						return -EINVAL;

					case ARM64_SSBD_FORCE_ENABLE:

						return PR_SPEC_DISABLE;

					case ARM64_SSBD_KERNEL:

						if (task_spec_ssb_force_disable(task))

							return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;

						if (task_spec_ssb_disable(task))

							return PR_SPEC_PRCTL | PR_SPEC_DISABLE;

						return PR_SPEC_PRCTL | PR_SPEC_ENABLE;

					case ARM64_SSBD_FORCE_DISABLE:

						return PR_SPEC_ENABLE;

					default:

						return PR_SPEC_NOT_AFFECTED;

					}

				}

				int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)

				{

					switch (which) {

					case PR_SPEC_STORE_BYPASS:

						return ssbd_prctl_get(task);

					default:

						return -ENODEV;

					}

				}

									
										8

arch/arm64/kernel/suspend.c
									
												View File
												
				@ -67,6 +67,14 @@ void notrace __cpu_suspend_exit(void)

					 */

					if (hw_breakpoint_restore)

						hw_breakpoint_restore(cpu);

					/*

					 * On resume, firmware implementing dynamic mitigation will

					 * have turned the mitigation on. If the user has forcefully

					 * disabled it, make sure their wishes are obeyed.

					 */

					if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)

						arm64_set_ssbd_mitigation(false);

				}

				/*

									
										4

arch/arm64/kvm/hyp-init.S
									
												View File
												
				@ -118,6 +118,10 @@ CPU_BE(	orr	x4, x4, #SCTLR_ELx_EE)

					kern_hyp_va	x2

					msr	vbar_el2, x2

					/* copy tpidr_el1 into tpidr_el2 for use by HYP */

					mrs	x1, tpidr_el1

					msr	tpidr_el2, x1

					/* Hello, World! */

					eret

				ENDPROC(__kvm_hyp_init)

									
										12

arch/arm64/kvm/hyp/entry.S
									
												View File
												
				@ -62,9 +62,6 @@ ENTRY(__guest_enter)

					// Store the host regs

					save_callee_saved_regs x1

					// Store the host_ctxt for use at exit time

					str	x1, [sp, #-16]!

					add	x18, x0, #VCPU_CONTEXT

					// Restore guest regs x0-x17

				@ -118,8 +115,7 @@ ENTRY(__guest_exit)

					// Store the guest regs x19-x29, lr

					save_callee_saved_regs x1

					// Restore the host_ctxt from the stack

					ldr	x2, [sp], #16

					get_host_ctxt	x2, x3

					// Now restore the host regs

					restore_callee_saved_regs x2

				@ -159,6 +155,10 @@ abort_guest_exit_end:

				ENDPROC(__guest_exit)

				ENTRY(__fpsimd_guest_restore)

					// x0: esr

					// x1: vcpu

					// x2-x29,lr: vcpu regs

					// vcpu x0-x1 on the stack

					stp	x2, x3, [sp, #-16]!

					stp	x4, lr, [sp, #-16]!

				@ -173,7 +173,7 @@ alternative_else

				alternative_endif

					isb

					mrs	x3, tpidr_el2

					mov	x3, x1

					ldr	x0, [x3, #VCPU_HOST_CONTEXT]

					kern_hyp_va x0

									
										62

arch/arm64/kvm/hyp/hyp-entry.S
									
												View File
												
				@ -72,13 +72,8 @@ ENDPROC(__kvm_hyp_teardown)

				el1_sync:				// Guest trapped into EL2

					stp	x0, x1, [sp, #-16]!

				alternative_if_not ARM64_HAS_VIRT_HOST_EXTN

					mrs	x1, esr_el2

				alternative_else

					mrs	x1, esr_el1

				alternative_endif

					lsr	x0, x1, #ESR_ELx_EC_SHIFT

					mrs	x0, esr_el2

					lsr	x0, x0, #ESR_ELx_EC_SHIFT

					cmp	x0, #ESR_ELx_EC_HVC64

					ccmp	x0, #ESR_ELx_EC_HVC32, #4, ne

					b.ne	el1_trap

				@ -112,33 +107,73 @@ el1_hvc_guest:

					 */

					ldr	x1, [sp]				// Guest's x0

					eor	w1, w1, #ARM_SMCCC_ARCH_WORKAROUND_1

					cbz	w1, wa_epilogue

					/* ARM_SMCCC_ARCH_WORKAROUND_2 handling */

					eor	w1, w1, #(ARM_SMCCC_ARCH_WORKAROUND_1 ^ \

							  ARM_SMCCC_ARCH_WORKAROUND_2)

					cbnz	w1, el1_trap

					mov	x0, x1

				#ifdef CONFIG_ARM64_SSBD

				alternative_cb	arm64_enable_wa2_handling

					b	wa2_end

				alternative_cb_end

					get_vcpu_ptr	x2, x0

					ldr	x0, [x2, #VCPU_WORKAROUND_FLAGS]

					// Sanitize the argument and update the guest flags

					ldr	x1, [sp, #8]			// Guest's x1

					clz	w1, w1				// Murphy's device:

					lsr	w1, w1, #5			// w1 = !!w1 without using

					eor	w1, w1, #1			// the flags...

					bfi	x0, x1, #VCPU_WORKAROUND_2_FLAG_SHIFT, #1

					str	x0, [x2, #VCPU_WORKAROUND_FLAGS]

					/* Check that we actually need to perform the call */

					hyp_ldr_this_cpu x0, arm64_ssbd_callback_required, x2

					cbz	x0, wa2_end

					mov	w0, #ARM_SMCCC_ARCH_WORKAROUND_2

					smc	#0

					/* Don't leak data from the SMC call */

					mov	x3, xzr

				wa2_end:

					mov	x2, xzr

					mov	x1, xzr

				#endif

				wa_epilogue:

					mov	x0, xzr

					add	sp, sp, #16

					eret

				el1_trap:

					get_vcpu_ptr	x1, x0

					mrs		x0, esr_el2

					lsr		x0, x0, #ESR_ELx_EC_SHIFT

					/*

					 * x0: ESR_EC

					 * x1: vcpu pointer

					 */

					/* Guest accessed VFP/SIMD registers, save host, restore Guest */

					cmp	x0, #ESR_ELx_EC_FP_ASIMD

					b.eq	__fpsimd_guest_restore

					mrs	x1, tpidr_el2

					mov	x0, #ARM_EXCEPTION_TRAP

					b	__guest_exit

				el1_irq:

					stp     x0, x1, [sp, #-16]!

					mrs	x1, tpidr_el2

					get_vcpu_ptr	x1, x0

					mov	x0, #ARM_EXCEPTION_IRQ

					b	__guest_exit

				el1_error:

					stp     x0, x1, [sp, #-16]!

					mrs	x1, tpidr_el2

					get_vcpu_ptr	x1, x0

					mov	x0, #ARM_EXCEPTION_EL1_SERROR

					b	__guest_exit

				@ -173,6 +208,11 @@ ENTRY(__hyp_do_panic)

					eret

				ENDPROC(__hyp_do_panic)

				ENTRY(__hyp_panic)

					get_host_ctxt x0, x1

					b	hyp_panic

				ENDPROC(__hyp_panic)

				.macro invalid_vector	label, target = __hyp_panic

					.align	2

				\label:

									
										64

arch/arm64/kvm/hyp/switch.c
									
												View File
												
				@ -15,6 +15,7 @@

				 * along with this program.  If not, see <http://www.gnu.org/licenses/>.

				 */

				#include <linux/arm-smccc.h>

				#include <linux/types.h>

				#include <linux/jump_label.h>

				#include <uapi/linux/psci.h>

				@ -267,6 +268,39 @@ static void __hyp_text __skip_instr(struct kvm_vcpu *vcpu)

					write_sysreg_el2(*vcpu_pc(vcpu), elr);

				}

				static inline bool __hyp_text __needs_ssbd_off(struct kvm_vcpu *vcpu)

				{

					if (!cpus_have_cap(ARM64_SSBD))

						return false;

					return !(vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG);

				}

				static void __hyp_text __set_guest_arch_workaround_state(struct kvm_vcpu *vcpu)

				{

				#ifdef CONFIG_ARM64_SSBD

					/*

					 * The host runs with the workaround always present. If the

					 * guest wants it disabled, so be it...

					 */

					if (__needs_ssbd_off(vcpu) &&

					    __hyp_this_cpu_read(arm64_ssbd_callback_required))

						arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 0, NULL);

				#endif

				}

				static void __hyp_text __set_host_arch_workaround_state(struct kvm_vcpu *vcpu)

				{

				#ifdef CONFIG_ARM64_SSBD

					/*

					 * If the guest has disabled the workaround, bring it back on.

					 */

					if (__needs_ssbd_off(vcpu) &&

					    __hyp_this_cpu_read(arm64_ssbd_callback_required))

						arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 1, NULL);

				#endif

				}

				int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)

				{

					struct kvm_cpu_context *host_ctxt;

				@ -275,9 +309,9 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)

					u64 exit_code;

					vcpu = kern_hyp_va(vcpu);

					write_sysreg(vcpu, tpidr_el2);

					host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);

					host_ctxt->__hyp_running_vcpu = vcpu;

					guest_ctxt = &vcpu->arch.ctxt;

					__sysreg_save_host_state(host_ctxt);

				@ -297,6 +331,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)

					__sysreg_restore_guest_state(guest_ctxt);

					__debug_restore_state(vcpu, kern_hyp_va(vcpu->arch.debug_ptr), guest_ctxt);

					__set_guest_arch_workaround_state(vcpu);

					/* Jump in the fire! */

				again:

					exit_code = __guest_enter(vcpu, host_ctxt);

				@ -339,6 +375,8 @@ again:

						}

					}

					__set_host_arch_workaround_state(vcpu);

					fp_enabled = __fpsimd_enabled();

					__sysreg_save_guest_state(guest_ctxt);

				@ -364,7 +402,8 @@ again:

				static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";

				static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par)

				static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,

									     struct kvm_vcpu *vcpu)

				{

					unsigned long str_va;

				@ -378,35 +417,32 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par)

					__hyp_do_panic(str_va,

						       spsr,  elr,

						       read_sysreg(esr_el2),   read_sysreg_el2(far),

						       read_sysreg(hpfar_el2), par,

						       (void *)read_sysreg(tpidr_el2));

						       read_sysreg(hpfar_el2), par, vcpu);

				}

				static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par)

				static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,

									    struct kvm_vcpu *vcpu)

				{

					panic(__hyp_panic_string,

					      spsr,  elr,

					      read_sysreg_el2(esr),   read_sysreg_el2(far),

					      read_sysreg(hpfar_el2), par,

					      (void *)read_sysreg(tpidr_el2));

					      read_sysreg(hpfar_el2), par, vcpu);

				}

				static hyp_alternate_select(__hyp_call_panic,

							    __hyp_call_panic_nvhe, __hyp_call_panic_vhe,

							    ARM64_HAS_VIRT_HOST_EXTN);

				void __hyp_text __noreturn __hyp_panic(void)

				void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *host_ctxt)

				{

					struct kvm_vcpu *vcpu = NULL;

					u64 spsr = read_sysreg_el2(spsr);

					u64 elr = read_sysreg_el2(elr);

					u64 par = read_sysreg(par_el1);

					if (read_sysreg(vttbr_el2)) {

						struct kvm_vcpu *vcpu;

						struct kvm_cpu_context *host_ctxt;

						vcpu = (struct kvm_vcpu *)read_sysreg(tpidr_el2);

						host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);

						vcpu = host_ctxt->__hyp_running_vcpu;

						__timer_save_state(vcpu);

						__deactivate_traps(vcpu);

						__deactivate_vm(vcpu);

				@ -414,7 +450,7 @@ void __hyp_text __noreturn __hyp_panic(void)

					}

					/* Call panic for real */

					__hyp_call_panic()(spsr, elr, par);

					__hyp_call_panic()(spsr, elr, par, vcpu);

					unreachable();

				}

									
										21

arch/arm64/kvm/hyp/sysreg-sr.c
									
												View File
												
				@ -27,8 +27,8 @@ static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { }

				/*

				 * Non-VHE: Both host and guest must save everything.

				 *

				 * VHE: Host must save tpidr*_el[01], actlr_el1, mdscr_el1, sp0, pc,

				 * pstate, and guest must save everything.

				 * VHE: Host must save tpidr*_el0, actlr_el1, mdscr_el1, sp_el0,

				 * and guest must save everything.

				 */

				static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)

				@ -36,11 +36,8 @@ static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)

					ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);

					ctxt->sys_regs[TPIDR_EL0]	= read_sysreg(tpidr_el0);

					ctxt->sys_regs[TPIDRRO_EL0]	= read_sysreg(tpidrro_el0);

					ctxt->sys_regs[TPIDR_EL1]	= read_sysreg(tpidr_el1);

					ctxt->sys_regs[MDSCR_EL1]	= read_sysreg(mdscr_el1);

					ctxt->gp_regs.regs.sp		= read_sysreg(sp_el0);

					ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);

					ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);

				}

				static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)

				@ -62,10 +59,13 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)

					ctxt->sys_regs[AMAIR_EL1]	= read_sysreg_el1(amair);

					ctxt->sys_regs[CNTKCTL_EL1]	= read_sysreg_el1(cntkctl);

					ctxt->sys_regs[PAR_EL1]		= read_sysreg(par_el1);

					ctxt->sys_regs[TPIDR_EL1]	= read_sysreg(tpidr_el1);

					ctxt->gp_regs.sp_el1		= read_sysreg(sp_el1);

					ctxt->gp_regs.elr_el1		= read_sysreg_el1(elr);

					ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);

					ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);

					ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);

				}

				static hyp_alternate_select(__sysreg_call_save_host_state,

				@ -89,11 +89,8 @@ static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctx

					write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  actlr_el1);

					write_sysreg(ctxt->sys_regs[TPIDR_EL0],	  tpidr_el0);

					write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);

					write_sysreg(ctxt->sys_regs[TPIDR_EL1],	  tpidr_el1);

					write_sysreg(ctxt->sys_regs[MDSCR_EL1],	  mdscr_el1);

					write_sysreg(ctxt->gp_regs.regs.sp,	  sp_el0);

					write_sysreg_el2(ctxt->gp_regs.regs.pc,	  elr);

					write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);

				}

				static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)

				@ -115,10 +112,13 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)

					write_sysreg_el1(ctxt->sys_regs[AMAIR_EL1],	amair);

					write_sysreg_el1(ctxt->sys_regs[CNTKCTL_EL1], 	cntkctl);

					write_sysreg(ctxt->sys_regs[PAR_EL1],		par_el1);

					write_sysreg(ctxt->sys_regs[TPIDR_EL1],		tpidr_el1);

					write_sysreg(ctxt->gp_regs.sp_el1,		sp_el1);

					write_sysreg_el1(ctxt->gp_regs.elr_el1,		elr);

					write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);

					write_sysreg_el2(ctxt->gp_regs.regs.pc,		elr);

					write_sysreg_el2(ctxt->gp_regs.regs.pstate,	spsr);

				}

				static hyp_alternate_select(__sysreg_call_restore_host_state,

				@ -183,3 +183,8 @@ void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)

					if (vcpu->arch.debug_flags & KVM_ARM64_DEBUG_DIRTY)

						write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);

				}

				void __hyp_text __kvm_set_tpidr_el2(u64 tpidr_el2)

				{

					asm("msr tpidr_el2, %0": : "r" (tpidr_el2));

				}

									
										4

arch/arm64/kvm/reset.c
									
												View File
												
				@ -135,6 +135,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)

					/* Reset PMU */

					kvm_pmu_vcpu_reset(vcpu);

					/* Default workaround setup is enabled (if supported) */

					if (kvm_arm_have_ssbd() == KVM_SSBD_KERNEL)

						vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;

					/* Reset timer */

					return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);

				}

									
										4

arch/arm64/mm/init.c
									
												View File
												
				@ -468,11 +468,13 @@ void __init mem_init(void)

					BUILD_BUG_ON(TASK_SIZE_32			> TASK_SIZE_64);

				#endif

				#ifdef CONFIG_SPARSEMEM_VMEMMAP

					/*

					 * Make sure we chose the upper bound of sizeof(struct page)

					 * correctly.

					 * correctly when sizing the VMEMMAP array.

					 */

					BUILD_BUG_ON(sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT));

				#endif

					if (PAGE_SIZE >= 16384 && get_num_physpages() <= 128) {

						extern int sysctl_overcommit_memory;

									
										4

arch/arm64/mm/mmu.c
									
												View File
												
				@ -804,12 +804,12 @@ int pmd_clear_huge(pmd_t *pmd)

					return 1;

				}

				int pud_free_pmd_page(pud_t *pud)

				int pud_free_pmd_page(pud_t *pud, unsigned long addr)

				{

					return pud_none(*pud);

				}

				int pmd_free_pte_page(pmd_t *pmd)

				int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)

				{

					return pmd_none(*pmd);

				}

									
										5

arch/arm64/mm/proc.S
									
												View File
												
				@ -186,8 +186,9 @@ ENDPROC(idmap_cpu_replace_ttbr1)

					.macro __idmap_kpti_put_pgtable_ent_ng, type

					orr	\type, \type, #PTE_NG		// Same bit for blocks and pages

					str	\type, [cur_\()\type\()p]	// Update the entry and ensure it

					dc	civac, cur_\()\type\()p		// is visible to all CPUs.

					str	\type, [cur_\()\type\()p]	// Update the entry and ensure

					dmb	sy				// that it is visible to all

					dc	civac, cur_\()\type\()p		// CPUs.

					.endm

				/*

									
										3

arch/m68k/mm/kmap.c
									
												View File
												
				@ -88,7 +88,8 @@ static inline void free_io_area(void *addr)

					for (p = &iolist ; (tmp = *p) ; p = &tmp->next) {

						if (tmp->addr == addr) {

							*p = tmp->next;

							__iounmap(tmp->addr, tmp->size);

							/* remove gap added in get_io_area() */

							__iounmap(tmp->addr, tmp->size - IO_SIZE);

							kfree(tmp);

							return;

						}

									
										10

arch/microblaze/boot/Makefile
									
												View File
												
				@ -21,17 +21,19 @@ $(obj)/linux.bin.gz: $(obj)/linux.bin FORCE

				quiet_cmd_cp = CP      $< $@$2

					cmd_cp = cat $< >$@$2 || (rm -f $@ && echo false)

				quiet_cmd_strip = STRIP   $@

				quiet_cmd_strip = STRIP   $< $@$2

					cmd_strip = $(STRIP) -K microblaze_start -K _end -K __log_buf \

								-K _fdt_start vmlinux -o $@

								-K _fdt_start $< -o $@$2

				UIMAGE_LOADADDR = $(CONFIG_KERNEL_BASE_ADDR)

				UIMAGE_IN = $@

				UIMAGE_OUT = $@.ub

				$(obj)/simpleImage.%: vmlinux FORCE

					$(call if_changed,cp,.unstrip)

					$(call if_changed,objcopy)

					$(call if_changed,uimage)

					$(call if_changed,strip)

					@echo 'Kernel: $@ is ready' ' (#'`cat .version`')'

					$(call if_changed,strip,.strip)

					@echo 'Kernel: $(UIMAGE_OUT) is ready' ' (#'`cat .version`')'

				clean-files += simpleImage.*.unstrip linux.bin.ub dts/*.dtb

									
										2

arch/mips/ath79/common.c
									
												View File
												
				@ -58,7 +58,7 @@ EXPORT_SYMBOL_GPL(ath79_ddr_ctrl_init);

				void ath79_ddr_wb_flush(u32 reg)

				{

					void __iomem *flush_reg = ath79_ddr_wb_flush_base + reg;

					void __iomem *flush_reg = ath79_ddr_wb_flush_base + (reg * 4);

					/* Flush the DDR write buffer. */

					__raw_writel(0x1, flush_reg);

									
										6

arch/mips/bcm47xx/setup.c
									
												View File
												
				@ -212,6 +212,12 @@ static int __init bcm47xx_cpu_fixes(void)

						 */

						if (bcm47xx_bus.bcma.bus.chipinfo.id == BCMA_CHIP_ID_BCM4706)

							cpu_wait = NULL;

						/*

						 * BCM47XX Erratum "R10: PCIe Transactions Periodically Fail"

						 * Enable ExternalSync for sync instruction to take effect

						 */

						set_c0_config7(MIPS_CONF7_ES);

						break;

				#endif

					}

									
										2

arch/mips/include/asm/io.h
									
												View File
												
				@ -412,6 +412,8 @@ static inline type pfx##in##bwlq##p(unsigned long port)			\

					__val = *__addr;						\

					slow;								\

													\

					/* prevent prefetching of coherent DMA data prematurely */	\

					rmb();								\

					return pfx##ioswab##bwlq(__addr, __val);			\

				}

									
										3

arch/mips/include/asm/mipsregs.h
									
												View File
												
				@ -663,6 +663,8 @@

				#define MIPS_CONF7_WII		(_ULCAST_(1) << 31)

				#define MIPS_CONF7_RPS		(_ULCAST_(1) << 2)

				/* ExternalSync */

				#define MIPS_CONF7_ES		(_ULCAST_(1) << 8)

				#define MIPS_CONF7_IAR		(_ULCAST_(1) << 10)

				#define MIPS_CONF7_AR		(_ULCAST_(1) << 16)

				@ -2641,6 +2643,7 @@ __BUILD_SET_C0(status)

				__BUILD_SET_C0(cause)

				__BUILD_SET_C0(config)

				__BUILD_SET_C0(config5)

				__BUILD_SET_C0(config7)

				__BUILD_SET_C0(intcontrol)

				__BUILD_SET_C0(intctl)

				__BUILD_SET_C0(srsmap)

									
										27

arch/mips/kernel/mcount.S
									
												View File
												
				@ -116,10 +116,20 @@ ftrace_stub:

				NESTED(_mcount, PT_SIZE, ra)

					PTR_LA	t1, ftrace_stub

					PTR_L	t2, ftrace_trace_function /* Prepare t2 for (1) */

					bne	t1, t2, static_trace

					beq	t1, t2, fgraph_trace

					 nop

					MCOUNT_SAVE_REGS

					move	a0, ra		/* arg1: self return address */

					jalr	t2		/* (1) call *ftrace_trace_function */

					 move	a1, AT		/* arg2: parent's return address */

					MCOUNT_RESTORE_REGS

				fgraph_trace:

				#ifdef	CONFIG_FUNCTION_GRAPH_TRACER

					PTR_LA	t1, ftrace_stub

					PTR_L	t3, ftrace_graph_return

					bne	t1, t3, ftrace_graph_caller

					 nop

				@ -128,24 +138,11 @@ NESTED(_mcount, PT_SIZE, ra)

					bne	t1, t3, ftrace_graph_caller

					 nop

				#endif

					b	ftrace_stub

				#ifdef CONFIG_32BIT

					 addiu sp, sp, 8

				#else

					 nop

				#endif

				static_trace:

					MCOUNT_SAVE_REGS

					move	a0, ra		/* arg1: self return address */

					jalr	t2		/* (1) call *ftrace_trace_function */

					 move	a1, AT		/* arg2: parent's return address */

					MCOUNT_RESTORE_REGS

				#ifdef CONFIG_32BIT

					addiu sp, sp, 8

				#endif

					.globl ftrace_stub

				ftrace_stub:

					RETURN_BACK

									
										43

arch/mips/kernel/process.c
									
												View File
												
				@ -26,6 +26,7 @@

				#include <linux/kallsyms.h>

				#include <linux/random.h>

				#include <linux/prctl.h>

				#include <linux/nmi.h>

				#include <asm/asm.h>

				#include <asm/bootinfo.h>

				@ -633,28 +634,42 @@ unsigned long arch_align_stack(unsigned long sp)

					return sp & ALMASK;

				}

				static void arch_dump_stack(void *info)

				static DEFINE_PER_CPU(struct call_single_data, backtrace_csd);

				static struct cpumask backtrace_csd_busy;

				static void handle_backtrace(void *info)

				{

					struct pt_regs *regs;

					nmi_cpu_backtrace(get_irq_regs());

					cpumask_clear_cpu(smp_processor_id(), &backtrace_csd_busy);

				}

					regs = get_irq_regs();

				static void raise_backtrace(cpumask_t *mask)

				{

					struct call_single_data *csd;

					int cpu;

					if (regs)

						show_regs(regs);

					for_each_cpu(cpu, mask) {

						/*

						 * If we previously sent an IPI to the target CPU & it hasn't

						 * cleared its bit in the busy cpumask then it didn't handle

						 * our previous IPI & it's not safe for us to reuse the

						 * call_single_data_t.

						 */

						if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {

							pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",

								cpu);

							continue;

						}

					dump_stack();

						csd = &per_cpu(backtrace_csd, cpu);

						csd->func = handle_backtrace;

						smp_call_function_single_async(cpu, csd);

					}

				}

				void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)

				{

					long this_cpu = get_cpu();

					if (cpumask_test_cpu(this_cpu, mask) && !exclude_self)

						dump_stack();

					smp_call_function_many(mask, arch_dump_stack, NULL, 1);

					put_cpu();

					nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);

				}

				int mips_get_process_fp_mode(struct task_struct *task)

									
										1

arch/mips/kernel/traps.c
									
												View File
												
				@ -351,6 +351,7 @@ static void __show_regs(const struct pt_regs *regs)

				void show_regs(struct pt_regs *regs)

				{

					__show_regs((struct pt_regs *)regs);

					dump_stack();

				}

				void show_registers(struct pt_regs *regs)

									
										37

arch/mips/mm/ioremap.c
									
												View File
												
				@ -9,6 +9,7 @@

				#include <linux/export.h>

				#include <asm/addrspace.h>

				#include <asm/byteorder.h>

				#include <linux/ioport.h>

				#include <linux/sched.h>

				#include <linux/slab.h>

				#include <linux/vmalloc.h>

				@ -97,6 +98,20 @@ static int remap_area_pages(unsigned long address, phys_addr_t phys_addr,

					return error;

				}

				static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages,

							       void *arg)

				{

					unsigned long i;

					for (i = 0; i < nr_pages; i++) {

						if (pfn_valid(start_pfn + i) &&

						    !PageReserved(pfn_to_page(start_pfn + i)))

							return 1;

					}

					return 0;

				}

				/*

				 * Generic mapping function (not visible outside):

				 */

				@ -115,8 +130,8 @@ static int remap_area_pages(unsigned long address, phys_addr_t phys_addr,

				void __iomem * __ioremap(phys_addr_t phys_addr, phys_addr_t size, unsigned long flags)

				{

					unsigned long offset, pfn, last_pfn;

					struct vm_struct * area;

					unsigned long offset;

					phys_addr_t last_addr;

					void * addr;

				@ -136,18 +151,16 @@ void __iomem * __ioremap(phys_addr_t phys_addr, phys_addr_t size, unsigned long

						return (void __iomem *) CKSEG1ADDR(phys_addr);

					/*

					 * Don't allow anybody to remap normal RAM that we're using..

					 * Don't allow anybody to remap RAM that may be allocated by the page

					 * allocator, since that could lead to races & data clobbering.

					 */

					if (phys_addr < virt_to_phys(high_memory)) {

						char *t_addr, *t_end;

						struct page *page;

						t_addr = __va(phys_addr);

						t_end = t_addr + (size - 1);

						for(page = virt_to_page(t_addr); page <= virt_to_page(t_end); page++)

							if(!PageReserved(page))

								return NULL;

					pfn = PFN_DOWN(phys_addr);

					last_pfn = PFN_DOWN(last_addr);

					if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL,

								  __ioremap_check_ram) == 1) {

						WARN_ONCE(1, "ioremap on RAM at %pa - %pa\n",

							  &phys_addr, &last_addr);

						return NULL;

					}

					/*

									
										2

arch/mips/pci/pci.c
									
												View File
												
				@ -55,7 +55,7 @@ void pci_resource_to_user(const struct pci_dev *dev, int bar,

					phys_addr_t size = resource_size(rsrc);

					*start = fixup_bigphys_addr(rsrc->start, size);

					*end = rsrc->start + size;

					*end = rsrc->start + size - 1;

				}

				int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,

2

arch/parisc/Kconfig

View File

 @ -184,7 +184,7 @@ config PREFETCH
 config MLONGCALLS
 	bool "Enable the -mlong-calls compiler option for big kernels"
 	def_bool y if (!MODULES)
 	default y
 	depends on PA8X00
 	help
 	  If you configure the kernel to include many drivers built-in instead

									
										32

arch/parisc/include/asm/barrier.h
									
										Normal file
									
												View File
												
				@ -0,0 +1,32 @@

				/* SPDX-License-Identifier: GPL-2.0 */

				#ifndef __ASM_BARRIER_H

				#define __ASM_BARRIER_H

				#ifndef __ASSEMBLY__

				/* The synchronize caches instruction executes as a nop on systems in

				   which all memory references are performed in order. */

				#define synchronize_caches() __asm__ __volatile__ ("sync" : : : "memory")

				#if defined(CONFIG_SMP)

				#define mb()		do { synchronize_caches(); } while (0)

				#define rmb()		mb()

				#define wmb()		mb()

				#define dma_rmb()	mb()

				#define dma_wmb()	mb()

				#else

				#define mb()		barrier()

				#define rmb()		barrier()

				#define wmb()		barrier()

				#define dma_rmb()	barrier()

				#define dma_wmb()	barrier()

				#endif

				#define __smp_mb()	mb()

				#define __smp_rmb()	mb()

				#define __smp_wmb()	mb()

				#include <asm-generic/barrier.h>

				#endif /* !__ASSEMBLY__ */

				#endif /* __ASM_BARRIER_H */

									
										2

arch/parisc/kernel/entry.S
									
												View File
												
				@ -481,6 +481,8 @@

					/* Release pa_tlb_lock lock without reloading lock address. */

					.macro		tlb_unlock0	spc,tmp

				#ifdef CONFIG_SMP

					or,COND(=)	%r0,\spc,%r0

					sync

					or,COND(=)	%r0,\spc,%r0

					stw             \spc,0(\tmp)

				#endif

									
										1

arch/parisc/kernel/pacache.S
									
												View File
												
				@ -354,6 +354,7 @@ ENDPROC_CFI(flush_data_cache_local)

					.macro	tlb_unlock	la,flags,tmp

				#ifdef CONFIG_SMP

					ldi		1,\tmp

					sync

					stw		\tmp,0(\la)

					mtsm		\flags

				#endif

									
										4

arch/parisc/kernel/syscall.S
									
												View File
												
				@ -633,6 +633,7 @@ cas_action:

					sub,<>	%r28, %r25, %r0

				2:	stw,ma	%r24, 0(%r26)

					/* Free lock */

					sync

					stw,ma	%r20, 0(%sr2,%r20)

				#if ENABLE_LWS_DEBUG

					/* Clear thread register indicator */

				@ -647,6 +648,7 @@ cas_action:

				3:		

					/* Error occurred on load or store */

					/* Free lock */

					sync

					stw	%r20, 0(%sr2,%r20)

				#if ENABLE_LWS_DEBUG

					stw	%r0, 4(%sr2,%r20)

				@ -848,6 +850,7 @@ cas2_action:

				cas2_end:

					/* Free lock */

					sync

					stw,ma	%r20, 0(%sr2,%r20)

					/* Enable interrupts */

					ssm	PSW_SM_I, %r0

				@ -858,6 +861,7 @@ cas2_end:

				22:

					/* Error occurred on load or store */

					/* Free lock */

					sync

					stw	%r20, 0(%sr2,%r20)

					ssm	PSW_SM_I, %r0

					ldo	1(%r0),%r28

									
										28

arch/powerpc/kernel/eeh_driver.c
									
												View File
												
				@ -450,9 +450,11 @@ static void *eeh_add_virt_device(void *data, void *userdata)

					driver = eeh_pcid_get(dev);

					if (driver) {

						eeh_pcid_put(dev);

						if (driver->err_handler)

						if (driver->err_handler) {

							eeh_pcid_put(dev);

							return NULL;

						}

						eeh_pcid_put(dev);

					}

				#ifdef CONFIG_PPC_POWERNV

				@ -489,17 +491,19 @@ static void *eeh_rmv_device(void *data, void *userdata)

					if (eeh_dev_removed(edev))

						return NULL;

					driver = eeh_pcid_get(dev);

					if (driver) {

						eeh_pcid_put(dev);

						if (removed &&

						    eeh_pe_passed(edev->pe))

							return NULL;

						if (removed &&

						    driver->err_handler &&

						    driver->err_handler->error_detected &&

						    driver->err_handler->slot_reset)

					if (removed) {

						if (eeh_pe_passed(edev->pe))

							return NULL;

						driver = eeh_pcid_get(dev);

						if (driver) {

							if (driver->err_handler &&

							    driver->err_handler->error_detected &&

							    driver->err_handler->slot_reset) {

								eeh_pcid_put(dev);

								return NULL;

							}

							eeh_pcid_put(dev);

						}

					}

					/* Remove it from PCI subsystem */

									
										1

arch/powerpc/kernel/entry_64.S
									
												View File
												
				@ -586,6 +586,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)

					 * actually hit this code path.

					 */

					isync

					slbie	r6

					slbie	r6		/* Workaround POWER5 < DD2.1 issue */

					slbmte	r7,r0

									
										3

arch/powerpc/kernel/fadump.c
									
												View File
												
				@ -1033,6 +1033,9 @@ void fadump_cleanup(void)

						init_fadump_mem_struct(&fdm,

							be64_to_cpu(fdm_active->cpu_state_data.destination_address));

						fadump_invalidate_dump(&fdm);

					} else if (fw_dump.dump_registered) {

						/* Un-register Firmware-assisted dump if it was registered. */

						fadump_unregister_dump(&fdm);

					}

				}

									
										2

arch/powerpc/kernel/head_8xx.S
									
												View File
												
				@ -769,7 +769,7 @@ start_here:

					tovirt(r6,r6)

					lis	r5, abatron_pteptrs@h

					ori	r5, r5, abatron_pteptrs@l

					stw	r5, 0xf0(r0)	/* Must match your Abatron config file */

					stw	r5, 0xf0(0)	/* Must match your Abatron config file */

					tophys(r5,r5)

					stw	r6, 0(r5)

									
										4

arch/powerpc/kernel/hw_breakpoint.c
									
												View File
												
				@ -175,8 +175,8 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)

					if (cpu_has_feature(CPU_FTR_DAWR)) {

						length_max = 512 ; /* 64 doublewords */

						/* DAWR region can't cross 512 boundary */

						if ((bp->attr.bp_addr >> 10) != 

						    ((bp->attr.bp_addr + bp->attr.bp_len - 1) >> 10))

						if ((bp->attr.bp_addr >> 9) !=

						    ((bp->attr.bp_addr + bp->attr.bp_len - 1) >> 9))

							return -EINVAL;

					}

					if (info->len >

									
										1

arch/powerpc/kernel/pci_32.c
									
												View File
												
				@ -11,6 +11,7 @@

				#include <linux/sched.h>

				#include <linux/errno.h>

				#include <linux/bootmem.h>

				#include <linux/syscalls.h>

				#include <linux/irq.h>

				#include <linux/list.h>

				#include <linux/of.h>

									
										1

arch/powerpc/kernel/ptrace.c
									
												View File
												
				@ -2380,6 +2380,7 @@ static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,

					/* Create a new breakpoint request if one doesn't exist already */

					hw_breakpoint_init(&attr);

					attr.bp_addr = hw_brk.address;

					attr.bp_len = 8;

					arch_bp_generic_fields(hw_brk.type,

							       &attr.bp_type);

									
										8

arch/powerpc/mm/slb.c
									
												View File
												
				@ -68,14 +68,14 @@ static inline void slb_shadow_update(unsigned long ea, int ssize,

					 * updating it.  No write barriers are needed here, provided

					 * we only update the current CPU's SLB shadow buffer.

					 */

					p->save_area[index].esid = 0;

					p->save_area[index].vsid = cpu_to_be64(mk_vsid_data(ea, ssize, flags));

					p->save_area[index].esid = cpu_to_be64(mk_esid_data(ea, ssize, index));

					WRITE_ONCE(p->save_area[index].esid, 0);

					WRITE_ONCE(p->save_area[index].vsid, cpu_to_be64(mk_vsid_data(ea, ssize, flags)));

					WRITE_ONCE(p->save_area[index].esid, cpu_to_be64(mk_esid_data(ea, ssize, index)));

				}

				static inline void slb_shadow_clear(enum slb_index index)

				{

					get_slb_shadow()->save_area[index].esid = 0;

					WRITE_ONCE(get_slb_shadow()->save_area[index].esid, 0);

				}

				static inline void create_shadowed_slbe(unsigned long ea, int ssize,

									
										34

arch/powerpc/net/bpf_jit_comp64.c
									
												View File
												
				@ -207,25 +207,37 @@ static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)

				static void bpf_jit_emit_func_call(u32 *image, struct codegen_context *ctx, u64 func)

				{

					unsigned int i, ctx_idx = ctx->idx;

					/* Load function address into r12 */

					PPC_LI64(12, func);

					/* For bpf-to-bpf function calls, the callee's address is unknown

					 * until the last extra pass. As seen above, we use PPC_LI64() to

					 * load the callee's address, but this may optimize the number of

					 * instructions required based on the nature of the address.

					 *

					 * Since we don't want the number of instructions emitted to change,

					 * we pad the optimized PPC_LI64() call with NOPs to guarantee that

					 * we always have a five-instruction sequence, which is the maximum

					 * that PPC_LI64() can emit.

					 */

					for (i = ctx->idx - ctx_idx; i < 5; i++)

						PPC_NOP();

				#ifdef PPC64_ELF_ABI_v1

					/* func points to the function descriptor */

					PPC_LI64(b2p[TMP_REG_2], func);

					/* Load actual entry point from function descriptor */

					PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_2], 0);

					/* ... and move it to LR */

					PPC_MTLR(b2p[TMP_REG_1]);

					/*

					 * Load TOC from function descriptor at offset 8.

					 * We can clobber r2 since we get called through a

					 * function pointer (so caller will save/restore r2)

					 * and since we don't use a TOC ourself.

					 */

					PPC_BPF_LL(2, b2p[TMP_REG_2], 8);

				#else

					/* We can clobber r12 */

					PPC_FUNC_ADDR(12, func);

					PPC_MTLR(12);

					PPC_BPF_LL(2, 12, 8);

					/* Load actual entry point from function descriptor */

					PPC_BPF_LL(12, 12, 0);

				#endif

					PPC_MTLR(12);

					PPC_BLRL();

				}

									
										6

arch/powerpc/platforms/chrp/time.c
									
												View File
												
				@ -27,6 +27,8 @@

				#include <asm/sections.h>

				#include <asm/time.h>

				#include <platforms/chrp/chrp.h>

				extern spinlock_t rtc_lock;

				#define NVRAM_AS0  0x74

				@ -62,7 +64,7 @@ long __init chrp_time_init(void)

					return 0;

				}

				int chrp_cmos_clock_read(int addr)

				static int chrp_cmos_clock_read(int addr)

				{

					if (nvram_as1 != 0)

						outb(addr>>8, nvram_as1);

				@ -70,7 +72,7 @@ int chrp_cmos_clock_read(int addr)

					return (inb(nvram_data));

				}

				void chrp_cmos_clock_write(unsigned long val, int addr)

				static void chrp_cmos_clock_write(unsigned long val, int addr)

				{

					if (nvram_as1 != 0)

						outb(addr>>8, nvram_as1);

									
										5

arch/powerpc/platforms/embedded6xx/hlwd-pic.c
									
												View File
												
				@ -35,6 +35,8 @@

				 */

				#define HW_BROADWAY_ICR		0x00

				#define HW_BROADWAY_IMR		0x04

				#define HW_STARLET_ICR		0x08

				#define HW_STARLET_IMR		0x0c

				/*

				@ -74,6 +76,9 @@ static void hlwd_pic_unmask(struct irq_data *d)

					void __iomem *io_base = irq_data_get_irq_chip_data(d);

					setbits32(io_base + HW_BROADWAY_IMR, 1 << irq);

					/* Make sure the ARM (aka. Starlet) doesn't handle this interrupt. */

					clrbits32(io_base + HW_STARLET_IMR, 1 << irq);

				}

									
										4

arch/powerpc/platforms/powermac/bootx_init.c
									
												View File
												
				@ -468,7 +468,7 @@ void __init bootx_init(unsigned long r3, unsigned long r4)

					boot_infos_t *bi = (boot_infos_t *) r4;

					unsigned long hdr;

					unsigned long space;

					unsigned long ptr, x;

					unsigned long ptr;

					char *model;

					unsigned long offset = reloc_offset();

				@ -562,6 +562,8 @@ void __init bootx_init(unsigned long r3, unsigned long r4)

					 * MMU switched OFF, so this should not be useful anymore.

					 */

					if (bi->version < 4) {

						unsigned long x __maybe_unused;

						bootx_printf("Touching pages...\n");

						/*

									
										1

arch/powerpc/platforms/powermac/setup.c
									
												View File
												
				@ -352,6 +352,7 @@ static int pmac_late_init(void)

				}

				machine_late_initcall(powermac, pmac_late_init);

				void note_bootable_part(dev_t dev, int part, int goodness);

				/*

				 * This is __ref because we check for "initializing" before

				 * touching any of the __init sensitive things and "initializing"

									
										1

arch/powerpc/platforms/powernv/pci-ioda.c
									
												View File
												
				@ -3424,7 +3424,6 @@ static void pnv_pci_ioda2_release_pe_dma(struct pnv_ioda_pe *pe)

						WARN_ON(pe->table_group.group);

					}

					pnv_pci_ioda2_table_free_pages(tbl);

					iommu_free_table(tbl, "pnv");

				}

									
										6

arch/s390/include/asm/cpu_mf.h
									
												View File
												
				@ -113,7 +113,7 @@ struct hws_basic_entry {

				struct hws_diag_entry {

					unsigned int def:16;	    /* 0-15  Data Entry Format		 */

					unsigned int R:14;	    /* 16-19 and 20-30 reserved		 */

					unsigned int R:15;	    /* 16-19 and 20-30 reserved		 */

					unsigned int I:1;	    /* 31 entry valid or invalid	 */

					u8	     data[];	    /* Machine-dependent sample data	 */

				} __packed;

				@ -129,7 +129,9 @@ struct hws_trailer_entry {

							unsigned int f:1;	/* 0 - Block Full Indicator   */

							unsigned int a:1;	/* 1 - Alert request control  */

							unsigned int t:1;	/* 2 - Timestamp format	      */

							unsigned long long:61;	/* 3 - 63: Reserved	      */

							unsigned int :29;	/* 3 - 31: Reserved	      */

							unsigned int bsdes:16;	/* 32-47: size of basic SDE   */

							unsigned int dsdes:16;	/* 48-63: size of diagnostic SDE */

						};

						unsigned long long flags;	/* 0 - 63: All indicators     */

					};

									
										4

arch/s390/kernel/entry.S
									
												View File
												
				@ -1187,7 +1187,7 @@ cleanup_critical:

					jl	0f

					clg	%r9,BASED(.Lcleanup_table+104)	# .Lload_fpu_regs_end

					jl	.Lcleanup_load_fpu_regs

				0:	BR_EX	%r14

				0:	BR_EX	%r14,%r11

					.align	8

				.Lcleanup_table:

				@ -1217,7 +1217,7 @@ cleanup_critical:

					ni	__SIE_PROG0C+3(%r9),0xfe	# no longer in SIE

					lctlg	%c1,%c1,__LC_USER_ASCE		# load primary asce

					larl	%r9,sie_exit			# skip forward to sie_exit

					BR_EX	%r14

					BR_EX	%r14,%r11

				#endif

				.Lcleanup_system_call:

1

arch/x86/Kconfig

View File

 @ -147,6 +147,7 @@ config X86
 	select HAVE_UID16			if X86_32 || IA32_EMULATION
 	select HAVE_UNSTABLE_SCHED_CLOCK
 	select HAVE_USER_RETURN_NOTIFIER
 	select HOTPLUG_SMT			if SMP
 	select IRQ_FORCED_THREADING
 	select MODULES_USE_ELF_RELA		if X86_64
 	select MODULES_USE_ELF_REL		if X86_32

									
										5

arch/x86/crypto/crc32c-intel_glue.c
									
												View File
												
				@ -58,16 +58,11 @@

				asmlinkage unsigned int crc_pcl(const u8 *buffer, int len,

								unsigned int crc_init);

				static int crc32c_pcl_breakeven = CRC32C_PCL_BREAKEVEN_EAGERFPU;

				#if defined(X86_FEATURE_EAGER_FPU)

				#define set_pcl_breakeven_point()					\

				do {									\

					if (!use_eager_fpu())						\

						crc32c_pcl_breakeven = CRC32C_PCL_BREAKEVEN_NOEAGERFPU;	\

				} while (0)

				#else

				#define set_pcl_breakeven_point()					\

					(crc32c_pcl_breakeven = CRC32C_PCL_BREAKEVEN_NOEAGERFPU)

				#endif

				#endif /* CONFIG_X86_64 */

				static u32 crc32c_intel_le_hw_byte(u32 crc, unsigned char const *data, size_t length)

									
										2

arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
									
												View File
												
				@ -265,7 +265,7 @@ ENTRY(sha256_mb_mgr_get_comp_job_avx2)

					vpinsrd	$1, _args_digest+1*32(state, idx, 4), %xmm0, %xmm0

					vpinsrd	$2, _args_digest+2*32(state, idx, 4), %xmm0, %xmm0

					vpinsrd	$3, _args_digest+3*32(state, idx, 4), %xmm0, %xmm0

					vmovd   _args_digest(state , idx, 4) , %xmm0

					vmovd	_args_digest+4*32(state, idx, 4), %xmm1

					vpinsrd	$1, _args_digest+5*32(state, idx, 4), %xmm1, %xmm1

					vpinsrd	$2, _args_digest+6*32(state, idx, 4), %xmm1, %xmm1

					vpinsrd	$3, _args_digest+7*32(state, idx, 4), %xmm1, %xmm1

									
										2

arch/x86/events/intel/uncore.c
									
												View File
												
				@ -212,7 +212,7 @@ void uncore_perf_event_update(struct intel_uncore_box *box, struct perf_event *e

					u64 prev_count, new_count, delta;

					int shift;

					if (event->hw.idx >= UNCORE_PMC_IDX_FIXED)

					if (event->hw.idx == UNCORE_PMC_IDX_FIXED)

						shift = 64 - uncore_fixed_ctr_bits(box);

					else

						shift = 64 - uncore_perf_ctr_bits(box);

									
										2

arch/x86/events/intel/uncore_nhmex.c
									
												View File
												
				@ -245,7 +245,7 @@ static void nhmex_uncore_msr_enable_event(struct intel_uncore_box *box, struct p

				{

					struct hw_perf_event *hwc = &event->hw;

					if (hwc->idx >= UNCORE_PMC_IDX_FIXED)

					if (hwc->idx == UNCORE_PMC_IDX_FIXED)

						wrmsrl(hwc->config_base, NHMEX_PMON_CTL_EN_BIT0);

					else if (box->pmu->type->event_mask & NHMEX_PMON_CTL_EN_BIT0)

						wrmsrl(hwc->config_base, hwc->config | NHMEX_PMON_CTL_EN_BIT22);

Compare commits

711 Commits v4.9.108 ... v4.9.122

24 Documentation/ABI/testing/sysfs-devices-system-cpu Unescape Escape View File

19 Documentation/Changes Unescape Escape View File

1 Documentation/devicetree/bindings/net/dsa/b53.txt Unescape Escape View File

23 Documentation/devicetree/bindings/net/dsa/qca8k.txt Unescape Escape View File

1 Documentation/devicetree/bindings/net/meson-dwmac.txt Unescape Escape View File

2 Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt Unescape Escape View File

1 Documentation/index.rst Unescape Escape View File

95 Documentation/kernel-parameters.txt Unescape Escape View File

610 Documentation/l1tf.rst Normal file Unescape Escape View File

3 Documentation/printk-formats.txt Unescape Escape View File

40 Documentation/virtual/kvm/api.txt Unescape Escape View File

6 Makefile Unescape Escape View File

3 arch/Kconfig Unescape Escape View File

1 arch/arc/configs/axs101_defconfig Unescape Escape View File

1 arch/arc/configs/axs103_defconfig Unescape Escape View File

1 arch/arc/configs/axs103_smp_defconfig Unescape Escape View File

1 arch/arc/configs/nsim_700_defconfig Unescape Escape View File

1 arch/arc/configs/nsim_hs_defconfig Unescape Escape View File

1 arch/arc/configs/nsim_hs_smp_defconfig Unescape Escape View File

1 arch/arc/configs/nsimosci_defconfig Unescape Escape View File

1 arch/arc/configs/nsimosci_hs_defconfig Unescape Escape View File

1 arch/arc/configs/nsimosci_hs_smp_defconfig Unescape Escape View File

2 arch/arc/include/asm/page.h Unescape Escape View File

2 arch/arc/include/asm/pgtable.h Unescape Escape View File

5 arch/arm/boot/dts/emev2.dtsi Unescape Escape View File

2 arch/arm/boot/dts/imx6q.dtsi Unescape Escape View File

2 arch/arm/boot/dts/imx6sx.dtsi Unescape Escape View File

5 arch/arm/boot/dts/sh73a0.dtsi Unescape Escape View File

2 arch/arm/include/asm/kgdb.h Unescape Escape View File

12 arch/arm/include/asm/kvm_host.h Unescape Escape View File

12 arch/arm/include/asm/kvm_mmu.h Unescape Escape View File

24 arch/arm/kvm/arm.c Unescape Escape View File

18 arch/arm/kvm/psci.c Unescape Escape View File

9 arch/arm64/Kconfig Unescape Escape View File

2 arch/arm64/configs/defconfig Unescape Escape View File

43 arch/arm64/include/asm/alternative.h Unescape Escape View File

27 arch/arm64/include/asm/assembler.h Unescape Escape View File

4 arch/arm64/include/asm/cmpxchg.h Unescape Escape View File

3 arch/arm64/include/asm/cpucaps.h Unescape Escape View File

22 arch/arm64/include/asm/cpufeature.h Unescape Escape View File

41 arch/arm64/include/asm/kvm_asm.h Unescape Escape View File

43 arch/arm64/include/asm/kvm_host.h Unescape Escape View File

44 arch/arm64/include/asm/kvm_mmu.h Unescape Escape View File

12 arch/arm64/include/asm/percpu.h Unescape Escape View File

1 arch/arm64/include/asm/thread_info.h Unescape Escape View File

1 arch/arm64/kernel/Makefile Unescape Escape View File

54 arch/arm64/kernel/alternative.c Unescape Escape View File

2 arch/arm64/kernel/asm-offsets.c Unescape Escape View File

180 arch/arm64/kernel/cpu_errata.c Unescape Escape View File

19 arch/arm64/kernel/cpufeature.c Unescape Escape View File

32 arch/arm64/kernel/entry.S Unescape Escape View File

11 arch/arm64/kernel/hibernate.c Unescape Escape View File

108 arch/arm64/kernel/ssbd.c Normal file Unescape Escape View File

8 arch/arm64/kernel/suspend.c Unescape Escape View File

4 arch/arm64/kvm/hyp-init.S Unescape Escape View File

12 arch/arm64/kvm/hyp/entry.S Unescape Escape View File

62 arch/arm64/kvm/hyp/hyp-entry.S Unescape Escape View File

64 arch/arm64/kvm/hyp/switch.c Unescape Escape View File

21 arch/arm64/kvm/hyp/sysreg-sr.c Unescape Escape View File

4 arch/arm64/kvm/reset.c Unescape Escape View File

4 arch/arm64/mm/init.c Unescape Escape View File

4 arch/arm64/mm/mmu.c Unescape Escape View File

5 arch/arm64/mm/proc.S Unescape Escape View File

3 arch/m68k/mm/kmap.c Unescape Escape View File

10 arch/microblaze/boot/Makefile Unescape Escape View File

2 arch/mips/ath79/common.c Unescape Escape View File

6 arch/mips/bcm47xx/setup.c Unescape Escape View File

2 arch/mips/include/asm/io.h Unescape Escape View File

3 arch/mips/include/asm/mipsregs.h Unescape Escape View File

27 arch/mips/kernel/mcount.S Unescape Escape View File

43 arch/mips/kernel/process.c Unescape Escape View File

1 arch/mips/kernel/traps.c Unescape Escape View File

37 arch/mips/mm/ioremap.c Unescape Escape View File

2 arch/mips/pci/pci.c Unescape Escape View File

2 arch/parisc/Kconfig Unescape Escape View File

32 arch/parisc/include/asm/barrier.h Normal file Unescape Escape View File

2 arch/parisc/kernel/entry.S Unescape Escape View File

1 arch/parisc/kernel/pacache.S Unescape Escape View File

711 Commits

v4.9.108 ... v4.9.122

24

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

19

Documentation/Changes

View File

1

Documentation/devicetree/bindings/net/dsa/b53.txt

View File

23

Documentation/devicetree/bindings/net/dsa/qca8k.txt

View File

1

Documentation/devicetree/bindings/net/meson-dwmac.txt

View File

2

Documentation/devicetree/bindings/pinctrl/meson,pinctrl.txt

View File

1

Documentation/index.rst

View File

95

Documentation/kernel-parameters.txt

View File

610

Documentation/l1tf.rst Normal file

View File

3

Documentation/printk-formats.txt

View File

40

Documentation/virtual/kvm/api.txt

View File

6

Makefile

View File

3

arch/Kconfig

View File

1

arch/arc/configs/axs101_defconfig

View File

1

arch/arc/configs/axs103_defconfig

View File

1

arch/arc/configs/axs103_smp_defconfig

View File

1

arch/arc/configs/nsim_700_defconfig

View File

1

arch/arc/configs/nsim_hs_defconfig

View File

1

arch/arc/configs/nsim_hs_smp_defconfig

View File

1

arch/arc/configs/nsimosci_defconfig

View File

1

arch/arc/configs/nsimosci_hs_defconfig

View File

1

arch/arc/configs/nsimosci_hs_smp_defconfig

View File

2

arch/arc/include/asm/page.h

View File

2

arch/arc/include/asm/pgtable.h

View File

5

arch/arm/boot/dts/emev2.dtsi

View File

2

arch/arm/boot/dts/imx6q.dtsi

View File

2

arch/arm/boot/dts/imx6sx.dtsi

View File

5

arch/arm/boot/dts/sh73a0.dtsi

View File

2

arch/arm/include/asm/kgdb.h

View File

12

arch/arm/include/asm/kvm_host.h

View File

12

arch/arm/include/asm/kvm_mmu.h

View File

24

arch/arm/kvm/arm.c

View File

18

arch/arm/kvm/psci.c

View File

9

arch/arm64/Kconfig

View File

2

arch/arm64/configs/defconfig

View File

43

arch/arm64/include/asm/alternative.h

View File

27

arch/arm64/include/asm/assembler.h

View File

4

arch/arm64/include/asm/cmpxchg.h

View File

3

arch/arm64/include/asm/cpucaps.h

View File

22

arch/arm64/include/asm/cpufeature.h

View File

41

arch/arm64/include/asm/kvm_asm.h

View File

43

arch/arm64/include/asm/kvm_host.h

View File

44

arch/arm64/include/asm/kvm_mmu.h

View File

12

arch/arm64/include/asm/percpu.h

View File

1

arch/arm64/include/asm/thread_info.h

View File

1

arch/arm64/kernel/Makefile

View File

54

arch/arm64/kernel/alternative.c

View File

2

arch/arm64/kernel/asm-offsets.c

View File

180

arch/arm64/kernel/cpu_errata.c

View File

19

arch/arm64/kernel/cpufeature.c

View File

32

arch/arm64/kernel/entry.S

View File

11

arch/arm64/kernel/hibernate.c

View File

108

arch/arm64/kernel/ssbd.c Normal file

View File

8

arch/arm64/kernel/suspend.c

View File

4

arch/arm64/kvm/hyp-init.S

View File

12

arch/arm64/kvm/hyp/entry.S

View File

62

arch/arm64/kvm/hyp/hyp-entry.S

View File

64

arch/arm64/kvm/hyp/switch.c

View File

21

arch/arm64/kvm/hyp/sysreg-sr.c

View File

4

arch/arm64/kvm/reset.c

View File

4

arch/arm64/mm/init.c

View File

4

arch/arm64/mm/mmu.c

View File

5

arch/arm64/mm/proc.S

View File

3

arch/m68k/mm/kmap.c

View File

10

arch/microblaze/boot/Makefile

View File

2

arch/mips/ath79/common.c

View File

6

arch/mips/bcm47xx/setup.c

View File

2

arch/mips/include/asm/io.h

View File

3

arch/mips/include/asm/mipsregs.h

View File

27

arch/mips/kernel/mcount.S

View File

43

arch/mips/kernel/process.c

View File

1

arch/mips/kernel/traps.c

View File

37

arch/mips/mm/ioremap.c

View File

2

arch/mips/pci/pci.c

View File

2

arch/parisc/Kconfig

View File

32

arch/parisc/include/asm/barrier.h Normal file

View File

2

arch/parisc/kernel/entry.S

View File

1

arch/parisc/kernel/pacache.S

View File

4

arch/parisc/kernel/syscall.S

View File