i2som-imx-linux

Author	SHA1	Message	Date
Greg Kroah-Hartman	3e598a7089	Linux 4.9.82	2018-02-17 13:21:21 +01:00
Steven Rostedt (VMware)	2de1085e8d	ftrace: Remove incorrect setting of glob search field commit `7b65865627` upstream. __unregister_ftrace_function_probe() will incorrectly parse the glob filter because it resets the search variable that was setup by filter_parse_regex(). Al Viro reported this: After that call of filter_parse_regex() we could have func_g.search not equal to glob only if glob started with '!' or ''. In the former case we would've buggered off with -EINVAL (not = 1). In the latter we would've set func_g.search equal to glob + 1, calculated the length of that thing in func_g.len and proceeded to reset func_g.search back to glob. Suppose the glob is e.g. foo. We end up with func_g.type = MATCH_MIDDLE_ONLY; func_g.len = 3; func_g.search = "foo"; Feeding that to ftrace_match_record() will not do anything sane - we will be looking for names containing "*foo" (->len is ignored for that one). Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk Fixes: `3ba0092971` ("ftrace: Introduce ftrace_glob structure") Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Eric W. Biederman	df113487f8	mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copy commit `6ac1dc736b` upstream. Setting si_code to 0 is the same a setting si_code to SI_USER which is definitely not correct. With si_code set to SI_USER si_pid and si_uid will be copied to userspace instead of si_addr. Which is very wrong. So fix this by using a sensible si_code (SEGV_MAPERR) for this failure. Fixes: `b920de1b77` ("mn10300: add the MN10300/AM33 architecture to the kernel") Cc: David Howells <dhowells@redhat.com> Cc: Masakazu Urade <urade.masakazu@jp.panasonic.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Amir Goldstein	38e3bc59e0	ovl: fix failure to fsync lower dir commit `d796e77f1d` upstream. As a writable mount, it is not expected for overlayfs to return EINVAL/EROFS for fsync, even if dir/file is not changed. This commit fixes the case of fsync of directory, which is easier to address, because overlayfs already implements fsync file operation for directories. The problem reported by Raphael is that new PostgreSQL 10.0 with a database in overlayfs where lower layer in squashfs fails to start. The failure is due to fsync error, when PostgreSQL does fsync on all existing db directories on startup and a specific directory exists lower layer with no changes. Reported-by: Raphael Hertzog <raphael@ouaza.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Tested-by: Raphaël Hertzog <hertzog@debian.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Toshi Kani	a468a3749b	acpi, nfit: fix register dimm error handling commit `23fbd7c70a` upstream. A NULL pointer reference kernel bug was observed when acpi_nfit_add_dimm() called in acpi_nfit_register_dimms() failed. This error path does not set nfit_mem->nvdimm, but the 2nd list_for_each_entry() loop in the function assumes it's always set. Add a check to nfit_mem->nvdimm. Fixes: `ba9c8dd3c2` ("acpi, nfit: add dimm device notification support") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Greg Kroah-Hartman	623c28ee02	ACPI: sbshc: remove raw pointer from printk() message commit `43cdd1b716` upstream. There's no need to be printing a raw kernel pointer to the kernel log at every boot. So just remove it, and change the whole message to use the correct dev_info() call at the same time. Reported-by: Wang Qize <wang_qize@venustech.com.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Imre Deak	3169a7c06e	drm/i915: Avoid PPS HW/SW state mismatch due to rounding commit `5643205c63` upstream. We store a SW state of the t11_t12 timing in 100usec units but have to program it in 100msec as required by HW. The rounding used during programming means there will be a mismatch between the SW and HW states of this value triggering a "PPS state mismatch" error. Avoid this by storing the already rounded-up value in the SW state. Note that we still calculate panel_power_cycle_delay with the finer 100usec granularity to avoid any needless waits using that version of the delay. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103903 Cc: joks <joks@linux.pl> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171129175137.2889-1-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Nikolay Borisov	8fe7ceaf8a	btrfs: Handle btrfs_set_extent_delalloc failure in fixup worker commit `f3038ee3a3` upstream. This function was introduced by `247e743cbe` ("Btrfs: Use async helpers to deal with pages that have been improperly dirtied") and it didn't do any error handling then. This function might very well fail in ENOMEM situation, yet it's not handled, this could lead to inconsistent state. So let's handle the failure by setting the mapping error bit. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Andrey Ryabinin	3c83fe52b5	lib/ubsan: add type mismatch handler for new GCC/Clang commit `42440c1f99` upstream. UBSAN=y fails to build with new GCC/clang: arch/x86/kernel/head64.o: In function `sanitize_boot_params': arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1' because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors. Compiler now uses new __ubsan_handle_type_mismatch_v1() function with slightly modified 'struct type_mismatch_data'. Let's add new 'struct type_mismatch_data_common' which is independent from compiler's layout of 'struct type_mismatch_data'. And make __ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent type mismatch data to our internal representation. This way, we can support both old and new compilers with minimal amount of change. Link: http://lkml.kernel.org/r/20180119152853.16806-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Andrew Morton	3f8130127c	lib/ubsan.c: s/missaligned/misaligned/ commit `b8fe1120b4` upstream. A vist from the spelling fairy. Cc: David Laight <David.Laight@ACULAB.COM> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Daniel Lezcano	1bb09d05a4	clocksource/drivers/stm32: Fix kernel panic with multiple timers commit `e0aeca3d8c` upstream. The current code hides a couple of bugs: - The global variable 'clock_event_ddata' is overwritten each time the init function is invoked. This is fixed with a kmemdup() instead of assigning the global variable. That prevents a memory corruption when several timers are defined in the DT. - The clockevent's event_handler is NULL if the time framework does not select the clockevent when registering it, this is fine but the init code generates in any case an interrupt leading to dereference this NULL pointer. The stm32 timer works with shadow registers, a mechanism to cache the registers. When a change is done in one buffered register, we need to artificially generate an event to force the timer to copy the content of the register to the shadowed register. The auto-reload register (ARR) is one of the shadowed register as well as the prescaler register (PSC), so in order to force the copy, we issue an event which in turn leads to an interrupt and the NULL dereference. This is fixed by inverting two lines where we clear the status register before enabling the update event interrupt. As this kernel crash is resulting from the combination of these two bugs, the fixes are grouped into a single patch. Tested-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Benjamin Gaignard <benjamin.gaignard@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1515418139-23276-11-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:20 +01:00
Bart Van Assche	944723bf84	pktcdvd: Fix pkt_setup_dev() error path commit `5a0ec388ef` upstream. Commit `523e1d399c` ("block: make gendisk hold a reference to its queue") modified add_disk() and disk_release() but did not update any of the error paths that trigger a put_disk() call after disk->queue has been assigned. That introduced the following behavior in the pktcdvd driver if pkt_new_dev() fails: Kernel BUG at 00000000e98fd882 [verbose debug info unavailable] Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL, fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev() error path. Fixes: commit `523e1d399c` ("block: make gendisk hold a reference to its queue") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Mika Westerberg	86d408d10e	pinctrl: intel: Initialize GPIO properly when used through irqchip commit `f5a26acf01` upstream. When a GPIO is requested using gpiod_get_* APIs the intel pinctrl driver switches the pin to GPIO mode and makes sure interrupts are routed to the GPIO hardware instead of IOAPIC. However, if the GPIO is used directly through irqchip, as is the case with many I2C-HID devices where I2C core automatically configures interrupt for the device, the pin is not initialized as GPIO. Instead we rely that the BIOS configures the pin accordingly which seems not to be the case at least in Asus X540NA SKU3 with Focaltech touchpad. When the pin is not properly configured it might result weird behaviour like interrupts suddenly stop firing completely and the touchpad stops responding to user input. Fix this by properly initializing the pin to GPIO mode also when it is used directly through irqchip. Fixes: `7981c0015a` ("pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support") Reported-by: Daniel Drake <drake@endlessm.com> Reported-and-tested-by: Chris Chiu <chiu@endlessm.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
James Hogan	10ddc77ffb	EDAC, octeon: Fix an uninitialized variable warning commit `544e92581a` upstream. Fix an uninitialized variable warning in the Octeon EDAC driver, as seen in MIPS cavium_octeon_defconfig builds since v4.14 with Codescape GNU Tools 2016.05-03: drivers/edac/octeon_edac-lmc.c In function ‘octeon_lmc_edac_poll_o2’: drivers/edac/octeon_edac-lmc.c:87:24: warning: ‘((long unsigned int*)&int_reg)[1]’ may \ be used uninitialized in this function [-Wmaybe-uninitialized] if (int_reg.s.sec_err \|\| int_reg.s.ded_err) { ^ Iinitialise the whole int_reg variable to zero before the conditional assignments in the error injection case. Signed-off-by: James Hogan <jhogan@kernel.org> Acked-by: David Daney <david.daney@cavium.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Fixes: `1bc021e815` ("EDAC: Octeon: Add error injection support") Link: http://lkml.kernel.org/r/20171113161206.20990-1-james.hogan@mips.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Max Filippov	2d4e295284	xtensa: fix futex_atomic_cmpxchg_inatomic commit `ca47480921` upstream. Return 0 if the operation was successful, not the userspace memory value. Check that userspace value equals passed oldval, not itself. Don't update uval if the value wasn't read from userspace memory. This fixes process hang due to infinite loop in futex_lock_pi. It also fixes a bunch of glibc tests nptl/tst-mutexpi. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Mikulas Patocka	71611b37cc	alpha: fix formating of stack content commit `4b01abdb32` upstream. Since version 4.9, the kernel automatically breaks printk calls into multiple newlines unless pr_cont is used. Fix the alpha stacktrace code, so that it prints stack trace in four columns, as it was initially intended. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Mikulas Patocka	7d22d92ca6	alpha: fix reboot on Avanti platform commit `55fc633c41` upstream. We need to define NEED_SRM_SAVE_RESTORE on the Avanti, otherwise we get machine check exception when attempting to reboot the machine. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Mikulas Patocka	68d18e90ee	alpha: fix crash if pthread_create races with signal delivery commit `21ffceda1c` upstream. On alpha, a process will crash if it attempts to start a thread and a signal is delivered at the same time. The crash can be reproduced with this program: https://cygwin.com/ml/cygwin/2014-11/msg00473.html The reason for the crash is this: * we call the clone syscall * we go to the function copy_process * copy process calls copy_thread_tls, it is a wrapper around copy_thread * copy_thread sets the tls pointer: childti->pcb.unique = regs->r20 * copy_thread sets regs->r20 to zero * we go back to copy_process * copy process checks "if (signal_pending(current))" and returns -ERESTARTNOINTR * the clone syscall is restarted, but this time, regs->r20 is zero, so the new thread is created with zero tls pointer * the new thread crashes in start_thread when attempting to access tls The comment in the code says that setting the register r20 is some compatibility with OSF/1. But OSF/1 doesn't use the CLONE_SETTLS flag, so we don't have to zero r20 if CLONE_SETTLS is set. This patch fixes the bug by zeroing regs->r20 only if CLONE_SETTLS is not set. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Eric W. Biederman	21f94109d0	signal/sh: Ensure si_signo is initialized in do_divide_error commit `0e88bb002a` upstream. Set si_signo. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org Fixes: `0983b31849` ("sh: Wire up division and address error exceptions on SH-2A.") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Eric W. Biederman	498b8b7453	signal/openrisc: Fix do_unaligned_access to send the proper signal commit `500d583005` upstream. While reviewing the signal sending on openrisc the do_unaligned_access function stood out because it is obviously wrong. A comment about an si_code set above when actually si_code is never set. Leading to a random si_code being sent to userspace in the event of an unaligned access. Looking further SIGBUS BUS_ADRALN is the proper pair of signal and si_code to send for an unaligned access. That is what other architectures do and what is required by posix. Given that do_unaligned_access is broken in a way that no one can be relying on it on openrisc fix the code to just do the right thing. Fixes: `769a8a9622` ("OpenRISC: Traps") Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: openrisc@lists.librecores.org Acked-by: Stafford Horne <shorne@gmail.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Hans de Goede	5795b076bd	Bluetooth: btusb: Restore QCA Rome suspend/resume fix with a "rewritten" version commit `61f5acea87` upstream. Commit `7d06d5895c` ("Revert "Bluetooth: btusb: fix QCA...suspend/resume"") removed the setting of the BTUSB_RESET_RESUME quirk for QCA Rome devices, instead favoring adding USB_QUIRK_RESET_RESUME quirks in usb/core/quirks.c. This was done because the DIY BTUSB_RESET_RESUME reset-resume handling has several issues (see the original commit message). An added advantage of moving over to the USB-core reset-resume handling is that it also disables autosuspend for these devices, which is similarly broken on these. But there are 2 issues with this approach: 1) It leaves the broken DIY BTUSB_RESET_RESUME code in place for Realtek devices. 2) Sofar only 2 of the 10 QCA devices known to the btusb code have been added to usb/core/quirks.c and if we fix the Realtek case the same way we need to add an additional 14 entries. So in essence we need to duplicate a large part of the usb_device_id table in btusb.c in usb/core/quirks.c and manually keep them in sync. This commit instead restores setting a reset-resume quirk for QCA devices in the btusb.c code, avoiding the duplicate usb_device_id table problem. This commit avoids the problems with the original DIY BTUSB_RESET_RESUME code by simply setting the USB_QUIRK_RESET_RESUME quirk directly on the usb_device. This commit also moves the BTUSB_REALTEK case over to directly setting the USB_QUIRK_RESET_RESUME on the usb_device and removes the now unused BTUSB_RESET_RESUME code. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514836 Fixes: `7d06d5895c` ("Revert "Bluetooth: btusb: fix QCA...suspend/resume"") Cc: Leif Liddy <leif.linux@gmail.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Daniel Drake <drake@endlessm.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Kai-Heng Feng	84bf682f53	Revert "Bluetooth: btusb: fix QCA Rome suspend/resume" commit `7d06d5895c` upstream. This reverts commit `fd865802c6`. This commit causes a regression on some QCA ROME chips. The USB device reset happens in btusb_open(), hence firmware loading gets interrupted. Furthermore, this commit stops working after commit ("a0085f2510e8976614ad8f766b209448b385492f Bluetooth: btusb: driver to enable the usb-wakeup feature"). Reset-resume quirk only gets enabled in btusb_suspend() when it's not a wakeup source. If we really want to reset the USB device, we need to do it before btusb_open(). Let's handle it in drivers/usb/core/quirks.c. Cc: Leif Liddy <leif.linux@gmail.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Daniel Drake <drake@endlessm.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:19 +01:00
Hans de Goede	6913d1b190	Bluetooth: btsdio: Do not bind to non-removable BCM43341 commit `b4cdaba274` upstream. BCM43341 devices soldered onto the PCB (non-removable) always (AFAICT) use an UART connection for bluetooth. But they also advertise btsdio support on their 3th sdio function, this causes 2 problems: 1) A non functioning BT HCI getting registered 2) Since the btsdio driver does not have suspend/resume callbacks, mmc_sdio_pre_suspend will return -ENOSYS, causing mmc_pm_notify() to react as if the SDIO-card is removed and since the slot is marked as non-removable it will never get detected as inserted again. Which results in wifi no longer working after a suspend/resume. This commit fixes both by making btsdio ignore BCM43341 devices when connected to a slot which is marked non-removable. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Hans de Goede	df9658e806	HID: quirks: Fix keyboard + touchpad on Toshiba Click Mini not working commit `edfc3722cf` upstream. The Toshiba Click Mini uses an i2c attached keyboard/touchpad combo (single i2c_hid device for both) which has a vid:pid of 04F3:0401, which is also used by a bunch of Elan touchpads which are handled by the drivers/input/mouse/elan_i2c driver, but that driver deals with pure touchpads and does not work for a combo device such as the one on the Toshiba Click Mini. The combo on the Mini has an ACPI id of ELAN0800, which is not claimed by the elan_i2c driver, so check for that and if it is found do not ignore the device. This fixes the keyboard/touchpad combo on the Mini not working (although with the touchpad in mouse emulation mode). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Eric Biggers	71baf27d8c	pipe: fix off-by-one error when checking buffer limits commit `9903a91c76` upstream. With pipe-user-pages-hard set to 'N', users were actually only allowed up to 'N - 1' buffers; and likewise for pipe-user-pages-soft. Fix this to allow up to 'N' buffers, as would be expected. Link: http://lkml.kernel.org/r/20180111052902.14409-5-ebiggers3@gmail.com Fixes: `b0b91d18e2` ("pipe: fix limit checking in pipe_set_size()") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Willy Tarreau <w@1wt.eu> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Luis R . Rodriguez" <mcgrof@kernel.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Eric Biggers	a705c24b5d	pipe: actually allow root to exceed the pipe buffer limits commit `85c2dd5473` upstream. pipe-user-pages-hard and pipe-user-pages-soft are only supposed to apply to unprivileged users, as documented in both Documentation/sysctl/fs.txt and the pipe(7) man page. However, the capabilities are actually only checked when increasing a pipe's size using F_SETPIPE_SZ, not when creating a new pipe. Therefore, if pipe-user-pages-hard has been set, the root user can run into it and be unable to create pipes. Similarly, if pipe-user-pages-soft has been set, the root user can run into it and have their pipes limited to 1 page each. Fix this by allowing the privileged override in both cases. Link: http://lkml.kernel.org/r/20180111052902.14409-4-ebiggers3@gmail.com Fixes: `759c01142a` ("pipe: limit the per-user amount of pages allocated in pipes") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Luis R . Rodriguez" <mcgrof@kernel.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Eric Biggers	91cebf98cd	kernel/relay.c: revert "kernel/relay.c: fix potential memory leak" commit `a1be1f3931` upstream. This reverts commit `ba62bafe94` ("kernel/relay.c: fix potential memory leak"). This commit introduced a double free bug, because 'chan' is already freed by the line: kref_put(&chan->kref, relay_destroy_channel); This bug was found by syzkaller, using the BLKTRACESETUP ioctl. Link: http://lkml.kernel.org/r/20180127004759.101823-1-ebiggers3@gmail.com Fixes: `ba62bafe94` ("kernel/relay.c: fix potential memory leak") Signed-off-by: Eric Biggers <ebiggers@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Rasmus Villemoes	33a4459bde	kernel/async.c: revert "async: simplify lowest_in_progress()" commit `4f7e988e63` upstream. This reverts commit `92266d6ef6` ("async: simplify lowest_in_progress()") which was simply wrong: In the case where domain is NULL, we now use the wrong offsetof() in the list_first_entry macro, so we don't actually fetch the ->cookie value, but rather the eight bytes located sizeof(struct list_head) further into the struct async_entry. On 64 bit, that's the data member, while on 32 bit, that's a u64 built from func and data in some order. I think the bug happens to be harmless in practice: It obviously only affects callers which pass a NULL domain, and AFAICT the only such caller is async_synchronize_full() -> async_synchronize_full_domain(NULL) -> async_synchronize_cookie_domain(ASYNC_COOKIE_MAX, NULL) and the ASYNC_COOKIE_MAX means that in practice we end up waiting for the async_global_pending list to be empty - but it would break if somebody happened to pass (void*)-1 as the data element to async_schedule, and of course also if somebody ever does a async_synchronize_cookie_domain(, NULL) with a "finite" cookie value. Maybe the "harmless in practice" means this isn't -stable material. But I'm not completely confident my quick git grep'ing is enough, and there might be affected code in one of the earlier kernels that has since been removed, so I'll leave the decision to the stable guys. Link: http://lkml.kernel.org/r/20171128104938.3921-1-linux@rasmusvillemoes.dk Fixes: `92266d6ef6` "async: simplify lowest_in_progress()" Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Tejun Heo <tj@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Adam Wallis <awallis@codeaurora.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Heiko Carstens	da3b224658	fs/proc/kcore.c: use probe_kernel_read() instead of memcpy() commit `d0290bc20d` upstream. Commit `df04abfd18` ("fs/proc/kcore.c: Add bounce buffer for ktext data") added a bounce buffer to avoid hardened usercopy checks. Copying to the bounce buffer was implemented with a simple memcpy() assuming that it is always valid to read from kernel memory iff the kern_addr_valid() check passed. A simple, but pointless, test case like "dd if=/proc/kcore of=/dev/null" now can easily crash the kernel, since the former execption handling on invalid kernel addresses now doesn't work anymore. Also adding a kern_addr_valid() implementation wouldn't help here. Most architectures simply return 1 here, while a couple implemented a page table walk to figure out if something is mapped at the address in question. With DEBUG_PAGEALLOC active mappings are established and removed all the time, so that relying on the result of kern_addr_valid() before executing the memcpy() also doesn't work. Therefore simply use probe_kernel_read() to copy to the bounce buffer. This also allows to simplify read_kcore(). At least on s390 this fixes the observed crashes and doesn't introduce warnings that were removed with `df04abfd18` ("fs/proc/kcore.c: Add bounce buffer for ktext data"), even though the generic probe_kernel_read() implementation uses uaccess functions. While looking into this I'm also wondering if kern_addr_valid() could be completely removed...(?) Link: http://lkml.kernel.org/r/20171202132739.99971-1-heiko.carstens@de.ibm.com Fixes: `df04abfd18` ("fs/proc/kcore.c: Add bounce buffer for ktext data") Fixes: `f5509cc18d` ("mm: Hardened usercopy") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Mauro Carvalho Chehab	1666d38f4e	media: cxusb, dib0700: ignore XC2028_I2C_FLUSH commit `9893b905e7` upstream. The XC2028_I2C_FLUSH only needs to be implemented on a few devices. Others can safely ignore it. That prevents filling the dmesg with lots of messages like: dib0700: stk7700ph_xc3028_callback: unknown command 2, arg 0 Fixes: `4d37ece757` ("[media] tuner/xc2028: Add I2C flush callback") Reported-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Mauro Carvalho Chehab	b2e7c63cad	media: ts2020: avoid integer overflows on 32 bit machines commit `81742be14b` upstream. Before this patch, when compiled for arm32, the signal strength were reported as: Lock (0x1f) Signal= 4294908.66dBm C/N= 12.79dB Because of a 32 bit integer overflow. After it, it is properly reported as: Lock (0x1f) Signal= -58.64dBm C/N= 12.79dB Fixes: `0f91c9d6ba` ("[media] TS2020: Calculate tuner gain correctly") Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Arnd Bergmann	d1d85ae79d	media: dvb-frontends: fix i2c access helpers for KASAN commit `3cd890dbe2` upstream. A typical code fragment was copied across many dvb-frontend drivers and causes large stack frames when built with with CONFIG_KASAN on gcc-5/6/7: drivers/media/dvb-frontends/cxd2841er.c:3225:1: error: the frame size of 3992 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/cxd2841er.c:3404:1: error: the frame size of 3136 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv0367.c:3143:1: error: the frame size of 4016 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv090x.c:3430:1: error: the frame size of 5312 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] drivers/media/dvb-frontends/stv090x.c:4248:1: error: the frame size of 4872 bytes is larger than 3072 bytes [-Werror=frame-larger-than=] gcc-8 now solves this by consolidating the stack slots for the argument variables, but on older compilers we can get the same behavior by taking the pointer of a local variable rather than the inline function argument. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:18 +01:00
Martin Kaiser	b7f9df60f4	watchdog: imx2_wdt: restore previous timeout after suspend+resume commit `0be267255c` upstream. When the watchdog device is suspended, its timeout is set to the maximum value. During resume, the previously set timeout should be restored. This does not work at the moment. The suspend function calls imx2_wdt_set_timeout(wdog, IMX2_WDT_MAX_TIME); and resume reverts this by calling imx2_wdt_set_timeout(wdog, wdog->timeout); However, imx2_wdt_set_timeout() updates wdog->timeout. Therefore, wdog->timeout is set to IMX2_WDT_MAX_TIME when we enter the resume function. Fix this by adding a new function __imx2_wdt_set_timeout() which only updates the hardware settings. imx2_wdt_set_timeout() now calls __imx2_wdt_set_timeout() and then saves the new timeout to wdog->timeout. During suspend, we call __imx2_wdt_set_timeout() directly so that wdog->timeout won't be updated and we can restore the previous value during resume. This approach makes wdog->timeout different from the actual setting in the hardware which is usually not a good thing. However, the two differ only while we're suspended and no kernel code is running, so it should be ok in this case. Signed-off-by: Martin Kaiser <martin@kaiser.cx> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Takashi Iwai	eb10c5973e	ASoC: skl: Fix kernel warning due to zero NHTL entry commit `20a1ea2222` upstream. I got the following kernel warning when loading snd-soc-skl module on Dell Latitude 7270 laptop: memremap attempted on mixed range 0x0000000000000000 size: 0x0 WARNING: CPU: 0 PID: 484 at kernel/memremap.c:98 memremap+0x8a/0x180 Call Trace: skl_nhlt_init+0x82/0xf0 [snd_soc_skl] skl_probe+0x2ee/0x7c0 [snd_soc_skl] .... It seems that the machine doesn't support the SKL DSP gives the empty NHLT entry, and it triggers the warning. For avoiding it, let do the zero check before calling memremap(). Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
John Keeping	76376783a4	ASoC: rockchip: i2s: fix playback after runtime resume commit `c66234cfed` upstream. When restoring registers during runtime resume, we must not write to I2S_TXDR which is the transmit FIFO as this queues up a sample to be output and pushes all of the output channels down by one. This can be demonstrated with the speaker-test utility: for i in a b c; do speaker-test -c 2 -s 1; done which should play a test through the left speaker three times but if the I2S hardware starts runtime suspended the first sample will be played through the right speaker. Fix this by marking I2S_TXDR as volatile (which also requires marking it as readble, even though it technically isn't). This seems to be the most robust fix, the alternative of giving I2S_TXDR a default value is more fragile since it does not prevent regcache writing to the register in all circumstances. While here, also fix the configuration of I2S_RXDR and I2S_FIFOLR; these are not writable so they do not suffer from the same problem as I2S_TXDR but reading from I2S_RXDR does suffer from a similar problem. Fixes: `f0447f6cbb` ("ASoC: rockchip: i2s: restore register during runtime_suspend/resume cycle", 2016-09-07) Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
James Morse	f6741799aa	KVM: arm/arm64: Handle CPU_PM_ENTER_FAILED commit `58d6b15e9d` upstream. cpu_pm_enter() calls the pm notifier chain with CPU_PM_ENTER, then if there is a failure: CPU_PM_ENTER_FAILED. When KVM receives CPU_PM_ENTER it calls cpu_hyp_reset() which will return us to the hyp-stub. If we subsequently get a CPU_PM_ENTER_FAILED, KVM does nothing, leaving the CPU running with the hyp-stub, at odds with kvm_arm_hardware_enabled. Add CPU_PM_ENTER_FAILED as a fallthrough for CPU_PM_EXIT, this reloads KVM based on kvm_arm_hardware_enabled. This is safe even if CPU_PM_ENTER never gets as far as KVM, as cpu_hyp_reinit() calls cpu_hyp_reset() to make sure the hyp-stub is loaded before reloading KVM. Fixes: `67f6919766` ("arm64: kvm: allows kvm cpu hotplug") CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Liran Alon	ba88289e7a	KVM: nVMX: Fix races when sending nested PI while dest enters/leaves L2 commit `6b6977117f` upstream. Consider the following scenario: 1. CPU A calls vmx_deliver_nested_posted_interrupt() to send an IPI to CPU B via virtual posted-interrupt mechanism. 2. CPU B is currently executing L2 guest. 3. vmx_deliver_nested_posted_interrupt() calls kvm_vcpu_trigger_posted_interrupt() which will note that vcpu->mode == IN_GUEST_MODE. 4. Assume that before CPU A sends the physical POSTED_INTR_NESTED_VECTOR IPI, CPU B exits from L2 to L0 during event-delivery (valid IDT-vectoring-info). 5. CPU A now sends the physical IPI. The IPI is received in host and it's handler (smp_kvm_posted_intr_nested_ipi()) does nothing. 6. Assume that before CPU A sets pi_pending=true and KVM_REQ_EVENT, CPU B continues to run in L0 and reach vcpu_enter_guest(). As KVM_REQ_EVENT is not set yet, vcpu_enter_guest() will continue and resume L2 guest. 7. At this point, CPU A sets pi_pending=true and KVM_REQ_EVENT but it's too late! CPU B already entered L2 and KVM_REQ_EVENT will only be consumed at next L2 entry! Another scenario to consider: 1. CPU A calls vmx_deliver_nested_posted_interrupt() to send an IPI to CPU B via virtual posted-interrupt mechanism. 2. Assume that before CPU A calls kvm_vcpu_trigger_posted_interrupt(), CPU B is at L0 and is about to resume into L2. Further assume that it is in vcpu_enter_guest() after check for KVM_REQ_EVENT. 3. At this point, CPU A calls kvm_vcpu_trigger_posted_interrupt() which will note that vcpu->mode != IN_GUEST_MODE. Therefore, do nothing and return false. Then, will set pi_pending=true and KVM_REQ_EVENT. 4. Now CPU B continue and resumes into L2 guest without processing the posted-interrupt until next L2 entry! To fix both issues, we just need to change vmx_deliver_nested_posted_interrupt() to set pi_pending=true and KVM_REQ_EVENT before calling kvm_vcpu_trigger_posted_interrupt(). It will fix the first scenario by chaging step (6) to note that KVM_REQ_EVENT and pi_pending=true and therefore process nested posted-interrupt. It will fix the second scenario by two possible ways: 1. If kvm_vcpu_trigger_posted_interrupt() is called while CPU B has changed vcpu->mode to IN_GUEST_MODE, physical IPI will be sent and will be received when CPU resumes into L2. 2. If kvm_vcpu_trigger_posted_interrupt() is called while CPU B hasn't yet changed vcpu->mode to IN_GUEST_MODE, then after CPU B will change vcpu->mode it will call kvm_request_pending() which will return true and therefore force another round of vcpu_enter_guest() which will note that KVM_REQ_EVENT and pi_pending=true and therefore process nested posted-interrupt. Fixes: `705699a139` ("KVM: nVMX: Enable nested posted interrupt processing") Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> [Add kvm_vcpu_kick to also handle the case where L1 doesn't intercept L2 HLT and L2 executes HLT instruction. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Marc Zyngier	51e22c571f	arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls commit `20e8175d24` upstream. KVM doesn't follow the SMCCC when it comes to unimplemented calls, and inject an UNDEF instead of returning an error. Since firmware calls are now used for security mitigation, they are becoming more common, and the undef is counter productive. Instead, let's follow the SMCCC which states that -1 must be returned to the caller when getting an unknown function number. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Eric Biggers	68f2013e1f	crypto: sha512-mb - initialize pending lengths correctly commit `eff84b3790` upstream. The SHA-512 multibuffer code keeps track of the number of blocks pending in each lane. The minimum of these values is used to identify the next lane that will be completed. Unused lanes are set to a large number (0xFFFFFFFF) so that they don't affect this calculation. However, it was forgotten to set the lengths to this value in the initial state, where all lanes are unused. As a result it was possible for sha512_mb_mgr_get_comp_job_avx2() to select an unused lane, causing a NULL pointer dereference. Specifically this could happen in the case where ->update() was passed fewer than SHA512_BLOCK_SIZE bytes of data, so it then called sha_complete_job() without having actually submitted any blocks to the multi-buffer code. This hit a NULL pointer dereference if another task happened to have submitted blocks concurrently to the same CPU and the flush timer had not yet expired. Fix this by initializing sha512_mb_mgr->lens correctly. As usual, this bug was found by syzkaller. Fixes: `45691e2d9b` ("crypto: sha512-mb - submit/flush routines for AVX2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Horia Geantă	a96e820790	crypto: caam - fix endless loop when DECO acquire fails commit `225ece3e7d` upstream. In case DECO0 cannot be acquired - i.e. run_descriptor_deco0() fails with -ENODEV, caam_probe() enters an endless loop: run_descriptor_deco0 ret -ENODEV -> instantiate_rng -ENODEV, overwritten by -EAGAIN ret -EAGAIN -> caam_probe -EAGAIN results in endless loop It turns out the error path in instantiate_rng() is incorrect, the checks are done in the wrong order. Fixes: `1005bccd7a` ("crypto: caam - enable instantiation of all RNG4 state handles") Reported-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Suggested-by: Auer Lukas <lukas.auer@aisec.fraunhofer.de> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Daniel Mentz	f2d4bed9ea	media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic commit `a1dfb4c48c` upstream. The 32-bit compat v4l2 ioctl handling is implemented based on its 64-bit equivalent. It converts 32-bit data structures into its 64-bit equivalents and needs to provide the data to the 64-bit ioctl in user space memory which is commonly allocated using compat_alloc_user_space(). However, due to how that function is implemented, it can only be called a single time for every syscall invocation. Supposedly to avoid this limitation, the existing code uses a mix of memory from the kernel stack and memory allocated through compat_alloc_user_space(). Under normal circumstances, this would not work, because the 64-bit ioctl expects all pointers to point to user space memory. As a workaround, set_fs(KERNEL_DS) is called to temporarily disable this extra safety check and allow kernel pointers. However, this might introduce a security vulnerability: The result of the 32-bit to 64-bit conversion is writeable by user space because the output buffer has been allocated via compat_alloc_user_space(). A malicious user space process could then manipulate pointers inside this output buffer, and due to the previous set_fs(KERNEL_DS) call, functions like get_user() or put_user() no longer prevent kernel memory access. The new approach is to pre-calculate the total amount of user space memory that is needed, allocate it using compat_alloc_user_space() and then divide up the allocated memory to accommodate all data structures that need to be converted. An alternative approach would have been to retain the union type karg that they allocated on the kernel stack in do_video_ioctl(), copy all data from user space into karg and then back to user space. However, we decided against this approach because it does not align with other compat syscall implementations. Instead, we tried to replicate the get_user/put_user pairs as found in other places in the kernel: if (get_user(clipcount, &up->clipcount) \|\| put_user(clipcount, &kp->clipcount)) return -EFAULT; Notes from hans.verkuil@cisco.com: This patch was taken from: `97b733953c` Clearly nobody could be bothered to upstream this patch or at minimum tell us :-( We only heard about this a week ago. This patch was rebased and cleaned up. Compared to the original I also swapped the order of the convert_in_user arguments so that they matched copy_in_user. It was hard to review otherwise. I also replaced the ALLOC_USER_SPACE/ALLOC_AND_GET by a normal function. Fixes: `6b5a9492ca` ("v4l: introduce string control support.") Signed-off-by: Daniel Mentz <danielmentz@google.com> Co-developed-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Hans Verkuil	437c4ec62e	media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors commit `d83a8243aa` upstream. Some ioctls need to copy back the result even if the ioctl returned an error. However, don't do this for the error code -ENOTTY. It makes no sense in that cases. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:17 +01:00
Hans Verkuil	30dcb0756b	media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type commit `169f24ca68` upstream. There is nothing wrong with using an unknown buffer type. So stop spamming the kernel log whenever this happens. The kernel will just return -EINVAL to signal this. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	30ac343c42	media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32 commit `a751be5b14` upstream. put_v4l2_window32() didn't copy back the clip list to userspace. Drivers can update the clip rectangles, so this should be done. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Daniel Mentz	55e3f3e684	media: v4l2-compat-ioctl32: Copy v4l2_window->global_alpha commit `025a26fa14` upstream. Commit `b2787845fb` ("V4L/DVB (5289): Add support for video output overlays.") added the field global_alpha to struct v4l2_window but did not update the compat layer accordingly. This change adds global_alpha to struct v4l2_window32 and copies the value for global_alpha back and forth. Signed-off-by: Daniel Mentz <danielmentz@google.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	8465657a3b	media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs commit `273caa2600` upstream. If the device is of type VFL_TYPE_SUBDEV then vdev->ioctl_ops is NULL so the 'if (!ops->vidioc_query_ext_ctrl)' check would crash. Add a test for !ops to the condition. All sub-devices that have controls will use the control framework, so they do not have an equivalent to ops->vidioc_query_ext_ctrl. Returning false if ops is NULL is the correct thing to do here. Fixes: `b8c601e8af` ("v4l2-compat-ioctl32.c: fix ctrl_is_pointer") Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	9a7cd41be3	media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer commit `b8c601e8af` upstream. ctrl_is_pointer just hardcoded two known string controls, but that caused problems when using e.g. custom controls that use a pointer for the payload. Reimplement this function: it now finds the v4l2_ctrl (if the driver uses the control framework) or it calls vidioc_query_ext_ctrl (if the driver implements that directly). In both cases it can now check if the control is a pointer control or not. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	eec955463d	media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32 commit `8ed5a59dcb` upstream. The struct v4l2_plane32 should set m.userptr as well. The same happens in v4l2_buffer32 and v4l2-compliance tests for this. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	daff4d009f	media: v4l2-compat-ioctl32.c: avoid sizeof(type) commit `333b1e9f96` upstream. Instead of doing sizeof(struct foo) use sizeof(up). There even were cases where 4 sizeof(__u32) was used instead of sizeof(kp->reserved), which is very dangerous when the size of the reserved array changes. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	81e0acf070	media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32 commit `486c521510` upstream. These helper functions do not really help. Move the code to the __get/put_v4l2_format32 functions. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	02129c9bc2	media: v4l2-compat-ioctl32.c: fix the indentation commit `b7b957d429` upstream. The indentation of this source is all over the place. Fix this. This patch only changes whitespace. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	f294548da6	media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF commit `3ee6d04071` upstream. The result of the VIDIOC_PREPARE_BUF ioctl was never copied back to userspace since it was missing in the switch. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:16 +01:00
Hans Verkuil	e78d9fdf5e	media: v4l2-ioctl.c: don't copy back the result for -ENOTTY commit `181a4a2d5a` upstream. If the ioctl returned -ENOTTY, then don't bother copying back the result as there is no point. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Cong Wang	daaa81c484	nsfs: mark dentry with DCACHE_RCUACCESS commit `073c516ff7` upstream. Andrey reported a use-after-free in __ns_get_path(): spin_lock include/linux/spinlock.h:299 [inline] lockref_get_not_dead+0x19/0x80 lib/lockref.c:179 __ns_get_path+0x197/0x860 fs/nsfs.c:66 open_related_ns+0xda/0x200 fs/nsfs.c:143 sock_ioctl+0x39d/0x440 net/socket.c:1001 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 We are under rcu read lock protection at that point: rcu_read_lock(); d = atomic_long_read(&ns->stashed); if (!d) goto slow; dentry = (struct dentry )d; if (!lockref_get_not_dead(&dentry->d_lockref)) goto slow; rcu_read_unlock(); but don't use a proper RCU API on the free path, therefore a parallel __d_free() could free it at the same time. We need to mark the stashed dentry with DCACHE_RCUACCESS so that __d_free() will be called after all readers leave RCU. Fixes: `e149ed2b80` ("take the targets of /proc//ns/* symlinks to separate fs") Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Eric Biggers	b93728341f	crypto: poly1305 - remove ->setkey() method commit `a16e772e66` upstream. Since Poly1305 requires a nonce per invocation, the Linux kernel implementations of Poly1305 don't use the crypto API's keying mechanism and instead expect the key and nonce as the first 32 bytes of the data. But ->setkey() is still defined as a stub returning an error code. This prevents Poly1305 from being used through AF_ALG and will also break it completely once we start enforcing that all crypto API users (not just AF_ALG) call ->setkey() if present. Fix it by removing crypto_poly1305_setkey(), leaving ->setkey as NULL. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Eric Biggers	45f31106ba	crypto: mcryptd - pass through absence of ->setkey() commit `fa59b92d29` upstream. When the mcryptd template is used to wrap an unkeyed hash algorithm, don't install a ->setkey() method to the mcryptd instance. This change is necessary for mcryptd to keep working with unkeyed hash algorithms once we start enforcing that ->setkey() is called when present. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Eric Biggers	c1ebf9f835	crypto: cryptd - pass through absence of ->setkey() commit `841a3ff329` upstream. When the cryptd template is used to wrap an unkeyed hash algorithm, don't install a ->setkey() method to the cryptd instance. This change is necessary for cryptd to keep working with unkeyed hash algorithms once we start enforcing that ->setkey() is called when present. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Eric Biggers	d2b492bda5	crypto: hash - introduce crypto_hash_alg_has_setkey() commit `cd6ed77ad5` upstream. Templates that use an shash spawn can use crypto_shash_alg_has_setkey() to determine whether the underlying algorithm requires a key or not. But there was no corresponding function for ahash spawns. Add it. Note that the new function actually has to support both shash and ahash algorithms, since the ahash API can be used with either. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Mika Westerberg	016572d31d	ahci: Add Intel Cannon Lake PCH-H PCI ID commit `f919dde077` upstream. Add Intel Cannon Lake PCH-H PCI ID to the list of supported controllers. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Hans de Goede	72c0031a91	ahci: Add PCI ids for Intel Bay Trail, Cherry Trail and Apollo Lake AHCI commit `998008b779` upstream. Add PCI ids for Intel Bay Trail, Cherry Trail and Apollo Lake AHCI SATA controllers. This commit is a preparation patch for allowing a different default sata link powermanagement policy for mobile chipsets. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Hans de Goede	3332b6f327	ahci: Annotate PCI ids for mobile Intel chipsets as such commit `ca1b4974bd` upstream. Intel uses different SATA PCI ids for the Desktop and Mobile SKUs of their chipsets. For older models the comment describing which chipset the PCI id is for, aksi indicates when we're dealing with a mobile SKU. Extend the comments for recent chipsets to also indicate mobile SKUs. The information this commit adds comes from Intel's chipset datasheets. This commit is a preparation patch for allowing a different default sata link powermanagement policy for mobile chipsets. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Ivan Vecera	058d13f85d	kernfs: fix regression in kernfs_fop_write caused by wrong type commit `ba87977a49` upstream. Commit `b7ce40cff0` ("kernfs: cache atomic_write_len in kernfs_open_file") changes type of local variable 'len' from ssize_t to size_t. This change caused that the ppos value is updated also when the previous write callback failed. Mentioned snippet: ... len = ops->write(...); <- return value can be negative ... if (len > 0) <- true here in this case ppos += len; ... Fixes: `b7ce40cff0` ("kernfs: cache atomic_write_len in kernfs_open_file") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:15 +01:00
Trond Myklebust	b79d8854ee	NFS: Fix a race between mmap() and O_DIRECT commit `e231c6879c` upstream. When locking the file in order to do O_DIRECT on it, we must unmap any mmapped ranges on the pagecache so that we can flush out the dirty data. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Eric Biggers	967f650f88	NFS: reject request for id_legacy key without auxdata commit `49686cbbb3` upstream. nfs_idmap_legacy_upcall() is supposed to be called with 'aux' pointing to a 'struct idmap', via the call to request_key_with_auxdata() in nfs_idmap_request_key(). However it can also be reached via the request_key() system call in which case 'aux' will be NULL, causing a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall(), assuming that the key description is valid enough to get that far. Fix this by making nfs_idmap_legacy_upcall() negate the key if no auxdata is provided. As usual, this bug was found by syzkaller. A simple reproducer using the command-line keyctl program is: keyctl request2 id_legacy uid:0 '' @s Fixes: `57e62324e4` ("NFS: Store the legacy idmapper result in the keyring") Reported-by: syzbot+5dfdbcf7b3eb5912abbb@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
J. Bruce Fields	ca2c316f7c	NFS: commit direct writes even if they fail partially commit `1b8d97b0a8` upstream. If some of the WRITE calls making up an O_DIRECT write syscall fail, we neglect to commit, even if some of the WRITEs succeed. We also depend on the commit code to free the reference count on the nfs_page taken in the "if (request_commit)" case at the end of nfs_direct_write_completion(). The problem was originally noticed because ENOSPC's encountered partway through a write would result in a closed file being sillyrenamed when it should have been unlinked. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Trond Myklebust	d1840343f9	NFS: Add a cond_resched() to nfs_commit_release_pages() commit `7f1bda447c` upstream. The commit list can get very large, and so we need a cond_resched() in nfs_commit_release_pages() in order to ensure we don't hog the CPU for excessive periods of time. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Scott Mayhew	e1df8c682d	nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds commit `ba4a76f703` upstream. Currently when falling back to doing I/O through the MDS (via pnfs_{read\|write}_through_mds), the client frees the nfs_pgio_header without releasing the reference taken on the dreq via pnfs_generic_pg_{read\|write}pages -> nfs_pgheader_init -> nfs_direct_pgio_init. It then takes another reference on the dreq via nfs_generic_pg_pgios -> nfs_pgheader_init -> nfs_direct_pgio_init and as a result the requester will become stuck in inode_dio_wait. Once that happens, other processes accessing the inode will become stuck as well. Ensure that pnfs_read_through_mds() and pnfs_write_through_mds() clean up correctly by calling hdr->completion_ops->completion() instead of calling hdr->release() directly. This can be reproduced (sometimes) by performing "storage failover takeover" commands on NetApp filer while doing direct I/O from a client. This can also be reproduced using SystemTap to simulate a failure while doing direct I/O from a client (from Dave Wysochanski <dwysocha@redhat.com>): stap -v -g -e 'probe module("nfs_layout_nfsv41_files").function("nfs4_fl_prepare_ds").return { $return=NULL; exit(); }' Suggested-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: `1ca018d28d` ("pNFS: Fix a memory leak when attempted pnfs fails") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Xiaolei Li	298dc6c669	ubifs: Massage assert in ubifs_xattr_set() wrt. init_xattrs commit `d8db5b1ca9` upstream. The inode is not locked in init_xattrs when creating a new inode. Without this patch, there will occurs assert when booting or creating a new file, if the kernel config CONFIG_SECURITY_SMACK is enabled. Log likes: UBIFS assert failed in ubifs_xattr_set at 298 (pid 1156) CPU: 1 PID: 1156 Comm: ldconfig Tainted: G S 4.12.0-rc1-207440-g1e70b02 #2 Hardware name: MediaTek MT2712 evaluation board (DT) Call trace: [<ffff000008088538>] dump_backtrace+0x0/0x238 [<ffff000008088834>] show_stack+0x14/0x20 [<ffff0000083d98d4>] dump_stack+0x9c/0xc0 [<ffff00000835d524>] ubifs_xattr_set+0x374/0x5e0 [<ffff00000835d7ec>] init_xattrs+0x5c/0xb8 [<ffff000008385788>] security_inode_init_security+0x110/0x190 [<ffff00000835e058>] ubifs_init_security+0x30/0x68 [<ffff00000833ada0>] ubifs_mkdir+0x100/0x200 [<ffff00000820669c>] vfs_mkdir+0x11c/0x1b8 [<ffff00000820b73c>] SyS_mkdirat+0x74/0xd0 [<ffff000008082f8c>] __sys_trace_return+0x0/0x4 Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com> Signed-off-by: Richard Weinberger <richard@nod.at> Cc: stable@vger.kernel.org (julia: massaged to apply to 4.9.y, which doesn't contain fscrypto support) Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Bradley Bolen	de14d0c124	ubi: block: Fix locking for idr_alloc/idr_remove commit `7f29ae9f97` upstream. This fixes a race with idr_alloc where gd->first_minor can be set to the same value for two simultaneous calls to ubiblock_create. Each instance calls device_add_disk with the same first_minor. device_add_disk calls bdi_register_owner which generates several warnings. WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/devices/virtual/bdi/252:2' WARNING: CPU: 1 PID: 179 at kernel-source/lib/kobject.c:240 kobject_add_internal+0x1ec/0x2f8 kobject_add_internal failed for 252:2 with -EEXIST, don't try to register things with the same name in the same directory WARNING: CPU: 1 PID: 179 at kernel-source/fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x88 sysfs: cannot create duplicate filename '/dev/block/252:2' However, device_add_disk does not error out when bdi_register_owner returns an error. Control continues until reaching blk_register_queue. It then BUGs. kernel BUG at kernel-source/fs/sysfs/group.c:113! [<c01e26cc>] (internal_create_group) from [<c01e2950>] (sysfs_create_group+0x20/0x24) [<c01e2950>] (sysfs_create_group) from [<c00e3d38>] (blk_trace_init_sysfs+0x18/0x20) [<c00e3d38>] (blk_trace_init_sysfs) from [<c02bdfbc>] (blk_register_queue+0xd8/0x154) [<c02bdfbc>] (blk_register_queue) from [<c02cec84>] (device_add_disk+0x194/0x44c) [<c02cec84>] (device_add_disk) from [<c0436ec8>] (ubiblock_create+0x284/0x2e0) [<c0436ec8>] (ubiblock_create) from [<c0427bb8>] (vol_cdev_ioctl+0x450/0x554) [<c0427bb8>] (vol_cdev_ioctl) from [<c0189110>] (vfs_ioctl+0x30/0x44) [<c0189110>] (vfs_ioctl) from [<c01892e0>] (do_vfs_ioctl+0xa0/0x790) [<c01892e0>] (do_vfs_ioctl) from [<c0189a14>] (SyS_ioctl+0x44/0x68) [<c0189a14>] (SyS_ioctl) from [<c0010640>] (ret_fast_syscall+0x0/0x34) Locking idr_alloc/idr_remove removes the race and keeps gd->first_minor unique. Fixes: `2bf50d42f3` ("UBI: block: Dynamically allocate minor numbers") Signed-off-by: Bradley Bolen <bradleybolen@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Sascha Hauer	84f9d8536c	ubi: fastmap: Erase outdated anchor PEBs during attach commit `f78e5623f4` upstream. The fastmap update code might erase the current fastmap anchor PEB in case it doesn't find any new free PEB. When a power cut happens in this situation we must not have any outdated fastmap anchor PEB on the device, because that would be used to attach during next boot. The easiest way to make that sure is to erase all outdated fastmap anchor PEBs synchronously during attach. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Richard Weinberger <richard@nod.at> Fixes: `dbb7d2a88d` ("UBI: Add fastmap core") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Miquel Raynal	44ebd641be	mtd: nand: sunxi: Fix ECC strength choice commit `f4c6cd1a7f` upstream. When the requested ECC strength does not exactly match the strengths supported by the ECC engine, the driver is selecting the closest strength meeting the 'selected_strength > requested_strength' constraint. Fix the fact that, in this particular case, ecc->strength value was not updated to match the 'selected_strength'. For instance, one can encounter this issue when no ECC requirement is filled in the device tree while the NAND chip minimum requirement is not a strength/step_size combo natively supported by the ECC engine. Fixes: `1fef62c142` ("mtd: nand: add sunxi NAND flash controller support") Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Miquel Raynal	d80cd3e936	mtd: nand: Fix nand_do_read_oob() return value commit `87e89ce8d0` upstream. Starting from commit `041e4575f0` ("mtd: nand: handle ECC errors in OOB"), nand_do_read_oob() (from the NAND core) did return 0 or a negative error, and the MTD layer expected it. However, the trend for the NAND layer is now to return an error or a positive number of bitflips. Deciding which status to return to the user belongs to the MTD layer. Commit `e47f68587b` ("mtd: check for max_bitflips in mtd_read_oob()") brought this logic to the mtd_read_oob() function while the return value coming from nand_do_read_oob() (called by the ->_read_oob() hook) was left unchanged. Fixes: `e47f68587b` ("mtd: check for max_bitflips in mtd_read_oob()") Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:14 +01:00
Kamal Dasu	d25d52ff10	mtd: nand: brcmnand: Disable prefetch by default commit `f953f0f896` upstream. Brcm nand controller prefetch feature needs to be disabled by default. Enabling affects performance on random reads as well as dma reads. Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Fixes: `27c5b17cd1` ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Arnd Bergmann	cbdabc7027	mtd: cfi: convert inline functions to macros commit `9e343e87d2` upstream. The map_word_() functions, dating back to linux-2.6.8, try to perform bitwise operations on a 'map_word' structure. This may have worked with compilers that were current then (gcc-3.4 or earlier), but end up being rather inefficient on any version I could try now (gcc-4.4 or higher). Specifically we hit a problem analyzed in gcc PR81715 where we fail to reuse the stack space for local variables. This can be seen immediately in the stack consumption for cfi_staa_erase_varsize() and other functions that (with CONFIG_KASAN) can be up to 2200 bytes. Changing the inline functions into macros brings this down to 1280 bytes. Without KASAN, the same problem exists, but the stack consumption is lower to start with, my patch shrinks it from 920 to 496 bytes on with arm-linux-gnueabi-gcc-5.4, and saves around 1KB in .text size for cfi_cmdset_0020.c, as it avoids copying map_word structures for each call to one of these helpers. With the latest gcc-8 snapshot, the problem is fixed in upstream gcc, but nobody uses that yet, so we should still work around it in mainline kernels and probably backport the workaround to stable kernels as well. We had a couple of other functions that suffered from the same gcc bug, and all of those had a simpler workaround involving dummy variables in the inline function. Unfortunately that did not work here, the macro hack was the best I could come up with. It would also be helpful to have someone to a little performance testing on the patch, to see how much it helps in terms of CPU utilitzation. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Arvind Yadav	198a7ddaf5	media: hdpvr: Fix an error handling path in hdpvr_probe() commit `c0f71bbb81` upstream. Here, hdpvr_register_videodev() is responsible for setup and register a video device. Also defining and initializing a worker. hdpvr_register_videodev() is calling by hdpvr_probe at last. So no need to flush any work here. Unregister v4l2, free buffers and memory. If hdpvr_probe() will fail. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Malcolm Priestley	f320dd2022	media: dvb-usb-v2: lmedm04: move ts2020 attach to dm04_lme2510_tuner commit `7bf7a7116e` upstream. When the tuner was split from m88rs2000 the attach function is in wrong place. Move to dm04_lme2510_tuner to trap errors on failure and removing a call to lme_coldreset. Prevents driver starting up without any tuner connected. Fixes to trap for ts2020 fail. LME2510(C): FE Found M88RS2000 ts2020: probe of 0-0060 failed with error -11 ... LME2510(C): TUN Found RS2000 tuner kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Malcolm Priestley	1ff1353a03	media: dvb-usb-v2: lmedm04: Improve logic checking of warm start commit `3d932ee27e` upstream. Warm start has no check as whether a genuine device has connected and proceeds to next execution path. Check device should read 0x47 at offset of 2 on USB descriptor read and it is the amount requested of 6 bytes. Fix for kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access as Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Mohamed Ghannam	7e2fb808d3	dccp: CVE-2017-8824: use-after-free in DCCP code commit `69c64866ce` upstream. Whenever the sock object is in DCCP_CLOSED state, dccp_disconnect() must free dccps_hc_tx_ccid and dccps_hc_rx_ccid and set to NULL. Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Steven Rostedt (VMware)	a384e5437f	sched/rt: Up the root domain ref count when passing it around via IPIs commit `364f566537` upstream. When issuing an IPI RT push, where an IPI is sent to each CPU that has more than one RT task scheduled on it, it references the root domain's rto_mask, that contains all the CPUs within the root domain that has more than one RT task in the runable state. The problem is, after the IPIs are initiated, the rq->lock is released. This means that the root domain that is associated to the run queue could be freed while the IPIs are going around. Add a sched_get_rd() and a sched_put_rd() that will increment and decrement the root domain's ref count respectively. This way when initiating the IPIs, the scheduler will up the root domain's ref count before releasing the rq->lock, ensuring that the root domain does not go away until the IPI round is complete. Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `4bdced5c9a` ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Steven Rostedt (VMware)	1c67998130	sched/rt: Use container_of() to get root domain in rto_push_irq_work_func() commit `ad0f1d9d65` upstream. When the rto_push_irq_work_func() is called, it looks at the RT overloaded bitmask in the root domain via the runqueue (rq->rd). The problem is that during CPU up and down, nothing here stops rq->rd from changing between taking the rq->rd->rto_lock and releasing it. That means the lock that is released is not the same lock that was taken. Instead of using this_rq()->rd to get the root domain, as the irq work is part of the root domain, we can simply get the root domain from the irq work that is passed to the routine: container_of(work, struct root_domain, rto_push_work) This keeps the root domain consistent. Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `4bdced5c9a` ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Petr Cvek	57ddb8eae5	usb: gadget: uvc: Missing files for configfs interface commit `c8cd751060` upstream. Commit `76e0da34c7` ("usb-gadget/uvc: use per-attribute show and store methods") caused a stringification of an undefined macro argument "aname", so three UVC parameters (streaming_interval, streaming_maxpacket and streaming_maxburst) were named "aname". Add the definition of "aname" to the main macro and name the filenames as originaly intended. Signed-off-by: Petr Cvek <petr.cvek@tul.cz> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Thomas Gleixner	0b376535ad	posix-timer: Properly check sigevent->sigev_notify commit `cef31d9af9` upstream. timer_create() specifies via sigevent->sigev_notify the signal delivery for the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD and (SIGEV_SIGNAL \| SIGEV_THREAD_ID). The sanity check in good_sigevent() is only checking the valid combination for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is not set it accepts any random value. This has no real effects on the posix timer and signal delivery code, but it affects show_timer() which handles the output of /proc/$PID/timers. That function uses a string array to pretty print sigev_notify. The access to that array has no bound checks, so random sigev_notify cause access beyond the array bounds. Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID masking from various code pathes as SIGEV_NONE can never be set in combination with SIGEV_THREAD_ID. Reported-by: Eric Biggers <ebiggers3@gmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:13 +01:00
Hugh Dickins	83946c33b9	kaiser: fix compile error without vsyscall Tobias noticed a compile error on 4.4.115, and it's the same on 4.9.80: arch/x86/mm/kaiser.c: In function ‘kaiser_init’: arch/x86/mm/kaiser.c:348:8: error: ‘vsyscall_pgprot’ undeclared (first use in this function) It seems like his combination of kernel options doesn't work for KAISER. X86_VSYSCALL_EMULATION is not set on his system, while LEGACY_VSYSCALL is set to NONE (LEGACY_VSYSCALL_NONE=y). He managed to get things compiling again, by moving the 'extern unsigned long vsyscall_pgprot' outside of the preprocessor statement. This works because the optimizer removes that code (vsyscall_enabled() is always false) - and that's how it was done in some older backports. Reported-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Yang Shunyong	297c7cc4b5	dmaengine: dmatest: fix container_of member in dmatest_callback commit `66b3bd2356` upstream. The type of arg passed to dmatest_callback is struct dmatest_done. It refers to test_done in struct dmatest_thread, not done_wait. Fixes: `6f6a23a213` ("dmaengine: dmatest: move callback wait ...") Signed-off-by: Yang Shunyong <shunyong.yang@hxt-semitech.com> Acked-by: Adam Wallis <awallis@codeaurora.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Aurelien Aptel	7e68916c36	CIFS: zero sensitive data when freeing commit `97f4b7276b` upstream. also replaces memset()+kfree() by kzfree(). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Daniel N Pettersson	f59eda1664	cifs: Fix autonegotiate security settings mismatch commit `9aca7e4544` upstream. Autonegotiation gives a security settings mismatch error if the SMB server selects an SMBv3 dialect that isn't SMB3.02. The exact error is "protocol revalidation - security settings mismatch". This can be tested using Samba v4.2 or by setting the global Samba setting max protocol = SMB3_00. The check that fails in smb3_validate_negotiate is the dialect verification of the negotiate info response. This is because it tries to verify against the protocol_id in the global smbdefault_values. The protocol_id in smbdefault_values is SMB3.02. In SMB2_negotiate the protocol_id in smbdefault_values isn't updated, it is global so it probably shouldn't be, but server->dialect is. This patch changes the check in smb3_validate_negotiate to use server->dialect instead of server->vals->protocol_id. The patch works with autonegotiate and when using a specific version in the vers mount option. Signed-off-by: Daniel N Pettersson <danielnp@axis.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Matthew Wilcox	ee6858f72a	cifs: Fix missing put_xid in cifs_file_strict_mmap commit `f04a703c3d` upstream. If cifs_zap_mapping() returned an error, we would return without putting the xid that we got earlier. Restructure cifs_file_strict_mmap() and cifs_file_mmap() to be more similar to each other and have a single point of return that always puts the xid. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Michal Suchanek	ba4f9c192d	powerpc/pseries: include linux/types.h in asm/hvcall.h commit `1b689a95ce` upstream. Commit `6e032b350c` ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: `6e032b350c` ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-17 13:21:12 +01:00
Greg Kroah-Hartman	7f3bd8db99	Linux 4.9.81	2018-02-13 12:36:03 +01:00
Borislav Petkov	2760f452a7	x86/microcode: Do the family check first commit `1f161f67a2` upstream with adjustments. On CPUs like AMD's Geode, for example, we shouldn't even try to load microcode because they do not support the modern microcode loading interface. However, we do the family check after the other checks whether the loader has been disabled on the command line or whether we're running in a guest. So move the family checks first in order to exit early if we're being loaded on an unsupported family. Reported-and-tested-by: Sven Glodowski <glodi1@arcor.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.11.. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396 Link: http://lkml.kernel.org/r/20171012112316.977-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Laurent Pinchart	230ca8fb95	drm: rcar-du: Fix race condition when disabling planes at CRTC stop commit `641307df71` upstream. When stopping the CRTC the driver must disable all planes and wait for the change to take effect at the next vblank. Merely calling drm_crtc_wait_one_vblank() is not enough, as the function doesn't include any mechanism to handle the race with vblank interrupts. Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that handles the vblank interrupt race. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: thongsyho <thong.ho.px@rvc.renesas.com> Signed-off-by: Nhan Nguyen <nhan.nguyen.yb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Laurent Pinchart	758e22acf4	drm: rcar-du: Use the VBK interrupt for vblank events commit `cbbb90b0c0` upstream. When implementing support for interlaced modes, the driver switched from reporting vblank events on the vertical blanking (VBK) interrupt to the frame end interrupt (FRM). This incorrectly divided the reported refresh rate by two. Fix it by moving back to the VBK interrupt. Fixes: `906eff7fca` ("drm: rcar-du: Implement support for interlaced modes") Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: thongsyho <thong.ho.px@rvc.renesas.com> Signed-off-by: Nhan Nguyen <nhan.nguyen.yb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Kuninori Morimoto	1cb145c672	ASoC: rsnd: avoid duplicate free_irq() commit `e0936c3471` upstream. commit `1f8754d4da` ("ASoC: rsnd: don't call free_irq() on Parent SSI") fixed Parent SSI duplicate free_irq(). But on Renesas Sound, not only Parent SSI but also Multi SSI have same issue. This patch avoid duplicate free_irq() if it was not pure SSI. Fixes: `1f8754d4da` ("ASoC: rsnd: don't call free_irq() on Parent SSI") Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: thongsyho <thong.ho.px@rvc.renesas.com> Signed-off-by: Nhan Nguyen <nhan.nguyen.yb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Kuninori Morimoto	24978c21f7	ASoC: rsnd: don't call free_irq() on Parent SSI commit `1f8754d4da` upstream. If SSI uses shared pin, some SSI will be used as parent SSI. Then, normal SSI's remove and Parent SSI's remove (these are same SSI) will be called when unbind or remove timing. In this case, free_irq() will be called twice. This patch solve this issue. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: thongsyho <thong.ho.px@rvc.renesas.com> Signed-off-by: Nhan Nguyen <nhan.nguyen.yb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Julian Scheel	a7de0e9718	ASoC: simple-card: Fix misleading error message commit `7ac45d1635` upstream. In case cpu could not be found the error message would always refer to /codec/ not being found in DT. Fix this by catching the cpu node not found case explicitly. Signed-off-by: Julian Scheel <julian@jusst.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: thongsyho <thong.ho.px@rvc.renesas.com> Signed-off-by: Nhan Nguyen <nhan.nguyen.yb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:03 +01:00
Robert Baronescu	7c17a1e585	crypto: tcrypt - fix S/G table for test_aead_speed() commit `5c6ac1d4f8` upstream. In case buffer length is a multiple of PAGE_SIZE, the S/G table is incorrectly generated. Fix this by handling buflen = k * PAGE_SIZE separately. Signed-off-by: Robert Baronescu <robert.baronescu@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
KarimAllah Ahmed	fc00dde960	KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL (cherry picked from commit `b2ac58f905`) [ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ] ... basically doing exactly what we do for VMX: - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID) - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest actually used it. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
KarimAllah Ahmed	e5a83419c9	KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL (cherry picked from commit `d28b387fb7`) [ Based on a patch from Ashok Raj <ashok.raj@intel.com> ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
KarimAllah Ahmed	755502f810	KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES (cherry picked from commit `28c1c9fabf`) Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the contents will come directly from the hardware, but user-space can still override it. [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
Ashok Raj	7013129a40	KVM/x86: Add IBPB support (cherry picked from commit `15d4507152`) The Indirect Branch Predictor Barrier (IBPB) is an indirect branch control mechanism. It keeps earlier branches from influencing later ones. Unlike IBRS and STIBP, IBPB does not define a new mode of operation. It's a command that ensures predicted branch targets aren't used after the barrier. Although IBRS and IBPB are enumerated by the same CPUID enumeration, IBPB is very different. IBPB helps mitigate against three potential attacks: * Mitigate guests from being attacked by other guests. - This is addressed by issing IBPB when we do a guest switch. * Mitigate attacks from guest/ring3->host/ring3. These would require a IBPB during context switch in host, or after VMEXIT. The host process has two ways to mitigate - Either it can be compiled with retpoline - If its going through context switch, and has set !dumpable then there is a IBPB in that path. (Tim's patch: https://patchwork.kernel.org/patch/10192871) - The case where after a VMEXIT you return back to Qemu might make Qemu attackable from guest when Qemu isn't compiled with retpoline. There are issues reported when doing IBPB on every VMEXIT that resulted in some tsc calibration woes in guest. * Mitigate guest/ring0->host/ring0 attacks. When host kernel is using retpoline it is safe against these attacks. If host kernel isn't using retpoline we might need to do a IBPB flush on every VMEXIT. Even when using retpoline for indirect calls, in certain conditions 'ret' can use the BTB on Skylake-era CPUs. There are other mitigations available like RSB stuffing/clearing. * IBPB is issued only for SVM during svm_free_vcpu(). VMX has a vmclear and SVM doesn't. Follow discussion here: https://lkml.org/lkml/2018/1/15/146 Please refer to the following spec for more details on the enumeration and control. Refer here to get documentation about mitigations. https://software.intel.com/en-us/side-channel-security-support [peterz: rebase and changelog rewrite] [karahmed: - rebase - vmx: expose PRED_CMD if guest has it in CPUID - svm: only pass through IBPB if guest has it in CPUID - vmx: support !cpu_has_vmx_msr_bitmap()] - vmx: support nested] [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS) PRED_CMD is a write-only MSR] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: kvm@vger.kernel.org Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
Paolo Bonzini	6236b782eb	KVM: VMX: make MSR bitmaps per-VCPU (cherry picked from commit `904e14fb7c`) Place the MSR bitmap in struct loaded_vmcs, and update it in place every time the x2apic or APICv state can change. This is rare and the loop can handle 64 MSRs per iteration, in a similar fashion as nested_vmx_prepare_msr_bitmap. This prepares for choosing, on a per-VM basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Suggested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
Paolo Bonzini	ff546f9d83	KVM: VMX: introduce alloc_loaded_vmcs (cherry picked from commit `f21f165ef9`) Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also allocate an MSR bitmap there. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
Jim Mattson	46e24dfc2d	KVM: nVMX: Eliminate vmcs02 pool (cherry picked from commit `de3a0021a6`) The potential performance advantages of a vmcs02 pool have never been realized. To simplify the code, eliminate the pool. Instead, a single vmcs02 is allocated per VCPU when the VCPU enters VMX operation. Cc: stable@vger.kernel.org # prereq for Spectre mitigation Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Ameya More <ameya.more@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
David Matlack	b7649e1776	KVM: nVMX: mark vmcs12 pages dirty on L2 exit (cherry picked from commit `c9f04407f2`) The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:02 +01:00
David Hildenbrand	1edccf20b9	KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail (cherry picked from commit `6342c50ad1`) vmx_complete_nested_posted_interrupt() can't fail, let's turn it into a void function. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
David Hildenbrand	19b1d4bdfe	KVM: nVMX: kmap() can't fail commit `42cf014d38` upstream. kmap() can't fail, therefore it will always return a valid pointer. Let's just get rid of the unnecessary checks. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
Darren Kenny	34900390e9	x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL (cherry picked from commit `af189c95a3`) Fixes: `117cc7a908` ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
Arnd Bergmann	4b234a253e	x86/pti: Mark constant arrays as __initconst (cherry picked from commit `4bf5d56d42`) I'm seeing build failures from the two newly introduced arrays that are marked 'const' and '__initdata', which are mutually exclusive: arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init' arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init' The correct annotation is __initconst. Fixes: `fec9434a12` ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Garnier <thgarnie@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
KarimAllah Ahmed	961cb14c61	x86/spectre: Simplify spectre_v2 command line parsing (cherry picked from commit `9005c6834c`) [dwmw2: Use ARRAY_SIZE] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
David Woodhouse	fe43338939	x86/retpoline: Avoid retpolines for built-in __init functions (cherry picked from commit `66f793099a`) There's no point in building init code with retpolines, since it runs before any potentially hostile userspace does. And before the retpoline is actually ALTERNATIVEd into place, for much of it. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
Dan Williams	eb99bd6341	x86/kvm: Update spectre-v1 mitigation (cherry picked from commit `085331dfc6`) Commit `75f139aaf8` "KVM: x86: Add memory barrier on vmcs field lookup" added a raw 'asm("lfence");' to prevent a bounds check bypass of 'vmcs_field_to_offset_table'. The lfence can be avoided in this path by using the array_index_nospec() helper designed for these types of fixes. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Honig <ahonig@google.com> Cc: kvm@vger.kernel.org Cc: Jim Mattson <jmattson@google.com> Link: https://lkml.kernel.org/r/151744959670.6342.3001723920950249067.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
Josh Poimboeuf	7552556f65	x86/paravirt: Remove 'noreplace-paravirt' cmdline option (cherry picked from commit `12c69f1e94`) The 'noreplace-paravirt' option disables paravirt patching, leaving the original pv indirect calls in place. That's highly incompatible with retpolines, unless we want to uglify paravirt even further and convert the paravirt calls to retpolines. As far as I can tell, the option doesn't seem to be useful for much other than introducing surprising corner cases and making the kernel vulnerable to Spectre v2. It was probably a debug option from the early paravirt days. So just remove it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lkml.kernel.org/r/20180131041333.2x6blhxirc2kclrq@treble Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
David Woodhouse	cda6b6074c	x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel (cherry picked from commit `7fcae1118f`) Despite the fact that all the other code there seems to be doing it, just using set_cpu_cap() in early_intel_init() doesn't actually work. For CPUs with PKU support, setup_pku() calls get_cpu_cap() after c->c_init() has set those feature bits. That resets those bits back to what was queried from the hardware. Turning the bits off for bad microcode is easy to fix. That can just use setup_clear_cpu_cap() to force them off for all CPUs. I was less keen on forcing the feature bits on that way, just in case of inconsistencies. I appreciate that the kernel is going to get this utterly wrong if CPU features are not consistent, because it has already applied alternatives by the time secondary CPUs are brought up. But at least if setup_force_cpu_cap() isn't being used, we might have a chance of detecting the lack of the corresponding bit and either panicking or refusing to bring the offending CPU online. So ensure that the appropriate feature bits are set within get_cpu_cap() regardless of how many extra times it's called. Fixes: `2961298e` ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:01 +01:00
Colin Ian King	f67e05d150	x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" (cherry picked from commit `e698dcdfcd`) Trivial fix to spelling mistake in pr_err error message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: kernel-janitors@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180130193218.9271-1-colin.king@canonical.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	359fde6bd0	x86/spectre: Report get_user mitigation for spectre_v1 (cherry picked from commit `edfbae53da`) Reflect the presence of get_user(), __get_user(), and 'syscall' protections in sysfs. The expectation is that new and better tooling will allow the kernel to grow more usages of array_index_nospec(), for now, only claim mitigation for __user pointer de-references. Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	0781a50a30	nl80211: Sanitize array index in parse_txq_params (cherry picked from commit `259d8c1e98`) Wireless drivers rely on parse_txq_params to validate that txq_params->ac is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx() handler is called. Use a new helper, array_index_nospec(), to sanitize txq_params->ac with respect to speculation. I.e. ensure that any speculation into ->conf_tx() handlers is done with a value of txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS). Reported-by: Christian Lamparter <chunkeey@gmail.com> Reported-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: linux-wireless@vger.kernel.org Cc: torvalds@linux-foundation.org Cc: "David S. Miller" <davem@davemloft.net> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	c26ceec695	vfs, fdtable: Prevent bounds-check bypass via speculative execution (cherry picked from commit `56c30ba7b3`) 'fd' is a user controlled value that is used as a data dependency to read from the 'fdt->fd' array. In order to avoid potential leaks of kernel memory values, block speculative execution of the instruction stream that could issue reads based on an invalid 'file *' returned from __fcheck_files. Co-developed-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	c3193fd49f	x86/syscall: Sanitize syscall table de-references under speculation (cherry picked from commit `2fbd7af5af`) The syscall table base is a user controlled function pointer in kernel space. Use array_index_nospec() to prevent any out of bounds speculation. While retpoline prevents speculating into a userspace directed target it does not stop the pointer de-reference, the concern is leaking memory relative to the syscall table base, by observing instruction cache behavior. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Andy Lutomirski <luto@kernel.org> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417984.33451.1216731042505722161.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	398a39311c	x86/get_user: Use pointer masking to limit speculation (cherry picked from commit `c7f631cb07`) Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. Unlike the __get_user() case get_user() includes the address limit check near the pointer de-reference. With that locality the speculation can be mitigated with pointer narrowing rather than a barrier, i.e. array_index_nospec(). Where the narrowing is performed by: cmp %limit, %ptr sbb %mask, %mask and %mask, %ptr With respect to speculation the value of %ptr is either less than %limit or NULL. Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	065eae4be8	x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec (cherry picked from commit `304ec1b050`) Quoting Linus: I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. __uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the limit check is far away from the user pointer de-reference. In those cases a barrier_nospec() prevents speculation with a potential pointer to privileged memory. uaccess_try_nospec covers get_user_try. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	ae75f83e79	x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end} (cherry picked from commit `b5c4ae4f35`) In preparation for converting some __uaccess_begin() instances to __uacess_begin_nospec(), make sure all 'from user' uaccess paths are using the _begin(), _end() helpers rather than open-coded stac() and clac(). No functional changes. Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	e06d7bfb22	x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec (cherry picked from commit `b3bbfb3fb5`) For __get_user() paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the access_ok() result to resolve in the pipeline before the CPU might take any speculative action on the pointer value. Given the cost of 'stac' the speculation barrier is placed after 'stac' to hopefully overlap the cost of disabling SMAP with the cost of flushing the instruction pipeline. Since __get_user is a major kernel interface that deals with user controlled pointers, the __uaccess_begin_nospec() mechanism will prevent speculative execution past an access_ok() permission check. While speculative execution past access_ok() is not enough to lead to a kernel memory leak, it is a necessary precondition. To be clear, __uaccess_begin_nospec() is addressing a class of potential problems near __get_user() usages. Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used to protect __get_user(), pointer masking similar to array_index_nospec() will be used for get_user() since it incorporates a bounds check near the usage. uaccess_try_nospec provides the same mechanism for get_user_try. No functional changes. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	1f03d140e2	x86: Introduce barrier_nospec (cherry picked from commit `b3d7ad85b8`) Rename the open coded form of this instruction sequence from rdtsc_ordered() into a generic barrier primitive, barrier_nospec(). One of the mitigations for Spectre variant1 vulnerabilities is to fence speculative execution after successfully validating a bounds check. I.e. force the result of a bounds check to resolve in the instruction pipeline to ensure speculative execution honors that result before potentially operating on out-of-bounds data. No functional changes. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@chromium.org> Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:36:00 +01:00
Dan Williams	8c33e2d23a	x86: Implement array_index_mask_nospec (cherry picked from commit `babdde2698`) array_index_nospec() uses a mask to sanitize user controllable array indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask otherwise. While the default array_index_mask_nospec() handles the carry-bit from the (index - size) result in software. The x86 array_index_mask_nospec() does the same, but the carry-bit is handled in the processor CF flag without conditional instructions in the control flow. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: gregkh@linuxfoundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Dan Williams	579ef9ea20	array_index_nospec: Sanitize speculative array de-references (cherry picked from commit `f380420330`) array_index_nospec() is proposed as a generic mechanism to mitigate against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary checks via speculative execution. The array_index_nospec() implementation is expected to be safe for current generation CPUs across multiple architectures (ARM, x86). Based on an original implementation by Linus Torvalds, tweaked to remove speculative flows by Alexei Starovoitov, and tweaked again by Linus to introduce an x86 assembly implementation for the mask generation. Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Co-developed-by: Alexei Starovoitov <ast@kernel.org> Suggested-by: Cyril Novikov <cnovikov@lynx.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Mark Rutland	899ab2cf91	Documentation: Document array_index_nospec (cherry picked from commit `f84a56f73d`) Document the rationale and usage of the new array_index_nospec() helper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: linux-arch@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: gregkh@linuxfoundation.org Cc: kernel-hardening@lists.openwall.com Cc: torvalds@linux-foundation.org Cc: alan@linux.intel.com Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Andy Lutomirski	f03d00ba0b	x86/asm: Move 'status' from thread_struct to thread_info (cherry picked from commit `37a8f7c383`) The TS_COMPAT bit is very hot and is accessed from code paths that mostly also touch thread_info::flags. Move it into struct thread_info to improve cache locality. The only reason it was in thread_struct is that there was a brief period during which arch-specific fields were not allowed in struct thread_info. Linus suggested further changing: ti->status &= ~(TS_COMPAT\|TS_I386_REGS_POKED); to: if (unlikely(ti->status & (TS_COMPAT\|TS_I386_REGS_POKED))) ti->status &= ~(TS_COMPAT\|TS_I386_REGS_POKED); on the theory that frequently dirtying the cacheline even in pure 64-bit code that never needs to modify status hurts performance. That could be a reasonable followup patch, but I suspect it matters less on top of this patch. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Andy Lutomirski	572e509178	x86/entry/64: Push extra regs right away (cherry picked from commit `d1f7732009`) With the fast path removed there is no point in splitting the push of the normal and the extra register set. Just push the extra regs right away. [ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Andy Lutomirski	d7f8d17406	x86/entry/64: Remove the SYSCALL64 fast path (cherry picked from commit `21d375b6b3`) The SYCALLL64 fast path was a nice, if small, optimization back in the good old days when syscalls were actually reasonably fast. Now there is PTI to slow everything down, and indirect branches are verboten, making everything messier. The retpoline code in the fast path is particularly nasty. Just get rid of the fast path. The slow path is barely slower. [ tglx: Split out the 'push all extra regs' part ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Dou Liyang	9eedeb72c4	x86/spectre: Check CONFIG_RETPOLINE in command line parser (cherry picked from commit `9471eee918`) The spectre_v2 option 'auto' does not check whether CONFIG_RETPOLINE is enabled. As a consequence it fails to emit the appropriate warning and sets feature flags which have no effect at all. Add the missing IS_ENABLED() check. Fixes: `da28512156` ("x86/spectre: Add boot time option to select Spectre v2 mitigation") Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: Tomohiro <misono.tomohiro@jp.fujitsu.com> Cc: dave.hansen@intel.com Cc: bp@alien8.de Cc: arjan@linux.intel.com Cc: dwmw@amazon.co.uk Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/f5892721-7528-3647-08fb-f8d10e65ad87@cn.fujitsu.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Borislav Petkov	77d1424d2f	x86/retpoline: Simplify vmexit_fill_RSB() (cherry picked from commit `1dde7415e9`) Simplify it to call an asm-function instead of pasting 41 insn bytes at every call site. Also, add alignment to the macro as suggested here: https://support.google.com/faqs/answer/7625886 [dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
David Woodhouse	77b3b3ee23	x86/cpufeatures: Clean up Spectre v2 related CPUID flags (cherry picked from commit `2961298efe`) We want to expose the hardware features simply in /proc/cpuinfo as "ibrs", "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them as the user-visible bits. When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP bit is set, set the AMD STIBP that's used for the generic hardware capability. Hide the rest from /proc/cpuinfo by putting "" in the comments. Including RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are patches to make the sysfs vulnerabilities information non-readable by non-root, and the same should apply to all information about which mitigations are actually in use. Those shouldn't appear in /proc/cpuinfo. The feature bit for whether IBPB is actually used, which is needed for ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB. Originally-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Thomas Gleixner	98911226d5	x86/cpu/bugs: Make retpoline module warning conditional (cherry picked from commit `e383095c7f`) If sysfs is disabled and RETPOLINE not defined: arch/x86/kernel/cpu/bugs.c:97:13: warning: ‘spectre_v2_bad_module’ defined but not used [-Wunused-variable] static bool spectre_v2_bad_module; Hide it. Fixes: `caf7501a1b` ("module/retpoline: Warn about missing retpoline in module") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:59 +01:00
Borislav Petkov	557cbfa222	x86/bugs: Drop one "mitigation" from dmesg (cherry picked from commit `55fa19d3e5`) Make [ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline into [ 0.031118] Spectre V2: Mitigation: Full generic retpoline to reduce the mitigation mitigations strings. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: tim.c.chen@linux.intel.com Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
Borislav Petkov	18bc71dff6	x86/nospec: Fix header guards names (cherry picked from commit `7a32fc51ca`) ... to adhere to the _ASM_X86_ naming scheme. No functional change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: riel@redhat.com Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: jikos@kernel.org Cc: luto@amacapital.net Cc: dave.hansen@intel.com Cc: torvalds@linux-foundation.org Cc: keescook@google.com Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Cc: pjt@google.com Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	31fd9eda7f	x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support (cherry picked from commit `20ffa1caec`) Expose indirect_branch_prediction_barrier() for use in subsequent patches. [ tglx: Add IBPB status to spectre_v2 sysfs file ] Co-developed-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	6c5e49150a	x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes (cherry picked from commit `a5b2966364`) This doesn't refuse to load the affected microcodes; it just refuses to use the Spectre v2 mitigation features if they're detected, by clearing the appropriate feature bits. The AMD CPUID bits are handled here too, because hypervisors may have been exposing those bits even on Intel chips, for fine-grained control of what's available. It is non-trivial to use x86_match_cpu() for this table because that doesn't handle steppings. And the approach taken in commit `bd9240a18` almost made me lose my lunch. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	a8799fd14d	x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown (cherry picked from commit `fec9434a12`) Also, for CPUs which don't speculate at all, don't report that they're vulnerable to the Spectre variants either. Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it for now, even though that could be done with a simple comparison, on the assumption that we'll have more to add. Based on suggestions from Dave Hansen and Alan Cox. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	af57d43c90	x86/msr: Add definitions for new speculation control MSRs (cherry picked from commit `1e340c60d0`) Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES. See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	c26a6bea26	x86/cpufeatures: Add AMD feature bits for Speculation Control (cherry picked from commit `5d10cbc91d`) AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel. See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	40532f65cc	x86/cpufeatures: Add Intel feature bits for Speculation Control (cherry picked from commit `fc67dd70ad`) Add three feature bits exposed by new microcode on Intel CPUs for speculation control. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
David Woodhouse	d3eba77440	x86/cpufeatures: Add CPUID_7_EDX CPUID leaf (cherry picked from commit `95ca0ee863`) This is a pure feature bits leaf. There are two AVX512 feature bits in it already which were handled as scattered bits, and three more from this leaf are going to be added for speculation control features. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
Andi Kleen	a1745ad92f	module/retpoline: Warn about missing retpoline in module (cherry picked from commit `caf7501a1b`) There's a risk that a kernel which has full retpoline mitigations becomes vulnerable when a module gets loaded that hasn't been compiled with the right compiler or the right option. To enable detection of that mismatch at module load time, add a module info string "retpoline" at build time when the module was compiled with retpoline support. This only covers compiled C source, but assembler source or prebuilt object files are not checked. If a retpoline enabled kernel detects a non retpoline protected module at load time, print a warning and report it in the sysfs vulnerability file. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: jeyu@kernel.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:58 +01:00
Peter Zijlstra	ec86a1dad0	KVM: VMX: Make indirect call speculation safe (cherry picked from commit `c940a3fb1e`) Replace indirect call with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Peter Zijlstra	fea3c9a540	KVM: x86: Make indirect calls in emulator speculation safe (cherry picked from commit `1a29b5b7f3`) Replace the indirect calls with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org [dwmw2: Use ASM_CALL_CONSTRAINT like upstream, now we have it] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Waiman Long	734e687d1d	x86/retpoline: Remove the esp/rsp thunk (cherry picked from commit `1df37383a8`) It doesn't make sense to have an indirect call thunk with esp/rsp as retpoline code won't work correctly with the stack pointer register. Removing it will help compiler writers to catch error in case such a thunk call is emitted incorrectly. Fixes: `76b043848f` ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law <law@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Eric Biggers	9692602ab8	KEYS: encrypted: fix buffer overread in valid_master_desc() commit `794b4bc292` upstream. With the 'encrypted' key type it was possible for userspace to provide a data blob ending with a master key description shorter than expected, e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a master key description, validate_master_desc() could read beyond the end of the buffer. Fix this by using strncmp() instead of memcmp(). [Also clean up the code to deduplicate some logic.] Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Jin Qian <jinqian@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Takashi Iwai	0a01ecbd23	b43: Add missing MODULE_FIRMWARE() commit `3c89a72ad8` upstream. Some firmware entries were forgotten to be added via MODULE_FIRMWARE(), which may result in the non-functional state when the driver is loaded in initrd. Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1037344 Fixes: `15be8e89cd` ("b43: add more bcma cores") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Jesse Chan	113d22965c	media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `5331aec1bf` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Borislav Petkov	dd7b14c3e0	x86/microcode/AMD: Do not load when running on a hypervisor commit `a15a753539` upstream. Doing so is completely void of sense for multiple reasons so prevent it. Set dis_ucode_ldr to true and thus disable the microcode loader by default to address xen pv guests which execute the AP path but not the BSP path. By having it turned off by default, the APs won't run into the loader either. Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the xen pv one. That one gets the aforementioned "fix". Also, improve the detection method by caching the final decision whether to continue loading in dis_ucode_ldr and do it once on the BSP. The APs then simply test that value. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Juergen Gross <jgross@suse.com> Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rolf Neugebauer <rolf.neugebauer@docker.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Josh Poimboeuf	0a9b2dec6c	x86/asm: Fix inline asm call constraints for GCC 4.4 commit `520a13c530` upstream. The kernel test bot (run by Xiaolong Ye) reported that the following commit: `f5caf621ee` ("x86/asm: Fix inline asm call constraints for Clang") is causing double faults in a kernel compiled with GCC 4.4. Linus subsequently diagnosed the crash pattern and the buggy commit and found that the issue is with this code: register unsigned int __asm_call_sp asm("esp"); #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp) Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC to produce the following bogus code: ffffffff8147461d: 89 e0 mov %esp,%eax ffffffff8147461f: 4c 89 f7 mov %r14,%rdi ffffffff81474622: 4c 89 fe mov %r15,%rsi ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx ffffffff8147462a: 89 c4 mov %eax,%esp ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled> Despite the absurdity of it backing up and restoring the stack pointer for no reason, the bug is actually the fact that it's only backing up and restoring the lower 32 bits of the stack pointer. The upper 32 bits are getting cleared out, corrupting the stack pointer. So change the '__asm_call_sp' register variable to be associated with the actual full-size stack pointer. This also requires changing the __ASM_SEL() macro to be based on the actual compiled arch size, rather than the CONFIG value, because CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso). Otherwise Clang fails to build the kernel because it complains about the use of a 64-bit register (RSP) in a 32-bit file. Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye@intel.com> Diagnosed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: LKP <lkp@01.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `f5caf621ee` ("x86/asm: Fix inline asm call constraints for Clang") Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Eric Dumazet	b671f40419	soreuseport: fix mem leak in reuseport_add_sock() [ Upstream commit `4db428a7c9` ] reuseport_add_sock() needs to deal with attaching a socket having its own sk_reuseport_cb, after a prior setsockopt(SO_ATTACH_REUSEPORT_?BPF) Without this fix, not only a WARN_ONCE() was issued, but we were also leaking memory. Thanks to sysbot and Eric Biggers for providing us nice C repros. ------------[ cut here ]------------ socket already in reuseport group WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119 reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 Fixes: `ef456144da` ("soreuseport: define reuseport groups") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Martin KaFai Lau	5771415d24	ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only [ Upstream commit `7ece54a60e` ] If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it implicitly implies it is an ipv6only socket. However, in inet6_bind(), this addr_type checking and setting sk->sk_ipv6only to 1 are only done after sk->sk_prot->get_port(sk, snum) has been completed successfully. This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses the 'get_port()'. In particular, when binding SO_REUSEPORT UDP sockets, udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock() checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be 1 while ipv6_only_sock(sk) is still 0 here. The end result is, reuseport_alloc(sk) is called instead of adding sk to the existing sk2->sk_reuseport_cb. It can be reproduced by binding two SO_REUSEPORT UDP sockets on an IPv6 address (!ANY and !MAPPED). Only one of the socket will receive packet. The fix is to set the implicit sk_ipv6only before calling get_port(). The original sk_ipv6only has to be saved such that it can be restored in case get_port() failed. The situation is similar to the inet_reset_saddr(sk) after get_port() has failed. Thanks to Calvin Owens <calvinowens@fb.com> who created an easy reproduction which leads to a fix. Fixes: `e32ea7e747` ("soreuseport: fast reuseport UDP socket selection") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:57 +01:00
Paolo Abeni	fa46d1437f	cls_u32: add missing RCU annotation. [ Upstream commit `058a6c0334` ] In a couple of points of the control path, n->ht_down is currently accessed without the required RCU annotation. The accesses are safe, but sparse complaints. Since we already held the rtnl lock, let use rtnl_dereference(). Fixes: `a1b7c5fd7f` ("net: sched: add cls_u32 offload hooks for netdevs") Fixes: `de5df63228` ("net: sched: cls_u32 changes to knode must appear atomic to readers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Neal Cardwell	b980f718f5	tcp_bbr: fix pacing_gain to always be unity when using lt_bw [ Upstream commit `3aff3b4b98` ] This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when using lt_bw and returning from the PROBE_RTT state to PROBE_BW. Previously, when using lt_bw, upon exiting PROBE_RTT and entering PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase() set a pacing gain above 1.0. In such cases this would result in a pacing rate that is 1.25x higher than intended, potentially resulting in a high loss rate for a little while until we stop using the lt_bw a bit later. This commit is a stable candidate for kernels back as far as 4.9. Fixes: `0f8782ea14` ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Reported-by: Beyers Cronje <bcronje@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Jason Wang	73adb3b74e	vhost_net: stop device during reset owner [ Upstream commit `4cd879515d` ] We don't stop device before reset owner, this means we could try to serve any virtqueue kick before reset dev->worker. This will result a warn since the work was pending at llist during owner resetting. Fix this by stopping device during owner reset. Reported-by: syzbot+eb17c6162478cc50632c@syzkaller.appspotmail.com Fixes: `3a4d5c94e9` ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Li RongQing	ee46a86142	tcp: release sk_frag.page in tcp_disconnect [ Upstream commit `9b42d55a66` ] socket can be disconnected and gets transformed back to a listening socket, if sk_frag.page is not released, which will be cloned into a new socket by sk_clone_lock, but the reference count of this page is increased, lead to a use after free or double free issue Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Chunhao Lin	5db5cabbf0	r8169: fix RTL8168EP take too long to complete driver initialization. [ Upstream commit `086ca23d03` ] Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver waiting until timeout. Fix this by waiting for the right register bit. Signed-off-by: Chunhao Lin <hau@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Kristian Evensen	9f2f873d5a	qmi_wwan: Add support for Quectel EP06 [ Upstream commit `c0b91a56a2` ] The Quectel EP06 is a Cat. 6 LTE modem. It uses the same interface as the EC20/EC25 for QMI, and requires the same "set DTR"-quirk to work. Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Junxiao Bi	97fe899816	qlcnic: fix deadlock bug [ Upstream commit `233ac38916` ] The following soft lockup was caught. This is a deadlock caused by recusive locking. Process kworker/u40:1:28016 was holding spin lock "mbx->queue_lock" in qlcnic_83xx_mailbox_worker(), while a softirq came in and ask the same spin lock in qlcnic_83xx_enqueue_mbx_cmd(). This lock should be hold by disable bh.. [161846.962125] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u40:1:28016] [161846.962367] Modules linked in: tun ocfs2 xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc sunrpc 8021q mrp garp bridge stp llc bonding dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 shpchp lpc_ich mfd_core ioatdma ipmi_devintf ipmi_si ipmi_msghandler sg ext4 jbd2 mbcache2 sr_mod cdrom sd_mod igb i2c_algo_bit i2c_core ahci libahci megaraid_sas ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel qla2xxx scsi_transport_fc qlcnic crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod [161846.962454] [161846.962460] CPU: 1 PID: 28016 Comm: kworker/u40:1 Not tainted 4.1.12-94.5.9.el6uek.x86_64 #2 [161846.962463] Hardware name: Oracle Corporation SUN SERVER X4-2L /ASSY,MB,X4-2L , BIOS 26050100 09/19/2017 [161846.962489] Workqueue: qlcnic_mailbox qlcnic_83xx_mailbox_worker [qlcnic] [161846.962493] task: ffff8801f2e34600 ti: ffff88004ca5c000 task.ti: ffff88004ca5c000 [161846.962496] RIP: e030:[<ffffffff810013aa>] [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20 [161846.962506] RSP: e02b:ffff880202e43388 EFLAGS: 00000206 [161846.962509] RAX: 0000000000000000 RBX: ffff8801f6996b70 RCX: ffffffff810013aa [161846.962511] RDX: ffff880202e433cc RSI: ffff880202e433b0 RDI: 0000000000000003 [161846.962513] RBP: ffff880202e433d0 R08: 0000000000000000 R09: ffff8801fe893200 [161846.962516] R10: ffff8801fe400538 R11: 0000000000000206 R12: ffff880202e4b000 [161846.962518] R13: 0000000000000050 R14: 0000000000000001 R15: 000000000000020d [161846.962528] FS: 0000000000000000(0000) GS:ffff880202e40000(0000) knlGS:ffff880202e40000 [161846.962531] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [161846.962533] CR2: 0000000002612640 CR3: 00000001bb796000 CR4: 0000000000042660 [161846.962536] Stack: [161846.962538] ffff880202e43608 0000000000000000 ffffffff813f0442 ffff880202e433b0 [161846.962543] 0000000000000000 ffff880202e433cc ffffffff00000001 0000000000000000 [161846.962547] 00000009813f03d6 ffff880202e433e0 ffffffff813f0460 ffff880202e43440 [161846.962552] Call Trace: [161846.962555] <IRQ> [161846.962565] [<ffffffff813f0442>] ? xen_poll_irq_timeout+0x42/0x50 [161846.962570] [<ffffffff813f0460>] xen_poll_irq+0x10/0x20 [161846.962578] [<ffffffff81014222>] xen_lock_spinning+0xe2/0x110 [161846.962583] [<ffffffff81013f01>] __raw_callee_save_xen_lock_spinning+0x11/0x20 [161846.962592] [<ffffffff816e5c57>] ? _raw_spin_lock+0x57/0x80 [161846.962609] [<ffffffffa028acfc>] qlcnic_83xx_enqueue_mbx_cmd+0x7c/0xe0 [qlcnic] [161846.962623] [<ffffffffa028e008>] qlcnic_83xx_issue_cmd+0x58/0x210 [qlcnic] [161846.962636] [<ffffffffa028caf2>] qlcnic_83xx_sre_macaddr_change+0x162/0x1d0 [qlcnic] [161846.962649] [<ffffffffa028cb8b>] qlcnic_83xx_change_l2_filter+0x2b/0x30 [qlcnic] [161846.962657] [<ffffffff8160248b>] ? __skb_flow_dissect+0x18b/0x650 [161846.962670] [<ffffffffa02856e5>] qlcnic_send_filter+0x205/0x250 [qlcnic] [161846.962682] [<ffffffffa0285c77>] qlcnic_xmit_frame+0x547/0x7b0 [qlcnic] [161846.962691] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0 [161846.962696] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0 [161846.962701] [<ffffffff81630112>] sch_direct_xmit+0x112/0x220 [161846.962706] [<ffffffff8160b80f>] __dev_queue_xmit+0x1df/0x5e0 [161846.962710] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20 [161846.962721] [<ffffffffa0575bd5>] bond_dev_queue_xmit+0x35/0x80 [bonding] [161846.962729] [<ffffffffa05769fb>] __bond_start_xmit+0x1cb/0x210 [bonding] [161846.962736] [<ffffffffa0576a71>] bond_start_xmit+0x31/0x60 [bonding] [161846.962740] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0 [161846.962745] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0 [161846.962749] [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0 [161846.962754] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20 [161846.962760] [<ffffffffa05cfa72>] vlan_dev_hard_start_xmit+0xb2/0x150 [8021q] [161846.962764] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0 [161846.962769] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0 [161846.962773] [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0 [161846.962777] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20 [161846.962789] [<ffffffffa05adf74>] br_dev_queue_push_xmit+0x54/0xa0 [bridge] [161846.962797] [<ffffffffa05ae4ff>] br_forward_finish+0x2f/0x90 [bridge] [161846.962807] [<ffffffff810b0dad>] ? ttwu_do_wakeup+0x1d/0x100 [161846.962811] [<ffffffff815f929b>] ? __alloc_skb+0x8b/0x1f0 [161846.962818] [<ffffffffa05ae04d>] __br_forward+0x8d/0x120 [bridge] [161846.962822] [<ffffffff815f613b>] ? __kmalloc_reserve+0x3b/0xa0 [161846.962829] [<ffffffff810be55e>] ? update_rq_runnable_avg+0xee/0x230 [161846.962836] [<ffffffffa05ae176>] br_forward+0x96/0xb0 [bridge] [161846.962845] [<ffffffffa05af85e>] br_handle_frame_finish+0x1ae/0x420 [bridge] [161846.962853] [<ffffffffa05afc4f>] br_handle_frame+0x17f/0x260 [bridge] [161846.962862] [<ffffffffa05afad0>] ? br_handle_frame_finish+0x420/0x420 [bridge] [161846.962867] [<ffffffff8160d057>] __netif_receive_skb_core+0x1f7/0x870 [161846.962872] [<ffffffff8160d6f2>] __netif_receive_skb+0x22/0x70 [161846.962877] [<ffffffff8160d913>] netif_receive_skb_internal+0x23/0x90 [161846.962884] [<ffffffffa07512ea>] ? xenvif_idx_release+0xea/0x100 [xen_netback] [161846.962889] [<ffffffff816e5a10>] ? _raw_spin_unlock_irqrestore+0x20/0x50 [161846.962893] [<ffffffff8160e624>] netif_receive_skb_sk+0x24/0x90 [161846.962899] [<ffffffffa075269a>] xenvif_tx_submit+0x2ca/0x3f0 [xen_netback] [161846.962906] [<ffffffffa0753f0c>] xenvif_tx_action+0x9c/0xd0 [xen_netback] [161846.962915] [<ffffffffa07567f5>] xenvif_poll+0x35/0x70 [xen_netback] [161846.962920] [<ffffffff8160e01b>] napi_poll+0xcb/0x1e0 [161846.962925] [<ffffffff8160e1c0>] net_rx_action+0x90/0x1c0 [161846.962931] [<ffffffff8108aaba>] __do_softirq+0x10a/0x350 [161846.962938] [<ffffffff8108ae75>] irq_exit+0x125/0x130 [161846.962943] [<ffffffff813f03a9>] xen_evtchn_do_upcall+0x39/0x50 [161846.962950] [<ffffffff816e7ffe>] xen_do_hypervisor_callback+0x1e/0x40 [161846.962952] <EOI> [161846.962959] [<ffffffff816e5c4a>] ? _raw_spin_lock+0x4a/0x80 [161846.962964] [<ffffffff816e5b1e>] ? _raw_spin_lock_irqsave+0x1e/0xa0 [161846.962978] [<ffffffffa028e279>] ? qlcnic_83xx_mailbox_worker+0xb9/0x2a0 [qlcnic] [161846.962991] [<ffffffff810a14e1>] ? process_one_work+0x151/0x4b0 [161846.962995] [<ffffffff8100c3f2>] ? check_events+0x12/0x20 [161846.963001] [<ffffffff810a1960>] ? worker_thread+0x120/0x480 [161846.963005] [<ffffffff816e187b>] ? __schedule+0x30b/0x890 [161846.963010] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0 [161846.963015] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0 [161846.963021] [<ffffffff810a6b3e>] ? kthread+0xce/0xf0 [161846.963025] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70 [161846.963031] [<ffffffff816e6522>] ? ret_from_fork+0x42/0x70 [161846.963035] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70 [161846.963037] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Eric Dumazet	ce43c07fce	net: igmp: add a missing rcu locking section [ Upstream commit `e7aadb27a5` ] Newly added igmpv3_get_srcaddr() needs to be called under rcu lock. Timer callbacks do not ensure this locking. ============================= WARNING: suspicious RCU usage 4.15.0+ #200 Not tainted ----------------------------- ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syzkaller616973/4074: #0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355 #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline] #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316 #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline] #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600 stack backtrace: CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592 __in_dev_get_rcu include/linux/inetdevice.h:216 [inline] igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline] igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389 add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432 add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565 igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605 igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722 igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938 Fixes: `a46182b002` ("net: igmp: Use correct source address on IGMPv3 reports") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Nikolay Aleksandrov	7d3d60ef22	ip6mr: fix stale iterator [ Upstream commit `4adfa79fc2` ] When we dump the ip6mr mfc entries via proc, we initialize an iterator with the table to dump but we don't clear the cache pointer which might be initialized from a prior read on the same descriptor that ended. This can result in lock imbalance (an unnecessary unlock) leading to other crashes and hangs. Clear the cache pointer like ipmr does to fix the issue. Thanks for the reliable reproducer. Here's syzbot's trace: WARNING: bad unlock balance detected! 4.15.0-rc3+ #128 Not tainted syzkaller971460/3195 is trying to release lock (mrt_lock) at: [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 but there are no more locks to release! other info that might help us debug this: 1 lock held by syzkaller971460/3195: #0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0 fs/seq_file.c:165 stack backtrace: CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561 __lock_release kernel/locking/lockdep.c:3775 [inline] lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023 __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255 ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 traverse+0x3bc/0xa00 fs/seq_file.c:135 seq_read+0x96a/0x13d0 fs/seq_file.c:189 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 BUG: sleeping function called from invalid context at lib/usercopy.c:25 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460 INFO: lockdep is turned off. CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060 __might_sleep+0x95/0x190 kernel/sched/core.c:6013 __might_fault+0xab/0x1d0 mm/memory.c:4525 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] seq_read+0xcb4/0x13d0 fs/seq_file.c:279 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0 lib/usercopy.c:26 Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Sebastian Andrzej Siewior	ffcf167d34	serial: core: mark port as initialized after successful IRQ change commit `44117a1d17` upstream. setserial changes the IRQ via uart_set_info(). It invokes uart_shutdown() which free the current used IRQ and clear TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke uart_startup() before returning to the caller leaving TTY_PORT_INITIALIZED cleared. The next open will crash with \| list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98. since the close from the IOCTL won't free the IRQ (and clean the list) due to the TTY_PORT_INITIALIZED check in uart_shutdown(). There is same pattern in uart_do_autoconfig() and I think it also needs to set TTY_PORT_INITIALIZED there. Is there a reason why uart_startup() does not set the flag by itself after the IRQ has been acquired (since it is cleared in uart_shutdown)? Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:56 +01:00
Hugh Dickins	400d3c8b0c	kaiser: allocate pgd with order 0 when pti=off The 4.9.77 version of "x86/pti/efi: broken conversion from efi to kernel page table" looked nicer than the 4.4.112 version, but was suboptimal on machines booted with "pti=off" (or on AMD machines): it allocated pgd with an order 1 page whatever the setting of kaiser_enabled. Fix that by moving the definition of PGD_ALLOCATION_ORDER from asm/pgalloc.h to asm/pgtable.h, which already defines kaiser_enabled. Fixes: `1b92c48a2e` ("x86/pti/efi: broken conversion from efi to kernel page table") Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Dave Hansen	ae1fc8de51	x86/pti: Make unpoison of pgd for trusted boot work for real commit `445b69e3b7` upstream The inital fix for trusted boot and PTI potentially misses the pgd clearing if pud_alloc() sets a PGD. It probably works in practice because for two adjacent calls to map_tboot_page() that share a PGD entry, the first will clear NX, then allocate and set the PGD (without NX clear). The second call will not allocate but will clear the NX bit. Defer the NX clearing to a point after it is known that all top-level allocations have occurred. Add a comment to clarify why. [ tglx: Massaged changelog ] [ hughd notes: I have not tested tboot, but this looks to me as necessary and as safe in old-Kaiser backports as it is upstream; I'm not submitting the commit-to-be-fixed `262b6b3008`, since it was undone by `445b69e3b7`, and makes conflict trouble because of 5-level's p4d versus 4-level's pgd.] Fixes: `262b6b3008` ("x86/tboot: Unbreak tboot with PTI enabled") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: peterz@infradead.org Cc: ning.sun@intel.com Cc: tboot-devel@lists.sourceforge.net Cc: andi@firstfloor.org Cc: luto@kernel.org Cc: law@redhat.com Cc: pbonzini@redhat.com Cc: torvalds@linux-foundation.org Cc: gregkh@linux-foundation.org Cc: dwmw@amazon.co.uk Cc: nickc@redhat.com Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Hugh Dickins	0a61cd6cae	kaiser: fix intel_bts perf crashes Vince reported perf_fuzzer quickly locks up on 4.15-rc7 with PTI; Robert reported Bad RIP with KPTI and Intel BTS also on 4.15-rc7: honggfuzz -f /tmp/somedirectorywithatleastonefile \ --linux_perf_bts_edge -s -- /bin/true (honggfuzz from https://github.com/google/honggfuzz) crashed with BUG: unable to handle kernel paging request at ffff9d3215100000 (then narrowed it down to perf record --per-thread -e intel_bts//u -- /bin/ls). The intel_bts driver does not use the 'normal' BTS buffer which is exposed through kaiser_add_mapping(), but instead uses the memory allocated for the perf AUX buffer. This obviously comes apart when using PTI, because then the kernel mapping, which includes that AUX buffer memory, disappears while switched to user page tables. Easily fixed in old-Kaiser backports, by applying kaiser_add_mapping() to those pages; perhaps not so easy for upstream, where 4.15-rc8 commit `99a9dc98ba` ("x86,perf: Disable intel_bts when PTI") disables for now. Slightly reorganized surrounding code in bts_buffer_setup_aux(), so it can better match bts_buffer_free_aux(): free_aux with an #ifdef to avoid the loop when PTI is off, but setup_aux needs to loop anyway (and kaiser_add_mapping() is cheap when PTI config is off or "pti=off"). Reported-by: Vince Weaver <vincent.weaver@maine.edu> Reported-by: Robert Święcki <robert@swiecki.net> Analyzed-by: Peter Zijlstra <peterz@infradead.org> Analyzed-by: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Vince Weaver <vince@deater.net> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Jesse Chan	374c84de94	ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `0cab20cec0` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Jesse Chan	0ee4f5e7bb	pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `0b9335cbd3` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Jesse Chan	781a2d6831	auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `09c479f7f1` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Michael Ellerman	9fed3978c3	powerpc/64s: Allow control of RFI flush via debugfs commit `236003e6b5` upstream. Expose the state of the RFI flush (enabled/disabled) via debugfs, and allow it to be enabled/disabled at runtime. eg: $ cat /sys/kernel/debug/powerpc/rfi_flush 1 $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush $ cat /sys/kernel/debug/powerpc/rfi_flush 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Michael Ellerman	1f0c936f43	powerpc/64s: Wire up cpu_show_meltdown() commit `fd6e440f20` upstream. The recent commit `87590ce6e3` ("sysfs/cpu: Add vulnerability folder") added a generic folder and set of files for reporting information on CPU vulnerabilities. One of those was for meltdown: /sys/devices/system/cpu/vulnerabilities/meltdown This commit wires up that file for 64-bit Book3S powerpc. For now we default to "Vulnerable" unless the RFI flush is enabled. That may not actually be true on all hardware, further patches will refine the reporting based on the CPU/platform etc. But for now we default to being pessimists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Oliver O'Halloran	6aec12e186	powerpc/powernv: Check device-tree for RFI flush settings commit `6e032b350c` upstream. New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Michael Neuling	7db0fff62f	powerpc/pseries: Query hypervisor for RFI flush settings commit `8989d56878` upstream. A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:55 +01:00
Michael Ellerman	0ef9f8289e	powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti commit `bc9c9304a4` upstream. Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it. We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Michael Ellerman	c3b82ebee6	powerpc/64s: Add support for RFI flush of L1-D cache commit `aa8a5e0062` upstream. On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest. This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs. The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load. In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest. In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software. For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to. In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them. Tested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir - back ported to stable with changes] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Nicholas Piggin	48cc95d4e4	powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL commit `c7305645eb` upstream. In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Backport to 4.4 based on patch from Balbir] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Nicholas Piggin	00e40620a5	powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL commit `b8e90cb7bc` upstream. In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Nicholas Piggin	9d914324d9	powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL commit `a08f828cf4` upstream. Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Nicholas Piggin	8fd3f98d0f	powerpc/64: Add macros for annotating the destination of rfid/hrfid commit `50e51c13b3` upstream. The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is. To make it clearer when reading the code, add macros which encode the expected destination context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Michael Neuling	be6641a7e6	powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper commit `191eccb158` upstream. A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [Balbir fixed conflicts in backport] Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-13 12:35:54 +01:00
Greg Kroah-Hartman	331b057d4f	Linux 4.9.80	2018-02-03 17:05:43 +01:00
Stefan Agner	1333c3e996	spi: imx: do not access registers while clocks disabled commit `d593574aff` upstream. Since clocks are disabled except during message transfer clocks are also disabled when spi_imx_remove gets called. Accessing registers leads to a freeeze at least on a i.MX 6ULL. Enable clocks before disabling accessing the MXC_CSPICTRL register. Fixes: `9e556dcc55` ("spi: spi-imx: only enable the clocks when we start to transfer a message") Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:43 +01:00
Fabio Estevam	5846849a1a	serial: imx: Only wakeup via RTSDEN bit if the system has RTS/CTS commit `38b1f0fb42` upstream. The wakeup mechanism via RTSDEN bit relies on the system using the RTS/CTS lines, so only allow such wakeup method when the system actually has RTS/CTS support. Fixes: `bc85734b12` ("serial: imx: allow waking up on RTSD") Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Martin Kaiser <martin@kaiser.cx> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:43 +01:00
Shuah Khan	9df847674e	usbip: vhci_hcd: clear just the USB_PORT_STAT_POWER bit Upstream commit `1c9de5bf42` ("usbip: vhci-hcd: Add USB3 SuperSpeed support") vhci_hcd clears all the bits port_status bits instead of clearing just the USB_PORT_STAT_POWER bit when it handles ClearPortFeature: USB_PORT_FEAT_POWER. This causes vhci_hcd attach to fail in a bad state, leaving device unusable by the client. The device is still attached and however client can't use it. The problem was fixed as part of larger change to add USB3 Super Speed support. This patch backports just the change to clear the USB_PORT_STAT_POWER. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:43 +01:00
Benjamin Herrenschmidt	57d4bb1bee	usb/gadget: Fix "high bandwidth" check in usb_gadget_ep_match_desc() commit `11fb379987` upstream. The current code tries to test for bits that are masked out by usb_endpoint_maxp(). Instead, use the proper accessor to access the new high bandwidth bits. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Oliver Neukum	92e64a1079	usb: uas: unconditionally bring back host after reset commit `cbeef22fd6` upstream. Quoting Hans: If we return 1 from our post_reset handler, then our disconnect handler will be called immediately afterwards. Since pre_reset blocks all scsi requests our disconnect handler will then hang in the scsi_remove_host call. This is esp. bad because our disconnect handler hanging for ever also stops the USB subsys from enumerating any new USB devices, causes commands like lsusb to hang, etc. In practice this happens when unplugging some uas devices because the hub code may see the device as needing a warm-reset and calls usb_reset_device before seeing the disconnect. In this case uas_configure_endpoints fails with -ENODEV. We do not want to print an error for this, so this commit also silences the shost_printk for -ENODEV. ENDQUOTE However, if we do that we better drop any unconditional execution and report to the SCSI subsystem that we have undergone a reset but we are not operational now. Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Hemant Kumar	f24d171a81	usb: f_fs: Prevent gadget unbind if it is already unbound commit `ce5bf9a50d` upstream. Upon usb composition switch there is possibility of ep0 file release happening after gadget driver bind. In case of composition switch from adb to a non-adb composition gadget will never gets bound again resulting into failure of usb device enumeration. Fix this issue by checking FFS_FL_BOUND flag and avoid extra gadget driver unbind if it is already done as part of composition switch. This fixes adb reconnection error reported on Android running v4.4 and above kernel versions. Verified on Hikey running vanilla v4.15-rc7 + few out of tree Mali patches. Reviewed-at: https://android-review.googlesource.com/#/c/582632/ Cc: Felipe Balbi <balbi@kernel.org> Cc: Greg KH <gregkh@linux-foundation.org> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Badhri <badhri@google.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: Hemant Kumar <hemantk@codeaurora.org> [AmitP: Cherry-picked it from android-4.14 and updated the commit log] Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Johan Hovold	800de0fab1	USB: serial: simple: add Motorola Tetra driver commit `46fe895e22` upstream. Add new Motorola Tetra (simple) driver for Motorola Solutions TETRA PEI devices. D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0cad ProdID=9011 Rev=24.16 S: Manufacturer=Motorola Solutions Inc. S: Product=Motorola Solutions TETRA PEI interface C: #Ifs= 2 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) Note that these devices do not support the CDC SET_CONTROL_LINE_STATE request (for any interface). Reported-by: Max Schulze <max.schulze@posteo.de> Tested-by: Max Schulze <max.schulze@posteo.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Shuah Khan	f80079536b	usbip: list: don't list devices attached to vhci_hcd commit `ef824501f5` upstream. usbip host lists devices attached to vhci_hcd on the same server when user does attach over localhost or specifies the server as the remote. usbip attach -r localhost -b busid or usbip attach -r servername (or server IP) Fix it to check and not list devices that are attached to vhci_hcd. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Shuah Khan	4c6fcc3425	usbip: prevent bind loops on devices attached to vhci_hcd commit `ef54cf0c60` upstream. usbip host binds to devices attached to vhci_hcd on the same server when user does attach over localhost or specifies the server as the remote. usbip attach -r localhost -b busid or usbip attach -r servername (or server IP) Unbind followed by bind works, however device is left in a bad state with accesses via the attached busid result in errors and system hangs during shutdown. Fix it to check and bail out if the device is already attached to vhci_hcd. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Jia-Ju Bai	ec719c52af	USB: serial: io_edgeport: fix possible sleep-in-atomic commit `c7b8f77872` upstream. According to drivers/usb/serial/io_edgeport.c, the driver may sleep under a spinlock. The function call path is: edge_bulk_in_callback (acquire the spinlock) process_rcvd_data process_rcvd_status change_port_settings send_iosp_ext_cmd write_cmd_usb usb_kill_urb --> may sleep To fix it, the redundant usb_kill_urb() is removed from the error path after usb_submit_urb() fails. This possible bug is found by my static analysis tool (DSAC) and checked by my code review. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Oliver Neukum	aa6a93fd0c	CDC-ACM: apply quirk for card reader commit `df1cc78a52` upstream. This devices drops random bytes from messages if you talk to it too fast. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Hans de Goede	c3b1f31377	USB: cdc-acm: Do not log urb submission errors on disconnect commit `f0386c083c` upstream. When disconnected sometimes the cdc-acm driver logs errors like these: [20278.039417] cdc_acm 2-2:2.1: urb 9 failed submission with -19 [20278.042924] cdc_acm 2-2:2.1: urb 10 failed submission with -19 [20278.046449] cdc_acm 2-2:2.1: urb 11 failed submission with -19 [20278.049920] cdc_acm 2-2:2.1: urb 12 failed submission with -19 [20278.053442] cdc_acm 2-2:2.1: urb 13 failed submission with -19 [20278.056915] cdc_acm 2-2:2.1: urb 14 failed submission with -19 [20278.060418] cdc_acm 2-2:2.1: urb 15 failed submission with -19 Silence these by not logging errors when the result is -ENODEV. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
Greg Kroah-Hartman	068cc4ad2b	USB: serial: pl2303: new device id for Chilitag commit `d08dd3f3dd` upstream. This adds a new device id for Chilitag devices to the pl2303 driver. Reported-by: "Chu.Mike [朱堅宜]" <Mike-Chu@prolific.com.tw> Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:42 +01:00
OKAMOTO Yoshiaki	2ef0d2ad5c	usb: option: Add support for FS040U modem commit `69341bd150` upstream. FS040U modem is manufactured by omega, and sold by Fujisoft. This patch adds ID of the modem to use option1 driver. Interface 3 is used as qmi_wwan, so the interface is ignored. Signed-off-by: Yoshiaki Okamoto <yokamoto@allied-telesis.co.jp> Signed-off-by: Hiroyuki Yamamoto <hyamamo@allied-telesis.co.jp> Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Gaurav Kohli	55eaecffe3	tty: fix data race between tty_init_dev and flush of buf commit `b027e2298b` upstream. There can be a race, if receive_buf call comes before tty initialization completes in n_tty_open and tty->disc_data may be NULL. CPU0 CPU1 ---- ---- 000\|n_tty_receive_buf_common() n_tty_open() -001\|n_tty_receive_buf2() tty_ldisc_open.isra.3() -002\|tty_ldisc_receive_buf(inline) tty_ldisc_setup() Using ldisc semaphore lock in tty_init_dev till disc_data initializes completely. Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org> Reviewed-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Dmitry Eremin	383e0620b7	staging: lustre: separate a connection destroy from free struct kib_conn commit `9b046013e5` upstream. The logic of the original commit `4d99b2581e` ("staging: lustre: avoid intensive reconnecting for ko2iblnd") was assumed conditional free of struct kib_conn if the second argument free_conn in function kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn) is true. But this hunk of code was dropped from original commit. As result the logic works wrong and current code use struct kib_conn after free. > drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c > 3317 kiblnd_destroy_conn(conn, !peer); > ^^^^ Freed always (but should be conditionally) > 3318 > 3319 spin_lock_irqsave(lock, flags); > 3320 if (!peer) > 3321 continue; > 3322 > 3323 conn->ibc_peer = peer; > ^^^^^^^^^^^^^^ Use after free > 3324 if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE) > 3325 list_add_tail(&conn->ibc_list, > ^^^^^^^^^^^^^^ Use after free > 3326 &kiblnd_data.kib_reconn_list); > 3327 else > 3328 list_add_tail(&conn->ibc_list, > ^^^^^^^^^^^^^^ Use after free > 3329 &kiblnd_data.kib_reconn_wait); To avoid confusion this fix moved the freeing a struct kib_conn outside of the function kiblnd_destroy_conn() and free as it was intended in original commit. Fixes: `4d99b2581e` ("staging: lustre: avoid intensive reconnecting for ko2iblnd") Signed-off-by: Dmitry Eremin <Dmitry.Eremin@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Stefan Schake	f94b238fb8	drm/vc4: Move IRQ enable to PM path [ Upstream commit `ce9caf2f79` ] We were calling enable_irq on bind, where it was already enabled previously by the IRQ helper. Additionally, dev->irq is not set correctly until after postinstall and so was always zero here, triggering a warning in 4.15. Fix both by moving the enable to the power management resume path, where we know there was a previous disable invocation during suspend. Fixes: `253696ccd6` ("drm/vc4: Account for interrupts in flight") Signed-off-by: Stefan Schake <stschake@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1514563543-32511-1-git-send-email-stschake@gmail.com Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Larry Finger	ace1911b76	staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID [ Upstream commit `b77992d2df` ] When not associated with an AP, wifi device drivers should respond to the SIOCGIWESSID ioctl with a zero-length string for the SSID, which is the behavior expected by dhcpcd. Currently, this driver returns an error code (-1) from the ioctl call, which causes dhcpcd to assume that the device is not a wireless interface and therefore it fails to work correctly with it thereafter. This problem was reported and tested at https://github.com/lwfinger/rtl8188eu/issues/234. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Colin Ian King	0e216b0a0f	usb: gadget: don't dereference g until after it has been null checked [ Upstream commit `b2fc059fa5` ] Avoid dereferencing pointer g until after g has been sanity null checked; move the assignment of cdev much later when it is required into a more local scope. Detected by CoverityScan, CID#1222135 ("Dereference before null check") Fixes: `b785ea7ce6` ("usb: gadget: composite: fix ep->maxburst initialization") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Icenowy Zheng	b4bfc8ef59	media: usbtv: add a new usbid [ Upstream commit `04226916d2` ] A new usbid of UTV007 is found in a newly bought device. The usbid is 1f71:3301. The ID on the chip is: UTV007 A89029.1 1520L18K1 Both video and audio is tested with the modified usbtv driver. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Acked-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Florian Fainelli	c16c193e3a	ARM: dts: NSP: Fix PPI interrupt types [ Upstream commit `5f1aa51c7a` ] Booting a kernel results in the kernel warning us about the following PPI interrupts configuration: [ 0.105127] smp: Bringing up secondary CPUs ... [ 0.110545] GIC: PPI11 is secure or misconfigured [ 0.110551] GIC: PPI13 is secure or misconfigured Fix this by using the appropriate edge configuration for PPI11 and PPI13, this is similar to what was fixed for Northstar (BCM5301X) in commit `0e34079cd1` ("ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags"). Fixes: `7b2e987de2` ("ARM: NSP: add minimal Northstar Plus device tree") Fixes: `1a9d53caba` ("ARM: dts: NSP: Add TWD Support to DT") Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Emmanuel Grumbach	9adb2a0f9a	iwlwifi: mvm: fix the TX queue hang timeout for MONITOR vif type [ Upstream commit `d1b275ffec` ] The MONITOR type is missing in the interface type switch. Add it. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:41 +01:00
Gustavo A. R. Silva	a248dc6a55	scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg [ Upstream commit `727535903b` ] _vreg_ is being dereferenced before it is null checked, hence there is a potential null pointer dereference. Fix this by moving the pointer dereference after _vreg_ has been null checked. This issue was detected with the help of Coccinelle. Fixes: `aa49761309` ("ufs: Add regulator enable support") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Guilherme G. Piccoli	fa64914313	scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path [ Upstream commit `e4717292dd` ] As part of the scsi EH path, aacraid performs a reinitialization of the adapter, which encompass freeing resources and IRQs, NULLifying lots of pointers, and then initialize it all over again. We've identified a problem during the free IRQ portion of this path if CONFIG_DEBUG_SHIRQ is enabled on kernel config file. Happens that, in case this flag was set, right after free_irq() effectively clears the interrupt, it checks if it was requested as IRQF_SHARED. In positive case, it performs another call to the IRQ handler on driver. Problem is: since aacraid currently free some resources before freeing the IRQ, once free_irq() path calls the handler again (due to CONFIG_DEBUG_SHIRQ), aacraid crashes due to NULL pointer dereference with the following trace: aac_src_intr_message+0xf8/0x740 [aacraid] __free_irq+0x33c/0x4a0 free_irq+0x78/0xb0 aac_free_irq+0x13c/0x150 [aacraid] aac_reset_adapter+0x2e8/0x970 [aacraid] aac_eh_reset+0x3a8/0x5d0 [aacraid] scsi_try_host_reset+0x74/0x180 scsi_eh_ready_devs+0xc70/0x1510 scsi_error_handler+0x624/0xa20 This patch prevents the crash by changing the order of the deinitialization in this path of aacraid: first we clear the IRQ, then we free other resources. No functional change intended. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Carlos Maiolino	fde77c712a	xfs: Properly retry failed dquot items in case of error during buffer writeback [ Upstream commit `373b0589dc` ] Once the inode item writeback errors is already fixed, it's time to fix the same problem in dquot code. Although there were no reports of users hitting this bug in dquot code (at least none I've seen), the bug is there and I was already planning to fix it when the correct approach to fix the inodes part was decided. This patch aims to fix the same problem in dquot code, regarding failed buffers being unable to be resubmitted once they are flush locked. Tested with the recently test-case sent to fstests list by Hou Tao. Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Darrick J. Wong	d96024440e	xfs: ubsan fixes [ Upstream commit `22a6c83777` ] Fix some complaints from the UBSAN about signed integer addition overflows. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Christophe JAILLET	9301165c46	drm/omap: Fix error handling path in 'omap_dmm_probe()' [ Upstream commit `8677b1ac2d` ] If we don't find a matching device node, we must free the memory allocated in 'omap_dmm' a few lines above. Fixes: `7cb0d6c17b` ("drm/omap: fix TILER on OMAP5") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Andrey Gusakov	f7170eb80a	drm/bridge: tc358767: fix 1-lane behavior [ Upstream commit `4dbd6c03fb` ] Use drm_dp_channel_eq_ok helper Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-7-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Andrey Gusakov	8ae615fece	drm/bridge: tc358767: fix AUXDATAn registers access [ Upstream commit `9217c1abbc` ] First four bytes should go to DP0_AUXWDATA0. Due to bug if len > 4 first four bytes was writen to DP0_AUXWDATA1 and all data get shifted by 4 bytes. Fix it. Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-6-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Andrey Gusakov	1bdfc52c33	drm/bridge: tc358767: fix timing calculations [ Upstream commit `66d1c3b94d` ] Fields in HTIM01 and HTIM02 regs should be even. Recomended thresh_dly value is max_tu_symbol. Remove set of VPCTRL0.VSDELAY as it is related to DSI input interface. Currently driver supports only DPI. Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-5-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Andrey Gusakov	c55908604e	drm/bridge: tc358767: fix DP0_MISC register set [ Upstream commit `f3b8adbe19` ] Remove shift from TU_SIZE_RECOMMENDED define as it used to calculate max_tu_symbols. Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-4-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:40 +01:00
Andrey Gusakov	8d4bfe89aa	drm/bridge: tc358767: filter out too high modes [ Upstream commit `99fc8e963a` ] Pixel clock limitation for DPI is 154 MHz. Do not accept modes with higher pixel clock rate. Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-3-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Andrey Gusakov	5f6a0441ca	drm/bridge: tc358767: do no fail on hi-res displays [ Upstream commit `cffd2b16c0` ] Do not fail data rates higher than 2.7 and more than 2 lanes. Try to fall back to 2.7Gbps and 2 lanes. Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Andrey Gusakov <andrey.gusakov@cogentembedded.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1510073785-16108-2-git-send-email-andrey.gusakov@cogentembedded.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Yisheng Xie	7b8623841f	kmemleak: add scheduling point to kmemleak_scan() [ Upstream commit `bde5f6bc68` ] kmemleak_scan() will scan struct page for each node and it can be really large and resulting in a soft lockup. We have seen a soft lockup when do scan while compile kernel: watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287] [...] Call Trace: kmemleak_scan+0x21a/0x4c0 kmemleak_write+0x312/0x350 full_proxy_write+0x5a/0xa0 __vfs_write+0x33/0x150 vfs_write+0xad/0x1a0 SyS_write+0x52/0xc0 do_syscall_64+0x61/0x1a0 entry_SYSCALL64_slow_path+0x25/0x25 Fix this by adding cond_resched every MAX_SCAN_SIZE. Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Trond Myklebust	d2a67f7afc	SUNRPC: Allow connect to return EHOSTUNREACH [ Upstream commit `4ba161a793` ] Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Tetsuo Handa	c4ecc2f696	quota: Check for register_shrinker() failure. [ Upstream commit `88bc0ede8d` ] register_shrinker() might return -ENOMEM error since Linux 3.12. Call panic() as with other failure checks in this function if register_shrinker() failed. Fixes: `1d3d4437ea` ("vmscan: per-node deferred work") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jan Kara <jack@suse.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Geert Uytterhoeven	d47907bcac	net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit [ Upstream commit `15bfe05c8d` ] On 64-bit (e.g. powerpc64/allmodconfig): drivers/net/ethernet/xilinx/ll_temac_main.c: In function 'temac_start_xmit_done': drivers/net/ethernet/xilinx/ll_temac_main.c:633:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] dev_kfree_skb_irq((struct sk_buff *)cur_p->app4); ^ cdmac_bd.app4 is u32, so it is too small to hold a kernel pointer. Note that several other fields in struct cdmac_bd are also too small to hold physical addresses on 64-bit platforms. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Christian König	e11616d5e6	drm/amdgpu: don't try to move pinned BOs [ Upstream commit `6edc6910ba` ] Never try to move pinned BOs during CS. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Michal Hocko	54a1fdff1b	xfs: fortify xfs_alloc_buftarg error handling [ Upstream commit `d210a9874b` ] percpu_counter_init failure path doesn't clean up &btp->bt_lru list. Call list_lru_destroy in that error path. Similarly register_shrinker error path is not handled. While it is unlikely to trigger these error path, it is not impossible especially the later might fail with large NUMAs. Let's handle the failure to make the code more robust. Noticed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Christophe JAILLET	98ae1ca753	bnxt_en: Fix an error handling path in 'bnxt_get_module_eeprom()' [ Upstream commit `dea521a2b9` ] Error code returned by 'bnxt_read_sfp_module_eeprom_info()' is handled a few lines above when reading the A0 portion of the EEPROM. The same should be done when reading the A2 portion of the EEPROM. In order to correctly propagate an error, update 'rc' in this 2nd call as well, otherwise 0 (success) is returned. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:39 +01:00
Robert Lippert	d5a746cf47	hwmon: (pmbus) Use 64bit math for DIRECT format values [ Upstream commit `bd467e4eab` ] Power values in the 100s of watt range can easily blow past 32bit math limits when processing everything in microwatts. Use 64bit math instead to avoid these issues on common 32bit ARM BMC platforms. Fixes: `442aba7872` ("hwmon: PMBus device driver") Signed-off-by: Robert Lippert <rlippert@google.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Vasily Averin	3b7742374f	lockd: fix "list_add double add" caused by legacy signal interface [ Upstream commit `81833de1a4` ] restart_grace() uses hardcoded init_net. It can cause to "list_add double add" in following scenario: 1) nfsd and lockd was started in several net namespaces 2) nfsd in init_net was stopped (lockd was not stopped because it have users from another net namespaces) 3) lockd got signal, called restart_grace() -> set_grace_period() and enabled lock_manager in hardcoded init_net. 4) nfsd in init_net is started again, its lockd_up() calls set_grace_period() and tries to add lock_manager into init_net 2nd time. Jeff Layton suggest: "Make it safe to call locks_start_grace multiple times on the same lock_manager. If it's already on the global grace_list, then don't try to add it again. (But we don't intentionally add twice, so for now we WARN about that case.) With this change, we also need to ensure that the nfsd4 lock manager initializes the list before we call locks_start_grace. While we're at it, move the rest of the nfsd_net initialization into nfs4_state_create_net. I see no reason to have it spread over two functions like it is today." Suggested patch was updated to generate warning in described situation. Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Andrew Elble	f25e222ccc	nfsd: check for use of the closed special stateid [ Upstream commit `ae254dac72` ] Prevent the use of the closed (invalid) special stateid by clients. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Vasily Averin	f35ab8e2ee	grace: replace BUG_ON by WARN_ONCE in exit_net hook [ Upstream commit `b872285751` ] Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Trond Myklebust	2a7d4a723d	nfsd: Ensure we check stateid validity in the seqid operation checks [ Upstream commit `9271d7e509` ] After taking the stateid st_mutex, we want to know that the stateid still represents valid state before performing any non-idempotent actions. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Trond Myklebust	5cd3586ca8	nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0) [ Upstream commit `fb500a7cfe` ] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Thomas Meyer	c57767b609	auxdisplay: img-ascii-lcd: Only build on archs that have IOMEM [ Upstream commit `141cbfba1d` ] This avoids the MODPOST error: ERROR: "devm_ioremap_resource" [drivers/auxdisplay/img-ascii-lcd.ko] undefined! Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Eduardo Otubo	c6a34556f5	xen-netfront: remove warning when unloading module [ Upstream commit `5b5971df3b` ] v2: * Replace busy wait with wait_event()/wake_up_all() * Cannot garantee that at the time xennet_remove is called, the xen_netback state will not be XenbusStateClosed, so added a condition for that * There's a small chance for the xen_netback state is XenbusStateUnknown by the time the xen_netfront switches to Closed, so added a condition for that. When unloading module xen_netfront from guest, dmesg would output warning messages like below: [ 105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use! [ 105.236839] deferring g.e. 0x903 (pfn 0x35805) This problem relies on netfront and netback being out of sync. By the time netfront revokes the g.e.'s netback didn't have enough time to free all of them, hence displaying the warnings on dmesg. The trick here is to make netfront to wait until netback frees all the g.e.'s and only then continue to cleanup for the module removal, and this is done by manipulating both device states. Signed-off-by: Eduardo Otubo <otubo@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Darrick J. Wong	b5bfda0f8e	xfs: always free inline data before resetting inode fork during ifree [ Upstream commit `98c4f78dcd` ] In xfs_ifree, we reset the data/attr forks to extents format without bothering to free any inline data buffer that might still be around after all the blocks have been truncated off the file. Prior to commit `43518812d2` ("xfs: remove support for inlining data/extents into the inode fork") nobody noticed because the leftover inline data after truncation was small enough to fit inside the inline buffer inside the fork itself. However, now that we've removed the inline buffer, we /always/ have to free the inline data buffer or else we leak them like crazy. This test was found by turning on kmemleak for generic/001 or generic/388. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Wanpeng Li	5c0b19bd8c	KVM: VMX: Fix rflags cache during vCPU reset [ Upstream commit `c37c28730b` ] Reported by syzkaller: * Guest State * CR0: actual=0x0000000080010031, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002061, shadow=0x0000000000000000, gh_mask=ffffffffffffe8f1 CR3 = 0x000000002081e000 RSP = 0x000000000000fffa RIP = 0x0000000000000000 RFLAGS=0x00023000 DR7 = 0x00000000000000 ^^^^^^^^^^ ------------[ cut here ]------------ WARNING: CPU: 6 PID: 24431 at /home/kernel/linux/arch/x86/kvm//x86.c:7302 kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm] CPU: 6 PID: 24431 Comm: reprotest Tainted: G W OE 4.14.0+ #26 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm] RSP: 0018:ffff880291d179e0 EFLAGS: 00010202 Call Trace: kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The failed vmentry is triggered by the following beautified testcase: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_guest_debug debug = { .control = 0xf0403, .arch = { .debugreg[6] = 0x2, .debugreg[7] = 0x2 } }; ioctl(r[4], KVM_SET_GUEST_DEBUG, &debug); ioctl(r[4], KVM_RUN, 0); } which testcase tries to setup the processor specific debug registers and configure vCPU for handling guest debug events through KVM_SET_GUEST_DEBUG. The KVM_SET_GUEST_DEBUG ioctl will get and set rflags in order to set TF bit if single step is needed. All regs' caches are reset to avail and GUEST_RFLAGS vmcs field is reset to 0x2 during vCPU reset. However, the cache of rflags is not reset during vCPU reset. The function vmx_get_rflags() returns an unreset rflags cache value since the cache is marked avail, it is 0 after boot. Vmentry fails if the rflags reserved bit 1 is 0. This patch fixes it by resetting both the GUEST_RFLAGS vmcs field and its cache to 0x2 during vCPU reset. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:38 +01:00
Wanpeng Li	b0fa04e842	KVM: X86: Fix softlockup when get the current kvmclock [ Upstream commit `e70b57a6ce` ] watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185] CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4 RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm] Call Trace: get_time_ref_counter+0x5a/0x80 [kvm] kvm_hv_process_stimers+0x120/0x5f0 [kvm] kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm] kvm_vcpu_ioctl+0x33a/0x620 [kvm] do_vfs_ioctl+0xa1/0x5d0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x1e/0xa9 This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0 (set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results in kvm_get_time_scale() gets into an infinite loop. This patch fixes it by treating the unhotplug pCPU as not using master clock. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Jeff Layton	90ef2c30eb	reiserfs: remove unneeded i_version bump [ Upstream commit `9f97df50c5` ] The i_version field in reiserfs is not initialized and is only ever updated here. Nothing ever views it, so just remove it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Josef Bacik	8cfb3965eb	btrfs: fix deadlock when writing out space cache [ Upstream commit `b77000ed55` ] If we fail to prepare our pages for whatever reason (out of memory in our case) we need to make sure to drop the block_group->data_rwsem, otherwise hilarity ensues. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add label and use existing unlocking code ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Chun-Yeow Yeoh	030d4676a2	mac80211: fix the update of path metric for RANN frame [ Upstream commit `fbbdad5edf` ] The previous path metric update from RANN frame has not considered the own link metric toward the transmitting mesh STA. Fix this. Reported-by: Michael65535 Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
zhangliping	03899a46c2	openvswitch: fix the incorrect flow action alloc size [ Upstream commit `67c8d22a73` ] If we want to add a datapath flow, which has more than 500 vxlan outputs' action, we will get the following error reports: openvswitch: netlink: Flow action size 32832 bytes exceeds max openvswitch: netlink: Flow action size 32832 bytes exceeds max openvswitch: netlink: Actions may not be safe on all matching packets ... ... It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but this is not the root cause. For example, for a vxlan output action, we need about 60 bytes for the nlattr, but after it is converted to the flow action, it only occupies 24 bytes. This means that we can still support more than 1000 vxlan output actions for a single datapath flow under the the current 32k max limitation. So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we shouldn't report EINVAL and keep it move on, as the judgement can be done by the reserve_sfa_size. Signed-off-by: zhangliping <zhangliping02@baidu.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Felix Kuehling	8275584082	drm/amdkfd: Fix SDMA oversubsription handling [ Upstream commit `8c946b8988` ] SDMA only supports a fixed number of queues. HWS cannot handle oversubscription. Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
shaoyunl	16980affa1	drm/amdkfd: Fix SDMA ring buffer size calculation [ Upstream commit `d12fb13f23` ] ffs function return the position of the first bit set on 1 based. (bit zero returns 1). Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Felix Kuehling	8afdbb165a	drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode [ Upstream commit `cf21654b40` ] Fix the SDMA load and unload sequence as suggested by HW document. Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
Michael Lyle	409982cbb5	bcache: check return value of register_shrinker [ Upstream commit `6c4ca1e36c` ] register_shrinker is now __must_check, so check it to kill a warning. Caller of bch_btree_cache_alloc in super.c appropriately checks return value so this is fully plumbed through. This V2 fixes checkpatch warnings and improves the commit description, as I was too hasty getting the previous version out. Signed-off-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Vojtech Pavlik <vojtech@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:37 +01:00
James Hogan	6436981ba6	cpufreq: Add Loongson machine dependencies [ Upstream commit `0d307935fe` ] The MIPS loongson cpufreq drivers don't build unless configured for the correct machine type, due to dependency on machine specific architecture headers and symbols in machine specific platform code. More specifically loongson1-cpufreq.c uses RST_CPU_EN and RST_CPU, neither of which is defined in asm/mach-loongson32/regs-clk.h unless CONFIG_LOONGSON1_LS1B=y, and loongson2_cpufreq.c references loongson2_clockmod_table[], which is only defined in arch/mips/loongson64/lemote-2f/clock.c, i.e. when CONFIG_LEMOTE_MACH2F=y. Add these dependencies to Kconfig to avoid randconfig / allyesconfig build failures (e.g. when based on BMIPS which also has a cpufreq driver). Signed-off-by: James Hogan <jhogan@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Nikita Leshenko	876b31fd98	KVM: x86: ioapic: Preserve read-only values in the redirection table [ Upstream commit `b200dded0a` ] According to 82093AA (IOAPIC) manual, Remote IRR and Delivery Status are read-only. QEMU implements the bits as RO in commit 479c2a1cb7fb ("ioapic: keep RO bits for IOAPIC entry"). Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Hans de Goede	1d3ab3b296	ACPI / bus: Leave modalias empty for devices which are not present [ Upstream commit `10809bb976` ] Most Bay and Cherry Trail devices use a generic DSDT with all possible peripheral devices present in the DSDT, with their _STA returning 0x00 or 0x0f based on AML variables which describe what is actually present on the board. Since ACPI device objects with a 0x00 status (not present) still get an entry under /sys/bus/acpi/devices, and those entry had an acpi:PNPID modalias, userspace would end up loading modules for non present hardware. This commit fixes this by leaving the modalias empty for non present devices. This results in 10 modules less being loaded with a generic distro kernel config on my Cherry Trail test-device (a GPD pocket). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Nikita Leshenko	a9f2c16936	KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered [ Upstream commit `a8bfec2930` ] Some OSes (Linux, Xen) use this behavior to clear the Remote IRR bit for IOAPICs without an EOI register. They simulate the EOI message manually by changing the trigger mode to edge and then back to level, with the entry being masked during this. QEMU implements this feature in commit ed1263c363c9 ("ioapic: clear remote irr bit for edge-triggered interrupts") As a side effect, this commit removes an incorrect behavior where Remote IRR was cleared when the redirection table entry was rewritten. This is not consistent with the manual and also opens an opportunity for a strange behavior when a redirection table entry is modified from an interrupt handler that handles the same entry: The modification will clear the Remote IRR bit even though the interrupt handler is still running. Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Nikita Leshenko	2f9e94ef49	KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race [ Upstream commit `0fc5a36dd6` ] KVM uses ioapic_handled_vectors to track vectors that need to notify the IOAPIC on EOI. The problem is that IOAPIC can be reconfigured while an interrupt with old configuration is pending or running and ioapic_handled_vectors only remembers the newest configuration; thus EOI from the old interrupt is not delievered to the IOAPIC. A previous commit `db2bdcbbbd` ("KVM: x86: fix edge EOI and IOAPIC reconfig race") addressed this issue by adding pending edge-triggered interrupts to ioapic_handled_vectors, fixing this race for edge-triggered interrupts. The commit explicitly ignored level-triggered interrupts, but this race applies to them as well: 1) IOAPIC sends a level triggered interrupt vector to VCPU0 2) VCPU0's handler deasserts the irq line and reconfigures the IOAPIC to route the vector to VCPU1. The reconfiguration rewrites only the upper 32 bits of the IOREDTBLn register. (Causes KVM to update ioapic_handled_vectors for VCPU0 and it no longer includes the vector.) 3) VCPU0 sends EOI for the vector, but it's not delievered to the IOAPIC because the ioapic_handled_vectors doesn't include the vector. 4) New interrupts are not delievered to VCPU1 because remote_irr bit is set forever. Therefore, the correct behavior is to add all pending and running interrupts to ioapic_handled_vectors. This commit introduces a slight performance hit similar to commit `db2bdcbbbd` ("KVM: x86: fix edge EOI and IOAPIC reconfig race") for the rare case that the vector is reused by a non-IOAPIC source on VCPU0. We prefer to keep solution simple and not handle this case just as the original commit does. Fixes: `db2bdcbbbd` ("KVM: x86: fix edge EOI and IOAPIC reconfig race") Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Wanpeng Li	ec73d16bc7	KVM: X86: Fix operand/address-size during instruction decoding [ Upstream commit `3853be2603` ] Pedro reported: During tests that we conducted on KVM, we noticed that executing a "PUSH %ES" instruction under KVM produces different results on both memory and the SP register depending on whether EPT support is enabled. With EPT the SP is reduced by 4 bytes (and the written value is 0-padded) but without EPT support it is only reduced by 2 bytes. The difference can be observed when the CS.DB field is 1 (32-bit) but not when it's 0 (16-bit). The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D also should be respected instead of just default operand/address-size/66H prefix/67H prefix during instruction decoding. This patch fixes it by also adjusting operand/address-size according to CS.D. Reported-by: Pedro Fonseca <pfonseca@cs.washington.edu> Tested-by: Pedro Fonseca <pfonseca@cs.washington.edu> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Pedro Fonseca <pfonseca@cs.washington.edu> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Liran Alon	114de9bfef	KVM: x86: Don't re-execute instruction when not passing CR2 value [ Upstream commit `9b8ae63798` ] In case of instruction-decode failure or emulation failure, x86_emulate_instruction() will call reexecute_instruction() which will attempt to use the cr2 value passed to x86_emulate_instruction(). However, when x86_emulate_instruction() is called from emulate_instruction(), cr2 is not passed (passed as 0) and therefore it doesn't make sense to execute reexecute_instruction() logic at all. Fixes: `51d8b66199` ("KVM: cleanup emulate_instruction") Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Liran Alon	60d9b22b1f	KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure [ Upstream commit `1f4dcb3b21` ] On this case, handle_emulation_failure() fills kvm_run with internal-error information which it expects to be delivered to user-mode for further processing. However, the code reports a wrong return-value which makes KVM to never return to user-mode on this scenario. Fixes: `6d77dbfc88` ("KVM: inject #UD if instruction emulation fails and exit to userspace") Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:36 +01:00
Abhishek Goel	d8f75b4c7f	cpupower : Fix cpupower working when cpu0 is offline [ Upstream commit `dbdc468f35` ] cpuidle_monitor used to assume that cpu0 is always online which is not a valid assumption on POWER machines. This patch fixes this by getting the cpu on which the current thread is running, instead of always using cpu0 for monitoring which may not be online. Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Abhishek Goel	82e57cdce0	cpupowerutils: bench - Fix cpu online check [ Upstream commit `53d1cd6b12` ] cpupower_is_cpu_online was incorrectly checking for 0. This patch fixes this by checking for 1 when the cpu is online. Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Stefan Schake	036c227cdd	drm/vc4: Account for interrupts in flight [ Upstream commit `253696ccd6` ] Synchronously disable the IRQ to make the following cancel_work_sync invocation effective. An interrupt in flight could enqueue further overflow mem work. As we free the binner BO immediately following vc4_irq_uninstall this caused a NULL pointer dereference in the work callback vc4_overflow_mem_work. Link: https://github.com/anholt/linux/issues/114 Signed-off-by: Stefan Schake <stschake@gmail.com> Fixes: `d5b1a78a77` ("drm/vc4: Add support for drawing 3D frames.") Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1510275907-993-2-git-send-email-stschake@gmail.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Lyude Paul	30942f91b5	igb: Free IRQs when device is hotplugged commit `888f229314` upstream. Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon hotplugging my kernel would immediately crash due to igb: [ 680.825801] kernel BUG at drivers/pci/msi.c:352! [ 680.828388] invalid opcode: 0000 [#1] SMP [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O 4.15.0-rc3Lyude-Test+ #6 [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017 [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178 [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00 [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298 [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000 [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000 [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0 [ 680.837954] Call Trace: [ 680.838853] pci_disable_msix+0xce/0xf0 [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] [ 680.840278] igb_remove+0x9d/0x110 [igb] [ 680.840764] pci_device_remove+0x36/0xb0 [ 680.841279] device_release_driver_internal+0x157/0x220 [ 680.841739] pci_stop_bus_device+0x7d/0xa0 [ 680.842255] pci_stop_bus_device+0x2b/0xa0 [ 680.842722] pci_stop_bus_device+0x3d/0xa0 [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 [ 680.843627] trim_stale_devices+0xf3/0x140 [ 680.844086] trim_stale_devices+0x94/0x140 [ 680.844532] trim_stale_devices+0xa6/0x140 [ 680.845031] ? get_slot_status+0x90/0xc0 [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 [ 680.846581] ? free_bridge+0x100/0x100 [ 680.847113] acpi_device_hotplug+0x8a/0x490 [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 [ 680.848076] process_one_work+0x182/0x3a0 [ 680.848543] worker_thread+0x2e/0x380 [ 680.848963] ? process_one_work+0x3a0/0x3a0 [ 680.849373] kthread+0x111/0x130 [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 [ 680.850188] ret_from_fork+0x1f/0x30 [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 As it turns out, normally the freeing of IRQs that would fix this is called inside of the scope of __igb_close(). However, since the device is already gone by the point we try to unregister the netdevice from the driver due to a hotplug we end up seeing that the netif isn't present and thus, forget to free any of the device IRQs. So: make sure that if we're in the process of dismantling the netdev, we always allow __igb_close() to be called so that IRQs may be freed normally. Additionally, only allow igb_close() to be called from __igb_close() if it hasn't already been called for the given adapter. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: `9474933caf` ("igb: close/suspend race in netif_device_detach") Cc: Todd Fujinaka <todd.fujinaka@intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Jesse Chan	3a98d07539	mtd: nand: denali_pci: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `d822401d1c` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Jesse Chan	e29997d552	gpio: ath79: add missing MODULE_DESCRIPTION/LICENSE commit `539340f37e` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION is also added. Signed-off-by: Jesse Chan <jc@linux.com> Acked-by: Alban Bedel <albeu@free.fr> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Jesse Chan	cb1a0b51d1	gpio: iop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `97b03136e1` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Jesse Chan	517931760e	power: reset: zx-reboot: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE commit `348c7cf5fc` upstream. This change resolves a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o see include/linux/module.h for more information This adds the license as "GPL v2", which matches the header of the file. MODULE_DESCRIPTION and MODULE_AUTHOR are also added. Signed-off-by: Jesse Chan <jc@linux.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Aaron Armstrong Skomra	ddba3c67a5	HID: wacom: EKR: ensure devres groups at higher indexes are released commit `791ae27373` upstream. Background: ExpressKey Remotes communicate their events via usb dongle. Each dongle can hold up to 5 pairings at one time and one EKR (identified by its serial number) can unfortunately be paired with its dongle more than once. The pairing takes place in a round-robin fashion. Input devices are only created once per EKR, when a new serial number is seen in the list of pairings. However, if a device is created for a "higher" paring index and subsequently a second pairing occurs at a lower pairing index, unpairing the remote with that serial number from any pairing index will currently cause a driver crash. This occurs infrequently, as two remotes are necessary to trigger this bug and most users have only one remote. As an illustration, to trigger the bug you need to have two remotes, and pair them in this order: 1. slot 0 -> remote 1 (input device created for remote 1) 2. slot 1 -> remote 1 (duplicate pairing - no device created) 3. slot 2 -> remote 1 (duplicate pairing - no device created) 4. slot 3 -> remote 1 (duplicate pairing - no device created) 5. slot 4 -> remote 2 (input device created for remote 2) 6. slot 0 -> remote 2 (1 destroyed and recreated at slot 1) 7. slot 1 -> remote 2 (1 destroyed and recreated at slot 2) 8. slot 2 -> remote 2 (1 destroyed and recreated at slot 3) 9. slot 3 -> remote 2 (1 destroyed and not recreated) 10. slot 4 -> remote 2 (2 was already in this slot so no changes) 11. slot 0 -> remote 1 (The current code sees remote 2 was paired over in one of the dongle slots it occupied and attempts to remove all information about remote 2 [1]. It calls wacom_remote_destroy_one for remote 2, but the destroy function assumes the lowest index is where the remote's input device was created. The code "cleans up" the other remote 2 pairings including the one which the input device was based on, assuming they were were just duplicate pairings. However, the cleanup doesn't call the devres release function for the input device that was created in slot 4). This issue is fixed by this commit. [1] Remote 2 should subsequently be re-created on the next packet from the EKR at the lowest numbered slot that it occupies (here slot 1). Fixes: `f9036bd436` ("HID: wacom: EKR: use devres groups to manage resources") Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:35 +01:00
Stephan Mueller	b7edc45f3a	crypto: af_alg - whitelist mask and type commit `bb30b8848c` upstream. The user space interface allows specifying the type and mask field used to allocate the cipher. Only a subset of the possible flags are intended for user space. Therefore, white-list the allowed flags. In case the user space caller uses at least one non-allowed flag, EINVAL is returned. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Ard Biesheuvel	1ce8e52f6f	crypto: sha3-generic - fixes for alignment and big endian operation commit `c013cee99d` upstream. Ensure that the input is byte swabbed before injecting it into the SHA3 transform. Use the get_unaligned() accessor for this so that we don't perform unaligned access inadvertently on architectures that do not support that. Fixes: `53964b9ee6` ("crypto: sha3 - Add SHA-3 hash algorithm") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Stephan Mueller	95259cb008	crypto: aesni - handle zero length dst buffer commit `9c674e1e2f` upstream. GCM can be invoked with a zero destination buffer. This is possible if the AAD and the ciphertext have zero lengths and only the tag exists in the source buffer (i.e. a source buffer cannot be zero). In this case, the GCM cipher only performs the authentication and no decryption operation. When the destination buffer has zero length, it is possible that no page is mapped to the SG pointing to the destination. In this case, sg_page(req->dst) is an invalid access. Therefore, page accesses should only be allowed if the req->dst->length is non-zero which is the indicator that a page must exist. This fixes a crash that can be triggered by user space via AF_ALG. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Hauke Mehrtens	f1803207b5	crypto: ecdh - fix typo in KPP dependency of CRYPTO_ECDH commit `b5b9007730` upstream. This fixes a typo in the CRYPTO_KPP dependency of CRYPTO_ECDH. Fixes: `3c4b23901a` ("crypto: ecdh - Add ECDH software support") Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Linus Walleij	cc1fa4a7b6	gpio: Fix kernel stack leak to userspace commit `24bd3efc9d` upstream. The GPIO event descriptor was leaking kernel stack to userspace because we don't zero the variable before use. Ooops. Fix this. Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Bartosz Golaszewski <brgl@bgdev.pl> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Patrice Chotard	241c04f75e	gpio: stmpe: i2c transfer are forbiden in atomic context commit `b888fb6f2a` upstream. Move the workaround from stmpe_gpio_irq_unmask() which is executed in atomic context to stmpe_gpio_irq_sync_unlock() which is not. It fixes the following issue: [ 1.500000] BUG: scheduling while atomic: swapper/1/0x00000002 [ 1.500000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-rc2-00020-gbd4301f-dirty #28 [ 1.520000] Hardware name: STM32 (Device Tree Support) [ 1.520000] [<0000bfc9>] (unwind_backtrace) from [<0000b347>] (show_stack+0xb/0xc) [ 1.530000] [<0000b347>] (show_stack) from [<0001fc49>] (__schedule_bug+0x39/0x58) [ 1.530000] [<0001fc49>] (__schedule_bug) from [<00168211>] (__schedule+0x23/0x2b2) [ 1.550000] [<00168211>] (__schedule) from [<001684f7>] (schedule+0x57/0x64) [ 1.550000] [<001684f7>] (schedule) from [<0016a513>] (schedule_timeout+0x137/0x164) [ 1.550000] [<0016a513>] (schedule_timeout) from [<00168b91>] (wait_for_common+0x8d/0xfc) [ 1.570000] [<00168b91>] (wait_for_common) from [<00139753>] (stm32f4_i2c_xfer+0xe9/0xfe) [ 1.580000] [<00139753>] (stm32f4_i2c_xfer) from [<00138545>] (__i2c_transfer+0x111/0x148) [ 1.590000] [<00138545>] (__i2c_transfer) from [<001385cf>] (i2c_transfer+0x53/0x70) [ 1.590000] [<001385cf>] (i2c_transfer) from [<001388a5>] (i2c_smbus_xfer+0x12f/0x36e) [ 1.600000] [<001388a5>] (i2c_smbus_xfer) from [<00138b49>] (i2c_smbus_read_byte_data+0x1f/0x2a) [ 1.610000] [<00138b49>] (i2c_smbus_read_byte_data) from [<00124fdd>] (__stmpe_reg_read+0xd/0x24) [ 1.620000] [<00124fdd>] (__stmpe_reg_read) from [<001252b3>] (stmpe_reg_read+0x19/0x24) [ 1.630000] [<001252b3>] (stmpe_reg_read) from [<0002c4d1>] (unmask_irq+0x17/0x22) [ 1.640000] [<0002c4d1>] (unmask_irq) from [<0002c57f>] (irq_startup+0x6f/0x78) [ 1.650000] [<0002c57f>] (irq_startup) from [<0002b7a1>] (__setup_irq+0x319/0x47c) [ 1.650000] [<0002b7a1>] (__setup_irq) from [<0002bad3>] (request_threaded_irq+0x6b/0xe8) [ 1.660000] [<0002bad3>] (request_threaded_irq) from [<0002d0b9>] (devm_request_threaded_irq+0x3b/0x6a) [ 1.670000] [<0002d0b9>] (devm_request_threaded_irq) from [<001446e7>] (mmc_gpiod_request_cd_irq+0x49/0x8a) [ 1.680000] [<001446e7>] (mmc_gpiod_request_cd_irq) from [<0013d45d>] (mmc_start_host+0x49/0x60) [ 1.690000] [<0013d45d>] (mmc_start_host) from [<0013e40b>] (mmc_add_host+0x3b/0x54) [ 1.700000] [<0013e40b>] (mmc_add_host) from [<00148119>] (mmci_probe+0x4d1/0x60c) [ 1.710000] [<00148119>] (mmci_probe) from [<000f903b>] (amba_probe+0x7b/0xbe) [ 1.720000] [<000f903b>] (amba_probe) from [<001170e5>] (driver_probe_device+0x169/0x1f8) [ 1.730000] [<001170e5>] (driver_probe_device) from [<001171b7>] (__driver_attach+0x43/0x5c) [ 1.740000] [<001171b7>] (__driver_attach) from [<0011618d>] (bus_for_each_dev+0x3d/0x46) [ 1.740000] [<0011618d>] (bus_for_each_dev) from [<001165cd>] (bus_add_driver+0xcd/0x124) [ 1.740000] [<001165cd>] (bus_add_driver) from [<00117713>] (driver_register+0x4d/0x7a) [ 1.760000] [<00117713>] (driver_register) from [<001fc765>] (do_one_initcall+0xbd/0xe8) [ 1.770000] [<001fc765>] (do_one_initcall) from [<001fc88b>] (kernel_init_freeable+0xfb/0x134) [ 1.780000] [<001fc88b>] (kernel_init_freeable) from [<00167ee3>] (kernel_init+0x7/0x9c) [ 1.790000] [<00167ee3>] (kernel_init) from [<00009b65>] (ret_from_fork+0x11/0x2c) Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Joel Stanley	efe3f94f83	tools/gpio: Fix build error with musl libc commit `1696784eb7` upstream. The GPIO tools build fails when using a buildroot toolchain that uses musl as it's C library: arm-broomstick-linux-musleabi-gcc -Wp,-MD,./.gpio-event-mon.o.d \ -Wp,-MT,gpio-event-mon.o -O2 -Wall -g -D_GNU_SOURCE \ -Iinclude -D"BUILD_STR(s)=#s" -c -o gpio-event-mon.o gpio-event-mon.c gpio-event-mon.c:30:6: error: unknown type name ‘u_int32_t’; did you mean ‘uint32_t’? u_int32_t handleflags, ^~~~~~~~~ uint32_t The glibc headers installed on my laptop include sys/types.h in unistd.h, but it appears that musl does not. Fixes: `97f69747d8` ("tools/gpio: add the gpio-event-mon tool") Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Max Gurtovoy	2a7076e715	RDMA/mlx5: set UMR wqe fence according to HCA cap commit `6e8484c5cf` upstream. Cache the needed umr_fence and set the wqe ctrl segmennt accordingly. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Marta Rybczynska <mrybczyn@kalray.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:34 +01:00
Max Gurtovoy	20e6f5bdf5	net/mlx5: Define interface bits for fencing UMR wqe commit `1410a90ae4` upstream. HW can implement UMR wqe re-transmission in various ways. Thus, add HCA cap to distinguish the needed fence for UMR to make sure that the wqe wouldn't fail on mkey checks. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Marta Rybczynska <mrybczyn@kalray.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:33 +01:00
Linus Torvalds	56bc086358	loop: fix concurrent lo_open/lo_release commit `ae6650163c` upstream. 范龙飞 reports that KASAN can report a use-after-free in __lock_acquire. The reason is due to insufficient serialization in lo_release(), which will continue to use the loop device even after it has decremented the lo_refcnt to zero. In the meantime, another process can come in, open the loop device again as it is being shut down. Confusion ensues. Reported-by: 范龙飞 <long7573@126.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-02-03 17:05:33 +01:00
Greg Kroah-Hartman	6c6f924f9c	Linux 4.9.79	2018-01-31 12:55:57 +01:00
Ben Hutchings	f12d060263	nfsd: auth: Fix gid sorting when rootsquash enabled commit `1995266727` upstream. Commit `bdcf0a423e` ("kernel: make groups_sort calling a responsibility group_info allocators") appears to break nfsd rootsquash in a pretty major way. It adds a call to groups_sort() inside the loop that copies/squashes gids, which means the valid gids are sorted along with the following garbage. The net result is that the highest numbered valid gids are replaced with any lower-valued garbage gids, possibly including 0. We should sort only once, after filling in all the gids. Fixes: `bdcf0a423e` ("kernel: make groups_sort calling a responsibility ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Wolfgang Walter <linux@stwm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Daniel Borkmann	f531fbb06a	bpf: reject stores into ctx via st and xadd [ upstream commit `f37a8cb84c` ] Alexei found that verifier does not reject stores into context via BPF_ST instead of BPF_STX. And while looking at it, we also should not allow XADD variant of BPF_STX. The context rewriter is only assuming either BPF_LDX_MEM- or BPF_STX_MEM-type operations, thus reject anything other than that so that assumptions in the rewriter properly hold. Add test cases as well for BPF selftests. Fixes: `d691f9e8d4` ("bpf: allow programs to write to certain skb fields") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Alexei Starovoitov	265d7657c9	bpf: fix 32-bit divide by zero [ upstream commit `68fda450a7` ] due to some JITs doing if (src_reg == 0) check in 64-bit mode for div/mod operations mask upper 32-bits of src register before doing the check Fixes: `622582786c` ("net: filter: x86: internal BPF JIT") Fixes: `7a12b5031c` ("sparc64: Add eBPF JIT.") Reported-by: syzbot+48340bb518e88849e2e3@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Eric Dumazet	4606077802	bpf: fix divides by zero [ upstream commit `c366287ebd` ] Divides by zero are not nice, lets avoid them if possible. Also do_div() seems not needed when dealing with 32bit operands, but this seems a minor detail. Fixes: `bd4cf0ed33` ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Daniel Borkmann	5cb917aa1f	bpf: avoid false sharing of map refcount with max_entries [ upstream commit `be95a845cc` ] In addition to commit `b2157399cc` ("bpf: prevent out-of-bounds speculation") also change the layout of struct bpf_map such that false sharing of fast-path members like max_entries is avoided when the maps reference counter is altered. Therefore enforce them to be placed into separate cachelines. pahole dump after change: struct bpf_map { const struct bpf_map_ops * ops; /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / u32 pages; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / bool unpriv_array; / 56 1 / / XXX 7 bytes hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / struct user_struct user; /* 64 8 / atomic_t refcnt; / 72 4 / atomic_t usercnt; / 76 4 / struct work_struct work; / 80 32 / char name[16]; / 112 16 / / --- cacheline 2 boundary (128 bytes) --- / / size: 128, cachelines: 2, members: 17 / / sum members: 121, holes: 1, sum holes: 7 */ }; Now all entries in the first cacheline are read only throughout the life time of the map, set up once during map creation. Overall struct size and number of cachelines doesn't change from the reordering. struct bpf_map is usually first member and embedded in map structs in specific map implementations, so also avoid those members to sit at the end where it could potentially share the cacheline with first map values e.g. in the array since remote CPUs could trigger map updates just as well for those (easily dirtying members like max_entries intentionally as well) while having subsequent values in cache. Quoting from Google's Project Zero blog [1]: Additionally, at least on the Intel machine on which this was tested, bouncing modified cache lines between cores is slow, apparently because the MESI protocol is used for cache coherence [8]. Changing the reference counter of an eBPF array on one physical CPU core causes the cache line containing the reference counter to be bounced over to that CPU core, making reads of the reference counter on all other CPU cores slow until the changed reference counter has been written back to memory. Because the length and the reference counter of an eBPF array are stored in the same cache line, this also means that changing the reference counter on one physical CPU core causes reads of the eBPF array's length to be slow on other physical CPU cores (intentional false sharing). While this doesn't 'control' the out-of-bounds speculation through masking the index as in commit `b2157399cc`, triggering a manipulation of the map's reference counter is really trivial, so lets not allow to easily affect max_entries from it. Splitting to separate cachelines also generally makes sense from a performance perspective anyway in that fast-path won't have a cache miss if the map gets pinned, reused in other progs, etc out of control path, thus also avoids unintentional false sharing. [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Daniel Borkmann	fcabc6d008	bpf: arsh is not supported in 32 bit alu thus reject it [ upstream commit `7891a87efc` ] The following snippet was throwing an 'unknown opcode cc' warning in BPF interpreter: 0: (18) r0 = 0x0 2: (7b) (u64 )(r10 -16) = r0 3: (cc) (u32) r0 s>>= (u32) r0 4: (95) exit Although a number of JITs do support BPF_ALU \| BPF_ARSH \| BPF_{K,X} generation, not all of them do and interpreter does neither. We can leave existing ones and implement it later in bpf-next for the remaining ones, but reject this properly in verifier for the time being. Fixes: `17a5267067` ("bpf: verifier (add verifier core)") Reported-by: syzbot+93c4904c5c70348a6890@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:57 +01:00
Alexei Starovoitov	a3d6dd6a66	bpf: introduce BPF_JIT_ALWAYS_ON config [ upstream commit `290af86629` ] The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Alexei Starovoitov	5226bb3b95	bpf: fix bpf_tail_call() x64 JIT [ upstream commit `90caccdd8c` ] - bpf prog_array just like all other types of bpf array accepts 32-bit index. Clarify that in the comment. - fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes - tighten corresponding check in the interpreter to stay consistent The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag in commit `96eabe7a40` in 4.14. Before that the map_flags would stay zero and though JIT code is wrong it will check bounds correctly. Hence two fixes tags. All other JITs don't have this problem. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Fixes: `96eabe7a40` ("bpf: Allow selecting numa node during map creation") Fixes: `b52f00e6a7` ("x86: bpf_jit: implement bpf_tail_call() helper") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Eric Dumazet	c964ad34f6	x86: bpf_jit: small optimization in emit_bpf_tail_call() [ upstream commit `84ccac6e78` ] Saves 4 bytes replacing following instructions : lea rax, [rsi + rdx * 8 + offsetof(...)] mov rax, qword ptr [rax] cmp rax, 0 by : mov rax, [rsi + rdx * 8 + offsetof(...)] test rax, rax Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Thomas Gleixner	c98ff7299b	hrtimer: Reset hrtimer cpu base proper on CPU hotplug commit `d5421ea43d` upstream. The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: `41d2e49493` ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Jia Zhang	9f3a6cadf4	x86/microcode/intel: Extend BDW late-loading further with LLC size check commit `7e702d17ed` upstream. Commit `b94b737331` ("x86/microcode/intel: Extend BDW late-loading with a revision check") reduced the impact of erratum BDF90 for Broadwell model 79. The impact can be reduced further by checking the size of the last level cache portion per core. Tony: "The erratum says the problem only occurs on the large-cache SKUs. So we only need to avoid the update if we are on a big cache SKU that is also running old microcode." For more details, see erratum BDF90 in document #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family Specification Update) from September 2017. Fixes: `b94b737331` ("x86/microcode/intel: Extend BDW late-loading with a revision check") Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Xiao Liang	dc1932c698	perf/x86/amd/power: Do not load AMD power module on !AMD platforms commit `40d4071ce2` upstream. The AMD power module can be loaded on non AMD platforms, but unload fails with the following Oops: BUG: unable to handle kernel NULL pointer dereference at (null) IP: __list_del_entry_valid+0x29/0x90 Call Trace: perf_pmu_unregister+0x25/0xf0 amd_power_pmu_exit+0x1c/0xd23 [power] SyS_delete_module+0x1a8/0x2b0 ? exit_to_usermode_loop+0x8f/0xb0 entry_SYSCALL_64_fastpath+0x20/0x83 Return -ENODEV instead of 0 from the module init function if the CPU does not match. Fixes: `c7ab62bfbe` ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism") Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Eric Dumazet	eecfa2eeef	flow_dissector: properly cap thoff field [ Upstream commit `d0c081b491` ] syzbot reported yet another crash [1] that is caused by insufficient validation of DODGY packets. Two bugs are happening here to trigger the crash. 1) Flow dissection leaves with incorrect thoff field. 2) skb_probe_transport_header() sets transport header to this invalid thoff, even if pointing after skb valid data. 3) qdisc_pkt_len_init() reads out-of-bound data because it trusts tcp_hdrlen(skb) Possible fixes : - Full flow dissector validation before injecting bad DODGY packets in the stack. This approach was attempted here : https://patchwork.ozlabs.org/patch/ 861874/ - Have more robust functions in the core. This might be needed anyway for stable versions. This patch fixes the flow dissection issue. [1] CPU: 1 PID: 3144 Comm: syzkaller271204 Not tainted 4.15.0-rc4-mm1+ #49 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:355 [inline] kasan_report+0x23b/0x360 mm/kasan/report.c:413 __asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:432 __tcp_hdrlen include/linux/tcp.h:35 [inline] tcp_hdrlen include/linux/tcp.h:40 [inline] qdisc_pkt_len_init net/core/dev.c:3160 [inline] __dev_queue_xmit+0x20d3/0x2200 net/core/dev.c:3465 dev_queue_xmit+0x17/0x20 net/core/dev.c:3554 packet_snd net/packet/af_packet.c:2943 [inline] packet_sendmsg+0x3ad5/0x60a0 net/packet/af_packet.c:2968 sock_sendmsg_nosec net/socket.c:628 [inline] sock_sendmsg+0xca/0x110 net/socket.c:638 sock_write_iter+0x31a/0x5d0 net/socket.c:907 call_write_iter include/linux/fs.h:1776 [inline] new_sync_write fs/read_write.c:469 [inline] __vfs_write+0x684/0x970 fs/read_write.c:482 vfs_write+0x189/0x510 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0xef/0x220 fs/read_write.c:581 entry_SYSCALL_64_fastpath+0x1f/0x96 Fixes: `34fad54c25` ("net: __skb_flow_dissect() must cap its return value") Fixes: `a6e544b0a8` ("flow_dissector: Jump to exit code in __skb_flow_dissect") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Cong Wang	18717ee28e	tun: fix a memory leak for tfile->tx_array [ Upstream commit `4df0bfc799` ] tfile->tun could be detached before we close the tun fd, via tun_detach_all(), so it should not be used to check for tfile->tx_array. As Jason suggested, we probably have to clean it up unconditionally both in __tun_deatch() and tun_detach_all(), but this requires to check if it is initialized or not. Currently skb_array_cleanup() doesn't have such a check, so I check it in the caller and introduce a helper function, it is a bit ugly but we can always improve it in net-next. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: `1576d98605` ("tun: switch to use skb array for tx") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:56 +01:00
Yuval Mintz	1105145cb3	mlxsw: spectrum_router: Don't log an error on missing neighbor [ Upstream commit `1ecdaea02c` ] Driver periodically samples all neighbors configured in device in order to update the kernel regarding their state. When finding an entry configured in HW that doesn't show in neigh_lookup() driver logs an error message. This introduces a race when removing multiple neighbors - it's possible that a given entry would still be configured in HW as its removal is still being processed but is already removed from the kernel's neighbor tables. Simply remove the error message and gracefully accept such events. Fixes: `c723c735fa` ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Fixes: `60f040ca11` ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Willem de Bruijn	3110e2134c	gso: validate gso_type in GSO handlers [ Upstream commit `121d57af30` ] Validate gso_type during segmentation as SKB_GSO_DODGY sources may pass packets where the gso_type does not match the contents. Syzkaller was able to enter the SCTP gso handler with a packet of gso_type SKB_GSO_TCPV4. On entry of transport layer gso handlers, verify that the gso_type matches the transport protocol. Fixes: `90017accff` ("sctp: Add GSO support") Link: http://lkml.kernel.org/r/<001a1137452496ffc305617e5fe0@google.com> Reported-by: syzbot+fee64147a25aecd48055@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Alexey Kodanev	cc99c6d59a	ip6_gre: init dev->mtu and dev->hard_header_len correctly [ Upstream commit `128bb975dc` ] Commit `b05229f442` ("gre6: Cleanup GREv6 transmit path, call common GRE functions") moved dev->mtu initialization from ip6gre_tunnel_setup() to ip6gre_tunnel_init(), as a result, the previously set values, before ndo_init(), are reset in the following cases: * rtnl_create_link() can update dev->mtu from IFLA_MTU parameter. * ip6gre_tnl_link_config() is invoked before ndo_init() in netlink and ioctl setup, so ndo_init() can reset MTU adjustments with the lower device MTU as well, dev->mtu and dev->hard_header_len. Not applicable for ip6gretap because it has one more call to ip6gre_tnl_link_config(tunnel, 1) in ip6gre_tap_init(). Fix the first case by updating dev->mtu with 'tb[IFLA_MTU]' parameter if a user sets it manually on a device creation, and fix the second one by moving ip6gre_tnl_link_config() call after register_netdevice(). Fixes: `b05229f442` ("gre6: Cleanup GREv6 transmit path, call common GRE functions") Fixes: `db2ec95d1b` ("ip6_gre: Fix MTU setting") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Ivan Vecera	1711ba166e	be2net: restore properly promisc mode after queues reconfiguration [ Upstream commit `52acf06451` ] The commit `6221906694` ("be2net: Request RSS capability of Rx interface depending on number of Rx rings") modified be_update_queues() so the IFACE (HW representation of the netdevice) is destroyed and then re-created. This causes a regression because potential promiscuous mode is not restored properly during be_open() because the driver thinks that the HW has promiscuous mode already enabled. Note that Lancer is not affected by this bug because RX-filter flags are disabled during be_close() for this chipset. Cc: Sathya Perla <sathya.perla@broadcom.com> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Fixes: `6221906694` ("be2net: Request RSS capability of Rx interface depending on number of Rx rings") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Guillaume Nault	00f9e47c6f	ppp: unlock all_ppp_mutex before registering device [ Upstream commit `0171c41835` ] ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices, needs to lock pn->all_ppp_mutex. Therefore we mustn't call register_netdevice() with pn->all_ppp_mutex already locked, or we'd deadlock in case register_netdevice() fails and calls .ndo_uninit(). Fortunately, we can unlock pn->all_ppp_mutex before calling register_netdevice(). This lock protects pn->units_idr, which isn't used in the device registration process. However, keeping pn->all_ppp_mutex locked during device registration did ensure that no device in transient state would be published in pn->units_idr. In practice, unlocking it before calling register_netdevice() doesn't change this property: ppp_unit_register() is called with 'ppp_mutex' locked and all searches done in pn->units_idr hold this lock too. Fixes: `8cb775bc0a` ("ppp: fix device unregistration upon netns deletion") Reported-and-tested-by: syzbot+367889b9c9e279219175@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Jim Westfall	260eb694b5	ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY [ Upstream commit `cd9ff4de01` ] Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices to avoid making an entry for every remote ip the device needs to talk to. This used the be the old behavior but became broken in `a263b30936` (ipv4: Make neigh lookups directly in output packet path) and later removed in `0bb4087cbe` (ipv4: Fix neigh lookup keying over loopback/point-to-point devices) because it was broken. Signed-off-by: Jim Westfall <jwestfall@surrealistic.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Jim Westfall	014510b117	net: Allow neigh contructor functions ability to modify the primary_key [ Upstream commit `096b9854c0` ] Use n->primary_key instead of pkey to account for the possibility that a neigh constructor function may have modified the primary_key value. Signed-off-by: Jim Westfall <jwestfall@surrealistic.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Neil Horman	66c16a22e3	vmxnet3: repair memory leak [ Upstream commit `848b159835` ] with the introduction of commit `b0eb57cb97`, it appears that rq->buf_info is improperly handled. While it is heap allocated when an rx queue is setup, and freed when torn down, an old line of code in vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0] being set to NULL prior to its being freed, causing a memory leak, which eventually exhausts the system on repeated create/destroy operations (for example, when the mtu of a vmxnet3 interface is changed frequently. Fix is pretty straight forward, just move the NULL set to after the free. Tested by myself with successful results Applies to net, and should likely be queued for stable, please Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-By: boyang@redhat.com CC: boyang@redhat.com CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: David S. Miller <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Cong Wang	0e52703d07	tipc: fix a memory leak in tipc_nl_node_get_link() [ Upstream commit `59b36613e8` ] When tipc_node_find_by_name() fails, the nlmsg is not freed. While on it, switch to a goto label to properly free it. Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Xin Long	2f056e7def	sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf [ Upstream commit `a0ff660058` ] After commit `cea0cc80a6` ("sctp: use the right sk after waking up from wait_buf sleep"), it may change to lock another sk if the asoc has been peeled off in sctp_wait_for_sndbuf. However, the asoc's new sk could be already closed elsewhere, as it's in the sendmsg context of the old sk that can't avoid the new sk's closing. If the sk's last one refcnt is held by this asoc, later on after putting this asoc, the new sk will be freed, while under it's own lock. This patch is to revert that commit, but fix the old issue by returning error under the old sk's lock. Fixes: `cea0cc80a6` ("sctp: use the right sk after waking up from wait_buf sleep") Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:55 +01:00
Xin Long	8e3534ea65	sctp: do not allow the v4 socket to bind a v4mapped v6 address [ Upstream commit `c5006b8aa7` ] The check in sctp_sockaddr_af is not robust enough to forbid binding a v4mapped v6 addr on a v4 socket. The worse thing is that v4 socket's bind_verify would not convert this v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4 socket bound a v6 addr. This patch is to fix it by doing the common sa.sa_family check first, then AF_INET check for v4mapped v6 addrs. Fixes: `7dab83de50` ("sctp: Support ipv6only AF_INET6 sockets.") Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Francois Romieu	0f51492d1b	r8169: fix memory corruption on retrieval of hardware statistics. [ Upstream commit `a78e93661c` ] Hardware statistics retrieval hurts in tight invocation loops. Avoid extraneous write and enforce strict ordering of writes targeted to the tally counters dump area address registers. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Guillaume Nault	1bd21b158e	pppoe: take ->needed_headroom of lower device into account on xmit [ Upstream commit `02612bb05e` ] In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom was probably fine before the introduction of ->needed_headroom in commit `f5184d267c` ("net: Allow netdevices to specify needed head/tailroom"). But now, virtual devices typically advertise the size of their overhead in dev->needed_headroom, so we must also take it into account in skb_reserve(). Allocation size of skb is also updated to take dev->needed_tailroom into account and replace the arbitrary 32 bytes with the real size of a PPPoE header. This issue was discovered by syzbot, who connected a pppoe socket to a gre device which had dev->header_ops->create == ipgre_header and dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any headroom, and dev_hard_header() crashed when ipgre_header() tried to prepend its header to skb->data. skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24 head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted 4.15.0-rc7-next-20180115+ #97 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100 RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282 RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000 RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0 R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180 FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_under_panic net/core/skbuff.c:114 [inline] skb_push+0xce/0xf0 net/core/skbuff.c:1714 ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879 dev_hard_header include/linux/netdevice.h:2723 [inline] pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg+0xca/0x110 net/socket.c:640 sock_write_iter+0x31a/0x5d0 net/socket.c:909 call_write_iter include/linux/fs.h:1775 [inline] do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653 do_iter_write+0x154/0x540 fs/read_write.c:932 vfs_writev+0x18a/0x340 fs/read_write.c:977 do_writev+0xfc/0x2a0 fs/read_write.c:1012 SYSC_writev fs/read_write.c:1085 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1082 entry_SYSCALL_64_fastpath+0x29/0xa0 Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like interfaces, but reserving space for ->needed_headroom is a more fundamental issue that needs to be addressed first. Same problem exists for __pppoe_xmit(), which also needs to take dev->needed_headroom into account in skb_cow_head(). Fixes: `f5184d267c` ("net: Allow netdevices to specify needed head/tailroom") Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Dan Streetman	cf67be7a1a	net: tcp: close sock if net namespace is exiting [ Upstream commit `4ee806d511` ] When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence. For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like: unregister_netdevice: waiting for lo to become free. Usage count = 1 These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting. After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811 Signed-off-by: Dan Streetman <ddstreet@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Eric Dumazet	a44d91150f	net: qdisc_pkt_len_init() should be more robust [ Upstream commit `7c68d1a6b4` ] Without proper validation of DODGY packets, we might very well feed qdisc_pkt_len_init() with invalid GSO packets. tcp_hdrlen() might access out-of-bound data, so let's use skb_header_pointer() and proper checks. Whole story is described in commit `d0c081b491` ("flow_dissector: properly cap thoff field") We have the goal of validating DODGY packets earlier in the stack, so we might very well revert this fix in the future. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Jason Wang <jasowang@redhat.com> Reported-by: syzbot+9da69ebac7dddd804552@syzkaller.appspotmail.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Felix Fietkau	0ae16964f2	net: igmp: fix source address check for IGMPv3 reports [ Upstream commit `ad23b75093` ] Commit "net: igmp: Use correct source address on IGMPv3 reports" introduced a check to validate the source address of locally generated IGMPv3 packets. Instead of checking the local interface address directly, it uses inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the local subnet (or equal to the point-to-point address if used). This breaks for point-to-point interfaces, so check against ifa->ifa_local directly. Cc: Kevin Cernekee <cernekee@chromium.org> Fixes: `a46182b002` ("net: igmp: Use correct source address on IGMPv3 reports") Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Yuiko Oshino	283498b4ca	lan78xx: Fix failure in USB Full Speed [ Upstream commit `a5b1379afb` ] Fix initialize the uninitialized tx_qlen to an appropriate value when USB Full Speed is used. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Eric Dumazet	c2ceff11b4	ipv6: ip6_make_skb() needs to clear cork.base.dst [ Upstream commit `95ef498d97` ] In my last patch, I missed fact that cork.base.dst was not initialized in ip6_make_skb() : If ip6_setup_cork() returns an error, we might attempt a dst_release() on some random pointer. Fixes: `862c03ee1d` ("ipv6: fix possible mem leaks in ipv6_make_skb()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:54 +01:00
Mike Maloney	fb50d8c916	ipv6: fix udpv6 sendmsg crash caused by too small MTU [ Upstream commit `749439bfac` ] The logic in __ip6_append_data() assumes that the MTU is at least large enough for the headers. A device's MTU may be adjusted after being added while sendmsg() is processing data, resulting in __ip6_append_data() seeing any MTU. For an mtu smaller than the size of the fragmentation header, the math results in a negative 'maxfraglen', which causes problems when refragmenting any previous skb in the skb_write_queue, leaving it possibly malformed. Instead sendmsg returns EINVAL when the mtu is calculated to be less than IPV6_MIN_MTU. Found by syzkaller: kernel BUG at ./include/linux/skbuff.h:2064! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801d0b68580 task.stack: ffff8801ac6b8000 RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline] RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216 RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000 RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0 RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000 R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8 R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000 FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6_finish_skb include/net/ipv6.h:911 [inline] udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x352/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4512e9 RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9 RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005 RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69 R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000 Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570 RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570 Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Mike Maloney <maloney@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Ben Hutchings	8b0d3e81cd	ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL [ Upstream commit `e9191ffb65` ] Commit `513674b5a2` ("net: reevalulate autoflowlabel setting after sysctl setting") removed the initialisation of ipv6_pinfo::autoflowlabel and added a second flag to indicate whether this field or the net namespace default should be used. The getsockopt() handling for this case was not updated, so it currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is not explicitly enabled. Fix it to return the effective value, whether that has been set at the socket or net namespace level. Fixes: `513674b5a2` ("net: reevalulate autoflowlabel setting after sysctl ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Alexey Kodanev	5bb5ae9718	dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state [ Upstream commit `dd5684ecae` ] ccid2_hc_tx_rto_expire() timer callback always restarts the timer again and can run indefinitely (unless it is stopped outside), and after commit `120e9dabaf` ("dccp: defer ccid_hc_tx_delete() at dismantle time"), which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from dccp_destroy_sock() to sk_destruct(), this started to happen quite often. The timer prevents releasing the socket, as a result, sk_destruct() won't be called. Found with LTP/dccp_ipsec tests running on the bonding device, which later couldn't be unloaded after the tests were completed: unregister_netdevice: waiting for bond0 to become free. Usage count = 148 Fixes: `2a91aa3967` ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Greg KH	5f6c581bcb	eventpoll.h: add missing epoll event masks commit `7e04072685` upstream. [resend due to me forgetting to cc: linux-api the first time around I posted these back on Feb 23] From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> For some reason these values are not in the uapi header file, so any libc has to define it themselves. To prevent them from needing to do this, just have the kernel provide the correct values. Reported-by: Elliott Hughes <enh@google.com> Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Ben Hutchings	9a0be5afbf	vsyscall: Fix permissions for emulate mode with KAISER/PTI The backport of KAISER to 4.4 turned vsyscall emulate mode into native mode. Add a vsyscall_pgprot variable to hold the correct page protections, like Borislav and Hugh did for 3.2 and 3.18. Cc: Borislav Petkov <bp@suse.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Thomas Meyer	1be7d46e77	um: link vmlinux with -no-pie commit `883354afbc` upstream. Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option. Link UML dynamic kernel image also with -no-pie to fix the build. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Bernie Innocenti <codewiz@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Martin Brandenburg	d680db7225	orangefs: fix deadlock; do not write i_size in read_iter commit `6793f1c450` upstream. After do_readv_writev, the inode cache is invalidated anyway, so i_size will never be read. It will be fetched from the server which will also know about updates from other machines. Fixes deadlock on 32-bit SMP. See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2 Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Aaron Ma	42f0aba58e	Input: trackpoint - force 3 buttons if 0 button is reported commit `f5d07b9e98` upstream. Lenovo introduced trackpoint compatible sticks with minimum PS/2 commands. They supposed to reply with 0x02, 0x03, or 0x04 in response to the "Read Extended ID" command, so we would know not to try certain extended commands. Unfortunately even some trackpoints reporting the original IBM version (0x01 firmware 0x0e) now respond with incorrect data to the "Get Extended Buttons" command: thinkpad_acpi: ThinkPad BIOS R0DET87W (1.87 ), EC unknown thinkpad_acpi: Lenovo ThinkPad E470, model 20H1004SGE psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 0/0 Since there are no trackpoints without buttons, let's assume the trackpoint has 3 buttons when we get 0 response to the extended buttons query. Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196253 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Johannes Weiner	19a7db1e2e	mm: fix 100% CPU kswapd busyloop on unreclaimable nodes commit `c73322d098` upstream. Patch series "mm: kswapd spinning on unreclaimable nodes - fixes and cleanups". Jia reported a scenario in which the kswapd of a node indefinitely spins at 100% CPU usage. We have seen similar cases at Facebook. The kernel's current method of judging its ability to reclaim a node (or whether to back off and sleep) is based on the amount of scanned pages in proportion to the amount of reclaimable pages. In Jia's and our scenarios, there are no reclaimable pages in the node, however, and the condition for backing off is never met. Kswapd busyloops in an attempt to restore the watermarks while having nothing to work with. This series reworks the definition of an unreclaimable node based not on scanning but on whether kswapd is able to actually reclaim pages in MAX_RECLAIM_RETRIES (16) consecutive runs. This is the same criteria the page allocator uses for giving up on direct reclaim and invoking the OOM killer. If it cannot free any pages, kswapd will go to sleep and leave further attempts to direct reclaim invocations, which will either make progress and re-enable kswapd, or invoke the OOM killer. Patch #1 fixes the immediate problem Jia reported, the remainder are smaller fixlets, cleanups, and overall phasing out of the old method. Patch #6 is the odd one out. It's a nice cleanup to get_scan_count(), and directly related to #5, but in itself not relevant to the series. If the whole series is too ambitious for 4.11, I would consider the first three patches fixes, the rest cleanups. This patch (of 9): Jia He reports a problem with kswapd spinning at 100% CPU when requesting more hugepages than memory available in the system: $ echo 4000 >/proc/sys/vm/nr_hugepages top - 13:42:59 up 3:37, 1 user, load average: 1.09, 1.03, 1.01 Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 12.5 sy, 0.0 ni, 85.5 id, 2.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 31371520 total, 30915136 used, 456384 free, 320 buffers KiB Swap: 6284224 total, 115712 used, 6168512 free. 48192 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 76 root 20 0 0 0 0 R 100.0 0.000 217:17.29 kswapd3 At that time, there are no reclaimable pages left in the node, but as kswapd fails to restore the high watermarks it refuses to go to sleep. Kswapd needs to back away from nodes that fail to balance. Up until commit `1d82de618d` ("mm, vmscan: make kswapd reclaim in terms of nodes") kswapd had such a mechanism. It considered zones whose theoretically reclaimable pages it had reclaimed six times over as unreclaimable and backed away from them. This guard was erroneously removed as the patch changed the definition of a balanced node. However, simply restoring this code wouldn't help in the case reported here: there are no reclaimable pages that could be scanned until the threshold is met. Kswapd would stay awake anyway. Introduce a new and much simpler way of backing off. If kswapd runs through MAX_RECLAIM_RETRIES (16) cycles without reclaiming a single page, make it back off from the node. This is the same number of shots direct reclaim takes before declaring OOM. Kswapd will go to sleep on that node until a direct reclaimer manages to reclaim some pages, thus proving the node reclaimable again. [hannes@cmpxchg.org: check kswapd failure against the cumulative nr_reclaimed count] Link: http://lkml.kernel.org/r/20170306162410.GB2090@cmpxchg.org [shakeelb@google.com: fix condition for throttle_direct_reclaim] Link: http://lkml.kernel.org/r/20170314183228.20152-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20170228214007.5621-2-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: Jia He <hejianet@gmail.com> Tested-by: Jia He <hejianet@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:53 +01:00
Greg Kroah-Hartman	e62b0c661f	Revert "module: Add retpoline tag to VERMAGIC" commit `5132ede0fe` upstream. This reverts commit `6cfb521ac0`. Turns out distros do not want to make retpoline as part of their "ABI", so this patch should not have been merged. Sorry Andi, this was my fault, I suggested it when your original patch was the "correct" way of doing this instead. Reported-by: Jiri Kosina <jikos@kernel.org> Fixes: `6cfb521ac0` ("module: Add retpoline tag to VERMAGIC") Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Johannes Thumshirn	c41bb027ed	scsi: libiscsi: fix shifting of DID_REQUEUE host byte commit `eef9ffdf9c` upstream. The SCSI host byte should be shifted left by 16 in order to have scsi_decide_disposition() do the right thing (.i.e. requeue the command). Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: `661134ad37` ("[SCSI] libiscsi, bnx2i: make bound ep check common") Cc: Lee Duncan <lduncan@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Bart Van Assche <Bart.VanAssche@sandisk.com> Cc: Chris Leech <cleech@redhat.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Jiri Slaby	7b50205cf8	fs/fcntl: f_setown, avoid undefined behaviour commit `fc3dc67471` upstream. fcntl(0, F_SETOWN, 0x80000000) triggers: UBSAN: Undefined behaviour in fs/fcntl.c:118:7 negation of -2147483648 cannot be represented in type 'int': CPU: 1 PID: 18261 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1 ... Call Trace: ... [<ffffffffad8f0868>] ? f_setown+0x1d8/0x200 [<ffffffffad8f19a9>] ? SyS_fcntl+0x999/0xf30 [<ffffffffaed1fb00>] ? entry_SYSCALL_64_fastpath+0x23/0xc1 Fix that by checking the arg parameter properly (against INT_MAX) before "who = -who". And return immediatelly with -EINVAL in case it is wrong. Note that according to POSIX we can return EINVAL: http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html [EINVAL] The cmd argument is F_SETOWN and the value of the argument is not valid as a process or process group identifier. [v2] returns an error, v1 used to fail silently [v3] implement proper check for the bad value INT_MIN Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Jeff Mahoney	0ccfbd4d6f	reiserfs: don't preallocate blocks for extended attributes commit `54930dfeb4` upstream. Most extended attributes will fit in a single block. More importantly, we drop the reference to the inode while holding the transaction open so the preallocated blocks aren't released. As a result, the inode may be evicted before it's removed from the transaction's prealloc list which can cause memory corruption. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Jeff Mahoney	b7d25282b7	reiserfs: fix race in prealloc discard commit `08db141b53` upstream. The main loop in __discard_prealloc is protected by the reiserfs write lock which is dropped across schedules like the BKL it replaced. The problem is that it checks the value, calls a routine that schedules, and then adjusts the state. As a result, two threads that are calling reiserfs_prealloc_discard at the same time can race when one calls reiserfs_free_prealloc_block, the lock is dropped, and the other calls reiserfs_free_prealloc_block with the same block number. In the right circumstances, it can cause the prealloc count to go negative. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Kevin Cernekee	898eeca02a	netfilter: xt_osf: Add missing permission checks commit `916a27901d` upstream. The capability check in nfnetlink_rcv() verifies that the caller has CAP_NET_ADMIN in the namespace that "owns" the netlink socket. However, xt_osf_fingers is shared by all net namespaces on the system. An unprivileged user can create user and net namespaces in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable() check: vpnns -- nfnl_osf -f /tmp/pf.os vpnns -- nfnl_osf -f /tmp/pf.os -d These non-root operations successfully modify the systemwide OS fingerprint list. Add new capable() checks so that they can't. Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Kevin Cernekee	2c3184ea80	netfilter: nfnetlink_cthelper: Add missing permission checks commit `4b380c42f7` upstream. The capability check in nfnetlink_rcv() verifies that the caller has CAP_NET_ADMIN in the namespace that "owns" the netlink socket. However, nfnl_cthelper_list is shared by all net namespaces on the system. An unprivileged user can create user and net namespaces in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable() check: $ nfct helper list nfct v1.4.4: netlink error: Operation not permitted $ vpnns -- nfct helper list { .name = ftp, .queuenum = 0, .l3protonum = 2, .l4protonum = 6, .priv_data_len = 24, .status = enabled, }; Add capable() checks in nfnetlink_cthelper, as this is cleaner than trying to generalize the solution. Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Seunghun Han	2915f16bdc	ACPICA: Namespace: fix operand cache leak commit `3b2d69114f` upstream. ACPICA commit a23325b2e583556eae88ed3f764e457786bf4df6 I found some ACPI operand cache leaks in ACPI early abort cases. Boot log of ACPI operand cache leak is as follows: >[ 0.174332] ACPI: Added _OSI(Module Device) >[ 0.175504] ACPI: Added _OSI(Processor Device) >[ 0.176010] ACPI: Added _OSI(3.0 _SCP Extensions) >[ 0.177032] ACPI: Added _OSI(Processor Aggregator Device) >[ 0.178284] ACPI: SCI (IRQ16705) allocation failed >[ 0.179352] ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20160930/evevent-131) >[ 0.180008] ACPI: Unable to start the ACPI Interpreter >[ 0.181125] ACPI Error: Could not remove SCI handler (20160930/evmisc-281) >[ 0.184068] kmem_cache_destroy Acpi-Operand: Slab cache still has objects >[ 0.185358] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3 #2 >[ 0.186820] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS virtual_box 12/01/2006 >[ 0.188000] Call Trace: >[ 0.188000] ? dump_stack+0x5c/0x7d >[ 0.188000] ? kmem_cache_destroy+0x224/0x230 >[ 0.188000] ? acpi_sleep_proc_init+0x22/0x22 >[ 0.188000] ? acpi_os_delete_cache+0xa/0xd >[ 0.188000] ? acpi_ut_delete_caches+0x3f/0x7b >[ 0.188000] ? acpi_terminate+0x5/0xf >[ 0.188000] ? acpi_init+0x288/0x32e >[ 0.188000] ? __class_create+0x4c/0x80 >[ 0.188000] ? video_setup+0x7a/0x7a >[ 0.188000] ? do_one_initcall+0x4e/0x1b0 >[ 0.188000] ? kernel_init_freeable+0x194/0x21a >[ 0.188000] ? rest_init+0x80/0x80 >[ 0.188000] ? kernel_init+0xa/0x100 >[ 0.188000] ? ret_from_fork+0x25/0x30 When early abort is occurred due to invalid ACPI information, Linux kernel terminates ACPI by calling acpi_terminate() function. The function calls acpi_ns_terminate() function to delete namespace data and ACPI operand cache (acpi_gbl_module_code_list). But the deletion code in acpi_ns_terminate() function is wrapped in ACPI_EXEC_APP definition, therefore the code is only executed when the definition exists. If the define doesn't exist, ACPI operand cache (acpi_gbl_module_code_list) is leaked, and stack dump is shown in kernel log. This causes a security threat because the old kernel (<= 4.9) shows memory locations of kernel functions in stack dump, therefore kernel ASLR can be neutralized. To fix ACPI operand leak for enhancing security, I made a patch which removes the ACPI_EXEC_APP define in acpi_ns_terminate() function for executing the deletion code unconditionally. Link: https://github.com/acpica/acpica/commit/a23325b2 Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Rafael J. Wysocki	3a53accd9c	ACPI / scan: Prefer devices without _HID/_CID for _ADR matching commit `c2a6bbaf0c` upstream. The way acpi_find_child_device() works currently is that, if there are two (or more) devices with the same _ADR value in the same namespace scope (which is not specifically allowed by the spec and the OS behavior in that case is not defined), the first one of them found to be present (with the help of _STA) will be returned. This covers the majority of cases, but is not sufficient if some of the devices in question have a _HID (or _CID) returning some valid ACPI/PNP device IDs (which is disallowed by the spec) and the ASL writers' expectation appears to be that the OS will match devices without a valid ACPI/PNP device ID against a given bus address first. To cover this special case as well, modify find_child_checks() to prefer devices without ACPI/PNP device IDs over devices that have them. Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:52 +01:00
Jiri Slaby	542cde0e3c	ipc: msg, make msgrcv work with LONG_MIN commit `999898355e` upstream. When LONG_MIN is passed to msgrcv, one would expect to recieve any message. But convert_mode does msgtyp = -msgtyp and -LONG_MIN is undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN again. So handle this case properly by assigning LONG_MAX to *msgtyp if LONG_MIN was specified as msgtyp to msgrcv. This code: long msg[] = { 100, 200 }; int m = msgget(IPC_PRIVATE, IPC_CREAT \| 0644); msgsnd(m, &msg, sizeof(msg), 0); msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0); produces currently nothing: msgget(IPC_PRIVATE, IPC_CREAT\|0644) = 65538 msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0 msgrcv(65538, ... Except a UBSAN warning: UBSAN: Undefined behaviour in ipc/msg.c:745:13 negation of -9223372036854775808 cannot be represented in type 'long int': With the patch, I see what I expect: msgget(IPC_PRIVATE, IPC_CREAT\|0644) = 0 msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0 msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16 Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Vlastimil Babka	685cce58f1	mm, page_alloc: fix potential false positive in __zone_watermark_ok commit `b050e3769c` upstream. Since commit `97a16fc82a` ("mm, page_alloc: only enforce watermarks for order-0 allocations"), __zone_watermark_ok() check for high-order allocations will shortcut per-migratetype free list checks for ALLOC_HARDER allocations, and return true as long as there's free page of any migratetype. The intention is that ALLOC_HARDER can allocate from MIGRATE_HIGHATOMIC free lists, while normal allocations can't. However, as a side effect, the watermark check will then also return true when there are pages only on the MIGRATE_ISOLATE list, or (prior to CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the allocation cannot actually obtain isolated pages, and might not be able to obtain CMA pages, this can result in a false positive. The condition should be rare and perhaps the outcome is not a fatal one. Still, it's better if the watermark check is correct. There also shouldn't be a performance tradeoff here. Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz Fixes: `97a16fc82a` ("mm, page_alloc: only enforce watermarks for order-0 allocations") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Doug Berger	714c19ef57	cma: fix calculation of aligned offset commit `e048cb32f6` upstream. The align_offset parameter is used by bitmap_find_next_zero_area_off() to represent the offset of map's base from the previous alignment boundary; the function ensures that the returned index, plus the align_offset, honors the specified align_mask. The logic introduced by commit `b5be83e308` ("mm: cma: align to physical address, not CMA region position") has the cma driver calculate the offset to the next alignment boundary. In most cases, the base alignment is greater than that specified when making allocations, resulting in a zero offset whether we align up or down. In the example given with the commit, the base alignment (8MB) was half the requested alignment (16MB) so the math also happened to work since the offset is 8MB in both directions. However, when requesting allocations with an alignment greater than twice that of the base, the returned index would not be correctly aligned. Also, the align_order arguments of cma_bitmap_aligned_mask() and cma_bitmap_aligned_offset() should not be negative so the argument type was made unsigned. Fixes: `b5be83e308` ("mm: cma: align to physical address, not CMA region position") Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com Signed-off-by: Angus Clark <angus@angusclark.org> Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Gregory Fong <gregory.0xf0@gmail.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Angus Clark <angus@angusclark.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Shiraz Hashim <shashim@codeaurora.org> Cc: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Michal Hocko	bc0e2174b0	hwpoison, memcg: forcibly uncharge LRU pages commit `18365225f0` upstream. Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In his particular case he has hit a bad_page("page still charged to cgroup") when onlining a hwpoison page. While this looks like something that shouldn't happen in the first place because onlining hwpages and returning them to the page allocator makes only little sense it shows a real problem. hwpoison pages do not get freed usually so we do not uncharge them (at least not since commit `0a31bc97c8` ("mm: memcontrol: rewrite uncharge API")). Each charge pins memcg (since `e8ea14cc6e` ("mm: memcontrol: take a css reference for each charged page")) as well and so the mem_cgroup and the associated state will never go away. Fix this leak by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache(). We also have to tweak uncharge_list because it cannot rely on zero ref count for these pages. [akpm@linux-foundation.org: coding-style fixes] Fixes: `0a31bc97c8` ("mm: memcontrol: rewrite uncharge API") Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Michal Hocko	c57664bd12	mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack commit `561b5e0709` upstream. Commit `1be7107fbe` ("mm: larger stack guard gap, between vmas") has introduced a regression in some rust and Java environments which are trying to implement their own stack guard page. They are punching a new MAP_FIXED mapping inside the existing stack Vma. This will confuse expand_{downwards,upwards} into thinking that the stack expansion would in fact get us too close to an existing non-stack vma which is a correct behavior wrt safety. It is a real regression on the other hand. Let's work around the problem by considering PROT_NONE mapping as a part of the stack. This is a gros hack but overflowing to such a mapping would trap anyway an we only can hope that usespace knows what it is doing and handle it propely. Fixes: `1be7107fbe` ("mm: larger stack guard gap, between vmas") Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Sudeep Holla	1d8c402e0c	drivers: base: cacheinfo: fix boot error message when acpi is enabled commit `55877ef45f` upstream. ARM64 enables both CONFIG_OF and CONFIG_ACPI and the firmware can pass both ACPI tables and the device tree. Based on the kernel parameter, one of the two will be chosen. If acpi is enabled, then device tree is not unflattened. Currently ARM64 platforms report: " Failed to find cpu0 device node Unable to detect cache hierarchy from DT for CPU 0 " which is incorrect when booting with ACPI. Also latest ACPI v6.1 has no support for cache properties/hierarchy. This patch adds check for unflattened device tree and also returns as "not supported" if ACPI is runtime enabled. It also removes the reference to DT from the error message as the cache hierarchy can be detected from the firmware(OF/DT/ACPI) Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Sudeep Holla	f5aaa5a283	drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled commit `fac5148257` upstream. With CONFIG_OF enabled on x86, we get the following error on boot: " Failed to find cpu0 device node Unable to detect cache hierarchy from DT for CPU 0 " and the cacheinfo fails to get populated in the corresponding sysfs entries. This is because cache_setup_of_node looks for of_node for setting up the shared cpu_map without checking that it's already populated in the architecture specific callback. In order to indicate that the shared cpu_map is already populated, this patch introduces a boolean `cpu_map_populated` in struct cpu_cacheinfo that can be used by the generic code to skip cache_shared_cpu_map_setup. This patch also sets that boolean for x86. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Janakarajan Natarajan	318e17d09c	Prevent timer value 0 for MWAITX commit `88d879d29f` upstream. Newer hardware has uncovered a bug in the software implementation of using MWAITX for the delay function. A value of 0 for the timer is meant to indicate that a timeout will not be used to exit MWAITX. On newer hardware this can result in MWAITX never returning, resulting in NMI soft lockup messages being printed. On older hardware, some of the other conditions under which MWAITX can exit masked this issue. The AMD APM does not currently document this and will be updated. Please refer to http://marc.info/?l=kvm&m=148950623231140 for information regarding NMI soft lockup messages on an AMD Ryzen 1800X. This has been root-caused as a 0 passed to MWAITX causing it to wait indefinitely. This change has the added benefit of avoiding the unnecessary setup of MONITORX/MWAITX when the delay value is zero. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Link: http://lkml.kernel.org/r/1493156643-29366-1-git-send-email-Janakarajan.Natarajan@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Punit Agrawal	45ee9d5e97	KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2 commit `c507babf10` upstream. KVM only supports PMD hugepages at stage 2 but doesn't actually check that the provided hugepage memory pagesize is PMD_SIZE before populating stage 2 entries. In cases where the backing hugepage size is smaller than PMD_SIZE (such as when using contiguous hugepages), KVM can end up creating stage 2 mappings that extend beyond the supplied memory. Fix this by checking for the pagesize of userspace vma before creating PMD hugepage at stage 2. Fixes: `66b3923a1a` ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:51 +01:00
Marc Kleine-Budde	41e4aa17bc	can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once commit `d468984688` upstream. If an invalid CANFD frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+e3b775f40babeff6e68b@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Marc Kleine-Budde	40bf2c0c1c	can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once commit `8cb68751c1` upstream. If an invalid CAN frame is received, from a driver or from a tun interface, a Kernel warning is generated. This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here. Reported-by: syzbot+4386709c0c1284dca827@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Jonathan Dieter	69e78e7214	usbip: Fix potential format overflow in userspace tools commit `e5dfa3f902` upstream. The usbip userspace tools call sprintf()/snprintf() and don't check for the return value which can lead the paths to overflow, truncating the final file in the path. More urgently, GCC 7 now warns that these aren't checked with -Wformat-overflow, and with -Werror enabled in configure.ac, that makes these tools unbuildable. This patch fixes these problems by replacing sprintf() with snprintf() in one place and adding checks for the return value of snprintf(). Reviewed-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Jonathan Dieter <jdieter@lesbg.com> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Jonathan Dieter	853c39f239	usbip: Fix implicit fallthrough warning commit `cfd6ed4537` upstream. GCC 7 now warns when switch statements fall through implicitly, and with -Werror enabled in configure.ac, that makes these tools unbuildable. We fix this by notifying the compiler that this particular case statement is meant to fall through. Reviewed-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Jonathan Dieter <jdieter@lesbg.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Shuah Khan	ce601a07bc	usbip: prevent vhci_hcd driver from leaking a socket pointer address commit `2f2d0088eb` upstream. When a client has a USB device attached over IP, the vhci_hcd driver is locally leaking a socket pointer address via the /sys/devices/platform/vhci_hcd/status file (world-readable) and in debug output when "usbip --debug port" is run. Fix it to not leak. The socket pointer address is not used at the moment and it was made visible as a convenient way to find IP address from socket pointer address by looking up /proc/net/{tcp,tcp6}. As this opens a security hole, the fix replaces socket pointer address with sockfd. Reported-by: Secunia Research <vuln@secunia.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Martin Brandenburg	5c26ee198f	orangefs: initialize op on loop restart in orangefs_devreq_read commit `a0ec1ded22` upstream. In orangefs_devreq_read, there is a loop which picks an op off the list of pending ops. If the loop fails to find an op, there is nothing to read, and it returns EAGAIN. If the op has been given up on, the loop is restarted via a goto. The bug is that the variable which the found op is written to is not reinitialized, so if there are no more eligible ops on the list, the code runs again on the already handled op. This is triggered by interrupting a process while the op is being copied to the client-core. It's a fairly small window, but it's there. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Martin Brandenburg	fb39345e73	orangefs: use list_for_each_entry_safe in purge_waiting_ops commit `0afc0decf2` upstream. set_op_state_purged can delete the op. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Andy Lutomirski	c36c940cd4	x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels commit `1c52d859cb` upstream. We support various non-Intel CPUs that don't have the CPUID instruction, so the M486 test was wrong. For now, fix it with a big hammer: handle missing CPUID on all 32-bit CPUs. Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Matthew Whitehead <tedheadster@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: xen-devel <Xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Zhang, Ning A" <ning.a.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-31 12:55:50 +01:00
Greg Kroah-Hartman	79584a4221	Linux 4.9.78	2018-01-23 19:57:10 +01:00
Jonas Gorski	60249fe905	MIPS: AR7: ensure the port type's FCR value is used commit `0a5191efe0` upstream. Since commit `aef9a7bd9b` ("serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers"), the port's default FCR value isn't used in serial8250_do_set_termios anymore, but copied over once in serial8250_config_port and then modified as needed. Unfortunately, serial8250_config_port will never be called if the port is shared between kernel and userspace, and the port's flag doesn't have UPF_BOOT_AUTOCONF, which would trigger a serial8250_config_port as well. This causes garbled output from userspace: [ 5.220000] random: procd urandom read with 49 bits of entropy available ers [kee Fix this by forcing it to be configured on boot, resulting in the expected output: [ 5.250000] random: procd urandom read with 50 bits of entropy available Press the [f] key and hit [enter] to enter failsafe mode Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level Fixes: `aef9a7bd9b` ("serial/uart/8250: Add tunable RX interrupt trigger I/F of FIFO buffers") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Nicolas Schichan <nschichan@freebox.fr> Cc: linux-mips@linux-mips.org Cc: linux-serial@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17544/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:10 +01:00
Andi Kleen	06d7342d84	x86/retpoline: Optimize inline assembler for vmexit_fill_RSB commit `3f7d875566` upstream. The generated assembler for the C fill RSB inline asm operations has several issues: - The C code sets up the loop register, which is then immediately overwritten in __FILL_RETURN_BUFFER with the same value again. - The C code also passes in the iteration count in another register, which is not used at all. Remove these two unnecessary operations. Just rely on the single constant passed to the macro for the iterations. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: dave.hansen@intel.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:10 +01:00
zhenwei.pi	b9f8b59353	x86/pti: Document fix wrong index commit `98f0fceec7` upstream. In section <2. Runtime Cost>, fix wrong index. Signed-off-by: zhenwei.pi <zhenwei.pi@youruncloud.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/1516237492-27739-1-git-send-email-zhenwei.pi@youruncloud.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:10 +01:00
Masami Hiramatsu	4b71be4966	kprobes/x86: Disable optimizing on the function jumps to indirect thunk commit `c86a32c09f` upstream. Since indirect jump instructions will be replaced by jump to __x86_indirect_thunk_, those jmp instruction must be treated as an indirect jump. Since optprobe prohibits to optimize probes in the function which uses an indirect jump, it also needs to find out the function which jump to __x86_indirect_thunk_ and disable optimization. Add a check that the jump target address is between the __indirect_thunk_start/end when optimizing kprobe. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629212062.10241.6991266100233002273.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Masami Hiramatsu	36ad6ba501	kprobes/x86: Blacklist indirect thunk functions for kprobes commit `c1804a2368` upstream. Mark __x86_indirect_thunk_* functions as blacklist for kprobes because those functions can be called from anywhere in the kernel including blacklist functions of kprobes. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Masami Hiramatsu	09402d8339	retpoline: Introduce start/end markers of indirect thunk commit `736e80a421` upstream. Introduce start/end markers of __x86_indirect_thunk_* functions. To make it easy, consolidate .text.__x86.indirect_thunk.* sections to one .text.__x86.indirect_thunk section and put it in the end of kernel text section and adds __indirect_thunk_start/end so that other subsystem (e.g. kprobes) can identify it. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Thomas Gleixner	c5aa687060	x86/mce: Make machine check speculation protected commit `6f41c34d69` upstream. The machine check idtentry uses an indirect branch directly from the low level code. This evades the speculation protection. Replace it by a direct call into C code and issue the indirect call there so the compiler can apply the proper speculation protection. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by:Borislav Petkov <bp@alien8.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Niced-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801181626290.1847@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Shuah Khan	87ac29717d	usbip: fix warning in vhci_hcd_probe/lockdep_init_map commit `918b8ac55b` upstream. vhci_hcd calls sysfs_create_group() with dynamically allocated sysfs attributes triggering the lock-class key not persistent warning. Call sysfs_attr_init() for dynamically allocated sysfs attributes to fix it. vhci_hcd vhci_hcd: USB/IP Virtual Host Controller vhci_hcd vhci_hcd: new USB bus registered, assigned bus number 2 BUG: key ffff88006a7e8d18 not in .data! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3131 lockdep_init_map+0x60c/0x770 DEBUG_LOCKS_WARN_ON(1)[ 1.567044] Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7+ #58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88006bce6eb8 ffffffff81f96c8a ffffffff00000a02 1ffff1000d79cd6a ffffed000d79cd62 000000046bce6ed8 0000000041b58ab3 ffffffff8598af40 ffffffff81f969f8 0000000000000000 0000000041b58ab3 0000000000000200 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81f96c8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51 [<ffffffff812b808f>] __warn+0x19f/0x1e0 kernel/panic.c:550 [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565 [<ffffffff813f3efc>] lockdep_init_map+0x60c/0x770 kernel/locking/lockdep.c:3131 [<ffffffff819e43d4>] __kernfs_create_file+0x114/0x2a0 fs/kernfs/file.c:954 [<ffffffff819e68f5>] sysfs_add_file_mode_ns+0x225/0x520 fs/sysfs/file.c:305 [< inline >] create_files fs/sysfs/group.c:64 [<ffffffff819e8a89>] internal_create_group+0x239/0x8f0 fs/sysfs/group.c:134 [<ffffffff819e915f>] sysfs_create_group+0x1f/0x30 fs/sysfs/group.c:156 [<ffffffff8323de24>] vhci_start+0x5b4/0x7a0 drivers/usb/usbip/vhci_hcd.c:978 [<ffffffff82c907ca>] usb_add_hcd+0x8da/0x1c60 drivers/usb/core/hcd.c:2867 [<ffffffff8323bc57>] vhci_hcd_probe+0x97/0x130 drivers/usb/usbip/vhci_hcd.c:1103 --- --- ---[ end trace c33c7b202cf3aac8 ]--- Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Tom Lendacky	0d92cf7f29	x86/cpu, x86/pti: Do not enable PTI on AMD processors commit `694d99d409` upstream. AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault. Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdoffice.net Cc: Nick Lowe <nick.lowe@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Marc Zyngier	ddfaa7acd7	arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls commit `acfb3b883f` upstream. KVM doesn't follow the SMCCC when it comes to unimplemented calls, and inject an UNDEF instead of returning an error. Since firmware calls are now used for security mitigation, they are becoming more common, and the undef is counter productive. Instead, let's follow the SMCCC which states that -1 must be returned to the caller when getting an unknown function number. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Dennis Yang	2904adc5b1	dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6 commit `490ae017f5` upstream. For btree removal, there is a corner case that a single thread could takes 6 locks which is more than THIN_MAX_CONCURRENT_LOCKS(5) and leads to deadlock. A btree removal might eventually call rebalance_children()->rebalance3() to rebalance entries of three neighbor child nodes when shadow_spine has already acquired two write locks. In rebalance3(), it tries to shadow and acquire the write locks of all three child nodes. However, shadowing a child node requires acquiring a read lock of the original child node and a write lock of the new block. Although the read lock will be released after block shadowing, shadowing the third child node in rebalance3() could still take the sixth lock. (2 write locks for shadow_spine + 2 write locks for the first two child nodes's shadow + 1 write lock for the last child node's shadow + 1 read lock for the last child node) Signed-off-by: Dennis Yang <dennisyang@qnap.com> Acked-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:09 +01:00
Joe Thornber	cabf6294a6	dm btree: fix serious bug in btree_split_beneath() commit `bc68d0a435` upstream. When inserting a new key/value pair into a btree we walk down the spine of btree nodes performing the following 2 operations: i) space for a new entry ii) adjusting the first key entry if the new key is lower than any in the node. If the _root_ node is full, the function btree_split_beneath() allocates 2 new nodes, and redistibutes the root nodes entries between them. The root node is left with 2 entries corresponding to the 2 new nodes. btree_split_beneath() then adjusts the spine to point to one of the two new children. This means the first key is never adjusted if the new key was lower, ie. operation (ii) gets missed out. This can result in the new key being 'lost' for a period; until another low valued key is inserted that will uncover it. This is a serious bug, and quite hard to make trigger in normal use. A reproducing test case ("thin create devices-in-reverse-order") is available as part of the thin-provision-tools project: https://github.com/jthornber/thin-provisioning-tools/blob/master/functional-tests/device-mapper/dm-tests.scm#L593 Fix the issue by changing btree_split_beneath() so it no longer adjusts the spine. Instead it unlocks both the new nodes, and lets the main loop in btree_insert_raw() relock the appropriate one and make any neccessary adjustments. Reported-by: Monty Pavel <monty_pavel@sina.com> Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Sergey Senozhatsky	ca2d736867	workqueue: avoid hard lockups in show_workqueue_state() commit `62635ea8c1` upstream. show_workqueue_state() can print out a lot of messages while being in atomic context, e.g. sysrq-t -> show_workqueue_state(). If the console device is slow it may end up triggering NMI hard lockup watchdog. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Xinyu Lin	d314f3bc7f	libata: apply MAX_SEC_1024 to all LITEON EP1 series devices commit `db5ff90979` upstream. LITEON EP1 has the same timeout issues as CX1 series devices. Revert max_sectors to the value of 1024. Fixes: `e0edc8c546` ("libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices") Signed-off-by: Xinyu Lin <xinyu0123@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Alexey Dobriyan	8a3f4baaa4	proc: fix coredump vs read /proc//stat race commit `8bb2ee192e` upstream. do_task_stat() accesses IP and SP of a task without bumping reference count of a stack (which became an entity with independent lifetime at some point). Steps to reproduce: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <sys/wait.h> int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){}); while (1) { char buf[64]; char buf2[4096]; pid_t pid; int fd; pid = fork(); if (pid == 0) { (volatile int *)0 = 0; } snprintf(buf, sizeof(buf), "/proc/%u/stat", pid); fd = open(buf, O_RDONLY); read(fd, buf2, sizeof(buf2)); close(fd); waitpid(pid, NULL, 0); } return 0; } BUG: unable to handle kernel paging request at 0000000000003fd8 IP: do_task_stat+0x8b4/0xaf0 PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 RIP: 0010:do_task_stat+0x8b4/0xaf0 Call Trace: proc_single_show+0x43/0x70 seq_read+0xe6/0x3b0 __vfs_read+0x1e/0x120 vfs_read+0x84/0x110 SyS_read+0x3d/0xa0 entry_SYSCALL_64_fastpath+0x13/0x6c RIP: 0033:0x7f4d7928cba0 RSP: 002b:00007ffddb245158 EFLAGS: 00000246 Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24 RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8 CR2: 0000000000003fd8 John Ogness said: for my tests I added an else case to verify that the race is hit and correctly mitigated. Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: "Kohli, Gaurav" <gkohli@codeaurora.org> Tested-by: John Ogness <john.ogness@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Xi Kangjie	43c3e093c2	scripts/gdb/linux/tasks.py: fix get_thread_info commit `883d50f56d` upstream. Since kernel 4.9, the thread_info has been moved into task_struct, no longer locates at the bottom of kernel stack. See commits `c65eacbe29` ("sched/core: Allow putting thread_info into task_struct") and `15f4eae70d` ("x86: Move thread_info into task_struct"). Before fix: (gdb) set $current = $lx_current() (gdb) p $lx_thread_info($current) $1 = {flags = 1470918301} (gdb) p $current.thread_info $2 = {flags = 2147483648} After fix: (gdb) p $lx_thread_info($current) $1 = {flags = 2147483648} (gdb) p $current.thread_info $2 = {flags = 2147483648} Link: http://lkml.kernel.org/r/20180118210159.17223-1-imxikangjie@gmail.com Fixes: `15f4eae70d` ("x86: Move thread_info into task_struct") Signed-off-by: Xi Kangjie <imxikangjie@gmail.com> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Stephane Grosjean	23d68eddd8	can: peak: fix potential bug in packet fragmentation commit `d8a243af1a` upstream. In some rare conditions when running one PEAK USB-FD interface over a non high-speed USB controller, one useless USB fragment might be sent. This patch fixes the way a USB command is fragmented when its length is greater than 64 bytes and when the underlying USB controller is not a high-speed one. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Thomas Petazzoni	19f47eafe1	ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7 commit `56aeb07c91` upstream. MPP7 is currently muxed as "gpio", but this function doesn't exist for MPP7, only "gpo" is available. This causes the following error: kirkwood-pinctrl f1010000.pin-controller: unsupported function gpio on pin mpp7 pinctrl core: failed to register map default (6): invalid type given kirkwood-pinctrl f1010000.pin-controller: error claiming hogs: -22 kirkwood-pinctrl f1010000.pin-controller: could not claim hogs: -22 kirkwood-pinctrl f1010000.pin-controller: unable to register pinctrl driver kirkwood-pinctrl: probe of f1010000.pin-controller failed with error -22 So the pinctrl driver is not probed, all device drivers (including the UART driver) do a -EPROBE_DEFER, and therefore the system doesn't really boot (well, it boots, but with no UART, and no devices that require pin-muxing). Back when the Device Tree file for this board was introduced, the definition was already wrong. The pinctrl driver also always described as "gpo" this function for MPP7. However, between Linux 4.10 and 4.11, a hog pin failing to be muxed was turned from a simple warning to a hard error that caused the entire pinctrl driver probe to bail out. This is probably the result of commit `6118714275` ("pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()"). This commit fixes the Device Tree to use the proper "gpo" function for MPP7, which fixes the boot of OpenBlocks A7, which was broken since Linux 4.11. Fixes: `f24b56cbcd` ("ARM: kirkwood: add support for OpenBlocks A7 platform") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:08 +01:00
Maxime Ripard	1f32f15ec7	ARM: sunxi_defconfig: Enable CMA commit `c13e7f313d` upstream. The DRM driver most notably, but also out of tree drivers (for now) like the VPU or GPU drivers, are quite big consumers of large, contiguous memory buffers. However, the sunxi_defconfig doesn't enable CMA in order to mitigate that, which makes them almost unusable. Enable it to make sure it somewhat works. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Arnd Bergmann	969e2145eb	phy: work around 'phys' references to usb-nop-xceiv devices commit `b7563e2796` upstream. Stefan Wahren reports a problem with a warning fix that was merged for v4.15: we had lots of device nodes with a 'phys' property pointing to a device node that is not compliant with the binding documented in Documentation/devicetree/bindings/phy/phy-bindings.txt This generally works because USB HCD drivers that support both the generic phy subsystem and the older usb-phy subsystem ignore most errors from phy_get() and related calls and then use the usb-phy driver instead. However, it turns out that making the usb-nop-xceiv device compatible with the generic-phy binding changes the phy_get() return code from -EINVAL to -EPROBE_DEFER, and the dwc2 usb controller driver for bcm2835 now returns -EPROBE_DEFER from its probe function rather than ignoring the failure, breaking all USB support on raspberry-pi when CONFIG_GENERIC_PHY is enabled. The same code is used in the dwc3 driver and the usb_add_hcd() function, so a reasonable assumption would be that many other platforms are affected as well. I have reviewed all the related patches and concluded that "usb-nop-xceiv" is the only USB phy that is affected by the change, and since it is by far the most commonly referenced phy, all the other USB phy drivers appear to be used in ways that are are either safe in DT (they don't use the 'phys' property), or in the driver (they already ignore -EPROBE_DEFER from generic-phy when usb-phy is available). To work around the problem, this adds a special case to _of_phy_get() so we ignore any PHY node that is compatible with "usb-nop-xceiv", as we know that this can never load no matter how much we defer. In the future, we might implement a generic-phy driver for "usb-nop-xceiv" and then remove this workaround. Since we generally want older kernels to also want to work with the fixed devicetree files, it would be good to backport the patch into stable kernels as well (3.13+ are possibly affected), even though they don't contain any of the patches that may have caused regressions. Fixes: `014d6da6cb` ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells Fixes: `c5bbf358b7` arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv Fixes: `44e5dced2e` arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv Fixes: `f568f6f554` ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv Fixes: `d745d5f277` ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv Fixes: `915fbe59cb` ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv Link: https://marc.info/?l=linux-usb&m=151518314314753&w=2 Link: https://patchwork.kernel.org/patch/10158145/ Cc: Felipe Balbi <balbi@kernel.org> Cc: Eric Anholt <eric@anholt.net> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Rob Herring <robh@kernel.org> Tested-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Steven Rostedt (VMware)	9a50ea0ce7	tracing: Fix converting enum's from the map in trace_event_eval_update() commit `1ebe1eaf2f` upstream. Since enums do not get converted by the TRACE_EVENT macro into their values, the event format displaces the enum name and not the value. This breaks tools like perf and trace-cmd that need to interpret the raw binary data. To solve this, an enum map was created to convert these enums into their actual numbers on boot up. This is done by TRACE_EVENTS() adding a TRACE_DEFINE_ENUM() macro. Some enums were not being converted. This was caused by an optization that had a bug in it. All calls get checked against this enum map to see if it should be converted or not, and it compares the call's system to the system that the enum map was created under. If they match, then they call is processed. To cut down on the number of iterations needed to find the maps with a matching system, since calls and maps are grouped by system, when a match is made, the index into the map array is saved, so that the next call, if it belongs to the same system as the previous call, could start right at that array index and not have to scan all the previous arrays. The problem was, the saved index was used as the variable to know if this is a call in a new system or not. If the index was zero, it was assumed that the call is in a new system and would keep incrementing the saved index until it found a matching system. The issue arises when the first matching system was at index zero. The next map, if it belonged to the same system, would then think it was the first match and increment the index to one. If the next call belong to the same system, it would begin its search of the maps off by one, and miss the first enum that should be converted. This left a single enum not converted properly. Also add a comment to describe exactly what that index was for. It took me a bit too long to figure out what I was thinking when debugging this issue. Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com Fixes: `0c564a538a` ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values") Reported-by: Chuck Lever <chuck.lever@oracle.com> Teste-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Johan Hovold	cb513d1414	Input: twl4030-vibra - fix sibling-node lookup commit `5b18920199` upstream. A helper purported to look up a child node based on its name was using the wrong of-helper and ended up prematurely freeing the parent of-node while searching the whole device tree depth-first starting at the parent node. Fixes: `64b9e4d803` ("input: twl4030-vibra: Support for DT booted kernel") Fixes: `e661d0a044` ("Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Johan Hovold	eaabab6468	Input: twl6040-vibra - fix child-node lookup commit `dcaf12a8b0` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children. Later sanity checks on node properties (which would likely be missing) should prevent this from causing much trouble however, especially as the original premature free of the parent node has already been fixed separately (but that "fix" was apparently never backported to stable). Fixes: `e7ec014a47` ("Input: twl6040-vibra - update for device tree support") Fixes: `c52c545ead` ("Input: twl6040-vibra - fix DT node memory management") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> (on Pyra OMAP5 hardware) Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Johan Hovold	9be13b3357	Input: 88pm860x-ts - fix child-node lookup commit `906bf7daa0` upstream. Fix child node-lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children. To make things worse, the parent node was prematurely freed, while the child node was leaked. Fixes: `2e57d56747` ("mfd: 88pm860x: Device tree support") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Nir Perry	607b86e173	Input: ALPS - fix multi-touch decoding on SS4 plus touchpads commit `4d94e776bd` upstream. The fix for handling two-finger scroll (i4a646580f793 - "Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad") introduced a minor "typo" that broke decoding of multi-touch events are decoded on some ALPS touchpads. For example, tapping with three-fingers can no longer be used to emulate middle-mouse-button (the kernel doesn't recognize this as the proper event, and doesn't report it correctly to userspace). This affects touchpads that use SS4 "plus" protocol variant, like those found on Dell E7270 & E7470 laptops (tested on E7270). First, probably the code in alps_decode_ss4_v2() for case SS4_PACKET_ID_MULTI used inconsistent indices to "f->mt[]". You can see 0 & 1 are used for the "if" part but 2 & 3 are used for the "else" part. Second, in the previous patch, new macros were introduced to decode X coordinates specific to the SS4 "plus" variant, but the macro to define the maximum X value wasn't changed accordingly. The macros to decode X values for "plus" variant are effectively shifted right by 1 bit, but the max wasn't shifted too. This causes the driver to incorrectly handle "no data" cases, which also interfered with how multi-touch was handled. Fixes: `4a646580f7` ("Input: ALPS - fix two-finger scroll breakage...") Signed-off-by: Nir Perry <nirperry@gmail.com> Reviewed-by: Masaki Ota <masaki.ota@jp.alps.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Jiada Wang	9792f9b483	perf tools: Fix build with ARCH=x86_64 commit `7a759cd8e8` upstream. With commit: `0a943cb10c` (tools build: Add HOSTARCH Makefile variable) when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of ARCH=x86, so the perf build process searchs header files from tools/arch/x86_64/include, which doesn't exist. The following build failure is seen: In file included from util/event.c:2:0: tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory compilation terminated. Fix this issue by using SRCARCH instead of ARCH in perf, just like the main kernel Makefile and tools/objtool's. Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Eugeniu Rosca <erosca@de.adit-jv.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: `0a943cb10c` ("tools build: Add HOSTARCH Makefile variable") Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Tuomas Tynkkynen <tuomas@tuxera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:07 +01:00
Thomas Gleixner	c557481a94	x86/apic/vector: Fix off by one in error path commit `45d55e7bac` upstream. Keith reported the following warning: WARNING: CPU: 28 PID: 1420 at kernel/irq/matrix.c:222 irq_matrix_remove_managed+0x10f/0x120 x86_vector_free_irqs+0xa1/0x180 x86_vector_alloc_irqs+0x1e4/0x3a0 msi_domain_alloc+0x62/0x130 The reason for this is that if the vector allocation fails the error handling code tries to free the failed vector as well, which causes the above imbalance warning to trigger. Adjust the error path to handle this correctly. Fixes: `b5dc8e6c21` ("x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors") Reported-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161217300.1823@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Joe Lawrence	5b13f59356	pipe: avoid round_pipe_size() nr_pages overflow on 32-bit commit `d3f14c4858` upstream. round_pipe_size() contains a right-bit-shift expression which may overflow, which would cause undefined results in a subsequent roundup_pow_of_two() call. static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages; nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; } PAGE_SIZE is defined as (1UL << PAGE_SHIFT), so: - 4 bytes wide on 32-bit (0 to 0xffffffff) - 8 bytes wide on 64-bit (0 to 0xffffffffffffffff) That means that 32-bit round_pipe_size(), nr_pages may overflow to 0: size=0x00000000 nr_pages=0x0 size=0x00000001 nr_pages=0x1 size=0xfffff000 nr_pages=0xfffff size=0xfffff001 nr_pages=0x0 << ! size=0xffffffff nr_pages=0x0 << ! This is bad because roundup_pow_of_two(n) is undefined when n == 0! 64-bit is not a problem as the unsigned int size is 4 bytes wide (similar to 32-bit) and the larger, 8 byte wide unsigned long, is sufficient to handle the largest value of the bit shift expression: size=0xffffffff nr_pages=100000 Modify round_pipe_size() to return 0 if n == 0 and updates its callers to handle accordingly. Link: http://lkml.kernel.org/r/1507658689-11669-3-git-send-email-joe.lawrence@redhat.com Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dong Jinguang <dongjinguang@huawei.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Len Brown	02802dfc82	x86/tsc: Fix erroneous TSC rate on Skylake Xeon commit `b511203093` upstream. The INTEL_FAM6_SKYLAKE_X hardcoded crystal_khz value of 25MHZ is problematic: - SKX workstations (with same model # as server variants) use a 24 MHz crystal. This results in a -4.0% time drift rate on SKX workstations. - SKX servers subject the crystal to an EMI reduction circuit that reduces its actual frequency by (approximately) -0.25%. This results in -1 second per 10 minute time drift as compared to network time. This issue can also trigger a timer and power problem, on configurations that use the LAPIC timer (versus the TSC deadline timer). Clock ticks scheduled with the LAPIC timer arrive a few usec before the time they are expected (according to the slow TSC). This causes Linux to poll-idle, when it should be in an idle power saving state. The idle and clock code do not graciously recover from this error, sometimes resulting in significant polling and measurable power impact. Stop using native_calibrate_tsc() for INTEL_FAM6_SKYLAKE_X. native_calibrate_tsc() will return 0, boot will run with tsc_khz = cpu_khz, and the TSC refined calibration will update tsc_khz to correct for the difference. [ tglx: Sanitized change log ] Fixes: `6baf3d6182` ("x86/tsc: Add additional Intel CPU models to the crystal quirk list") Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: Prarit Bhargava <prarit@redhat.com> Link: https://lkml.kernel.org/r/ff6dcea166e8ff8f2f6a03c17beab2cb436aa779.1513920414.git.len.brown@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Eric W. Biederman	5ab44e8f0f	x86/mm/pkeys: Fix fill_sig_info_pkey commit `beacd6f7ed` upstream. SEGV_PKUERR is a signal specific si_code which happens to have the same numeric value as several others: BUS_MCEERR_AR, ILL_ILLTRP, FPE_FLTOVF, TRAP_HWBKPT, CLD_TRAPPED, POLL_ERR, SEGV_THREAD_ID, as such it is not safe to just test the si_code the signal number must also be tested to prevent a false positive in fill_sig_info_pkey. This error was by inspection, and BUS_MCEERR_AR appears to be a real candidate for confusion. So pass in si_signo and check for SIG_SEGV to verify that it is actually a SEGV_PKUERR Fixes: `019132ff3d` ("x86/mm/pkeys: Fill in pkey field in siginfo") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20180112203135.4669-2-ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Andi Kleen	eee0cba7b0	module: Add retpoline tag to VERMAGIC commit `6cfb521ac0` upstream. Add a marker for retpoline to the module VERMAGIC. This catches the case when a non RETPOLINE compiled module gets loaded into a retpoline kernel, making it insecure. It doesn't handle the case when retpoline has been runtime disabled. Even in this case the match of the retcompile status will be enforced. This implies that even with retpoline run time disabled all modules loaded need to be recompiled. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20180116205228.4890-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Paolo Bonzini	a96cf98dda	x86/cpufeature: Move processor tracing out of scattered features commit `4fdec2034b` upstream. Processor tracing is already enumerated in word 9 (CPUID[7,0].EBX), so do not duplicate it in the scattered features word. Besides being more tidy, this will be useful for KVM when it presents processor tracing to the guests. KVM selects host features that are supported by both the host kernel (depending on command line options, CPU errata, or whatever) and KVM. Whenever a full feature word exists, KVM's code is written in the expectation that the CPUID bit number matches the X86_FEATURE_* bit number, but this is not the case for X86_FEATURE_INTEL_PT. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luwei Kang <luwei.kang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/1516117345-34561-1-git-send-email-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:06 +01:00
Josh Poimboeuf	13ccac5de8	objtool: Improve error message for bad file argument commit `385d11b152` upstream. If a nonexistent file is supplied to objtool, it complains with a non-helpful error: open: No such file or directory Improve it to: objtool: Can't open 'foo': No such file or directory Reported-by: Markus <M4rkusXXL@web.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Tom Lendacky	b73d68788f	x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros commit `28d437d550` upstream. The PAUSE instruction is currently used in the retpoline and RSB filling macros as a speculation trap. The use of PAUSE was originally suggested because it showed a very, very small difference in the amount of cycles/time used to execute the retpoline as compared to LFENCE. On AMD, the PAUSE instruction is not a serializing instruction, so the pause/jmp loop will use excess power as it is speculated over waiting for return to mispredict to the correct target. The RSB filling macro is applicable to AMD, and, if software is unable to verify that LFENCE is serializing on AMD (possible when running under a hypervisor), the generic retpoline support will be used and, so, is also applicable to AMD. Keep the current usage of PAUSE for Intel, but add an LFENCE instruction to the speculation trap for AMD. The same sequence has been adopted by GCC for the GCC generated retpolines. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Kees Cook <keescook@google.com> Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
David Woodhouse	abf67b1e78	x86/retpoline: Fill RSB on context switch for affected CPUs commit `c995efd5a7` upstream. On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace. This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively. Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI. On Skylake+ the problem is slightly different, and an underflow of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as filling the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many other conditions which may result in the RSB becoming empty. The full solution on Skylake+ is to use IBRS, which will prevent the problem even when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch. [ tglx: Added missing vendor check and slighty massaged comments and changelog ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Xunlei Pang	1ad4f2872c	sched/deadline: Zero out positive runtime after throttling constrained tasks commit `ae83b56a56` upstream. When a contrained task is throttled by dl_check_constrained_dl(), it may carry the remaining positive runtime, as a result when dl_task_timer() fires and calls replenish_dl_entity(), it will not be replenished correctly due to the positive dl_se->runtime. This patch assigns its runtime to 0 if positive after throttling. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `df8eac8caf` ("sched/deadline: Throttle a constrained deadline task activated after the deadline) Link: http://lkml.kernel.org/r/1494421417-27550-1-git-send-email-xlpang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Tomas Henzl	997231f9fd	scsi: hpsa: fix volume offline state commit `eb94588dab` upstream. In a previous patch a hpsa_scsi_dev_t.volume_offline update line has been removed, so let us put it back.. Fixes: `85b29008d8` (hpsa: update check for logical volume status) Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Sagi Grimberg	d303d0ca9a	iser-target: Fix possible use-after-free in connection establishment error commit `cd52cb26e7` upstream. In case we fail to establish the connection we must drain our pre-posted login recieve work request before continuing safely with connection teardown. Fixes: `a060b5629a` ("IB/core: generic RDMA READ/WRITE API") Reported-by: Amrani, Ram <Ram.Amrani@cavium.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Eric Biggers	0476e6d0b7	af_key: fix buffer overread in parse_exthdrs() commit `4e765b4972` upstream. If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg msg = (void )buf; msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2; write(sock, buf, 17); } Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:05 +01:00
Eric Biggers	e4dc05ab8f	af_key: fix buffer overread in verify_address_len() commit `06b335cb51` upstream. If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case. This bug was found using syzkaller with KMSAN. Reproducer: #include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h> int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg msg = (void )buf; struct sadb_address addr = (void )(msg + 1); msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC; write(sock, buf, 24); } Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Thomas Gleixner	676109b28c	timers: Unconditionally check deferrable base commit `ed4bbf7910` upstream. When the timer base is checked for expired timers then the deferrable base must be checked as well. This was missed when making the deferrable base independent of base::nohz_active. Fixes: `ced6d5c11d` ("timers: Use deferrable base independent of base::nohz_active") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: rt@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Takashi Iwai	4b6e681f59	ALSA: hda - Apply the existing quirk to iMac 14,1 commit `031f335cda` upstream. iMac 14,1 requires the same quirk as iMac 12,2, using GPIO 2 and 3 for headphone and speaker output amps. Add the codec SSID quirk entry (106b:0600) accordingly. BugLink: http://lkml.kernel.org/r/CAEw6Zyteav09VGHRfD5QwsfuWv5a43r0tFBNbfcHXoNrxVz7ew@mail.gmail.com Reported-by: Freaky <freaky2000@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Takashi Iwai	fae704d5bd	ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant commit `e4c9fd10eb` upstream. There is another Dell XPS 13 variant (SSID 1028:082a) that requires the existing fixup for reducing the headphone noise. This patch adds the quirk entry for that. BugLink: http://lkml.kernel.org/r/CAHXyb9ZCZJzVisuBARa+UORcjRERV8yokez=DP1_5O5isTz0ZA@mail.gmail.com Reported-and-tested-by: Francisco G. <frangio.1@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Takashi Iwai	b9e168a0c6	ALSA: pcm: Remove yet superfluous WARN_ON() commit `23b19b7b50` upstream. muldiv32() contains a snd_BUG_ON() (which is morphed as WARN_ON() with debug option) for checking the case of 0 / 0. This would be helpful if this happens only as a logical error; however, since the hw refine is performed with any data set provided by user, the inconsistent values that can trigger such a condition might be passed easily. Actually, syzbot caught this by passing some zero'ed old hw_params ioctl. So, having snd_BUG_ON() there is simply superfluous and rather harmful to give unnecessary confusions. Let's get rid of it. Reported-by: syzbot+7e6ee55011deeebce15d@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Takashi Iwai	e4ff9f2946	ALSA: seq: Make ioctls race-free commit `b3defb791b` upstream. The ALSA sequencer ioctls have no protection against racy calls while the concurrent operations may lead to interfere with each other. As reported recently, for example, the concurrent calls of setting client pool with a combination of write calls may lead to either the unkillable dead-lock or UAF. As a slightly big hammer solution, this patch introduces the mutex to make each ioctl exclusive. Although this may reduce performance via parallel ioctl calls, usually it's not demanded for sequencer usages, hence it should be negligible. Reported-by: Luo Quan <a4651386@163.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Li Jinyue	d8a3170db0	futex: Prevent overflow by strengthen input validation commit `fbe0e839d1` upstream. UBSAN reports signed integer overflow in kernel/futex.c: UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int' Add a sanity check to catch negative values of nr_wake and nr_requeue. Signed-off-by: Li Jinyue <lijinyue@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Hannes Reinecke	bb7119eea2	scsi: sg: disable SET_FORCE_LOW_DMA commit `745dfa0d8e` upstream. The ioctl SET_FORCE_LOW_DMA has never worked since the initial git check-in, and the respective setting is nowadays handled correctly. So disable it entirely. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:04 +01:00
Vishal Verma	c9ca9d9d9b	libnvdimm, btt: Fix an incompatibility in the log layout commit `24e3a7fb60` upstream. Due to a spec misinterpretation, the Linux implementation of the BTT log area had different padding scheme from other implementations, such as UEFI and NVML. This fixes the padding scheme, and defaults to it for new BTT layouts. We attempt to detect the padding scheme in use when probing for an existing BTT. If we detect the older/incompatible scheme, we continue using it. Reported-by: Juston Li <juston.li@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Fixes: `5212e11fde` ("nd_btt: atomic sector updates") Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-23 19:57:03 +01:00
Greg Kroah-Hartman	b8cf9ff79d	Linux 4.9.77	2018-01-17 09:39:00 +01:00
Pavel Tatashin	1b92c48a2e	x86/pti/efi: broken conversion from efi to kernel page table The page table order must be increased for EFI table in order to avoid a bug where NMI tries to change the page table to kernel page table, while efi page table is active. For more disccussion about this bug, see this thread: http://lkml.iu.edu/hypermail/linux/kernel/1801.1/00951.html Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:39:00 +01:00
Josh Poimboeuf	92e8f20494	objtool: Fix retpoline support for pre-ORC objtool Objtool 1.0 (pre-ORC) produces the following warning when it encounters a retpoline: arch/x86/crypto/camellia-aesni-avx2-asm_64.o: warning: objtool: .altinstr_replacement+0xf: return instruction outside of a callable function That warning is meant to catch GCC bugs and missing ENTRY/ENDPROC annotations, neither of which are applicable to alternatives. Silence the warning for alternative instructions, just like objtool 2.0 already does. Reported-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:39:00 +01:00
Thomas Gleixner	44f1eae7fe	x86/retpoline: Remove compile time warning commit `b8b9ce4b5a` upstream. Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler does not have retpoline support. Linus rationale for this is: It's wrong because it will just make people turn off RETPOLINE, and the asm updates - and return stack clearing - that are independent of the compiler are likely the most important parts because they are likely the ones easiest to target. And it's annoying because most people won't be able to do anything about it. The number of people building their own compiler? Very small. So if their distro hasn't got a compiler yet (and pretty much nobody does), the warning is just annoying crap. It is already properly reported as part of the sysfs interface. The compile-time warning only encourages bad things. Fixes: `76b043848f` ("x86/retpoline: Add initial retpoline support") Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:39:00 +01:00
Andy Lutomirski	c05d544d53	selftests/x86: Add test_vsyscall commit `352909b49b` upstream. This tests that the vsyscall entries do what they're expected to do. It also confirms that attempts to read the vsyscall page behave as expected. If changes are made to the vsyscall code or its memory map handling, running this test in all three of vsyscall=none, vsyscall=emulate, and vsyscall=native are helpful. (Because it's easy, this also compares the vsyscall results to their vDSO equivalents.) Note to KAISER backporters: please test this under all three vsyscall modes. Also, in the emulate and native modes, make sure that test_vsyscall_64 agrees with the command line or config option as to which mode you're in. It's quite easy to mess up the kernel such that native mode accidentally emulates or vice versa. Greg, etc: please backport this to all your Meltdown-patched kernels. It'll help make sure the patches didn't regress vsyscalls. CSigned-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/2b9c5a174c1d60fd7774461d518aa75598b1d8fd.1515719552.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:39:00 +01:00
David Woodhouse	c1ddd99a02	x86/retpoline: Fill return stack buffer on vmexit commit `117cc7a908` upstream. In accordance with the Intel and AMD documentation, we need to overwrite all entries in the RSB on exiting a guest, to prevent malicious branch target predictions from affecting the host kernel. This is needed both for retpoline and for IBRS. [ak: numbers again for the RSB stuffing labels] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:39:00 +01:00
Andi Kleen	276e300447	x86/retpoline/irq32: Convert assembler indirect jumps commit `7614e913db` upstream. Convert all indirect jumps in 32bit irq inline asm code to use non speculative sequences. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	a590960ae6	x86/retpoline/checksum32: Convert assembler indirect jumps commit `5096732f6f` upstream. Convert all indirect jumps in 32bit checksum assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	87a1fe3625	x86/retpoline/xen: Convert Xen hypercall indirect jumps commit `ea08816d5b` upstream. Convert indirect call in Xen hypercall to use non-speculative sequence, when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	9e37da4c3d	x86/retpoline/hyperv: Convert assembler indirect jumps commit `e70e5892b2` upstream. Convert all indirect jumps in hyperv inline asm code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.uk [ backport to 4.9, hopefully correct, not tested... - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	83d7658362	x86/retpoline/ftrace: Convert ftrace assembler indirect jumps commit `9351803bd8` upstream. Convert all indirect jumps in ftrace assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	8b1bacc321	x86/retpoline/entry: Convert entry assembler indirect jumps commit `2641f08bb7` upstream. Convert indirect jumps in core 32/64bit entry assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return address after the 'call' instruction must be precisely at the .Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work, and the use of alternatives will mess that up unless we play horrid games to prepend with NOPs and make the variants the same length. It's not worth it; in the case where we ALTERNATIVE out the retpoline, the first instruction at __x86.indirect_thunk.rax is going to be a bare jmp *%rax anyway. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	2adc2f7444	x86/retpoline/crypto: Convert crypto assembler indirect jumps commit `9697fa39ef` upstream. Convert all indirect jumps in crypto assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	8f96937ee3	x86/spectre: Add boot time option to select Spectre v2 mitigation commit `da28512156` upstream. Add a spectre_v2= option to select the mitigation used for the indirect branch speculation vulnerability. Currently, the only option available is retpoline, in its various forms. This will be expanded to cover the new IBRS/IBPB microcode features. The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a serializing instruction, which is indicated by the LFENCE_RDTSC feature. [ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS integration becomes simple ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
David Woodhouse	2bb5de42f2	x86/retpoline: Add initial retpoline support commit `76b043848f` upstream. Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide the corresponding thunks. Provide assembler macros for invoking the thunks in the same way that GCC does, from native and inline assembler. This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In some circumstances, IBRS microcode features may be used instead, and the retpoline can be disabled. On AMD CPUs if lfence is serialising, the retpoline can be dramatically simplified to a simple "lfence; jmp *\reg". A future patch, after it has been verified that lfence really is serialising in all circumstances, can enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition to X86_FEATURE_RETPOLINE. Do not align the retpoline in the altinstr section, because there is no guarantee that it stays aligned when it's copied over the oldinstr during alternative patching. [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks] [ tglx: Put actual function CALL/JMP in front of the macros, convert to symbolic labels ] [ dwmw2: Convert back to numeric labels, merge objtool fixes ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
Andrey Ryabinin	4bf050da57	x86/asm: Use register variable to get stack pointer value commit `196bd485ee` upstream. Currently we use current_stack_pointer() function to get the value of the stack pointer register. Since commit: `f5caf621ee` ("x86/asm: Fix inline asm call constraints for Clang") ... we have a stack register variable declared. It can be used instead of current_stack_pointer() function which allows to optimize away some excessive "mov %rsp, %<dst>" instructions: -mov %rsp,%rdx -sub %rdx,%rax -cmp $0x3fff,%rax -ja ffffffff810722fd <ist_begin_non_atomic+0x2d> +sub %rsp,%rax +cmp $0x3fff,%rax +ja ffffffff810722fa <ist_begin_non_atomic+0x2a> Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer and use it instead of the removed function. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org> [dwmw2: We want ASM_CALL_CONSTRAINT for retpoline] Signed-off-by: David Woodhouse <dwmw@amazon.co.ku> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:59 +01:00
Josh Poimboeuf	4d8bd3e2f6	objtool: Allow alternatives to be ignored commit `258c76059c` upstream. Getting objtool to understand retpolines is going to be a bit of a challenge. For now, take advantage of the fact that retpolines are patched in with alternatives. Just read the original (sane) non-alternative instruction, and ignore the patched-in retpoline. This allows objtool to understand the control flow around the retpoline, even if it can't yet follow what's inside. This means the ORC unwinder will fail to unwind from inside a retpoline, but will work fine otherwise. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-3-git-send-email-dwmw@amazon.co.uk [dwmw2: Applies to tools/objtool/builtin-check.c not check.[ch]] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Josh Poimboeuf	3adb52ab29	objtool: Detect jumps to retpoline thunks commit `39b735332c` upstream. A direct jump to a retpoline thunk is really an indirect jump in disguise. Change the objtool instruction type accordingly. Objtool needs to know where indirect branches are so it can detect switch statement jump tables. This fixes a bunch of warnings with CONFIG_RETPOLINE like: arch/x86/events/intel/uncore_nhmex.o: warning: objtool: nhmex_rbox_msr_enable_event()+0x44: sibling call from callable instruction with modified stack frame kernel/signal.o: warning: objtool: copy_siginfo_to_user()+0x91: sibling call from callable instruction with modified stack frame ... Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-2-git-send-email-dwmw@amazon.co.uk [dwmw2: Applies to tools/objtool/builtin-check.c not check.c] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Josh Poimboeuf	35aee626fa	objtool, modules: Discard objtool annotation sections for modules commit `e390f9a968` upstream. The '__unreachable' and '__func_stack_frame_non_standard' sections are only used at compile time. They're discarded for vmlinux but they should also be discarded for modules. Since this is a recurring pattern, prefix the section names with ".discard.". It's a nice convention and vmlinux.lds.h already discards such sections. Also remove the 'a' (allocatable) flag from the __unreachable section since it doesn't make sense for a discarded section. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jessica Yu <jeyu@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `d1091c7fa3` ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170301180444.lhd53c5tibc4ns77@treble Signed-off-by: Ingo Molnar <mingo@kernel.org> [dwmw2: Remove the unreachable part in backporting since it's not here yet] Signed-off-by: David Woodhouse <dwmw@amazon.co.ku> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Andy Lutomirski	00bcb5ada6	x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier commit `b8b7abaed7` upstream. Otherwise we might have the PCID feature bit set during cpu_init(). This is just for robustness. I haven't seen any actual bugs here. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `cba4671af7` ("x86/mm: Disable PCID on 32-bit kernels") Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.1505663533.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
David Woodhouse	91b7e5cdc8	x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm commit `b9e705ef7c` upstream. Where an ALTERNATIVE is used in the middle of an inline asm block, this would otherwise lead to the following instruction being appended directly to the trailing ".popsection", and a failed compile. Fixes: `9cebed423c` ("x86, alternative: Use .pushsection/.popsection") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: ak@linux.intel.com Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Borislav Petkov	194dc04770	x86/alternatives: Fix optimize_nops() checking commit `612e8e9350` upstream. The alternatives code checks only the first byte whether it is a NOP, but with NOPs in front of the payload and having actual instructions after it breaks the "optimized' test. Make sure to scan all bytes before deciding to optimize the NOPs in there. Reported-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
David Woodhouse	5ddd318a47	sysfs/cpu: Fix typos in vulnerability documentation commit `9ecccfaa7c` upstream. Fixes: `87590ce6e` ("sysfs/cpu: Add vulnerability folder") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Tom Lendacky	9c5e750c8e	x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC commit `9c6a73c758` upstream. With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference to MFENCE_RDTSC. However, since the kernel could be running under a hypervisor that does not support writing that MSR, read the MSR back and verify that the bit has been set successfully. If the MSR can be read and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the MFENCE_RDTSC feature. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdoffice.net Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Tom Lendacky	abcc3e5f00	x86/cpu/AMD: Make LFENCE a serializing instruction commit `e4d0e84e49` upstream. To aid in speculation control, make LFENCE a serializing instruction since it has less overhead than MFENCE. This is done by setting bit 1 of MSR 0xc0011029 (DE_CFG). Some families that support LFENCE do not have this MSR. For these families, the LFENCE instruction is already serializing. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdoffice.net Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:58 +01:00
Thomas Gleixner	45a98824bd	x86/cpu: Implement CPU vulnerabilites sysfs functions commit `61dc0f555b` upstream. Implement the CPU vulnerabilty show functions for meltdown, spectre_v1 and spectre_v2. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180107214913.177414879@linutronix.de Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Thomas Gleixner	11ec2df9c0	sysfs/cpu: Add vulnerability folder commit `87590ce6e3` upstream. As the meltdown/spectre problem affects several CPU architectures, it makes sense to have common way to express whether a system is affected by a particular vulnerability or not. If affected the way to express the mitigation should be common as well. Create /sys/devices/system/cpu/vulnerabilities folder and files for meltdown, spectre_v1 and spectre_v2. Allow architectures to override the show function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180107214913.096657732@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Borislav Petkov	56eff367e0	x86/cpu: Merge bugs.c and bugs_64.c commit `62a67e123e` upstream. Should be easier when following boot paths. It probably is a left over from the x86 unification eons ago. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161024173844.23038-3-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
David Woodhouse	26323fb4d7	x86/cpufeatures: Add X86_BUG_SPECTRE_V[12] commit `99c6fa2511` upstream. Add the bug bits for spectre v1/2 and force them unconditionally for all cpus. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1515239374-23361-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Thomas Gleixner	43fe95308d	x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN commit `de791821c2` upstream. Use the name associated with the particular attack which needs page table isolation for mitigation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jiri Koshina <jikos@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Lutomirski <luto@amacapital.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg KH <gregkh@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Thomas Gleixner	d88f601b9a	x86/cpufeatures: Add X86_BUG_CPU_INSECURE commit `a89f040fa3` upstream. Many x86 CPUs leak information to user space due to missing isolation of user space and kernel space page tables. There are many well documented ways to exploit that. The upcoming software migitation of isolating the user and kernel space page tables needs a misfeature flag so code can be made runtime conditional. Add the BUG bits which indicates that the CPU is affected and add a feature bit which indicates that the software migitation is enabled. Assume for now that _ALL_ x86 CPUs are affected by this. Exceptions can be made later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Thomas Gleixner	c2cacde516	x86/cpufeatures: Make CPU bugs sticky commit `6cbd2171e8` upstream. There is currently no way to force CPU bug bits like CPU feature bits. That makes it impossible to set a bug bit once at boot and have it stick for all upcoming CPUs. Extend the force set/clear arrays to handle bug bits as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Andy Lutomirski	ef46398101	x86/cpu: Factor out application of forced CPU caps commit `8bf1ebca21` upstream. There are multiple call sites that apply forced CPU caps. Factor them into a helper. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Whitehead <tedheadster@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/623ff7555488122143e4417de09b18be2085ad06.1484705016.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Dave Hansen	4e6c2af2ba	x86/Documentation: Add PTI description commit `01c9b17bf6` upstream. Add some details about how PTI works, what some of the downsides are, and how to debug it when things go wrong. Also document the kernel parameter: 'pti/nopti'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Moritz Lipp <moritz.lipp@iaik.tugraz.at> Cc: Daniel Gruss <daniel.gruss@iaik.tugraz.at> Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Cc: Richard Fellner <richard.fellner@student.tugraz.at> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andi Lutomirsky <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:57 +01:00
Benjamin Poirier	d598347989	e1000e: Fix e1000_check_for_copper_link_ich8lan return value. commit `4110e02eb4` upstream. e1000e_check_for_copper_link() and e1000_check_for_copper_link_ich8lan() are the two functions that may be assigned to mac.ops.check_for_link when phy.media_type == e1000_media_type_copper. Commit `19110cfbb3` ("e1000e: Separate signaling for link check/link up") changed the meaning of the return value of check_for_link for copper media but only adjusted the first function. This patch adjusts the second function likewise. Reported-by: Christian Hesse <list@eworm.de> Reported-by: Gabriel C <nix.or.die@gmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=198047 Fixes: `19110cfbb3` ("e1000e: Separate signaling for link check/link up") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Christian Hesse <list@eworm.de> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Icenowy Zheng	3ba5d3a2cf	uas: ignore UAS for Norelsys NS1068(X) chips commit `928afc8527` upstream. The UAS mode of Norelsys NS1068(X) is reported to fail to work on several platforms with the following error message: xhci-hcd xhci-hcd.0.auto: ERROR Transfer event for unknown stream ring slot 1 ep 8 xhci-hcd xhci-hcd.0.auto: @00000000bf04a400 00000000 00000000 1b000000 01098001 And when trying to mount a partition on the disk the disk will disconnect from the USB controller, then after re-connecting the device will be offlined and not working at all. Falling back to USB mass storage can solve this problem, so ignore UAS function of this chip. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Ben Seri	6aebc2670e	Bluetooth: Prevent stack info leak from the EFS element. commit `06e7e776ca` upstream. In the function l2cap_parse_conf_rsp and in the function l2cap_parse_conf_req the following variable is declared without initialization: struct l2cap_conf_efs efs; In addition, when parsing input configuration parameters in both of these functions, the switch case for handling EFS elements may skip the memcpy call that will write to the efs variable: ... case L2CAP_CONF_EFS: if (olen == sizeof(efs)) memcpy(&efs, (void *)val, olen); ... The olen in the above if is attacker controlled, and regardless of that if, in both of these functions the efs variable would eventually be added to the outgoing configuration request that is being built: l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs), (unsigned long) &efs); So by sending a configuration request, or response, that contains an L2CAP_CONF_EFS element, but with an element length that is not sizeof(efs) - the memcpy to the uninitialized efs variable can be avoided, and the uninitialized variable would be returned to the attacker (16 bytes). This issue has been assigned CVE-2017-1000410 Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by: Ben Seri <ben@armis.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Viktor Slavkovic	c51d23dffc	staging: android: ashmem: fix a race condition in ASHMEM_SET_SIZE ioctl commit `443064cb0b` upstream. A lock-unlock is missing in ASHMEM_SET_SIZE ioctl which can result in a race condition when mmap is called. After the !asma->file check, before setting asma->size, asma->file can be set in mmap. That would result in having different asma->size than the mapped memory size. Combined with ASHMEM_UNPIN ioctl and shrinker invocation, this can result in memory corruption. Signed-off-by: Viktor Slavkovic <viktors@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Shuah Khan	8ab8c6e660	usbip: vudc_tx: fix v_send_ret_submit() vulnerability to null xfer buffer commit `5fd77a3a0e` upstream. v_send_ret_submit() handles urb with a null transfer_buffer, when it replays a packet with potential malicious data that could contain a null buffer. Add a check for the condition when actual_length > 0 and transfer_buffer is null. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Shuah Khan	86c8d58fc7	usbip: fix vudc_rx: harden CMD_SUBMIT path to handle malicious input commit `b78d830f00` upstream. Harden CMD_SUBMIT path to handle malicious input that could trigger large memory allocations. Add checks to validate transfer_buffer_length and number_of_packets to protect against bad input requesting for unbounded memory allocations. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:56 +01:00
Shuah Khan	6851ec74bf	usbip: remove kernel addresses from usb device and urb debug msgs commit `e1346fd87c` upstream. usbip_dump_usb_device() and usbip_dump_urb() print kernel addresses. Remove kernel addresses from usb device and urb debug msgs and improve the message content. Instead of printing parent device and bus addresses, print parent device and bus names. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Pete Zaitcev	435db24bb9	USB: fix usbmon BUG trigger commit `46eb14a6e1` upstream. Automated tests triggered this by opening usbmon and accessing the mmap while simultaneously resizing the buffers. This bug was with us since 2006, because typically applications only size the buffers once and thus avoid racing. Reported by Kirill A. Shutemov. Reported-by: <syzbot+f9831b881b3e849829fc@syzkaller.appspotmail.com> Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Stefan Agner	9f6ca0ea7a	usb: misc: usb3503: make sure reset is low for at least 100us commit `b8626f1dc2` upstream. When using a GPIO which is high by default, and initialize the driver in USB Hub mode, initialization fails with: [ 111.757794] usb3503 0-0008: SP_ILOCK failed (-5) The reason seems to be that the chip is not properly reset. Probe does initialize reset low, however some lines later the code already set it back high, which is not long enouth. Make sure reset is asserted for at least 100us by inserting a delay after initializing the reset pin during probe. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Christian Holl	11632d079e	USB: serial: cp210x: add new device ID ELV ALC 8xxx commit `d14ac576d1` upstream. This adds the ELV ALC 8xxx Battery Charging device to the list of USB IDs of drivers/usb/serial/cp210x.c Signed-off-by: Christian Holl <cyborgx1@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Diego Elio Pettenò	4abe275c2d	USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ commit `4307413256` upstream. Add IDs for the OneTouch Verio IQ that comes with an embedded USB-to-serial converter. Signed-off-by: Diego Elio Pettenò <flameeyes@flameeyes.eu> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Daniel Borkmann	820ef2a0e5	bpf, array: fix overflow in max_entries and undefined behavior in index_mask commit `bbeb6e4323` upstream. syzkaller tried to alloc a map with 0xfffffffd entries out of a userns, and thus unprivileged. With the recently added logic in `b2157399cc` ("bpf: prevent out-of-bounds speculation") we round this up to the next power of two value for max_entries for unprivileged such that we can apply proper masking into potentially zeroed out map slots. However, this will generate an index_mask of 0xffffffff, and therefore a + 1 will let this overflow into new max_entries of 0. This will pass allocation, etc, and later on map access we still enforce on the original attr->max_entries value which was 0xfffffffd, therefore triggering GPF all over the place. Thus bail out on overflow in such case. Moreover, on 32 bit archs roundup_pow_of_two() can also not be used, since fls_long(max_entries - 1) can result in 32 and 1UL << 32 in 32 bit space is undefined. Therefore, do this by hand in a 64 bit variable. This fixes all the issues triggered by syzkaller's reproducers. Fixes: `b2157399cc` ("bpf: prevent out-of-bounds speculation") Reported-by: syzbot+b0efb8e572d01bce1ae0@syzkaller.appspotmail.com Reported-by: syzbot+6c15e9744f75f2364773@syzkaller.appspotmail.com Reported-by: syzbot+d2f5524fb46fd3b312ee@syzkaller.appspotmail.com Reported-by: syzbot+61d23c95395cc90dbc2b@syzkaller.appspotmail.com Reported-by: syzbot+0d363c942452cca68c01@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Alexei Starovoitov	a9bfac14cd	bpf: prevent out-of-bounds speculation commit `b2157399cc` upstream. Under speculation, CPUs may mis-predict branches in bounds checks. Thus, memory accesses under a bounds check may be speculated even if the bounds check fails, providing a primitive for building a side channel. To avoid leaking kernel data round up array-based maps and mask the index after bounds check, so speculated load with out of bounds index will load either valid value from the array or zero from the padded area. Unconditionally mask index for all array types even when max_entries are not rounded to power of 2 for root user. When map is created by unpriv user generate a sequence of bpf insns that includes AND operation to make sure that JITed code includes the same 'index & index_mask' operation. If prog_array map is created by unpriv user replace bpf_tail_call(ctx, map, index); with if (index >= max_entries) { index &= map->index_mask; bpf_tail_call(ctx, map, index); } (along with roundup to power 2) to prevent out-of-bounds speculation. There is secondary redundant 'if (index >= max_entries)' in the interpreter and in all JITs, but they can be optimized later if necessary. Other array-like maps (cpumap, devmap, sockmap, perf_event_array, cgroup_array) cannot be used by unpriv, so no changes there. That fixes bpf side of "Variant 1: bounds check bypass (CVE-2017-5753)" on all architectures with and without JIT. v2->v3: Daniel noticed that attack potentially can be crafted via syscall commands without loading the program, so add masking to those paths as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Slaby <jslaby@suse.cz> [ Backported to 4.9 - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Alexei Starovoitov	f55093dccd	bpf: refactor fixup_bpf_calls() commit `79741b3bde` upstream. reduce indent and make it iterate over instructions similar to convert_ctx_accesses(). Also convert hard BUG_ON into soft verifier error. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Jiri Slaby <jslaby@suse.cz> [Backported to 4.9.y - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Alexei Starovoitov	28035366af	bpf: move fixup_bpf_calls() function commit `e245c5c6a5` upstream. no functional change. move fixup_bpf_calls() to verifier.c it's being refactored in the next patch Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Jiri Slaby <jslaby@suse.cz> [backported to 4.9 - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:55 +01:00
Nicholas Bellinger	60c7a9cd50	target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK commit `1c21a48055` upstream. This patch fixes bug where early se_cmd exceptions that occur before backend execution can result in use-after-free if/when a subsequent ABORT_TASK occurs for the same tag. Since an early se_cmd exception will have had se_cmd added to se_session->sess_cmd_list via target_get_sess_cmd(), it will not have CMD_T_COMPLETE set by the usual target_complete_cmd() backend completion path. This causes a subsequent ABORT_TASK + __target_check_io_state() to signal ABORT_TASK should proceed. As core_tmr_abort_task() executes, it will bring the outstanding se_cmd->cmd_kref count down to zero releasing se_cmd, after se_cmd has already been queued with error status into fabric driver response path code. To address this bug, introduce a CMD_T_PRE_EXECUTE bit that is set at target_get_sess_cmd() time, and cleared immediately before backend driver dispatch in target_execute_cmd() once CMD_T_ACTIVE is set. Then, check CMD_T_PRE_EXECUTE within __target_check_io_state() to determine when an early exception has occured, and avoid aborting this se_cmd since it will have already been queued into fabric driver response path code. Reported-by: Donald White <dew@datera.io> Cc: Donald White <dew@datera.io> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Nicholas Bellinger	748e1b6281	iscsi-target: Make TASK_REASSIGN use proper se_cmd->cmd_kref commit `ae072726f6` upstream. Since commit `59b6986dbf` fixed a potential NULL pointer dereference by allocating a se_tmr_req for ISCSI_TM_FUNC_TASK_REASSIGN, the se_tmr_req is currently leaked by iscsit_free_cmd() because no iscsi_cmd->se_cmd.se_tfo was associated. To address this, treat ISCSI_TM_FUNC_TASK_REASSIGN like any other TMR and call transport_init_se_cmd() + target_get_sess_cmd() to setup iscsi_cmd->se_cmd.se_tfo with se_cmd->cmd_kref of 2. This will ensure normal release operation once se_cmd->cmd_kref reaches zero and target_release_cmd_kref() is invoked, se_tmr_req will be released via existing target_free_cmd_mem() and core_tmr_release_req() code. Reported-by: Donald White <dew@datera.io> Cc: Donald White <dew@datera.io> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Lepton Wu	ec61bafb2a	kaiser: Set _PAGE_NX only if supported This finally resolve crash if loaded under qemu + haxm. Haitao Shan pointed out that the reason of that crash is that NX bit get set for page tables. It seems we missed checking if _PAGE_NX is supported in kaiser_add_user_map Link: https://www.spinics.net/lists/kernel/msg2689835.html Reviewed-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Lepton Wu <ytht.net@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Dan Carpenter	08a7525811	drm/vmwgfx: Potential off by one in vmw_view_add() commit `0d9cac0ca0` upstream. The vmw_view_cmd_to_type() function returns vmw_view_max (3) on error. It's one element beyond the end of the vmw_view_cotables[] table. My read on this is that it's possible to hit this failure. header->id comes from vmw_cmd_check() and it's a user controlled number between 1040 and 1225 so we can hit that error. But I don't have the hardware to test this code. Fixes: `d80efd5cb3` ("drm/vmwgfx: Initial DX support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Andrew Honig	012df71d29	KVM: x86: Add memory barrier on vmcs field lookup commit `75f139aaf8` upstream. This adds a memory barrier when performing a lookup into the vmcs_field_to_offset_table. This is related to CVE-2017-5753. Signed-off-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Jia Zhang	431fd501aa	x86/microcode/intel: Extend BDW late-loading with a revision check commit `b94b737331` upstream. Instead of blacklisting all model 79 CPUs when attempting a late microcode loading, limit that only to CPUs with microcode revisions < 0x0b000021 because only on those late loading may cause a system hang. For such processors either: a) a BIOS update which might contain a newer microcode revision or b) the early microcode loading method should be considered. Processors with revisions 0x0b000021 or higher will not experience such hangs. For more details, see erratum BDF90 in document #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family Specification Update) from September 2017. [ bp: Heavily massage commit message and pr_* statements. ] Fixes: `723f2828a9` ("x86/microcode/intel: Disable late loading on model 79") Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1514772287-92959-1-git-send-email-qianyue.zj@alibaba-inc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Ilya Dryomov	553a8b8c8d	rbd: set max_segments to USHRT_MAX commit `21acdf45f4` upstream. Commit `d3834fefcf` ("rbd: bump queue_max_segments") bumped max_segments (unsigned short) to max_hw_sectors (unsigned int). max_hw_sectors is set to the number of 512-byte sectors in an object and overflows unsigned short for 32M (largest possible) objects, making the block layer resort to handing us single segment (i.e. single page or even smaller) bios in that case. Fixes: `d3834fefcf` ("rbd: bump queue_max_segments") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Eric Biggers	3752d2fb9a	crypto: algapi - fix NULL dereference in crypto_remove_spawns() commit `9a00674213` upstream. syzkaller triggered a NULL pointer dereference in crypto_remove_spawns() via a program that repeatedly and concurrently requests AEADs "authenc(cmac(des3_ede-asm),pcbc-aes-aesni)" and hashes "cmac(des3_ede)" through AF_ALG, where the hashes are requested as "untested" (CRYPTO_ALG_TESTED is set in ->salg_mask but clear in ->salg_feat; this causes the template to be instantiated for every request). Although AF_ALG users really shouldn't be able to request an "untested" algorithm, the NULL pointer dereference is actually caused by a longstanding race condition where crypto_remove_spawns() can encounter an instance which has had spawn(s) "grabbed" but hasn't yet been registered, resulting in ->cra_users still being NULL. We probably should properly initialize ->cra_users earlier, but that would require updating many templates individually. For now just fix the bug in a simple way that can easily be backported: make crypto_remove_spawns() treat a NULL ->cra_users list as empty. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Roi Dayan	b28394cbb4	net/sched: Fix update of lastuse in act modules implementing stats_update [ Upstream commit `3bb23421a5` ] We need to update lastuse to to the most updated value between what is already set and the new value. If HW matching fails, i.e. because of an issue, the stats are not updated but it could be that software did match and updated lastuse. Fixes: `5712bf9c5c` ("net/sched: act_mirred: Use passed lastuse argument") Fixes: `9fea47d93b` ("net/sched: act_gact: Update statistics when offloaded to hardware") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:54 +01:00
Ido Schimmel	e2b825e8de	mlxsw: spectrum_router: Fix NULL pointer deref [ Upstream commit `8764a8267b` ] When we remove the neighbour associated with a nexthop we should always refuse to write the nexthop to the adjacency table. Regardless if it is already present in the table or not. Otherwise, we risk dereferencing the NULL pointer that was set instead of the neighbour. Fixes: `a7ff87acd9` ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Stephen Hemminger	16d5b481d0	ethtool: do not print warning for applications using legacy API [ Upstream commit `71891e2dab` ] In kernel log ths message appears on every boot: "warning: `NetworkChangeNo' uses legacy ethtool link settings API, link modes are only partially reported" When ethtool link settings API changed, it started complaining about usages of old API. Ironically, the original patch was from google but the application using the legacy API is chrome. Linux ABI is fixed as much as possible. The kernel must not break it and should not complain about applications using legacy API's. This patch just removes the warning since using legacy API's in Linux is perfectly acceptable. Fixes: `3f1ac7a700` ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Eric Dumazet	dde00c9224	ipv6: fix possible mem leaks in ipv6_make_skb() [ Upstream commit `862c03ee1d` ] ip6_setup_cork() might return an error, while memory allocations have been done and must be rolled back. Fixes: `6422398c2a` ("ipv6: introduce ipv6_make_skb") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Reported-by: Mike Maloney <maloney@google.com> Acked-by: Mike Maloney <maloney@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Jerome Brunet	6f237183c7	net: stmmac: enable EEE in MII, GMII or RGMII only [ Upstream commit `879626e3a5` ] Note in the databook - Section 4.4 - EEE : " The EEE feature is not supported when the MAC is configured to use the TBI, RTBI, SMII, RMII or SGMII single PHY interface. Even if the MAC supports multiple PHY interfaces, you should activate the EEE mode only when the MAC is operating with GMII, MII, or RGMII interface." Applying this restriction solves a stability issue observed on Amlogic gxl platforms operating with RMII interface and the internal PHY. Fixes: `83bf79b6bb` ("stmmac: disable at run-time the EEE if not supported") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Arnaud Patard <arnaud.patard@rtp-net.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Sergei Shtylyov	7f4226ffcb	sh_eth: fix SH7757 GEther initialization [ Upstream commit `5133550296` ] Renesas SH7757 has 2 Fast and 2 Gigabit Ether controllers, while the 'sh_eth' driver can only reset and initialize TSU of the first controller pair. Shimoda-san tried to solve that adding the 'needs_init' member to the 'struct sh_eth_plat_data', however the platform code still never sets this flag. I think that we can infer this information from the 'devno' variable (set to 'platform_device::id') and reset/init the Ether controller pair only for an even 'devno'; therefore 'sh_eth_plat_data::needs_init' can be removed... Fixes: `150647fb2c` ("net: sh_eth: change the condition of initialization") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Sergei Shtylyov	eb2f80e099	sh_eth: fix TSU resource handling [ Upstream commit `dfe8266b8d` ] When switching the driver to the managed device API, I managed to break the case of a dual Ether devices sharing a single TSU: the 2nd Ether port wouldn't probe. Iwamatsu-san has tried to fix this but his patch was buggy and he then dropped the ball... The solution is to limit calling devm_request_mem_region() to the first of the two ports sharing the same TSU, so devm_ioremap_resource() can't be used anymore for the TSU resource... Fixes: `d5e07e6921` ("sh_eth: use managed device API") Reported-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Mohamed Ghannam	ce31b6ac11	RDS: null pointer dereference in rds_atomic_free_op [ Upstream commit `7d11f77f84` ] set rm->atomic.op_active to 0 when rds_pin_pages() fails or the user supplied address is invalid, this prevents a NULL pointer usage in rds_atomic_free_op() Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Mohamed Ghannam	cebb382931	RDS: Heap OOB write in rds_message_alloc_sgs() [ Upstream commit `c095508770` ] When args->nr_local is 0, nr_pages gets also 0 due some size calculation via rds_rm_size(), which is later used to allocate pages for DMA, this bug produces a heap Out-Of-Bound write access to a specific memory region. Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Andrii Vladyka	61196a67ca	net: core: fix module type in sock_diag_bind [ Upstream commit `b8fd0823e0` ] Use AF_INET6 instead of AF_INET in IPv6-related code path Signed-off-by: Andrii Vladyka <tulup@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:53 +01:00
Eli Cooper	ca5681b723	ip6_tunnel: disable dst caching if tunnel is dual-stack [ Upstream commit `23263ec86a` ] When an ip6_tunnel is in mode 'any', where the transport layer protocol can be either 4 or 41, dst_cache must be disabled. This is because xfrm policies might apply to only one of the two protocols. Caching dst would cause xfrm policies for one protocol incorrectly used for the other. Signed-off-by: Eli Cooper <elicooper@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Cong Wang	fe71f34fbf	8021q: fix a memory leak for VLAN 0 device [ Upstream commit `78bbb15f22` ] A vlan device with vid 0 is allow to creat by not able to be fully cleaned up by unregister_vlan_dev() which checks for vlan_id!=0. Also, VLAN 0 is probably not a valid number and it is kinda "reserved" for HW accelerating devices, but it is probably too late to reject it from creation even if makes sense. Instead, just remove the check in unregister_vlan_dev(). Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: `ad1afb0039` ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Ben Hutchings	66bb6c2c44	xhci: Fix ring leak in failure path of xhci_alloc_virt_device() This is a stable-only fix for the backport of commit `5d9b70f7d5` ("xhci: Don't add a virt_dev to the devs array before it's fully allocated"). In branches that predate commit `c5628a2af8` ("xhci: remove endpoint ring cache") there is an additional failure path in xhci_alloc_virt_device() where ring cache allocation fails, in which case we need to free the ring allocated for endpoint 0. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Mathias Nyman <mathias.nyman@intel.com>	2018-01-17 09:38:52 +01:00
Eric Dumazet	135f98084e	cx82310_eth: use skb_cow_head() to deal with cloned skbs commit `a9e840a208` upstream. We need to ensure there is enough headroom to push extra header, but we also need to check if we are allowed to change headers. skb_cow_head() is the proper helper to deal with this. Fixes: `cc28a20e77` ("introduce cx82310_eth: Conexant CX82310-based ADSL router USB ethernet driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Hughes <james.hughes@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Eric Dumazet	7c5015409b	smsc75xx: use skb_cow_head() to deal with cloned skbs commit `b7c6d26758` upstream. We need to ensure there is enough headroom to push extra header, but we also need to check if we are allowed to change headers. skb_cow_head() is the proper helper to deal with this. Fixes: `d0cad87170` ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Hughes <james.hughes@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Eric Dumazet	ab4fd7a2dd	sr9700: use skb_cow_head() to deal with cloned skbs commit `d532c1082f` upstream. We need to ensure there is enough headroom to push extra header, but we also need to check if we are allowed to change headers. skb_cow_head() is the proper helper to deal with this. Fixes: `c9b37458e9` ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Hughes <james.hughes@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Eric Dumazet	542bcc5493	lan78xx: use skb_cow_head() to deal with cloned skbs commit `d4ca735919` upstream. We need to ensure there is enough headroom to push extra header, but we also need to check if we are allowed to change headers. skb_cow_head() is the proper helper to deal with this. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Hughes <james.hughes@raspberrypi.org> Cc: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Dan Streetman	1ecdfc1ee9	zswap: don't param_set_charp while holding spinlock commit `fd5bb66cd9` upstream. Change the zpool/compressor param callback function to release the zswap_pools_lock spinlock before calling param_set_charp, since that function may sleep when it calls kmalloc with GFP_KERNEL. While this problem has existed for a while, I wasn't able to trigger it using a tight loop changing either/both the zpool and compressor params; I think it's very unlikely to be an issue on the stable kernels, especially since most zswap users will change the compressor and/or zpool from sysfs only one time each boot - or zero times, if they add the params to the kernel boot. Fixes: `c99b42c352` ("zswap: use charp for zswap param strings") Link: http://lkml.kernel.org/r/20170126155821.4545-1-ddstreet@ieee.org Signed-off-by: Dan Streetman <dan.streetman@canonical.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Vikas C Sajjan	5c1b80f674	x86/acpi: Reduce code duplication in mp_override_legacy_irq() commit `4ee2ec1b12` upstream. The new function mp_register_ioapic_irq() is a subset of the code in mp_override_legacy_irq(). Replace the code duplication by invoking mp_register_ioapic_irq() from mp_override_legacy_irq(). Signed-off-by: Vikas C Sajjan <vikas.cha.sajjan@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: linux-pm@vger.kernel.org Cc: kkamagui@gmail.com Cc: linux-acpi@vger.kernel.org Link: https://lkml.kernel.org/r/1510848825-21965-3-git-send-email-vikas.cha.sajjan@hpe.com Cc: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:52 +01:00
Takashi Iwai	43ff00f873	ALSA: aloop: Fix racy hw constraints adjustment commit `898dfe4687` upstream. The aloop driver tries to update the hw constraints of the connected target on the cable of the opened PCM substream. This is done by adding the extra hw constraints rules referring to the substream runtime->hw fields, while the other substream may update the runtime hw of another side on the fly. This is, however, racy and may result in the inconsistent values when both PCM streams perform the prepare concurrently. One of the reason is that it overwrites the other's runtime->hw field; which is not only racy but also broken when it's called before the open of another side finishes. And, since the reference to runtime->hw isn't protected, the concurrent write may give the partial value update and become inconsistent. This patch is an attempt to fix and clean up: - The prepare doesn't change the runtime->hw of other side any longer, but only update the cable->hw that is referred commonly. - The extra rules refer to the loopback_pcm object instead of the runtime->hw. The actual hw is deduced from cable->hw. - The extra rules take the cable_lock to protect against the race. Fixes: `b1c73fc8e6` ("ALSA: snd-aloop: Fix hw_params restrictions and checking") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	5af666d0dd	ALSA: aloop: Fix inconsistent format due to incomplete rule commit `b088b53e20` upstream. The extra hw constraint rule for the formats the aloop driver introduced has a slight flaw, where it doesn't return a positive value when the mask got changed. It came from the fact that it's basically a copy&paste from snd_hw_constraint_mask64(). The original code is supposed to be a single-shot and it modifies the mask bits only once and never after, while what we need for aloop is the dynamic hw rule that limits the mask bits. This difference results in the inconsistent state, as the hw_refine doesn't apply the dependencies fully. The worse and surprisingly result is that it causes a crash in OSS emulation when multiple full-duplex reads/writes are performed concurrently (I leave why it triggers Oops to readers as a homework). For fixing this, replace a few open-codes with the standard snd_mask_*() macros. Reported-by: syzbot+3902b5220e8ca27889ca@syzkaller.appspotmail.com Fixes: `b1c73fc8e6` ("ALSA: snd-aloop: Fix hw_params restrictions and checking") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	01046dd834	ALSA: aloop: Release cable upon open error path commit `9685347aa0` upstream. The aloop runtime object and its assignment in the cable are left even when opening a substream fails. This doesn't mean any memory leak, but it still keeps the invalid pointer that may be referred by the another side of the cable spontaneously, which is a potential Oops cause. Clean up the cable assignment and the empty cable upon the error path properly. Fixes: `597603d615` ("ALSA: introduce the snd-aloop module for the PCM loopback") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	bee3f2d5c0	ALSA: pcm: Allow aborting mutex lock at OSS read/write loops commit `900498a34a` upstream. PCM OSS read/write loops keep taking the mutex lock for the whole read/write, and this might take very long when the exceptionally high amount of data is given. Also, since it invokes with mutex_lock(), the concurrent read/write becomes unbreakable. This patch tries to address these issues by replacing mutex_lock() with mutex_lock_interruptible(), and also splits / re-takes the lock at each read/write period chunk, so that it can switch the context more finely if requested. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	3a00564cb4	ALSA: pcm: Abort properly at pending signal in OSS read/write loops commit `29159a4ed7` upstream. The loops for read and write in PCM OSS emulation have no proper check of pending signals, and they keep processing even after user tries to break. This results in a very long delay, often seen as RCU stall when a huge unprocessed bytes remain queued. The bug could be easily triggered by syzkaller. As a simple workaround, this patch adds the proper check of pending signals and aborts the loop appropriately. Reported-by: syzbot+993cb4cfcbbff3947c21@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	8e81425e80	ALSA: pcm: Add missing error checks in OSS emulation plugin builder commit `6708913750` upstream. In the OSS emulation plugin builder where the frame size is parsed in the plugin chain, some places miss the possible errors returned from the plugin src_ or dst_frames callback. This patch papers over such places. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Takashi Iwai	83da0245ed	ALSA: pcm: Remove incorrect snd_BUG_ON() usages commit `fe08f34d06` upstream. syzkaller triggered kernel warnings through PCM OSS emulation at closing a stream: WARNING: CPU: 0 PID: 3502 at sound/core/pcm_lib.c:1635 snd_pcm_hw_param_first+0x289/0x690 sound/core/pcm_lib.c:1635 Call Trace: .... snd_pcm_hw_param_near.constprop.27+0x78d/0x9a0 sound/core/oss/pcm_oss.c:457 snd_pcm_oss_change_params+0x17d3/0x3720 sound/core/oss/pcm_oss.c:969 snd_pcm_oss_make_ready+0xaa/0x130 sound/core/oss/pcm_oss.c:1128 snd_pcm_oss_sync+0x257/0x830 sound/core/oss/pcm_oss.c:1638 snd_pcm_oss_release+0x20b/0x280 sound/core/oss/pcm_oss.c:2431 __fput+0x327/0x7e0 fs/file_table.c:210 .... This happens while it tries to open and set up the aloop device concurrently. The warning above (invoked from snd_BUG_ON() macro) is to detect the unexpected logical error where snd_pcm_hw_refine() call shouldn't fail. The theory is true for the case where the hw_params config rules are static. But for an aloop device, the hw_params rule condition does vary dynamically depending on the connected target; when another device is opened and changes the parameters, the device connected in another side is also affected, and it caused the error from snd_pcm_hw_refine(). That is, the simplest "solution" for this is to remove the incorrect assumption of static rules, and treat such an error as a normal error path. As there are a couple of other places using snd_BUG_ON() incorrectly, this patch removes these spurious snd_BUG_ON() calls. Reported-by: syzbot+6f11c7e2a1b91d466432@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Vikas C Sajjan	0199927a8e	x86/acpi: Handle SCI interrupts above legacy space gracefully commit `252714155f` upstream. Platforms which support only IOAPIC mode, pass the SCI information above the legacy space (0-15) via the FADT mechanism and not via MADT. In such cases mp_override_legacy_irq() which is invoked from acpi_sci_ioapic_setup() to register SCI interrupts fails for interrupts greater equal 16, since it is meant to handle only the legacy space and emits error "Invalid bus_irq %u for legacy override". Add a new function to handle SCI interrupts >= 16 and invoke it conditionally in acpi_sci_ioapic_setup(). The code duplication due to this new function will be cleaned up in a separate patch. Co-developed-by: Sunil V L <sunil.vl@hpe.com> Signed-off-by: Vikas C Sajjan <vikas.cha.sajjan@hpe.com> Signed-off-by: Sunil V L <sunil.vl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Abdul Lateef Attar <abdul-lateef.attar@hpe.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: linux-pm@vger.kernel.org Cc: kkamagui@gmail.com Cc: linux-acpi@vger.kernel.org Link: https://lkml.kernel.org/r/1510848825-21965-2-git-send-email-vikas.cha.sajjan@hpe.com Cc: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Rafael J. Wysocki	64ab063b71	platform/x86: wmi: Call acpi_wmi_init() later commit `98b8e4e5c1` upstream. Calling acpi_wmi_init() at the subsys_initcall() level causes ordering issues to appear on some systems and they are difficult to reproduce, because there is no guaranteed ordering between subsys_initcall() calls, so they may occur in different orders on different systems. In particular, commit `86d9f48534` (mm/slab: fix kmemcg cache creation delayed issue) exposed one of these issues where genl_init() and acpi_wmi_init() are both called at the same initcall level, but the former must run before the latter so as to avoid a NULL pointer dereference. For this reason, move the acpi_wmi_init() invocation to the initcall_sync level which should still be early enough for things to work correctly in the WMI land. Link: https://marc.info/?t=151274596700002&r=1&w=2 Reported-by: Jonathan McDowell <noodles@earth.li> Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Tested-by: Jonathan McDowell <noodles@earth.li> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Jim Mattson	491c0ca3db	kvm: vmx: Scrub hardware GPRs at VM-exit commit `0cb5b30698` upstream. Guest GPR values are live in the hardware GPRs at VM-exit. Do not leave any guest values in hardware GPRs after the guest GPR values are saved to the vcpu_vmx structure. This is a partial mitigation for CVE 2017-5715 and CVE 2017-5753. Specifically, it defeats the Project Zero PoC for CVE 2017-5715. Suggested-by: Eric Northup <digitaleric@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Eric Northup <digitaleric@google.com> Reviewed-by: Benjamin Serebrin <serebrin@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> [Paolo: Add AMD bits, Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:51 +01:00
Maciej W. Rozycki	78c00f597b	MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses commit `c8c5a3a24d` upstream. Complement commit `c23b3d1a53` ("MIPS: ptrace: Change GP regset to use correct core dump register layout") and also reject outsized PTRACE_SETREGSET requests to the NT_PRFPREG regset, like with the NT_PRSTATUS regset. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `c23b3d1a53` ("MIPS: ptrace: Change GP regset to use correct core dump register layout") Cc: James Hogan <james.hogan@mips.com> Cc: Paul Burton <Paul.Burton@mips.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17930/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	1f4cff1c36	MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET commit `006501e039` upstream. Complement commit `d614fd58a2` ("mips/ptrace: Preserve previous registers for short regset write") and like with the PTRACE_GETREGSET ptrace(2) request also apply a BUILD_BUG_ON check for the size of the `elf_fpreg_t' type in the PTRACE_SETREGSET request handler. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `d614fd58a2` ("mips/ptrace: Preserve previous registers for short regset write") Cc: James Hogan <james.hogan@mips.com> Cc: Paul Burton <Paul.Burton@mips.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17929/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	cfc5c63a38	MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA commit `be07a6a118` upstream. Fix a commit `72b22bbad1` ("MIPS: Don't assume 64-bit FP registers for FP regset") public API regression, then activated by commit `1db1af84d6` ("MIPS: Basic MSA context switching support"), that caused the FCSR register not to be read or written for CONFIG_CPU_HAS_MSA kernel configurations (regardless of actual presence or absence of the MSA feature in a given processor) with ptrace(2) PTRACE_GETREGSET and PTRACE_SETREGSET requests nor recorded in core dumps. This is because with !CONFIG_CPU_HAS_MSA configurations the whole of `elf_fpregset_t' array is bulk-copied as it is, which includes the FCSR in one half of the last, 33rd slot, whereas with CONFIG_CPU_HAS_MSA configurations array elements are copied individually, and then only the leading 32 FGR slots while the remaining slot is ignored. Correct the code then such that only FGR slots are copied in the respective !MSA and MSA helpers an then the FCSR slot is handled separately in common code. Use `ptrace_setfcr31' to update the FCSR too, so that the read-only mask is respected. Retrieving a correct value of FCSR is important in debugging not only for the human to be able to get the right interpretation of the situation, but for correct operation of GDB as well. This is because the condition code bits in FSCR are used by GDB to determine the location to place a breakpoint at when single-stepping through an FPU branch instruction. If such a breakpoint is placed incorrectly (i.e. with the condition reversed), then it will be missed, likely causing the debuggee to run away from the control of GDB and consequently breaking the process of investigation. Fortunately GDB continues using the older PTRACE_GETFPREGS ptrace(2) request which is unaffected, so the regression only really hits with post-mortem debug sessions using a core dump file, in which case execution, and consequently single-stepping through branches is not possible. Of course core files created by buggy kernels out there will have the value of FCSR recorded clobbered, but such core files cannot be corrected and the person using them simply will have to be aware that the value of FCSR retrieved is not reliable. Which also means we can likely get away without defining a replacement API which would ensure a correct value of FSCR to be retrieved, or none at all. This is based on previous work by Alex Smith, extensively rewritten. Signed-off-by: Alex Smith <alex@alex-smith.me.uk> Signed-off-by: James Hogan <james.hogan@mips.com> Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `72b22bbad1` ("MIPS: Don't assume 64-bit FP registers for FP regset") Cc: Paul Burton <Paul.Burton@mips.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17928/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	f616180a87	MIPS: Consistently handle buffer counter with PTRACE_SETREGSET commit `80b3ffce01` upstream. Update commit `d614fd58a2` ("mips/ptrace: Preserve previous registers for short regset write") bug and consistently consume all data supplied to `fpr_set_msa' with the ptrace(2) PTRACE_SETREGSET request, such that a zero data buffer counter is returned where insufficient data has been given to fill a whole number of FP general registers. In reality this is not going to happen, as the caller is supposed to only supply data covering a whole number of registers and it is verified in `ptrace_regset' and again asserted in `fpr_set', however structuring code such that the presence of trailing partial FP general register data causes `fpr_set_msa' to return with a non-zero data buffer counter makes it appear that this trailing data will be used if there are subsequent writes made to FP registers, which is going to be the case with the FCSR once the missing write to that register has been fixed. Fixes: `d614fd58a2` ("mips/ptrace: Preserve previous registers for short regset write") Signed-off-by: Maciej W. Rozycki <macro@mips.com> Cc: James Hogan <james.hogan@mips.com> Cc: Paul Burton <Paul.Burton@mips.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17927/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	5b593a81fd	MIPS: Guard against any partial write attempt with PTRACE_SETREGSET commit `dc24d0edf3` upstream. Complement commit `d614fd58a2` ("mips/ptrace: Preserve previous registers for short regset write") and ensure that no partial register write attempt is made with PTRACE_SETREGSET, as we do not preinitialize any temporaries used to hold incoming register data and consequently random data could be written. It is the responsibility of the caller, such as `ptrace_regset', to arrange for writes to span whole registers only, so here we only assert that it has indeed happened. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `72b22bbad1` ("MIPS: Don't assume 64-bit FP registers for FP regset") Cc: James Hogan <james.hogan@mips.com> Cc: Paul Burton <Paul.Burton@mips.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17926/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	8eb5655aac	MIPS: Factor out NT_PRFPREG regset access helpers commit `a03fe72572` upstream. In preparation to fix a commit `72b22bbad1` ("MIPS: Don't assume 64-bit FP registers for FP regset") FCSR access regression factor out NT_PRFPREG regset access helpers for the non-MSA and the MSA variants respectively, to avoid having to deal with excessive indentation in the actual fix. No functional change, however use `target->thread.fpu.fpr[0]' rather than `target->thread.fpu.fpr[i]' for FGR holding type size determination as there's no `i' variable to refer to anymore, and for the factored out `i' variable declaration use `unsigned int' rather than `unsigned' as its type, following the common style. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `72b22bbad1` ("MIPS: Don't assume 64-bit FP registers for FP regset") Cc: James Hogan <james.hogan@mips.com> Cc: Paul Burton <Paul.Burton@mips.com> Cc: Alex Smith <alex@alex-smith.me.uk> Cc: Dave Martin <Dave.Martin@arm.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17925/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Maciej W. Rozycki	14e1c579ac	MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task commit `b67336eee3` upstream. Fix an API loophole introduced with commit `9791554b45` ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS"), where the caller of prctl(2) is incorrectly allowed to make a change to CP0.Status.FR or CP0.Config5.FRE register bits even if CONFIG_MIPS_O32_FP64_SUPPORT has not been enabled, despite that an executable requesting the mode requested via ELF file annotation would not be allowed to run in the first place, or for n64 and n64 ABI tasks which do not have non-default modes defined at all. Add suitable checks to `mips_set_process_fp_mode' and bail out if an invalid mode change has been requested for the ABI in effect, even if the FPU hardware or emulation would otherwise allow it. Always succeed however without taking any further action if the mode requested is the same as one already in effect, regardless of whether any mode change, should it be requested, would actually be allowed for the task concerned. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Fixes: `9791554b45` ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS") Reviewed-by: Paul Burton <paul.burton@mips.com> Cc: James Hogan <james.hogan@mips.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/17800/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Bart Van Assche	3019171864	IB/srpt: Disable RDMA access by the initiator commit `bec40c2604` upstream. With the SRP protocol all RDMA operations are initiated by the target. Since no RDMA operations are initiated by the initiator, do not grant the initiator permission to submit RDMA reads or writes to the target. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:50 +01:00
Wolfgang Grandegger	02f201f78f	can: gs_usb: fix return value of the "set_bittiming" callback commit `d5b42e6607` upstream. The "set_bittiming" callback treats a positive return value as error! For that reason "can_changelink()" will quit silently after setting the bittiming values without processing ctrlmode, restart-ms, etc. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:49 +01:00
Wanpeng Li	c781e3be97	KVM: Fix stack-out-of-bounds read in write_mmio commit `e39d200fa5` upstream. Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through (u64 )val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:49 +01:00
Vasanthakumar Thiagarajan	c5ab9ee144	ath10k: rebuild crypto header in rx data frames commit `7eccb738fc` upstream. Rx data frames notified through HTT_T2H_MSG_TYPE_RX_IND and HTT_T2H_MSG_TYPE_RX_FRAG_IND expect PN/TSC check to be done on host (mac80211) rather than firmware. Rebuild cipher header in every received data frames (that are notified through those HTT interfaces) from the rx_hdr_status tlv available in the rx descriptor of the first msdu. Skip setting RX_FLAG_IV_STRIPPED flag for the packets which requires mac80211 PN/TSC check support and set appropriate RX_FLAG for stripped crypto tail. Hw QCA988X, QCA9887, QCA99X0, QCA9984, QCA9888 and QCA4019 currently need the rebuilding of cipher header to perform PN/TSC check for replay attack. Please note that removing crypto tail for CCMP-256, GCMP and GCMP-256 ciphers in raw mode needs to be fixed. Since Rx with these ciphers in raw mode does not work in the current form even without this patch and removing crypto tail for these chipers needs clean up, raw mode related issues in CCMP-256, GCMP and GCMP-256 can be addressed in follow up patches. Tested-by: Manikanta Pubbisetty <mpubbise@qti.qualcomm.com> Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:49 +01:00
David Spinadel	234c8e6043	mac80211: Add RX flag to indicate ICV stripped commit `cef0acd4d7` upstream. Add a flag that indicates that the WEP ICV was stripped from an RX packet, allowing the device to not transfer that if it's already checked. Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:49 +01:00
Suren Baghdasaryan	b58aa24edb	dm bufio: fix shrinker scans when (nr_to_scan < retain_target) commit `fbc7c07ec2` upstream. When system is under memory pressure it is observed that dm bufio shrinker often reclaims only one buffer per scan. This change fixes the following two issues in dm bufio shrinker that cause this behavior: 1. ((nr_to_scan - freed) <= retain_target) condition is used to terminate slab scan process. This assumes that nr_to_scan is equal to the LRU size, which might not be correct because do_shrink_slab() in vmscan.c calculates nr_to_scan using multiple inputs. As a result when nr_to_scan is less than retain_target (64) the scan will terminate after the first iteration, effectively reclaiming one buffer per scan and making scans very inefficient. This hurts vmscan performance especially because mutex is acquired/released every time dm_bufio_shrink_scan() is called. New implementation uses ((LRU size - freed) <= retain_target) condition for scan termination. LRU size can be safely determined inside __scan() because this function is called after dm_bufio_lock(). 2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to determine number of freeable objects in the slab. However dm_bufio always retains retain_target buffers in its LRU and will terminate a scan when this mark is reached. Therefore returning the entire LRU size from dm_bufio_shrink_count() is misleading because that does not represent the number of freeable objects that slab will reclaim during a scan. Returning (LRU size - retain_target) better represents the number of freeable objects in the slab. This way do_shrink_slab() returns 0 when (LRU size < retain_target) and vmscan will not try to scan this shrinker avoiding scans that will not reclaim any memory. Test: tested using Android device running <AOSP>/system/extras/alloc-stress that generates memory pressure and causes intensive shrinker scans Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-17 09:38:49 +01:00
Greg Kroah-Hartman	7bbc6ca488	Linux 4.9.76	2018-01-10 09:29:55 +01:00
Boris Brezillon	5e1f377fc8	mtd: nand: pxa3xx: Fix READOOB implementation commit `fee4380f36` upstream. In the current driver, OOB bytes are accessed in raw mode, and when a page access is done with NDCR_SPARE_EN set and NDCR_ECC_EN cleared, the driver must read the whole spare area (64 bytes in case of a 2k page, 16 bytes for a 512 page). The driver was only reading the free OOB bytes, which was leaving some unread data in the FIFO and was somehow leading to a timeout. We could patch the driver to read ->spare_size + ->ecc_size instead of just ->spare_size when READOOB is requested, but we'd better make in-band and OOB accesses consistent. Since the driver is always accessing in-band data in non-raw mode (with the ECC engine enabled), we should also access OOB data in this mode. That's particularly useful when using the BCH engine because in this mode the free OOB bytes are also ECC protected. Fixes: `43bcfd2bb2` ("mtd: nand: pxa3xx: Add driver-specific ECC BCH support") Reported-by: Sean Nyekjær <sean.nyekjaer@prevas.dk> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Tested-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:54 +01:00
Borislav Petkov	beca4e2d99	Map the vsyscall page with _PAGE_USER This needs to happen early in kaiser_pagetable_walk(), before the hierarchy is established so that _PAGE_USER permission can be really set. A proper fix would be to teach kaiser_pagetable_walk() to update those permissions but the vsyscall page is the only exception here so ... Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:54 +01:00
Thomas Gleixner	47f3cea393	x86/tlb: Drop the _GPL from the cpu_tlbstate export commit `1e5476815f` upstream. The recent changes for PTI touch cpu_tlbstate from various tlb_flush inlines. cpu_tlbstate is exported as GPL symbol, so this causes a regression when building out of tree drivers for certain graphics cards. Aside of that the export was wrong since it was introduced as it should have been EXPORT_PER_CPU_SYMBOL_GPL(). Use the correct PER_CPU export and drop the _GPL to restore the previous state which allows users to utilize the cards they payed for. As always I'm really thrilled to make this kind of change to support the #friends (or however the hot hashtag of today is spelled) from that closet sauce graphics corp. Fixes: `1e02ce4ccc` ("x86: Store a per-cpu shadow copy of CR4") Fixes: `6fd166aae7` ("x86/mm: Use/Fix PCID to optimize user/kernel switches") Reported-by: Kees Cook <keescook@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Backlund <tmb@mageia.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:54 +01:00
Helge Deller	91dfc41e75	parisc: qemu idle sleep support commit `310d82784f` upstream. Add qemu idle sleep support when running under qemu with SeaBIOS PDC firmware. Like the power architecture we use the "or" assembler instructions, which translate to nops on real hardware, to indicate that qemu shall idle sleep. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:54 +01:00
Helge Deller	14c06206b9	parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel commit `88776c0e70` upstream. Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9." Those opcodes evaluate to the ldcw() assembly instruction which requires (on 32bit) an alignment of 16 bytes to ensure atomicity. As it turns out, qemu is correct and in our assembly code in entry.S and pacache.S we don't pay attention to the required alignment. This patch fixes the problem by aligning the lock offset in assembly code in the same manner as we do in our C-code. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:54 +01:00
Tom Lendacky	dd43c465ba	x86/microcode/AMD: Add support for fam17h microcode loading commit `f4e9b7af0c` upstream. The size for the Microcode Patch Block (MPB) for an AMD family 17h processor is 3200 bytes. Add a #define for fam17h so that it does not default to 2048 bytes and fail a microcode load/update. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Alice Ferrazzi <alicef@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Aaron Ma	2b009d33f4	Input: elantech - add new icbody type 15 commit `10d900303f` upstream. The touchpad of Lenovo Thinkpad L480 reports it's version as 15. Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Vineet Gupta	cc1349fa9c	ARC: uaccess: dont use "l" gcc inline asm constraint modifier commit `79435ac78d` upstream. This used to setup the LP_COUNT register automatically, but now has been removed. There was an earlier fix `3c7c7a2fc8` which fixed instance in delay.h but somehow missed this one as gcc change had not made its way into production toolchains and was not pedantic as it is now ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Robin Murphy	e6a897a684	iommu/arm-smmu-v3: Cope with duplicated Stream IDs commit `563b5cbe33` upstream. For PCI devices behind an aliasing PCIe-to-PCI/X bridge, the bridge alias to DevFn 0.0 on the subordinate bus may match the original RID of the device, resulting in the same SID being present in the device's fwspec twice. This causes trouble later in arm_smmu_write_strtab_ent() when we wind up visiting the STE a second time and find it already live. Avoid the issue by giving arm_smmu_install_ste_for_dev() the cleverness to skip over duplicates. It seems mildly counterintuitive compared to preventing the duplicates from existing in the first place, but since the DT and ACPI probe paths build their fwspecs differently, this is actually the cleanest and most self-contained way to deal with it. Fixes: `8f78515425` ("iommu/arm-smmu: Implement of_xlate() for SMMUv3") Reported-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Tested-by: Jayachandran C. <jnair@caviumnetworks.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Jean-Philippe Brucker	03975faee7	iommu/arm-smmu-v3: Don't free page table ops twice commit `57d72e159b` upstream. Kasan reports a double free when finalise_stage_fn fails: the io_pgtable ops are freed by arm_smmu_domain_finalise and then again by arm_smmu_domain_free. Prevent this by leaving pgtbl_ops empty on failure. Fixes: `48ec83bcbc` ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Oleg Nesterov	4d53eb4949	kernel/signal.c: remove the no longer needed SIGNAL_UNKILLABLE check in complete_signal() commit `426915796c` upstream. complete_signal() checks SIGNAL_UNKILLABLE before it starts to destroy the thread group, today this is wrong in many ways. If nothing else, fatal_signal_pending() should always imply that the whole thread group (except ->group_exit_task if it is not NULL) is killed, this check breaks the rule. After the previous changes we can rely on sig_task_ignored(); sig_fatal(sig) && SIGNAL_UNKILLABLE can only be true if we actually want to kill this task and sig == SIGKILL OR it is traced and debugger can intercept the signal. This should hopefully fix the problem reported by Dmitry. This test-case static int init(void arg) { for (;;) pause(); } int main(void) { char stack[16 1024]; for (;;) { int pid = clone(init, stack + sizeof(stack)/2, CLONE_NEWPID \| SIGCHLD, NULL); assert(pid > 0); assert(ptrace(PTRACE_ATTACH, pid, 0, 0) == 0); assert(waitpid(-1, NULL, WSTOPPED) == pid); assert(ptrace(PTRACE_DETACH, pid, 0, SIGSTOP) == 0); assert(syscall(__NR_tkill, pid, SIGKILL) == 0); assert(pid == wait(NULL)); } } triggers the WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING)) in task_participate_group_stop(). do_signal_stop()->signal_group_exit() checks SIGNAL_GROUP_EXIT and return false, but task_set_jobctl_pending() checks fatal_signal_pending() and does not set JOBCTL_STOP_PENDING. And his should fix the minor security problem reported by Kyle, SECCOMP_RET_TRACE can miss fatal_signal_pending() the same way if the task is the root of a pid namespace. Link: http://lkml.kernel.org/r/20171103184246.GD21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Kyle Huey <me@kylehuey.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:53 +01:00
Oleg Nesterov	794ac8ef9b	kernel/signal.c: protect the SIGNAL_UNKILLABLE tasks from !sig_kernel_only() signals commit `ac25385089` upstream. Change sig_task_ignored() to drop the SIG_DFL && !sig_kernel_only() signals even if force == T. This simplifies the next change and this matches the same check in get_signal() which will drop these signals anyway. Link: http://lkml.kernel.org/r/20171103184227.GC21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
Oleg Nesterov	1453b3ac6c	kernel/signal.c: protect the traced SIGNAL_UNKILLABLE tasks from SIGKILL commit `628c1bcba2` upstream. The comment in sig_ignored() says "Tracers may want to know about even ignored signals" but SIGKILL can not be reported to debugger and it is just wrong to return 0 in this case: SIGKILL should only kill the SIGNAL_UNKILLABLE task if it comes from the parent ns. Change sig_ignored() to ignore ->ptrace if sig == SIGKILL and rely on sig_task_ignored(). SISGTOP coming from within the namespace is not really right too but at least debugger can intercept it, and we can't drop it here because this will break "gdb -p 1": ptrace_attach() won't work. Perhaps we will add another ->ptrace check later, we will see. Link: http://lkml.kernel.org/r/20171103184206.GB21036@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
Thiago Rafael Becker	79258d9834	kernel: make groups_sort calling a responsibility group_info allocators commit `bdcf0a423e` upstream. In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
Jens Axboe	3a381abc5b	nbd: fix use-after-free of rq/bio in the xmit path commit `429a787be6` upstream. For writes, we can get a completion in while we're still iterating the request and bio chain. If that happens, we're reading freed memory and we can crash. Break out after the last segment and avoid having the iterator read freed memory. Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
David Howells	2b9b2002e0	fscache: Fix the default for fscache_maybe_release_page() commit `9880150655` upstream. Fix the default for fscache_maybe_release_page() for when the cookie isn't valid or the page isn't cached. It mustn't return false as that indicates the page cannot yet be freed. The problem with the default is that if, say, there's no cache, but a network filesystem's pages are using up almost all the available memory, a system can OOM because the filesystem ->releasepage() op will not allow them to be released as fscache_maybe_release_page() incorrectly prevents it. This can be tested by writing a sequence of 512MiB files to an AFS mount. It does not affect NFS or CIFS because both of those wrap the call in a check of PG_fscache and it shouldn't bother Ceph as that only has PG_private set whilst writeback is in progress. This might be an issue for 9P, however. Note that the pages aren't entirely stuck. Removing a file or unmounting will clear things because that uses ->invalidatepage() instead. Fixes: `201a15428b` ("FS-Cache: Handle pages pending storage that get evicted under OOM conditions") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
Stefan Brüns	34fa2eede0	sunxi-rsb: Include OF based modalias in device uevent commit `e2bf801ecd` upstream. Include the OF-based modalias in the uevent sent when registering devices on the sunxi RSB bus, so that user space has a chance to autoload the kernel module for the device. Fixes a regression caused by commit `3f241bfa60` ("arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0"). When the axp20x-rsb module for the AXP803 PMIC is built as a module, it is not loaded and the system ends up with an disfunctional MMC controller. Fixes: `d787dcdb9c` ("bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus") Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:52 +01:00
Eric Biggers	c195a4c023	crypto: pcrypt - fix freeing pcrypt instances commit `d76c68109f` upstream. pcrypt is using the old way of freeing instances, where the ->free() method specified in the 'struct crypto_template' is passed a pointer to the 'struct crypto_instance'. But the crypto_instance is being kfree()'d directly, which is incorrect because the memory was actually allocated as an aead_instance, which contains the crypto_instance at a nonzero offset. Thus, the wrong pointer was being kfree()'d. Fix it by switching to the new way to free aead_instance's where the ->free() method is specified in the aead_instance itself. Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: `0496f56065` ("crypto: pcrypt - Add support for new AEAD interface") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:51 +01:00
Eric Biggers	868f50b95d	crypto: chacha20poly1305 - validate the digest size commit `e57121d08c` upstream. If the rfc7539 template was instantiated with a hash algorithm with digest size larger than 16 bytes (POLY1305_DIGEST_SIZE), then the digest overran the 'tag' buffer in 'struct chachapoly_req_ctx', corrupting the subsequent memory, including 'cryptlen'. This caused a crash during crypto_skcipher_decrypt(). Fix it by, when instantiating the template, requiring that the underlying hash algorithm has the digest size expected for Poly1305. Reproducer: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { int algfd, reqfd; struct sockaddr_alg addr = { .salg_type = "aead", .salg_name = "rfc7539(chacha20,sha256)", }; unsigned char buf[32] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (void *)&addr, sizeof(addr)); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, sizeof(buf)); reqfd = accept(algfd, 0, 0); write(reqfd, buf, 16); read(reqfd, buf, 16); } Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: `71ebc4d1b2` ("crypto: chacha20poly1305 - Add a ChaCha20-Poly1305 AEAD construction, RFC7539") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:51 +01:00
Jan Engelhardt	f6db86f31b	crypto: n2 - cure use after free commit `203f45003a` upstream. queue_cache_init is first called for the Control Word Queue (n2_crypto_probe). At that time, queue_cache[0] is NULL and a new kmem_cache will be allocated. If the subsequent n2_register_algs call fails, the kmem_cache will be released in queue_cache_destroy, but queue_cache_init[0] is not set back to NULL. So when the Module Arithmetic Unit gets probed next (n2_mau_probe), queue_cache_init will not allocate a kmem_cache again, but leave it as its bogus value, causing a BUG() to trigger when queue_cache[0] is eventually passed to kmem_cache_zalloc: n2_crypto: Found N2CP at /virtual-devices@100/n2cp@7 n2_crypto: Registered NCS HVAPI version 2.0 called queue_cache_init n2_crypto: md5 alg registration failed n2cp f028687c: /virtual-devices@100/n2cp@7: Unable to register algorithms. called queue_cache_destroy n2cp: probe of f028687c failed with error -22 n2_crypto: Found NCP at /virtual-devices@100/ncp@6 n2_crypto: Registered NCS HVAPI version 2.0 called queue_cache_init kernel BUG at mm/slab.c:2993! Call Trace: [0000000000604488] kmem_cache_alloc+0x1a8/0x1e0 (inlined) kmem_cache_zalloc (inlined) new_queue (inlined) spu_queue_setup (inlined) handle_exec_unit [0000000010c61eb4] spu_mdesc_scan+0x1f4/0x460 [n2_crypto] [0000000010c62b80] n2_mau_probe+0x100/0x220 [n2_crypto] [000000000084b174] platform_drv_probe+0x34/0xc0 Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:51 +01:00
Oleg Nesterov	790080ce0e	kernel/acct.c: fix the acct->needcheck check in check_free_space() commit `4d9570158b` upstream. As Tsukada explains, the time_is_before_jiffies(acct->needcheck) check is very wrong, we need time_is_after_jiffies() to make sys_acct() work. Ignoring the overflows, the code should "goto out" if needcheck > jiffies, while currently it checks "needcheck < jiffies" and thus in the likely case check_free_space() does nothing until jiffies overflow. In particular this means that sys_acct() is simply broken, acct_on() sets acct->needcheck = jiffies and expects that check_free_space() should set acct->active = 1 after the free-space check, but this won't happen if jiffies increments in between. This was broken by commit `32dc730860` ("get rid of timer in kern/acct.c") in 2011, then another (correct) commit `795a2f22a8` ("acct() should honour the limits from the very beginning") made the problem more visible. Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com Fixes: `32dc730860` ("get rid of timer in kern/acct.c") Reported-by: TSUKADA Koutaro <tsukada@ascade.co.jp> Suggested-by: TSUKADA Koutaro <tsukada@ascade.co.jp> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-10 09:29:51 +01:00
Greg Kroah-Hartman	9f74755895	Linux 4.9.75	2018-01-05 15:46:36 +01:00
Guenter Roeck	92fd81f772	kaiser: Set _PAGE_NX only if supported This resolves a crash if loaded under qemu + haxm under windows. See https://www.spinics.net/lists/kernel/msg2689835.html for details. Here is a boot log (the log is from chromeos-4.4, but Tao Wu says that the same log is also seen with vanilla v4.4.110-rc1). [ 0.712750] Freeing unused kernel memory: 552K [ 0.721821] init: Corrupted page table at address 57b029b332e0 [ 0.722761] PGD 80000000bb238067 PUD bc36a067 PMD bc369067 PTE 45d2067 [ 0.722761] Bad pagetable: 000b [#1] PREEMPT SMP [ 0.722761] Modules linked in: [ 0.722761] CPU: 1 PID: 1 Comm: init Not tainted 4.4.96 #31 [ 0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 [ 0.722761] task: ffff8800bc290000 ti: ffff8800bc28c000 task.ti: ffff8800bc28c000 [ 0.722761] RIP: 0010:[<ffffffff83f4129e>] [<ffffffff83f4129e>] __clear_user+0x42/0x67 [ 0.722761] RSP: 0000:ffff8800bc28fcf8 EFLAGS: 00010202 [ 0.722761] RAX: 0000000000000000 RBX: 00000000000001a4 RCX: 00000000000001a4 [ 0.722761] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000057b029b332e0 [ 0.722761] RBP: ffff8800bc28fd08 R08: ffff8800bc290000 R09: ffff8800bb2f4000 [ 0.722761] R10: ffff8800bc290000 R11: ffff8800bb2f4000 R12: 000057b029b332e0 [ 0.722761] R13: 0000000000000000 R14: 000057b029b33340 R15: ffff8800bb1e2a00 [ 0.722761] FS: 0000000000000000(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 [ 0.722761] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 0.722761] CR2: 000057b029b332e0 CR3: 00000000bb2f8000 CR4: 00000000000006e0 [ 0.722761] Stack: [ 0.722761] 000057b029b332e0 ffff8800bb95fa80 ffff8800bc28fd18 ffffffff83f4120c [ 0.722761] ffff8800bc28fe18 ffffffff83e9e7a1 ffff8800bc28fd68 0000000000000000 [ 0.722761] ffff8800bc290000 ffff8800bc290000 ffff8800bc290000 ffff8800bc290000 [ 0.722761] Call Trace: [ 0.722761] [<ffffffff83f4120c>] clear_user+0x2e/0x30 [ 0.722761] [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7 [ 0.722761] [<ffffffff83de2088>] search_binary_handler+0x86/0x19c [ 0.722761] [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98 [ 0.722761] [<ffffffff844febe0>] ? rest_init+0x87/0x87 [ 0.722761] [<ffffffff83de40be>] do_execve+0x23/0x25 [ 0.722761] [<ffffffff83c002e3>] run_init_process+0x2b/0x2d [ 0.722761] [<ffffffff844fec4d>] kernel_init+0x6d/0xda [ 0.722761] [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70 [ 0.722761] [<ffffffff844febe0>] ? rest_init+0x87/0x87 [ 0.722761] Code: 86 84 be 12 00 00 00 e8 87 0d e8 ff 66 66 90 48 89 d8 48 c1 eb 03 4c 89 e7 83 e0 07 48 89 d9 be 08 00 00 00 31 d2 48 85 c9 74 0a <48> 89 17 48 01 f7 ff c9 75 f6 48 89 c1 85 c9 74 09 88 17 48 ff [ 0.722761] RIP [<ffffffff83f4129e>] __clear_user+0x42/0x67 [ 0.722761] RSP <ffff8800bc28fcf8> [ 0.722761] ---[ end trace def703879b4ff090 ]--- [ 0.722761] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/kernel/locking/rwsem.c:21 [ 0.722761] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: init [ 0.722761] CPU: 1 PID: 1 Comm: init Tainted: G D 4.4.96 #31 [ 0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 [ 0.722761] 0000000000000086 dcb5d76098c89836 ffff8800bc28fa30 ffffffff83f34004 [ 0.722761] ffffffff84839dc2 0000000000000015 ffff8800bc28fa40 ffffffff83d57dc9 [ 0.722761] ffff8800bc28fa68 ffffffff83d57e6a ffffffff84a53640 0000000000000000 [ 0.722761] Call Trace: [ 0.722761] [<ffffffff83f34004>] dump_stack+0x4d/0x63 [ 0.722761] [<ffffffff83d57dc9>] ___might_sleep+0x13a/0x13c [ 0.722761] [<ffffffff83d57e6a>] __might_sleep+0x9f/0xa6 [ 0.722761] [<ffffffff84502788>] down_read+0x20/0x31 [ 0.722761] [<ffffffff83cc5d9b>] __blocking_notifier_call_chain+0x35/0x63 [ 0.722761] [<ffffffff83cc5ddd>] blocking_notifier_call_chain+0x14/0x16 [ 0.800374] usb 1-1: new full-speed USB device number 2 using uhci_hcd [ 0.722761] [<ffffffff83cefe97>] profile_task_exit+0x1a/0x1c [ 0.802309] [<ffffffff83cac84e>] do_exit+0x39/0xe7f [ 0.802309] [<ffffffff83ce5938>] ? vprintk_default+0x1d/0x1f [ 0.802309] [<ffffffff83d7bb95>] ? printk+0x57/0x73 [ 0.802309] [<ffffffff83c46e25>] oops_end+0x80/0x85 [ 0.802309] [<ffffffff83c7b747>] pgtable_bad+0x8a/0x95 [ 0.802309] [<ffffffff83ca7f4a>] __do_page_fault+0x8c/0x352 [ 0.802309] [<ffffffff83eefba5>] ? file_has_perm+0xc4/0xe5 [ 0.802309] [<ffffffff83ca821c>] do_page_fault+0xc/0xe [ 0.802309] [<ffffffff84507682>] page_fault+0x22/0x30 [ 0.802309] [<ffffffff83f4129e>] ? __clear_user+0x42/0x67 [ 0.802309] [<ffffffff83f4127f>] ? __clear_user+0x23/0x67 [ 0.802309] [<ffffffff83f4120c>] clear_user+0x2e/0x30 [ 0.802309] [<ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7 [ 0.802309] [<ffffffff83de2088>] search_binary_handler+0x86/0x19c [ 0.802309] [<ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98 [ 0.802309] [<ffffffff844febe0>] ? rest_init+0x87/0x87 [ 0.802309] [<ffffffff83de40be>] do_execve+0x23/0x25 [ 0.802309] [<ffffffff83c002e3>] run_init_process+0x2b/0x2d [ 0.802309] [<ffffffff844fec4d>] kernel_init+0x6d/0xda [ 0.802309] [<ffffffff84505b2f>] ret_from_fork+0x3f/0x70 [ 0.802309] [<ffffffff844febe0>] ? rest_init+0x87/0x87 [ 0.830559] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 0.830559] [ 0.831305] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 0.831305] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 The crash part of this problem may be solved with the following patch (thanks to Hugh for the hint). There is still another problem, though - with this patch applied, the qemu session aborts with "VCPU Shutdown request", whatever that means. Cc: lepton <ytht.net@gmail.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:36 +01:00
Kees Cook	ea6cd39d23	KPTI: Report when enabled Make sure dmesg reports when KPTI is enabled. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:36 +01:00
Kees Cook	e71fac0172	KPTI: Rename to PAGE_TABLE_ISOLATION This renames CONFIG_KAISER to CONFIG_PAGE_TABLE_ISOLATION. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Borislav Petkov	59094faf3f	x86/kaiser: Move feature detection up ... before the first use of kaiser_enabled as otherwise funky things happen: about to get started... (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000] (XEN) Pagetable walk from ffff88022a449090: (XEN) L4[0x110] = 0000000229e0e067 0000000000001e0e (XEN) L3[0x008] = 0000000000000000 ffffffffffffffff (XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08 entry.o#create_bounce_frame+0x135/0x14d (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.9.1_02-3.21 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff81007460>] (XEN) RFLAGS: 0000000000000286 EM: 1 CONTEXT: pv guest (d0v0) Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Jiri Kosina	402e63de94	kaiser: disabled on Xen PV Kaiser cannot be used on paravirtualized MMUs (namely reading and writing CR3). This does not work with KAISER as the CR3 switch from and to user space PGD would require to map the whole XEN_PV machinery into both. More importantly, enabling KAISER on Xen PV doesn't make too much sense, as PV guests use distinct %cr3 values for kernel and user already. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Borislav Petkov	2c2721754a	x86/kaiser: Reenable PARAVIRT Now that the required bits have been addressed, reenable PARAVIRT. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Thomas Gleixner	1817d2c2fa	x86/paravirt: Dont patch flush_tlb_single commit `a035795499` upstream native_flush_tlb_single() will be changed with the upcoming PAGE_TABLE_ISOLATION feature. This requires to have more code in there than INVLPG. Remove the paravirt patching for it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Cc: michael.schwarz@iaik.tugraz.at Cc: moritz.lipp@iaik.tugraz.at Cc: richard.fellner@student.tugraz.at Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Hugh Dickins	fe5cb75fd2	kaiser: kaiser_flush_tlb_on_return_to_user() check PCID Let kaiser_flush_tlb_on_return_to_user() do the X86_FEATURE_PCID check, instead of each caller doing it inline first: nobody needs to optimize for the noPCID case, it's clearer this way, and better suits later changes. Replace those no-op X86_CR3_PCID_KERN_FLUSH lines by a BUILD_BUG_ON() in load_new_mm_cr3(), in case something changes. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Hugh Dickins	b72c26e911	kaiser: asm/tlbflush.h handle noPGE at lower level I found asm/tlbflush.h too twisty, and think it safer not to avoid __native_flush_tlb_global_irq_disabled() in the kaiser_enabled case, but instead let it handle kaiser_enabled along with cr3: it can just use __native_flush_tlb() for that, no harm in re-disabling preemption. (This is not the same change as Kirill and Dave have suggested for upstream, flipping PGE in cr4: that's neat, but needs a cpu_has_pge check; cr3 is enough for kaiser, and thought to be cheaper than cr4.) Also delete the X86_FEATURE_INVPCID invpcid_flush_all_nonglobals() preference from __native_flush_tlb(): unlike the invpcid_flush_all() preference in __native_flush_tlb_global(), it's not seen in upstream 4.14, and was recently reported to be surprisingly slow. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Hugh Dickins	8c2f8a5cc1	kaiser: drop is_atomic arg to kaiser_pagetable_walk() I have not observed a might_sleep() warning from setup_fixmap_gdt()'s use of kaiser_add_mapping() in our tree (why not?), but like upstream we have not provided a way for that to pass is_atomic true down to kaiser_pagetable_walk(), and at startup it's far from a likely source of trouble: so just delete the walk's is_atomic arg and might_sleep(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Hugh Dickins	169b369f99	kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush Now that we're playing the ALTERNATIVE game, use that more efficient method: instead of user-mapping an extra page, and reading an extra cacheline each time for x86_cr3_pcid_noflush. Neel has found that __stringify(bts $X86_CR3_PCID_NOFLUSH_BIT, %rax) is a working substitute for the "bts $63, %rax" in these ALTERNATIVEs; but the one line with $63 in looks clearer, so let's stick with that. Worried about what happens with an ALTERNATIVE between the jump and jump label in another ALTERNATIVE? I was, but have checked the combinations in SWITCH_KERNEL_CR3_NO_STACK at entry_SYSCALL_64, and it does a good job. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Borislav Petkov	8018307a45	x86/kaiser: Check boottime cmdline params AMD (and possibly other vendors) are not affected by the leak KAISER is protecting against. Keep the "nopti" for traditional reasons and add pti=<on\|off\|auto> like upstream. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:35 +01:00
Borislav Petkov	50624dd12d	x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling Concentrate it in arch/x86/mm/kaiser.c and use the upstream string "nopti". Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	23e09439aa	kaiser: add "nokaiser" boot option, using ALTERNATIVE Added "nokaiser" boot option: an early param like "noinvpcid". Most places now check int kaiser_enabled (#defined 0 when not CONFIG_KAISER) instead of #ifdef CONFIG_KAISER; but entry_64.S and entry_64_compat.S are using the ALTERNATIVE technique, which patches in the preferred instructions at runtime. That technique is tied to x86 cpu features, so X86_FEATURE_KAISER is fabricated. Prior to "nokaiser", Kaiser #defined _PAGE_GLOBAL 0: revert that, but be careful with both _PAGE_GLOBAL and CR4.PGE: setting them when nokaiser like when !CONFIG_KAISER, but not setting either when kaiser - neither matters on its own, but it's hard to be sure that _PAGE_GLOBAL won't get set in some obscure corner, or something add PGE into CR4. By omitting _PAGE_GLOBAL from __supported_pte_mask when kaiser_enabled, all page table setup which uses pte_pfn() masks it out of the ptes. It's slightly shameful that the same declaration versus definition of kaiser_enabled appears in not one, not two, but in three header files (asm/kaiser.h, asm/pgtable.h, asm/tlbflush.h). I felt safer that way, than with #including any of those in any of the others; and did not feel it worth an asm/kaiser_enabled.h - kernel/cpu/common.c includes them all, so we shall hear about it if they get out of synch. Cleanups while in the area: removed the silly #ifdef CONFIG_KAISER from kaiser.c; removed the unused native_get_normal_pgd(); removed the spurious reg clutter from SWITCH_*_CR3 macro stubs; corrected some comments. But more interestingly, set CR4.PSE in secondary_startup_64: the manual is clear that it does not matter whether it's 0 or 1 when 4-level-pts are enabled, but I was distracted to find cr4 different on BSP and auxiliaries - BSP alone was adding PSE, in probe_page_size_mask(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	cb7d8d7e67	kaiser: fix unlikely error in alloc_ldt_struct() An error from kaiser_add_mapping() here is not at all likely, but Eric Biggers rightly points out that __free_ldt_struct() relies on new_ldt->size being initialized: move that up. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	3df1461787	kaiser: kaiser_remove_mapping() move along the pgd When removing the bogus comment from kaiser_remove_mapping(), I really ought to have checked the extent of its bogosity: as Neel points out, there is nothing to stop unmap_pud_range_nofree() from continuing beyond the end of a pud (and starting in the wrong position on the next). Fix kaiser_remove_mapping() to constrain the extent and advance pgd pointer correctly: use pgd_addr_end() macro as used throughout base mm (but don't assume page-rounded start and size in this case). But this bug was very unlikely to trigger in this backport: since any buddy allocation is contained within a single pud extent, and we are not using vmapped stacks (and are only mapping one page of stack anyway): the only way to hit this bug here would be when freeing a large modified ldt. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	05ddad146d	kaiser: paranoid_entry pass cr3 need to paranoid_exit Neel Natu points out that paranoid_entry() was wrong to assume that an entry that did not need swapgs would not need SWITCH_KERNEL_CR3: paranoid_entry (used for debug breakpoint, int3, double fault or MCE; though I think it's only the MCE case that is cause for concern here) can break in at an awkward time, between cr3 switch and swapgs, but its handling always needs kernel gs and kernel cr3. Easy to fix in itself, but paranoid_entry() also needs to convey to paranoid_exit() (and my reading of macro idtentry says paranoid_entry and paranoid_exit are always paired) how to restore the prior state. The swapgs state is already conveyed by %ebx (0 or 1), so extend that also to convey when SWITCH_USER_CR3 will be needed (2 or 3). (Yes, I'd much prefer that 0 meant no swapgs, whereas it's the other way round: and a convention shared with error_entry() and error_exit(), which I don't want to touch. Perhaps I should have inverted the bit for switch cr3 too, but did not.) paranoid_exit() would be straightforward, except for TRACE_IRQS: it did TRACE_IRQS_IRETQ when doing swapgs, but TRACE_IRQS_IRETQ_DEBUG when not: which is it supposed to use when SWITCH_USER_CR3 is split apart from that? As best as I can determine, commit `5963e317b1` ("ftrace/x86: Do not change stacks in DEBUG when calling lockdep") missed the swapgs case, and should have used TRACE_IRQS_IRETQ_DEBUG there too (the discrepancy has nothing to do with the liberal use of _NO_STACK and _UNSAFE_STACK hereabouts: TRACE_IRQS_OFF_DEBUG has just been used in all cases); discrepancy lovingly preserved across several paranoid_exit() cleanups, but I'm now removing it. Neel further indicates that to use SWITCH_USER_CR3_NO_STACK there in paranoid_exit() is now not only unnecessary but unsafe: might corrupt syscall entry's unsafe_stack_register_backup of %rax. Just use SWITCH_USER_CR3: and delete SWITCH_USER_CR3_NO_STACK altogether, before we make the mistake of using it again. hughd adds: this commit fixes an issue in the Kaiser-without-PCIDs part of the series, and ought to be moved earlier, if you decided to make a release of Kaiser-without-PCIDs. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	d0142ceb79	kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user Mostly this commit is just unshouting X86_CR3_PCID_KERN_VAR and X86_CR3_PCID_USER_VAR: we usually name variables in lower-case. But why does x86_cr3_pcid_noflush need to be __aligned(PAGE_SIZE)? Ah, it's a leftover from when kaiser_add_user_map() once complained about mapping the same page twice. Make it __read_mostly instead. (I'm a little uneasy about all the unrelated data which shares its page getting user-mapped too, but that was so before, and not a big deal: though we call it user-mapped, it's not mapped with _PAGE_USER.) And there is a little change around the two calls to do_nmi(). Previously they set the NOFLUSH bit (if PCID supported) when forcing to kernel context before do_nmi(); now they also have the NOFLUSH bit set (if PCID supported) when restoring context after: nothing done in do_nmi() should require a TLB to be flushed here. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	6a2b411761	kaiser: PCID 0 for kernel and 128 for user Why was 4 chosen for kernel PCID and 6 for user PCID? No good reason in a backport where PCIDs are only used for Kaiser. If we continue with those, then we shall need to add Andy Lutomirski's 4.13 commit `6c690ee103` ("x86/mm: Split read_cr3() into read_cr3_pa() and __read_cr3()"), which deals with the problem of read_cr3() callers finding stray bits in the cr3 that they expected to be page-aligned; and for hibernation, his 4.14 commit `f34902c5c6` ("x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3"). But if 0 is used for kernel PCID, then there's no need to add in those commits - whenever the kernel looks, it sees 0 in the lower bits; and 0 for kernel seems an obvious choice. And I naughtily propose 128 for user PCID. Because there's a place in _SWITCH_TO_USER_CR3 where it takes note of the need for TLB FLUSH, but needs to reset that to NOFLUSH for the next occasion. Currently it does so with a "movb $(0x80)" into the high byte of the per-cpu quadword, but that will cause a machine without PCID support to crash. Now, if %al just happened to have 0x80 in it at that point, on a machine with PCID support, but 0 on a machine without PCID support... (That will go badly wrong once the pgd can be at a physical address above 2^56, but even with 5-level paging, physical goes up to 2^52.) Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	0b5ca9d995	kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user We have many machines (Westmere, Sandybridge, Ivybridge) supporting PCID but not INVPCID: on these load_new_mm_cr3() simply crashed. Flushing user context inside load_new_mm_cr3() without the use of invpcid is difficult: momentarily switch from kernel to user context and back to do so? I'm not sure whether that can be safely done at all, and would risk polluting user context with kernel internals, and kernel context with stale user externals. Instead, follow the hint in the comment that was there: change X86_CR3_PCID_USER_VAR to be a per-cpu variable, then load_new_mm_cr3() can leave a note in it, for SWITCH_USER_CR3 on return to userspace to flush user context TLB, instead of default X86_CR3_PCID_USER_NOFLUSH. Which works well enough that there's no need to do it this way only when invpcid is unsupported: it's a good alternative to invpcid here. But there's a couple of inlines in asm/tlbflush.h that need to do the same trick, so it's best to localize all this per-cpu business in mm/kaiser.c: moving that part of the initialization from setup_pcid() to kaiser_setup_pcid(); with kaiser_flush_tlb_on_return_to_user() the function for noting an X86_CR3_PCID_USER_FLUSH. And let's keep a KAISER_SHADOW_PGD_OFFSET in there, to avoid the extra OR on exit. I did try to make the feature tests in asm/tlbflush.h more consistent with each other: there seem to be far too many ways of performing such tests, and I don't have a good grasp of their differences. At first I converted them all to be static_cpu_has(): but that proved to be a mistake, as the comment in __native_flush_tlb_single() hints; so then I reversed and made them all this_cpu_has(). Probably all gratuitous change, but that's the way it's working at present. I am slightly bothered by the way non-per-cpu X86_CR3_PCID_KERN_VAR gets re-initialized by each cpu (before and after these changes): no problem when (as usual) all cpus on a machine have the same features, but in principle incorrect. However, my experiment to per-cpu-ify that one did not end well... Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	2684b12a16	kaiser: enhanced by kernel and user PCIDs Merged performance improvements to Kaiser, using distinct kernel and user Process Context Identifiers to minimize the TLB flushing. [This work actually all from Dave Hansen 2017-08-30: still omitting trackswitch mods, and KAISER_REAL_SWITCH deleted.] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:34 +01:00
Hugh Dickins	1972bb9d92	kaiser: vmstat show NR_KAISERTABLE as nr_overhead The kaiser update made an interesting choice, never to free any shadow page tables. Contention on global spinlock was worrying, particularly with it held across page table scans when freeing. Something had to be done: I was going to add refcounting; but simply never to free them is an appealing choice, minimizing contention without complicating the code (the more a page table is found already, the less the spinlock is used). But leaking pages in this way is also a worry: can we get away with it? At the very least, we need a count to show how bad it actually gets: in principle, one might end up wasting about 1/256 of memory that way (1/512 for when direct-mapped pages have to be user-mapped, plus 1/512 for when they are user-mapped from the vmalloc area on another occasion (but we don't have vmalloc'ed stacks, so only large ldts are vmalloc'ed). Add per-cpu stat NR_KAISERTABLE: including 256 at startup for the shared pgd entries, and 1 for each intermediate page table added thereafter for user-mapping - but leave out the 1 per mm, for its shadow pgd, because that distracts from the monotonic increase. Shown in /proc/vmstat as nr_overhead (0 if kaiser not enabled). In practice, it doesn't look so bad so far: more like 1/12000 after nine hours of gtests below; and movable pageblock segregation should tend to cluster the kaiser tables into a subset of the address space (if not, they will be bad for compaction too). But production may tell a different story: keep an eye on this number, and bring back lighter freeing if it gets out of control (maybe a shrinker). ["nr_overhead" should of course say "nr_kaisertable", if it needs to stay; but for the moment we are being coy, preferring that when Joe Blow notices a new line in his /proc/vmstat, he does not get too curious about what this "kaiser" stuff might be.] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	1ce27de401	kaiser: delete KAISER_REAL_SWITCH option We fail to see what CONFIG_KAISER_REAL_SWITCH is for: it seems to be left over from early development, and now just obscures tricky parts of the code. Delete it before adding PCIDs, or nokaiser boot option. (Or if there is some good reason to keep the option, then it needs a help text - and a "depends on KAISER", so that all those without KAISER are not asked the question. But we'd much rather delete it.) Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	c27cdea56c	kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET There's a 0x1000 in various places, which looks better with a name. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	61b7a404fa	kaiser: cleanups while trying for gold link While trying to get our gold link to work, four cleanups: matched the gdt_page declaration to its definition; in fiddling unsuccessfully with PERCPU_INPUT(), lined up backslashes; lined up the backslashes according to convention in percpu-defs.h; deleted the unused irq_stack_pointer addition to irq_stack_union. Sad to report that aligning backslashes does not appear to help gold align to 8192: but while these did not help, they are worth keeping. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	604db49610	kaiser: align addition to x86/mm/Makefile Use tab not space so they line up properly, kaslr.o also. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	be6bf01f4c	kaiser: tidied up kaiser_add/remove_mapping slightly Yes, unmap_pud_range_nofree()'s declaration ought to be in a header file really, but I'm not sure we want to use it anyway: so for now just declare it inside kaiser_remove_mapping(). And there doesn't seem to be such a thing as unmap_p4d_range(), even in a 5-level paging tree. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	67fab0d4ac	kaiser: tidied up asm/kaiser.h somewhat Mainly deleting a surfeit of blank lines, and reflowing header comment. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	f43f386f0b	kaiser: ENOMEM if kaiser_pagetable_walk() NULL kaiser_add_user_map() took no notice when kaiser_pagetable_walk() failed. And avoid its might_sleep() when atomic (though atomic at present unused). Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	f881e62684	kaiser: fix perf crashes Avoid perf crashes: place debug_store in the user-mapped per-cpu area instead of allocating, and use page allocator plus kaiser_add_mapping() to keep the BTS and PEBS buffers user-mapped (that is, present in the user mapping, though visible only to kernel and hardware). The PEBS fixup buffer does not need this treatment. The need for a user-mapped struct debug_store showed up before doing any conscious perf testing: in a couple of kernel paging oopses on Westmere, implicating the debug_store offset of the per-cpu area. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:33 +01:00
Hugh Dickins	1937794431	kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER pjt has observed that nmi's second (nmi_from_kernel) call to do_nmi() adjusted the %rdi regs arg, rightly when CONFIG_KAISER, but wrongly when not CONFIG_KAISER. Although the minimal change is to add an #ifdef CONFIG_KAISER around the addq line, that looks cluttered, and I prefer how the first call to do_nmi() handled it: prepare args in %rdi and %rsi before getting into the CONFIG_KAISER block, since it does not touch them at all. And while we're here, place the "#ifdef CONFIG_KAISER" that follows each, to enclose the "Unconditionally restore CR3" comment: matching how the "Unconditionally use kernel CR3" comment above is enclosed. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Hugh Dickins	639c005dae	kaiser: KAISER depends on SMP It is absurd that KAISER should depend on SMP, but apparently nobody has tried a UP build before: which breaks on implicit declaration of function 'per_cpu_offset' in arch/x86/mm/kaiser.c. Now, you would expect that to be trivially fixed up; but looking at the System.map when that block is #ifdef'ed out of kaiser_init(), I see that in a UP build __per_cpu_user_mapped_end is precisely at __per_cpu_user_mapped_start, and the items carefully gathered into that section for user-mapping on SMP, dispersed elsewhere on UP. So, some other kind of section assignment will be needed on UP, but implementing that is not a priority: just make KAISER depend on SMP for now. Also inserted a blank line before the option, tidied up the brief Kconfig help message, and added an "If unsure, Y". Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Hugh Dickins	7a92e20d15	kaiser: fix build and FIXME in alloc_ldt_struct() Include linux/kaiser.h instead of asm/kaiser.h to build ldt.c without CONFIG_KAISER. kaiser_add_mapping() does already return an error code, so fix the FIXME. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Hugh Dickins	0994a2cf8f	kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE Kaiser only needs to map one page of the stack; and kernel/fork.c did not build on powerpc (no __PAGE_KERNEL). It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack() and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's kaiser_add_mapping() and kaiser_remove_mapping(). And use linux/kaiser.h in init/main.c to avoid the #ifdefs there. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Hugh Dickins	ac2f1018ac	kaiser: do not set _PAGE_NX on pgd_none native_pgd_clear() uses native_set_pgd(), so native_set_pgd() must avoid setting the _PAGE_NX bit on an otherwise pgd_none() entry: usually that just generated a warning on exit, but sometimes more mysterious and damaging failures (our production machines could not complete booting). The original fix to this just avoided adding _PAGE_NX to an empty entry; but eventually more problems surfaced with kexec, and EFI mapping expected to be a problem too. So now instead change native_set_pgd() to update shadow only if _PAGE_USER: A few places (kernel/machine_kexec_64.c, platform/efi/efi_64.c for sure) use set_pgd() to set up a temporary internal virtual address space, with physical pages remapped at what Kaiser regards as userspace addresses: Kaiser then assumes a shadow pgd follows, which it will try to corrupt. This appears to be responsible for the recent kexec and kdump failures; though it's unclear how those did not manifest as a problem before. Ah, the shadow pgd will only be assumed to "follow" if the requested pgd is on an even-numbered page: so I suppose it was going wrong 50% of the time all along. What we need is a flag to set_pgd(), to tell it we're dealing with userspace. Er, isn't that what the pgd's _PAGE_USER bit is saying? Add a test for that. But we cannot do the same for pgd_clear() (which may be called to clear corrupted entries - set aside the question of "corrupt in which pgd?" until later), so there just rely on pgd_clear() not being called in the problematic cases - with a WARN_ON_ONCE() which should fire half the time if it is. But this is getting too big for an inline function: move it into arch/x86/mm/kaiser.c (which then demands a boot/compressed mod); and de-void and de-space native_get_shadow/normal_pgd() while here. Also make an unnecessary change to KASLR's init_trampoline(): it was using set_pgd() to assign a pgd-value to a global variable (not in a pg directory page), which was rather scary given Kaiser's previous set_pgd() implementation: not a problem now, but too scary to leave as was, it could easily blow up if we have to change set_pgd() again. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Dave Hansen	8f0baadf2b	kaiser: merged update Merged fixes and cleanups, rebased to 4.9.51 tree (no 5-level paging). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Richard Fellner	13be4483bb	KAISER: Kernel Address Isolation This patch introduces our implementation of KAISER (Kernel Address Isolation to have Side-channels Efficiently Removed), a kernel isolation technique to close hardware side channels on kernel address information. More information about the patch can be found on: https://github.com/IAIK/KAISER From: Richard Fellner <richard.fellner@student.tugraz.at> From: Daniel Gruss <daniel.gruss@iaik.tugraz.at> Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode Date: Thu, 4 May 2017 14:26:50 +0200 Link: http://marc.info/?l=linux-kernel&m=149390087310405&w=2 Kaiser-4.10-SHA1: c4b1831d44c6144d3762ccc72f0c4e71a0c713e5 To: <linux-kernel@vger.kernel.org> To: <kernel-hardening@lists.openwall.com> Cc: <clementine.maurice@iaik.tugraz.at> Cc: <moritz.lipp@iaik.tugraz.at> Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Cc: Richard Fellner <richard.fellner@student.tugraz.at> Cc: Ingo Molnar <mingo@kernel.org> Cc: <kirill.shutemov@linux.intel.com> Cc: <anders.fogh@gdata-adan.de> After several recent works [1,2,3] KASLR on x86_64 was basically considered dead by many researchers. We have been working on an efficient but effective fix for this problem and found that not mapping the kernel space when running in user mode is the solution to this problem [4] (the corresponding paper [5] will be presented at ESSoS17). With this RFC patch we allow anybody to configure their kernel with the flag CONFIG_KAISER to add our defense mechanism. If there are any questions we would love to answer them. We also appreciate any comments! Cheers, Daniel (+ the KAISER team from Graz University of Technology) [1] http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf [2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf [3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf [4] https://github.com/IAIK/KAISER [5] https://gruss.cc/files/kaiser.pdf [patch based also on https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch] Signed-off-by: Richard Fellner <richard.fellner@student.tugraz.at> Signed-off-by: Moritz Lipp <moritz.lipp@iaik.tugraz.at> Signed-off-by: Daniel Gruss <daniel.gruss@iaik.tugraz.at> Signed-off-by: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Tom Lendacky	b5fd58e997	x86/boot: Add early cmdline parsing for options with arguments commit `e505371dd8` upstream. Add a cmdline_find_option() function to look for cmdline options that take arguments. The argument is returned in a supplied buffer and the argument length (regardless of whether it fits in the supplied buffer) is returned, with -1 indicating not found. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/36b5f97492a9745dce27682305f990fc20e5cf8a.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:32 +01:00
Neal Cardwell	8824b2d7ab	tcp_bbr: reset long-term bandwidth sampling on loss recovery undo commit `600647d467` upstream. Fix BBR so that upon notification of a loss recovery undo BBR resets long-term bandwidth sampling. Under high reordering, reordering events can be interpreted as loss. If the reordering and spurious loss estimates are high enough, this can cause BBR to spuriously estimate that we are seeing loss rates high enough to trigger long-term bandwidth estimation. To avoid that problem, this commit resets long-term bandwidth sampling on loss recovery undo events. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:31 +01:00
Neal Cardwell	61c51da2b4	tcp_bbr: reset full pipe detection on loss recovery undo commit `2f6c498e4f` upstream. Fix BBR so that upon notification of a loss recovery undo BBR resets the full pipe detection (STARTUP exit) state machine. Under high reordering, reordering events can be interpreted as loss. If the reordering and spurious loss estimates are high enough, this could previously cause BBR to spuriously estimate that the pipe is full. Since spurious loss recovery means that our overall sending will have slowed down spuriously, this commit gives a flow more time to probe robustly for bandwidth and decide the pipe is really full. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-05 15:46:31 +01:00
Greg Kroah-Hartman	07bcb2489b	Linux 4.9.74	2018-01-02 20:35:18 +01:00
Andy Lutomirski	181a832c2e	mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP commit `5dd0b16cda` upstream. This fixes CONFIG_SMP=n, CONFIG_DEBUG_TLBFLUSH=y without introducing further #ifdef soup. Caught by a Kbuild bot randconfig build. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `ce4a4e565f` ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code") Link: http://lkml.kernel.org/r/76da9a3cc4415996f2ad2c905b93414add322021.1496673616.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:18 +01:00
Johan Hovold	d76dabb5af	tty: fix tty_ldisc_receive_buf() documentation commit `e7e51dcf3b` upstream. The tty_ldisc_receive_buf() helper returns the number of bytes processed so drop the bogus "not" from the kernel doc comment. Fixes: `8d082cd300` ("tty: Unify receive_buf() code paths") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Linus Torvalds	00fc57ae06	n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD) commit `966031f340` upstream. We added support for EXTPROC back in 2010 in commit `26df6d1340` ("tty: Add EXTPROC support for LINEMODE") and the intent was to allow it to override some (all?) ICANON behavior. Quoting from that original commit message: There is a new bit in the termios local flag word, EXTPROC. When this bit is set, several aspects of the terminal driver are disabled. Input line editing, character echo, and mapping of signals are all disabled. This allows the telnetd to turn off these functions when in linemode, but still keep track of what state the user wants the terminal to be in. but the problem turns out that "several aspects of the terminal driver are disabled" is a bit ambiguous, and you can really confuse the n_tty layer by setting EXTPROC and then causing some of the ICANON invariants to no longer be maintained. This fixes at least one such case (TIOCINQ) becoming unhappy because of the confusion over whether ICANON really means ICANON when EXTPROC is set. This basically makes TIOCINQ match the case of read: if EXTPROC is set, we ignore ICANON. Also, make sure to reset the ICANON state ie EXTPROC changes, not just if ICANON changes. Fixes: `26df6d1340` ("tty: Add EXTPROC support for LINEMODE") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Reported-by: syzkaller <syzkaller@googlegroups.com> Cc: Jiri Slaby <jslaby@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Thomas Gleixner	404ae546c7	x86/smpboot: Remove stale TLB flush invocations commit `322f8b8b34` upstream. smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector() invoke local_flush_tlb() for no obvious reason. Digging in history revealed that the original code in the 2.1 era added those because the code manipulated a swapper_pg_dir pagetable entry. The pagetable manipulation was removed long ago in the 2.3 timeframe, but the TLB flush invocations stayed around forever. Remove them along with the pointless pr_debug()s which come from the same 2.1 change. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Thomas Gleixner	e8119ac05d	nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() commit `5d62c183f9` upstream. The conditions in irq_exit() to invoke tick_nohz_irq_exit() which subsequently invokes tick_nohz_stop_sched_tick() are: if ((idle_cpu(cpu) && !need_resched()) \|\| tick_nohz_full_cpu(cpu)) If need_resched() is not set, but a timer softirq is pending then this is an indication that the softirq code punted and delegated the execution to softirqd. need_resched() is not true because the current interrupted task takes precedence over softirqd. Invoking tick_nohz_irq_exit() in this case can cause an endless loop of timer interrupts because the timer wheel contains an expired timer, but softirqs are not yet executed. So it returns an immediate expiry request, which causes the timer to fire immediately again. Lather, rinse and repeat.... Prevent that by adding a check for a pending timer soft interrupt to the conditions in tick_nohz_stop_sched_tick() which avoid calling get_next_timer_interrupt(). That keeps the tick sched timer on the tick and prevents a repetitive programming of an already expired timer. Reported-by: Sebastian Siewior <bigeasy@linutronix.d> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Sebastian Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Thomas Gleixner	249d4a9b32	timers: Reinitialize per cpu bases on hotplug commit `26456f87ac` upstream. The timer wheel bases are not (re)initialized on CPU hotplug. That leaves them with a potentially stale clk and next_expiry valuem, which can cause trouble then the CPU is plugged. Add a prepare callback which forwards the clock, sets next_expiry to far in the future and reset the control flags to a known state. Set base->must_forward_clk so the first timer which is queued will try to forward the clock to current jiffies. Fixes: `500462a9de` ("timers: Switch to a non-cascading wheel") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Thomas Gleixner	574e543ff9	timers: Invoke timer_start_debug() where it makes sense commit `fd45bb77ad` upstream. The timer start debug function is called before the proper timer base is set. As a consequence the trace data contains the stale CPU and flags values. Call the debug function after setting the new base and flags. Fixes: `500462a9de` ("timers: Switch to a non-cascading wheel") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Anna-Maria Gleixner	d840687aa8	timers: Use deferrable base independent of base::nohz_active commit `ced6d5c11d` upstream. During boot and before base::nohz_active is set in the timer bases, deferrable timers are enqueued into the standard timer base. This works correctly as long as base::nohz_active is false. Once it base::nohz_active is set and a timer which was enqueued before that is accessed the lock selector code choses the lock of the deferred base. This causes unlocked access to the standard base and in case the timer is removed it does not clear the pending flag in the standard base bitmap which causes get_next_timer_interrupt() to return bogus values. To prevent that, the deferrable timers must be enqueued in the deferrable base, even when base::nohz_active is not set. Those deferrable timers also need to be expired unconditional. Fixes: `500462a9de` ("timers: Switch to a non-cascading wheel") Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:17 +01:00
Daniel Thompson	09d3e69305	usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201 commit `da99706689` upstream. When plugging in a USB webcam I see the following message: xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk? handle_tx_event: 913 callbacks suppressed All is quiet again with this patch (and I've done a fair but of soak testing with the camera since). Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Mathias Nyman	ab1fbfecd3	USB: Fix off by one in type-specific length check of BOS SSP capability commit `07b9f12864` upstream. USB 3.1 devices are not detected as 3.1 capable since 4.15-rc3 due to a off by one in commit `81cf4a4536` ("USB: core: Add type-specific length check of BOS descriptors") It uses USB_DT_USB_SSP_CAP_SIZE() to get SSP capability size which takes the zero based SSAC as argument, not the actual count of sublink speed attributes. USB3 spec 9.6.2.5 says "The number of Sublink Speed Attributes = SSAC + 1." The type-specific length check patch was added to stable and needs to be fixed there as well Fixes: `81cf4a4536` ("USB: core: Add type-specific length check of BOS descriptors") CC: Masakazu Mokuno <masakazu.mokuno@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Oliver Neukum	425d2f1533	usb: add RESET_RESUME for ELSA MicroLink 56K commit `b9096d9f15` upstream. This modem needs this quirk to operate. It produces timeouts when resumed without reset. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Dmitry Fleytman Dmitry Fleytman	0f2e9cbc23	usb: Add device quirk for Logitech HD Pro Webcam C925e commit `7f038d256c` upstream. Commit `e0429362ab` ("usb: Add device quirk for Logitech HD Pro Webcams C920 and C930e") introduced quirk to workaround an issue with some Logitech webcams. There is one more model that has the same issue - C925e, so applying the same quirk as well. See aforementioned commit message for detailed explanation of the problem. Signed-off-by: Dmitry Fleytman <dmitry.fleytman@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
SZ Lin (林上智)	d98f4d4d02	USB: serial: option: adding support for YUGA CLM920-NC5 commit `3920bb7130` upstream. This patch adds support for YUGA CLM920-NC5 PID 0x9625 USB modem to option driver. Interface layout: 0: QCDM/DIAG 1: ADB 2: MODEM 3: AT 4: RMNET Signed-off-by: Taiyi Wu <taiyity.wu@moxa.com> Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Daniele Palmas	192cdf5eca	USB: serial: option: add support for Telit ME910 PID 0x1101 commit `08933099e6` upstream. This patch adds support for PID 0x1101 of Telit ME910. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Reinhard Speyerer	6ab3d87ad7	USB: serial: qcserial: add Sierra Wireless EM7565 commit `92a18a657f` upstream. Sierra Wireless EM7565 devices use the QCSERIAL_SWI layout for their serial ports T: Bus=01 Lev=03 Prnt=29 Port=01 Cnt=02 Dev#= 31 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1199 ProdID=9091 Rev= 0.06 S: Manufacturer=Sierra Wireless, Incorporated S: Product=Sierra Wireless EM7565 Qualcomm Snapdragon X16 LTE-A S: SerialNumber=xxxxxxxx C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=qcserial E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=qcserial E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=qcserial E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms but need sendsetup = true for the NMEA port to make it work properly. Simplify the patch compared to v1 as suggested by Bjørn Mork by taking advantage of the fact that existing devices work with sendsetup = true too. Use sendsetup = true for the NMEA interface of QCSERIAL_SWI and add DEVICE_SWI entries for the EM7565 PID 0x9091 and the EM7565 QDL PID 0x9090. Tests with several MC73xx/MC74xx/MC77xx devices have been performed in order to verify backward compatibility. Signed-off-by: Reinhard Speyerer <rspmn@arcor.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:16 +01:00
Max Schulze	0af1aebb6a	USB: serial: ftdi_sio: add id for Airbus DS P8GR commit `c6a36ad383` upstream. Add AIRBUS_DS_P8GR device IDs to ftdi_sio driver. Signed-off-by: Max Schulze <max.schulze@posteo.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Shuah Khan	03dce0573d	usbip: vhci: stop printing kernel pointer addresses in messages commit `8272d099d0` upstream. Remove and/or change debug, info. and error messages to not print kernel pointer addresses. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Shuah Khan	9e9f4255c0	usbip: stub: stop printing kernel pointer addresses in messages commit `248a220443` upstream. Remove and/or change debug, info. and error messages to not print kernel pointer addresses. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Shuah Khan	1ef5c433b3	usbip: prevent leaking socket pointer address in messages commit `90120d15f4` upstream. usbip driver is leaking socket pointer address in messages. Remove the messages that aren't useful and print sockfd in the ones that are useful for debugging. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Juan Zea	3c579d0b4f	usbip: fix usbip bind writing random string after command in match_busid commit `544c4605ac` upstream. usbip bind writes commands followed by random string when writing to match_busid attribute in sysfs, caused by using full variable size instead of string length. Signed-off-by: Juan Zea <juan.zea@qindel.com> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Julian Wiedmann	67b539cab4	s390/qeth: update takeover IPs after configuration change [ Upstream commit `02f510f326` ] Any modification to the takeover IP-ranges requires that we re-evaluate which IP addresses are takeover-eligible. Otherwise we might do takeover for some addresses when we no longer should, or vice-versa. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Julian Wiedmann	476d7d6932	s390/qeth: lock IP table while applying takeover changes [ Upstream commit `8a03a3692b` ] Modifying the flags of an IP addr object needs to be protected against eg. concurrent removal of the same object from the IP table. Fixes: `5f78e29cee` ("qeth: optimize IP handling in rx_mode callback") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:15 +01:00
Julian Wiedmann	475018c797	s390/qeth: don't apply takeover changes to RXIP [ Upstream commit `b22d73d668` ] When takeover is switched off, current code clears the 'TAKEOVER' flag on all IPs. But the flag is also used for RXIP addresses, and those should not be affected by the takeover mode. Fix the behaviour by consistenly applying takover logic to NORMAL addresses only. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Julian Wiedmann	6ed7c48e93	s390/qeth: apply takeover changes when mode is toggled [ Upstream commit `7fbd9493f0` ] Just as for an explicit enable/disable, toggling the takeover mode also requires that the IP addresses get updated. Otherwise all IPs that were added to the table before the mode-toggle, get registered with the old settings. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Moni Shoua	7493d98ea8	net/mlx5: Fix error flow in CREATE_QP command [ Upstream commit `dbff26e44d` ] In error flow, when DESTROY_QP command should be executed, the wrong mailbox was set with data, not the one that is written to hardware, Fix that. Fixes: `09a7d9eca1` '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc' Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Gal Pressman	c844a45894	net/mlx5e: Prevent possible races in VXLAN control flow [ Upstream commit `0c1cc8b221` ] When calling add/remove VXLAN port, a lock must be held in order to prevent race scenarios when more than one add/remove happens at the same time. Fix by holding our state_lock (mutex) as done by all other parts of the driver. Note that the spinlock protecting the radix-tree is still needed in order to synchronize radix-tree access from softirq context. Fixes: `b3f63c3d5e` ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Gal Pressman	604225824f	net/mlx5e: Add refcount to VXLAN structure [ Upstream commit `23f4cc2cd9` ] A refcount mechanism must be implemented in order to prevent unwanted scenarios such as: - Open an IPv4 VXLAN interface - Open an IPv6 VXLAN interface (different socket) - Remove one of the interfaces With current implementation, the UDP port will be removed from our VXLAN database and turn off the offloads for the other interface, which is still active. The reference count mechanism will only allow UDP port removals once all consumers are gone. Fixes: `b3f63c3d5e` ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Gal Pressman	d1614fd9cd	net/mlx5e: Fix possible deadlock of VXLAN lock [ Upstream commit `6323514116` ] mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user context) and mlx5e_features_check (softirq), but the lock acquired does not disable bottom half and might result in deadlock. Fix it by simply replacing spin_lock() with spin_lock_bh(). While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh(). lockdep's WARNING: inconsistent lock state [ 654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes: [ 654.028321] (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core] [ 654.028528] {SOFTIRQ-ON-W} state was registered at: [ 654.028607] _raw_spin_lock+0x3c/0x70 [ 654.028689] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core] [ 654.028794] mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core] [ 654.028878] process_one_work+0x1e9/0x640 [ 654.028942] worker_thread+0x4a/0x3f0 [ 654.029002] kthread+0x141/0x180 [ 654.029056] ret_from_fork+0x24/0x30 [ 654.029114] irq event stamp: 579088 [ 654.029174] hardirqs last enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0 [ 654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0 [ 654.029446] softirqs last enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80 [ 654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0 [ 654.029684] other info that might help us debug this: [ 654.029781] Possible unsafe locking scenario: [ 654.029868] CPU0 [ 654.029908] ---- [ 654.029947] lock(&(&vxlan_db->lock)->rlock); [ 654.030045] <Interrupt> [ 654.030090] lock(&(&vxlan_db->lock)->rlock); [ 654.030162] * DEADLOCK * Fixes: `b3f63c3d5e` ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:14 +01:00
Gal Pressman	9424a79ec1	net/mlx5e: Fix features check of IPv6 traffic [ Upstream commit `2989ad1ec0` ] The assumption that the next header field contains the transport protocol is wrong for IPv6 packets with extension headers. Instead, we should look the inner-most next header field in the buffer. This will fix TSO offload for tunnels over IPv6 with extension headers. Performance testing: 19.25x improvement, cool! Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap. CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] TSO: Enabled Before: 4,926.24 Mbps Now : 94,827.91 Mbps Fixes: `b3f63c3d5e` ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Eran Ben Elisha	1387239123	net/mlx5: Fix rate limit packet pacing naming and struct [ Upstream commit `37e92a9d4f` ] In mlx5_ifc, struct size was not complete, and thus driver was sending garbage after the last defined field. Fixed it by adding reserved field to complete the struct size. In addition, rename all set_rate_limit to set_pp_rate_limit to be compliant with the Firmware <-> Driver definition. Fixes: `7486216b3a` ("{net,IB}/mlx5: mlx5_ifc updates") Fixes: `1466cc5b23` ("net/mlx5: Rate limit tables support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Yousuk Seung	e74fe7268e	tcp: invalidate rate samples during SACK reneging [ Upstream commit `d4761754b4` ] Mark tcp_sock during a SACK reneging event and invalidate rate samples while marked. Such rate samples may overestimate bw by including packets that were SACKed before reneging. < ack 6001 win 10000 sack 7001:38001 < ack 7001 win 0 sack 8001:38001 // Reneg detected > seq 7001:8001 // RTO, SACK cleared. < ack 38001 win 10000 In above example the rate sample taken after the last ack will count 7001-38001 as delivered while the actual delivery rate likely could be much lower i.e. 7001-8001. This patch adds a new field tcp_sock.sack_reneg and marks it when we declare SACK reneging and entering TCP_CA_Loss, and unmarks it after the last rate sample was taken before moving back to TCP_CA_Open. This patch also invalidates rate samples taken while tcp_sock.is_sack_reneg is set. Fixes: `b9f64820fb` ("tcp: track data delivery rate for a TCP connection") Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Willem de Bruijn	58f6ebbd34	sock: free skb in skb_complete_tx_timestamp on error [ Upstream commit `35b99dffc3` ] skb_complete_tx_timestamp must ingest the skb it is passed. Call kfree_skb if the skb cannot be enqueued. Fixes: `b245be1f4d` ("net-timestamp: no-payload only sysctl") Fixes: `9ac25fc063` ("net: fix socket refcounting in skb_complete_tx_timestamp()") Reported-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Grygorii Strashko	a746fadd5e	net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround [ Upstream commit `c1a8d0a3ac` ] Under some circumstances driver will perform PHY reset in ksz9031_read_status() to fix autoneg failure case (idle error count = 0xFF). When this happens ksz9031 will not detect link status change any more when connecting to Netgear 1G switch (link can be recovered sometimes by restarting netdevice "ifconfig down up"). Reproduced with TI am572x board equipped with ksz9031 PHY while connecting to Netgear 1G switch. Fix the issue by reconfiguring autonegotiation after PHY reset in ksz9031_read_status(). Fixes: `d2fd719bcb` ("net/phy: micrel: Add workaround for bad autoneg") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Eric W. Biederman	03c93293a8	net: Fix double free and memory corruption in get_net_ns_by_id() [ Upstream commit `21b5944350` ] (I can trivially verify that that idr_remove in cleanup_net happens after the network namespace count has dropped to zero --EWB) Function get_net_ns_by_id() does not check for net::count after it has found a peer in netns_ids idr. It may dereference a peer, after its count has already been finaly decremented. This leads to double free and memory corruption: put_net(peer) rtnl_lock() atomic_dec_and_test(&peer->count) [count=0] ... __put_net(peer) get_net_ns_by_id(net, id) spin_lock(&cleanup_list_lock) list_add(&net->cleanup_list, &cleanup_list) spin_unlock(&cleanup_list_lock) queue_work() peer = idr_find(&net->netns_ids, id) \| get_net(peer) [count=1] \| ... \| (use after final put) v ... cleanup_net() ... spin_lock(&cleanup_list_lock) ... list_replace_init(&cleanup_list, ..) ... spin_unlock(&cleanup_list_lock) ... ... ... ... put_net(peer) ... atomic_dec_and_test(&peer->count) [count=0] ... spin_lock(&cleanup_list_lock) ... list_add(&net->cleanup_list, &cleanup_list) ... spin_unlock(&cleanup_list_lock) ... queue_work() ... rtnl_unlock() rtnl_lock() ... for_each_net(tmp) { ... id = __peernet2id(tmp, peer) ... spin_lock_irq(&tmp->nsid_lock) ... idr_remove(&tmp->netns_ids, id) ... ... ... net_drop_ns() ... net_free(peer) ... } ... \| v cleanup_net() ... (Second free of peer) Also, put_net() on the right cpu may reorder with left's cpu list_replace_init(&cleanup_list, ..), and then cleanup_list will be corrupted. Since cleanup_net() is executed in worker thread, while put_net(peer) can happen everywhere, there should be enough time for concurrent get_net_ns_by_id() to pick the peer up, and the race does not seem to be unlikely. The patch fixes the problem in standard way. (Also, there is possible problem in peernet2id_alloc(), which requires check for net::count under nsid_lock and maybe_get_net(peer), but in current stable kernel it's used under rtnl_lock() and it has to be safe. Openswitch begun to use peernet2id_alloc(), and possibly it should be fixed too. While this is not in stable kernel yet, so I'll send a separate message to netdev@ later). Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Fixes: `0c7aecd4bd` "netns: add rtnl cmd to add and get peer netns ids" Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Andrew Lunn	8c38f3190f	net: fec: Allow reception of frames bigger than 1522 bytes [ Upstream commit `fbbeefdd21` ] The FEC Receive Control Register has a 14 bit field indicating the longest frame that may be received. It is being set to 1522. Frames longer than this are discarded, but counted as being in error. When using DSA, frames from the switch has an additional header, either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame of 1522 bytes received by the switch on a port becomes 1530 bytes when passed to the host via the FEC interface. Change the maximum receive size to 2048 - 64, where 64 is the maximum rx_alignment applied on the receive buffer for AVB capable FEC cores. Use this value also for the maximum receive buffer size. The driver is already allocating a receive SKB of 2048 bytes, so this change should not have any significant effects. Tested on imx51, imx6, vf610. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:13 +01:00
Nikolay Aleksandrov	243adaa4ea	net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks [ Upstream commit `84aeb437ab` ] The early call to br_stp_change_bridge_id in bridge's newlink can cause a memory leak if an error occurs during the newlink because the fdb entries are not cleaned up if a different lladdr was specified, also another minor issue is that it generates fdb notifications with ifindex = 0. Another unrelated memory leak is the bridge sysfs entries which get added on NETDEV_REGISTER event, but are not cleaned up in the newlink error path. To remove this special case the call to br_stp_change_bridge_id is done after netdev register and we cleanup the bridge on changelink error via br_dev_delete to plug all leaks. This patch makes netlink bridge destruction on newlink error the same as dellink and ioctl del which is necessary since at that point we have a fully initialized bridge device. To reproduce the issue: $ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1 RTNETLINK answers: Invalid argument $ rmmod bridge [ 1822.142525] ============================================================================= [ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown() [ 1822.144821] ----------------------------------------------------------------------------- [ 1822.145990] Disabling lock debugging due to kernel taint [ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100 [ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87 [ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 1822.150008] Call Trace: [ 1822.150510] dump_stack+0x78/0xa9 [ 1822.151156] slab_err+0xb1/0xd3 [ 1822.151834] ? __kmalloc+0x1bb/0x1ce [ 1822.152546] __kmem_cache_shutdown+0x151/0x28b [ 1822.153395] shutdown_cache+0x13/0x144 [ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb [ 1822.154669] SyS_delete_module+0x194/0x244 [ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a [ 1822.156343] RIP: 0033:0x7f929bd38b17 [ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0 [ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17 [ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0 [ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11 [ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80 [ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090 [ 1822.161278] INFO: Object 0x000000007645de29 @offset=0 [ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128 Fixes: `30313a3d57` ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device") Fixes: `5b8d5429da` ("bridge: netlink: register netdevice before executing changelink") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Ido Schimmel	e4f6698027	ipv4: Fix use-after-free when flushing FIB tables [ Upstream commit `b4681c2829` ] Since commit `0ddcf43d5d` ("ipv4: FIB Local/MAIN table collapse") the local table uses the same trie allocated for the main table when custom rules are not in use. When a net namespace is dismantled, the main table is flushed and freed (via an RCU callback) before the local table. In case the callback is invoked before the local table is iterated, a use-after-free can occur. Fix this by iterating over the FIB tables in reverse order, so that the main table is always freed after the local table. v3: Reworded comment according to Alex's suggestion. v2: Add a comment to make the fix more explicit per Dave's and Alex's feedback. Fixes: `0ddcf43d5d` ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Nikita V. Shirokov	e51abae845	adding missing rcu_read_unlock in ipxip6_rcv [ Upstream commit `74c4b656c3` ] commit `8d79266bc4` ("ip6_tunnel: add collect_md mode to IPv6 tunnels") introduced new exit point in ipxip6_rcv. however rcu_read_unlock is missing there. this diff is fixing this v1->v2: instead of doing rcu_read_unlock in place, we are going to "drop" section (to prevent skb leakage) Fixes: `8d79266bc4` ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Tonghao Zhang	ae67e5486b	sctp: Replace use of sockets_allocated with specified macro. [ Upstream commit `8cb38a6024` ] The patch(`180d8cd942`) replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to accessor macros. But the sockets_allocated field of sctp sock is not replaced at all. Then replace it now for unifying the code. Fixes: `180d8cd942` ("foundations of per-cgroup memory pressure controlling.") Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Tobias Jordan	99cf2daf0d	net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case [ Upstream commit `589bf32f09` ] add appropriate calls to clk_disable_unprepare() by jumping to out_mdio in case orion_mdio_probe() returns -EPROBE_DEFER. Found by Linux Driver Verification project (linuxtesting.org). Fixes: `3d604da1e9` ("net: mvmdio: get and enable optional clock") Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Mohamed Ghannam	f75f910ffa	net: ipv4: fix for a race condition in raw_sendmsg [ Upstream commit `8f659a03a0` ] inet->hdrincl is racy, and could lead to uninitialized stack pointer usage, so its value should be read only once. Fixes: `c008ba5bdc` ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt") Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Brian King	484369ff97	tg3: Fix rx hang on MTU change with 5717/5719 [ Upstream commit `748a240c58` ] This fixes a hang issue seen when changing the MTU size from 1500 MTU to 9000 MTU on both 5717 and 5719 chips. In discussion with Broadcom, they've indicated that these chipsets have the same phy as the 57766 chipset, so the same workarounds apply. This has been tested by IBM on both Power 8 and Power 9 systems as well as by Broadcom on x86 hardware and has been confirmed to resolve the hang issue. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:12 +01:00
Christoph Paasch	7887a700ce	tcp md5sig: Use skb's saddr when replying to an incoming segment [ Upstream commit `30791ac419` ] The MD5-key that belongs to a connection is identified by the peer's IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying to an incoming segment from tcp_check_req() that failed the seq-number checks. Thus, to find the correct key, we need to use the skb's saddr and not the daddr. This bug seems to have been there since quite a while, but probably got unnoticed because the consequences are not catastrophic. We will call tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer, thus the connection doesn't really fail. Fixes: `9501f97229` ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Neal Cardwell	a4bf8efd2b	tcp_bbr: record "full bw reached" decision in new full_bw_reached bit [ Upstream commit `c589e69b50` ] This commit records the "full bw reached" decision in a new full_bw_reached bit. This is a pure refactor that does not change the current behavior, but enables subsequent fixes and improvements. In particular, this enables simple and clean fixes because the full_bw and full_bw_cnt can be unconditionally zeroed without worrying about forgetting that we estimated we filled the pipe in Startup. And it enables future improvements because multiple code paths can be used for estimating that we filled the pipe in Startup; any new code paths only need to set this bit when they think the pipe is full. Note that this fix intentionally reduces the width of the full_bw_cnt counter, since we have never used the most significant bit. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Avinash Repaka	53288d8218	RDS: Check cmsg_len before dereferencing CMSG_DATA [ Upstream commit `14e138a86f` ] RDS currently doesn't check if the length of the control message is large enough to hold the required data, before dereferencing the control message data. This results in following crash: BUG: KASAN: stack-out-of-bounds in rds_rdma_bytes net/rds/send.c:1013 [inline] BUG: KASAN: stack-out-of-bounds in rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066 Read of size 8 at addr ffff8801c928fb70 by task syzkaller455006/3157 CPU: 0 PID: 3157 Comm: syzkaller455006 Not tainted 4.15.0-rc3+ #161 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430 rds_rdma_bytes net/rds/send.c:1013 [inline] rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066 sock_sendmsg_nosec net/socket.c:628 [inline] sock_sendmsg+0xca/0x110 net/socket.c:638 ___sys_sendmsg+0x320/0x8b0 net/socket.c:2018 __sys_sendmmsg+0x1ee/0x620 net/socket.c:2108 SYSC_sendmmsg net/socket.c:2139 [inline] SyS_sendmmsg+0x35/0x60 net/socket.c:2134 entry_SYSCALL_64_fastpath+0x1f/0x96 RIP: 0033:0x43fe49 RSP: 002b:00007fffbe244ad8 EFLAGS: 00000217 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fe49 RDX: 0000000000000001 RSI: 000000002020c000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004017b0 R13: 0000000000401840 R14: 0000000000000000 R15: 0000000000000000 To fix this, we verify that the cmsg_len is large enough to hold the data to be read, before proceeding further. Reported-by: syzbot <syzkaller-bugs@googlegroups.com> Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Michael S. Tsirkin	8b032bde28	ptr_ring: add barriers [ Upstream commit `a8ceb5dbfd` ] Users of ptr_ring expect that it's safe to give the data structure a pointer and have it be available to consumers, but that actually requires an smb_wmb or a stronger barrier. In absence of such barriers and on architectures that reorder writes, consumer might read an un=initialized value from an skb pointer stored in the skb array. This was observed causing crashes. To fix, add memory barriers. The barrier we use is a wmb, the assumption being that producers do not need to read the value so we do not need to order these reads. Reported-by: George Cherian <george.cherian@cavium.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Shaohua Li	b3b56038ba	net: reevalulate autoflowlabel setting after sysctl setting [ Upstream commit `513674b5a2` ] sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2. If sockopt doesn't set autoflowlabel, outcome packets from the hosts are supposed to not include flowlabel. This is true for normal packet, but not for reset packet. The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't changed, so the sock will keep the old behavior in terms of auto flowlabel. Reset packet is suffering from this problem, because reset packet is sent from a special control socket, which is created at boot time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control socket will always have its ipv6_pinfo.autoflowlabel set, even after user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always have flowlabel. Normal sock created before sysctl setting suffers from the same issue. We can't even turn off autoflowlabel unless we kill all socks in the hosts. To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the autoflowlabel setting from user, otherwise we always call ip6_default_np_autolabel() which has the new settings of sysctl. Note, this changes behavior a little bit. Before commit `42240901f7` (ipv6: Implement different admin modes for automatic flow labels), the autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes, existing connection will change autoflowlabel behavior. After that commit, autoflowlabel behavior is sticky in the whole life of the sock. With this patch, the behavior isn't sticky again. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Tom Herbert <tom@quantonium.net> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Sebastian Sjoholm	8baa58c5d5	net: qmi_wwan: add Sierra EM7565 1199:9091 [ Upstream commit `aceef61ee5` ] Sierra Wireless EM7565 is an Qualcomm MDM9x50 based M.2 modem. The USB id is added to qmi_wwan.c to allow QMI communication with the EM7565. Signed-off-by: Sebastian Sjoholm <ssjoholm@mac.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:11 +01:00
Kevin Cernekee	0b18782288	netlink: Add netns check on taps [ Upstream commit `93c647643b` ] Currently, a nlmon link inside a child namespace can observe systemwide netlink activity. Filter the traffic so that nlmon can only sniff netlink messages from its own netns. Test case: vpnns -- bash -c "ip link add nlmon0 type nlmon; \ ip link set nlmon0 up; \ tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" & sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \ spi 0x1 mode transport \ auth sha1 0x6162633132330000000000000000000000000000 \ enc aes 0x00000000000000000000000000000000 grep --binary abc123 /tmp/nlmon.pcap Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Kevin Cernekee	2c1a0b2e2b	net: igmp: Use correct source address on IGMPv3 reports [ Upstream commit `a46182b002` ] Closing a multicast socket after the final IPv4 address is deleted from an interface can generate a membership report that uses the source IP from a different interface. The following test script, run from an isolated netns, reproduces the issue: #!/bin/bash ip link add dummy0 type dummy ip link add dummy1 type dummy ip link set dummy0 up ip link set dummy1 up ip addr add 10.1.1.1/24 dev dummy0 ip addr add 192.168.99.99/24 dev dummy1 tcpdump -U -i dummy0 & socat EXEC:"sleep 2" \ UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 & sleep 1 ip addr del 10.1.1.1/24 dev dummy0 sleep 5 kill %tcpdump RFC 3376 specifies that the report must be sent with a valid IP source address from the destination subnet, or from address 0.0.0.0. Add an extra check to make sure this is the case. Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Fugang Duan	930882f8b8	net: fec: unmap the xmit buffer that are not transferred by DMA [ Upstream commit `178e5f57a8` ] The enet IP only support 32 bit, it will use swiotlb buffer to do dma mapping when xmit buffer DMA memory address is bigger than 4G in i.MX platform. After stress suspend/resume test, it will print out: log: [12826.352864] fec 5b040000.ethernet: swiotlb buffer is full (sz: 191 bytes) [12826.359676] DMA: Out of SW-IOMMU space for 191 bytes at device 5b040000.ethernet [12826.367110] fec 5b040000.ethernet eth0: Tx DMA memory map failed The issue is that the ready xmit buffers that are dma mapped but DMA still don't copy them into fifo, once MAC restart, these DMA buffers are not unmapped. So it should check the dma mapping buffer and unmap them. Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Eric Dumazet	f6d7cdbb02	ipv6: mcast: better catch silly mtu values [ Upstream commit `b9b312a7a4` ] syzkaller reported crashes in IPv6 stack [1] Xin Long found that lo MTU was set to silly values. IPv6 stack reacts to changes to small MTU, by disabling itself under RTNL. But there is a window where threads not using RTNL can see a wrong device mtu. This can lead to surprises, in mld code where it is assumed the mtu is suitable. Fix this by reading device mtu once and checking IPv6 minimal MTU. [1] skbuff: skb_over_panic: text:0000000010b86b8d len:196 put:20 head:000000003b477e60 data:000000000e85441e tail:0xd4 end:0xc0 dev:lo ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc2-mm1+ #39 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_panic+0x15c/0x1f0 net/core/skbuff.c:100 RSP: 0018:ffff8801db307508 EFLAGS: 00010286 RAX: 0000000000000082 RBX: ffff8801c517e840 RCX: 0000000000000000 RDX: 0000000000000082 RSI: 1ffff1003b660e61 RDI: ffffed003b660e95 RBP: ffff8801db307570 R08: 1ffff1003b660e23 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85bd4020 R13: ffffffff84754ed2 R14: 0000000000000014 R15: ffff8801c4e26540 FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000463610 CR3: 00000001c6698000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> skb_over_panic net/core/skbuff.c:109 [inline] skb_put+0x181/0x1c0 net/core/skbuff.c:1694 add_grhead.isra.24+0x42/0x3b0 net/ipv6/mcast.c:1695 add_grec+0xa55/0x1060 net/ipv6/mcast.c:1817 mld_send_cr net/ipv6/mcast.c:1903 [inline] mld_ifc_timer_expire+0x4d2/0x770 net/ipv6/mcast.c:2448 call_timer_fn+0x23b/0x840 kernel/time/timer.c:1320 expire_timers kernel/time/timer.c:1357 [inline] __run_timers+0x7e1/0xb60 kernel/time/timer.c:1660 run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686 __do_softirq+0x29d/0xbb2 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d3/0x210 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:540 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:920 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Eric Dumazet	c2f78bf8ca	ipv4: igmp: guard against silly MTU values [ Upstream commit `b5476022bb` ] IPv4 stack reacts to changes to small MTU, by disabling itself under RTNL. But there is a window where threads not using RTNL can see a wrong device mtu. This can lead to surprises, in igmp code where it is assumed the mtu is suitable. Fix this by reading device mtu once and checking IPv4 minimal MTU. This patch adds missing IPV4_MIN_MTU define, to not abuse ETH_MIN_MTU anymore. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Linus Torvalds	b929ccccbc	kbuild: add '-fno-stack-check' to kernel build options commit `3ce120b16c` upstream. It appears that hardened gentoo enables "-fstack-check" by default for gcc. That doesn't work _at_all_ for the kernel, because the kernel stack doesn't act like a user stack at all: it's much smaller, and it doesn't auto-expand on use. So the extra "probe one page below the stack" code generated by -fstack-check just breaks the kernel in horrible ways, causing infinite double faults etc. [ I have to say, that the particular code gcc generates looks very stupid even for user space where it works, but that's a separate issue. ] Reported-and-tested-by: Alexander Tsoy <alexander@tsoy.me> Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Andy Lutomirski	04bdf71d9f	x86/mm/64: Fix reboot interaction with CR4.PCIDE commit `924c6b900c` upstream. Trying to reboot via real mode fails with PCID on: long mode cannot be exited while CR4.PCIDE is set. (No, I have no idea why, but the SDM and actual CPUs are in agreement here.) The result is a GPF and a hang instead of a reboot. I didn't catch this in testing because neither my computer nor my VM reboots this way. I can trigger it with reboot=bios, though. Fixes: `660da7c922` ("x86/mm: Enable CR4.PCIDE on supported systems") Reported-and-tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.1507524746.git.luto@kernel.org Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:10 +01:00
Andy Lutomirski	b52f937ecc	x86/mm: Enable CR4.PCIDE on supported systems commit `660da7c922` upstream. We can use PCID if the CPU has PCID and PGE and we're not on Xen. By itself, this has no effect. A followup patch will start using PCID. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/6327ecd907b32f79d5aa0d466f04503bbec5df88.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	e6a29320de	x86/mm: Add the 'nopcid' boot option to turn off PCID commit `0790c9aad8` upstream. The parameter is only present on x86_64 systems to save a few bytes, as PCID is always disabled on x86_32. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	1e7f3d8875	x86/mm: Disable PCID on 32-bit kernels commit `cba4671af7` upstream. 32-bit kernels on new hardware will see PCID in CPUID, but PCID can only be used in 64-bit mode. Rather than making all PCID code conditional, just disable the feature on 32-bit builds. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	3e5daacf65	x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code commit `ce4a4e565f` upstream. The UP asm/tlbflush.h generates somewhat nicer code than the SMP version. Aside from that, it's fallen quite a bit behind the SMP code: - flush_tlb_mm_range() didn't flush individual pages if the range was small. - The lazy TLB code was much weaker. This usually wouldn't matter, but, if a kernel thread flushed its lazy "active_mm" more than once (due to reclaim or similar), it wouldn't be unlazied and would instead pointlessly flush repeatedly. - Tracepoints were missing. Aside from that, simply having the UP code around was a maintanence burden, since it means that any change to the TLB flush code had to make sure not to break it. Simplify everything by deleting the UP code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	a94af05008	x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range() commit `ca6c99c079` upstream. flush_tlb_page() was very similar to flush_tlb_mm_range() except that it had a couple of issues: - It was missing an smp_mb() in the case where current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit) - It was missing tracepoints and vm counter updates. The only reason that I can see for keeping it at as a separate function is that it could avoid a few branches that flush_tlb_mm_range() needs to decide to flush just one page. This hardly seems worthwhile. If we decide we want to get rid of those branches again, a better way would be to introduce an __flush_tlb_mm_range() helper and make both flush_tlb_page() and flush_tlb_mm_range() use it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.1495492063.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	113980c002	x86/mm: Make flush_tlb_mm_range() more predictable commit `ce27374fab` upstream. I'm about to rewrite the function almost completely, but first I want to get a functional change out of the way. Currently, if flush_tlb_mm_range() does not flush the local TLB at all, it will never do individual page flushes on remote CPUs. This seems to be an accident, and preserving it will be awkward. Let's change it first so that any regressions in the rewrite will be easier to bisect and so that the rewrite can attempt to change no visible behavior at all. The fix is simple: we can simply avoid short-circuiting the calculation of base_pages_to_flush. As a side effect, this also eliminates a potential corner case: if tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range() could have ended up flushing the entire address space one page at a time. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:09 +01:00
Andy Lutomirski	219acedb06	x86/mm: Remove flush_tlb() and flush_tlb_current_task() commit `29961b59a5` upstream. I was trying to figure out what how flush_tlb_current_task() would possibly work correctly if current->mm != current->active_mm, but I realized I could spare myself the effort: it has no callers except the unused flush_tlb() macro. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Andy Lutomirski	72b812d5b8	x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() commit `9ccee2373f` upstream. mark_screen_rdonly() is the last remaining caller of flush_tlb(). flush_tlb_mm_range() is potentially faster and isn't obsolete. Compile-tested only because I don't know whether software that uses this mechanism even exists. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Hui Wang	65ca46e5fe	ALSA: hda - fix headset mic detection issue on a Dell machine commit `285d5ddcff` upstream. It has the codec alc256, and add its pin definition to pin quirk table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE. Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Takashi Iwai	a1dbcd823a	ALSA: hda: Drop useless WARN_ON() commit `a36c263838` upstream. Since the commit `97cc2ed27e` ("ALSA: hda - Fix yet another i915 pointer leftover in error path") cleared hdac_acomp pointer, the WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give a false-positive warning, as the function gets called no matter whether the component is registered or not. For fixing it, let's get rid of the spurious WARN_ON(). Fixes: `97cc2ed27e` ("ALSA: hda - Fix yet another i915 pointer leftover in error path") Reported-by: Kouta Okamoto <kouta.okamoto@toshiba.co.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Andrew F. Davis	d30d1761bc	ASoC: tlv320aic31xx: Fix GPIO1 register definition commit `737e0b7b67` upstream. GPIO1 control register is number 51, fix this here. Fixes: `bafcbfe429` ("ASoC: tlv320aic31xx: Make the register values human readable") Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Johan Hovold	b04640a450	ASoC: twl4030: fix child-node lookup commit `15f8c5f241` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent codec node was also prematurely freed, while the child node was leaked. Fixes: `2d6d649a2e` ("ASoC: twl4030: Support for DT booted kernel") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Maciej S. Szmigiero	00add00ed2	ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure commit `695b78b548` upstream. AC'97 ops (register read / write) need SSI regmap and clock, so they have to be set after them. We also need to set these ops back to NULL if we fail the probe. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:08 +01:00
Johan Hovold	35f87d45cb	ASoC: da7218: fix fix child-node lookup commit `bc6476d6c1` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent codec node was also prematurely freed. Fixes: `4d50934abd` ("ASoC: da7218: Add da7218 codec driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Ben Hutchings	125e81b5af	ASoC: wm_adsp: Fix validation of firmware and coeff lengths commit `50dd2ea8ef` upstream. The checks for whether another region/block header could be present are subtracting the size from the current offset. Obviously we should instead subtract the offset from the size. The checks for whether the region/block data fit in the file are adding the data size to the current offset and header size, without checking for integer overflow. Rearrange these so that overflow is impossible. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Tested-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Steve Wise	72d5e020c0	iw_cxgb4: Only validate the MSN for successful completions commit `f55688c454` upstream. If the RECV CQE is in error, ignore the MSN check. This was causing recvs that were flushed into the sw cq to be completed with the wrong status (BAD_MSN instead of FLUSHED). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Steven Rostedt (VMware)	2e0d458c31	ring-buffer: Mask out the info bits when returning buffer page length commit `45d8b80c2a` upstream. Two info bits were added to the "commit" part of the ring buffer data page when returned to be consumed. This was to inform the user space readers that events have been missed, and that the count may be stored at the end of the page. What wasn't handled, was the splice code that actually called a function to return the length of the data in order to zero out the rest of the page before sending it up to user space. These data bits were returned with the length making the value negative, and that negative value was not checked. It was compared to PAGE_SIZE, and only used if the size was less than PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an unsigned compare, meaning the negative size value did not end up causing a large portion of memory to be randomly zeroed out. Fixes: `66a8cb95ed` ("ring-buffer: Add place holder recording of dropped events") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Jing Xia	81e155e7b0	tracing: Fix crash when it fails to alloc ring buffer commit `24f2aaf952` upstream. Double free of the ring buffer happens when it fails to alloc new ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured. The root cause is that the pointer is not set to NULL after the buffer is freed in allocate_trace_buffers(), and the freeing of the ring buffer is invoked again later if the pointer is not equal to Null, as: instance_mkdir() \|-allocate_trace_buffers() \|-allocate_trace_buffer(tr, &tr->trace_buffer...) \|-allocate_trace_buffer(tr, &tr->max_buffer...) // allocate fail(-ENOMEM),first free // and the buffer pointer is not set to null \|-ring_buffer_free(tr->trace_buffer.buffer) // out_free_tr \|-free_trace_buffers() \|-free_trace_buffer(&tr->trace_buffer); //if trace_buffer is not null, free again \|-ring_buffer_free(buf->buffer) \|-rb_free_cpu_buffer(buffer->buffers[cpu]) // ring_buffer_per_cpu is null, and // crash in ring_buffer_per_cpu->pages Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: `737223fbca` ("tracing: Consolidate buffer allocation code") Signed-off-by: Jing Xia <jing.xia@spreadtrum.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Steven Rostedt (VMware)	5dc4cd2688	tracing: Fix possible double free on failure of allocating trace buffer commit `4397f04575` upstream. Jing Xia and Chunyan Zhang reported that on failing to allocate part of the tracing buffer, memory is freed, but the pointers that point to them are not initialized back to NULL, and later paths may try to free the freed memory again. Jing and Chunyan fixed one of the locations that does this, but missed a spot. Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com Fixes: `737223fbca` ("tracing: Consolidate buffer allocation code") Reported-by: Jing Xia <jing.xia@spreadtrum.com> Reported-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:07 +01:00
Steven Rostedt (VMware)	6edea15d12	tracing: Remove extra zeroing out of the ring buffer page commit `6b7e633fe9` upstream. The ring_buffer_read_page() takes care of zeroing out any extra data in the page that it returns. There's no need to zero it out again from the consumer. It was removed from one consumer of this function, but read_buffers_splice_read() did not remove it, and worse, it contained a nasty bug because of it. Fixes: `2711ca237a` ("ring-buffer: Move zeroing out excess in page to ring buffer code") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:06 +01:00
Greg Kroah-Hartman	3d16a1315a	sync objtool's copy of x86-opcode-map.txt When building objtool, we get the warning: warning: objtool: x86 instruction decoder differs from kernel That's due to commit `2816c0455c` which was commit `12a78d43de` upstream that modified arch/x86/lib/x86-opcode-map.txt without also updating the objtool copy. The objtool copy was updated in a much larger patch upstream, but we don't need all of that here, so just update the single file. If this gets too annoying, I'll just end up doing what we did for 4.14 and backport the whole series to keep this from happening again, but as this seems to be rare in the 4.9-stable series, this single patch should be fine. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-01-02 20:35:06 +01:00
Greg Kroah-Hartman	b3e88217e2	Linux 4.9.73	2017-12-29 17:43:00 +01:00
Ben Hutchings	37435f7e80	bpf/verifier: Fix states_equal() comparison of pointer and UNKNOWN An UNKNOWN_VALUE is not supposed to be derived from a pointer, unless pointer leaks are allowed. Therefore, states_equal() must not treat a state with a pointer in a register as "equal" to a state with an UNKNOWN_VALUE in that register. This was fixed differently upstream, but the code around here was largely rewritten in 4.14 by commit `f1174f77b5` "bpf/verifier: rework value tracking". The bug can be detected by the bpf/verifier sub-test "pointer/scalar confusion in state equality check (way 1)". Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Edward Cree <ecree@solarflare.com> Cc: Jann Horn <jannh@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net>	2017-12-29 17:43:00 +01:00
Yelena Krivosheev	69cf72b287	net: mvneta: eliminate wrong call to handle rx descriptor error commit `2eecb2e04a` upstream. There are few reasons in mvneta_rx_swbm() function when received packet is dropped. mvneta_rx_error() should be called only if error bit [16] is set in rx descriptor. [gregory.clement@free-electrons.com: add fixes tag] Fixes: `dc35a10f68` ("net: mvneta: bm: add support for hardware buffer management") Signed-off-by: Yelena Krivosheev <yelena@marvell.com> Tested-by: Dmitri Epshtein <dima@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Yelena Krivosheev	a57f99f484	net: mvneta: use proper rxq_number in loop on rx queues commit `ca5902a654` upstream. When adding the RX queue association with each CPU, a typo was made in the mvneta_cleanup_rxqs() function. This patch fixes it. [gregory.clement@free-electrons.com: add commit log and fixes tag] Fixes: `2dcf75e279` ("net: mvneta: Associate RX queues with each CPU") Signed-off-by: Yelena Krivosheev <yelena@marvell.com> Tested-by: Dmitri Epshtein <dima@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Yelena Krivosheev	405f3d7946	net: mvneta: clear interface link status on port disable commit `4423c18e46` upstream. When port connect to PHY in polling mode (with poll interval 1 sec), port and phy link status must be synchronize in order don't loss link change event. [gregory.clement@free-electrons.com: add fixes tag] Fixes: `c5aff18204` ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: Yelena Krivosheev <yelena@marvell.com> Tested-by: Dmitri Epshtein <dima@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Dan Williams	423716cf28	libnvdimm, pfn: fix start_pad handling for aligned namespaces commit `19deaa217b` upstream. The alignment checks at pfn driver startup fail to properly account for the 'start_pad' in the case where the namespace is misaligned relative to its internal alignment. This is typically triggered in 1G aligned namespace, but could theoretically trigger with small namespace alignments. When this triggers the kernel reports messages of the form: dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000 Fixes: `1ee6667cd8` ("libnvdimm, pfn, dax: fix initialization vs autodetect...") Reported-by: Jane Chu <jane.chu@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Ravi Bangoria	77b318a4e5	powerpc/perf: Dereference BHRB entries safely commit `f41d84dddc` upstream. It's theoretically possible that branch instructions recorded in BHRB (Branch History Rolling Buffer) entries have already been unmapped before they are processed by the kernel. Hence, trying to dereference such memory location will result in a crash. eg: Unable to handle kernel paging request for data at address 0xd000000019c41764 Faulting instruction address: 0xc000000000084a14 NIP [c000000000084a14] branch_target+0x4/0x70 LR [c0000000000eb828] record_and_restart+0x568/0x5c0 Call Trace: [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable) [c0000000000ec378] perf_event_interrupt+0x298/0x460 [c000000000027964] performance_monitor_exception+0x54/0x70 [c000000000009ba4] performance_monitor_common+0x114/0x120 Fix it by deferefencing the addresses safely. Fixes: `691231846c` ("powerpc/perf: Fix setting of "to" addresses for BHRB") Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Use probe_kernel_read() which is clearer, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Chen-Yu Tsai	2635a64d0e	clk: sunxi: sun9i-mmc: Implement reset callback for reset controls commit `61d2f2a057` upstream. Our MMC host driver now issues a reset, instead of just deasserting the reset control, since commit `c34eda69ad` ("mmc: sunxi: Reset the device at probe time"). The sun9i-mmc clock driver does not support this, and will fail, which results in MMC not probing. This patch implements the reset callback by asserting the reset control, then deasserting it after a small delay. Fixes: `7a6fca879f` ("clk: sunxi: Add driver for A80 MMC config clocks/resets") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Link: lkml.kernel.org/r/20171218035751.20661-1-wens@csie.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Paolo Bonzini	18276e9bcd	kvm: x86: fix RSM when PCID is non-zero commit `fae1a3e775` upstream. rsm_load_state_64() and rsm_enter_protected_mode() load CR3, then CR4 & ~PCIDE, then CR0, then CR4. However, setting CR4.PCIDE fails if CR3[11:0] != 0. It's probably easier in the long run to replace rsm_enter_protected_mode() with an emulator callback that sets all the special registers (like KVM_SET_SREGS would do). For now, set the PCID field of CR3 only after CR4.PCIDE is 1. Reported-by: Laszlo Ersek <lersek@redhat.com> Tested-by: Laszlo Ersek <lersek@redhat.com> Fixes: `660a5d517a` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:43:00 +01:00
Wanpeng Li	e5c73b3b60	KVM: X86: Fix load RFLAGS w/o the fixed bit commit `d73235d17b` upstream. * Guest State * CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871 CR3 = 0x00000000fffbc000 RSP = 0x0000000000000000 RIP = 0x0000000000000000 RFLAGS=0x00000000 DR7 = 0x0000000000000400 ^^^^^^^^^^ The failed vmentry is triggered by the following testcase when ept=Y: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_regs regs = { .rflags = 0, }; ioctl(r[4], KVM_SET_REGS, &regs); ioctl(r[4], KVM_RUN, 0); } X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1 of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails. This patch fixes it by oring X86_EFLAGS_FIXED during ioctl. Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Quan Xu <quan.xu0@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Mika Westerberg	418dfce4fa	pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems commit `d2b3c35359` upstream. Guenter Roeck reported an interrupt storm on a prototype system which is based on Cyan Chromebook. The root cause turned out to be a incorrectly configured pin that triggers spurious interrupts. This will be fixed in coreboot but currently we need to prevent the interrupt storm from happening by masking all interrupts (but not GPEs) on those systems. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953 Fixes: `bcb48cca23` ("pinctrl: cherryview: Do not mask all interrupts in probe") Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Ricardo Ribalda Delgado	cb8b2fd190	spi: xilinx: Detect stall with Unknown commands commit `5a1314fa69` upstream. When the core is configured in C_SPI_MODE > 0, it integrates a lookup table that automatically configures the core in dual or quad mode based on the command (first byte on the tx fifo). Unfortunately, that list mode_?_memoy_*.mif does not contain all the supported commands by the flash. Since 4.14 spi-nor automatically tries to probe the flash using SFDP (command 0x5a), and that command is not part of the list_mode table. Whit the right combination of C_SPI_MODE and C_SPI_MEMORY this leads into a stall that can only be recovered with a soft rest. This patch detects this kind of stall and returns -EIO to the caller on those commands. spi-nor can handle this error properly: m25p80 spi0.0: Detected stall. Check C_SPI_MODE and C_SPI_MEMORY. 0x21 0x2404 m25p80 spi0.0: SPI transfer failed: -5 spi_master spi0: failed to transfer one message from queue m25p80 spi0.0: s25sl064p (8192 Kbytes) Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Helge Deller	373386ec3f	parisc: Hide Diva-built-in serial aux and graphics card commit `bcf3f1752a` upstream. Diva GSP card has built-in serial AUX port and ATI graphic card which simply don't work and which both don't have external connectors. User Guides even mention that those devices shouldn't be used. So, prevent that Linux drivers try to enable those devices. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Rafael J. Wysocki	10b4a621f3	PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() commit `5839ee7389` upstream. It is incorrect to call pci_restore_state() for devices in low-power states (D1-D3), as that involves the restoration of MSI setup which requires MMIO to be operational and that is only the case in D0. However, pci_pm_thaw_noirq() may do that if the driver's "freeze" callbacks put the device into a low-power state, so fix it by making it force devices into D0 via pci_set_power_state() instead of trying to "update" their power state which is pointless. Fixes: `e60514bd44` (PCI/PM: Restore the status of PCI devices across hibernation) Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Maarten Lankhorst <dev@mblankhorst.nl> Tested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Maarten Lankhorst <dev@mblankhorst.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Takashi Iwai	3176065495	ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU commit `5a15f289ee` upstream. The commit `89b89d121f` ("ALSA: usb-audio: Add check return value for usb_string()") added the check of the return value from snd_usb_copy_string_desc(), which is correct per se, but it introduced a regression. In the original code, either the "Clock Source", "Playback Source" or "Capture Source" suffix is added after the terminal string, while the commit changed it to add the suffix only when get_term_name() is failing. It ended up with an incorrect ctl name like "PCM" instead of "PCM Capture Source". Also, even the original code has a similar bug: when the ctl name is generated from snd_usb_copy_string_desc() for the given iSelector, it also doesn't put the suffix. This patch addresses these issues: the suffix is added always when no static mapping is found. Also the patch tries to put more comments and cleans up the if/else block for better readability in order to avoid the same pitfall again. Fixes: `89b89d121f` ("ALSA: usb-audio: Add check return value for usb_string()") Reported-and-tested-by: Mauro Santos <registo.mailling@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Jussi Laako	beab14a3ee	ALSA: usb-audio: Add native DSD support for Esoteric D-05X commit `866f7ed7d6` upstream. Adds VID:PID of Esoteric D-05X to the TEAC device id's. Renames the is_teac_50X_dac() function to is_teac_dsd_dac() to cover broader device family from the same corporation sharing the same USB audio implementation. Signed-off-by: Jussi Laako <jussi@sonarnerd.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Takashi Iwai	cec92448c5	ALSA: rawmidi: Avoid racy info ioctl via ctl device commit `c1cfd9025c` upstream. The rawmidi also allows to obtaining the information via ioctl of ctl API. It means that user can issue an ioctl to the rawmidi device even when it's being removed as long as the control device is present. Although the code has some protection via the global register_mutex, its range is limited to the search of the corresponding rawmidi object, and the mutex is already unlocked at accessing the rawmidi object. This may lead to a use-after-free. For avoiding it, this patch widens the application of register_mutex to the whole snd_rawmidi_info_select() function. We have another mutex per rawmidi object, but this operation isn't very hot path, so it shouldn't matter from the performance POV. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:59 +01:00
Johan Hovold	becf7d87cd	mfd: twl6040: Fix child-node lookup commit `85e9b13cbb` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent node was prematurely freed, while the child node was leaked. Note that the CONFIG_OF compile guard can be removed as of_get_child_by_name() provides a !CONFIG_OF implementation which always fails. Fixes: `37e13cecaa` ("mfd: Add support for Device Tree to twl6040") Fixes: `ca2cad6ae3` ("mfd: Fix twl6040 build failure") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:58 +01:00
Johan Hovold	f4c0796fdc	mfd: twl4030-audio: Fix sibling-node lookup commit `0a423772de` upstream. A helper purported to look up a child node based on its name was using the wrong of-helper and ended up prematurely freeing the parent of-node while leaking any matching node. To make things worse, any matching node would not even necessarily be a child node as the whole device tree was searched depth-first starting at the parent. Fixes: `019a7e6b7b` ("mfd: twl4030-audio: Add DT support") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:58 +01:00
Jon Hunter	2db85cb211	mfd: cros ec: spi: Don't send first message too soon commit `15d8374874` upstream. On the Tegra124 Nyan-Big chromebook the very first SPI message sent to the EC is failing. The Tegra SPI driver configures the SPI chip-selects to be active-high by default (and always has for many years). The EC SPI requires an active-low chip-select and so the Tegra chip-select is reconfigured to be active-low when the EC SPI driver calls spi_setup(). The problem is that if the first SPI message to the EC is sent too soon after reconfiguring the SPI chip-select, it fails. The EC SPI driver prevents back-to-back SPI messages being sent too soon by keeping track of the time the last transfer was sent via the variable 'last_transfer_ns'. To prevent the very first transfer being sent too soon, initialise the 'last_transfer_ns' variable after calling spi_setup() and before sending the first SPI message. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Benson Leung <bleung@chromium.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:58 +01:00
Sebastian Andrzej Siewior	e81cff1ced	crypto: mcryptd - protect the per-CPU queue with a lock commit `9abffc6f2e` upstream. mcryptd_enqueue_request() grabs the per-CPU queue struct and protects access to it with disabled preemption. Then it schedules a worker on the same CPU. The worker in mcryptd_queue_worker() guards access to the same per-CPU variable with disabled preemption. If we take CPU-hotplug into account then it is possible that between queue_work_on() and the actual invocation of the worker the CPU goes down and the worker will be scheduled on _another_ CPU. And here the preempt_disable() protection does not work anymore. The easiest thing is to add a spin_lock() to guard access to the list. Another detail: mcryptd_queue_worker() is not processing more than MCRYPTD_BATCH invocation in a row. If there are still items left, then it will invoke queue_work() to proceed with more later. I would suggest to simply drop that check because it does not use a system workqueue and the workqueue is already marked as "CPU_INTENSIVE". And if preemption is required then the scheduler should do it. However if queue_work() is used then the work item is marked as CPU unbound. That means it will try to run on the local CPU but it may run on another CPU as well. Especially with CONFIG_DEBUG_WQ_FORCE_RR_CPU=y. Again, the preempt_disable() won't work here but lock which was introduced will help. In order to keep work-item on the local CPU (and avoid RR) I changed it to queue_work_on(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:58 +01:00
Dan Williams	d31a207aaf	acpi, nfit: fix health event notification commit `adf6895754` upstream. Integration testing with a BIOS that generates injected health event notifications fails to communicate those events to userspace. The nfit driver neglects to link the ACPI DIMM device with the necessary driver data so acpi_nvdimm_notify() fails this lookup: nfit_mem = dev_get_drvdata(dev); if (nfit_mem && nfit_mem->flags_attr) sysfs_notify_dirent(nfit_mem->flags_attr); Add the necessary linkage when installing the notification handler and clean it up when the nfit driver instance is torn down. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Fixes: `ba9c8dd3c2` ("acpi, nfit: add dimm device notification support") Reported-by: Daniel Osawa <daniel.k.osawa@intel.com> Tested-by: Daniel Osawa <daniel.k.osawa@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:58 +01:00
Takashi Iwai	54c74d3881	ACPI: APEI / ERST: Fix missing error handling in erst_reader() commit `bb82e0b4a7` upstream. The commit `f6f8285132` ("pstore: pass allocated memory region back to caller") changed the check of the return value from erst_read() in erst_reader() in the following way: if (len == -ENOENT) goto skip; - else if (len < 0) { - rc = -1; + else if (len < sizeof(*rcd)) { + rc = -EIO; goto out; This introduced another bug: since the comparison with sizeof() is cast to unsigned, a negative len value doesn't hit any longer. As a result, when an error is returned from erst_read(), the code falls through, and it may eventually lead to some weird thing like memory corruption. This patch adds the negative error value check more explicitly for addressing the issue. Fixes: `f6f8285132` (pstore: pass allocated memory region back to caller) Tested-by: Jerry Tang <jtang@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-29 17:42:57 +01:00
Greg Kroah-Hartman	2df3979310	Linux 4.9.72	2017-12-25 14:23:47 +01:00
Guenter Roeck	6430e166ae	sparc32: Export vac_cache_size to fix build error commit `9d262d9511` upstream. sparc32:allmodconfig fails to build with the following error. ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined! Fixes: `cb88645596` ("infiniband: Fix alignment of mmap cookies ...") Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:47 +01:00
Daniel Borkmann	3695b3b185	bpf: fix incorrect sign extension in check_alu_op() From: Jann Horn <jannh@google.com> [ Upstream commit `95a762e2c8` ] Distinguish between BPF_ALU64\|BPF_MOV\|BPF_K (load 32-bit immediate, sign-extended to 64-bit) and BPF_ALU\|BPF_MOV\|BPF_K (load 32-bit immediate, zero-padded to 64-bit); only perform sign extension in the first case. Starting with v4.14, this is exploitable by unprivileged users as long as the unprivileged_bpf_disabled sysctl isn't set. Debian assigned CVE-2017-16995 for this issue. v3: - add CVE number (Ben Hutchings) Fixes: `484611357c` ("bpf: allow access into map value arrays") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:47 +01:00
Daniel Borkmann	d75d3ee237	bpf: reject out-of-bounds stack pointer calculation From: Jann Horn <jannh@google.com> Reject programs that compute wildly out-of-bounds stack pointers. Otherwise, pointers can be computed with an offset that doesn't fit into an `int`, causing security issues in the stack memory access check (as well as signed integer overflow during offset addition). This is a fix specifically for the v4.9 stable tree because the mainline code looks very different at this point. Fixes: `7bca0a9702` ("bpf: enhance verifier to understand stack pointer arithmetic") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:47 +01:00
Daniel Borkmann	7b5b73ea87	bpf: fix branch pruning logic From: Alexei Starovoitov <ast@fb.com> [ Upstream commit `c131187db2` ] when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: `17a5267067` ("bpf: verifier (add verifier core)") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:47 +01:00
Daniel Borkmann	565f012f5a	bpf: adjust insn_aux_data when patching insns From: Alexei Starovoitov <ast@fb.com> [ Upstream commit `8041902dae` ] convert_ctx_accesses() replaces single bpf instruction with a set of instructions. Adjust corresponding insn_aux_data while patching. It's needed to make sure subsequent 'for(all insn)' loops have matching insn and insn_aux_data. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:47 +01:00
Greg Kroah-Hartman	3b6c84bc64	Revert "Bluetooth: btusb: driver to enable the usb-wakeup feature" This reverts commit `7336f5481f` which is commit `a0085f2510` upstream. It causes problems with working systems, as noted by a number of the ChromeOS developers. Cc: Sukumar Ghorai <sukumar.ghorai@intel.com> Cc: Amit K Bag <amit.k.bag@intel.com> Cc: Oliver Neukum <oneukum@suse.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Matthias Kaehlcke <mka@chromium.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Brian Norris <briannorris@chromium.org> Acked-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Peter Hutterer	dbeb719e24	platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes commit `bff5bf9db1` upstream. Sending the switch state change twice within the same frame is invalid evdev protocol and only works if the client handles keys immediately as well. Processing events immediately is incorrect, it forces a fake order of events that does not exist on the device. Recent versions of libinput changed to only process the device state and SYN_REPORT time, so now the key event is lost. https://bugs.freedesktop.org/show_bug.cgi?id=104041 Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Aleksandar Markovic	00ecb4b1a5	MIPS: math-emu: Fix final emulation phase for certain instructions commit `409fcace99` upstream. Fix final phase of <CLASS\|MADDF\|MSUBF\|MAX\|MIN\|MAXA\|MINA>.<D\|S> emulation. Provide proper generation of SIGFPE signal and updating debugfs FP exception stats in cases of any exception flags set in preceding phases of emulation. CLASS.<D\|S> instruction may generate "Unimplemented Operation" FP exception. <MADDF\|MSUBF>.<D\|S> instructions may generate "Inexact", "Unimplemented Operation", "Invalid Operation", "Overflow", and "Underflow" FP exceptions. <MAX\|MIN\|MAXA\|MINA>.<D\|S> instructions can generate "Unimplemented Operation" and "Invalid Operation" FP exceptions. The proper final processing of the cases when any FP exception flag is set is achieved by replacing "break" statement with "goto copcsr" statement. With such solution, this patch brings the final phase of emulation of the above instructions consistent with the one corresponding to the previously implemented emulation of other related FPU instructions (ADD, SUB, etc.). Fixes: `38db37ba06` ("MIPS: math-emu: Add support for the MIPS R6 CLASS FPU instruction") Fixes: `e24c3bec3e` ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction") Fixes: `83d43305a1` ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction") Fixes: `a79f5f9ba5` ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction") Fixes: `4e9561b20e` ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction") Signed-off-by: Aleksandar Markovic <aleksandar.markovic@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Douglas Leung <douglas.leung@mips.com> Cc: Goran Ferenc <goran.ferenc@mips.com> Cc: "Maciej W. Rozycki" <macro@imgtec.com> Cc: Miodrag Dinic <miodrag.dinic@mips.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Petar Jovanovic <petar.jovanovic@mips.com> Cc: Raghu Gandham <raghu.gandham@mips.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17581/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Daniel Lezcano	3cff90788e	thermal/drivers/hisi: Fix multiple alarm interrupts firing commit `db2b033260` upstream. The DT specifies a threshold of 65000, we setup the register with a value in the temperature resolution for the controller, 64656. When we reach 64656, the interrupt fires, the interrupt is disabled. Then the irq thread runs and calls thermal_zone_device_update() which will call in turn hisi_thermal_get_temp(). The function will look if the temperature decreased, assuming it was more than 65000, but that is not the case because the current temperature is 64656 (because of the rounding when setting the threshold). This condition being true, we re-enable the interrupt which fires immediately after exiting the irq thread. That happens again and again until the temperature goes to more than 65000. Potentially, there is here an interrupt storm if the temperature stabilizes at this temperature. A very unlikely case but possible. In any case, it does not make sense to handle dozens of alarm interrupt for nothing. Fix this by rounding the threshold value to the controller resolution so the check against the threshold is consistent with the one set in the controller. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Daniel Lezcano	1b2c46a6be	thermal/drivers/hisi: Simplify the temperature/step computation commit `48880b979c` upstream. The step and the base temperature are fixed values, we can simplify the computation by converting the base temperature to milli celsius and use a pre-computed step value. That saves us a lot of mult + div for nothing at runtime. Take also the opportunity to change the function names to be consistent with the rest of the code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Daniel Lezcano	2dac559df9	thermal/drivers/hisi: Fix kernel panic on alarm interrupt commit `2cb4de785c` upstream. The threaded interrupt for the alarm interrupt is requested before the temperature controller is setup. This one can fire an interrupt immediately leading to a kernel panic as the sensor data is not initialized. In order to prevent that, move the threaded irq after the Tsensor is setup. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Daniel Lezcano	b679b8d7ba	thermal/drivers/hisi: Fix missing interrupt enablement commit `c176b10b02` upstream. The interrupt for the temperature threshold is not enabled at the end of the probe function, enable it after the setup is complete. On the other side, the irq_enabled is not correctly set as we are checking if the interrupt is masked where 'yes' means irq_enabled=false. irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED, &data->irq_enabled); As we are always enabling the interrupt, it is pointless to check if the interrupt is masked or not, just set irq_enabled to 'true'. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Arvind Yadav	82bf76afa8	thermal: hisilicon: Handle return value of clk_prepare_enable commit `919054fdfc` upstream. clk_prepare_enable() can fail here and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Nicholas Piggin	b86c7b8c5d	cpuidle: fix broadcast control when broadcast can not be entered [ Upstream commit `f187851b9b` ] When failing to enter broadcast timer mode for an idle state that requires it, a new state is selected that does not require broadcast, but the broadcast variable remains set. This causes tick_broadcast_exit to be called despite not having entered broadcast mode. This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some cases. It does not appear to cause problems for code today, but seems to violate the interface so should be fixed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Alexandre Belloni	15319d2a49	rtc: set the alarm to the next expiring timer [ Upstream commit `74717b28cb` ] If there is any non expired timer in the queue, the RTC alarm is never set. This is an issue when adding a timer that expires before the next non expired timer. Ensure the RTC alarm is set in that case. Fixes: `2b2f5ff00f` ("rtc: interface: ignore expired timers when enqueuing new timers") Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:46 +01:00
Hoang Tran	acc96729e1	tcp: fix under-evaluated ssthresh in TCP Vegas [ Upstream commit `cf5d74b85e` ] With the commit `76174004a0` (tcp: do not slow start when cwnd equals ssthresh), the comparison to the reduced cwnd in tcp_vegas_ssthresh() would under-evaluate the ssthresh. Signed-off-by: Hoang Tran <hoang.tran@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Chen-Yu Tsai	5859027994	clk: sunxi-ng: sun6i: Rename HDMI DDC clock to avoid name collision [ Upstream commit `7f3ed79188` ] The HDMI DDC clock found in the CCU is the parent of the actual DDC clock within the HDMI controller. That clock is also named "hdmi-ddc". Rename the one in the CCU to "ddc". This makes more sense than renaming the one in the HDMI controller to something else. Fixes: `c6e6c96d8f` ("clk: sunxi-ng: Add A31/A31s clocks") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Arvind Yadav	bb0618ac23	staging: greybus: light: Release memory obtained by kasprintf [ Upstream commit `04820da210` ] Free memory region, if gb_lights_channel_config is not successful. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Mike Manning	4bf42a2ec1	net: ipv6: send NS for DAD when link operationally up [ Upstream commit `1f372c7bfb` ] The NS for DAD are sent on admin up as long as a valid qdisc is found. A race condition exists by which these packets will not egress the interface if the operational state of the lower device is not yet up. The solution is to delay DAD until the link is operationally up according to RFC2863. Rather than only doing this, follow the existing code checks by deferring IPv6 device initialization altogether. The fix allows DAD on devices like tunnels that are controlled by userspace control plane. The fix has no impact on regular deployments, but means that there is no IPv6 connectivity until the port has been opened in the case of port-based network access control, which should be desirable. Signed-off-by: Mike Manning <mmanning@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Jacob Keller	52d0a601ae	fm10k: ensure we process SM mbx when processing VF mbx [ Upstream commit `17a9180994` ] When we process VF mailboxes, the driver is likely going to also queue up messages to the switch manager. This process merely queues up the FIFO, but doesn't actually begin the transmission process. Because we hold the mailbox lock during this VF processing, the PF<->SM mailbox is not getting processed at this time. Ensure that we actually process the PF<->SM mailbox in between each PF<->VF mailbox. This should ensure prompt transmission of the messages queued up after each VF message is received and handled. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Alex Williamson	76d83bfc11	vfio/pci: Virtualize Maximum Payload Size [ Upstream commit `523184972b` ] With virtual PCI-Express chipsets, we now see userspace/guest drivers trying to match the physical MPS setting to a virtual downstream port. Of course a lone physical device surrounded by virtual interconnects cannot make a correct decision for a proper MPS setting. Instead, let's virtualize the MPS control register so that writes through to hardware are disallowed. Userspace drivers like QEMU assume they can write anything to the device and we'll filter out anything dangerous. Since mismatched MPS can lead to AER and other faults, let's add it to the kernel side rather than relying on userspace virtualization to handle it. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Dick Kennedy	de5a4c816d	scsi: lpfc: PLOGI failures during NPIV testing [ Upstream commit `e8bcf0ae4c` ] Local Reject/Invalid RPI errors seen during discovery. Temporary RPI cleanup was occurring regardless of SLI rev. It's only necessary on SLI-4. Adjust the test for whether cleanup is necessary. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Dick Kennedy	b438d2f7e2	scsi: lpfc: Fix secure firmware updates [ Upstream commit `184fc2b9a8` ] Firmware update fails with: status x17 add_status x56 on the final write If multiple DMA buffers are used for the download, some firmware revs have difficulty with signatures and crcs split across the dma buffer boundaries. Resolve by making all writes be a single 4k page in length. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Jacob Keller	fc9d6386a9	fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw [ Upstream commit `3e256ac5b1` ] We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit `883a9ccbae` ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Ed Blake	bd0feaac15	ASoC: img-parallel-out: Add pm_runtime_get/put to set_fmt callback [ Upstream commit `c70458890f` ] Add pm_runtime_get_sync and pm_runtime_put calls to set_fmt callback function. This fixes a bus error during boot when CONFIG_SUSPEND is defined when this function gets called while the device is runtime disabled and device registers are accessed while the clock is disabled. Signed-off-by: Ed Blake <ed.blake@sondrel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:45 +01:00
Tom Zanussi	6af9b18a2e	tracing: Exclude 'generic fields' from histograms [ Upstream commit `a15f7fc203` ] There are a small number of 'generic fields' (comm/COMM/cpu/CPU) that are found by trace_find_event_field() but are only meant for filtering. Specifically, they unlike normal fields, they have a size of 0 and thus wreak havoc when used as a histogram key. Exclude these (return -EINVAL) when used as histogram keys. Link: http://lkml.kernel.org/r/956154cbc3e8a4f0633d619b886c97f0f0edf7b4.1506105045.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Gabriele Paoloni	fbb2d72a54	PCI/AER: Report non-fatal errors only to the affected endpoint [ Upstream commit `86acc79071` ] Previously, if an non-fatal error was reported by an endpoint, we called report_error_detected() for the endpoint, every sibling on the bus, and their descendents. If any of them did not implement the .error_detected() method, do_recovery() failed, leaving all these devices unrecovered. For example, the system described in the bugzilla below has two devices: 0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected() 0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected() When a device such as 74:02.0 reported a non-fatal error, do_recovery() failed because 74:03.0 lacked an .error_detected() method. But per PCIe r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and does not affect 74:03.0: Non-fatal errors are uncorrectable errors which cause a particular transaction to be unreliable but the Link is otherwise fully functional. Isolating Non-fatal from Fatal errors provides Requester/Receiver logic in a device or system management software the opportunity to recover from the error without resetting the components on the Link and disturbing other transactions in progress. Devices not associated with the transaction in error are not impacted by the error. Report non-fatal errors only to the endpoint that reported them. We really want to check for AER_NONFATAL here, but the current code structure doesn't allow that. Looking for pci_channel_io_normal is the best we can do now. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055 Fixes: `6c2b374d74` ("PCI-Express AER implemetation: AER core and aerdriver") Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Colin Ian King	1d4b32bee9	IB/rxe: check for allocation failure on elem [ Upstream commit `4831ca9e4a` ] The allocation for elem may fail (especially because we're using GFP_ATOMIC) so best to check for a null return. This fixes a potential null pointer dereference when assigning elem->pool. Detected by CoverityScan CID#1357507 ("Dereference null return value") Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Emil Tantilov	2141182852	ixgbe: fix use of uninitialized padding [ Upstream commit `dcfd6b839c` ] This patch is resolving Coverity hits where padding in a structure could be used uninitialized. - Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum() - Initialize buffer.pad2/3 before ixgbe_hic_unlocked() Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Christophe JAILLET	700053c873	igb: check memory allocation failure [ Upstream commit `18eb86362a` ] Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Fabio Estevam	c236525bae	PM / OPP: Move error message to debug level [ Upstream commit `035ed07208` ] On some i.MX6 platforms which do not have speed grading check, opp table will not be created in platform code, so cpufreq driver prints the following error message: cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19) However, this is not really an error in this case because the imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count() and if it fails, it means that platform code does not provide OPP and then dev_pm_opp_of_add_table() will be called. In order to avoid such confusing error message, move it to debug level. It is up to the caller of dev_pm_opp_get_opp_count() to check its return value and decide if it will print an error or not. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Stuart Hayes	164a941c03	PCI: Create SR-IOV virtfn/physfn links before attaching driver [ Upstream commit `27d6162944` ] When creating virtual functions, create the "virtfn%u" and "physfn" links in sysfs before attaching the driver instead of after. When we attach the driver to the new virtual network interface first, there is a race when the driver attaches to the new sends out an "add" udev event, and the network interface naming software (biosdevname or systemd, for example) tries to look at these links. Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Sreekanth Reddy	b40eeea31a	scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive [ Upstream commit `2ce9a36452` ] Whenever an I/O for a RAID volume fails with IOCStatus MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to (MPI2_SCSI_STATE_TERMINATED \| MPI2_SCSI_STATE_NO_SCSI_STATUS) then return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO infinite times) set in the host byte. Previously, the driver was completing the I/O with "DID_SOFT_ERROR" which causes the I/O to be quickly retried. However, firmware needed more time and hence I/Os were failing. Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Varun Prakash	fd1d9dccc0	scsi: cxgb4i: fix Tx skb leak [ Upstream commit `9b3a081fb6` ] In case of connection reset Tx skb queue can have some skbs which are not transmitted so purge Tx skb queue in release_offload_resources() to avoid skb leak. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
David Daney	241833a3a9	PCI: Avoid bus reset if bridge itself is broken [ Upstream commit `357027786f` ] When checking to see if a PCI bus can safely be reset, we previously checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET flag set. Children marked with that flag are known not to behave well after a bus reset. Some PCIe root port bridges also do not behave well after a bus reset, sometimes causing the devices behind the bridge to become unusable. Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device to allow these bridges to be flagged, and prevent their secondary buses from being reset. Signed-off-by: David Daney <david.daney@cavium.com> [jglauber@cavium.com: fixed typo] Signed-off-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:44 +01:00
Dan Murphy	d3469e6166	net: phy: at803x: Change error to EINVAL for invalid MAC [ Upstream commit `fc7556877d` ] Change the return error code to EINVAL if the MAC address is not valid in the set_wol function. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Shakeel Butt	206e1621ba	kvm, mm: account kvm related kmem slabs to kmemcg [ Upstream commit `46bea48ac2` ] The kvm slabs can consume a significant amount of system memory and indeed in our production environment we have observed that a lot of machines are spending significant amount of memory that can not be left as system memory overhead. Also the allocations from these slabs can be triggered directly by user space applications which has access to kvm and thus a buggy application can leak such memory. So, these caches should be accounted to kmemcg. Signed-off-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Russell King	bdb33bb5e2	rtc: pl031: make interrupt optional [ Upstream commit `5b64a2965d` ] On some platforms, the interrupt for the PL031 is optional. Avoid trying to claim the interrupt if it's not specified. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Christian Lamparter	1525e330d6	crypto: crypto4xx - increase context and scatter ring buffer elements [ Upstream commit `778f81d6cd` ] If crypto4xx is used in conjunction with dm-crypt, the available ring buffer elements are not enough to handle the load properly. On an aes-cbc-essiv:sha256 encrypted swap partition the read performance is abyssal: (tested with hdparm -t) /dev/mapper/swap_crypt: Timing buffered disk reads: 14 MB in 3.68 seconds = 3.81 MB/sec The patch increases both PPC4XX_NUM_SD and PPC4XX_NUM_PD to 256. This improves the performance considerably: /dev/mapper/swap_crypt: Timing buffered disk reads: 104 MB in 3.03 seconds = 34.31 MB/sec Furthermore, PPC4XX_LAST_SD, PPC4XX_LAST_GD and PPC4XX_LAST_PD can be easily calculated from their respective PPC4XX_NUM_* constant. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Derek Basehore	291c7e488f	backlight: pwm_bl: Fix overflow condition [ Upstream commit `5d0c49aceb` ] This fixes an overflow condition that can happen with high max brightness and period values in compute_duty_cycle. This fixes it by using a 64 bit variable for computing the duty cycle. Signed-off-by: Derek Basehore <dbasehore@chromium.org> Acked-by: Thierry Reding <thierry.reding@gmail.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Sankar Patchineelam	d14718c9f4	bnxt_en: Fix NULL pointer dereference in reopen failure path [ Upstream commit `2247925f09` ] Net device reset can fail when the h/w or f/w is in a bad state. Subsequent netdevice open fails in bnxt_hwrm_stat_ctx_alloc(). The cleanup invokes bnxt_hwrm_resource_free() which inturn calls bnxt_disable_int(). In this routine, the code segment if (ring->fw_ring_id != INVALID_HW_RING_ID) BNXT_CP_DB(cpr->cp_doorbell, cpr->cp_raw_cons); results in NULL pointer dereference as cpr->cp_doorbell is not yet initialized, and fw_ring_id is zero. The fix is to initialize cpr fw_ring_id to INVALID_HW_RING_ID before bnxt_init_chip() is invoked. Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Vaidyanathan Srinivasan	9e1771368a	cpuidle: powernv: Pass correct drv->cpumask for registration [ Upstream commit `293d264f13` ] drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init(). On PowerNV platform cpu_present could be less than cpu_possible in cases where firmware detects the cpu, but it is not available to the OS. When CONFIG_HOTPLUG_CPU=n, such cpus are not hotplugable at runtime and hence we skip creating cpu_device. This breaks cpuidle on powernv where register_cpu() is not called for cpus in cpu_possible_mask that cannot be hot-added at runtime. Trying cpuidle_register_device() on cpu without cpu_device will cause crash like this: cpu 0xf: Vector: 380 (Data SLB Access) at [c000000ff1503490] pc: c00000000022c8bc: string+0x34/0x60 lr: c00000000022ed78: vsnprintf+0x284/0x42c sp: c000000ff1503710 msr: 9000000000009033 dar: 6000000060000000 current = 0xc000000ff1480000 paca = 0xc00000000fe82d00 softe: 0 irq_happened: 0x01 pid = 1, comm = swapper/8 Linux version 4.11.0-rc2 (sv@sagarika) (gcc version 4.9.4 (Buildroot 2017.02-00004-gc28573e) ) #15 SMP Fri Mar 17 19:32:02 IST 2017 enter ? for help [link register ] c00000000022ed78 vsnprintf+0x284/0x42c [c000000ff1503710] c00000000022ebb8 vsnprintf+0xc4/0x42c (unreliable) [c000000ff1503800] c00000000022ef40 vscnprintf+0x20/0x44 [c000000ff1503830] c0000000000ab61c vprintk_emit+0x94/0x2cc [c000000ff15038a0] c0000000000acc9c vprintk_func+0x60/0x74 [c000000ff15038c0] c000000000619694 printk+0x38/0x4c [c000000ff15038e0] c000000000224950 kobject_get+0x40/0x60 [c000000ff1503950] c00000000022507c kobject_add_internal+0x60/0x2c4 [c000000ff15039e0] c000000000225350 kobject_init_and_add+0x70/0x78 [c000000ff1503a60] c00000000053c288 cpuidle_add_sysfs+0x9c/0xe0 [c000000ff1503ae0] c00000000053aeac cpuidle_register_device+0xd4/0x12c [c000000ff1503b30] c00000000053b108 cpuidle_register+0x98/0xcc [c000000ff1503bc0] c00000000085eaf0 powernv_processor_idle_init+0x140/0x1e0 [c000000ff1503c60] c00000000000cd60 do_one_initcall+0xc0/0x15c [c000000ff1503d20] c000000000833e84 kernel_init_freeable+0x1a0/0x25c [c000000ff1503dc0] c00000000000d478 kernel_init+0x24/0x12c [c000000ff1503e30] c00000000000b564 ret_from_kernel_thread+0x5c/0x78 This patch fixes the bug by passing correct cpumask from powernv-cpuidle driver. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [ rjw: Comment massage ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Russell King	5460e4672b	ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory [ Upstream commit `916a008b4b` ] dma_get_sgtable() tries to create a scatterlist table containing valid struct page pointers for the coherent memory allocation passed in to it. However, memory can be declared via dma_declare_coherent_memory(), or via other reservation schemes which means that coherent memory is not guaranteed to be backed by struct pages. In such cases, the resulting scatterlist table contains pointers to invalid pages, which causes kernel oops later. This patch adds detection of such memory, and refuses to create a scatterlist table for such memory. Reported-by: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Dan Carpenter	9c1433b5dd	Btrfs: fix an integer overflow check [ Upstream commit `457ae7268b` ] This isn't super serious because you need CAP_ADMIN to run this code. I added this integer overflow check last year but apparently I am rubbish at writing integer overflow checks... There are two issues. First, access_ok() works on unsigned long type and not u64 so on 32 bit systems the access_ok() could be checking a truncated size. The other issue is that we should be using a stricter limit so we don't overflow the kzalloc() setting ctx->clone_roots later in the function after the access_ok(): alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1); sctx->clone_roots = kzalloc(alloc_size, GFP_KERNEL \| __GFP_NOWARN); Fixes: `f5ecec3ce2` ("btrfs: send: silence an integer overflow warning") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ added comment ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Liping Zhang	0708a47681	netfilter: nfnetlink_queue: fix secctx memory leak [ Upstream commit `77c1c03c5b` ] We must call security_release_secctx to free the memory returned by security_secid_to_secctx, otherwise memory may be leaked forever. Fixes: `ef493bd930` ("netfilter: nfnetlink_queue: add security context information") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:43 +01:00
Adam Wallis	54420c1ac4	xhci: plat: Register shutdown for xhci_plat [ Upstream commit `b07c12517f` ] Shutdown should be called for xhci_plat devices especially for situations where kexec might be used by stopping DMA transactions. Signed-off-by: Adam Wallis <awallis@codeaurora.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Jonas Jensen	55b6a5d080	net: moxa: fix TX overrun memory leak [ Upstream commit `c2b341a620` ] moxart_mac_start_xmit() doesn't care where tx_tail is, tx_head can catch and pass tx_tail, which is bad because moxart_tx_finished() isn't guaranteed to catch up on freeing resources from tx_tail. Add a check in moxart_mac_start_xmit() stopping the queue at the end of the circular buffer. Also add a check in moxart_tx_finished() waking the queue if the buffer has TX_WAKE_THRESHOLD or more free descriptors. While we're at it, move spin_lock_irq() to happen before our descriptor pointer is assigned in moxart_mac_start_xmit(). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=99451 Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Arnd Bergmann	ce19146a0d	isdn: kcapi: avoid uninitialized data [ Upstream commit `af109a2cf6` ] gcc-7 points out that the AVMB1_ADDCARD ioctl results in an unintialized value ending up in the cardnr parameter: drivers/isdn/capi/kcapi.c: In function 'old_capi_manufacturer': drivers/isdn/capi/kcapi.c:1042:24: error: 'cdef.cardnr' may be used uninitialized in this function [-Werror=maybe-uninitialized] cparams.cardnr = cdef.cardnr; This has been broken since before the start of the git history, so either the value is not used for anything important, or the ioctl command doesn't get called in practice. Setting the cardnr to zero avoids the warning and makes sure we have consistent behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Arnd Bergmann	bb011a4513	virtio_balloon: prevent uninitialized variable use [ Upstream commit `f0bb2d50df` ] The latest gcc-7.0.1 snapshot reports a new warning: virtio/virtio_balloon.c: In function 'update_balloon_stats': virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized] virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized] virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized] virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized] This seems absolutely right, so we should add an extra check to prevent copying uninitialized stack data into the statistics. >From all I can tell, this has been broken since the statistics code was originally added in 2.6.34. Fixes: `9564e138b1` ("virtio: Add memory statistics reporting to the balloon driver (V4)") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Ladi Prosek	c6f9090929	virtio-balloon: use actual number of stats for stats queue buffers [ Upstream commit `9646b26e85` ] The virtio balloon driver contained a not-so-obvious invariant that update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters in order to send valid stats to the host. This commit fixes it by having update_balloon_stats return the actual number of counters, and its callers use it when pushing buffers to the stats virtqueue. Note that it is still out of spec to change the number of counters at run-time. "Driver MUST supply the same subset of statistics in all buffers submitted to the statsq." Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Herongguang (Stephen)	808ed3bd9d	KVM: pci-assign: do not map smm memory slot pages in vt-d page tables [ Upstream commit `0292e169b2` ] or VM memory are not put thus leaked in kvm_iommu_unmap_memslots() when destroy VM. This is consistent with current vfio implementation. Signed-off-by: herongguang <herongguang.he@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Mark Rutland	29c4f517ff	net: ipconfig: fix ic_close_devs() use-after-free [ Upstream commit `ffefb6f4d6` ] Our chosen ic_dev may be anywhere in our list of ic_devs, and we may free it before attempting to close others. When we compare d->dev and ic_dev->dev, we're potentially dereferencing memory returned to the allocator. This causes KASAN to scream for each subsequent ic_dev we check. As there's a 1-1 mapping between ic_devs and netdevs, we can instead compare d and ic_dev directly, which implicitly handles the !ic_dev case, and avoids the use-after-free. The ic_dev pointer may be stale, but we will not dereference it. Original splat: [ 6.487446] ================================================================== [ 6.494693] BUG: KASAN: use-after-free in ic_close_devs+0xc4/0x154 at addr ffff800367efa708 [ 6.503013] Read of size 8 by task swapper/0/1 [ 6.507452] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc3-00002-gda42158 #8 [ 6.514993] Hardware name: AppliedMicro Mustang/Mustang, BIOS 3.05.05-beta_rc Jan 27 2016 [ 6.523138] Call trace: [ 6.525590] [<ffff200008094778>] dump_backtrace+0x0/0x570 [ 6.530976] [<ffff200008094d08>] show_stack+0x20/0x30 [ 6.536017] [<ffff200008bee928>] dump_stack+0x120/0x188 [ 6.541231] [<ffff20000856d5e4>] kasan_object_err+0x24/0xa0 [ 6.546790] [<ffff20000856d924>] kasan_report_error+0x244/0x738 [ 6.552695] [<ffff20000856dfec>] __asan_report_load8_noabort+0x54/0x80 [ 6.559204] [<ffff20000aae86ac>] ic_close_devs+0xc4/0x154 [ 6.564590] [<ffff20000aaedbac>] ip_auto_config+0x2ed4/0x2f1c [ 6.570321] [<ffff200008084b04>] do_one_initcall+0xcc/0x370 [ 6.575882] [<ffff20000aa31de8>] kernel_init_freeable+0x5f8/0x6c4 [ 6.581959] [<ffff20000a16df00>] kernel_init+0x18/0x190 [ 6.587171] [<ffff200008084710>] ret_from_fork+0x10/0x40 [ 6.592468] Object at ffff800367efa700, in cache kmalloc-128 size: 128 [ 6.598969] Allocated: [ 6.601324] PID = 1 [ 6.603427] save_stack_trace_tsk+0x0/0x418 [ 6.607603] save_stack_trace+0x20/0x30 [ 6.611430] kasan_kmalloc+0xd8/0x188 [ 6.615087] ip_auto_config+0x8c4/0x2f1c [ 6.619002] do_one_initcall+0xcc/0x370 [ 6.622832] kernel_init_freeable+0x5f8/0x6c4 [ 6.627178] kernel_init+0x18/0x190 [ 6.630660] ret_from_fork+0x10/0x40 [ 6.634223] Freed: [ 6.636233] PID = 1 [ 6.638334] save_stack_trace_tsk+0x0/0x418 [ 6.642510] save_stack_trace+0x20/0x30 [ 6.646337] kasan_slab_free+0x88/0x178 [ 6.650167] kfree+0xb8/0x478 [ 6.653131] ic_close_devs+0x130/0x154 [ 6.656875] ip_auto_config+0x2ed4/0x2f1c [ 6.660875] do_one_initcall+0xcc/0x370 [ 6.664705] kernel_init_freeable+0x5f8/0x6c4 [ 6.669051] kernel_init+0x18/0x190 [ 6.672534] ret_from_fork+0x10/0x40 [ 6.676098] Memory state around the buggy address: [ 6.680880] ffff800367efa600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 6.688078] ffff800367efa680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 6.695276] >ffff800367efa700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 6.702469] ^ [ 6.705952] ffff800367efa780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 6.713149] ffff800367efa800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 6.720343] ================================================================== [ 6.727536] Disabling lock debugging due to kernel taint Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: David S. Miller <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Rafael J. Wysocki	e9a1ba292f	cpufreq: Fix creation of symbolic links to policy directories [ Upstream commit `2f0ba790df` ] The cpufreq core only tries to create symbolic links from CPU directories in sysfs to policy directories in cpufreq_add_dev(), either when a given CPU is registered or when the cpufreq driver is registered, whichever happens first. That is not sufficient, however, because cpufreq_add_dev() may be called for an offline CPU whose policy object has not been created yet and, quite obviously, the symbolic cannot be added in that case. Fix that by making cpufreq_online() attempt to add symbolic links to policy objects for the CPUs in the related_cpus mask of every new policy object created by it. The cpufreq_driver_lock locking around the for_each_cpu() loop in cpufreq_online() is dropped, because it is not necessary and the code is somewhat simpler without it. Moreover, failures to create a symbolic link will not be regarded as hard errors any more and the CPUs without those links will not be taken offline automatically, but that should not be problematic in practice. Reported-and-tested-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Reizer, Eyal	e0d1315305	ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend [ Upstream commit `9bcf53f34a` ] mmc2 used for wl12xx was missing the keep-power-in suspend parameter. As a result the board couldn't reach suspend state. Signed-off-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Gao Feng	b5ed572a1b	netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to register [ Upstream commit `75c689dca9` ] In the commit `93557f53e1` ("netfilter: nf_conntrack: nf_conntrack snmp helper"), the snmp_helper is replaced by nf_nat_snmp_hook. So the snmp_helper is never registered. But it still tries to unregister the snmp_helper, it could cause the panic. Now remove the useless snmp_helper and the unregister call in the error handler. Fixes: `93557f53e1` ("netfilter: nf_conntrack: nf_conntrack snmp helper") Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:42 +01:00
Liping Zhang	01060acf6a	netfilter: nfnl_cthelper: fix a race when walk the nf_ct_helper_hash table [ Upstream commit `83d90219a5` ] The nf_ct_helper_hash table is protected by nf_ct_helper_mutex, while nfct_helper operation is protected by nfnl_lock(NFNL_SUBSYS_CTHELPER). So it's possible that one CPU is walking the nf_ct_helper_hash for cthelper add/get/del, another cpu is doing nf_conntrack_helpers_unregister at the same time. This is dangrous, and may cause use after free error. Note, delete operation will flush all cthelpers added via nfnetlink, so using rcu to do protect is not easy. Now introduce a dummy list to record all the cthelpers added via nfnetlink, then we can walk the dummy list instead of walking the nf_ct_helper_hash. Also, keep nfnl_cthelper_dump_table unchanged, it may be invoked without nfnl_lock(NFNL_SUBSYS_CTHELPER) held. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Alexey Khoroshilov	9e6398184a	irda: vlsi_ir: fix check for DMA mapping errors [ Upstream commit `6ac3b77a6f` ] vlsi_alloc_ring() checks for DMA mapping errors by comparing returned address with zero, while pci_dma_mapping_error() should be used. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Sagi Grimberg	37f41dac70	RDMA/iser: Fix possible mr leak on device removal event [ Upstream commit `ea174c9573` ] When the rdma device is removed, we must cleanup all the rdma resources within the DEVICE_REMOVAL event handler to let the device teardown gracefully. When this happens with live I/O, some memory regions are occupied. Thus, track them too and dereg all the mr's. We are safe with mr access by iscsi_iser_cleanup_task. Reported-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Alexander Duyck	661f534869	i40e: Do not enable NAPI on q_vectors that have no rings [ Upstream commit `13a8cd191a` ] When testing the epoll w/ busy poll code I found that I could get into a state where the i40e driver had q_vectors w/ active NAPI that had no rings. This was resulting in a divide by zero error. To correct it I am updating the driver code so that we only support NAPI on q_vectors that have 1 or more rings allocated to them. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
David Marchand	2eb783a705	IB/rxe: increment msn only when completing a request [ Upstream commit `9fcd67d177` ] According to C9-147, MSN should only be incremented when the last packet of a multi packet request has been received. "Logically, the requester associates a sequential Send Sequence Number (SSN) with each WQE posted to the send queue. The SSN bears a one- to-one relationship to the MSN returned by the responder in each re- sponse packet. Therefore, when the requester receives a response, it in- terprets the MSN as representing the SSN of the most recent request completed by the responder to determine which send WQE(s) can be completed." Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Dan Carpenter	2f0e39f2e3	IB/rxe: double free on error [ Upstream commit `ded2602353` ] "goto err;" has it's own kfree_skb() call so it's a double free. We only need to free on the "goto exit;" path. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Alexander Duyck	7f077afe94	net: Do not allow negative values for busy_read and busy_poll sysctl interfaces [ Upstream commit `95f2552113` ] This change basically codifies what I think was already the limitations on the busy_poll and busy_read sysctl interfaces. We weren't checking the lower bounds and as such could input negative values. The behavior when that was used was dependent on the architecture. In order to prevent any issues with that I am just disabling support for values less than 0 since this way we don't have to worry about any odd behaviors. By limiting the sysctl values this way it also makes it consistent with how we handle the SO_BUSY_POLL socket option since the value appears to be reported as a signed integer value and negative values are rejected. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Josef Bacik	521a7e3dad	nbd: set queue timeout properly [ Upstream commit `f858685503` ] We can't just set the timeout on the tagset, we have to set it on the queue as it would have been setup already at this point. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Jason Gunthorpe	f4fcc56632	infiniband: Fix alignment of mmap cookies to support VIPT caching [ Upstream commit `cb88645596` ] When vmalloc_user is used to create memory that is supposed to be mmap'd to user space, it is necessary for the mmap cookie (eg the offset) to be aligned to SHMLBA. This creates a situation where all virtual mappings of the same physical page share the same virtual cache index and guarantees VIPT coherence. Otherwise the cache is non-coherent and the kernel will not see writes by userspace when reading the shared page (or vice-versa). Reported-by: Josh Beavers <josh.beavers@gmail.com> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Sagi Grimberg	cd083d5bca	IB/core: Protect against self-requeue of a cq work item [ Upstream commit `86f46aba8d` ] We need to make sure that the cq work item does not run when we are destroying the cq. Unlike flush_work, cancel_work_sync protects against self-requeue of the work item (which we can do in ib_cq_poll_work). Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:41 +01:00
Shiraz Saleem	26452a5033	i40iw: Receive netdev events post INET_NOTIFIER state [ Upstream commit `871a8623d3` ] Netdev notification events are de-registered only when all client iwdev instances are removed. If a single client is closed and re-opened, netdev events could arrive even before the Control Queue-Pair (CQP) is created, causing a NULL pointer dereference crash in i40iw_get_cqp_request. Fix this by allowing netdev event notification only after we have reached the INET_NOTIFIER state with respect to device initialization. Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Arnd Bergmann	102a8a1634	bna: avoid writing uninitialized data into hw registers [ Upstream commit `a5af839253` ] The latest gcc-7 snapshot warns about bfa_ioc_send_enable/bfa_ioc_send_disable writing undefined values into the hardware registers: drivers/net/ethernet/brocade/bna/bfa_ioc.c: In function 'bfa_iocpf_sm_disabling_entry': arch/arm/include/asm/io.h:109:22: error: '((void )&disable_req+4)' is used uninitialized in this function [-Werror=uninitialized] arch/arm/include/asm/io.h:109:22: error: '((void )&disable_req+8)' is used uninitialized in this function [-Werror=uninitialized] The two functions look like they should do the same thing, but only one of them initializes the time stamp and clscode field. The fact that we only get a warning for one of the two functions seems to be arbitrary, based on the inlining decisions in the compiler. To address this, I'm making both functions do the same thing: - set the clscode from the ioc structure in both - set the time stamp from ktime_get_real_seconds (which also avoids the signed-integer overflow in 2038 and extends the well-defined behavior until 2106). - zero-fill the reserved field Fixes: `8b230ed8ec` ("bna: Brocade 10Gb Ethernet device driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Julian Wiedmann	51533c4bf1	s390/qeth: no ETH header for outbound AF_IUCV [ Upstream commit `acd9776b5c` ] With AF_IUCV traffic, the skb passed to hard_start_xmit() has a 14 byte slot at skb->data, intended for an ETH header. qeth_l3_fill_af_iucv_hdr() fills this ETH header... and then immediately moves it to the skb's headroom, where it disappears and is never seen again. But it's still possible for us to return NETDEV_TX_BUSY after the skb has been modified. Since we didn't get a private copy of the skb, the next time the skb is delivered to hard_start_xmit() it no longer has the expected layout (we moved the ETH header to the headroom, so skb->data now starts at the IUCV_TRANS header). So when qeth_l3_fill_af_iucv_hdr() does another round of rebuilding, the resulting qeth header ends up all wrong. On transmission, the buffer is then rejected by the HiperSockets device with SBALF15 = x'04'. When this error is passed back to af_iucv as TX_NOTIFY_UNREACHABLE, it tears down the offending socket. As the ETH header for AF_IUCV serves no purpose, just align the code to what we do for IP traffic on L3 HiperSockets: keep the ETH header at skb->data, and pass down data_offset = ETH_HLEN to qeth_fill_buffer(). When mapping the payload into the SBAL elements, the ETH header is then stripped off. This avoids the skb manipulations in qeth_l3_fill_af_iucv_hdr(), and any buffer re-entering hard_start_xmit() after NETDEV_TX_BUSY is now processed properly. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Julian Wiedmann	118b0404d6	s390/qeth: size calculation outbound buffers [ Upstream commit `7d969d2e88` ] Depending on the device type, hard_start_xmit() builds different output buffer formats. For instance with HiperSockets, on both L2 and L3 we strip the ETH header from the skb - L3 doesn't need it, and L2 carries it in the buffer's header element. For this, we pass data_offset = ETH_HLEN all the way down to __qeth_fill_buffer(), where skb->data is then adjusted accordingly. But the initial size calculation still considers the full skb length (including the ETH header). So qeth_get_elements_no() can erroneously reject a skb as too big, even though it would actually fit into an output buffer once the ETH header has been trimmed off later. Fix this by passing an additional offset to qeth_get_elements_no(), that indicates where in the skb the on-wire data actually begins. Since the current code uses data_offset=-1 for some special handling on OSA, we need to clamp data_offset to 0... On HiperSockets this helps when sending ~MTU-size skbs with weird page alignment. No change for OSA or AF_IUCV. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
hayeswang	60d5982304	r8152: prevent the driver from transmitting packets with carrier off [ Upstream commit `2f25abe6ba` ] The linking status may be changed when autosuspend. And, after autoresume, the driver may try to transmit packets when the device is carrier off, because the interrupt transfer doesn't update the linking status, yet. And, if the device is in ALDPS mode, the device would stop working. The another similar case is 1. unplug the cable. 2. interrupt transfer queue a work_queue for linking change. 3. device enters the ALDPS mode. 4. a tx occurs before the work_queue is called. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Arnaud Pouliquen	b89e229112	ASoC: STI: Fix reader substream pointer set [ Upstream commit `3c9d3f1bc2` ] reader->substream is used in IRQ handler for error case but is never set. Set value to pcm substream on DAI startup and clean it on dai shutdown. Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@st.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Peter Stein	347848e0bb	HID: xinmo: fix for out of range for THT 2P arcade controller. [ Upstream commit `9257821c5a` ] There is a new clone of the XIN MO arcade controller which has same issue with out of range like the original. This fix will solve the issue where 2 directions on the joystick are not recognized by the new THT 2P arcade controller with device ID 0x75e1. In details the new device ID is added the hid-id list and the hid-xinmo source code. Signed-off-by: Peter Stein <peter@stuntstein.dk> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Arnd Bergmann	afa055f2a1	hwmon: (asus_atk0110) fix uninitialized data access [ Upstream commit `a2125d0244` ] The latest gcc-7 snapshot adds a warning to point out that when atk_read_value_old or atk_read_value_new fails, we copy uninitialized data into sensor->cached_value: drivers/hwmon/asus_atk0110.c: In function 'atk_input_show': drivers/hwmon/asus_atk0110.c:651:26: error: 'value' may be used uninitialized in this function [-Werror=maybe-uninitialized] Adding an error check avoids this. All versions of the driver are affected. Fixes: `2c03d07ad5` ("hwmon: Add Asus ATK0110 support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Luca Tettamanti <kronos.it@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Rob Herring	5700ffc4ac	ARM: dts: ti: fix PCI bus dtc warnings [ Upstream commit `7d79f6098d` ] dtc recently added PCI bus checks. Fix these warnings. Signed-off-by: Rob Herring <robh@kernel.org> Cc: "Benoît Cousson" <bcousson@baylibre.com> Cc: Tony Lindgren <tony@atomide.com> Cc: linux-omap@vger.kernel.org Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:40 +01:00
Wanpeng Li	2df19698db	KVM: VMX: Fix enable VPID conditions [ Upstream commit `08d839c4b1` ] This can be reproduced by running L2 on L1, and disable VPID on L0 if w/o commit "KVM: nVMX: Fix nested VPID vmx exec control", the L2 crash as below: KVM: entry failed, hardware error 0x7 EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306c3 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Reference SDM 30.3 INVVPID: Protected Mode Exceptions - #UD - If not in VMX operation. - If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0). - If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0). So we should check both VPID enable bit in vmx exec control and INVVPID support bit in vmx capability MSRs to enable VPID. This patch adds the guarantee to not enable VPID if either INVVPID or single-context/all-context invalidation is not exposed in vmx capability MSRs. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Wanpeng Li	e0249c0234	KVM: x86: correct async page present tracepoint [ Upstream commit `24dccf83a1` ] After async pf setup successfully, there is a broadcast wakeup w/ special token 0xffffffff which tells vCPU that it should wake up all processes waiting for APFs though there is no real process waiting at the moment. The async page present tracepoint print prematurely and fails to catch the special token setup. This patch fixes it by moving the async page present tracepoint after the special token setup. Before patch: qemu-system-x86-8499 [006] ...1 5973.473292: kvm_async_pf_ready: token 0x0 gva 0x0 After patch: qemu-system-x86-8499 [006] ...1 5973.473292: kvm_async_pf_ready: token 0xffffffff gva 0x0 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Jim Mattson	8386ff5203	kvm: vmx: Flush TLB when the APIC-access address changes [ Upstream commit `fb6c819843` ] Quoting from the Intel SDM, volume 3, section 28.3.3.4: Guidelines for Use of the INVEPT Instruction: If EPT was in use on a logical processor at one time with EPTP X, it is recommended that software use the INVEPT instruction with the "single-context" INVEPT type and with EPTP X in the INVEPT descriptor before a VM entry on the same logical processor that enables EPT with EPTP X and either (a) the "virtualize APIC accesses" VM-execution control was changed from 0 to 1; or (b) the value of the APIC-access address was changed. In the nested case, the burden falls on L1, unless L0 enables EPT in vmcs02 when L1 doesn't enable EPT in vmcs12. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Dick Kennedy	3bd2017b6a	scsi: lpfc: Fix PT2PT PRLI reject [ Upstream commit `a71e3cdcfc` ] lpfc cannot establish connection with targets that send PRLI in P2P configurations. If lpfc rejects a PRLI that is sent from a target the target will not resend and will reject the PRLI send from the initiator. [mkp: applied by hand] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Patrice Chotard	0f4aa1f0f5	pinctrl: st: add irq_request/release_resources callbacks [ Upstream commit `e855fa9a65` ] When using GPIO as IRQ source, the GPIO must be configured in INPUT. Callbacks dedicated for this was missing in pinctrl-st driver. This fix the following kernel error when trying to lock a gpio as IRQ: [ 7.521095] gpio gpiochip7: (PIO11): gpiochip_lock_as_irq: tried to flag a GPIO set as output for IRQ [ 7.526018] gpio gpiochip7: (PIO11): unable to lock HW IRQ 6 for IRQ [ 7.529405] genirq: Failed to request resources for 0-0053 (irq 81) on irqchip GPIO Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Eric Dumazet	7656871eff	inet: frag: release spinlock before calling icmp_send() [ Upstream commit `ec4fbd6475` ] Dmitry reported a lockdep splat [1] (false positive) that we can fix by releasing the spinlock before calling icmp_send() from ip_expire() This is a false positive because sending an ICMP message can not possibly re-enter the IP frag engine. [1] [ INFO: possible circular locking dependency detected ] 4.10.0+ #29 Not tainted ------------------------------------------------------- modprobe/12392 is trying to acquire lock: (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] spin_lock include/linux/spinlock.h:299 [inline] (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] __netif_tx_lock include/linux/netdevice.h:3486 [inline] (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 but task is already holding lock: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] spin_lock include/linux/spinlock.h:299 [inline] (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&q->lock)->rlock){+.-...}: validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] ip_defrag+0x3a2/0x4130 net/ipv4/ip_fragment.c:669 ip_check_defrag+0x4e3/0x8b0 net/ipv4/ip_fragment.c:713 packet_rcv_fanout+0x282/0x800 net/packet/af_packet.c:1459 deliver_skb net/core/dev.c:1834 [inline] dev_queue_xmit_nit+0x294/0xa90 net/core/dev.c:1890 xmit_one net/core/dev.c:2903 [inline] dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2923 sch_direct_xmit+0x31f/0x6d0 net/sched/sch_generic.c:182 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_resolve_output+0x6b9/0xb10 net/core/neighbour.c:1308 neigh_output include/net/neighbour.h:478 [inline] ip_finish_output2+0x8b8/0x15a0 net/ipv4/ip_output.c:228 ip_do_fragment+0x1d93/0x2720 net/ipv4/ip_output.c:672 ip_fragment.constprop.54+0x145/0x200 net/ipv4/ip_output.c:545 ip_finish_output+0x82d/0xe10 net/ipv4/ip_output.c:314 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 raw_sendmsg+0x26de/0x3a00 net/ipv4/raw.c:655 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 ___sys_sendmsg+0x4a3/0x9f0 net/socket.c:1985 __sys_sendmmsg+0x25c/0x750 net/socket.c:2075 SYSC_sendmmsg net/socket.c:2106 [inline] SyS_sendmmsg+0x35/0x60 net/socket.c:2101 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281 return_from_SYSCALL_64+0x0/0x7a -> #0 (_xmit_ETHER#2){+.-...}: check_prev_add kernel/locking/lockdep.c:1830 [inline] check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940 validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] __netif_tx_lock include/linux/netdevice.h:3486 [inline] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_hh_output include/net/neighbour.h:468 [inline] neigh_output include/net/neighbour.h:476 [inline] ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228 ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394 icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754 ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239 call_timer_fn+0x241/0x820 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 [inline] __run_timers+0x960/0xcf0 kernel/time/timer.c:1601 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:657 [inline] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707 __read_once_size include/linux/compiler.h:254 [inline] atomic_read arch/x86/include/asm/atomic.h:26 [inline] rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:350 [inline] __rcu_is_watching kernel/rcu/tree.c:1133 [inline] rcu_is_watching+0x83/0x110 kernel/rcu/tree.c:1147 rcu_read_lock_held+0x87/0xc0 kernel/rcu/update.c:293 radix_tree_deref_slot include/linux/radix-tree.h:238 [inline] filemap_map_pages+0x6d4/0x1570 mm/filemap.c:2335 do_fault_around mm/memory.c:3231 [inline] do_read_fault mm/memory.c:3265 [inline] do_fault+0xbd5/0x2080 mm/memory.c:3370 handle_pte_fault mm/memory.c:3600 [inline] __handle_mm_fault+0x1062/0x2cb0 mm/memory.c:3714 handle_mm_fault+0x1e2/0x480 mm/memory.c:3751 __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397 do_page_fault+0x54/0x70 arch/x86/mm/fault.c:1460 page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1011 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&q->lock)->rlock); lock(_xmit_ETHER#2); lock(&(&q->lock)->rlock); lock(_xmit_ETHER#2); * DEADLOCK * 10 locks held by modprobe/12392: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81329758>] __do_page_fault+0x2b8/0xb60 arch/x86/mm/fault.c:1336 #1: (rcu_read_lock){......}, at: [<ffffffff8188cab6>] filemap_map_pages+0x1e6/0x1570 mm/filemap.c:2324 #2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] spin_lock include/linux/spinlock.h:299 [inline] #2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] pte_alloc_one_map mm/memory.c:2944 [inline] #2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] alloc_set_pte+0x13b8/0x1b90 mm/memory.c:3072 #3: (((&q->timer))){+.-...}, at: [<ffffffff81627e72>] lockdep_copy_map include/linux/lockdep.h:175 [inline] #3: (((&q->timer))){+.-...}, at: [<ffffffff81627e72>] call_timer_fn+0x1c2/0x820 kernel/time/timer.c:1258 #4: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] spin_lock include/linux/spinlock.h:299 [inline] #4: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201 #5: (rcu_read_lock){......}, at: [<ffffffff8389a633>] ip_expire+0x1b3/0x6c0 net/ipv4/ip_fragment.c:216 #6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] spin_trylock include/linux/spinlock.h:309 [inline] #6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] icmp_xmit_lock net/ipv4/icmp.c:219 [inline] #6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] icmp_send+0x803/0x1c80 net/ipv4/icmp.c:681 #7: (rcu_read_lock_bh){......}, at: [<ffffffff838ab9a1>] ip_finish_output2+0x2c1/0x15a0 net/ipv4/ip_output.c:198 #8: (rcu_read_lock_bh){......}, at: [<ffffffff836d1dee>] __dev_queue_xmit+0x23e/0x1e60 net/core/dev.c:3324 #9: (dev->qdisc_running_key ?: &qdisc_running_key){+.....}, at: [<ffffffff836d3a27>] dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 stack backtrace: CPU: 0 PID: 12392 Comm: modprobe Not tainted 4.10.0+ #29 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:52 print_circular_bug+0x307/0x3b0 kernel/locking/lockdep.c:1204 check_prev_add kernel/locking/lockdep.c:1830 [inline] check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940 validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] __netif_tx_lock include/linux/netdevice.h:3486 [inline] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_hh_output include/net/neighbour.h:468 [inline] neigh_output include/net/neighbour.h:476 [inline] ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228 ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394 icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754 ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239 call_timer_fn+0x241/0x820 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 [inline] __run_timers+0x960/0xcf0 kernel/time/timer.c:1601 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:657 [inline] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707 RIP: 0010:__read_once_size include/linux/compiler.h:254 [inline] RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline] RIP: 0010:rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:350 [inline] RIP: 0010:__rcu_is_watching kernel/rcu/tree.c:1133 [inline] RIP: 0010:rcu_is_watching+0x83/0x110 kernel/rcu/tree.c:1147 RSP: 0000:ffff8801c391f120 EFLAGS: 00000a03 ORIG_RAX: ffffffffffffff10 RAX: dffffc0000000000 RBX: ffff8801c391f148 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000055edd4374000 RDI: ffff8801dbe1ae0c RBP: ffff8801c391f1a0 R08: 0000000000000002 R09: 0000000000000000 R10: dffffc0000000000 R11: 0000000000000002 R12: 1ffff10038723e25 R13: ffff8801dbe1ae00 R14: ffff8801c391f680 R15: dffffc0000000000 </IRQ> rcu_read_lock_held+0x87/0xc0 kernel/rcu/update.c:293 radix_tree_deref_slot include/linux/radix-tree.h:238 [inline] filemap_map_pages+0x6d4/0x1570 mm/filemap.c:2335 do_fault_around mm/memory.c:3231 [inline] do_read_fault mm/memory.c:3265 [inline] do_fault+0xbd5/0x2080 mm/memory.c:3370 handle_pte_fault mm/memory.c:3600 [inline] __handle_mm_fault+0x1062/0x2cb0 mm/memory.c:3714 handle_mm_fault+0x1e2/0x480 mm/memory.c:3751 __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397 do_page_fault+0x54/0x70 arch/x86/mm/fault.c:1460 page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1011 RIP: 0033:0x7f83172f2786 RSP: 002b:00007fffe859ae80 EFLAGS: 00010293 RAX: 000055edd4373040 RBX: 00007f83175111c8 RCX: 000055edd4373238 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f8317510970 RBP: 00007fffe859afd0 R08: 0000000000000009 R09: 0000000000000000 R10: 0000000000000064 R11: 0000000000000000 R12: 000055edd4373040 R13: 0000000000000000 R14: 00007fffe859afe8 R15: 0000000000000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Ying Xue	e6e8067ec3	tipc: fix nametbl deadlock at tipc_nametbl_unsubscribe [ Upstream commit `557d054c01` ] Until now, tipc_nametbl_unsubscribe() is called at subscriptions reference count cleanup. Usually the subscriptions cleanup is called at subscription timeout or at subscription cancel or at subscriber delete. We have ignored the possibility of this being called from other locations, which causes deadlock as we try to grab the tn->nametbl_lock while holding it already. CPU1: CPU2: ---------- ---------------- tipc_nametbl_publish spin_lock_bh(&tn->nametbl_lock) tipc_nametbl_insert_publ tipc_nameseq_insert_publ tipc_subscrp_report_overlap tipc_subscrp_get tipc_subscrp_send_event tipc_close_conn tipc_subscrb_release_cb tipc_subscrb_delete tipc_subscrp_put tipc_subscrp_put tipc_subscrp_kref_release tipc_nametbl_unsubscribe spin_lock_bh(&tn->nametbl_lock) <<grab nametbl_lock again>> CPU1: CPU2: ---------- ---------------- tipc_nametbl_stop spin_lock_bh(&tn->nametbl_lock) tipc_purge_publications tipc_nameseq_remove_publ tipc_subscrp_report_overlap tipc_subscrp_get tipc_subscrp_send_event tipc_close_conn tipc_subscrb_release_cb tipc_subscrb_delete tipc_subscrp_put tipc_subscrp_put tipc_subscrp_kref_release tipc_nametbl_unsubscribe spin_lock_bh(&tn->nametbl_lock) <<grab nametbl_lock again>> In this commit, we advance the calling of tipc_nametbl_unsubscribe() from the refcount cleanup to the intended callers. Fixes: `d094c4d5f5` ("tipc: add subscription refcount to avoid invalid delete") Reported-by: John Thompson <thompa.atl@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
hayeswang	bfb38fbd86	r8152: fix the rx early size of RTL8153 [ Upstream commit `b20cb60e2b` ] revert commit `a59e6d8152` ("r8152: correct the rx early size") and fix the rx early size as (rx buffer size - rx packet size - rx desc size - alignment) / 4 Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Marek Szyprowski	7171aa2680	iommu/exynos: Workaround FLPD cache flush issues for SYSMMU v5 [ Upstream commit `cd37a296a9` ] For some unknown reasons, in some cases, FLPD cache invalidation doesn't work properly with SYSMMU v5 controllers found in Exynos5433 SoCs. This can be observed by a firmware crash during initialization phase of MFC video decoder available in the mentioned SoCs when IOMMU support is enabled. To workaround this issue perform a full TLB/FLPD invalidation in case of replacing any first level page descriptors in case of SYSMMU v5. Fixes: `740a01eee9` ("iommu/exynos: Add support for v5 SYSMMU") CC: stable@vger.kernel.org # v4.10+ Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:39 +01:00
Jeffy Chen	0f0ac21805	netfilter: nfnl_cthelper: Fix memory leak [ Upstream commit `f83bf8da11` ] We have memory leaks of nf_conntrack_helper & expect_policy. Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Pablo Neira Ayuso	ec38fb443a	netfilter: nfnl_cthelper: fix runtime expectation policy updates [ Upstream commit `2c42225755` ] We only allow runtime updates of expectation policies for timeout and maximum number of expectations, otherwise reject the update. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Gustavo A. R. Silva	02197d86c5	usb: gadget: udc: remove pointer dereference after free [ Upstream commit `1f459262b0` ] Remove pointer dereference after free. Addresses-Coverity-ID: 1091173 Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Roger Quadros	2b943bed33	usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed [ Upstream commit `16bb05d98c` ] As per USB3.0 Specification "Table 9-20. Standard Endpoint Descriptor", for interrupt and isochronous endpoints, wMaxPacketSize must be set to 1024 if the endpoint defines bMaxBurst to be greater than zero. Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Alex Hemme	2101ccbc2a	hwmon: (max31790) Set correct PWM value [ Upstream commit `dd7406dd33` ] Traced fans not spinning to incorrect PWM value being written. The passed in value was written instead of the calulated value. Fixes: `54187ff9d7` ("hwmon: (max31790) Convert to use new hwmon registration API") Signed-off-by: Alex Hemme <ahemme@cisco.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Tony Lindgren	4ee082a727	net: qmi_wwan: Add USB IDs for MDM6600 modem on Motorola Droid 4 [ Upstream commit `4071898bf0` ] This gets qmicli working with the MDM6600 modem. Cc: Bjørn Mork <bjorn@mork.no> Reviewed-by: Sebastian Reichel <sre@kernel.org> Tested-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Xin Long	9ed8f0faba	sctp: out_qlen should be updated when pruning unsent queue [ Upstream commit `23bb09cfbe` ] This patch is to fix the issue that sctp_prsctp_prune_sent forgot to update q->out_qlen when removing a chunk from unsent queue. Fixes: `8dbdf1f5b0` ("sctp: implement prsctp PRIO policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Dan Carpenter	b4cf187a1b	bna: integer overflow bug in debugfs [ Upstream commit `13e2d5187f` ] We could allocate less memory than intended because we do: bnad->regdata = kzalloc(len << 2, GFP_KERNEL); The shift can overflow leading to a crash. This is debugfs code so the impact is very small. Fixes: `7afc5dbde0` ("bna: Add debugfs interface.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Rasesh Mody <rasesh.mody@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Eric Dumazet	b3f662ccd3	sch_dsmark: fix invalid skb_cow() usage [ Upstream commit `aea92fb2e0` ] skb_cow(skb, sizeof(ip header)) is not very helpful in this context. First we need to use pskb_may_pull() to make sure the ip header is in skb linear part, then use skb_try_make_writable() to address clones issues. Fixes: `4c30719f4f` ("[PKT_SCHED] dsmark: handle cloned and non-linear skb's") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Peng Tao	98d20e5902	vsock: cancel packets when failing to connect [ Upstream commit `380feae0de` ] Otherwise we'll leave the packets queued until releasing vsock device. E.g., if guest is slow to start up, resulting ETIMEDOUT on connect, guest will get the connect requests from failed host sockets. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:38 +01:00
Peng Tao	482b3f92ae	vhost-vsock: add pkt cancel capability [ Upstream commit `16320f363a` ] To allow canceling all packets of a connection. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Peng Tao	6f1848e778	vsock: track pkt owner vsock [ Upstream commit `36d277bac8` ] So that we can cancel a queued pkt later if necessary. Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Herbert Xu	7ff28d3307	crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex [ Upstream commit `8a0f5ccfb3` ] On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote: > > Yes, please. > Disregarding some reports is not a good way long term. Please try this patch. ---8<--- Subject: netlink: Annotate nlk cb_mutex by protocol Currently all occurences of nlk->cb_mutex are annotated by lockdep as a single class. This causes a false lcokdep cycle involving genl and crypto_user. This patch fixes it by dividing cb_mutex into individual classes based on the netlink protocol. As genl and crypto_user do not use the same netlink protocol this breaks the false dependency loop. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
hayeswang	ddfc9f7599	r8152: fix the list rx_done may be used without initialization [ Upstream commit `98d068ab52` ] The list rx_done would be initialized when the linking on occurs. Therefore, if a napi is scheduled without any linking on before, the following kernel panic would happen. BUG: unable to handle kernel NULL pointer dereference at 000000000000008 IP: [<ffffffffc085efde>] r8152_poll+0xe1e/0x1210 [r8152] PGD 0 Oops: 0002 [#1] SMP Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Vaidyanathan Srinivasan	9712b2b73d	cpuidle: Validate cpu_dev in cpuidle_add_sysfs() [ Upstream commit `ad0a45fd9c` ] If a given cpu is not in cpu_present and cpu hotplug is disabled, arch can skip setting up the cpu_dev. Arch cpuidle driver should pass correct cpu mask for registration, but failing to do so by the driver causes error to propagate and crash like this: [ 30.076045] Unable to handle kernel paging request for data at address 0x00000048 [ 30.076100] Faulting instruction address: 0xc0000000007b2f30 cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670] pc: c0000000007b2f30: kobject_get+0x20/0x70 lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0 sp: c000003feb18b8f0 msr: 9000000000009033 dar: 48 dsisr: 40000000 current = 0xc000003fd2ed8300 paca = 0xc00000000fbab500 softe: 0 irq_happened: 0x01 pid = 1, comm = swapper/0 Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017 enter ? for help [c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0 [c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0 [c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130 [c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0 [c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120 [c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4 [c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0 [c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360 [c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160 [c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74 Validating cpu_dev fixes the crash and reports correct error message like: [ 30.163506] Failed to register cpuidle device for cpu136 [ 30.173329] Registration of powernv driver failed. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [ rjw: Comment massage ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Sagi Grimberg	8f21b63c9d	nvme-loop: handle cpu unplug when re-establishing the controller [ Upstream commit `945dd5bacc` ] If a cpu unplug event has occured, we need to take the minimum of the provided nr_io_queues and the number of online cpus, otherwise we won't be able to connect them as blk-mq mapping won't dispatch to those queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Jon Medhurst	c9bbd2727d	arm: kprobes: Align stack to 8-bytes in test code [ Upstream commit `974310d047` ] kprobes test cases need to have a stack that is aligned to an 8-byte boundary because they call other functions (and the ARM ABI mandates that alignment) and because test cases include 64-bit accesses to the stack. Unfortunately, GCC doesn't ensure this alignment for inline assembler and for the code in question seems to always misalign it by pushing just the LR register onto the stack. We therefore need to explicitly perform stack alignment at the start of each test case. Without this fix, some test cases will generate alignment faults on systems where alignment is enforced. Even if the kernel is configured to handle these faults in software, triggering them is ugly. It also exposes limitations in the fault handling code which doesn't cope with writes to the stack. E.g. when handling this instruction strd r6, [sp, #-64]! the fault handling code will write to a stack location below the SP value at the point the fault occurred, which coincides with where the exception handler has pushed the saved register context. This results in corruption of those registers. Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Masami Hiramatsu	d0ee8d5b86	arm: kprobes: Fix the return address of multiple kretprobes [ Upstream commit `06553175f5` ] This is arm port of commit `737480a0d5` ("kprobes/x86: Fix the return address of multiple kretprobes"). Fix the return address of subsequent kretprobes when multiple kretprobes are set on the same function. For example: # cd /sys/kernel/debug/tracing # echo "r:event1 sys_symlink" > kprobe_events # echo "r:event2 sys_symlink" >> kprobe_events # echo 1 > events/kprobes/enable # ln -s /tmp/foo /tmp/bar (without this patch) # cat trace \| grep -v ^# ln-82 [000] dn.2 68.446525: event1: (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) ln-82 [000] dn.2 68.447831: event2: (ret_fast_syscall+0x0/0x1c <- SyS_symlink) (with this patch) # cat trace \| grep -v ^# ln-81 [000] dn.1 39.463469: event1: (ret_fast_syscall+0x0/0x1c <- SyS_symlink) ln-81 [000] dn.1 39.464701: event2: (ret_fast_syscall+0x0/0x1c <- SyS_symlink) Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: KUMANO Syuhei <kumano.prog@gmail.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Oscar Campos	6e2a6941fa	HID: corsair: Add driver Scimitar Pro RGB gaming mouse 1b1c:1b3e support to hid-corsair [ Upstream commit `01adc47e88` ] This mouse sold by Corsair as Scimitar PRO RGB defines two consecutive Logical Minimum items in its Application (Consumer.0001) report making it non parseable. This patch fixes the report descriptor overriding byte 77 in rdesc from 0x16 (Logical Minimum with 16 bits value) to 0x26 (Logical Maximum with 16 bits value). Signed-off-by: Oscar Campos <oscar.campos@member.fsf.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:37 +01:00
Oscar Campos	e93ea3a50c	HID: corsair: support for K65-K70 Rapidfire and Scimitar Pro RGB [ Upstream commit `deaba63699` ] Add quirks for several corsair gaming devices to avoid long delays on report initialization Supported devices: - Corsair K65RGB Rapidfire Gaming Keyboard - Corsair K70RGB Rapidfire Gaming Keyboard - Corsair Scimitar Pro RGB Gaming Mouse Signed-off-by: Oscar Campos <oscar.campos@member.fsf.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Dmitry Vyukov	2a7eee3d72	kvm: fix usage of uninit spinlock in avic_vm_destroy() [ Upstream commit `3863dff0c3` ] If avic is not enabled, avic_vm_init() does nothing and returns early. However, avic_vm_destroy() still tries to destroy what hasn't been created. The only bad consequence of this now is that avic_vm_destroy() uses svm_vm_data_hash_lock that hasn't been initialized (and is not meant to be used at all if avic is not enabled). Return early from avic_vm_destroy() if avic is not enabled. It has nothing to destroy. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: kvm@vger.kernel.org Cc: syzkaller@googlegroups.com Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Jaroslav Kysela	2d9a34c064	ALSA: hda - add support for docking station for HP 840 G3 [ Upstream commit `cc3a47a248` ] This tested patch adds missing initialization for Line-In/Out PINs for the docking station for HP 840 G3. Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Jaroslav Kysela	52c3323e41	ALSA: hda - add support for docking station for HP 820 G2 [ Upstream commit `04d5466a97` ] This tested patch adds missing initialization for Line-In/Out PINs for the docking station for HP 820 G2. Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Steve Capper	bb95f1caee	arm64: Initialise high_memory global variable earlier commit `f24e5834a2` upstream. The high_memory global variable is used by cma_declare_contiguous(.) before it is defined. We don't notice this as we compute __pa(high_memory - 1), and it looks like we're processing a VA from the direct linear map. This problem becomes apparent when we flip the kernel virtual address space and the linear map is moved to the bottom of the kernel VA space. This patch moves the initialisation of high_memory before it used. Fixes: `f7426b983a` ("mm: cma: adjust address limit to avoid hitting low/high memory boundary") Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Vaibhav Jain	76fcdc8cbb	cxl: Check if vphb exists before iterating over AFU devices commit `12841f87b7` upstream. During an eeh a kernel-oops is reported if no vPHB is allocated to the AFU. This happens as during AFU init, an error in creation of vPHB is a non-fatal error. Hence afu->phb should always be checked for NULL before iterating over it for the virtual AFU pci devices. This patch fixes the kenel-oops by adding a NULL pointer check for afu->phb before it is dereferenced. Fixes: `9e8df8a219` ("cxl: EEH support") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-25 14:23:36 +01:00
Greg Kroah-Hartman	b632d71014	Linux 4.9.71	2017-12-20 10:07:34 +01:00
Miaoqing Pan	ed70a22125	ath9k: fix tx99 potential info leak [ Upstream commit `ee0a47186e` ] When the user sets count to zero the string buffer would remain completely uninitialized which causes the kernel to parse its own stack data, potentially leading to an info leak. In addition to that, the string might be not terminated properly when the user data does not contain a 0-terminator. Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org> Reviewed-by: Christoph Böhmwalder <christoph@boehmwalder.at> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:34 +01:00
Matteo Croce	8f23eb16af	icmp: don't fail on fragment reassembly time exceeded [ Upstream commit `258bbb1b0e` ] The ICMP implementation currently replies to an ICMP time exceeded message (type 11) with an ICMP host unreachable message (type 3, code 1). However, time exceeded messages can either represent "time to live exceeded in transit" (code 0) or "fragment reassembly time exceeded" (code 1). Unconditionally replying to "fragment reassembly time exceeded" with host unreachable messages might cause unjustified connection resets which are now easily triggered as UFO has been removed, because, in turn, sending large buffers triggers IP fragmentation. The issue can be easily reproduced by running a lot of UDP streams which is likely to trigger IP fragmentation: # start netserver in the test namespace ip netns add test ip netns exec test netserver # create a VETH pair ip link add name veth0 type veth peer name veth0 netns test ip link set veth0 up ip -n test link set veth0 up for i in $(seq 20 29); do # assign addresses to both ends ip addr add dev veth0 192.168.$i.1/24 ip -n test addr add dev veth0 192.168.$i.2/24 # start the traffic netperf -L 192.168.$i.1 -H 192.168.$i.2 -t UDP_STREAM -l 0 & done # wait send_data: data send error: No route to host (errno 113) netperf: send_omni: send_data failed: No route to host We need to differentiate instead: if fragment reassembly time exceeded is reported, we need to silently drop the packet, if time to live exceeded is reported, maintain the current behaviour. In both cases increment the related error count "icmpInTimeExcds". While at it, fix a typo in a comment, and convert the if statement into a switch to mate it more readable. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Alex Vesker	2eb165b9fb	IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop [ Upstream commit `b4b678b06f` ] When ndo_open and ndo_stop are called RTNL lock should be held. In this specific case ipoib_ib_dev_open calls the offloaded ndo_open which re-sets the number of TX queue assuming RTNL lock is held. Since RTNL lock is not held, RTNL assert will fail. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Bart Van Assche	0c70b35bf1	RDMA/cma: Avoid triggering undefined behavior [ Upstream commit `c0b64f58e8` ] According to the C standard the behavior of computations with integer operands is as follows: * A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type. * The behavior for signed integer underflow and overflow is undefined. Hence only use unsigned integers when checking for integer overflow. This patch is what I came up with after having analyzed the following smatch warnings: drivers/infiniband/core/cma.c:3448: cma_resolve_ib_udp() warn: signed overflow undefined. 'offset + conn_param->private_data_len < conn_param->private_data_len' drivers/infiniband/core/cma.c:3505: cma_connect_ib() warn: signed overflow undefined. 'offset + conn_param->private_data_len < conn_param->private_data_len' Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Alexander Duyck	31eb4108e1	macvlan: Only deliver one copy of the frame to the macvlan interface [ Upstream commit `dd6b9c2c33` ] This patch intoduces a slight adjustment for macvlan to address the fact that in source mode I was seeing two copies of any packet addressed to the macvlan interface being delivered where there should have been only one. The issue appears to be that one copy was delivered based on the source MAC address and then the second copy was being delivered based on the destination MAC address. To fix it I am just treating a unicast address match as though it is not a match since source based macvlan isn't supposed to be matching based on the destination MAC anyway. Fixes: `79cf79abce` ("macvlan: add source mode") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Jan Kara	b64ab3ca9d	udf: Avoid overflow when session starts at large offset [ Upstream commit `abdc0eb069` ] When session starts beyond offset 2^31 the arithmetics in udf_check_vsd() would overflow. Make sure the computation is done in large enough type. Reported-by: Cezary Sliwa <sliwa@ifpan.edu.pl> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Dan Carpenter	91e0cf85ca	scsi: bfa: integer overflow in debugfs [ Upstream commit `3e35127565` ] We could allocate less memory than intended because we do: bfad->regdata = kzalloc(len << 2, GFP_KERNEL); The shift can overflow leading to a crash. This is debugfs code so the impact is very small. I fixed the network version of this in March with commit `13e2d5187f` ("bna: integer overflow bug in debugfs"). Fixes: `ab2a9ba189` ("[SCSI] bfa: add debugfs support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
weiping zhang	64da4e8d00	scsi: sd: change allow_restart to bool in sysfs interface [ Upstream commit `658e9a6dc1` ] /sys/class/scsi_disk/0:2:0:0/allow_restart can be changed to 0 unexpectedly by writing an invalid string such as the following: echo asdf > /sys/class/scsi_disk/0:2:0:0/allow_restart Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
weiping zhang	1cafdac891	scsi: sd: change manage_start_stop to bool in sysfs interface [ Upstream commit `623401ee33` ] /sys/class/scsi_disk/0:2:0:0/manage_start_stop can be changed to 0 unexpectly by writing an invalid string. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Jia-Ju Bai	8315bcf841	rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd [ Upstream commit `08880f8e08` ] The driver may sleep under a spinlock, and the function call path is: rtw_set_802_11_bssid(acquire the spinlock) rtw_disassoc_cmd kzalloc(GFP_KERNEL) --> may sleep To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Jia-Ju Bai	6641d3e307	rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd [ Upstream commit `2bf9806d42` ] The driver may sleep under a spinlock, and the function call path is: rtw_surveydone_event_callback(acquire the spinlock) rtw_createbss_cmd kzalloc(GFP_KERNEL) --> may sleep To fix it, GFP_KERNEL is replaced with GFP_ATOMIC. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Jia-Ju Bai	28e006e14f	vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend [ Upstream commit `42c8eb3f6e` ] The driver may sleep under a spinlock, and the function call path is: vt6655_suspend (acquire the spinlock) pci_set_power_state __pci_start_power_transition (drivers/pci/pci.c) msleep --> may sleep To fix it, pci_set_power_state is called without having a spinlock. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:33 +01:00
Parav Pandit	04d5a2d5d2	IB/core: Fix calculation of maximum RoCE MTU [ Upstream commit `99260132fd` ] The original code only took into consideration the largest header possible after the IB_BTH_BYTES. This was incorrect, as the largest possible header size is the largest possible combination of headers we might run into. The new code accounts for all possible headers in the largest possible combination and subtracts that from the MTU to make sure that all packets will fit on the wire. Link: https://www.spinics.net/lists/linux-rdma/msg54558.html Fixes: `3c86aa70bf` ("RDMA/cm: Add RDMA CM support for IBoE devices") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Kurt Garloff	c744ecec01	scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry [ Upstream commit `909cf3e16a` ] All EMC SYMMETRIX support REPORT_LUNS, even if configured to report SCSI-2 for whatever reason. Signed-off-by: Kurt Garloff <garloff@suse.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
NeilBrown	f39486bd37	raid5: Set R5_Expanded on parity devices as well as data. [ Upstream commit `235b6003fb` ] When reshaping a fully degraded raid5/raid6 to a larger nubmer of devices, the new device(s) are not in-sync and so that can make the newly grown stripe appear to be "failed". To avoid this, we set the R5_Expanded flag to say "Even though this device is not fully in-sync, this block is safe so don't treat the device as failed for this stripe". This flag is set for data devices, not not for parity devices. Consequently, if you have a RAID6 with two devices that are partly recovered and a spare, and start a reshape to include the spare, then when the reshape gets past the point where the recovery was up to, it will think the stripes are failed and will get into an infinite loop, failing to make progress. So when contructing parity on an EXPAND_READY stripe, set R5_Expanded. Reported-by: Curt <lightspd@gmail.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Linus Walleij	4fdb10391b	pinctrl: adi2: Fix Kconfig build problem [ Upstream commit `1c363531dd` ] The build robot is complaining on Blackfin: drivers/pinctrl/pinctrl-adi2.c: In function 'port_setup': >> drivers/pinctrl/pinctrl-adi2.c:221:21: error: dereferencing pointer to incomplete type 'struct gpio_port_t' writew(readw(&regs->port_fer) & ~BIT(offset), ^~ drivers/pinctrl/pinctrl-adi2.c: In function 'adi_gpio_ack_irq': >> drivers/pinctrl/pinctrl-adi2.c:266:18: error: dereferencing pointer to incomplete type 'struct bfin_pint_regs' if (readl(&regs->invert_set) & pintbit) ^~ It seems the driver need to include <asm/gpio.h> and <asm/irq.h> to compile. The Blackfin architecture was re-defining the Kconfig PINCTRL symbol which is not OK, so replaced this with PINCTRL_BLACKFIN_ADI2 which selects PINCTRL and PINCTRL_ADI2 just like most arches do. Further, the old GPIO driver symbol GPIO_ADI was possible to select at the same time as selecting PINCTRL. This was not working because the arch-local <asm/gpio.h> header contains an explicit #ifndef PINCTRL clause making compilation break if you combine them. The same is true for DEBUG_MMRS. Make sure the ADI2 pinctrl driver is not selected at the same time as the old GPIO implementation. (This should be converted to use gpiolib or pincontrol and move to drivers/...) Also make sure the old GPIO_ADI driver or DEBUG_MMRS is not selected at the same time as the new PINCTRL implementation, and only make PINCTRL_ADI2 selectable for the Blackfin families that actually have it. This way it is still possible to add e.g. I2C-based pin control expanders on the Blackfin. Cc: Steven Miao <realmz6@gmail.com> Cc: Huanhuan Feng <huanhuan.feng@analog.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Bin Liu	afeeff4d61	usb: musb: da8xx: fix babble condition handling commit `bd3486ded7` upstream. When babble condition happens, the musb controller might automatically turns off VBUS. On DA8xx platform, the controller generates drvvbus interrupt for turning off VBUS along with the babble interrupt. In this case, we should handle the babble interrupt first and recover from the babble condition. This change ignores the drvvbus interrupt if babble interrupt is also generated at the same time, so the babble recovery routine works properly. Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
nixiaoming	92ad6c13e1	tty fix oops when rmmod 8250 [ Upstream commit `c79dde629d` ] After rmmod 8250.ko tty_kref_put starts kwork (release_one_tty) to release proc interface oops when accessing driver->driver_name in proc_tty_unregister_driver Use jprobe, found driver->driver_name point to 8250.ko static static struct uart_driver serial8250_reg .driver_name= serial, Use name in proc_dir_entry instead of driver->driver_name to fix oops test on linux 4.1.12: BUG: unable to handle kernel paging request at ffffffffa01979de IP: [<ffffffff81310f40>] strchr+0x0/0x30 PGD `1a0d067` PUD 1a0e063 PMD 851c1f067 PTE 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... ... [last unloaded: 8250] CPU: 7 PID: 116 Comm: kworker/7:1 Tainted: G O 4.1.12 #1 Hardware name: Insyde RiverForest/Type2 - Board Product Name1, BIOS NE5KV904 12/21/2015 Workqueue: events release_one_tty task: ffff88085b684960 ti: ffff880852884000 task.ti: ffff880852884000 RIP: 0010:[<ffffffff81310f40>] [<ffffffff81310f40>] strchr+0x0/0x30 RSP: 0018:ffff880852887c90 EFLAGS: 00010282 RAX: ffffffff81a5eca0 RBX: ffffffffa01979de RCX: 0000000000000004 RDX: ffff880852887d10 RSI: 000000000000002f RDI: ffffffffa01979de RBP: ffff880852887cd8 R08: 0000000000000000 R09: ffff88085f5d94d0 R10: 0000000000000195 R11: 0000000000000000 R12: ffffffffa01979de R13: ffff880852887d00 R14: ffffffffa01979de R15: ffff88085f02e840 FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa01979de CR3: 0000000001a0c000 CR4: 00000000001406e0 Stack: ffffffff812349b1 ffff880852887cb8 ffff880852887d10 ffff88085f5cd6c2 ffff880852800a80 ffffffffa01979de ffff880852800a84 0000000000000010 ffff88085bb28bd8 ffff880852887d38 ffffffff812354f0 ffff880852887d08 Call Trace: [<ffffffff812349b1>] ? __xlate_proc_name+0x71/0xd0 [<ffffffff812354f0>] remove_proc_entry+0x40/0x180 [<ffffffff815f6811>] ? _raw_spin_lock_irqsave+0x41/0x60 [<ffffffff813be520>] ? destruct_tty_driver+0x60/0xe0 [<ffffffff81237c68>] proc_tty_unregister_driver+0x28/0x40 [<ffffffff813be548>] destruct_tty_driver+0x88/0xe0 [<ffffffff813be5bd>] tty_driver_kref_put+0x1d/0x20 [<ffffffff813becca>] release_one_tty+0x5a/0xd0 [<ffffffff81074159>] process_one_work+0x139/0x420 [<ffffffff810745a1>] worker_thread+0x121/0x450 [<ffffffff81074480>] ? process_scheduled_works+0x40/0x40 [<ffffffff8107a16c>] kthread+0xec/0x110 [<ffffffff81080000>] ? tg_rt_schedulable+0x210/0x220 [<ffffffff8107a080>] ? kthread_freezable_should_stop+0x80/0x80 [<ffffffff815f7292>] ret_from_fork+0x42/0x70 [<ffffffff8107a080>] ? kthread_freezable_should_stop+0x80/0x80 Signed-off-by: nixiaoming <nixiaoming@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Matthias Brugger	421910e924	soc: mediatek: pwrap: fix compiler errors [ Upstream commit `fb2c1934f3` ] When compiling using sparse, we got the following error: drivers/soc/mediatek/mtk-pmic-wrap.c:686:25: error: dubious one-bit signed bitfield Changing the data type to unsigned fixes this. Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Michael Ellerman	7745382fe8	powerpc/perf/hv-24x7: Fix incorrect comparison in memord [ Upstream commit `05c14c0313` ] In the hv-24x7 code there is a function memord() which tries to implement a sort function return -1, 0, 1. However one of the conditions is incorrect, such that it can never be true, because we will have already returned. I don't believe there is a bug in practice though, because the comparisons are an optimisation prior to calling memcmp(). Fix it by swapping the second comparision, so it can be true. Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Martin Wilck	ab9d257114	scsi: hpsa: destroy sas transport properties before scsi_host [ Upstream commit `dfb2e6f46b` ] This patch cleans up a lot of warnings when unloading the driver. A current example of the stack trace starts with: [ 142.570715] sysfs group 'power' not found for kobject 'port-5:0' There can be hundreds of these messages during a driver unload. I am resubmitting this patch on behalf of Martin Wilck with his permission. His original patch can be found here: https://www.spinics.net/lists/linux-scsi/msg102085.html This patch did not help until Hannes's commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod") was applied to the kernel. --------------------------- Original patch description: --------------------------- Unloading the hpsa driver causes warnings [ 1063.793652] WARNING: CPU: 1 PID: 4850 at ../fs/sysfs/group.c:237 device_del+0x54/0x240() [ 1063.793659] sysfs group ffffffff81cf21a0 not found for kobject 'port-2:0' with two different stacks: 1) [ 1063.793774] [<ffffffff81448af4>] device_del+0x54/0x240 [ 1063.793780] [<ffffffff8145178a>] transport_remove_classdev+0x4a/0x60 [ 1063.793784] [<ffffffff81451216>] attribute_container_device_trigger+0xa6/0xb0 [ 1063.793802] [<ffffffffa0105d46>] sas_port_delete+0x126/0x160 [scsi_transport_sas] [ 1063.793819] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa] 2) [ 1063.797103] [<ffffffff81448af4>] device_del+0x54/0x240 [ 1063.797118] [<ffffffffa0105d4e>] sas_port_delete+0x12e/0x160 [scsi_transport_sas] [ 1063.797134] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa] This is caused by the fact that host device hostX is deleted before the SAS transport devices hostX/port-a:b. This patch fixes this by reverting the order of device deletions. Tested-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin Wilck <mwilck@suse.de> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:32 +01:00
Martin Wilck	1723d6668d	scsi: hpsa: cleanup sas_phy structures in sysfs when unloading [ Upstream commit `55ca38b425` ] I am resubmitting this patch on behalf of Martin Wilck with his permission. The original patch can be found here: https://www.spinics.net/lists/linux-scsi/msg102083.html This patch did not help until Hannes's commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod") was applied to the kernel. -------------------------------------- Original patch description from Martin: -------------------------------------- When the hpsa module is unloaded using rmmod, dangling symlinks remain under /sys/class/sas_phy. Fix this by calling sas_phy_delete() rather than sas_phy_free (which, according to comments, should not be called for PHYs that have been set up successfully, anyway). Tested-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin Wilck <mwilck@suse.de> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Alex Williamson	237e053346	PCI: Detach driver before procfs & sysfs teardown on device remove [ Upstream commit `16b6c8bb68` ] When removing a device, for example a VF being removed due to SR-IOV teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual hot-unplug, we first remove the procfs and sysfs attributes for the device before attempting to release the device from any driver bound to it. Unbinding the driver from the device can take time. The device might need to write out data or it might be actively in use. If it's in use by userspace through a vfio driver, the unbind might block until the user releases the device. This leads to a potentially non-trivial amount of time where the device exists, but we've torn down the interfaces that userspace uses to examine devices, for instance lspci might generate this sort of error: pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config lspci: Unable to read the standard configuration space header of device 0000:01:0a.3 We don't seem to have any dependence on this teardown ordering in the kernel, so let's unbind the driver first, which is also more symmetric with the instantiation of the device in pci_bus_add_device(). Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Leon Romanovsky	8f84f861f9	RDMA/cxgb4: Declare stag as __be32 [ Upstream commit `35fb2a88ed` ] The scqe.stag is actually __b32, fix it. drivers/infiniband/hw/cxgb4/cq.c:754:52: warning: cast to restricted __be32 Cc: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Christoph Hellwig	769bca9339	xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real [ Upstream commit `5e422f5e4f` ] There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the passed in new extent state but always converted to normal, leading to wrong behavior when converting from normal to unwritten. Only found by code inspection, it seems like this code path to move partial extent from written to unwritten while merging it with the next extent is rarely exercised. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Brian Foster	c82209949b	xfs: fix log block underflow during recovery cycle verification [ Upstream commit `9f2a450580` ] It is possible for mkfs to format very small filesystems with too small of an internal log with respect to the various minimum size and block count requirements. If this occurs when the log happens to be smaller than the scan window used for cycle verification and the scan wraps the end of the log, the start_blk calculation in xlog_find_head() underflows and leads to an attempt to scan an invalid range of log blocks. This results in log recovery failure and a failed mount. Since there may be filesystems out in the wild with this kind of geometry, we cannot simply refuse to mount. Instead, cap the scan window for cycle verification to the size of the physical log. This ensures that the cycle verification proceeds as expected when the scan wraps the end of the log. Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Jiri Slaby	fc4177eacf	l2tp: cleanup l2tp_tunnel_delete calls [ Upstream commit `4dc12ffeae` ] l2tp_tunnel_delete does not return anything since commit `62b982eeb4` ("l2tp: fix race condition in l2tp_tunnel_delete"). But call sites of l2tp_tunnel_delete still do casts to void to avoid unused return value warnings. Kill these now useless casts. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Christoph Hellwig	6a559523ee	nvme: use kref_get_unless_zero in nvme_find_get_ns [ Upstream commit `2dd4122854` ] For kref_get_unless_zero to protect against lookup vs free races we need to use it in all places where we aren't guaranteed to already hold a reference. There is no such guarantee in nvme_find_get_ns, so switch to kref_get_unless_zero in this function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Osama Khan	e2fce5a557	platform/x86: hp_accel: Add quirk for HP ProBook 440 G4 [ Upstream commit `163ca80013` ] Added support for HP ProBook 440 G4 laptops by including the accelerometer orientation quirk for that device. Testing was performed based on the axis orientation guidelines here: https://www.kernel.org/doc/Documentation/misc-devices/lis3lv02d which states "If the left side is elevated, X increases (becomes positive)". When tested, on lifting the left edge, x values became increasingly negative thus indicating an inverted x-axis on the installed lis3lv02d chip. This was compensated by adding an entry for this device in hp_accel.c specifying the quirk as x_inverted. The patch was tested on a ProBook 440 G4 device and x-axis as well as y and z-axis values are now generated as per spec. Signed-off-by: Osama Khan <osama.khan@ericsson.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Christophe JAILLET	7fab68e1f7	btrfs: tests: Fix a memory leak in error handling path in 'run_test()' [ Upstream commit `9ca2e97fa3` ] If 'btrfs_alloc_path()' fails, we must free the resources already allocated, as done in the other error handling paths in this function. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Nick Desaulniers	b7ada2c0ea	arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27 [ Upstream commit `fd9dde6abc` ] Upon upgrading to binutils 2.27, we found that our lz4 and gzip compressed kernel images were significantly larger, resulting is 10ms boot time regressions. As noted by Rahul: "aarch64 binaries uses RELA relocations, where each relocation entry includes an addend value. This is similar to x86_64. On x86_64, the addend values are also stored at the relocation offset for relative relocations. This is an optimization: in the case where code does not need to be relocated, the loader can simply skip processing relative relocations. In binutils-2.25, both bfd and gold linkers did this for x86_64, but only the gold linker did this for aarch64. The kernel build here is using the bfd linker, which stored zeroes at the relocation offsets for relative relocations. Since a set of zeroes compresses better than a set of non-zero addend values, this behavior was resulting in much better lz4 compression. The bfd linker in binutils-2.27 is now storing the actual addend values at the relocation offsets. The behavior is now consistent with what it does for x86_64 and what gold linker does for both architectures. The change happened in this upstream commit: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613 Since a bunch of zeroes got replaced by non-zero addend values, we see the side effect of lz4 compressed image being a bit bigger. To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs" flag can be used: $ LDFLAGS="--no-apply-dynamic-relocs" make With this flag, the compressed image size is back to what it was with binutils-2.25. If the kernel is using ASLR, there aren't additional runtime costs to --no-apply-dynamic-relocs, as the relocations will need to be applied again anyway after the kernel is relocated to a random address. If the kernel is not using ASLR, then presumably the current default behavior of the linker is better. Since the static linker performed the dynamic relocs, and the kernel is not moved to a different address at load time, it can skip applying the relocations all over again." Some measurements: $ ld -v GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117 ^ $ ls -l vmlinux -rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux $ ls -l Image.lz4-dtb -rw-r----- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb $ ld -v GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315 ^ pre patch: $ ls -l vmlinux -rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux $ ls -l Image.lz4-dtb -rw-r----- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb post patch: $ ls -l vmlinux -rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux $ ls -l Image.lz4-dtb -rw-r----- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb By Siqi's measurement w/ gzip: binutils 2.27 with this patch (with --no-apply-dynamic-relocs): Image 41535488 Image.gz 13404067 binutils 2.27 without this patch (without --no-apply-dynamic-relocs): Image 41535488 Image.gz 14125516 Any compression scheme should be able to get better results from the longer runs of zeros, not just GZIP and LZ4. 10ms boot time savings isn't anything to get excited about, but users of arm64+compression+bfd-2.27 should not have to pay a penalty for no runtime improvement. Reported-by: Gopinath Elanchezhian <gelanchezhian@google.com> Reported-by: Sindhuri Pentyala <spentyala@google.com> Reported-by: Wei Wang <wvw@google.com> Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Suggested-by: Rahul Chaudhry <rahulchaudhry@google.com> Suggested-by: Siqi Lin <siqilin@google.com> Suggested-by: Stephen Hines <srhines@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: added comment to Makefile] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
Patel Jay P	52aaa748a9	Ib/hfi1: Return actual operational VLs in port info query [ Upstream commit `00f9203119` ] __subn_get_opa_portinfo stores value returned by hfi1_get_ib_cfg() as operational vls. hfi1_get_ib_cfg() returns vls_operational field in hfi1_pportdata. The problem with this is that the value is always equal to vls_supported field in hfi1_pportdata. The logic to calculate operational_vls is to set value passed by FM (in __subn_set_opa_portinfo routine). If no value is passed then default value is stored in operational_vls. Field actual_vls_operational is calculated on the basis of buffer control table. Hence, modifying hfi1_get_ib_cfg() to return actual_operational_vls when used with HFI1_IB_CFG_OP_VLS parameter Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Patel Jay P <jay.p.patel@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:31 +01:00
tang.junhui	9102ed6a5f	bcache: fix wrong cache_misses statistics [ Upstream commit `c157313791` ] Currently, Cache missed IOs are identified by s->cache_miss, but actually, there are many situations that missed IOs are not assigned a value for s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO (s->iop.bypass = 1), or the cache_bio allocate failed. In these situations, it will go to out_put or out_submit, and s->cache_miss is null, which leads bch_mark_cache_accounting() to treat this IO as a hit IO. [ML: applied by 3-way merge] Signed-off-by: tang.junhui <tang.junhui@zte.com.cn> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Liang Chen	c2a0531f59	bcache: explicitly destroy mutex while exiting [ Upstream commit `330a4db89d` ] mutex_destroy does nothing most of time, but it's better to call it to make the code future proof and it also has some meaning for like mutex debug. As Coly pointed out in a previous review, bcache_exit() may not be able to handle all the references properly if userspace registers cache and backing devices right before bch_debug_init runs and bch_debug_init failes later. So not exposing userspace interface until everything is ready to avoid that issue. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Coly Li <colyli@suse.de> Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Bob Peterson	75f66eeae6	GFS2: Take inode off order_write list when setting jdata flag [ Upstream commit `cc555b09d8` ] This patch fixes a deadlock caused when the jdata flag is set for inodes that are already on the ordered write list. Since it is on the ordered write list, log_flush calls gfs2_ordered_write which calls filemap_fdatawrite. But since the inode had the jdata flag set, that calls gfs2_jdata_writepages, which tries to start a new transaction. A new transaction cannot be started because it tries to acquire the log_flush rwsem which is already locked by the log flush operation. The bottom line is: We cannot switch an inode from ordered to jdata until we eliminate any ordered data pages (via log flush) or any log_flush operation afterward will create the circular dependency above. So we need to flush the log before setting the diskflags to switch the file mode, then we need to remove the inode from the ordered writes list. Before this patch, the log flush was done for jdata->ordered, but that's wrong. If we're going from jdata to ordered, we don't need to call gfs2_log_flush because the call to filemap_fdatawrite will do it for us: filemap_fdatawrite() -> __filemap_fdatawrite_range() __filemap_fdatawrite_range() -> do_writepages() do_writepages() -> gfs2_jdata_writepages() gfs2_jdata_writepages() -> gfs2_log_flush() This patch modifies function do_gfs2_set_flags so that if a file has its jdata flag set, and it's already on the ordered write list, the log will be flushed and it will be removed from the list before setting the flag. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Abhijith Das <adas@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Douglas Gilbert	026ffaf658	scsi: scsi_debug: write_same: fix error report [ Upstream commit `e33d7c5645` ] The scsi_debug driver incorrectly suggests there is an error with the SCSI WRITE SAME command when the number_of_logical_blocks is greater than 1. It will also suggest there is an error when NDOB (no data-out buffer) is set and the number_of_logical_blocks is greater than 0. Both are valid, fix. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Daniel Lezcano	d8914530f2	thermal/drivers/step_wise: Fix temperature regulation misbehavior [ Upstream commit `07209fcf33` ] There is a particular situation when the cooling device is cpufreq and the heat dissipation is not efficient enough where the temperature increases little by little until reaching the critical threshold and leading to a SoC reset. The behavior is reproducible on a hikey6220 with bad heat dissipation (eg. stacked with other boards). Running a simple C program doing while(1); for each CPU of the SoC makes the temperature to reach the passive regulation trip point and ends up to the maximum allowed temperature followed by a reset. This issue has been also reported by running the libhugetlbfs test suite. What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz while the temperature continues to grow. It appears the step wise governor calls get_target_state() the first time with the throttle set to true and the trend to 'raising'. The code selects logically the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so good. The temperature decreases immediately but still stays greater than the trip point, then get_target_state() is called again, this time with the throttle set to true and the trend to 'dropping'. From there the algorithm assumes we have to step down the state and the cpu frequency jumps back to 1.2GHz. But the temperature is still higher than the trip point, so get_target_state() is called with throttle=1 and trend='raising' again, we jump to 900MHz, then get_target_state() is called with throttle=1 and trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not stabilizes and continues to increase. [ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 237.922690] thermal cooling_device0: cur_state=0 [ 237.922701] thermal cooling_device0: old_target=0, target=1 [ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 238.026694] thermal cooling_device0: cur_state=1 [ 238.026707] thermal cooling_device0: old_target=1, target=0 [ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 238.134679] thermal cooling_device0: cur_state=0 [ 238.134690] thermal cooling_device0: old_target=0, target=1 In this situation the temperature continues to increase while the trend is oscillating between 'dropping' and 'raising'. We need to keep the current state untouched if the throttle is set, so the temperature can decrease or a higher state could be selected, thus preventing this oscillation. Keeping the next_target untouched when 'throttle' is true at 'dropping' time fixes the issue. The following traces show the governor does not change the next state if trend==2 (dropping) and throttle==1. [ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2306.128021] thermal cooling_device0: cur_state=0 [ 2306.128031] thermal cooling_device0: old_target=0, target=1 [ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 2306.232030] thermal cooling_device0: cur_state=1 [ 2306.232042] thermal cooling_device0: old_target=1, target=1 [ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1 [ 2306.336021] thermal cooling_device0: cur_state=1 [ 2306.336034] thermal cooling_device0: old_target=1, target=1 [ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2306.440022] thermal cooling_device0: cur_state=1 [ 2306.440034] thermal cooling_device0: old_target=1, target=0 [ ... ] After a while, if the temperature continues to increase, the next state becomes 2 which is 720MHz on the hikey. That results in the temperature stabilizing around the trip point. [ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0 [ 2455.832019] thermal cooling_device0: cur_state=1 [ 2455.832032] thermal cooling_device0: old_target=1, target=1 [ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2455.936027] thermal cooling_device0: cur_state=1 [ 2455.936040] thermal cooling_device0: old_target=1, target=1 [ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2456.044023] thermal cooling_device0: cur_state=1 [ 2456.044036] thermal cooling_device0: old_target=1, target=1 [ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2456.148042] thermal cooling_device0: cur_state=1 [ 2456.148055] thermal cooling_device0: old_target=1, target=2 [ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2456.252058] thermal cooling_device0: cur_state=2 [ 2456.252075] thermal cooling_device0: old_target=2, target=1 IOW, this change is needed to keep the state for a cooling device if the temperature trend is oscillating while the temperature increases slightly. Without this change, the situation above leads to a catastrophic crash by a hardware reset on hikey. This issue has been reported to happen on an OMAP dra7xx also. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Keerthy <j-keerthy@ti.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Kuninori Morimoto	019433db87	ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod [ Upstream commit `21781e8788` ] SSI parent mod might be NULL. ssi_parent_mod() needs to care about it. Otherwise, it uses negative shift. This patch fixes it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Gao Feng	cf16dac8bd	ppp: Destroy the mutex when cleanup [ Upstream commit `f02b2320b2` ] The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the good readbility, it's better to invoke it in exit func when the init func invokes mutex_init. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Michał Mirosław	27f5597c98	clk: tegra: Fix cclk_lp divisor register [ Upstream commit `54eff2264d` ] According to comments in code and common sense, cclk_lp uses its own divisor, not cclk_g's. Fixes: `b08e8c0ecc` ("clk: tegra: add clock support for Tegra30") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-By: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Leo Yan	54809e38a6	clk: hi6220: mark clock cs_atb_syspll as critical [ Upstream commit `d2a3671ebe` ] Clock cs_atb_syspll is pll used for coresight trace bus; when clock cs_atb_syspll is disabled and operates its child clock node cs_atb results in system hang. So mark clock cs_atb_syspll as critical to keep it enabled. Cc: Guodong Xu <guodong.xu@linaro.org> Cc: Zhangfei Gao <zhangfei.gao@linaro.org> Cc: Haojian Zhuang <haojian.zhuang@linaro.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Link: lkml.kernel.org/r/1504226835-2115-2-git-send-email-leo.yan@linaro.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:30 +01:00
Sébastien Szymanski	47b63ea40e	clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU [ Upstream commit `c68ee58d9e` ] On i.MX6 SoCs without VPU (in my case MCIMX6D4AVT10AC), the hdmi driver fails to probe: [ 2.540030] dwhdmi-imx 120000.hdmi: Unsupported HDMI controller (0000:00:00) [ 2.548199] imx-drm display-subsystem: failed to bind 120000.hdmi (ops dw_hdmi_imx_ops): -19 [ 2.557403] imx-drm display-subsystem: master bind failed: -19 That's because hdmi_isfr's parent, video_27m, is not correctly ungated. As explained in commit `5ccc248cc5` ("ARM: imx6q: clk: Add support for mipi_core_cfg clock as a shared clock gate"), video_27m is gated by CCM_CCGR3[CG8]. On i.MX6 SoCs with VPU, the hdmi is working thanks to the CCM_CMEOR[mod_en_ov_vpu] bit which makes the video_27m ungated whatever is in CCM_CCGR3[CG8]. The issue can be reproduced by setting CCMEOR[mod_en_ov_vpu] to 0. Make the HDMI work in every case by setting hdmi_isfr's parent to mipi_core_cfg. Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Chen Zhong	d6b6302c36	clk: mediatek: add the option for determining PLL source clock [ Upstream commit `c955bf3998` ] Since the previous setup always sets the PLL using crystal 26MHz, this doesn't always happen in every MediaTek platform. So the patch added flexibility for assigning extra member for determining the PLL source clock. Signed-off-by: Chen Zhong <chen.zhong@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Jan Kara	2850c3ec0d	mm: Handle 0 flags in _calc_vm_trans() macro [ Upstream commit `592e254502` ] _calc_vm_trans() does not handle the situation when some of the passed flags are 0 (which can happen if these VM flags do not make sense for the architecture). Improve the _calc_vm_trans() macro to return 0 in such situation. Since all passed flags are constant, this does not add any runtime overhead. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Robert Baronescu	18498f1c70	crypto: tcrypt - fix buffer lengths in test_aead_speed() [ Upstream commit `7aacbfcb33` ] Fix the way the length of the buffers used for encryption / decryption are computed. For e.g. in case of encryption, input buffer does not contain an authentication tag. Signed-off-by: Robert Baronescu <robert.baronescu@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Suzuki K Poulose	2ed46cbf23	arm-ccn: perf: Prevent module unload while PMU is in use [ Upstream commit `c7f5828bf7` ] When the PMU driver is built as a module, the perf expects the pmu->module to be valid, so that the driver is prevented from being unloaded while it is in use. Fix the CCN pmu driver to fill in this field. Fixes: `a33b0daab7` ("bus: ARM CCN PMU driver") Cc: Pawel Moll <pawel.moll@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Eryu Guan	c843e9f8f9	xfs: truncate pagecache before writeback in xfs_setattr_size() [ Upstream commit `350976ae21` ] On truncate down, if new size is not block size aligned, we zero the rest of block to avoid exposing stale data to user, and iomap_truncate_page() skips zeroing if the range is already in unwritten state or a hole. Then we writeback from on-disk i_size to the new size if this range hasn't been written to disk yet, and truncate page cache beyond new EOF and set in-core i_size. The problem is that we could write data between di_size and newsize before removing the page cache beyond newsize, as the extents may still be in unwritten state right after a buffer write. As such, the page of data that newsize lies in has not been zeroed by page cache invalidation before it is written, and xfs_do_writepage() hasn't triggered it's "zero data beyond EOF" case because we haven't updated in-core i_size yet. Then a subsequent mmap read could see non-zeros past EOF. I occasionally see this in fsx runs in fstests generic/112, a simplified fsx operation sequence is like (assuming 4k block size xfs): fallocate 0x0 0x1000 0x0 keep_size write 0x0 0x1000 0x0 truncate 0x0 0x800 0x1000 punch_hole 0x0 0x800 0x800 mapread 0x0 0x800 0x800 where fallocate allocates unwritten extent but doesn't update i_size, buffer write populates the page cache and extent is still unwritten, truncate skips zeroing page past new EOF and writes the page to disk, punch_hole invalidates the page cache, at last mapread reads the block back and sees non-zero beyond EOF. Fix it by moving truncate_setsize() to before writeback so the page cache invalidation zeros the partial page at the new EOF. This also triggers "zero data beyond EOF" in xfs_do_writepage() at writeback time, because newsize has been set and page straddles the newsize. Also fixed the wrong 'end' param of filemap_write_and_wait_range() call while we're at it, the 'end' is inclusive and should be 'newsize - 1'. Suggested-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eryu Guan <eguan@redhat.com> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Gary R Hook	03bfadfb0d	iommu/amd: Limit the IOVA page range to the specified addresses [ Upstream commit `b92b4fb5c1` ] The extent of pages specified when applying a reserved region should include up to the last page of the range, but not the page following the range. Signed-off-by: Gary R Hook <gary.hook@amd.com> Fixes: `8d54d6c8b8` ('iommu/amd: Implement apply_dm_region call-back') Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Liu Bo	cb0acb3701	badblocks: fix wrong return value in badblocks_set if badblocks are disabled [ Upstream commit `39b4954c0a` ] MD's rdev_set_badblocks() expects that badblocks_set() returns 1 if badblocks are disabled, otherwise, rdev_set_badblocks() will record superblock changes and return success in that case and md will fail to report an IO error which it should. This bug has existed since badblocks were introduced in commit `9e0e252a04` ("badblocks: Add core badblock management code"). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Acked-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
Jiang Yi	dcdca12381	target/file: Do not return error for UNMAP if length is zero [ Upstream commit `594e25e734` ] The function fd_execute_unmap() in target_core_file.c calles ret = file->f_op->fallocate(file, mode, pos, len); Some filesystems implement fallocate() to return error if length is zero (e.g. btrfs) but according to SCSI Block Commands spec UNMAP should return success for zero length. Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
tangwenji	998201fdc5	target:fix condition return in core_pr_dump_initiator_port() [ Upstream commit `24528f089d` ] When is pr_reg->isid_present_at_reg is false,this function should return. This fixes a regression originally introduced by: commit `d2843c173e` Author: Andy Grover <agrover@redhat.com> Date: Thu May 16 10:40:55 2013 -0700 target: Alter core_pr_dump_initiator_port for ease of use Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:29 +01:00
tangwenji	a4f54ec403	iscsi-target: fix memory leak in lio_target_tiqn_addtpg() [ Upstream commit `12d5a43b2d` ] tpg must free when call core_tpg_register() return fail Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Bart Van Assche	e086a82a92	target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd() [ Upstream commit `cfe2b621bb` ] Avoid that cmd->se_cmd.se_tfo is read after a command has already been freed. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Kuppuswamy Sathyanarayanan	abc4b4420a	platform/x86: intel_punit_ipc: Fix resource ioremap warning [ Upstream commit `6cc8cbbc88` ] For PUNIT device, ISPDRIVER_IPC and GTDDRIVER_IPC resources are not mandatory. So when PMC IPC driver creates a PUNIT device, if these resources are not available then it creates dummy resource entries for these missing resources. But during PUNIT device probe, doing ioremap on these dummy resources generates following warning messages. intel_punit_ipc: can't request region for resource [mem 0x00000000] intel_punit_ipc: can't request region for resource [mem 0x00000000] intel_punit_ipc: can't request region for resource [mem 0x00000000] intel_punit_ipc: can't request region for resource [mem 0x00000000] This patch fixes this issue by adding extra check for resource size before performing ioremap operation. Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Christophe Leroy	6e5a846d51	powerpc/ipic: Fix status get and status clear [ Upstream commit `6b148a7ce7` ] IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR which is the mask register. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
William A. Kennington III	d7e7c431d6	powerpc/opal: Fix EBUSY bug in acquiring tokens [ Upstream commit `71e24d7731` ] The current code checks the completion map to look for the first token that is complete. In some cases, a completion can come in but the token can still be on lease to the caller processing the completion. If this completed but unreleased token is the first token found in the bitmap by another tasks trying to acquire a token, then the __test_and_set_bit call will fail since the token will still be on lease. The acquisition will then fail with an EBUSY. This patch reorganizes the acquisition code to look at the opal_async_token_map for an unleased token. If the token has no lease it must have no outstanding completions so we should never see an EBUSY, unless we have leased out too many tokens. Since opal_async_get_token_inrerruptible is protected by a semaphore, we will practically never see EBUSY anymore. Fixes: `8d72482322` ("powerpc/powernv: Infrastructure to support OPAL async completion") Signed-off-by: William A. Kennington III <wak@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
KUWAZAWA Takuya	a463f9c5df	netfilter: ipvs: Fix inappropriate output of procfs [ Upstream commit `c5504f724c` ] Information about ipvs in different network namespace can be seen via procfs. How to reproduce: # ip netns add ns01 # ip netns add ns02 # ip netns exec ns01 ip a add dev lo 127.0.0.1/8 # ip netns exec ns02 ip a add dev lo 127.0.0.1/8 # ip netns exec ns01 ipvsadm -A -t 10.1.1.1:80 # ip netns exec ns02 ipvsadm -A -t 10.1.1.2:80 The ipvsadm displays information about its own network namespace only. # ip netns exec ns01 ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.1:80 wlc # ip netns exec ns02 ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.2:80 wlc But I can see information about other network namespace via procfs. # ip netns exec ns01 cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 0A010101:0050 wlc TCP 0A010102:0050 wlc # ip netns exec ns02 cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 0A010102:0050 wlc Signed-off-by: KUWAZAWA Takuya <albatross0@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Matthias Brugger	b3b6d1eea0	iommu/mediatek: Fix driver name [ Upstream commit `395df08d2e` ] There exist two Mediatek iommu drivers for the two different generations of the device. But both drivers have the same name "mtk-iommu". This breaks the registration of the second driver: Error: Driver 'mtk-iommu' is already registered, aborting... Fix this by changing the name for first generation to "mtk-iommu-v1". Fixes: `b17336c55d` ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Mika Westerberg	9a4bf05126	PCI: Do not allocate more buses than available in parent [ Upstream commit `a20c7f36bd` ] One can ask more buses to be reserved for hotplug bridges by passing pci=hpbussize=N in the kernel command line. If the parent bus does not have enough bus space available we incorrectly create child bus with the requested number of subordinate buses. In the example below hpbussize is set to one more than we have available buses in the root port: pci 0000:07:00.0: [8086:1578] type 01 class 0x060400 pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 0 pci 0000:07:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 1 pci_bus 0000:08: busn_res: can not insert [bus 08-ff] under [bus 07-3f] (conflicts with (null) [bus 07-3f]) pci_bus 0000:08: scanning bus ... pci_bus 0000:0a: bus scan returning with max=40 pci_bus 0000:0a: busn_res: [bus 0a-ff] end is updated to 40 pci_bus 0000:0a: [bus 0a-40] partially hidden behind bridge 0000:07 [bus 07-3f] pci_bus 0000:08: bus scan returning with max=40 pci_bus 0000:08: busn_res: [bus 08-ff] end is updated to 40 Instead of allowing this, limit the subordinate number to be less than or equal the maximum subordinate number allocated for the parent bus (if it has any). Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: remove irrelevant dmesg messages] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:28 +01:00
Shriya	29a404be7b	powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo [ Upstream commit `cd77b5ce20` ] The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which returns the last frequency requested by the kernel, but may not reflect the actual frequency the processor is running at. This patch makes a call to cpufreq_get() instead which returns the current frequency reported by the hardware. Fixes: `fb5153d05a` ("powerpc: powernv: Implement ppc_md.get_proc_freq()") Signed-off-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Qiang	f44d28e034	PCI/PME: Handle invalid data when reading Root Status [ Upstream commit `3ad3f8ce50` ] PCIe PME and native hotplug share the same interrupt number, so hotplug interrupts are also processed by PME. In some cases, e.g., a Link Down interrupt, a device may be present but unreachable, so when we try to read its Root Status register, the read fails and we get all ones data (0xffffffff). Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e., "some device has asserted PME," so we scheduled pcie_pme_work_fn(). This caused an infinite loop because pcie_pme_work_fn() tried to handle PME requests until PCI_EXP_RTSTA_PME is cleared, but with the link down, PCI_EXP_RTSTA_PME can't be cleared. Check for the invalid 0xffffffff data everywhere we read the Root Status register. `1469d17dd3` ("PCI: pciehp: Handle invalid data when reading from non-existent devices") added similar checks in the hotplug driver. Signed-off-by: Qiang Zheng <zhengqiang10@huawei.com> [bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow other similar checks] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Peter Ujfalusi	5a7192bc38	dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type [ Upstream commit `288e7560e4` ] The used 0x1f mask is only valid for am335x family of SoC, different family using this type of crossbar might have different number of electable events. In case of am43xx family 0x3f mask should have been used for example. Instead of trying to handle each family's mask, just use u8 type to store the mux value since the event offsets are aligned to byte offset. Fixes: `42dbdcc6bf` ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Pankaj Bharadiya	03a48dc965	ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case [ Upstream commit `f8e0665211` ] In the loop that adds the uuid_module to the uuid_list list, allocated memory is not properly freed in the error path free uuid_list whenever any of the memory allocation in the loop fails to avoid memory leak. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Guneshwor Singh <guneshwor.o.singh@intel.com> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Philipp Zabel	9146b10f8c	rtc: pcf8563: fix output clock rate [ Upstream commit `a3350f9c57` ] The pcf8563_clkout_recalc_rate function erroneously ignores the frequency index read from the CLKO register and always returns 32768 Hz. Fixes: `a39a6405d5` ("rtc: pcf8563: add CLKOUT to common clock framework") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Christophe JAILLET	cf53526f33	video: fbdev: au1200fb: Return an error code if a memory allocation fails [ Upstream commit `8cae353e6b` ] 'ret' is known to be 0 at this point. In case of memory allocation error in 'framebuffer_alloc()', return -ENOMEM instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Christophe JAILLET	90e2591f6f	video: fbdev: au1200fb: Release some resources if a memory allocation fails [ Upstream commit `451f130602` ] We should go through the error handling code instead of returning -ENOMEM directly. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Ladislav Michl	92c3c7db83	video: udlfb: Fix read EDID timeout [ Upstream commit `c987694755` ] While usb_control_msg function expects timeout in miliseconds, a value of HZ is used. Replace it with USB_CTRL_GET_TIMEOUT and also fix error message which looks like: udlfb: Read EDID byte 78 failed err ffffff92 as error is either negative errno or number of bytes transferred use %d format specifier. Returned EDID is in second byte, so return error when less than two bytes are received. Fixes: `18dffdf891` ("staging: udlfb: enhance EDID and mode handling support") Signed-off-by: Ladislav Michl <ladis@linux-mips.org> Cc: Bernie Thompson <bernie@plugable.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Geert Uytterhoeven	aecce5fc04	fbdev: controlfb: Add missing modes to fix out of bounds access [ Upstream commit `ac831a379d` ] Dan's static analysis says: drivers/video/fbdev/controlfb.c:560 control_setup() error: buffer overflow 'control_mac_modes' 20 <= 21 Indeed, control_mac_modes[] has only 20 elements, while VMODE_MAX is 22, which may lead to an out of bounds read when parsing vmode commandline options. The bug was introduced in v2.4.5.6, when 2 new modes were added to macmodes.h, but control_mac_modes[] wasn't updated: https://kernel.opensuse.org/cgit/kernel/diff/include/video/macmodes.h?h=v2.5.2&id=29f279c764808560eaceb88fef36cbc35c529aad Augment control_mac_modes[] with the two new video modes to fix this. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Robert Stonehouse	0312ab0f0e	sfc: don't warn on successful change of MAC [ Upstream commit `cbad52e92a` ] Fixes: `535a61777f` ("sfc: suppress handled MCDI failures when changing the MAC address") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:27 +01:00
Sébastien Szymanski	da73389e8a	HID: cp2112: fix broken gpio_direction_input callback [ Upstream commit `7da85fbf1c` ] When everything goes smoothly, ret is set to 0 which makes the function to return EIO error. Fixes: `8e9faa1546` ("HID: cp2112: fix gpio-callback error handling") Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Dou Liyang	e15628b293	Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting" [ Upstream commit `c962cff17d` ] Revert: `dc6db24d24` ("x86/acpi: Set persistent cpuid <-> nodeid mapping when booting") The mapping of "cpuid <-> nodeid" is established at boot time via ACPI tables to keep associations of workqueues and other node related items consistent across cpu hotplug. But, ACPI tables are unreliable and failures with that boot time mapping have been reported on machines where the ACPI table and the physical information which is retrieved at actual hotplug is inconsistent. Revert the mapping implementation so it can be replaced with a less error prone approach. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Tested-by: Xiaolong Ye <xiaolong.ye@intel.com> Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: guzheng1@huawei.com Cc: izumi.taku@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1488528147-2279-2-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Mike Christie	25b0b3f237	target: fix race during implicit transition work flushes [ Upstream commit `760bf578ed` ] This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Mike Christie	892e4f9bc2	target: fix ALUA transition timeout handling [ Upstream commit `d7175373f2` ] The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Mike Christie	0d34f4770e	target: Use system workqueue for ALUA transitions [ Upstream commit `207ee84133` ] If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Zygo Blaxell	8f60ef9447	btrfs: add missing memset while reading compressed inline extents [ Upstream commit `e1699d2d7b` ] This is a story about 4 distinct (and very old) btrfs bugs. Commit `c8b978188c` ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit `93c82d5750` ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit `166ae5a418` ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: `0000000` cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 `0001760` 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: `0000000` cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d `0001760` 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Olga Kornievskaia	5d460d359a	NFSv4.1 respect server's max size in CREATE_SESSION [ Upstream commit `033853325f` ] Currently client doesn't respect max sizes server returns in CREATE_SESSION. nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0 so they never get set to the sizes returned by the server. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Daniel Drake	88af4e3477	efi/esrt: Cleanup bad memory map log messages [ Upstream commit `822f5845f7` ] The Intel Compute Stick STCK1A8LFC and Weibu F3C platforms both log 2 error messages during boot: efi: requested map not found. esrt: ESRT header is not in the memory map. Searching the web, this seems to affect many other platforms too. Since these messages are logged as errors, they appear on-screen during the boot process even when using the "quiet" boot parameter used by distros. Demote the ESRT error to a warning so that it does not appear on-screen, and delete the error logging from efi_mem_desc_lookup; both callsites of that function log more specific messages upon failure. Out of curiosity I looked closer at the Weibu F3C. There is no entry in the UEFI-provided memory map which corresponds to the ESRT pointer, but hacking the code to map it anyway, the ESRT does appear to be valid with 2 entries. Signed-off-by: Daniel Drake <drake@endlessm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Peter Jones <pjones@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Daniel Borkmann	e30b840d46	perf symbols: Fix symbols__fixup_end heuristic for corner cases [ Upstream commit `e7ede72a6d` ] The current symbols__fixup_end() heuristic for the last entry in the rb tree is suboptimal as it leads to not being able to recognize the symbol in the call graph in a couple of corner cases, for example: i) If the symbol has a start address (f.e. exposed via kallsyms) that is at a page boundary, then the roundup(curr->start, 4096) for the last entry will result in curr->start == curr->end with a symbol length of zero. ii) If the symbol has a start address that is shortly before a page boundary, then also here, curr->end - curr->start will just be very few bytes, where it's unrealistic that we could perform a match against. Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so that we can catch such corner cases and have a better chance to find that specific symbol. It's still just best effort as the real end of the symbol is unknown to us (and could even be at a larger offset than the current range), but better than the current situation. Alexei reported that he recently run into case i) with a JITed eBPF program (these are all page aligned) as the last symbol which wasn't properly shown in the call graph (while other eBPF program symbols in the rb tree were displayed correctly). Since this is a generic issue, lets try to improve the heuristic a bit. Reported-and-Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: `2e538c4a18` ("perf tools: Improve kernel/modules symbol lookup") Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Dmitry Vyukov	2a47e7de08	tty: fix data race in tty_ldisc_ref_wait() [ Upstream commit `a4a3e06114` ] tty_ldisc_ref_wait() checks tty->ldisc under tty->ldisc_sem. But if ldisc==NULL it releases them sem and reloads tty->ldisc without holding the sem. This is wrong and can lead to returning non-NULL ldisc without protection. Don't reload tty->ldisc second time. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: syzkaller@googlegroups.com Cc: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:26 +01:00
Dmitry Vyukov	70f450fc86	tty: don't panic on OOM in tty_set_ldisc() [ Upstream commit `5362544beb` ] If tty_ldisc_open() fails in tty_set_ldisc(), it tries to go back to the old discipline or N_TTY. But that can fail as well, in such case it panics. This is not a graceful way to handle OOM. Leave ldisc==NULL if all attempts fail instead. Also use existing tty_ldisc_reinit() helper function instead of tty_ldisc_restore(). Also don't WARN/BUG in tty_ldisc_reinit() if N_TTY fails, which would have the same net effect of bringing kernel down on OOM. Instead print a single line message about what has happened. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: syzkaller@googlegroups.com Cc: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
David Howells	3d57ec51d2	rxrpc: Ignore BUSY packets on old calls [ Upstream commit `4d4a6ac73e` ] If we receive a BUSY packet for a call we think we've just completed, the packet is handed off to the connection processor to deal with - but the connection processor doesn't expect a BUSY packet and so flags a protocol error. Fix this by simply ignoring the BUSY packet for the moment. The symptom of this may appear as a system call failing with EPROTO. This may be triggered by pressing ctrl-C under some circumstances. This comes about we abort calls due to interruption by a signal (which we shouldn't do, but that's going to be a large fix and mostly in fs/afs/). What happens is that we abort the call and may also abort follow up calls too (this needs offloading somehoe). So we see a transmission of something like the following sequence of packets: DATA for call N ABORT call N DATA for call N+1 ABORT call N+1 in very quick succession on the same channel. However, the peer may have deferred the processing of the ABORT from the call N to a background thread and thus sees the DATA message from the call N+1 coming in before it has cleared the channel. Thus it sends a BUSY packet[]. [] Note that some implementations (OpenAFS, for example) mark the BUSY packet with one plus the callNumber of the call prior to call N. Ordinarily, this would be call N, but there's no requirement for the calls on a channel to be numbered strictly sequentially (the number is required to increase). This is wrong and means that the callNumber in the BUSY packet should be ignored (it really ought to be N+1 since that's what it's in response to). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
David Ahern	42b6d6e824	net: mpls: Fix nexthop alive tracking on down events [ Upstream commit `61733c91c4` ] Alive tracking of nexthops can account for a link twice if the carrier goes down followed by an admin down of the same link rendering multipath routes useless. This is similar to `79099aab38` for UNREGISTER events and DOWN events. Fix by tracking number of alive nexthops in mpls_ifdown similar to the logic in mpls_ifup. Checking the flags per nexthop once after all events have been processed is simpler than trying to maintian a running count through all event combinations. Also, WRITE_ONCE is used instead of ACCESS_ONCE to set rt_nhn_alive per a comment from checkpatch: WARNING: Prefer WRITE_ONCE(<FOO>, <BAR>) over ACCESS_ONCE(<FOO>) = <BAR> Fixes: `c89359a42e` ("mpls: support for dead routes") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
Jack Morgenstein	fd27dbcae9	net/mlx4_core: Avoid delays during VF driver device shutdown [ Upstream commit `4cbe4dac82` ] Some Hypervisors detach VFs from VMs by instantly causing an FLR event to be generated for a VF. In the mlx4 case, this will cause that VF's comm channel to be disabled before the VM has an opportunity to invoke the VF device's "shutdown" method. For such Hypervisors, there is a race condition between the VF's shutdown method and its internal-error detection/reset thread. The internal-error detection/reset thread (which runs every 5 seconds) also detects a disabled comm channel. If the internal-error detection/reset flow wins the race, we still get delays (while that flow tries repeatedly to detect comm-channel recovery). The cited commit fixed the command timeout problem when the internal-error detection/reset flow loses the race. This commit avoids the unneeded delays when the internal-error detection/reset flow wins. Fixes: `d585df1c5c` ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Simon Xiao <sixiao@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
Sagi Grimberg	65bfe003dc	nvmet-rdma: Fix a possible uninitialized variable dereference [ Upstream commit `b25634e2a0` ] When handling a new recv command, we grab a new rsp resource and check for the queue state being live. In case the queue is not in live state, we simply restore the rsp back to the free list. However in this flow we didn't set rsp->queue yet, so we cannot dereference it. Instead, make sure to initialize rsp->queue (and other rsp members) as soon as possible so we won't reference uninitialized variables. Reported-by: Yi Zhang <yizhan@redhat.com> Reported-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
Sagi Grimberg	571e47760d	nvmet: confirm sq percpu has scheduled and switched to atomic [ Upstream commit `d11ea004a4` ] percpu_ref_kill is not enough to prevent subsequent percpu_ref_tryget_live from failing. Hence call perfcpu_ref_kill_confirm to make it safe. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
Sagi Grimberg	af0cee086b	nvme-loop: fix a possible use-after-free when destroying the admin queue [ Upstream commit `e4c5d3762e` ] we need to destroy the nvmet sq and let it finish gracefully before continue to cleanup the queue. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
David Howells	a8939aac82	afs: Fix abort on signal while waiting for call completion [ Upstream commit `954cd6dc02` ] Fix the way in which a call that's in progress and being waited for is aborted in the case that EINTR is detected. We should be sending RX_USER_ABORT rather than RX_CALL_DEAD as the abort code. Note that since the only two ways out of the loop are if the call completes or if a signal happens, the kill-the-call clause after the loop has finished can only happen in the case of EINTR. This means that we only have one abort case to deal with, not two, and the "KWC" case can never happen and so can be deleted. Note further that simply aborting the call isn't necessarily the best thing here since at this point: the request has been entirely sent and it's likely the server will do the operation anyway - whether we abort it or not. In future, we should punt the handling of the remainder of the call off to a background thread. Reported-by: Marc Dionne <marc.c.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
David Howells	d43dda0725	afs: Fix afs_kill_pages() [ Upstream commit `7286a35e89` ] Fix afs_kill_pages() in two ways: (1) If a writeback has been partially flushed, then if we try and kill the pages it contains, some of them may no longer be undergoing writeback and end_page_writeback() will assert. Fix this by checking to see whether the page in question is actually undergoing writeback before ending that writeback. (2) The loop that scans for pages to kill doesn't increase the first page index, and so the loop may not terminate, but it will try to process the same pages over and over again. Fix this by increasing the first page index to one after the last page we processed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
David Howells	856bb4b609	afs: Fix page leak in afs_write_begin() [ Upstream commit `6d06b0d252` ] afs_write_begin() leaks a ref and a lock on a page if afs_fill_page() fails. Fix the leak by unlocking and releasing the page in the error path. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:25 +01:00
Marc Dionne	833acb3e09	afs: Populate and use client modification time [ Upstream commit `ab94f5d0dd` ] The inode timestamps should be set from the client time in the status received from the server, rather than the server time which is meant for internal server use. Set AFS_SET_MTIME and populate the mtime for operations that take an input status, such as file/dir creation and StoreData. If an input time is not provided the server will set the vnode times based on the current server time. In a situation where the server has some skew with the client, this could lead to the client seeing a timestamp in the future for a file that it just created or wrote. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
David Howells	a3e7a29abf	afs: Better abort and net error handling [ Upstream commit `70af0e3bd6` ] If we receive a network error, a remote abort or a protocol error whilst we're still transmitting data, make sure we return an appropriate error to the caller rather than ESHUTDOWN or ECONNABORTED. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
David Howells	ab23906116	afs: Invalid op ID should abort with RXGEN_OPCODE [ Upstream commit `1157f153f3` ] When we are given an invalid operation ID, we should abort that with RXGEN_OPCODE rather than RX_INVALID_OPERATION. Also map RXGEN_OPCODE to -ENOTSUPP. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
David Howells	972e7b7cbf	afs: Fix the maths in afs_fs_store_data() [ Upstream commit `146a119278` ] afs_fs_store_data() works out of the size of the write it's going to make, but it uses 32-bit unsigned subtraction in one place that gets automatically cast to loff_t. However, if to < offset, then the number goes negative, but as the result isn't signed, this doesn't get sign-extended to 64-bits when placed in a loff_t. Fix by casting the operands to loff_t. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
Tina Ruchandani	9329ae4cb1	afs: Prevent callback expiry timer overflow [ Upstream commit `56e714312e` ] get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs_vnode record to use ktime_get_real_seconds() instead, for the fields cb_expires and cb_expires_at. Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
Tina Ruchandani	7da1b85a75	afs: Migrate vlocation fields to 64-bit [ Upstream commit `8a79790bf0` ] get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs's vlocation record to use ktime_get_real_seconds() instead, for the fields time_of_death and update_at. Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
David Howells	7286fad157	afs: Flush outstanding writes when an fd is closed [ Upstream commit `58fed94dfb` ] Flush outstanding writes in afs when an fd is closed. This is what NFS and CIFS do. Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
Marc Dionne	eaaad7646d	afs: Deal with an empty callback array [ Upstream commit `bcd89270d9` ] Servers may send a callback array that is the same size as the FID array, or an empty array. If the callback count is 0, the code would attempt to read (fid_count * 12) bytes of data, which would fail and result in an unmarshalling error. This would lead to stale data for remotely modified files or directories. Store the callback array size in the internal afs_call structure and use that to determine the amount of data to read. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
Marc Dionne	900048089c	afs: Adjust mode bits processing [ Upstream commit `627f46943f` ] Mode bits for an afs file should not be enforced in the usual way. For files, the absence of user bits can restrict file access with respect to what is granted by the server. These bits apply regardless of the owner or the current uid; the rest of the mode bits (group, other) are ignored. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
Marc Dionne	ba47c15974	afs: Populate group ID from vnode status [ Upstream commit `6186f0788b` ] The group was hard coded to GLOBAL_ROOT_GID; use the group ID that was received from the server. Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:24 +01:00
David Howells	c250fae9ad	afs: Fix missing put_page() [ Upstream commit `29c8bbbd6e` ] In afs_writepages_region(), inside the loop where we find dirty pages to deal with, one of the if-statements is missing a put_page(). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Alex Deucher	b29c7b7c62	drm/radeon: reinstate oland workaround for sclk [ Upstream commit `66822d815a` ] Higher sclks seem to be unstable on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
yong mao	2a84fce9b0	mmc: mediatek: Fixed bug where clock frequency could be set wrong [ Upstream commit `40ceda09c8` ] This patch can fix two issues: Issue 1: In previous code, div may be overflow when setting clock frequency as f_min. We can use DIV_ROUND_UP to fix this boundary related issue. Issue 2: In previous code, we can not set the correct clock frequency when div equals 0xff. Signed-off-by: Yong Mao <yong.mao@mediatek.com> Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com> Reviewed-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Steven Rostedt (VMware)	28714e962a	sched/deadline: Use deadline instead of period when calculating overflow [ Upstream commit `2317d5f1c3` ] I was testing Daniel's changes with his test case, and tweaked it a little. Instead of having the runtime equal to the deadline, I increased the deadline ten fold. Daniel's test case had: attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms / attr.sched_deadline = 2 1000 * 1000; /* 2 ms / attr.sched_period = 2 1000 * 1000 * 1000; /* 2 s / To make it more interesting, I changed it to: attr.sched_runtime = 2 1000 * 1000; /* 2 ms / attr.sched_deadline = 20 1000 * 1000; /* 20 ms / attr.sched_period = 2 1000 * 1000 * 1000; /* 2 s */ The results were rather surprising. The behavior that Daniel's patch was fixing came back. The task started using much more than .1% of the CPU. More like 20%. Looking into this I found that it was due to the dl_entity_overflow() constantly returning true. That's because it uses the relative period against relative runtime vs the absolute deadline against absolute runtime. runtime / (deadline - t) > dl_runtime / dl_period There's even a comment mentioning this, and saying that when relative deadline equals relative period, that the equation is the same as using deadline instead of period. That comment is backwards! What we really want is: runtime / (deadline - t) > dl_runtime / dl_deadline We care about if the runtime can make its deadline, not its period. And then we can say "when the deadline equals the period, the equation is the same as using dl_period instead of dl_deadline". After correcting this, now when the task gets enqueued, it can throttle correctly, and Daniel's fix to the throttling of sleeping deadline tasks works even when the runtime and deadline are not the same. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Romulo Silva de Oliveira <romulo.deoliveira@ufsc.br> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Daniel Bristot de Oliveira	a2e29113f1	sched/deadline: Throttle a constrained deadline task activated after the deadline [ Upstream commit `df8eac8caf` ] During the activation, CBS checks if it can reuse the current task's runtime and period. If the deadline of the task is in the past, CBS cannot use the runtime, and so it replenishes the task. This rule works fine for implicit deadline tasks (deadline == period), and the CBS was designed for implicit deadline tasks. However, a task with constrained deadline (deadine < period) might be awakened after the deadline, but before the next period. In this case, replenishing the task would allow it to run for runtime / deadline. As in this case deadline < period, CBS enables a task to run for more than the runtime / period. In a very loaded system, this can cause a domino effect, making other tasks miss their deadlines. To avoid this problem, in the activation of a constrained deadline task after the deadline but before the next period, throttle the task and set the replenishing timer to the begin of the next period, unless it is boosted. Reproducer: --------------- %< --------------- int main (int argc, char *argv) { int ret; int flags = 0; unsigned long l = 0; struct timespec ts; struct sched_attr attr; memset(&attr, 0, sizeof(attr)); attr.size = sizeof(attr); attr.sched_policy = SCHED_DEADLINE; attr.sched_runtime = 2 1000 * 1000; /* 2 ms / attr.sched_deadline = 2 1000 * 1000; /* 2 ms / attr.sched_period = 2 1000 * 1000 * 1000; /* 2 s / ts.tv_sec = 0; ts.tv_nsec = 2000 1000; /* 2 ms / ret = sched_setattr(0, &attr, flags); if (ret < 0) { perror("sched_setattr"); exit(-1); } for(;;) { / XXX: you may need to adjust the loop / for (l = 0; l < 150000; l++); / * The ideia is to go to sleep right before the deadline * and then wake up before the next period to receive * a new replenishment. */ nanosleep(&ts, NULL); } exit(0); } --------------- >% --------------- On my box, this reproducer uses almost 50% of the CPU time, which is obviously wrong for a task with 2/2000 reservation. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Romulo Silva de Oliveira <romulo.deoliveira@ufsc.br> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Link: http://lkml.kernel.org/r/edf58354e01db46bf42df8d2dd32418833f68c89.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Daniel Bristot de Oliveira	9cc56a00ea	sched/deadline: Make sure the replenishment timer fires in the next period [ Upstream commit `5ac69d3778` ] Currently, the replenishment timer is set to fire at the deadline of a task. Although that works for implicit deadline tasks because the deadline is equals to the begin of the next period, that is not correct for constrained deadline tasks (deadline < period). For instance: f.c: --------------- %< --------------- int main (void) { for(;;); } --------------- >% --------------- # gcc -o f f.c # trace-cmd record -e sched:sched_switch \ -e syscalls:sys_exit_sched_setattr \ chrt -d --sched-runtime 490000000 \ --sched-deadline 500000000 \ --sched-period 1000000000 0 ./f # trace-cmd report \| grep "{pid of ./f}" After setting parameters, the task is replenished and continue running until being throttled: f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0 The task is throttled after running 492318 ms, as expected: f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> watchdog/3:32 [0] But then, the task is replenished 500719 ms after the first replenishment: <idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R ==> f:11295 [-1] Running for 490277 ms: f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> swapper/3:0 [120] Hence, in the first period, the task runs 2 * runtime, and that is a bug. During the first replenishment, the next deadline is set one period away. So the runtime / period starts to be respected. However, as the second replenishment took place in the wrong instant, the next replenishment will also be held in a wrong instant of time. Rather than occurring in the nth period away from the first activation, it is taking place in the (nth period - relative deadline). Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Romulo Silva de Oliveira <romulo.deoliveira@ufsc.br> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Link: http://lkml.kernel.org/r/ac50d89887c25285b47465638354b63362f8adff.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Wanpeng Li	0a4d4dac5e	sched/deadline: Add missing update_rq_clock() in dl_task_timer() [ Upstream commit `dcc3b5ffe1` ] The following warning can be triggered by hot-unplugging the CPU on which an active SCHED_DEADLINE task is running on: ------------[ cut here ]------------ WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 rq->clock_update_flags < RQCF_ACT_SKIP CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: <IRQ> dump_stack+0x85/0xc4 __warn+0x172/0x1b0 warn_slowpath_fmt+0xb4/0xf0 ? __warn+0x1b0/0x1b0 ? debug_check_no_locks_freed+0x2c0/0x2c0 ? cpudl_set+0x3d/0x2b0 replenish_dl_entity+0x71e/0xc40 enqueue_task_dl+0x2ea/0x12e0 ? dl_task_timer+0x777/0x990 ? __hrtimer_run_queues+0x270/0xa50 dl_task_timer+0x316/0x990 ? enqueue_task_dl+0x12e0/0x12e0 ? enqueue_task_dl+0x12e0/0x12e0 __hrtimer_run_queues+0x270/0xa50 ? hrtimer_cancel+0x20/0x20 ? hrtimer_interrupt+0x119/0x600 hrtimer_interrupt+0x19c/0x600 ? trace_hardirqs_off+0xd/0x10 local_apic_timer_interrupt+0x74/0xe0 smp_apic_timer_interrupt+0x76/0xa0 apic_timer_interrupt+0x93/0xa0 The DL task will be migrated to a suitable later deadline rq once the DL timer fires and currnet rq is offline. The rq clock of the new rq should be updated. This patch fixes it by updating the rq clock after holding the new rq's rq lock. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1488865888-15894-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Sara Sharon	8d3a318194	iwlwifi: mvm: cleanup pending frames in DQA mode [ Upstream commit `9a3fcf912e` ] When a station is asleep, the fw will set it as "asleep". All queues that are used only by one station will be stopped by the fw. In pre-DQA mode this was relevant for aggregation queues. However, in DQA mode a queue is owned by one station only, so all queues will be stopped. As a result, we don't expect to get filtered frames back to mac80211 and don't have to maintain the entire pending_frames state logic, the same way as we do in aggregations. The correct behavior is to align DQA behavior with the aggregation queue behaviour pre-DQA: - Don't count pending frames. - Let mac80211 know we have frames in these queues so that it can properly handle trigger frames. When a trigger frame is received, mac80211 tells the driver to send frames from the queues using release_buffered_frames. The driver will tell the fw to let frames out even if the station is asleep. This is done by iwl_mvm_sta_modify_sleep_tx_count. Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Vitaly Kuznetsov	a524bb57dd	Drivers: hv: util: move waiting for release to hv_utils_transport itself [ Upstream commit `e9c18ae6eb` ] Waiting for release_event in all three drivers introduced issues on release as on_reset() hook is not always called. E.g. if the device was never opened we will never get the completion. Move the waiting code to hvutil_transport_destroy() and make sure it is only called when the device is open. hvt->lock serialization should guarantee the absence of races. Fixes: `5a66fecbf6` ("Drivers: hv: util: kvp: Fix a rescind processing issue") Fixes: `20951c7535` ("Drivers: hv: util: Fcopy: Fix a rescind processing issue") Fixes: `d77044d142` ("Drivers: hv: util: Backup: Fix a rescind processing issue") Reported-by: Dexuan Cui <decui@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:23 +01:00
Alex Deucher	da626b13ce	drm/radeon/si: add dpm quirk for Oland [ Upstream commit `0f424de1fd` ] OLAND 0x1002:0x6604 0x1028:0x066F 0x00 seems to have problems with higher sclks. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Taku Izumi	1867eb8050	fjes: Fix wrong netdevice feature flags [ Upstream commit `fe8daf5fa7` ] This patch fixes netdev->features for Extended Socket network device. Currently Extended Socket network device's netdev->feature claims NETIF_F_HW_CSUM, however this is completely wrong. There's no feature of checksum offloading. That causes invalid TCP/UDP checksum and packet rejection when IP forwarding from Extended Socket network device to other network device. NETIF_F_HW_CSUM should be omitted. Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Don Brace	91510a623b	scsi: hpsa: do not timeout reset operations [ Upstream commit `2ef2884980` ] Resets can take longer than DEFAULT_TIMEOUT. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Don Brace	0f07e76111	scsi: hpsa: limit outstanding rescans [ Upstream commit `87b9e6aa87` ] Avoid rescan storms. No need to queue another if one is pending. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Don Brace	c81410a435	scsi: hpsa: update check for logical volume status [ Upstream commit `85b29008d8` ] - Add in a new case for volume offline. Resolves internal testing bug for multilun array management. - Return correct status for failed TURs. Reviewed-by: Scott Benesh <scott.benesh@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Kuninori Morimoto	8652baa5a3	ASoC: rcar: clear DE bit only in PDMACHCR when it stops [ Upstream commit `62a10498af` ] R-Car datasheet indicates "Clear DE in PDMACHCR" for transfer stop, but current code clears all bits in PDMACHCR. Because of this, DE bit might never been cleared, and it causes CMD overflow. This patch fixes this issue. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Stafford Horne	fd2530a4ee	openrisc: fix issue handling 8 byte get_user calls [ Upstream commit `154e67cd8e` ] Was getting the following error with allmodconfig: ERROR: "__get_user_bad" [lib/test_user_copy.ko] undefined! This was simply a missing break statement, causing an unwanted fall through. Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Alexander Shishkin	18b39b61b2	intel_th: pci: Add Gemini Lake support [ Upstream commit `340837f985` ] This adds Intel(R) Trace Hub PCI ID for Gemini Lake SOC. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:22 +01:00
Arnd Bergmann	3544f57578	drm: amd: remove broken include path [ Upstream commit `655d9ca9ac` ] The AMD ACP driver adds "-I../acp -I../acp/include" to the gcc command line, which makes no sense, since these are evaluated relative to the build directory. When we build with "make W=1", they instead cause a warning: cc1: error: ../acp/: No such file or directory [-Werror=missing-include-dirs] cc1: error: ../acp/include: No such file or directory [-Werror=missing-include-dirs] cc1: all warnings being treated as errors ../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_drv.o' failed ../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_device.o' failed ../scripts/Makefile.build:289: recipe for target 'drivers/gpu/drm/amd/amdgpu/amdgpu_kms.o' failed This removes the subdir-ccflags variable that evidently did not serve any purpose here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Ram Amrani	4c9c097127	qed: Fix interrupt flags on Rx LL2 [ Upstream commit `1df2adedcc` ] Before iterating over the the LL2 Rx ring, the ring's spinlock is taken via spin_lock_irqsave(). The actual processing of the packet [including handling by the protocol driver] is done without said lock, so qed releases the spinlock and re-claims it afterwards. Problem is that the final spin_lock_irqrestore() at the end of the iteration uses the original flags saved from the initial irqsave() instead of the flags from the most recent irqsave(). So it's possible that the interrupt status would be incorrect at the end of the processing. Fixes: `0a7fb11c23` ("qed: Add Light L2 support"); CC: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Mintz, Yuval	ac04ab9624	qed: Fix mapping leak on LL2 rx flow [ Upstream commit `752ecb2da1` ] When receiving an Rx LL2 packet, qed fails to unmap the previous buffer. Fixes: `0a7fb11c23` ("qed: Add Light L2 support"); Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Ram Amrani	8de6d7b28d	qed: Align CIDs according to DORQ requirement [ Upstream commit `f3e48119b9` ] The Doorbell HW block can be configured at a granularity of 16 x CIDs, so we need to make sure that the actual number of CIDs configured would be a multiplication of 16. Today, when RoCE is enabled - given that the number is unaligned, doorbelling the higher CIDs would fail to reach the firmware and would eventually timeout. Fixes: `dbb799c397` ("qed: Initialize hardware for new protocols") Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Jiri Pirko	fddc3df764	mlxsw: reg: Fix SPVMLR max record count [ Upstream commit `e9093b1183` ] The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans learning not being enabled for wider ranges than 255. Fixes: `a4feea74cd` ("mlxsw: reg: Add Switch Port VLAN MAC Learning register definition") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Jiri Pirko	4c8b4e60b5	mlxsw: reg: Fix SPVM max record count [ Upstream commit `f004ec065b` ] The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans not being enabled for wider ranges than 255. Fixes: `b2e345f9a4` ("mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Vlad Yasevich	6c548e90a0	net: Resend IGMP memberships upon peer notification. [ Upstream commit `37c343b4f4` ] When we notify peers of potential changes, it's also good to update IGMP memberships. For example, during VM migration, updating IGMP memberships will redirect existing multicast streams to the VM at the new location. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Arnd Bergmann	889163d75f	irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN [ Upstream commit `fa23b9d1b8` ] This driver uses the MSI domain but has no strict dependency on PCI_MSI, so we may run into a build failure when CONFIG_GENERIC_MSI_IRQ_DOMAIN is disabled: drivers/irqchip/irq-mvebu-odmi.c:152:15: error: variable 'odmi_msi_ops' has initializer but incomplete type static struct msi_domain_ops odmi_msi_ops = { ^~~~~~~~~~~~~~ drivers/irqchip/irq-mvebu-odmi.c:155:15: error: variable 'odmi_msi_domain_info' has initializer but incomplete type static struct msi_domain_info odmi_msi_domain_info = { ^~~~~~~~~~~~~~~ drivers/irqchip/irq-mvebu-odmi.c:156:3: error: 'struct msi_domain_info' has no member named 'flags' .flags = (MSI_FLAG_USE_DEF_DOM_OPS \| MSI_FLAG_USE_DEF_CHIP_OPS), ^~~~~ drivers/irqchip/irq-mvebu-odmi.c:156:12: error: 'MSI_FLAG_USE_DEF_DOM_OPS' undeclared here (not in a function) .flags = (MSI_FLAG_USE_DEF_DOM_OPS \| MSI_FLAG_USE_DEF_CHIP_OPS), ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/irqchip/irq-mvebu-odmi.c:156:39: error: 'MSI_FLAG_USE_DEF_CHIP_OPS' undeclared here (not in a function); did you mean 'MSI_FLAG_USE_DEF_DOM_OPS'? Selecting the option from this driver seems to solve this nicely, though I could not find any other instance of this in irqchip drivers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Matthias Kaehlcke	e30ccb5f1c	dmaengine: Fix array index out of bounds warning in __get_unmap_pool() [ Upstream commit `23f963e91f` ] This fixes the following warning when building with clang and CONFIG_DMA_ENGINE_RAID=n : drivers/dma/dmaengine.c:1102:11: error: array index 2 is past the end of the array (which contains 1 element) [-Werror,-Warray-bounds] return &unmap_pool[2]; ^ ~ drivers/dma/dmaengine.c:1083:1: note: array 'unmap_pool' declared here static struct dmaengine_unmap_pool unmap_pool[] = { ^ drivers/dma/dmaengine.c:1104:11: error: array index 3 is past the end of the array (which contains 1 element) [-Werror,-Warray-bounds] return &unmap_pool[3]; ^ ~ drivers/dma/dmaengine.c:1083:1: note: array 'unmap_pool' declared here static struct dmaengine_unmap_pool unmap_pool[] = { Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:21 +01:00
Johan Hovold	46cbe3f51c	net: wimax/i2400m: fix NULL-deref at probe [ Upstream commit `6e526fdff7` ] Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. The endpoints are specifically dereferenced in the i2400m_bootrom_init path during probe (e.g. in i2400mu_tx_bulk_out). Fixes: `f398e4240f` ("i2400m/USB: probe/disconnect, dev init/shutdown and reset backends") Cc: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Tahsin Erdogan	2e70c4d5de	writeback: fix memory leak in wb_queue_work() [ Upstream commit `4a3a485b1e` ] When WB_registered flag is not set, wb_queue_work() skips queuing the work, but does not perform the necessary clean up. In particular, if work->auto_free is true, it should free the memory. The leak condition can be reprouced by following these steps: mount /dev/sdb /mnt/sdb /* In qemu console: device_del sdb */ umount /dev/sdb Above will result in a wb_queue_work() call on an unregistered wb and thus leak memory. Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Tahsin Erdogan <tahsin@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Sagi Grimberg	d28046fb8c	blk-mq: Fix tagset reinit in the presence of cpu hot-unplug [ Upstream commit `0067d4b020` ] In case cpu was unplugged, we need to make sure not to assume that the tags for that cpu are still allocated. so check for null tags when reinitializing a tagset. Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Hiroyuki Yokoyama	143d13d1e6	ASoC: rsnd: fix sound route path when using SRC6/SRC9 [ Upstream commit `a1c2ff5372` ] This patch fixes the problem that the missing value of the route path setting table and incorrect values are set in the CMD_ROUTE_SELECT register. Signed-off-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> [Kuninori: shared data on MIX and non-MIX case] Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Florian Westphal	97b75dad9d	netfilter: bridge: honor frag_max_size when refragmenting [ Upstream commit `4ca60d08cb` ] consider a bridge with mtu 9000, but end host sending smaller packets to another host with mtu < 9000. In this case, after reassembly, bridge+defrag would refragment, and then attempt to send the reassembled packet as long as it was below 9k. Instead we have to cap by the largest fragment size seen. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Tomi Valkeinen	38780b9ae4	drm/omap: fix dmabuf mmap for dma_alloc'ed buffers [ Upstream commit `9fa1d75372` ] omap_gem_dmabuf_mmap() returns an error (with a WARN) when called for a buffer which is allocated with dma_alloc_*(). This prevents dmabuf mmap from working on SoCs without DMM, e.g. AM4 and OMAP3. I could not find any reason for omap_gem_dmabuf_mmap() rejecting such buffers, and just removing the if() fixes the limitation. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Dmitry Torokhov	8fb782bbd2	Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list [ Upstream commit `a4c2a13129` ] TUXEDO BU1406 does not implement active multiplexing mode properly, and takes around 550 ms in i8042_set_mux_mode(). Given that the device does not have external AUX port, there is no downside in disabling the MUX mode. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Suggested-by: Vojtech Pavlik <vojtech@suse.cz> Reviewed-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
NeilBrown	817f60ccf7	NFSD: fix nfsd_reset_versions for NFSv4. [ Upstream commit `800a938f0b` ] If you write "-2 -3 -4" to the "versions" file, it will notice that no versions are enabled, and nfsd_reset_versions() is called. This enables all major versions, not no minor versions. So we lose the invariant that NFSv4 is only advertised when at least one minor is enabled. Fix the code to explicitly enable minor versions for v4, change it to use nfsd_vers() to test and set, and simplify the code. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
NeilBrown	0154269f9c	NFSD: fix nfsd_minorversion(.., NFSD_AVAIL) [ Upstream commit `928c6fb3a9` ] Current code will return 1 if the version is supported, and -1 if it isn't. This is confusing and inconsistent with the one place where this is used. So change to return 1 if it is supported, and zero if not. i.e. an error is never returned. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:20 +01:00
Dave Airlie	063c753ef7	drm/amdgpu: fix parser init error path to avoid crash in parser fini [ Upstream commit `607523d19c` ] If we don't reset the chunk info in the error path, the subsequent fini path will double free. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Oleksandr Tyshchenko	3d40364d33	iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it [ Upstream commit `a03849e721` ] Do a check for already installed leaf entry at the current level before dereferencing it in order to avoid walking the page table down with wrong pointer to the next level. Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> CC: Will Deacon <will.deacon@arm.com> CC: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Daniel Jurgens	721c136ac2	net/mlx5: Don't save PCI state when PCI error is detected [ Upstream commit `5d47f6c89d` ] When a PCI error is detected the PCI state could be corrupt, don't save it in that flow. Save the state after initialization. After restoring the PCI state during slot reset save it again, restoring the state destroys the previously saved state info. Fixes: `05ac2c0b74` ('net/mlx5: Fix race between PCI error handlers and health work') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Paul Blakey	248cbd97be	net/mlx5: Fix create autogroup prev initializer [ Upstream commit `af36370569` ] The autogroups list is a list of non overlapping group boundaries sorted by their start index. If the autogroups list wasn't empty and an empty group slot was found at the start of the list, the new group was added to the end of the list instead of the beginning, as the prev initializer was incorrect. When this was repeated, it caused multiple groups to have overlapping boundaries. Fixed that by correctly initializing the prev pointer to the start of the list. Fixes: `eccec8da3b` ('net/mlx5: Keep autogroups list ordered') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
David Howells	515d78dc0a	rxrpc: Wake up the transmitter if Rx window size increases on the peer [ Upstream commit `702f2ac87a` ] The RxRPC ACK packet may contain an extension that includes the peer's current Rx window size for this call. We adjust the local Tx window size to match. However, the transmitter can stall if the receive window is reduced to 0 by the peer and then reopened. This is because the normal way that the transmitter is re-energised is by dropping something out of our Tx queue and thus making space. When a single gap is made, the transmitter is woken up. However, because there's nothing in the Tx queue at this point, this doesn't happen. To fix this, perform a wake_up() any time we see the peer's Rx window size increasing. The observable symptom is that calls start failing on ETIMEDOUT and the following: kAFS: SERVER DEAD state=-62 appears in dmesg. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Doug Berger	e85b9bc29b	net: bcmgenet: Power up the internal PHY before probing the MII [ Upstream commit `6be371b053` ] When using the internal PHY it must be powered up when the MII is probed or the PHY will not be detected. Since the PHY is powered up at reset this has not been a problem. However, when the kernel is restarted with kexec the PHY will likely be powered down when the kernel starts so it will not be detected and the Ethernet link will not be established. This commit explicitly powers up the internal PHY when the GENET driver is probed to correct this behavior. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Doug Berger	f9ac24794f	net: bcmgenet: synchronize irq0 status between the isr and task [ Upstream commit `07c52d6a0b` ] Add a spinlock to ensure that irq0_stat is not unintentionally altered as the result of preemption. Also removed unserviced irq0 interrupts and removed irq1_stat since there is no bottom half service for those interrupts. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Doug Berger	4c3727f6ad	net: bcmgenet: power down internal phy if open or resume fails [ Upstream commit `7627409cc4` ] Since the internal PHY is powered up during the open and resume functions it should be powered back down if the functions fail. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Doug Berger	66e522ab02	net: bcmgenet: reserved phy revisions must be checked first [ Upstream commit `eca4bad734` ] The reserved gphy_rev value of 0x01ff must be tested before the old or new scheme for GPHY major versioning are tested, otherwise it will be treated as 0xff00 according to the old scheme. Fixes: `b04a2f5b9f` ("net: bcmgenet: add support for new GENET PHY revision scheme") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:19 +01:00
Doug Berger	dc8d63c43a	net: bcmgenet: correct MIB access of UniMAC RUNT counters [ Upstream commit `1ad3d225e5` ] The gap between the Tx status counters and the Rx RUNT counters is now being added to allow correct reporting of the registers. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Doug Berger	bb5c42a5b1	net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values [ Upstream commit `ffff71328a` ] The location of the RBUF overflow and error counters has moved between different version of the GENET MAC. This commit corrects the driver to read from the correct locations depending on the version of the GENET MAC. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Michael Chan	72cd0c3f66	bnxt_en: Ignore 0 value in autoneg supported speed from firmware. [ Upstream commit `520ad89a54` ] In some situations, the firmware will return 0 for autoneg supported speed. This may happen if the firmware detects no SFP module, for example. The driver should ignore this so that we don't end up with an invalid autoneg setting with nothing advertised. When SFP module is inserted, we'll get the updated settings from firmware at that time. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Alexander Potapenko	ae0ebdba96	net: initialize msg.msg_flags in recvfrom [ Upstream commit `9f138fa609` ] KMSAN reports a use of uninitialized memory in put_cmsg() because msg.msg_flags in recvfrom haven't been initialized properly. The flag values don't affect the result on this path, but it's still a good idea to initialize them explicitly. Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Andrea Arcangeli	6783015096	userfaultfd: selftest: vm: allow to build in vm/ directory [ Upstream commit `46aa6a302b` ] linux/tools/testing/selftests/vm $ make gcc -Wall -I ../../../../usr/include compaction_test.c -lrt -o /compaction_test /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.4/../../../../x86_64-pc-linux-gnu/bin/ld: cannot open output file /compaction_test: Permission denied collect2: error: ld returned 1 exit status make: *** [../lib.mk:54: /compaction_test] Error 1 Since commit `a8ba798bc8` ("selftests: enable O and KBUILD_OUTPUT") selftests/vm build fails if run from the "selftests/vm" directory, but it works in the selftests/ directory. It's quicker to be able to do a local vm-only build after a tree wipe and this patch allows for it again. Link: http://lkml.kernel.org/r/20170302173738.18994-4-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Andrea Arcangeli	275314e90c	userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE [ Upstream commit `6bbc4a4144` ] __do_fault assumes vmf->page has been initialized and is valid if VM_FAULT_NOPAGE is not returned by vma->vm_ops->fault(vma, vmf). handle_userfault() in turn should return VM_FAULT_NOPAGE if it doesn't return VM_FAULT_SIGBUS or VM_FAULT_RETRY (the other two possibilities). This VM_FAULT_NOPAGE case is only invoked when signal are pending and it didn't matter for anonymous memory before. It only started to matter since shmem was introduced. hugetlbfs also takes a different path and doesn't exercise __do_fault. Link: http://lkml.kernel.org/r/20170228154201.GH5816@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Guoqing Jiang	9bcd15bdfb	md-cluster: free md_cluster_info if node leave cluster [ Upstream commit `9c8043f337` ] To avoid memory leak, we need to free the cinfo which is allocated when node join cluster. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Chunfeng Yun	9841d7b08f	usb: xhci-mtk: check hcc_params after adding primary hcd [ Upstream commit `94a631d91a` ] hcc_params is set in xhci_gen_setup() called from usb_add_hcd(), so checks the Maximum Primary Stream Array Size in the hcc_params register after adding primary hcd. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:18 +01:00
Radim Krčmář	215df1f355	KVM: nVMX: do not warn when MSR bitmap address is not backed [ Upstream commit `05d8d34611` ] Before trying to do nested_get_page() in nested_vmx_merge_msr_bitmap(), we have already checked that the MSR bitmap address is valid (4k aligned and within physical limits). SDM doesn't specify what happens if the there is no memory mapped at the valid address, but Intel CPUs treat the situation as if the bitmap was configured to trap all MSRs. KVM already does that by returning false and a correct handling doesn't need the guest-trigerrable warning that was reported by syzkaller: (The warning was originally there to catch some possible bugs in nVMX.) ------------[ cut here ]------------ WARNING: CPU: 0 PID: 7832 at arch/x86/kvm/vmx.c:9709 nested_vmx_merge_msr_bitmap arch/x86/kvm/vmx.c:9709 [inline] WARNING: CPU: 0 PID: 7832 at arch/x86/kvm/vmx.c:9709 nested_get_vmcs12_pages+0xfb6/0x15c0 arch/x86/kvm/vmx.c:9640 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 7832 Comm: syz-executor1 Not tainted 4.10.0+ #229 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 panic+0x1fb/0x412 kernel/panic.c:179 __warn+0x1c4/0x1e0 kernel/panic.c:540 warn_slowpath_null+0x2c/0x40 kernel/panic.c:583 nested_vmx_merge_msr_bitmap arch/x86/kvm/vmx.c:9709 [inline] nested_get_vmcs12_pages+0xfb6/0x15c0 arch/x86/kvm/vmx.c:9640 enter_vmx_non_root_mode arch/x86/kvm/vmx.c:10471 [inline] nested_vmx_run+0x6186/0xaab0 arch/x86/kvm/vmx.c:10561 handle_vmlaunch+0x1a/0x20 arch/x86/kvm/vmx.c:7312 vmx_handle_exit+0xfc0/0x3f00 arch/x86/kvm/vmx.c:8526 vcpu_enter_guest arch/x86/kvm/x86.c:6982 [inline] vcpu_run arch/x86/kvm/x86.c:7044 [inline] kvm_arch_vcpu_ioctl_run+0x1418/0x4840 arch/x86/kvm/x86.c:7205 kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2570 Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> [Jim Mattson explained the bare metal behavior: "I believe this behavior would be documented in the chipset data sheet rather than the SDM, since the chipset returns all 1s for an unclaimed read."] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Javier Martinez Canillas	50fc2d4152	usb: phy: isp1301: Add OF device ID table [ Upstream commit `fd567653bd` ] The driver doesn't have a struct of_device_id table but supported devices are registered via Device Trees. This is working on the assumption that a I2C device registered via OF will always match a legacy I2C device ID and that the MODALIAS reported will always be of the form i2c:<device>. But this could change in the future so the correct approach is to have an OF device ID table if the devices are registered via OF. Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Ilan peer	bf864220a5	mac80211: Fix addition of mesh configuration element commit `57629915d5` upstream. The code was setting the capabilities byte to zero, after it was already properly set previously. Fix it. The bug was found while debugging hwsim mesh tests failures that happened since the commit mentioned below. Fixes: `76f43b4c0a` ("mac80211: Remove invalid flag operations in mesh TSF synchronization") Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: Richard Schütz <rschuetz@uni-koblenz.de> Cc: Mathias Kretschmer <mathias.kretschmer@fit.fraunhofer.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Chandan Rajendra	32e2ae0328	ext4: fix crash when a directory's i_size is too small commit `9d5afec6b8` upstream. On a ppc64 machine, when mounting a fuzzed ext2 image (generated by fsfuzzer) the following call trace is seen, VFS: brelse: Trying to free free buffer WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40 .__brelse.part.6+0x20/0x40 (unreliable) .ext4_find_entry+0x384/0x4f0 .ext4_lookup+0x84/0x250 .lookup_slow+0xdc/0x230 .walk_component+0x268/0x400 .path_lookupat+0xec/0x2d0 .filename_lookup+0x9c/0x1d0 .vfs_statx+0x98/0x140 .SyS_newfstatat+0x48/0x80 system_call+0x58/0x6c This happens because the directory that ext4_find_entry() looks up has inode->i_size that is less than the block size of the filesystem. This causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not reading any of the directory file's blocks. This renders the entries in bh_use[] array to continue to have garbage data. buffer_uptodate() on bh_use[0] can then return a zero value upon which brelse() function is invoked. This commit fixes the bug by returning -ENOENT when the directory file has no associated blocks. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Eryu Guan	6a851bb99e	ext4: fix fdatasync(2) after fallocate(2) operation commit `c894aa9757` upstream. Currently, fallocate(2) with KEEP_SIZE followed by a fdatasync(2) then crash, we'll see wrong allocated block number (stat -c %b), the blocks allocated beyond EOF are all lost. fstests generic/468 exposes this bug. Commit `67a7d5f561` ("ext4: fix fdatasync(2) after extent manipulation operations") fixed all the other extent manipulation operation paths such as hole punch, zero range, collapse range etc., but forgot the fallocate case. So similarly, fix it by recording the correct journal tid in ext4 inode in fallocate(2) path, so that ext4_sync_file() will wait for the right tid to be committed on fdatasync(2). This addresses the test failure in xfstests test generic/468. Signed-off-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Adam Wallis	679dbeac0b	dmaengine: dmatest: move callback wait queue to thread context commit `6f6a23a213` upstream. Commit `adfa543e73` ("dmatest: don't use set_freezable_with_signal()") introduced a bug (that is in fact documented by the patch commit text) that leaves behind a dangling pointer. Since the done_wait structure is allocated on the stack, future invocations to the DMATEST can produce undesirable results (e.g., corrupted spinlocks). Commit `a9df21e34b` ("dmaengine: dmatest: warn user when dma test times out") attempted to WARN the user that the stack was likely corrupted but did not fix the actual issue. This patch fixes the issue by pushing the wait queue and callback structs into the the thread structure. If a failure occurs due to time, dmaengine_terminate_all will force the callback to safely call wake_up_all() without possibility of using a freed pointer. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605 Fixes: `adfa543e73` ("dmatest: don't use set_freezable_with_signal()") Reviewed-by: Sinan Kaya <okaya@codeaurora.org> Suggested-by: Shunyong Yang <shunyong.yang@hxt-semitech.com> Signed-off-by: Adam Wallis <awallis@codeaurora.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
David Lechner	744cb5ab33	eeprom: at24: change nvmem stride to 1 commit `7f6d2ecd3d` upstream. Trying to read the MAC address from an eeprom that has an offset that is not a multiple of 4 causes an error currently. Fix it by changing the nvmem stride to 1. Signed-off-by: David Lechner <david@lechnology.com> [Bartosz: tweaked the commit message] Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Steven Rostedt	d266817f50	sched/rt: Do not pull from current CPU if only one CPU to pull commit `f73c52a5bc` upstream. Daniel Wagner reported a crash on the BeagleBone Black SoC. This is a single CPU architecture, and does not have a functional arch_send_call_function_single_ipi() implementation which can crash the kernel if that is called. As it only has one CPU, it shouldn't be called, but if the kernel is compiled for SMP, the push/pull RT scheduling logic now calls it for irq_work if the one CPU is overloaded, it can use that function to call itself and crash the kernel. Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system only has a single CPU. But SCHED_FEAT is a constant if sched debugging is turned off. Another fix can also be used, and this should also help with normal SMP machines. That is, do not initiate the pull code if there's only one RT overloaded CPU, and that CPU happens to be the current CPU that is scheduling in a lower priority task. Even on a system with many CPUs, if there's many RT tasks waiting to run on a single CPU, and that CPU schedules in another RT task of lower priority, it will initiate the PULL logic in case there's a higher priority RT task on another CPU that is waiting to run. But if there is no other CPU with waiting RT tasks, it will initiate the RT pull logic on itself (as it still has RT tasks waiting to run). This is a wasted effort. Not only does this help with SMP code where the current CPU is the only one with RT overloaded tasks, it should also solve the issue that Daniel encountered, because it will prevent the PULL logic from executing, as there's only one CPU on the system, and the check added here will cause it to exit the RT pull code. Reported-by: Daniel Wagner <wagi@monom.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Fixes: `4bdced5c9` ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Scott Mayhew	9c537f06d6	nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests commit `dc4fd9ab01` upstream. If there were no commit requests, then nfs_commit_inode() should not wait on the commit or mark the inode dirty, otherwise the following BUG_ON can be triggered: [ 1917.130762] kernel BUG at fs/inode.c:578! [ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1] [ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries [ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G ------------ T 3.10.0-768.el7.ppc64 #1 [ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000 [ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80 [ 1917.130816] REGS: c00000004cea3720 TRAP: 0700 Tainted: G ------------ T (3.10.0-768.el7.ppc64) [ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI> CR: 22002822 XER: 20000000 [ 1917.130828] CFAR: c00000000011f594 SOFTE: 1 GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750 GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03 GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8 GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0 GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8 [ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350 [ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350 [ 1917.130873] Call Trace: [ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable) [ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270 [ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0 [ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs] [ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0 [ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180 [ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150 [ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100 [ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60 [ 1917.130919] Instruction dump: [ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0 [ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Mathias Nyman	3bdb508d68	xhci: Don't add a virt_dev to the devs array before it's fully allocated commit `5d9b70f7d5` upstream. Avoid null pointer dereference if some function is walking through the devs array accessing members of a new virt_dev that is mid allocation. Add the virt_dev to xhci->devs[i] _after_ the virt_device and all its members are properly allocated. issue found by KASAN: null-ptr-deref in xhci_find_slot_id_by_port "Quick analysis suggests that xhci_alloc_virt_device() is not mutex protected. If so, there is a time frame where xhci->devs[slot_id] is set but not fully initialized. Specifically, xhci->devs[i]->udev can be NULL." Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:17 +01:00
Sukumar Ghorai	7336f5481f	Bluetooth: btusb: driver to enable the usb-wakeup feature commit `a0085f2510` upstream. BT-Controller connected as platform non-root-hub device and usb-driver initialize such device with wakeup disabled, Ref. usb_new_device(). At present wakeup-capability get enabled by hid-input device from usb function driver(e.g. BT HID device) at runtime. Again some functional driver does not set usb-wakeup capability(e.g LE HID device implement as HID-over-GATT), and can't wakeup the host on USB. Most of the device operation (such as mass storage) initiated from host (except HID) and USB wakeup aligned with host resume procedure. For BT device, usb-wakeup capability need to enable form btusc driver as a generic solution for multiple profile use case and required for USB remote wakeup (in-bus wakeup) while host is suspended. Also usb-wakeup feature need to enable/disable with HCI interface up and down. Signed-off-by: Sukumar Ghorai <sukumar.ghorai@intel.com> Signed-off-by: Amit K Bag <amit.k.bag@intel.com> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Chunfeng Yun	cdfe4c0091	usb: xhci: fix TDS for MTK xHCI1.1 commit `72b663a99c` upstream. For MTK's xHCI 1.0 or latter, TD size is the number of max packet sized packets remaining in the TD, not including this TRB (following spec). For MTK's xHCI 0.96 and older, TD size is the number of max packet sized packets remaining in the TD, including this TRB (not following spec). Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Yan, Zheng	e081bd0d70	ceph: drop negative child dentries before try pruning inode's alias commit `040d786032` upstream. Negative child dentry holds reference on inode's alias, it makes d_prune_aliases() do nothing. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Shuah Khan	14513e49c4	usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer commit `be6123df1e` upstream. stub_send_ret_submit() handles urb with a potential null transfer_buffer, when it replays a packet with potential malicious data that could contain a null buffer. Add a check for the condition when actual_length > 0 and transfer_buffer is null. Reported-by: Secunia Research <vuln@secunia.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Shuah Khan	f3e957266a	usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input commit `c6688ef9f2` upstream. Harden CMD_SUBMIT path to handle malicious input that could trigger large memory allocations. Add checks to validate transfer_buffer_length and number_of_packets to protect against bad input requesting for unbounded memory allocations. Validate early in get_pipe() and return failure. Reported-by: Secunia Research <vuln@secunia.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Felipe Balbi	b6dbace92e	usb: add helper to extract bits 12:11 of wMaxPacketSize commit `541b6fe630` upstream. According to USB Specification 2.0 table 9-4, wMaxPacketSize is a bitfield. Endpoint's maxpacket is laid out in bits 10:0. For high-speed, high-bandwidth isochronous endpoints, bits 12:11 contain a multiplier to tell us how many transactions we want to try per uframe. This means that if we want an isochronous endpoint to issue 3 transfers of 1024 bytes per uframe, wMaxPacketSize should contain the value: 1024 \| (2 << 11) or 5120 (0x1400). In order to make Host and Peripheral controller drivers' life easier, we're adding a helper which returns bits 12:11. Note that no care is made WRT to checking endpoint type and gadget's speed. That's left for drivers to handle. Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Shuah Khan	20e825cdf7	usbip: fix stub_rx: get_pipe() to validate endpoint number commit `635f545a7e` upstream. get_pipe() routine doesn't validate the input endpoint number and uses to reference ep_in and ep_out arrays. Invalid endpoint number can trigger BUG(). Range check the epnum and returning error instead of calling BUG(). Change caller stub_recv_cmd_submit() to handle the get_pipe() error return. Reported-by: Secunia Research <vuln@secunia.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
Alan Stern	99542e468b	USB: core: prevent malicious bNumInterfaces overflow commit `48a4ff1c7b` upstream. A malicious USB device with crafted descriptors can cause the kernel to access unallocated memory by setting the bNumInterfaces value too high in a configuration descriptor. Although the value is adjusted during parsing, this adjustment is skipped in one of the error return paths. This patch prevents the problem by setting bNumInterfaces to 0 initially. The existing code already sets it to the proper value after parsing is complete. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:16 +01:00
David Kozub	0d29ae4f50	USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID commit `6235445462` upstream. There is another JMS567-based USB3 UAS enclosure (152d:0578) that fails with the following error: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [sda] tag#0 Sense Key : Illegal Request [current] [sda] tag#0 Add. Sense: Invalid field in cdb The issue occurs both with UAS (occasionally) and mass storage (immediately after mounting a FS on a disk in the enclosure). Enabling US_FL_BROKEN_FUA quirk solves this issue. This patch adds an UNUSUAL_DEV with US_FL_BROKEN_FUA for the enclosure for both UAS and mass storage. Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
Changbin Du	d760f90341	tracing: Allocate mask_str buffer dynamically commit `90e406f96f` upstream. The default NR_CPUS can be very large, but actual possible nr_cpu_ids usually is very small. For my x86 distribution, the NR_CPUS is 8192 and nr_cpu_ids is 4. About 2 pages are wasted. Most machines don't have so many CPUs, so define a array with NR_CPUS just wastes memory. So let's allocate the buffer dynamically when need. With this change, the mutext tracing_cpumask_update_lock also can be removed now, which was used to protect mask_str. Link: http://lkml.kernel.org/r/1512013183-19107-1-git-send-email-changbin.du@intel.com Fixes: `36dfe9252b` ("ftrace: make use of tracing_cpumask") Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
NeilBrown	d1175423ce	autofs: fix careless error in recent commit commit `302ec300ef` upstream. Commit `ecc0c469f2` ("autofs: don't fail mount for transient error") was meant to replace an 'if' with a 'switch', but instead added the 'switch' leaving the case in place. Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name Fixes: `ecc0c469f2` ("autofs: don't fail mount for transient error") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: NeilBrown <neilb@suse.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
Eric Biggers	c32e053a11	crypto: salsa20 - fix blkcipher_walk API usage commit `ecaaab5649` upstream. When asked to encrypt or decrypt 0 bytes, both the generic and x86 implementations of Salsa20 crash in blkcipher_walk_done(), either when doing 'kfree(walk->buffer)' or 'free_page((unsigned long)walk->page)', because walk->buffer and walk->page have not been initialized. The bug is that Salsa20 is calling blkcipher_walk_done() even when nothing is in 'walk.nbytes'. But blkcipher_walk_done() is only meant to be called when a nonzero number of bytes have been provided. The broken code is part of an optimization that tries to make only one call to salsa20_encrypt_bytes() to process inputs that are not evenly divisible by 64 bytes. To fix the bug, just remove this "optimization" and use the blkcipher_walk API the same way all the other users do. Reproducer: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { int algfd, reqfd; struct sockaddr_alg addr = { .salg_type = "skcipher", .salg_name = "salsa20", }; char key[16] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (void *)&addr, sizeof(addr)); reqfd = accept(algfd, 0, 0); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key)); read(reqfd, key, sizeof(key)); } Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: `eb6f13eb9f` ("[CRYPTO] salsa20_generic: Fix multi-page processing") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
Eric Biggers	43259d07fc	crypto: hmac - require that the underlying hash algorithm is unkeyed commit `af3ff8045b` upstream. Because the HMAC template didn't check that its underlying hash algorithm is unkeyed, trying to use "hmac(hmac(sha3-512-generic))" through AF_ALG or through KEYCTL_DH_COMPUTE resulted in the inner HMAC being used without having been keyed, resulting in sha3_update() being called without sha3_init(), causing a stack buffer overflow. This is a very old bug, but it seems to have only started causing real problems when SHA-3 support was added (requires CONFIG_CRYPTO_SHA3) because the innermost hash's state is ->import()ed from a zeroed buffer, and it just so happens that other hash algorithms are fine with that, but SHA-3 is not. However, there could be arch or hardware-dependent hash algorithms also affected; I couldn't test everything. Fix the bug by introducing a function crypto_shash_alg_has_setkey() which tests whether a shash algorithm is keyed. Then update the HMAC template to require that its underlying hash algorithm is unkeyed. Here is a reproducer: #include <linux/if_alg.h> #include <sys/socket.h> int main() { int algfd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "hmac(hmac(sha3-512-generic))", }; char key[4096] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (const struct sockaddr *)&addr, sizeof(addr)); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key)); } Here was the KASAN report from syzbot: BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:341 [inline] BUG: KASAN: stack-out-of-bounds in sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161 Write of size 4096 at addr ffff8801cca07c40 by task syzkaller076574/3044 CPU: 1 PID: 3044 Comm: syzkaller076574 Not tainted 4.14.0-mm1+ #25 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x137/0x190 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 memcpy include/linux/string.h:341 [inline] sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161 crypto_shash_update+0xcb/0x220 crypto/shash.c:109 shash_finup_unaligned+0x2a/0x60 crypto/shash.c:151 crypto_shash_finup+0xc4/0x120 crypto/shash.c:165 hmac_finup+0x182/0x330 crypto/hmac.c:152 crypto_shash_finup+0xc4/0x120 crypto/shash.c:165 shash_digest_unaligned+0x9e/0xd0 crypto/shash.c:172 crypto_shash_digest+0xc4/0x120 crypto/shash.c:186 hmac_setkey+0x36a/0x690 crypto/hmac.c:66 crypto_shash_setkey+0xad/0x190 crypto/shash.c:64 shash_async_setkey+0x47/0x60 crypto/shash.c:207 crypto_ahash_setkey+0xaf/0x180 crypto/ahash.c:200 hash_setkey+0x40/0x90 crypto/algif_hash.c:446 alg_setkey crypto/af_alg.c:221 [inline] alg_setsockopt+0x2a1/0x350 crypto/af_alg.c:254 SYSC_setsockopt net/socket.c:1851 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1830 entry_SYSCALL_64_fastpath+0x1f/0x96 Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
Eric Biggers	cd9b59861f	crypto: rsa - fix buffer overread when stripping leading zeroes commit `d2890c3778` upstream. In rsa_get_n(), if the buffer contained all 0's and "FIPS mode" is enabled, we would read one byte past the end of the buffer while scanning the leading zeroes. Fix it by checking 'n_sz' before '!*ptr'. This bug was reachable by adding a specially crafted key of type "asymmetric" (requires CONFIG_RSA and CONFIG_X509_CERTIFICATE_PARSER). KASAN report: BUG: KASAN: slab-out-of-bounds in rsa_get_n+0x19e/0x1d0 crypto/rsa_helper.c:33 Read of size 1 at addr ffff88003501a708 by task keyctl/196 CPU: 1 PID: 196 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: rsa_get_n+0x19e/0x1d0 crypto/rsa_helper.c:33 asn1_ber_decoder+0x82a/0x1fd0 lib/asn1_decoder.c:328 rsa_set_pub_key+0xd3/0x320 crypto/rsa.c:278 crypto_akcipher_set_pub_key ./include/crypto/akcipher.h:364 [inline] pkcs1pad_set_pub_key+0xae/0x200 crypto/rsa-pkcs1pad.c:117 crypto_akcipher_set_pub_key ./include/crypto/akcipher.h:364 [inline] public_key_verify_signature+0x270/0x9d0 crypto/asymmetric_keys/public_key.c:106 x509_check_for_self_signed+0x2ea/0x480 crypto/asymmetric_keys/x509_public_key.c:141 x509_cert_parse+0x46a/0x620 crypto/asymmetric_keys/x509_cert_parser.c:129 x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174 asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388 key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0x96 Allocated by task 196: __do_kmalloc mm/slab.c:3711 [inline] __kmalloc_track_caller+0x118/0x2e0 mm/slab.c:3726 kmemdup+0x17/0x40 mm/util.c:118 kmemdup ./include/linux/string.h:414 [inline] x509_cert_parse+0x2cb/0x620 crypto/asymmetric_keys/x509_cert_parser.c:106 x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174 asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388 key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0x96 Fixes: `5a7de97309` ("crypto: rsa - return raw integers for the ASN.1 parser") Cc: Tudor Ambarus <tudor-dan.ambarus@nxp.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00
Martin Kaiser	1fb73eae96	mfd: fsl-imx25: Clean up irq settings during removal commit `18f7739379` upstream. When fsl-imx25-tsadc is compiled as a module, loading, unloading and reloading the module will lead to a crash. Unable to handle kernel paging request at virtual address bf005430 [<c004df6c>] (irq_find_matching_fwspec) from [<c028d5ec>] (of_irq_get+0x58/0x74) [<c028d594>] (of_irq_get) from [<c01ff970>] (platform_get_irq+0x48/0xc8) [<c01ff928>] (platform_get_irq) from [<bf00e33c>] (mx25_tsadc_probe+0x220/0x2f4 [fsl_imx25_tsadc]) irq_find_matching_fwspec() loops over all registered irq domains. The irq domain is still registered from last time the module was loaded but the pointer to its operations is invalid after the module was unloaded. Add a removal function which clears the irq handler and removes the irq domain. With this cleanup in place, it's possible to unload and reload the module. Signed-off-by: Martin Kaiser <martin@kaiser.cx> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-20 10:07:15 +01:00

849 changed files with 11444 additions and 4182 deletions

16

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

 @ -350,3 +350,19 @@ Contact:	Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org>
 Description:	AArch64 CPU registers
 		'identification' directory exposes the CPU ID registers for
 		 identifying model and revision of the CPU.
 What:		/sys/devices/system/cpu/vulnerabilities
 		/sys/devices/system/cpu/vulnerabilities/meltdown
 		/sys/devices/system/cpu/vulnerabilities/spectre_v1
 		/sys/devices/system/cpu/vulnerabilities/spectre_v2
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
 		The files are named after the code names of CPU
 		vulnerabilities. The output of those files reflects the
 		state of the CPUs in the system. Possible output values:
 		"Not affected"	  CPU is not affected by the vulnerability
 		"Vulnerable"	  CPU is affected and no mitigation in effect
 		"Mitigation: $M"  CPU is affected and mitigation $M is in effect

47

Documentation/kernel-parameters.txt

View File

 @ -2691,6 +2691,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	nosmt		[KNL,S390] Disable symmetric multithreading (SMT).
 			Equivalent to smt=1.
 	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
 			(indirect branch prediction) vulnerability. System may
 			allow data leaks with this option, which is equivalent
 			to spectre_v2=off.
 	noxsave		[BUGS=X86] Disables x86 extended register state save
 			and restore using xsave. The kernel will fallback to
 			enabling legacy floating-point and sse state.
 @ -2795,11 +2800,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	nopat		[X86] Disable PAT (page attribute table extension of
 			pagetables) support.
 	nopcid		[X86-64] Disable the PCID cpu feature.
 	norandmaps	Don't use address space randomization.  Equivalent to
 			echo 0 > /proc/sys/kernel/randomize_va_space
 	noreplace-paravirt	[X86,IA-64,PV_OPS] Don't patch paravirt_ops
 	noreplace-smp	[X86-32,SMP] Don't replace SMP instructions
 			with UP alternatives
 @ -3323,6 +3328,21 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	pt.		[PARIDE]
 			See Documentation/blockdev/paride.txt.
 	pti=		[X86_64] Control Page Table Isolation of user and
 			kernel address spaces.  Disabling this feature
 			removes hardening, but improves performance of
 			system calls and interrupts.
 			on   - unconditionally enable
 			off  - unconditionally disable
 			auto - kernel detects whether your CPU model is
 			       vulnerable to issues that PTI mitigates
 			Not specifying this option is equivalent to pti=auto.
 	nopti		[X86_64]
 			Equivalent to pti=off
 	pty.legacy_count=
 			[KNL] Number of legacy pty's. Overwrites compiled-in
 			default number.
 @ -3927,6 +3947,29 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	sonypi.*=	[HW] Sony Programmable I/O Control Device driver
 			See Documentation/laptops/sonypi.txt
 	spectre_v2=	[X86] Control mitigation of Spectre variant 2
 			(indirect branch speculation) vulnerability.
 			on   - unconditionally enable
 			off  - unconditionally disable
 			auto - kernel detects whether your CPU model is
 			       vulnerable
 			Selecting 'on' will, and 'auto' may, choose a
 			mitigation method at run time according to the
 			CPU, the available microcode, the setting of the
 			CONFIG_RETPOLINE configuration option, and the
 			compiler with which the kernel was built.
 			Specific mitigations can also be selected manually:
 			retpoline	  - replace indirect branches
 			retpoline,generic - google's original retpoline
 			retpoline,amd     - AMD-specific minimal thunk
 			Not specifying this option is equivalent to
 			spectre_v2=auto.
 	spia_io_base=	[HW,MTD]
 	spia_fio_base=
 	spia_pedr=

90

Documentation/speculation.txt Normal file

View File

 @ -0,0 +1,90 @@
 This document explains potential effects of speculation, and how undesirable
 effects can be mitigated portably using common APIs.
 ===========
 Speculation
 ===========
 To improve performance and minimize average latencies, many contemporary CPUs
 employ speculative execution techniques such as branch prediction, performing
 work which may be discarded at a later stage.
 Typically speculative execution cannot be observed from architectural state,
 such as the contents of registers. However, in some cases it is possible to
 observe its impact on microarchitectural state, such as the presence or
 absence of data in caches. Such state may form side-channels which can be
 observed to extract secret information.
 For example, in the presence of branch prediction, it is possible for bounds
 checks to be ignored by code which is speculatively executed. Consider the
 following code:
 	int load_array(int *array, unsigned int index)
 	{
 		if (index >= MAX_ARRAY_ELEMS)
 			return 0;
 		else
 			return array[index];
 	}
 Which, on arm64, may be compiled to an assembly sequence such as:
 	CMP	<index>, #MAX_ARRAY_ELEMS
 	B.LT	less
 	MOV	<returnval>, #0
 	RET
   less:
 	LDR	<returnval>, [<array>, <index>]
 	RET
 It is possible that a CPU mis-predicts the conditional branch, and
 speculatively loads array[index], even if index >= MAX_ARRAY_ELEMS. This
 value will subsequently be discarded, but the speculated load may affect
 microarchitectural state which can be subsequently measured.
 More complex sequences involving multiple dependent memory accesses may
 result in sensitive information being leaked. Consider the following
 code, building on the prior example:
 	int load_dependent_arrays(int *arr1, int *arr2, int index)
 	{
 		int val1, val2,
 		val1 = load_array(arr1, index);
 		val2 = load_array(arr2, val1);
 		return val2;
 	}
 Under speculation, the first call to load_array() may return the value
 of an out-of-bounds address, while the second call will influence
 microarchitectural state dependent on this value. This may provide an
 arbitrary read primitive.
 ====================================
 Mitigating speculation side-channels
 ====================================
 The kernel provides a generic API to ensure that bounds checks are
 respected even under speculation. Architectures which are affected by
 speculation-based side-channels are expected to implement these
 primitives.
 The array_index_nospec() helper in <linux/nospec.h> can be used to
 prevent information from being leaked via side-channels.
 A call to array_index_nospec(index, size) returns a sanitized index
 value that is bounded to [0, size) even under cpu speculation
 conditions.
 This can be used to protect the earlier load_array() example:
 	int load_array(int *array, unsigned int index)
 	{
 		if (index >= MAX_ARRAY_ELEMS)
 			return 0;
 		else {
 			index = array_index_nospec(index, MAX_ARRAY_ELEMS);
 			return array[index];
 		}
 	}

186

Documentation/x86/pti.txt Normal file

View File

 @ -0,0 +1,186 @@
 Overview
 ========
 Page Table Isolation (pti, previously known as KAISER[1]) is a
 countermeasure against attacks on the shared user/kernel address
 space such as the "Meltdown" approach[2].
 To mitigate this class of attacks, we create an independent set of
 page tables for use only when running userspace applications.  When
 the kernel is entered via syscalls, interrupts or exceptions, the
 page tables are switched to the full "kernel" copy.  When the system
 switches back to user mode, the user copy is used again.
 The userspace page tables contain only a minimal amount of kernel
 data: only what is needed to enter/exit the kernel such as the
 entry/exit functions themselves and the interrupt descriptor table
 (IDT).  There are a few strictly unnecessary things that get mapped
 such as the first C function when entering an interrupt (see
 comments in pti.c).
 This approach helps to ensure that side-channel attacks leveraging
 the paging structures do not function when PTI is enabled.  It can be
 enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time.
 Once enabled at compile-time, it can be disabled at boot with the
 'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt).
 Page Table Management
 =====================
 When PTI is enabled, the kernel manages two sets of page tables.
 The first set is very similar to the single set which is present in
 kernels without PTI.  This includes a complete mapping of userspace
 that the kernel can use for things like copy_to_user().
 Although _complete_, the user portion of the kernel page tables is
 crippled by setting the NX bit in the top level.  This ensures
 that any missed kernel->user CR3 switch will immediately crash
 userspace upon executing its first instruction.
 The userspace page tables map only the kernel data needed to enter
 and exit the kernel.  This data is entirely contained in the 'struct
 cpu_entry_area' structure which is placed in the fixmap which gives
 each CPU's copy of the area a compile-time-fixed virtual address.
 For new userspace mappings, the kernel makes the entries in its
 page tables like normal.  The only difference is when the kernel
 makes entries in the top (PGD) level.  In addition to setting the
 entry in the main kernel PGD, a copy of the entry is made in the
 userspace page tables' PGD.
 This sharing at the PGD level also inherently shares all the lower
 layers of the page tables.  This leaves a single, shared set of
 userspace page tables to manage.  One PTE to lock, one set of
 accessed bits, dirty bits, etc...
 Overhead
 ========
 Protection against side-channel attacks is important.  But,
 this protection comes at a cost:
 . Increased Memory Use
   a. Each process now needs an order-1 PGD instead of order-0.
      (Consumes an additional 4k per process).
   b. The 'cpu_entry_area' structure must be 2MB in size and 2MB
      aligned so that it can be mapped by setting a single PMD
      entry.  This consumes nearly 2MB of RAM once the kernel
      is decompressed, but no space in the kernel image itself.
 . Runtime Cost
   a. CR3 manipulation to switch between the page table copies
      must be done at interrupt, syscall, and exception entry
      and exit (it can be skipped when the kernel is interrupted,
      though.)  Moves to CR3 are on the order of a hundred
      cycles, and are required at every entry and exit.
   b. A "trampoline" must be used for SYSCALL entry.  This
      trampoline depends on a smaller set of resources than the
      non-PTI SYSCALL entry code, so requires mapping fewer
      things into the userspace page tables.  The downside is
      that stacks must be switched at entry time.
   c. Global pages are disabled for all kernel structures not
      mapped into both kernel and userspace page tables.  This
      feature of the MMU allows different processes to share TLB
      entries mapping the kernel.  Losing the feature means more
      TLB misses after a context switch.  The actual loss of
      performance is very small, however, never exceeding 1%.
   d. Process Context IDentifiers (PCID) is a CPU feature that
      allows us to skip flushing the entire TLB when switching page
      tables by setting a special bit in CR3 when the page tables
      are changed.  This makes switching the page tables (at context
      switch, or kernel entry/exit) cheaper.  But, on systems with
      PCID support, the context switch code must flush both the user
      and kernel entries out of the TLB.  The user PCID TLB flush is
      deferred until the exit to userspace, minimizing the cost.
      See intel.com/sdm for the gory PCID/INVPCID details.
   e. The userspace page tables must be populated for each new
      process.  Even without PTI, the shared kernel mappings
      are created by copying top-level (PGD) entries into each
      new process.  But, with PTI, there are now *two* kernel
      mappings: one in the kernel page tables that maps everything
      and one for the entry/exit structures.  At fork(), we need to
      copy both.
   f. In addition to the fork()-time copying, there must also
      be an update to the userspace PGD any time a set_pgd() is done
      on a PGD used to map userspace.  This ensures that the kernel
      and userspace copies always map the same userspace
      memory.
   g. On systems without PCID support, each CR3 write flushes
      the entire TLB.  That means that each syscall, interrupt
      or exception flushes the TLB.
   h. INVPCID is a TLB-flushing instruction which allows flushing
      of TLB entries for non-current PCIDs.  Some systems support
      PCIDs, but do not support INVPCID.  On these systems, addresses
      can only be flushed from the TLB for the current PCID.  When
      flushing a kernel address, we need to flush all PCIDs, so a
      single kernel address flush will require a TLB-flushing CR3
      write upon the next use of every PCID.
 Possible Future Work
 ====================
 . We can be more careful about not actually writing to CR3
    unless its value is actually changed.
 . Allow PTI to be enabled/disabled at runtime in addition to the
    boot-time switching.
 Testing
 ========
 To test stability of PTI, the following test procedure is recommended,
 ideally doing all of these in parallel:
 . Set CONFIG_DEBUG_ENTRY=y
 . Run several copies of all of the tools/testing/selftests/x86/ tests
    (excluding MPX and protection_keys) in a loop on multiple CPUs for
    several minutes.  These tests frequently uncover corner cases in the
    kernel entry code.  In general, old kernels might cause these tests
    themselves to crash, but they should never crash the kernel.
 . Run the 'perf' tool in a mode (top or record) that generates many
    frequent performance monitoring non-maskable interrupts (see "NMI"
    in /proc/interrupts).  This exercises the NMI entry/exit code which
    is known to trigger bugs in code paths that did not expect to be
    interrupted, including nested NMIs.  Using "-c" boosts the rate of
    NMIs, and using two -c with separate counters encourages nested NMIs
    and less deterministic behavior.
 	while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done
 . Launch a KVM virtual machine.
 . Run 32-bit binaries on systems supporting the SYSCALL instruction.
    This has been a lightly-tested code path and needs extra scrutiny.
 Debugging
 =========
 Bugs in PTI cause a few different signatures of crashes
 that are worth noting here.
  * Failures of the selftests/x86 code.  Usually a bug in one of the
    more obscure corners of entry_64.S
  * Crashes in early boot, especially around CPU bringup.  Bugs
    in the trampoline code or mappings cause these.
  * Crashes at the first interrupt.  Caused by bugs in entry_64.S,
    like screwing up a page table switch.  Also caused by
    incorrectly mapping the IRQ handler entry code.
  * Crashes at the first NMI.  The NMI code is separate from main
    interrupt handlers and can have bugs that do not affect
    normal interrupts.  Also caused by incorrectly mapping NMI
    code.  NMIs that interrupt the entry code must be very
    careful and can be the cause of crashes that show up when
    running perf.
  * Kernel crashes at the first exit to userspace.  entry_64.S
    bugs, or failing to map some of the exit code.
  * Crashes at first interrupt that interrupts userspace. The paths
    in entry_64.S that return to userspace are sometimes separate
    from the ones that return to the kernel.
  * Double faults: overflowing the kernel stack because of page
    faults upon page faults.  Caused by touching non-pti-mapped
    data in the entry code, or forgetting to switch to kernel
    CR3 before calling into C functions which are not pti-mapped.
  * Userspace segfaults early in boot, sometimes manifesting
    as mount(8) failing to mount the rootfs.  These have
    tended to be TLB invalidation issues.  Usually invalidating
    the wrong PCID, or otherwise missing an invalidation.
 . https://gruss.cc/files/kaiser.pdf
 . https://meltdownattack.com/meltdown.pdf

									
										5

Makefile
									
												View File
												
				@ -1,6 +1,6 @@

				VERSION = 4

				PATCHLEVEL = 9

				SUBLEVEL = 70

				SUBLEVEL = 82

				EXTRAVERSION =

				NAME = Roaring Lionus

				@ -788,6 +788,9 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-sign)

				# disable invalid "can't wrap" optimizations for signed / pointers

				KBUILD_CFLAGS	+= $(call cc-option,-fno-strict-overflow)

				# Make sure -fstack-check isn't enabled (like gentoo apparently did)

				KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)

				# conserve stack if available

				KBUILD_CFLAGS   += $(call cc-option,-fconserve-stack)

									
										3

arch/alpha/kernel/pci_impl.h
									
												View File
												
				@ -143,7 +143,8 @@ struct pci_iommu_arena

				};

				#if defined(CONFIG_ALPHA_SRM) && \

				    (defined(CONFIG_ALPHA_CIA) || defined(CONFIG_ALPHA_LCA))

				    (defined(CONFIG_ALPHA_CIA) || defined(CONFIG_ALPHA_LCA) || \

				     defined(CONFIG_ALPHA_AVANTI))

				# define NEED_SRM_SAVE_RESTORE

				#else

				# undef NEED_SRM_SAVE_RESTORE

									
										3

arch/alpha/kernel/process.c
									
												View File
												
				@ -265,12 +265,13 @@ copy_thread(unsigned long clone_flags, unsigned long usp,

					   application calling fork.  */

					if (clone_flags & CLONE_SETTLS)

						childti->pcb.unique = regs->r20;

					else

						regs->r20 = 0;	/* OSF/1 has some strange fork() semantics.  */

					childti->pcb.usp = usp ?: rdusp();

					*childregs = *regs;

					childregs->r0 = 0;

					childregs->r19 = 0;

					childregs->r20 = 1;	/* OSF/1 has some strange fork() semantics.  */

					regs->r20 = 0;

					stack = ((struct switch_stack *) regs) - 1;

					*childstack = *stack;

					childstack->r26 = (unsigned long) ret_from_fork;

									
										13

arch/alpha/kernel/traps.c
									
												View File
												
				@ -158,11 +158,16 @@ void show_stack(struct task_struct *task, unsigned long *sp)

					for(i=0; i < kstack_depth_to_print; i++) {

						if (((long) stack & (THREAD_SIZE-1)) == 0)

							break;

						if (i && ((i % 4) == 0))

							printk("\n       ");

						printk("%016lx ", *stack++);

						if ((i % 4) == 0) {

							if (i)

								pr_cont("\n");

							printk("       ");

						} else {

							pr_cont(" ");

						}

						pr_cont("%016lx", *stack++);

					}

					printk("\n");

					pr_cont("\n");

					dik_show_trace(sp);

				}

									
										5

arch/arc/include/asm/uaccess.h
									
												View File
												
				@ -673,6 +673,7 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)

						return 0;

					__asm__ __volatile__(

					"	mov	lp_count, %5		\n"

					"	lp	3f			\n"

					"1:	ldb.ab  %3, [%2, 1]		\n"

					"	breq.d	%3, 0, 3f               \n"

				@ -689,8 +690,8 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)

					"	.word   1b, 4b			\n"

					"	.previous			\n"

					: "+r"(res), "+r"(dst), "+r"(src), "=r"(val)

					: "g"(-EFAULT), "l"(count)

					: "memory");

					: "g"(-EFAULT), "r"(count)

					: "lp_count", "lp_start", "lp_end", "memory");

					return res;

				}

1

arch/arm/boot/dts/am335x-evmsk.dts

View File

 @ -668,6 +668,7 @@
 	ti,non-removable;
 	bus-width = <4>;
 	cap-power-off-card;
 	keep-power-in-suspend;
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc2_pins>;

4

arch/arm/boot/dts/bcm-nsp.dtsi

View File

 @ -85,7 +85,7 @@
 		timer@20200 {
 			compatible = "arm,cortex-a9-global-timer";
 			reg = <0x20200 0x100>;
 			interrupts = <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>;
 			interrupts = <GIC_PPI 11 IRQ_TYPE_EDGE_RISING>;
 			clocks = <&periph_clk>;
 		};
 @ -93,7 +93,7 @@
 			compatible = "arm,cortex-a9-twd-timer";
 			reg = <0x20600 0x20>;
 			interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(2) |
 						  IRQ_TYPE_LEVEL_HIGH)>;
 						  IRQ_TYPE_EDGE_RISING)>;
 			clocks = <&periph_clk>;
 		};

2

arch/arm/boot/dts/dra7.dtsi

View File

 @ -282,6 +282,7 @@
 				device_type = "pci";
 				ranges = <0x81000000 0 0          0x03000 0 0x00010000
 x82000000 0 0x20013000 0x13000 0 0xffed000>;
 				bus-range = <0x00 0xff>;
 				#interrupt-cells = <1>;
 				num-lanes = <1>;
 				linux,pci-domain = <0>;
 @ -318,6 +319,7 @@
 				device_type = "pci";
 				ranges = <0x81000000 0 0          0x03000 0 0x00010000
 x82000000 0 0x30013000 0x13000 0 0xffed000>;
 				bus-range = <0x00 0xff>;
 				#interrupt-cells = <1>;
 				num-lanes = <1>;
 				linux,pci-domain = <1>;

10

arch/arm/boot/dts/kirkwood-openblocks_a7.dts

View File

 @ -53,7 +53,8 @@
 		};
 		pinctrl: pin-controller@10000 {
 			pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header>;
 			pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header
 				     &pmx_gpio_header_gpo>;
 			pinctrl-names = "default";
 			pmx_uart0: pmx-uart0 {
 @ -85,11 +86,16 @@
 			 * ground.
 			 */
 			pmx_gpio_header: pmx-gpio-header {
 				marvell,pins = "mpp17", "mpp7", "mpp29", "mpp28",
 				marvell,pins = "mpp17", "mpp29", "mpp28",
 					       "mpp35", "mpp34", "mpp40";
 				marvell,function = "gpio";
 			};
 			pmx_gpio_header_gpo: pxm-gpio-header-gpo {
 				marvell,pins = "mpp7";
 				marvell,function = "gpo";
 			};
 			pmx_gpio_init: pmx-init {
 				marvell,pins = "mpp38";
 				marvell,function = "gpio";

2

arch/arm/configs/sunxi_defconfig

View File

 @ -11,6 +11,7 @@ CONFIG_SMP=y
 CONFIG_NR_CPUS=8
 CONFIG_AEABI=y
 CONFIG_HIGHMEM=y
 CONFIG_CMA=y
 CONFIG_ARM_APPENDED_DTB=y
 CONFIG_ARM_ATAG_DTB_COMPAT=y
 CONFIG_CPU_FREQ=y
 @ -35,6 +36,7 @@ CONFIG_CAN_SUN4I=y
 # CONFIG_WIRELESS is not set
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
 CONFIG_DMA_CMA=y
 CONFIG_BLK_DEV_SD=y
 CONFIG_ATA=y
 CONFIG_AHCI_SUNXI=y

									
										1

arch/arm/kvm/arm.c
									
												View File
												
				@ -1165,6 +1165,7 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block *self,

							cpu_hyp_reset();

						return NOTIFY_OK;

					case CPU_PM_ENTER_FAILED:

					case CPU_PM_EXIT:

						if (__this_cpu_read(kvm_arm_hardware_enabled))

							/* The hardware was enabled before suspend. */

									
										13

arch/arm/kvm/handle_exit.c
									
												View File
												
				@ -38,7 +38,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

					ret = kvm_psci_call(vcpu);

					if (ret < 0) {

						kvm_inject_undefined(vcpu);

						vcpu_set_reg(vcpu, 0, ~0UL);

						return 1;

					}

				@ -47,7 +47,16 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

				static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					kvm_inject_undefined(vcpu);

					/*

					 * "If an SMC instruction executed at Non-secure EL1 is

					 * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a

					 * Trap exception, not a Secure Monitor Call exception [...]"

					 *

					 * We need to advance the PC after the trap, as it would

					 * otherwise return to the same address...

					 */

					vcpu_set_reg(vcpu, 0, ~0UL);

					kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));

					return 1;

				}

									
										6

arch/arm/kvm/mmio.c
									
												View File
												
				@ -112,7 +112,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run)

						}

						trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,

							       data);

							       &data);

						data = vcpu_data_host_to_guest(vcpu, data, len);

						vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);

					}

				@ -182,14 +182,14 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,

						data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt),

									       len);

						trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data);

						trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, &data);

						kvm_mmio_write_buf(data_buf, len, data);

						ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, fault_ipa, len,

								       data_buf);

					} else {

						trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, len,

							       fault_ipa, 0);

							       fault_ipa, NULL);

						ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, fault_ipa, len,

								      data_buf);

									
										2

arch/arm/kvm/mmu.c
									
												View File
												
				@ -1284,7 +1284,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,

						return -EFAULT;

					}

					if (is_vm_hugetlb_page(vma) && !logging_active) {

					if (vma_kernel_pagesize(vma) && !logging_active) {

						hugetlb = true;

						gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;

					} else {

									
										20

arch/arm/mm/dma-mapping.c
									
												View File
												
				@ -930,13 +930,31 @@ static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_add

					__arm_dma_free(dev, size, cpu_addr, handle, attrs, true);

				}

				/*

				 * The whole dma_get_sgtable() idea is fundamentally unsafe - it seems

				 * that the intention is to allow exporting memory allocated via the

				 * coherent DMA APIs through the dma_buf API, which only accepts a

				 * scattertable.  This presents a couple of problems:

				 * 1. Not all memory allocated via the coherent DMA APIs is backed by

				 *    a struct page

				 * 2. Passing coherent DMA memory into the streaming APIs is not allowed

				 *    as we will try to flush the memory through a different alias to that

				 *    actually being used (and the flushes are redundant.)

				 */

				int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,

						 void *cpu_addr, dma_addr_t handle, size_t size,

						 unsigned long attrs)

				{

					struct page *page = pfn_to_page(dma_to_pfn(dev, handle));

					unsigned long pfn = dma_to_pfn(dev, handle);

					struct page *page;

					int ret;

					/* If the PFN is not valid, we do not have a struct page */

					if (!pfn_valid(pfn))

						return -ENXIO;

					page = pfn_to_page(pfn);

					ret = sg_alloc_table(sgt, 1, GFP_KERNEL);

					if (unlikely(ret))

						return ret;

									
										36

arch/arm/probes/kprobes/core.c
									
												View File
												
				@ -433,6 +433,7 @@ static __used __kprobes void *trampoline_handler(struct pt_regs *regs)

					struct hlist_node *tmp;

					unsigned long flags, orig_ret_address = 0;

					unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;

					kprobe_opcode_t *correct_ret_addr = NULL;

					INIT_HLIST_HEAD(&empty_rp);

					kretprobe_hash_lock(current, &head, &flags);

				@ -455,15 +456,7 @@ static __used __kprobes void *trampoline_handler(struct pt_regs *regs)

							/* another task is sharing our hash bucket */

							continue;

						if (ri->rp && ri->rp->handler) {

							__this_cpu_write(current_kprobe, &ri->rp->kp);

							get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;

							ri->rp->handler(ri, regs);

							__this_cpu_write(current_kprobe, NULL);

						}

						orig_ret_address = (unsigned long)ri->ret_addr;

						recycle_rp_inst(ri, &empty_rp);

						if (orig_ret_address != trampoline_address)

							/*

				@ -475,6 +468,33 @@ static __used __kprobes void *trampoline_handler(struct pt_regs *regs)

					}

					kretprobe_assert(ri, orig_ret_address, trampoline_address);

					correct_ret_addr = ri->ret_addr;

					hlist_for_each_entry_safe(ri, tmp, head, hlist) {

						if (ri->task != current)

							/* another task is sharing our hash bucket */

							continue;

						orig_ret_address = (unsigned long)ri->ret_addr;

						if (ri->rp && ri->rp->handler) {

							__this_cpu_write(current_kprobe, &ri->rp->kp);

							get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;

							ri->ret_addr = correct_ret_addr;

							ri->rp->handler(ri, regs);

							__this_cpu_write(current_kprobe, NULL);

						}

						recycle_rp_inst(ri, &empty_rp);

						if (orig_ret_address != trampoline_address)

							/*

							 * This is the real return address. Any other

							 * instances associated with this task are for

							 * other calls deeper on the call stack

							 */

							break;

					}

					kretprobe_hash_unlock(current, &flags);

					hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {

									
										11

arch/arm/probes/kprobes/test-core.c
									
												View File
												
				@ -976,7 +976,10 @@ static void coverage_end(void)

				void __naked __kprobes_test_case_start(void)

				{

					__asm__ __volatile__ (

						"stmdb	sp!, {r4-r11}				\n\t"

						"mov	r2, sp					\n\t"

						"bic	r3, r2, #7				\n\t"

						"mov	sp, r3					\n\t"

						"stmdb	sp!, {r2-r11}				\n\t"

						"sub	sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"

						"bic	r0, lr, #1  @ r0 = inline data		\n\t"

						"mov	r1, sp					\n\t"

				@ -996,7 +999,8 @@ void __naked __kprobes_test_case_end_32(void)

						"movne	pc, r0					\n\t"

						"mov	r0, r4					\n\t"

						"add	sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"

						"ldmia	sp!, {r4-r11}				\n\t"

						"ldmia	sp!, {r2-r11}				\n\t"

						"mov	sp, r2					\n\t"

						"mov	pc, r0					\n\t"

					);

				}

				@ -1012,7 +1016,8 @@ void __naked __kprobes_test_case_end_16(void)

						"bxne	r0					\n\t"

						"mov	r0, r4					\n\t"

						"add	sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"

						"ldmia	sp!, {r4-r11}				\n\t"

						"ldmia	sp!, {r2-r11}				\n\t"

						"mov	sp, r2					\n\t"

						"bx	r0					\n\t"

					);

				}

									
										8

arch/arm64/Makefile
									
												View File
												
				@ -14,8 +14,12 @@ LDFLAGS_vmlinux	:=-p --no-undefined -X

				CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)

				GZFLAGS		:=-9

				ifneq ($(CONFIG_RELOCATABLE),)

				LDFLAGS_vmlinux		+= -pie -shared -Bsymbolic

				ifeq ($(CONFIG_RELOCATABLE), y)

				# Pass --no-apply-dynamic-relocs to restore pre-binutils-2.27 behaviour

				# for relative relocs, since this leads to better Image compression

				# with the relocation offsets always being zero.

				LDFLAGS_vmlinux		+= -pie -shared -Bsymbolic \

							$(call ld-option, --no-apply-dynamic-relocs)

				endif

				ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)

									
										4

arch/arm64/kvm/handle_exit.c
									
												View File
												
				@ -44,7 +44,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

					ret = kvm_psci_call(vcpu);

					if (ret < 0) {

						kvm_inject_undefined(vcpu);

						vcpu_set_reg(vcpu, 0, ~0UL);

						return 1;

					}

				@ -53,7 +53,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)

				static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					kvm_inject_undefined(vcpu);

					vcpu_set_reg(vcpu, 0, ~0UL);

					return 1;

				}

									
										2

arch/arm64/mm/init.c
									
												View File
												
				@ -296,6 +296,7 @@ void __init arm64_memblock_init(void)

						arm64_dma_phys_limit = max_zone_dma_phys();

					else

						arm64_dma_phys_limit = PHYS_MASK + 1;

					high_memory = __va(memblock_end_of_DRAM() - 1) + 1;

					dma_contiguous_reserve(arm64_dma_phys_limit);

					memblock_allow_resize();

				@ -322,7 +323,6 @@ void __init bootmem_init(void)

					sparse_init();

					zone_sizes_init(min, max);

					high_memory = __va((max << PAGE_SHIFT) - 1) + 1;

					memblock_dump_all();

				}

7

arch/blackfin/Kconfig

View File

 @ -319,11 +319,14 @@ config BF53x
 config GPIO_ADI
 	def_bool y
 	depends on !PINCTRL
 	depends on (BF51x || BF52x || BF53x || BF538 || BF539 || BF561)
 config PINCTRL
 config PINCTRL_BLACKFIN_ADI2
 	def_bool y
 	depends on BF54x || BF60x
 	depends on (BF54x || BF60x)
 	select PINCTRL
 	select PINCTRL_ADI2
 config MEM_MT48LC64M4A2FB_7E
 	bool

1

arch/blackfin/Kconfig.debug

View File

 @ -17,6 +17,7 @@ config DEBUG_VERBOSE
 config DEBUG_MMRS
 	tristate "Generate Blackfin MMR tree"
 	depends on !PINCTRL
 	select DEBUG_FS
 	help
 	  Create a tree of Blackfin MMRs via the debugfs tree.  If

									
										2

arch/mips/ar7/platform.c
									
												View File
												
				@ -576,7 +576,7 @@ static int __init ar7_register_uarts(void)

					uart_port.type		= PORT_AR7;

					uart_port.uartclk	= clk_get_rate(bus_clk) / 2;

					uart_port.iotype	= UPIO_MEM32;

					uart_port.flags		= UPF_FIXED_TYPE;

					uart_port.flags		= UPF_FIXED_TYPE | UPF_BOOT_AUTOCONF;

					uart_port.regshift	= 2;

					uart_port.line		= 0;

									
										12

arch/mips/kernel/process.c
									
												View File
												
				@ -683,6 +683,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value)

					struct task_struct *t;

					int max_users;

					/* If nothing to change, return right away, successfully.  */

					if (value == mips_get_process_fp_mode(task))

						return 0;

					/* Only accept a mode change if 64-bit FP enabled for o32.  */

					if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT))

						return -EOPNOTSUPP;

					/* And only for o32 tasks.  */

					if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS))

						return -EOPNOTSUPP;

					/* Check the value is valid */

					if (value & ~known_bits)

						return -EOPNOTSUPP;

									
										153

arch/mips/kernel/ptrace.c
									
												View File
												
				@ -439,25 +439,38 @@ static int gpr64_set(struct task_struct *target,

				#endif /* CONFIG_64BIT */

				static int fpr_get(struct task_struct *target,

						   const struct user_regset *regset,

						   unsigned int pos, unsigned int count,

						   void *kbuf, void __user *ubuf)

				/*

				 * Copy the floating-point context to the supplied NT_PRFPREG buffer,

				 * !CONFIG_CPU_HAS_MSA variant.  FP context's general register slots

				 * correspond 1:1 to buffer slots.  Only general registers are copied.

				 */

				static int fpr_get_fpa(struct task_struct *target,

						       unsigned int *pos, unsigned int *count,

						       void **kbuf, void __user **ubuf)

				{

					unsigned i;

					int err;

					return user_regset_copyout(pos, count, kbuf, ubuf,

								   &target->thread.fpu,

								   0, NUM_FPU_REGS * sizeof(elf_fpreg_t));

				}

				/*

				 * Copy the floating-point context to the supplied NT_PRFPREG buffer,

				 * CONFIG_CPU_HAS_MSA variant.  Only lower 64 bits of FP context's

				 * general register slots are copied to buffer slots.  Only general

				 * registers are copied.

				 */

				static int fpr_get_msa(struct task_struct *target,

						       unsigned int *pos, unsigned int *count,

						       void **kbuf, void __user **ubuf)

				{

					unsigned int i;

					u64 fpr_val;

					int err;

					/* XXX fcr31  */

					if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t))

						return user_regset_copyout(&pos, &count, &kbuf, &ubuf,

									   &target->thread.fpu,

									   0, sizeof(elf_fpregset_t));

					BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));

					for (i = 0; i < NUM_FPU_REGS; i++) {

						fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0);

						err = user_regset_copyout(&pos, &count, &kbuf, &ubuf,

						err = user_regset_copyout(pos, count, kbuf, ubuf,

									  &fpr_val, i * sizeof(elf_fpreg_t),

									  (i + 1) * sizeof(elf_fpreg_t));

						if (err)

				@ -467,27 +480,64 @@ static int fpr_get(struct task_struct *target,

					return 0;

				}

				static int fpr_set(struct task_struct *target,

				/*

				 * Copy the floating-point context to the supplied NT_PRFPREG buffer.

				 * Choose the appropriate helper for general registers, and then copy

				 * the FCSR register separately.

				 */

				static int fpr_get(struct task_struct *target,

						   const struct user_regset *regset,

						   unsigned int pos, unsigned int count,

						   const void *kbuf, const void __user *ubuf)

						   void *kbuf, void __user *ubuf)

				{

					unsigned i;

					const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);

					int err;

					if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))

						err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf);

					else

						err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf);

					if (err)

						return err;

					err = user_regset_copyout(&pos, &count, &kbuf, &ubuf,

								  &target->thread.fpu.fcr31,

								  fcr31_pos, fcr31_pos + sizeof(u32));

					return err;

				}

				/*

				 * Copy the supplied NT_PRFPREG buffer to the floating-point context,

				 * !CONFIG_CPU_HAS_MSA variant.   Buffer slots correspond 1:1 to FP

				 * context's general register slots.  Only general registers are copied.

				 */

				static int fpr_set_fpa(struct task_struct *target,

						       unsigned int *pos, unsigned int *count,

						       const void **kbuf, const void __user **ubuf)

				{

					return user_regset_copyin(pos, count, kbuf, ubuf,

								  &target->thread.fpu,

								  0, NUM_FPU_REGS * sizeof(elf_fpreg_t));

				}

				/*

				 * Copy the supplied NT_PRFPREG buffer to the floating-point context,

				 * CONFIG_CPU_HAS_MSA variant.  Buffer slots are copied to lower 64

				 * bits only of FP context's general register slots.  Only general

				 * registers are copied.

				 */

				static int fpr_set_msa(struct task_struct *target,

						       unsigned int *pos, unsigned int *count,

						       const void **kbuf, const void __user **ubuf)

				{

					unsigned int i;

					u64 fpr_val;

					/* XXX fcr31  */

					init_fp_ctx(target);

					if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t))

						return user_regset_copyin(&pos, &count, &kbuf, &ubuf,

									  &target->thread.fpu,

									  0, sizeof(elf_fpregset_t));

					int err;

					BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));

					for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) {

						err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,

					for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) {

						err = user_regset_copyin(pos, count, kbuf, ubuf,

									 &fpr_val, i * sizeof(elf_fpreg_t),

									 (i + 1) * sizeof(elf_fpreg_t));

						if (err)

				@ -498,6 +548,53 @@ static int fpr_set(struct task_struct *target,

					return 0;

				}

				/*

				 * Copy the supplied NT_PRFPREG buffer to the floating-point context.

				 * Choose the appropriate helper for general registers, and then copy

				 * the FCSR register separately.

				 *

				 * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0',

				 * which is supposed to have been guaranteed by the kernel before

				 * calling us, e.g. in `ptrace_regset'.  We enforce that requirement,

				 * so that we can safely avoid preinitializing temporaries for

				 * partial register writes.

				 */

				static int fpr_set(struct task_struct *target,

						   const struct user_regset *regset,

						   unsigned int pos, unsigned int count,

						   const void *kbuf, const void __user *ubuf)

				{

					const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);

					u32 fcr31;

					int err;

					BUG_ON(count % sizeof(elf_fpreg_t));

					if (pos + count > sizeof(elf_fpregset_t))

						return -EIO;

					init_fp_ctx(target);

					if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))

						err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf);

					else

						err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf);

					if (err)

						return err;

					if (count > 0) {

						err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,

									 &fcr31,

									 fcr31_pos, fcr31_pos + sizeof(u32));

						if (err)

							return err;

						ptrace_setfcr31(target, fcr31);

					}

					return err;

				}

				enum mips_regset {

					REGSET_GPR,

					REGSET_FPR,

									
										28

arch/mips/math-emu/cp1emu.c
									
												View File
												
				@ -1781,7 +1781,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(fs, MIPSInst_FS(ir));

							SPFROMREG(fd, MIPSInst_FD(ir));

							rv.s = ieee754sp_maddf(fd, fs, ft);

							break;

							goto copcsr;

						}

						case fmsubf_op: {

				@ -1794,7 +1794,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(fs, MIPSInst_FS(ir));

							SPFROMREG(fd, MIPSInst_FD(ir));

							rv.s = ieee754sp_msubf(fd, fs, ft);

							break;

							goto copcsr;

						}

						case frint_op: {

				@ -1818,7 +1818,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(fs, MIPSInst_FS(ir));

							rv.w = ieee754sp_2008class(fs);

							rfmt = w_fmt;

							break;

							goto copcsr;

						}

						case fmin_op: {

				@ -1830,7 +1830,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(ft, MIPSInst_FT(ir));

							SPFROMREG(fs, MIPSInst_FS(ir));

							rv.s = ieee754sp_fmin(fs, ft);

							break;

							goto copcsr;

						}

						case fmina_op: {

				@ -1842,7 +1842,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(ft, MIPSInst_FT(ir));

							SPFROMREG(fs, MIPSInst_FS(ir));

							rv.s = ieee754sp_fmina(fs, ft);

							break;

							goto copcsr;

						}

						case fmax_op: {

				@ -1854,7 +1854,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(ft, MIPSInst_FT(ir));

							SPFROMREG(fs, MIPSInst_FS(ir));

							rv.s = ieee754sp_fmax(fs, ft);

							break;

							goto copcsr;

						}

						case fmaxa_op: {

				@ -1866,7 +1866,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,

							SPFROMREG(ft, MIPSInst_FT(ir));

							SPFROMREG(fs, MIPSInst_FS(ir));

							rv.s = ieee754sp_fmaxa(fs, ft);

							break;

							goto copcsr;

						}

						case fabs_op:

				@ -2110,7 +2110,7 @@ copcsr:

							DPFROMREG(fs, MIPSInst_FS(ir));

							DPFROMREG(fd, MIPSInst_FD(ir));

							rv.d = ieee754dp_maddf(fd, fs, ft);

							break;

							goto copcsr;

						}

						case fmsubf_op: {

				@ -2123,7 +2123,7 @@ copcsr:

							DPFROMREG(fs, MIPSInst_FS(ir));

							DPFROMREG(fd, MIPSInst_FD(ir));

							rv.d = ieee754dp_msubf(fd, fs, ft);

							break;

							goto copcsr;

						}

						case frint_op: {

				@ -2147,7 +2147,7 @@ copcsr:

							DPFROMREG(fs, MIPSInst_FS(ir));

							rv.w = ieee754dp_2008class(fs);

							rfmt = w_fmt;

							break;

							goto copcsr;

						}

						case fmin_op: {

				@ -2159,7 +2159,7 @@ copcsr:

							DPFROMREG(ft, MIPSInst_FT(ir));

							DPFROMREG(fs, MIPSInst_FS(ir));

							rv.d = ieee754dp_fmin(fs, ft);

							break;

							goto copcsr;

						}

						case fmina_op: {

				@ -2171,7 +2171,7 @@ copcsr:

							DPFROMREG(ft, MIPSInst_FT(ir));

							DPFROMREG(fs, MIPSInst_FS(ir));

							rv.d = ieee754dp_fmina(fs, ft);

							break;

							goto copcsr;

						}

						case fmax_op: {

				@ -2183,7 +2183,7 @@ copcsr:

							DPFROMREG(ft, MIPSInst_FT(ir));

							DPFROMREG(fs, MIPSInst_FS(ir));

							rv.d = ieee754dp_fmax(fs, ft);

							break;

							goto copcsr;

						}

						case fmaxa_op: {

				@ -2195,7 +2195,7 @@ copcsr:

							DPFROMREG(ft, MIPSInst_FT(ir));

							DPFROMREG(fs, MIPSInst_FS(ir));

							rv.d = ieee754dp_fmaxa(fs, ft);

							break;

							goto copcsr;

						}

						case fabs_op:

									
										2

arch/mn10300/mm/misalignment.c
									
												View File
												
				@ -437,7 +437,7 @@ transfer_failed:

					info.si_signo	= SIGSEGV;

					info.si_errno	= 0;

					info.si_code	= 0;

					info.si_code	= SEGV_MAPERR;

					info.si_addr	= (void *) regs->pc;

					force_sig_info(SIGSEGV, &info, current);

					return;

									
										2

arch/openrisc/include/asm/uaccess.h
									
												View File
												
				@ -211,7 +211,7 @@ do {									\

					case 1: __get_user_asm(x, ptr, retval, "l.lbz"); break;		\

					case 2: __get_user_asm(x, ptr, retval, "l.lhz"); break;		\

					case 4: __get_user_asm(x, ptr, retval, "l.lwz"); break;		\

					case 8: __get_user_asm2(x, ptr, retval);			\

					case 8: __get_user_asm2(x, ptr, retval); break;			\

					default: (x) = __get_user_bad();				\

					}								\

				} while (0)

									
										10

arch/openrisc/kernel/traps.c
									
												View File
												
				@ -302,12 +302,12 @@ asmlinkage void do_unaligned_access(struct pt_regs *regs, unsigned long address)

					siginfo_t info;

					if (user_mode(regs)) {

						/* Send a SIGSEGV */

						info.si_signo = SIGSEGV;

						/* Send a SIGBUS */

						info.si_signo = SIGBUS;

						info.si_errno = 0;

						/* info.si_code has been set above */

						info.si_addr = (void *)address;

						force_sig_info(SIGSEGV, &info, current);

						info.si_code = BUS_ADRALN;

						info.si_addr = (void __user *)address;

						force_sig_info(SIGBUS, &info, current);

					} else {

						printk("KERNEL: Unaligned Access 0x%.8lx\n", address);

						show_registers(regs);

									
										2

arch/parisc/include/asm/ldcw.h
									
												View File
												
				@ -11,6 +11,7 @@

				   for the semaphore.  */

				#define __PA_LDCW_ALIGNMENT	16

				#define __PA_LDCW_ALIGN_ORDER	4

				#define __ldcw_align(a) ({					\

					unsigned long __ret = (unsigned long) &(a)->lock[0];	\

					__ret = (__ret + __PA_LDCW_ALIGNMENT - 1)		\

				@ -28,6 +29,7 @@

				   ldcd). */

				#define __PA_LDCW_ALIGNMENT	4

				#define __PA_LDCW_ALIGN_ORDER	2

				#define __ldcw_align(a) (&(a)->slock)

				#define __LDCW	"ldcw,co"

									
										13

arch/parisc/kernel/entry.S
									
												View File
												
				@ -35,6 +35,7 @@

				#include <asm/pgtable.h>

				#include <asm/signal.h>

				#include <asm/unistd.h>

				#include <asm/ldcw.h>

				#include <asm/thread_info.h>

				#include <linux/linkage.h>

				@ -46,6 +47,14 @@

				#endif

					.import		pa_tlb_lock,data

					.macro  load_pa_tlb_lock reg

				#if __PA_LDCW_ALIGNMENT > 4

					load32	PA(pa_tlb_lock) + __PA_LDCW_ALIGNMENT-1, \reg

					depi	0,31,__PA_LDCW_ALIGN_ORDER, \reg

				#else

					load32	PA(pa_tlb_lock), \reg

				#endif

					.endm

					/* space_to_prot macro creates a prot id from a space id */

				@ -457,7 +466,7 @@

					.macro		tlb_lock	spc,ptp,pte,tmp,tmp1,fault

				#ifdef CONFIG_SMP

					cmpib,COND(=),n	0,\spc,2f

					load32		PA(pa_tlb_lock),\tmp

					load_pa_tlb_lock \tmp

				1:	LDCW		0(\tmp),\tmp1

					cmpib,COND(=)	0,\tmp1,1b

					nop

				@ -480,7 +489,7 @@

					/* Release pa_tlb_lock lock. */

					.macro		tlb_unlock1	spc,tmp

				#ifdef CONFIG_SMP

					load32		PA(pa_tlb_lock),\tmp

					load_pa_tlb_lock \tmp

					tlb_unlock0	\spc,\tmp

				#endif

					.endm

									
										9

arch/parisc/kernel/pacache.S
									
												View File
												
				@ -36,6 +36,7 @@

				#include <asm/assembly.h>

				#include <asm/pgtable.h>

				#include <asm/cache.h>

				#include <asm/ldcw.h>

				#include <linux/linkage.h>

					.text

				@ -333,8 +334,12 @@ ENDPROC_CFI(flush_data_cache_local)

					.macro	tlb_lock	la,flags,tmp

				#ifdef CONFIG_SMP

					ldil		L%pa_tlb_lock,%r1

					ldo		R%pa_tlb_lock(%r1),\la

				#if __PA_LDCW_ALIGNMENT > 4

					load32		pa_tlb_lock + __PA_LDCW_ALIGNMENT-1, \la

					depi		0,31,__PA_LDCW_ALIGN_ORDER, \la

				#else

					load32		pa_tlb_lock, \la

				#endif

					rsm		PSW_SM_I,\flags

				1:	LDCW		0(\la),\tmp

					cmpib,<>,n	0,\tmp,3f

									
										39

arch/parisc/kernel/process.c
									
												View File
												
				@ -39,6 +39,7 @@

				#include <linux/kernel.h>

				#include <linux/mm.h>

				#include <linux/fs.h>

				#include <linux/cpu.h>

				#include <linux/module.h>

				#include <linux/personality.h>

				#include <linux/ptrace.h>

				@ -180,6 +181,44 @@ int dump_task_fpu (struct task_struct *tsk, elf_fpregset_t *r)

					return 1;

				}

				/*

				 * Idle thread support

				 *

				 * Detect when running on QEMU with SeaBIOS PDC Firmware and let

				 * QEMU idle the host too.

				 */

				int running_on_qemu __read_mostly;

				void __cpuidle arch_cpu_idle_dead(void)

				{

					/* nop on real hardware, qemu will offline CPU. */

					asm volatile("or %%r31,%%r31,%%r31\n":::);

				}

				void __cpuidle arch_cpu_idle(void)

				{

					local_irq_enable();

					/* nop on real hardware, qemu will idle sleep. */

					asm volatile("or %%r10,%%r10,%%r10\n":::);

				}

				static int __init parisc_idle_init(void)

				{

					const char *marker;

					/* check QEMU/SeaBIOS marker in PAGE0 */

					marker = (char *) &PAGE0->pad0;

					running_on_qemu = (memcmp(marker, "SeaBIOS", 8) == 0);

					if (!running_on_qemu)

						cpu_idle_poll_ctrl(1);

					return 0;

				}

				arch_initcall(parisc_idle_init);

				/*

				 * Copy architecture-specific thread state

				 */

1

arch/powerpc/Kconfig

View File

 @ -128,6 +128,7 @@ config PPC
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select GENERIC_SMP_IDLE_THREAD
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_VULNERABILITIES	if PPC_BOOK3S_64
 	select GENERIC_TIME_VSYSCALL_OLD
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_CLOCKEVENTS_BROADCAST if SMP

									
										6

arch/powerpc/include/asm/exception-64e.h
									
												View File
												
				@ -209,5 +209,11 @@ exc_##label##_book3e:

					ori	r3,r3,vector_offset@l;		\

					mtspr	SPRN_IVOR##vector_number,r3;

				#define RFI_TO_KERNEL							\

					rfi

				#define RFI_TO_USER							\

					rfi

				#endif /* _ASM_POWERPC_EXCEPTION_64E_H */

									
										53

arch/powerpc/include/asm/exception-64s.h
									
												View File
												
				@ -51,6 +51,59 @@

				#define EX_PPR		88	/* SMT thread status register (priority) */

				#define EX_CTR		96

				/*

				 * Macros for annotating the expected destination of (h)rfid

				 *

				 * The nop instructions allow us to insert one or more instructions to flush the

				 * L1-D cache when returning to userspace or a guest.

				 */

				#define RFI_FLUSH_SLOT							\

					RFI_FLUSH_FIXUP_SECTION;					\

					nop;								\

					nop;								\

					nop

				#define RFI_TO_KERNEL							\

					rfid

				#define RFI_TO_USER							\

					RFI_FLUSH_SLOT;							\

					rfid;								\

					b	rfi_flush_fallback

				#define RFI_TO_USER_OR_KERNEL						\

					RFI_FLUSH_SLOT;							\

					rfid;								\

					b	rfi_flush_fallback

				#define RFI_TO_GUEST							\

					RFI_FLUSH_SLOT;							\

					rfid;								\

					b	rfi_flush_fallback

				#define HRFI_TO_KERNEL							\

					hrfid

				#define HRFI_TO_USER							\

					RFI_FLUSH_SLOT;							\

					hrfid;								\

					b	hrfi_flush_fallback

				#define HRFI_TO_USER_OR_KERNEL						\

					RFI_FLUSH_SLOT;							\

					hrfid;								\

					b	hrfi_flush_fallback

				#define HRFI_TO_GUEST							\

					RFI_FLUSH_SLOT;							\

					hrfid;								\

					b	hrfi_flush_fallback

				#define HRFI_TO_UNKNOWN							\

					RFI_FLUSH_SLOT;							\

					hrfid;								\

					b	hrfi_flush_fallback

				#ifdef CONFIG_RELOCATABLE

				#define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h)			\

					mfspr	r11,SPRN_##h##SRR0;	/* save SRR0 */			\

									
										15

arch/powerpc/include/asm/feature-fixups.h
									
												View File
												
				@ -189,4 +189,19 @@ void apply_feature_fixups(void);

				void setup_feature_keys(void);

				#endif

				#define RFI_FLUSH_FIXUP_SECTION				\

				951:							\

					.pushsection __rfi_flush_fixup,"a";		\

					.align 2;					\

				952:							\

					FTR_ENTRY_OFFSET 951b-952b;			\

					.popsection;

				#ifndef __ASSEMBLY__

				extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;

				#endif

				#endif /* __ASM_POWERPC_FEATURE_FIXUPS_H */

									
										18

arch/powerpc/include/asm/hvcall.h
									
												View File
												
				@ -240,6 +240,7 @@

				#define H_GET_HCA_INFO          0x1B8

				#define H_GET_PERF_COUNT        0x1BC

				#define H_MANAGE_TRACE          0x1C0

				#define H_GET_CPU_CHARACTERISTICS 0x1C8

				#define H_FREE_LOGICAL_LAN_BUFFER 0x1D4

				#define H_QUERY_INT_STATE       0x1E4

				#define H_POLL_PENDING		0x1D8

				@ -306,7 +307,19 @@

				#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE	3

				#define H_SET_MODE_RESOURCE_LE			4

				/* H_GET_CPU_CHARACTERISTICS return values */

				#define H_CPU_CHAR_SPEC_BAR_ORI31	(1ull << 63) // IBM bit 0

				#define H_CPU_CHAR_BCCTRL_SERIALISED	(1ull << 62) // IBM bit 1

				#define H_CPU_CHAR_L1D_FLUSH_ORI30	(1ull << 61) // IBM bit 2

				#define H_CPU_CHAR_L1D_FLUSH_TRIG2	(1ull << 60) // IBM bit 3

				#define H_CPU_CHAR_L1D_THREAD_PRIV	(1ull << 59) // IBM bit 4

				#define H_CPU_BEHAV_FAVOUR_SECURITY	(1ull << 63) // IBM bit 0

				#define H_CPU_BEHAV_L1D_FLUSH_PR	(1ull << 62) // IBM bit 1

				#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR	(1ull << 61) // IBM bit 2

				#ifndef __ASSEMBLY__

				#include <linux/types.h>

				/**

				 * plpar_hcall_norets: - Make a pseries hypervisor call with no return arguments

				@ -433,6 +446,11 @@ static inline unsigned long cmo_get_page_size(void)

				}

				#endif /* CONFIG_PPC_PSERIES */

				struct h_cpu_char_result {

					u64 character;

					u64 behaviour;

				};

				#endif /* __ASSEMBLY__ */

				#endif /* __KERNEL__ */

				#endif /* _ASM_POWERPC_HVCALL_H */

									
										10

arch/powerpc/include/asm/paca.h
									
												View File
												
				@ -205,6 +205,16 @@ struct paca_struct {

					struct sibling_subcore_state *sibling_subcore_state;

				#endif

				#endif

				#ifdef CONFIG_PPC_BOOK3S_64

					/*

					 * rfi fallback flush must be in its own cacheline to prevent

					 * other paca data leaking into the L1d

					 */

					u64 exrfi[13] __aligned(0x80);

					void *rfi_flush_fallback_area;

					u64 l1d_flush_congruence;

					u64 l1d_flush_sets;

				#endif

				};

				#ifdef CONFIG_PPC_BOOK3S

									
										14

arch/powerpc/include/asm/plpar_wrappers.h
									
												View File
												
				@ -340,4 +340,18 @@ static inline long plapr_set_watchpoint0(unsigned long dawr0, unsigned long dawr

					return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);

				}

				static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p)

				{

					unsigned long retbuf[PLPAR_HCALL_BUFSIZE];

					long rc;

					rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf);

					if (rc == H_SUCCESS) {

						p->character = retbuf[0];

						p->behaviour = retbuf[1];

					}

					return rc;

				}

				#endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */

									
										13

arch/powerpc/include/asm/setup.h
									
												View File
												
				@ -38,6 +38,19 @@ static inline void pseries_big_endian_exceptions(void) {}

				static inline void pseries_little_endian_exceptions(void) {}

				#endif /* CONFIG_PPC_PSERIES */

				void rfi_flush_enable(bool enable);

				/* These are bit flags */

				enum l1d_flush_type {

					L1D_FLUSH_NONE		= 0x1,

					L1D_FLUSH_FALLBACK	= 0x2,

					L1D_FLUSH_ORI		= 0x4,

					L1D_FLUSH_MTTRIG	= 0x8,

				};

				void __init setup_rfi_flush(enum l1d_flush_type, bool enable);

				void do_rfi_flush_fixups(enum l1d_flush_type types);

				#endif /* !__ASSEMBLY__ */

				#endif	/* _ASM_POWERPC_SETUP_H */

									
										4

arch/powerpc/kernel/asm-offsets.c
									
												View File
												
				@ -240,6 +240,10 @@ int main(void)

				#ifdef CONFIG_PPC_BOOK3S_64

					DEFINE(PACAMCEMERGSP, offsetof(struct paca_struct, mc_emergency_sp));

					DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce));

					DEFINE(PACA_RFI_FLUSH_FALLBACK_AREA, offsetof(struct paca_struct, rfi_flush_fallback_area));

					DEFINE(PACA_EXRFI, offsetof(struct paca_struct, exrfi));

					DEFINE(PACA_L1D_FLUSH_CONGRUENCE, offsetof(struct paca_struct, l1d_flush_congruence));

					DEFINE(PACA_L1D_FLUSH_SETS, offsetof(struct paca_struct, l1d_flush_sets));

				#endif

					DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id));

					DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state));

									
										30

arch/powerpc/kernel/entry_64.S
									
												View File
												
				@ -251,13 +251,23 @@ BEGIN_FTR_SECTION

				END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)

					ld	r13,GPR13(r1)	/* only restore r13 if returning to usermode */

					ld	r2,GPR2(r1)

					ld	r1,GPR1(r1)

					mtlr	r4

					mtcr	r5

					mtspr	SPRN_SRR0,r7

					mtspr	SPRN_SRR1,r8

					RFI_TO_USER

					b	.	/* prevent speculative execution */

					/* exit to kernel */

				1:	ld	r2,GPR2(r1)

					ld	r1,GPR1(r1)

					mtlr	r4

					mtcr	r5

					mtspr	SPRN_SRR0,r7

					mtspr	SPRN_SRR1,r8

					RFI

					RFI_TO_KERNEL

					b	.	/* prevent speculative execution */

				syscall_error:	

				@ -859,7 +869,7 @@ BEGIN_FTR_SECTION

				END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)

					ACCOUNT_CPU_USER_EXIT(r13, r2, r4)

					REST_GPR(13, r1)

				1:

					mtspr	SPRN_SRR1,r3

					ld	r2,_CCR(r1)

				@ -872,8 +882,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)

					ld	r3,GPR3(r1)

					ld	r4,GPR4(r1)

					ld	r1,GPR1(r1)

					RFI_TO_USER

					b	.	/* prevent speculative execution */

					rfid

				1:	mtspr	SPRN_SRR1,r3

					ld	r2,_CCR(r1)

					mtcrf	0xFF,r2

					ld	r2,_NIP(r1)

					mtspr	SPRN_SRR0,r2

					ld	r0,GPR0(r1)

					ld	r2,GPR2(r1)

					ld	r3,GPR3(r1)

					ld	r4,GPR4(r1)

					ld	r1,GPR1(r1)

					RFI_TO_KERNEL

					b	.	/* prevent speculative execution */

				#endif /* CONFIG_PPC_BOOK3E */

									
										108

arch/powerpc/kernel/exceptions-64s.S
									
												View File
												
				@ -655,6 +655,8 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)

					andi.	r10,r12,MSR_RI	/* check for unrecoverable exception */

					beq-	2f

					andi.	r10,r12,MSR_PR	/* check for user mode (PR != 0) */

					bne	1f

					/* All done -- return from exception. */

				@ -671,7 +673,23 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)

					ld	r11,PACA_EXSLB+EX_R11(r13)

					ld	r12,PACA_EXSLB+EX_R12(r13)

					ld	r13,PACA_EXSLB+EX_R13(r13)

					rfid

					RFI_TO_KERNEL

					b	.	/* prevent speculative execution */

				1:

				.machine	push

				.machine	"power4"

					mtcrf	0x80,r9

					mtcrf	0x01,r9		/* slb_allocate uses cr0 and cr7 */

				.machine	pop

					RESTORE_PPR_PACA(PACA_EXSLB, r9)

					ld	r9,PACA_EXSLB+EX_R9(r13)

					ld	r10,PACA_EXSLB+EX_R10(r13)

					ld	r11,PACA_EXSLB+EX_R11(r13)

					ld	r12,PACA_EXSLB+EX_R12(r13)

					ld	r13,PACA_EXSLB+EX_R13(r13)

					RFI_TO_USER

					b	.	/* prevent speculative execution */

				2:	mfspr	r11,SPRN_SRR0

				@ -679,7 +697,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)

					mtspr	SPRN_SRR0,r10

					ld	r10,PACAKMSR(r13)

					mtspr	SPRN_SRR1,r10

					rfid

					RFI_TO_KERNEL

					b	.

				8:	mfspr	r11,SPRN_SRR0

				@ -1576,6 +1594,92 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)

					bl	kernel_bad_stack

					b	1b

					.globl rfi_flush_fallback

				rfi_flush_fallback:

					SET_SCRATCH0(r13);

					GET_PACA(r13);

					std	r9,PACA_EXRFI+EX_R9(r13)

					std	r10,PACA_EXRFI+EX_R10(r13)

					std	r11,PACA_EXRFI+EX_R11(r13)

					std	r12,PACA_EXRFI+EX_R12(r13)

					std	r8,PACA_EXRFI+EX_R13(r13)

					mfctr	r9

					ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)

					ld	r11,PACA_L1D_FLUSH_SETS(r13)

					ld	r12,PACA_L1D_FLUSH_CONGRUENCE(r13)

					/*

					 * The load adresses are at staggered offsets within cachelines,

					 * which suits some pipelines better (on others it should not

					 * hurt).

					 */

					addi	r12,r12,8

					mtctr	r11

					DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */

					/* order ld/st prior to dcbt stop all streams with flushing */

					sync

				1:	li	r8,0

					.rept	8 /* 8-way set associative */

					ldx	r11,r10,r8

					add	r8,r8,r12

					xor	r11,r11,r11	// Ensure r11 is 0 even if fallback area is not

					add	r8,r8,r11	// Add 0, this creates a dependency on the ldx

					.endr

					addi	r10,r10,128 /* 128 byte cache line */

					bdnz	1b

					mtctr	r9

					ld	r9,PACA_EXRFI+EX_R9(r13)

					ld	r10,PACA_EXRFI+EX_R10(r13)

					ld	r11,PACA_EXRFI+EX_R11(r13)

					ld	r12,PACA_EXRFI+EX_R12(r13)

					ld	r8,PACA_EXRFI+EX_R13(r13)

					GET_SCRATCH0(r13);

					rfid

					.globl hrfi_flush_fallback

				hrfi_flush_fallback:

					SET_SCRATCH0(r13);

					GET_PACA(r13);

					std	r9,PACA_EXRFI+EX_R9(r13)

					std	r10,PACA_EXRFI+EX_R10(r13)

					std	r11,PACA_EXRFI+EX_R11(r13)

					std	r12,PACA_EXRFI+EX_R12(r13)

					std	r8,PACA_EXRFI+EX_R13(r13)

					mfctr	r9

					ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)

					ld	r11,PACA_L1D_FLUSH_SETS(r13)

					ld	r12,PACA_L1D_FLUSH_CONGRUENCE(r13)

					/*

					 * The load adresses are at staggered offsets within cachelines,

					 * which suits some pipelines better (on others it should not

					 * hurt).

					 */

					addi	r12,r12,8

					mtctr	r11

					DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */

					/* order ld/st prior to dcbt stop all streams with flushing */

					sync

				1:	li	r8,0

					.rept	8 /* 8-way set associative */

					ldx	r11,r10,r8

					add	r8,r8,r12

					xor	r11,r11,r11	// Ensure r11 is 0 even if fallback area is not

					add	r8,r8,r11	// Add 0, this creates a dependency on the ldx

					.endr

					addi	r10,r10,128 /* 128 byte cache line */

					bdnz	1b

					mtctr	r9

					ld	r9,PACA_EXRFI+EX_R9(r13)

					ld	r10,PACA_EXRFI+EX_R10(r13)

					ld	r11,PACA_EXRFI+EX_R11(r13)

					ld	r12,PACA_EXRFI+EX_R12(r13)

					ld	r8,PACA_EXRFI+EX_R13(r13)

					GET_SCRATCH0(r13);

					hrfid

				/*

				 * Called from arch_local_irq_enable when an interrupt needs

				 * to be resent. r3 contains 0x500, 0x900, 0xa00 or 0xe80 to indicate

									
										139

arch/powerpc/kernel/setup_64.c
									
												View File
												
				@ -37,6 +37,7 @@

				#include <linux/memblock.h>

				#include <linux/memory.h>

				#include <linux/nmi.h>

				#include <linux/debugfs.h>

				#include <asm/io.h>

				#include <asm/kdump.h>

				@ -678,4 +679,142 @@ static int __init disable_hardlockup_detector(void)

					return 0;

				}

				early_initcall(disable_hardlockup_detector);

				#ifdef CONFIG_PPC_BOOK3S_64

				static enum l1d_flush_type enabled_flush_types;

				static void *l1d_flush_fallback_area;

				static bool no_rfi_flush;

				bool rfi_flush;

				static int __init handle_no_rfi_flush(char *p)

				{

					pr_info("rfi-flush: disabled on command line.");

					no_rfi_flush = true;

					return 0;

				}

				early_param("no_rfi_flush", handle_no_rfi_flush);

				/*

				 * The RFI flush is not KPTI, but because users will see doco that says to use

				 * nopti we hijack that option here to also disable the RFI flush.

				 */

				static int __init handle_no_pti(char *p)

				{

					pr_info("rfi-flush: disabling due to 'nopti' on command line.\n");

					handle_no_rfi_flush(NULL);

					return 0;

				}

				early_param("nopti", handle_no_pti);

				static void do_nothing(void *unused)

				{

					/*

					 * We don't need to do the flush explicitly, just enter+exit kernel is

					 * sufficient, the RFI exit handlers will do the right thing.

					 */

				}

				void rfi_flush_enable(bool enable)

				{

					if (rfi_flush == enable)

						return;

					if (enable) {

						do_rfi_flush_fixups(enabled_flush_types);

						on_each_cpu(do_nothing, NULL, 1);

					} else

						do_rfi_flush_fixups(L1D_FLUSH_NONE);

					rfi_flush = enable;

				}

				static void init_fallback_flush(void)

				{

					u64 l1d_size, limit;

					int cpu;

					l1d_size = ppc64_caches.dsize;

					limit = min(safe_stack_limit(), ppc64_rma_size);

					/*

					 * Align to L1d size, and size it at 2x L1d size, to catch possible

					 * hardware prefetch runoff. We don't have a recipe for load patterns to

					 * reliably avoid the prefetcher.

					 */

					l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit));

					memset(l1d_flush_fallback_area, 0, l1d_size * 2);

					for_each_possible_cpu(cpu) {

						/*

						 * The fallback flush is currently coded for 8-way

						 * associativity. Different associativity is possible, but it

						 * will be treated as 8-way and may not evict the lines as

						 * effectively.

						 *

						 * 128 byte lines are mandatory.

						 */

						u64 c = l1d_size / 8;

						paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area;

						paca[cpu].l1d_flush_congruence = c;

						paca[cpu].l1d_flush_sets = c / 128;

					}

				}

				void __init setup_rfi_flush(enum l1d_flush_type types, bool enable)

				{

					if (types & L1D_FLUSH_FALLBACK) {

						pr_info("rfi-flush: Using fallback displacement flush\n");

						init_fallback_flush();

					}

					if (types & L1D_FLUSH_ORI)

						pr_info("rfi-flush: Using ori type flush\n");

					if (types & L1D_FLUSH_MTTRIG)

						pr_info("rfi-flush: Using mttrig type flush\n");

					enabled_flush_types = types;

					if (!no_rfi_flush)

						rfi_flush_enable(enable);

				}

				#ifdef CONFIG_DEBUG_FS

				static int rfi_flush_set(void *data, u64 val)

				{

					if (val == 1)

						rfi_flush_enable(true);

					else if (val == 0)

						rfi_flush_enable(false);

					else

						return -EINVAL;

					return 0;

				}

				static int rfi_flush_get(void *data, u64 *val)

				{

					*val = rfi_flush ? 1 : 0;

					return 0;

				}

				DEFINE_SIMPLE_ATTRIBUTE(fops_rfi_flush, rfi_flush_get, rfi_flush_set, "%llu\n");

				static __init int rfi_flush_debugfs_init(void)

				{

					debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush);

					return 0;

				}

				device_initcall(rfi_flush_debugfs_init);

				#endif

				ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)

				{

					if (rfi_flush)

						return sprintf(buf, "Mitigation: RFI Flush\n");

					return sprintf(buf, "Vulnerable\n");

				}

				#endif /* CONFIG_PPC_BOOK3S_64 */

				#endif

									
										9

arch/powerpc/kernel/vmlinux.lds.S
									
												View File
												
				@ -132,6 +132,15 @@ SECTIONS

					/* Read-only data */

					RODATA

				#ifdef CONFIG_PPC64

					. = ALIGN(8);

					__rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) {

						__start___rfi_flush_fixup = .;

						*(__rfi_flush_fixup)

						__stop___rfi_flush_fixup = .;

					}

				#endif

					EXCEPTION_TABLE(0)

					NOTES :kernel :notes

									
										42

arch/powerpc/lib/feature-fixups.c
									
												View File
												
				@ -23,6 +23,7 @@

				#include <asm/sections.h>

				#include <asm/setup.h>

				#include <asm/firmware.h>

				#include <asm/setup.h>

				struct fixup_entry {

					unsigned long	mask;

				@ -115,6 +116,47 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)

					}

				}

				#ifdef CONFIG_PPC_BOOK3S_64

				void do_rfi_flush_fixups(enum l1d_flush_type types)

				{

					unsigned int instrs[3], *dest;

					long *start, *end;

					int i;

					start = PTRRELOC(&__start___rfi_flush_fixup),

					end = PTRRELOC(&__stop___rfi_flush_fixup);

					instrs[0] = 0x60000000; /* nop */

					instrs[1] = 0x60000000; /* nop */

					instrs[2] = 0x60000000; /* nop */

					if (types & L1D_FLUSH_FALLBACK)

						/* b .+16 to fallback flush */

						instrs[0] = 0x48000010;

					i = 0;

					if (types & L1D_FLUSH_ORI) {

						instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */

						instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/

					}

					if (types & L1D_FLUSH_MTTRIG)

						instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */

					for (i = 0; start < end; start++, i++) {

						dest = (void *)start + *start;

						pr_devel("patching dest %lx\n", (unsigned long)dest);

						patch_instruction(dest, instrs[0]);

						patch_instruction(dest + 1, instrs[1]);

						patch_instruction(dest + 2, instrs[2]);

					}

					printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i);

				}

				#endif /* CONFIG_PPC_BOOK3S_64 */

				void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)

				{

					long *start, *end;

									
										8

arch/powerpc/perf/core-book3s.c
									
												View File
												
				@ -401,8 +401,12 @@ static __u64 power_pmu_bhrb_to(u64 addr)

					int ret;

					__u64 target;

					if (is_kernel_addr(addr))

						return branch_target((unsigned int *)addr);

					if (is_kernel_addr(addr)) {

						if (probe_kernel_read(&instr, (void *)addr, sizeof(instr)))

							return 0;

						return branch_target(&instr);

					}

					/* Userspace: need copy instruction here then translate it */

					pagefault_disable();

									
										2

arch/powerpc/perf/hv-24x7.c
									
												View File
												
				@ -516,7 +516,7 @@ static int memord(const void *d1, size_t s1, const void *d2, size_t s2)

				{

					if (s1 < s2)

						return 1;

					if (s2 > s1)

					if (s1 > s2)

						return -1;

					return memcmp(d1, d2, s1);

									
										6

arch/powerpc/platforms/powernv/opal-async.c
									
												View File
												
				@ -39,18 +39,18 @@ int __opal_async_get_token(void)

					int token;

					spin_lock_irqsave(&opal_async_comp_lock, flags);

					token = find_first_bit(opal_async_complete_map, opal_max_async_tokens);

					token = find_first_zero_bit(opal_async_token_map, opal_max_async_tokens);

					if (token >= opal_max_async_tokens) {

						token = -EBUSY;

						goto out;

					}

					if (__test_and_set_bit(token, opal_async_token_map)) {

					if (!__test_and_clear_bit(token, opal_async_complete_map)) {

						token = -EBUSY;

						goto out;

					}

					__clear_bit(token, opal_async_complete_map);

					__set_bit(token, opal_async_token_map);

				out:

					spin_unlock_irqrestore(&opal_async_comp_lock, flags);

									
										52

arch/powerpc/platforms/powernv/setup.c
									
												View File
												
				@ -35,13 +35,63 @@

				#include <asm/opal.h>

				#include <asm/kexec.h>

				#include <asm/smp.h>

				#include <asm/tm.h>

				#include <asm/setup.h>

				#include "powernv.h"

				static void pnv_setup_rfi_flush(void)

				{

					struct device_node *np, *fw_features;

					enum l1d_flush_type type;

					int enable;

					/* Default to fallback in case fw-features are not available */

					type = L1D_FLUSH_FALLBACK;

					enable = 1;

					np = of_find_node_by_name(NULL, "ibm,opal");

					fw_features = of_get_child_by_name(np, "fw-features");

					of_node_put(np);

					if (fw_features) {

						np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2");

						if (np && of_property_read_bool(np, "enabled"))

							type = L1D_FLUSH_MTTRIG;

						of_node_put(np);

						np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0");

						if (np && of_property_read_bool(np, "enabled"))

							type = L1D_FLUSH_ORI;

						of_node_put(np);

						/* Enable unless firmware says NOT to */

						enable = 2;

						np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0");

						if (np && of_property_read_bool(np, "disabled"))

							enable--;

						of_node_put(np);

						np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1");

						if (np && of_property_read_bool(np, "disabled"))

							enable--;

						of_node_put(np);

						of_node_put(fw_features);

					}

					setup_rfi_flush(type, enable > 0);

				}

				static void __init pnv_setup_arch(void)

				{

					set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);

					pnv_setup_rfi_flush();

					/* Initialize SMP */

					pnv_smp_init();

				@ -289,7 +339,7 @@ static unsigned long pnv_get_proc_freq(unsigned int cpu)

				{

					unsigned long ret_freq;

					ret_freq = cpufreq_quick_get(cpu) * 1000ul;

					ret_freq = cpufreq_get(cpu) * 1000ul;

					/*

					 * If the backend cpufreq driver does not exist,

									
										35

arch/powerpc/platforms/pseries/setup.c
									
												View File
												
				@ -450,6 +450,39 @@ static void __init find_and_init_phbs(void)

					of_pci_check_probe_only();

				}

				static void pseries_setup_rfi_flush(void)

				{

					struct h_cpu_char_result result;

					enum l1d_flush_type types;

					bool enable;

					long rc;

					/* Enable by default */

					enable = true;

					rc = plpar_get_cpu_characteristics(&result);

					if (rc == H_SUCCESS) {

						types = L1D_FLUSH_NONE;

						if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)

							types |= L1D_FLUSH_MTTRIG;

						if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)

							types |= L1D_FLUSH_ORI;

						/* Use fallback if nothing set in hcall */

						if (types == L1D_FLUSH_NONE)

							types = L1D_FLUSH_FALLBACK;

						if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))

							enable = false;

					} else {

						/* Default to fallback if case hcall is not available */

						types = L1D_FLUSH_FALLBACK;

					}

					setup_rfi_flush(types, enable);

				}

				static void __init pSeries_setup_arch(void)

				{

					set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);

				@ -467,6 +500,8 @@ static void __init pSeries_setup_arch(void)

					fwnmi_init();

					pseries_setup_rfi_flush();

					/* By default, only probe PCI (can be overridden by rtas_pci) */

					pci_add_flags(PCI_PROBE_ONLY);

									
										4

arch/powerpc/sysdev/ipic.c
									
												View File
												
				@ -845,12 +845,12 @@ void ipic_disable_mcp(enum ipic_mcp_irq mcp_irq)

				u32 ipic_get_mcp_status(void)

				{

					return ipic_read(primary_ipic->regs, IPIC_SERMR);

					return ipic_read(primary_ipic->regs, IPIC_SERSR);

				}

				void ipic_clear_mcp_status(u32 mask)

				{

					ipic_write(primary_ipic->regs, IPIC_SERMR, mask);

					ipic_write(primary_ipic->regs, IPIC_SERSR, mask);

				}

				/* Return an interrupt vector or 0 if no interrupt is pending. */

									
										1

arch/s390/kernel/compat_linux.c
									
												View File
												
				@ -263,6 +263,7 @@ COMPAT_SYSCALL_DEFINE2(s390_setgroups16, int, gidsetsize, u16 __user *, grouplis

						return retval;

					}

					groups_sort(group_info);

					retval = set_current_groups(group_info);

					put_group_info(group_info);

									
										3

arch/sh/kernel/traps_32.c
									
												View File
												
				@ -607,7 +607,8 @@ asmlinkage void do_divide_error(unsigned long r4)

						break;

					}

					force_sig_info(SIGFPE, &info, current);

					info.si_signo = SIGFPE;

					force_sig_info(info.si_signo, &info, current);

				}

				#endif

									
										1

arch/sparc/mm/srmmu.c
									
												View File
												
				@ -54,6 +54,7 @@

				enum mbus_module srmmu_modtype;

				static unsigned int hwbug_bitmask;

				int vac_cache_size;

				EXPORT_SYMBOL(vac_cache_size);

				int vac_line_size;

				extern struct resource sparc_iomap;

									
										2

arch/um/Makefile
									
												View File
												
				@ -117,7 +117,7 @@ archheaders:

				archprepare: include/generated/user_constants.h

				LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static

				LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib

				LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)

				CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \

					$(call cc-option, -fno-stack-protector,) \

16

arch/x86/Kconfig

View File

 @ -45,7 +45,7 @@ config X86
 	select ARCH_USE_CMPXCHG_LOCKREF		if X86_64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH if SMP
 	select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
 	select ARCH_WANTS_DYNAMIC_TASK_STRUCT
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_IPC_PARSE_VERSION	if X86_32
 @ -64,6 +64,7 @@ config X86
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
 	select GENERIC_CPU_VULNERABILITIES
 	select GENERIC_EARLY_IOREMAP
 	select GENERIC_FIND_FIRST_BIT
 	select GENERIC_IOMAP
 @ -407,6 +408,19 @@ config GOLDFISH
        def_bool y
        depends on X86_GOLDFISH
 config RETPOLINE
 	bool "Avoid speculative indirect branches in kernel"
 	default y
 	---help---
 	  Compile kernel with the retpoline compiler options to guard against
 	  kernel-to-user data leaks by avoiding speculative indirect
 	  branches. Requires a compiler with -mindirect-branch=thunk-extern
 	  support for full protection. The kernel may run slower.
 	  Without compiler support, at least indirect branches in assembler
 	  code are eliminated. Since this includes the syscall entry path,
 	  it is not entirely pointless.
 if X86_32
 config X86_EXTENDED_PLATFORM
 	bool "Support for extended (non-PC) x86 platforms"

									
										8

arch/x86/Makefile
									
												View File
												
				@ -182,6 +182,14 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables

				KBUILD_CFLAGS += $(mflags-y)

				KBUILD_AFLAGS += $(mflags-y)

				# Avoid indirect branches in kernel to deal with Spectre

				ifdef CONFIG_RETPOLINE

				    RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register)

				    ifneq ($(RETPOLINE_CFLAGS),)

				        KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE

				    endif

				endif

				archscripts: scripts_basic

					$(Q)$(MAKE) $(build)=arch/x86/tools relocs

									
										1

arch/x86/boot/compressed/misc.h
									
												View File
												
				@ -9,6 +9,7 @@

				 */

				#undef CONFIG_PARAVIRT

				#undef CONFIG_PARAVIRT_SPINLOCKS

				#undef CONFIG_PAGE_TABLE_ISOLATION

				#undef CONFIG_KASAN

				#include <linux/linkage.h>

									
										5

arch/x86/crypto/aesni-intel_asm.S
									
												View File
												
				@ -32,6 +32,7 @@

				#include <linux/linkage.h>

				#include <asm/inst.h>

				#include <asm/frame.h>

				#include <asm/nospec-branch.h>

				/*

				 * The following macros are used to move an (un)aligned 16 byte value to/from

				@ -2734,7 +2735,7 @@ ENTRY(aesni_xts_crypt8)

					pxor INC, STATE4

					movdqu IV, 0x30(OUTP)

					call *%r11

					CALL_NOSPEC %r11

					movdqu 0x00(OUTP), INC

					pxor INC, STATE1

				@ -2779,7 +2780,7 @@ ENTRY(aesni_xts_crypt8)

					_aesni_gf128mul_x_ble()

					movups IV, (IVP)

					call *%r11

					CALL_NOSPEC %r11

					movdqu 0x40(OUTP), INC

					pxor INC, STATE1

									
										2

arch/x86/crypto/aesni-intel_glue.c
									
												View File
												
				@ -906,7 +906,7 @@ static int helper_rfc4106_encrypt(struct aead_request *req)

					if (sg_is_last(req->src) &&

					    req->src->offset + req->src->length <= PAGE_SIZE &&

					    sg_is_last(req->dst) &&

				+	    sg_is_last(req->dst) && req->dst->length &&

					    req->dst->offset + req->dst->length <= PAGE_SIZE) {

						one_entry_in_sg = 1;

						scatterwalk_start(&src_sg_walk, req->src);

									
										3

arch/x86/crypto/camellia-aesni-avx-asm_64.S
									
												View File
												
				@ -17,6 +17,7 @@

				#include <linux/linkage.h>

				#include <asm/frame.h>

				#include <asm/nospec-branch.h>

				#define CAMELLIA_TABLE_BYTE_LEN 272

				@ -1224,7 +1225,7 @@ camellia_xts_crypt_16way:

					vpxor 14 * 16(%rax), %xmm15, %xmm14;

					vpxor 15 * 16(%rax), %xmm15, %xmm15;

					call *%r9;

					CALL_NOSPEC %r9;

					addq $(16 * 16), %rsp;

									
										3

arch/x86/crypto/camellia-aesni-avx2-asm_64.S
									
												View File
												
				@ -12,6 +12,7 @@

				#include <linux/linkage.h>

				#include <asm/frame.h>

				#include <asm/nospec-branch.h>

				#define CAMELLIA_TABLE_BYTE_LEN 272

				@ -1337,7 +1338,7 @@ camellia_xts_crypt_32way:

					vpxor 14 * 32(%rax), %ymm15, %ymm14;

					vpxor 15 * 32(%rax), %ymm15, %ymm15;

					call *%r9;

					CALL_NOSPEC %r9;

					addq $(16 * 32), %rsp;

									
										3

arch/x86/crypto/crc32c-pcl-intel-asm_64.S
									
												View File
												
				@ -45,6 +45,7 @@

				#include <asm/inst.h>

				#include <linux/linkage.h>

				#include <asm/nospec-branch.h>

				## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction

				@ -172,7 +173,7 @@ continue_block:

					movzxw  (bufp, %rax, 2), len

					lea	crc_array(%rip), bufp

					lea     (bufp, len, 1), bufp

					jmp     *bufp

					JMP_NOSPEC bufp

					################################################################

					## 2a) PROCESS FULL BLOCKS:

									
										1

arch/x86/crypto/poly1305_glue.c
									
												View File
												
				@ -164,7 +164,6 @@ static struct shash_alg alg = {

					.init		= poly1305_simd_init,

					.update		= poly1305_simd_update,

					.final		= crypto_poly1305_final,

					.setkey		= crypto_poly1305_setkey,

					.descsize	= sizeof(struct poly1305_simd_desc_ctx),

					.base		= {

						.cra_name		= "poly1305",

									
										7

arch/x86/crypto/salsa20_glue.c
									
												View File
												
				@ -59,13 +59,6 @@ static int encrypt(struct blkcipher_desc *desc,

					salsa20_ivsetup(ctx, walk.iv);

					if (likely(walk.nbytes == nbytes))

					{

						salsa20_encrypt_bytes(ctx, walk.src.virt.addr,

								      walk.dst.virt.addr, nbytes);

						return blkcipher_walk_done(desc, &walk, 0);

					}

					while (walk.nbytes >= 64) {

						salsa20_encrypt_bytes(ctx, walk.src.virt.addr,

								      walk.dst.virt.addr,

									
										10

arch/x86/crypto/sha512-mb/sha512_mb_mgr_init_avx2.c
									
												View File
												
				@ -57,10 +57,12 @@ void sha512_mb_mgr_init_avx2(struct sha512_mb_mgr *state)

				{

					unsigned int j;

					state->lens[0] = 0;

					state->lens[1] = 1;

					state->lens[2] = 2;

					state->lens[3] = 3;

					/* initially all lanes are unused */

					state->lens[0] = 0xFFFFFFFF00000000;

					state->lens[1] = 0xFFFFFFFF00000001;

					state->lens[2] = 0xFFFFFFFF00000002;

					state->lens[3] = 0xFFFFFFFF00000003;

					state->unused_lanes = 0xFF03020100;

					for (j = 0; j < 4; j++)

						state->ldata[j].job_in_lane = NULL;

									
										9

arch/x86/entry/common.c
									
												View File
												
				@ -20,6 +20,7 @@

				#include <linux/export.h>

				#include <linux/context_tracking.h>

				#include <linux/user-return-notifier.h>

				#include <linux/nospec.h>

				#include <linux/uprobes.h>

				#include <asm/desc.h>

				@ -201,7 +202,7 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)

					 * special case only applies after poking regs and before the

					 * very next return to user mode.

					 */

					current->thread.status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

					ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

				#endif

					user_enter_irqoff();

				@ -277,7 +278,8 @@ __visible void do_syscall_64(struct pt_regs *regs)

					 * regs->orig_ax, which changes the behavior of some syscalls.

					 */

					if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) {

						regs->ax = sys_call_table[nr & __SYSCALL_MASK](

						nr = array_index_nospec(nr & __SYSCALL_MASK, NR_syscalls);

						regs->ax = sys_call_table[nr](

							regs->di, regs->si, regs->dx,

							regs->r10, regs->r8, regs->r9);

					}

				@ -299,7 +301,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)

					unsigned int nr = (unsigned int)regs->orig_ax;

				#ifdef CONFIG_IA32_EMULATION

					current->thread.status |= TS_COMPAT;

					ti->status |= TS_COMPAT;

				#endif

					if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) {

				@ -313,6 +315,7 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)

					}

					if (likely(nr < IA32_NR_syscalls)) {

						nr = array_index_nospec(nr, IA32_NR_syscalls);

						/*

						 * It's possible that a 32-bit syscall implementation

						 * takes a 64-bit parameter but nonetheless assumes that

									
										22

arch/x86/entry/entry_32.S
									
												View File
												
				@ -45,6 +45,7 @@

				#include <asm/asm.h>

				#include <asm/smap.h>

				#include <asm/export.h>

				#include <asm/nospec-branch.h>

					.section .entry.text, "ax"

				@ -228,6 +229,18 @@ ENTRY(__switch_to_asm)

					movl	%ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset

				#endif

				#ifdef CONFIG_RETPOLINE

					/*

					 * When switching from a shallower to a deeper call stack

					 * the RSB may either underflow or use entries populated

					 * with userspace addresses. On CPUs where those concerns

					 * exist, overwrite the RSB with entries which capture

					 * speculative execution to prevent attack.

					 */

					/* Clobbers %ebx */

					FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW

				#endif

					/* restore callee-saved registers */

					popl	%esi

					popl	%edi

				@ -260,7 +273,7 @@ ENTRY(ret_from_fork)

					/* kernel thread */

				1:	movl	%edi, %eax

					call	*%ebx

					CALL_NOSPEC %ebx

					/*

					 * A kernel thread is allowed to return here after successfully

					 * calling do_execve().  Exit to userspace to complete the execve()

				@ -984,7 +997,8 @@ trace:

					movl	0x4(%ebp), %edx

					subl	$MCOUNT_INSN_SIZE, %eax

					call	*ftrace_trace_function

					movl    ftrace_trace_function, %ecx

					CALL_NOSPEC %ecx

					popl	%edx

					popl	%ecx

				@ -1020,7 +1034,7 @@ return_to_handler:

					movl	%eax, %ecx

					popl	%edx

					popl	%eax

					jmp	*%ecx

					JMP_NOSPEC %ecx

				#endif

				#ifdef CONFIG_TRACING

				@ -1062,7 +1076,7 @@ error_code:

					movl	%ecx, %es

					TRACE_IRQS_OFF

					movl	%esp, %eax			# pt_regs pointer

					call	*%edi

					CALL_NOSPEC %edi

					jmp	ret_from_exception

				END(page_fault)

									
										294

arch/x86/entry/entry_64.S
									
												View File
												
				@ -36,6 +36,8 @@

				#include <asm/smap.h>

				#include <asm/pgtable_types.h>

				#include <asm/export.h>

				#include <asm/kaiser.h>

				#include <asm/nospec-branch.h>

				#include <linux/err.h>

				/* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */

				@ -146,6 +148,7 @@ ENTRY(entry_SYSCALL_64)

					 * it is too small to ever cause noticeable irq latency.

					 */

					SWAPGS_UNSAFE_STACK

					SWITCH_KERNEL_CR3_NO_STACK

					/*

					 * A hypervisor implementation might want to use a label

					 * after the swapgs, so that it can do the swapgs

				@ -174,83 +177,17 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)

					pushq	%r9				/* pt_regs->r9 */

					pushq	%r10				/* pt_regs->r10 */

					pushq	%r11				/* pt_regs->r11 */

					sub	$(6*8), %rsp			/* pt_regs->bp, bx, r12-15 not saved */

					pushq	%rbx				/* pt_regs->rbx */

					pushq	%rbp				/* pt_regs->rbp */

					pushq	%r12				/* pt_regs->r12 */

					pushq	%r13				/* pt_regs->r13 */

					pushq	%r14				/* pt_regs->r14 */

					pushq	%r15				/* pt_regs->r15 */

					/*

					 * If we need to do entry work or if we guess we'll need to do

					 * exit work, go straight to the slow path.

					 */

					movq	PER_CPU_VAR(current_task), %r11

					testl	$_TIF_WORK_SYSCALL_ENTRY|_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)

					jnz	entry_SYSCALL64_slow_path

				entry_SYSCALL_64_fastpath:

					/*

					 * Easy case: enable interrupts and issue the syscall.  If the syscall

					 * needs pt_regs, we'll call a stub that disables interrupts again

					 * and jumps to the slow path.

					 */

					TRACE_IRQS_ON

					ENABLE_INTERRUPTS(CLBR_NONE)

				#if __SYSCALL_MASK == ~0

					cmpq	$__NR_syscall_max, %rax

				#else

					andl	$__SYSCALL_MASK, %eax

					cmpl	$__NR_syscall_max, %eax

				#endif

					ja	1f				/* return -ENOSYS (already in pt_regs->ax) */

					movq	%r10, %rcx

					/*

					 * This call instruction is handled specially in stub_ptregs_64.

					 * It might end up jumping to the slow path.  If it jumps, RAX

					 * and all argument registers are clobbered.

					 */

					call	*sys_call_table(, %rax, 8)

				.Lentry_SYSCALL_64_after_fastpath_call:

					movq	%rax, RAX(%rsp)

				1:

					/*

					 * If we get here, then we know that pt_regs is clean for SYSRET64.

					 * If we see that no exit work is required (which we are required

					 * to check with IRQs off), then we can go straight to SYSRET64.

					 */

					DISABLE_INTERRUPTS(CLBR_NONE)

					TRACE_IRQS_OFF

					movq	PER_CPU_VAR(current_task), %r11

					testl	$_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)

					jnz	1f

					LOCKDEP_SYS_EXIT

					TRACE_IRQS_ON		/* user mode is traced as IRQs on */

					movq	RIP(%rsp), %rcx

					movq	EFLAGS(%rsp), %r11

					RESTORE_C_REGS_EXCEPT_RCX_R11

					movq	RSP(%rsp), %rsp

					USERGS_SYSRET64

				1:

					/*

					 * The fast path looked good when we started, but something changed

					 * along the way and we need to switch to the slow path.  Calling

					 * raise(3) will trigger this, for example.  IRQs are off.

					 */

					TRACE_IRQS_ON

					ENABLE_INTERRUPTS(CLBR_NONE)

					SAVE_EXTRA_REGS

					movq	%rsp, %rdi

					call	syscall_return_slowpath	/* returns with IRQs disabled */

					jmp	return_from_SYSCALL_64

				entry_SYSCALL64_slow_path:

					/* IRQs are off. */

					SAVE_EXTRA_REGS

					movq	%rsp, %rdi

					call	do_syscall_64		/* returns with IRQs disabled */

				return_from_SYSCALL_64:

					RESTORE_EXTRA_REGS

					TRACE_IRQS_IRETQ		/* we're about to change IF */

				@ -323,53 +260,31 @@ return_from_SYSCALL_64:

				syscall_return_via_sysret:

					/* rcx and r11 are already restored (see code above) */

					RESTORE_C_REGS_EXCEPT_RCX_R11

					/*

					 * This opens a window where we have a user CR3, but are

					 * running in the kernel.  This makes using the CS

					 * register useless for telling whether or not we need to

					 * switch CR3 in NMIs.  Normal interrupts are OK because

					 * they are off here.

					 */

					SWITCH_USER_CR3

					movq	RSP(%rsp), %rsp

					USERGS_SYSRET64

				opportunistic_sysret_failed:

					/*

					 * This opens a window where we have a user CR3, but are

					 * running in the kernel.  This makes using the CS

					 * register useless for telling whether or not we need to

					 * switch CR3 in NMIs.  Normal interrupts are OK because

					 * they are off here.

					 */

					SWITCH_USER_CR3

					SWAPGS

					jmp	restore_c_regs_and_iret

				END(entry_SYSCALL_64)

				ENTRY(stub_ptregs_64)

					/*

					 * Syscalls marked as needing ptregs land here.

					 * If we are on the fast path, we need to save the extra regs,

					 * which we achieve by trying again on the slow path.  If we are on

					 * the slow path, the extra regs are already saved.

					 *

					 * RAX stores a pointer to the C function implementing the syscall.

					 * IRQs are on.

					 */

					cmpq	$.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)

					jne	1f

					/*

					 * Called from fast path -- disable IRQs again, pop return address

					 * and jump to slow path

					 */

					DISABLE_INTERRUPTS(CLBR_NONE)

					TRACE_IRQS_OFF

					popq	%rax

					jmp	entry_SYSCALL64_slow_path

				1:

					jmp	*%rax				/* Called from C */

				END(stub_ptregs_64)

				.macro ptregs_stub func

				ENTRY(ptregs_\func)

					leaq	\func(%rip), %rax

					jmp	stub_ptregs_64

				END(ptregs_\func)

				.endm

				/* Instantiate ptregs_stub for each ptregs-using syscall */

				#define __SYSCALL_64_QUAL_(sym)

				#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_stub sym

				#define __SYSCALL_64(nr, sym, qual) __SYSCALL_64_QUAL_##qual(sym)

				#include <asm/syscalls_64.h>

				/*

				 * %rdi: prev task

				 * %rsi: next task

				@ -395,6 +310,18 @@ ENTRY(__switch_to_asm)

					movq	%rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset

				#endif

				#ifdef CONFIG_RETPOLINE

					/*

					 * When switching from a shallower to a deeper call stack

					 * the RSB may either underflow or use entries populated

					 * with userspace addresses. On CPUs where those concerns

					 * exist, overwrite the RSB with entries which capture

					 * speculative execution to prevent attack.

					 */

					/* Clobbers %rbx */

					FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW

				#endif

					/* restore callee-saved registers */

					popq	%r15

					popq	%r14

				@ -424,13 +351,14 @@ ENTRY(ret_from_fork)

					movq	%rsp, %rdi

					call	syscall_return_slowpath	/* returns with IRQs disabled */

					TRACE_IRQS_ON			/* user mode is traced as IRQS on */

					SWITCH_USER_CR3

					SWAPGS

					jmp	restore_regs_and_iret

				1:

					/* kernel thread */

					movq	%r12, %rdi

					call	*%rbx

					CALL_NOSPEC %rbx

					/*

					 * A kernel thread is allowed to return here after successfully

					 * calling do_execve().  Exit to userspace to complete the execve()

				@ -478,6 +406,7 @@ END(irq_entries_start)

					 * tracking that we're in kernel mode.

					 */

					SWAPGS

					SWITCH_KERNEL_CR3

					/*

					 * We need to tell lockdep that IRQs are off.  We can't do this until

				@ -535,6 +464,7 @@ GLOBAL(retint_user)

					mov	%rsp,%rdi

					call	prepare_exit_to_usermode

					TRACE_IRQS_IRETQ

					SWITCH_USER_CR3

					SWAPGS

					jmp	restore_regs_and_iret

				@ -612,6 +542,7 @@ native_irq_return_ldt:

					pushq	%rdi				/* Stash user RDI */

					SWAPGS

					SWITCH_KERNEL_CR3

					movq	PER_CPU_VAR(espfix_waddr), %rdi

					movq	%rax, (0*8)(%rdi)		/* user RAX */

					movq	(1*8)(%rsp), %rax		/* user RIP */

				@ -638,6 +569,7 @@ native_irq_return_ldt:

					 * still points to an RO alias of the ESPFIX stack.

					 */

					orq	PER_CPU_VAR(espfix_stack), %rax

					SWITCH_USER_CR3

					SWAPGS

					movq	%rax, %rsp

				@ -1016,13 +948,17 @@ idtentry async_page_fault	do_async_page_fault	has_error_code=1

				#endif

				#ifdef CONFIG_X86_MCE

				idtentry machine_check					has_error_code=0	paranoid=1 do_sym=*machine_check_vector(%rip)

				idtentry machine_check		do_mce			has_error_code=0	paranoid=1

				#endif

				/*

				 * Save all registers in pt_regs, and switch gs if needed.

				 * Use slow, but surefire "are we in kernel?" check.

				 * Return: ebx=0: need swapgs on exit, ebx=1: otherwise

				 *

				 * Return: ebx=0: needs swapgs but not SWITCH_USER_CR3 in paranoid_exit

				 *         ebx=1: needs neither swapgs nor SWITCH_USER_CR3 in paranoid_exit

				 *         ebx=2: needs both swapgs and SWITCH_USER_CR3 in paranoid_exit

				 *         ebx=3: needs SWITCH_USER_CR3 but not swapgs in paranoid_exit

				 */

				ENTRY(paranoid_entry)

					cld

				@ -1035,7 +971,26 @@ ENTRY(paranoid_entry)

					js	1f				/* negative -> in kernel */

					SWAPGS

					xorl	%ebx, %ebx

				1:	ret

				1:

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/*

					 * We might have come in between a swapgs and a SWITCH_KERNEL_CR3

					 * on entry, or between a SWITCH_USER_CR3 and a swapgs on exit.

					 * Do a conditional SWITCH_KERNEL_CR3: this could safely be done

					 * unconditionally, but we need to find out whether the reverse

					 * should be done on return (conveyed to paranoid_exit in %ebx).

					 */

					ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER

					testl	$KAISER_SHADOW_PGD_OFFSET, %eax

					jz	2f

					orl	$2, %ebx

					andq	$(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax

					/* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */

					ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID

					movq	%rax, %cr3

				2:

				#endif

					ret

				END(paranoid_entry)

				/*

				@ -1048,19 +1003,26 @@ END(paranoid_entry)

				 * be complicated.  Fortunately, we there's no good reason

				 * to try to handle preemption here.

				 *

				 * On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)

				 * On entry: ebx=0: needs swapgs but not SWITCH_USER_CR3

				 *           ebx=1: needs neither swapgs nor SWITCH_USER_CR3

				 *           ebx=2: needs both swapgs and SWITCH_USER_CR3

				 *           ebx=3: needs SWITCH_USER_CR3 but not swapgs

				 */

				ENTRY(paranoid_exit)

					DISABLE_INTERRUPTS(CLBR_NONE)

					TRACE_IRQS_OFF_DEBUG

					testl	%ebx, %ebx			/* swapgs needed? */

					jnz	paranoid_exit_no_swapgs

					TRACE_IRQS_IRETQ

					SWAPGS_UNSAFE_STACK

					jmp	paranoid_exit_restore

				paranoid_exit_no_swapgs:

					TRACE_IRQS_IRETQ_DEBUG

				paranoid_exit_restore:

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/* No ALTERNATIVE for X86_FEATURE_KAISER: paranoid_entry sets %ebx */

					testl	$2, %ebx			/* SWITCH_USER_CR3 needed? */

					jz	paranoid_exit_no_switch

					SWITCH_USER_CR3

				paranoid_exit_no_switch:

				#endif

					testl	$1, %ebx			/* swapgs needed? */

					jnz	paranoid_exit_no_swapgs

					SWAPGS_UNSAFE_STACK

				paranoid_exit_no_swapgs:

					RESTORE_EXTRA_REGS

					RESTORE_C_REGS

					REMOVE_PT_GPREGS_FROM_STACK 8

				@ -1075,6 +1037,13 @@ ENTRY(error_entry)

					cld

					SAVE_C_REGS 8

					SAVE_EXTRA_REGS 8

					/*

					 * error_entry() always returns with a kernel gsbase and

					 * CR3.  We must also have a kernel CR3/gsbase before

					 * calling TRACE_IRQS_*.  Just unconditionally switch to

					 * the kernel CR3 here.

					 */

					SWITCH_KERNEL_CR3

					xorl	%ebx, %ebx

					testb	$3, CS+8(%rsp)

					jz	.Lerror_kernelspace

				@ -1235,6 +1204,10 @@ ENTRY(nmi)

					 */

					SWAPGS_UNSAFE_STACK

					/*

					 * percpu variables are mapped with user CR3, so no need

					 * to switch CR3 here.

					 */

					cld

					movq	%rsp, %rdx

					movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp

				@ -1268,12 +1241,34 @@ ENTRY(nmi)

					movq	%rsp, %rdi

					movq	$-1, %rsi

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/* Unconditionally use kernel CR3 for do_nmi() */

					/* %rax is saved above, so OK to clobber here */

					ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER

					/* If PCID enabled, NOFLUSH now and NOFLUSH on return */

					ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID

					pushq	%rax

					/* mask off "user" bit of pgd address and 12 PCID bits: */

					andq	$(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax

					movq	%rax, %cr3

				2:

				#endif

					call	do_nmi

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/*

					 * Unconditionally restore CR3.  I know we return to

					 * kernel code that needs user CR3, but do we ever return

					 * to "user mode" where we need the kernel CR3?

					 */

					ALTERNATIVE "", "popq %rax; movq %rax, %cr3", X86_FEATURE_KAISER

				#endif

					/*

					 * Return back to user mode.  We must *not* do the normal exit

					 * work, because we don't want to enable interrupts.  Fortunately,

					 * do_nmi doesn't modify pt_regs.

					 * work, because we don't want to enable interrupts.  Do not

					 * switch to user CR3: we might be going back to kernel code

					 * that had a user CR3 set.

					 */

					SWAPGS

					jmp	restore_c_regs_and_iret

				@ -1470,22 +1465,55 @@ end_repeat_nmi:

					ALLOC_PT_GPREGS_ON_STACK

					/*

					 * Use paranoid_entry to handle SWAPGS, but no need to use paranoid_exit

					 * as we should not be calling schedule in NMI context.

					 * Even with normal interrupts enabled. An NMI should not be

					 * setting NEED_RESCHED or anything that normal interrupts and

					 * exceptions might do.

					 * Use the same approach as paranoid_entry to handle SWAPGS, but

					 * without CR3 handling since we do that differently in NMIs.  No

					 * need to use paranoid_exit as we should not be calling schedule

					 * in NMI context.  Even with normal interrupts enabled. An NMI

					 * should not be setting NEED_RESCHED or anything that normal

					 * interrupts and exceptions might do.

					 */

					call	paranoid_entry

					/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */

					cld

					SAVE_C_REGS

					SAVE_EXTRA_REGS

					movl	$1, %ebx

					movl	$MSR_GS_BASE, %ecx

					rdmsr

					testl	%edx, %edx

					js	1f				/* negative -> in kernel */

					SWAPGS

					xorl	%ebx, %ebx

				1:

					movq	%rsp, %rdi

					movq	$-1, %rsi

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/* Unconditionally use kernel CR3 for do_nmi() */

					/* %rax is saved above, so OK to clobber here */

					ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER

					/* If PCID enabled, NOFLUSH now and NOFLUSH on return */

					ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID

					pushq	%rax

					/* mask off "user" bit of pgd address and 12 PCID bits: */

					andq	$(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax

					movq	%rax, %cr3

				2:

				#endif

					/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */

					call	do_nmi

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					/*

					 * Unconditionally restore CR3.  We might be returning to

					 * kernel code that needs user CR3, like just just before

					 * a sysret.

					 */

					ALTERNATIVE "", "popq %rax; movq %rax, %cr3", X86_FEATURE_KAISER

				#endif

					testl	%ebx, %ebx			/* swapgs needed? */

					jnz	nmi_restore

				nmi_swapgs:

					/* We fixed up CR3 above, so no need to switch it here */

					SWAPGS_UNSAFE_STACK

				nmi_restore:

					RESTORE_EXTRA_REGS

									
										8

arch/x86/entry/entry_64_compat.S
									
												View File
												
				@ -13,6 +13,8 @@

				#include <asm/irqflags.h>

				#include <asm/asm.h>

				#include <asm/smap.h>

				#include <asm/pgtable_types.h>

				#include <asm/kaiser.h>

				#include <linux/linkage.h>

				#include <linux/err.h>

				@ -48,6 +50,7 @@

				ENTRY(entry_SYSENTER_compat)

					/* Interrupts are off on entry. */

					SWAPGS_UNSAFE_STACK

					SWITCH_KERNEL_CR3_NO_STACK

					movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp

					/*

				@ -184,6 +187,7 @@ ENDPROC(entry_SYSENTER_compat)

				ENTRY(entry_SYSCALL_compat)

					/* Interrupts are off on entry. */

					SWAPGS_UNSAFE_STACK

					SWITCH_KERNEL_CR3_NO_STACK

					/* Stash user ESP and switch to the kernel stack. */

					movl	%esp, %r8d

				@ -259,6 +263,7 @@ sysret32_from_system_call:

					xorq	%r8, %r8

					xorq	%r9, %r9

					xorq	%r10, %r10

					SWITCH_USER_CR3

					movq	RSP-ORIG_RAX(%rsp), %rsp

					swapgs

					sysretl

				@ -297,7 +302,7 @@ ENTRY(entry_INT80_compat)

					PARAVIRT_ADJUST_EXCEPTION_FRAME

					ASM_CLAC			/* Do this early to minimize exposure */

					SWAPGS

					SWITCH_KERNEL_CR3_NO_STACK

					/*

					 * User tracing code (ptrace or signal handlers) might assume that

					 * the saved RAX contains a 32-bit number when we're invoking a 32-bit

				@ -338,6 +343,7 @@ ENTRY(entry_INT80_compat)

					/* Go back to user mode. */

					TRACE_IRQS_ON

					SWITCH_USER_CR3

					SWAPGS

					jmp	restore_regs_and_iret

				END(entry_INT80_compat)

									
										7

arch/x86/entry/syscall_64.c
									
												View File
												
				@ -6,14 +6,11 @@

				#include <asm/asm-offsets.h>

				#include <asm/syscall.h>

				#define __SYSCALL_64_QUAL_(sym) sym

				#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_##sym

				#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long __SYSCALL_64_QUAL_##qual(sym)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);

				#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);

				#include <asm/syscalls_64.h>

				#undef __SYSCALL_64

				#define __SYSCALL_64(nr, sym, qual) [nr] = __SYSCALL_64_QUAL_##qual(sym),

				#define __SYSCALL_64(nr, sym, qual) [nr] = sym,

				extern long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);

									
										12

arch/x86/entry/vsyscall/vsyscall_64.c
									
												View File
												
				@ -46,6 +46,7 @@ static enum { EMULATE, NATIVE, NONE } vsyscall_mode =

				#else

					EMULATE;

				#endif

				unsigned long vsyscall_pgprot = __PAGE_KERNEL_VSYSCALL;

				static int __init vsyscall_setup(char *str)

				{

				@ -66,6 +67,11 @@ static int __init vsyscall_setup(char *str)

				}

				early_param("vsyscall", vsyscall_setup);

				bool vsyscall_enabled(void)

				{

					return vsyscall_mode != NONE;

				}

				static void warn_bad_vsyscall(const char *level, struct pt_regs *regs,

							      const char *message)

				{

				@ -331,11 +337,11 @@ void __init map_vsyscall(void)

					extern char __vsyscall_page;

					unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);

					if (vsyscall_mode != NATIVE)

						vsyscall_pgprot = __PAGE_KERNEL_VVAR;

					if (vsyscall_mode != NONE)

						__set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall,

							     vsyscall_mode == NATIVE

							     ? PAGE_KERNEL_VSYSCALL

							     : PAGE_KERNEL_VVAR);

							     __pgprot(vsyscall_pgprot));

					BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=

						     (unsigned long)VSYSCALL_ADDR);

									
										2

arch/x86/events/amd/power.c
									
												View File
												
				@ -277,7 +277,7 @@ static int __init amd_power_pmu_init(void)

					int ret;

					if (!x86_match_cpu(cpu_match))

						return 0;

						return -ENODEV;

					if (!boot_cpu_has(X86_FEATURE_ACC_POWER))

						return -ENODEV;

									
										44

arch/x86/events/intel/bts.c
									
												View File
												
				@ -22,6 +22,7 @@

				#include <linux/debugfs.h>

				#include <linux/device.h>

				#include <linux/coredump.h>

				#include <linux/kaiser.h>

				#include <asm-generic/sizes.h>

				#include <asm/perf_event.h>

				@ -77,6 +78,23 @@ static size_t buf_size(struct page *page)

					return 1 << (PAGE_SHIFT + page_private(page));

				}

				static void bts_buffer_free_aux(void *data)

				{

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					struct bts_buffer *buf = data;

					int nbuf;

					for (nbuf = 0; nbuf < buf->nr_bufs; nbuf++) {

						struct page *page = buf->buf[nbuf].page;

						void *kaddr = page_address(page);

						size_t page_size = buf_size(page);

						kaiser_remove_mapping((unsigned long)kaddr, page_size);

					}

				#endif

					kfree(data);

				}

				static void *

				bts_buffer_setup_aux(int cpu, void **pages, int nr_pages, bool overwrite)

				{

				@ -113,29 +131,33 @@ bts_buffer_setup_aux(int cpu, void **pages, int nr_pages, bool overwrite)

					buf->real_size = size - size % BTS_RECORD_SIZE;

					for (pg = 0, nbuf = 0, offset = 0, pad = 0; nbuf < buf->nr_bufs; nbuf++) {

						unsigned int __nr_pages;

						void *kaddr = pages[pg];

						size_t page_size;

						page = virt_to_page(kaddr);

						page_size = buf_size(page);

						if (kaiser_add_mapping((unsigned long)kaddr,

									page_size, __PAGE_KERNEL) < 0) {

							buf->nr_bufs = nbuf;

							bts_buffer_free_aux(buf);

							return NULL;

						}

						page = virt_to_page(pages[pg]);

						__nr_pages = PagePrivate(page) ? 1 << page_private(page) : 1;

						buf->buf[nbuf].page = page;

						buf->buf[nbuf].offset = offset;

						buf->buf[nbuf].displacement = (pad ? BTS_RECORD_SIZE - pad : 0);

						buf->buf[nbuf].size = buf_size(page) - buf->buf[nbuf].displacement;

						buf->buf[nbuf].size = page_size - buf->buf[nbuf].displacement;

						pad = buf->buf[nbuf].size % BTS_RECORD_SIZE;

						buf->buf[nbuf].size -= pad;

						pg += __nr_pages;

						offset += __nr_pages << PAGE_SHIFT;

						pg += page_size >> PAGE_SHIFT;

						offset += page_size;

					}

					return buf;

				}

				static void bts_buffer_free_aux(void *data)

				{

					kfree(data);

				}

				static unsigned long bts_buffer_offset(struct bts_buffer *buf, unsigned int idx)

				{

					return buf->buf[idx].offset + buf->buf[idx].displacement;

									
										57

arch/x86/events/intel/ds.c
									
												View File
												
				@ -2,11 +2,15 @@

				#include <linux/types.h>

				#include <linux/slab.h>

				#include <asm/kaiser.h>

				#include <asm/perf_event.h>

				#include <asm/insn.h>

				#include "../perf_event.h"

				static

				DEFINE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(struct debug_store, cpu_debug_store);

				/* The size of a BTS record in bytes: */

				#define BTS_RECORD_SIZE		24

				@ -268,6 +272,39 @@ void fini_debug_store_on_cpu(int cpu)

				static DEFINE_PER_CPU(void *, insn_buffer);

				static void *dsalloc(size_t size, gfp_t flags, int node)

				{

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					unsigned int order = get_order(size);

					struct page *page;

					unsigned long addr;

					page = __alloc_pages_node(node, flags | __GFP_ZERO, order);

					if (!page)

						return NULL;

					addr = (unsigned long)page_address(page);

					if (kaiser_add_mapping(addr, size, __PAGE_KERNEL) < 0) {

						__free_pages(page, order);

						addr = 0;

					}

					return (void *)addr;

				#else

					return kmalloc_node(size, flags | __GFP_ZERO, node);

				#endif

				}

				static void dsfree(const void *buffer, size_t size)

				{

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

					if (!buffer)

						return;

					kaiser_remove_mapping((unsigned long)buffer, size);

					free_pages((unsigned long)buffer, get_order(size));

				#else

					kfree(buffer);

				#endif

				}

				static int alloc_pebs_buffer(int cpu)

				{

					struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;

				@ -278,7 +315,7 @@ static int alloc_pebs_buffer(int cpu)

					if (!x86_pmu.pebs)

						return 0;

					buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);

					buffer = dsalloc(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);

					if (unlikely(!buffer))

						return -ENOMEM;

				@ -289,7 +326,7 @@ static int alloc_pebs_buffer(int cpu)

					if (x86_pmu.intel_cap.pebs_format < 2) {

						ibuffer = kzalloc_node(PEBS_FIXUP_SIZE, GFP_KERNEL, node);

						if (!ibuffer) {

							kfree(buffer);

							dsfree(buffer, x86_pmu.pebs_buffer_size);

							return -ENOMEM;

						}

						per_cpu(insn_buffer, cpu) = ibuffer;

				@ -315,7 +352,8 @@ static void release_pebs_buffer(int cpu)

					kfree(per_cpu(insn_buffer, cpu));

					per_cpu(insn_buffer, cpu) = NULL;

					kfree((void *)(unsigned long)ds->pebs_buffer_base);

					dsfree((void *)(unsigned long)ds->pebs_buffer_base,

							x86_pmu.pebs_buffer_size);

					ds->pebs_buffer_base = 0;

				}

				@ -329,7 +367,7 @@ static int alloc_bts_buffer(int cpu)

					if (!x86_pmu.bts)

						return 0;

					buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_NOWARN, node);

					buffer = dsalloc(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_NOWARN, node);

					if (unlikely(!buffer)) {

						WARN_ONCE(1, "%s: BTS buffer allocation failure\n", __func__);

						return -ENOMEM;

				@ -355,19 +393,15 @@ static void release_bts_buffer(int cpu)

					if (!ds || !x86_pmu.bts)

						return;

					kfree((void *)(unsigned long)ds->bts_buffer_base);

					dsfree((void *)(unsigned long)ds->bts_buffer_base, BTS_BUFFER_SIZE);

					ds->bts_buffer_base = 0;

				}

				static int alloc_ds_buffer(int cpu)

				{

					int node = cpu_to_node(cpu);

					struct debug_store *ds;

					ds = kzalloc_node(sizeof(*ds), GFP_KERNEL, node);

					if (unlikely(!ds))

						return -ENOMEM;

					struct debug_store *ds = per_cpu_ptr(&cpu_debug_store, cpu);

					memset(ds, 0, sizeof(*ds));

					per_cpu(cpu_hw_events, cpu).ds = ds;

					return 0;

				@ -381,7 +415,6 @@ static void release_ds_buffer(int cpu)

						return;

					per_cpu(cpu_hw_events, cpu).ds = NULL;

					kfree(ds);

				}

				void release_ds_buffers(void)

									
										4

arch/x86/include/asm/alternative.h
									
												View File
												
				@ -139,7 +139,7 @@ static inline int alternatives_text_reserved(void *start, void *end)

					".popsection\n"							\

					".pushsection .altinstr_replacement, \"ax\"\n"			\

					ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\

					".popsection"

					".popsection\n"

				#define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\

					OLDINSTR_2(oldinstr, 1, 2)					\

				@ -150,7 +150,7 @@ static inline int alternatives_text_reserved(void *start, void *end)

					".pushsection .altinstr_replacement, \"ax\"\n"			\

					ALTINSTR_REPLACEMENT(newinstr1, feature1, 1)			\

					ALTINSTR_REPLACEMENT(newinstr2, feature2, 2)			\

					".popsection"

					".popsection\n"

				/*

				 * Alternative instructions for different CPU types or capabilities.

									
										27

arch/x86/include/asm/asm-prototypes.h
									
												View File
												
				@ -10,7 +10,34 @@

				#include <asm/pgtable.h>

				#include <asm/special_insns.h>

				#include <asm/preempt.h>

				#include <asm/asm.h>

				#ifndef CONFIG_X86_CMPXCHG64

				extern void cmpxchg8b_emu(void);

				#endif

				#ifdef CONFIG_RETPOLINE

				#ifdef CONFIG_X86_32

				#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void);

				#else

				#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void);

				INDIRECT_THUNK(8)

				INDIRECT_THUNK(9)

				INDIRECT_THUNK(10)

				INDIRECT_THUNK(11)

				INDIRECT_THUNK(12)

				INDIRECT_THUNK(13)

				INDIRECT_THUNK(14)

				INDIRECT_THUNK(15)

				#endif

				INDIRECT_THUNK(ax)

				INDIRECT_THUNK(bx)

				INDIRECT_THUNK(cx)

				INDIRECT_THUNK(dx)

				INDIRECT_THUNK(si)

				INDIRECT_THUNK(di)

				INDIRECT_THUNK(bp)

				asmlinkage void __fill_rsb(void);

				asmlinkage void __clear_rsb(void);

				#endif /* CONFIG_RETPOLINE */

									
										15

arch/x86/include/asm/asm.h
									
												View File
												
				@ -11,10 +11,12 @@

				# define __ASM_FORM_COMMA(x) " " #x ","

				#endif

				#ifdef CONFIG_X86_32

				#ifndef __x86_64__

				/* 32 bit */

				# define __ASM_SEL(a,b) __ASM_FORM(a)

				# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)

				#else

				/* 64 bit */

				# define __ASM_SEL(a,b) __ASM_FORM(b)

				# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)

				#endif

				@ -125,4 +127,15 @@

				/* For C file, we already have NOKPROBE_SYMBOL macro */

				#endif

				#ifndef __ASSEMBLY__

				/*

				 * This output constraint should be used for any inline asm which has a "call"

				 * instruction.  Otherwise the asm may be inserted before the frame pointer

				 * gets set up by the containing function.  If you forget to do this, objtool

				 * may print a "call without frame pointer save/setup" warning.

				 */

				register unsigned long current_stack_pointer asm(_ASM_SP);

				#define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)

				#endif

				#endif /* _ASM_X86_ASM_H */

									
										28

arch/x86/include/asm/barrier.h
									
												View File
												
				@ -23,6 +23,34 @@

				#define wmb()	asm volatile("sfence" ::: "memory")

				#endif

				/**

				 * array_index_mask_nospec() - generate a mask that is ~0UL when the

				 * 	bounds check succeeds and 0 otherwise

				 * @index: array element index

				 * @size: number of elements in array

				 *

				 * Returns:

				 *     0 - (index < size)

				 */

				static inline unsigned long array_index_mask_nospec(unsigned long index,

						unsigned long size)

				{

					unsigned long mask;

					asm ("cmp %1,%2; sbb %0,%0;"

							:"=r" (mask)

							:"r"(size),"r" (index)

							:"cc");

					return mask;

				}

				/* Override the default implementation from linux/nospec.h. */

				#define array_index_mask_nospec array_index_mask_nospec

				/* Prevent speculative execution past this barrier. */

				#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \

									   "lfence", X86_FEATURE_LFENCE_RDTSC)

				#ifdef CONFIG_X86_PPRO_FENCE

				#define dma_rmb()	rmb()

				#else

									
										2

arch/x86/include/asm/cmdline.h
									
												View File
												
				@ -2,5 +2,7 @@

				#define _ASM_X86_CMDLINE_H

				int cmdline_find_option_bool(const char *cmdline_ptr, const char *option);

				int cmdline_find_option(const char *cmdline_ptr, const char *option,

							char *buffer, int bufsize);

				#endif /* _ASM_X86_CMDLINE_H */

									
										9

arch/x86/include/asm/cpufeature.h
									
												View File
												
				@ -28,6 +28,7 @@ enum cpuid_leafs

					CPUID_8000_000A_EDX,

					CPUID_7_ECX,

					CPUID_8000_0007_EBX,

					CPUID_7_EDX,

				};

				#ifdef CONFIG_X86_FEATURE_NAMES

				@ -78,8 +79,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];

					   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\

					   REQUIRED_MASK_CHECK					  ||	\

					   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

					   BUILD_BUG_ON_ZERO(NCAPINTS != 19))

				#define DISABLED_MASK_BIT_SET(feature_bit)				\

					 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\

				@ -100,8 +102,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];

					   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\

					   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\

					   DISABLED_MASK_CHECK					  ||	\

					   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

					   BUILD_BUG_ON_ZERO(NCAPINTS != 19))

				#define cpu_has(c, bit)							\

					(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\

				@ -135,6 +138,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];

					set_bit(bit, (unsigned long *)cpu_caps_set);	\

				} while (0)

				#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)

				#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS)

				/*

				 * Static testing of CPU features.  Used the same as boot_cpu_has().

									
										29

arch/x86/include/asm/cpufeatures.h
									
												View File
												
				@ -12,7 +12,7 @@

				/*

				 * Defines x86 CPU feature bits

				 */

				#define NCAPINTS	18	/* N 32-bit words worth of info */

				#define NCAPINTS	19	/* N 32-bit words worth of info */

				#define NBUGINTS	1	/* N 32-bit bug flags */

				/*

				@ -189,13 +189,20 @@

				#define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */

				#define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */

				#define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 4) /* Effectively INVPCID && CR4.PCIDE=1 */

				#define X86_FEATURE_HW_PSTATE	( 7*32+ 8) /* AMD HW-PState */

				#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

				#define X86_FEATURE_INTEL_PT	( 7*32+15) /* Intel Processor Trace */

				#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */

				#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */

				#define X86_FEATURE_RETPOLINE	( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */

				#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */

				#define X86_FEATURE_RSB_CTXSW	( 7*32+19) /* "" Fill RSB on context switches */

				/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */

				#define X86_FEATURE_KAISER	( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */

				#define X86_FEATURE_USE_IBPB	( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */

				/* Virtualization flags: Linux defined, word 8 */

				#define X86_FEATURE_TPR_SHADOW  ( 8*32+ 0) /* Intel TPR Shadow */

				@ -228,6 +235,7 @@

				#define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */

				#define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */

				#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */

				#define X86_FEATURE_INTEL_PT	( 9*32+25) /* Intel Processor Trace */

				#define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */

				#define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */

				#define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */

				@ -252,6 +260,9 @@

				/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */

				#define X86_FEATURE_CLZERO	(13*32+0) /* CLZERO instruction */

				#define X86_FEATURE_IRPERF	(13*32+1) /* Instructions Retired Count */

				#define X86_FEATURE_IBPB	(13*32+12) /* Indirect Branch Prediction Barrier */

				#define X86_FEATURE_IBRS	(13*32+14) /* Indirect Branch Restricted Speculation */

				#define X86_FEATURE_STIBP	(13*32+15) /* Single Thread Indirect Branch Predictors */

				/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */

				#define X86_FEATURE_DTHERM	(14*32+ 0) /* Digital Thermal Sensor */

				@ -287,6 +298,13 @@

				#define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */

				#define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */

				/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */

				#define X86_FEATURE_AVX512_4VNNIW	(18*32+ 2) /* AVX-512 Neural Network Instructions */

				#define X86_FEATURE_AVX512_4FMAPS	(18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */

				#define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */

				#define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */

				#define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */

				/*

				 * BUG word(s)

				 */

				@ -312,5 +330,8 @@

				#define X86_BUG_SWAPGS_FENCE	X86_BUG(11) /* SWAPGS without input dep on GS */

				#define X86_BUG_MONITOR		X86_BUG(12) /* IPI required to wake up remote CPU */

				#define X86_BUG_AMD_E400	X86_BUG(13) /* CPU is among the affected by Erratum 400 */

				#define X86_BUG_CPU_MELTDOWN	X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */

				#define X86_BUG_SPECTRE_V1	X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */

				#define X86_BUG_SPECTRE_V2	X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */

				#endif /* _ASM_X86_CPUFEATURES_H */

									
										2

arch/x86/include/asm/desc.h
									
												View File
												
				@ -43,7 +43,7 @@ struct gdt_page {

					struct desc_struct gdt[GDT_ENTRIES];

				} __attribute__((aligned(PAGE_SIZE)));

				DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page);

				DECLARE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(struct gdt_page, gdt_page);

				static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu)

				{

									
										7

arch/x86/include/asm/disabled-features.h
									
												View File
												
				@ -21,11 +21,13 @@

				# define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))

				# define DISABLE_CYRIX_ARR	(1<<(X86_FEATURE_CYRIX_ARR & 31))

				# define DISABLE_CENTAUR_MCR	(1<<(X86_FEATURE_CENTAUR_MCR & 31))

				# define DISABLE_PCID		0

				#else

				# define DISABLE_VME		0

				# define DISABLE_K6_MTRR	0

				# define DISABLE_CYRIX_ARR	0

				# define DISABLE_CENTAUR_MCR	0

				# define DISABLE_PCID		(1<<(X86_FEATURE_PCID & 31))

				#endif /* CONFIG_X86_64 */

				#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS

				@ -43,7 +45,7 @@

				#define DISABLED_MASK1	0

				#define DISABLED_MASK2	0

				#define DISABLED_MASK3	(DISABLE_CYRIX_ARR|DISABLE_CENTAUR_MCR|DISABLE_K6_MTRR)

				#define DISABLED_MASK4	0

				#define DISABLED_MASK4	(DISABLE_PCID)

				#define DISABLED_MASK5	0

				#define DISABLED_MASK6	0

				#define DISABLED_MASK7	0

				@ -57,6 +59,7 @@

				#define DISABLED_MASK15	0

				#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)

				#define DISABLED_MASK17	0

				#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

				#define DISABLED_MASK18	0

				#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)

				#endif /* _ASM_X86_DISABLED_FEATURES_H */

									
										2

arch/x86/include/asm/hardirq.h
									
												View File
												
				@ -22,8 +22,8 @@ typedef struct {

				#ifdef CONFIG_SMP

					unsigned int irq_resched_count;

					unsigned int irq_call_count;

					unsigned int irq_tlb_count;

				#endif

					unsigned int irq_tlb_count;

				#ifdef CONFIG_X86_THERMAL_VECTOR

					unsigned int irq_thermal_count;

				#endif

									
										2

arch/x86/include/asm/hw_irq.h
									
												View File
												
				@ -178,7 +178,7 @@ extern char irq_entries_start[];

				#define VECTOR_RETRIGGERED	((void *)~0UL)

				typedef struct irq_desc* vector_irq_t[NR_VECTORS];

				DECLARE_PER_CPU(vector_irq_t, vector_irq);

				DECLARE_PER_CPU_USER_MAPPED(vector_irq_t, vector_irq);

				#endif /* !ASSEMBLY_ */

									
										7

arch/x86/include/asm/intel-family.h
									
												View File
												
				@ -12,6 +12,7 @@

				 */

				#define INTEL_FAM6_CORE_YONAH		0x0E

				#define INTEL_FAM6_CORE2_MEROM		0x0F

				#define INTEL_FAM6_CORE2_MEROM_L	0x16

				#define INTEL_FAM6_CORE2_PENRYN		0x17

				@ -21,6 +22,7 @@

				#define INTEL_FAM6_NEHALEM_G		0x1F /* Auburndale / Havendale */

				#define INTEL_FAM6_NEHALEM_EP		0x1A

				#define INTEL_FAM6_NEHALEM_EX		0x2E

				#define INTEL_FAM6_WESTMERE		0x25

				#define INTEL_FAM6_WESTMERE_EP		0x2C

				#define INTEL_FAM6_WESTMERE_EX		0x2F

				@ -36,9 +38,9 @@

				#define INTEL_FAM6_HASWELL_GT3E		0x46

				#define INTEL_FAM6_BROADWELL_CORE	0x3D

				#define INTEL_FAM6_BROADWELL_XEON_D	0x56

				#define INTEL_FAM6_BROADWELL_GT3E	0x47

				#define INTEL_FAM6_BROADWELL_X		0x4F

				#define INTEL_FAM6_BROADWELL_XEON_D	0x56

				#define INTEL_FAM6_SKYLAKE_MOBILE	0x4E

				#define INTEL_FAM6_SKYLAKE_DESKTOP	0x5E

				@ -57,9 +59,10 @@

				#define INTEL_FAM6_ATOM_SILVERMONT2	0x4D /* Avaton/Rangely */

				#define INTEL_FAM6_ATOM_AIRMONT		0x4C /* CherryTrail / Braswell */

				#define INTEL_FAM6_ATOM_MERRIFIELD	0x4A /* Tangier */

				#define INTEL_FAM6_ATOM_MOOREFIELD	0x5A /* Annidale */

				#define INTEL_FAM6_ATOM_MOOREFIELD	0x5A /* Anniedale */

				#define INTEL_FAM6_ATOM_GOLDMONT	0x5C

				#define INTEL_FAM6_ATOM_DENVERTON	0x5F /* Goldmont Microserver */

				#define INTEL_FAM6_ATOM_GEMINI_LAKE	0x7A

				/* Xeon Phi */

									
										141

arch/x86/include/asm/kaiser.h
									
										Normal file
									
												View File
												
				@ -0,0 +1,141 @@

				#ifndef _ASM_X86_KAISER_H

				#define _ASM_X86_KAISER_H

				#include <uapi/asm/processor-flags.h> /* For PCID constants */

				/*

				 * This file includes the definitions for the KAISER feature.

				 * KAISER is a counter measure against x86_64 side channel attacks on

				 * the kernel virtual memory.  It has a shadow pgd for every process: the

				 * shadow pgd has a minimalistic kernel-set mapped, but includes the whole

				 * user memory. Within a kernel context switch, or when an interrupt is handled,

				 * the pgd is switched to the normal one. When the system switches to user mode,

				 * the shadow pgd is enabled. By this, the virtual memory caches are freed,

				 * and the user may not attack the whole kernel memory.

				 *

				 * A minimalistic kernel mapping holds the parts needed to be mapped in user

				 * mode, such as the entry/exit functions of the user space, or the stacks.

				 */

				#define KAISER_SHADOW_PGD_OFFSET 0x1000

				#ifdef __ASSEMBLY__

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

				.macro _SWITCH_TO_KERNEL_CR3 reg

				movq %cr3, \reg

				andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg

				/* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */

				ALTERNATIVE "", "bts $63, \reg", X86_FEATURE_PCID

				movq \reg, %cr3

				.endm

				.macro _SWITCH_TO_USER_CR3 reg regb

				/*

				 * regb must be the low byte portion of reg: because we have arranged

				 * for the low byte of the user PCID to serve as the high byte of NOFLUSH

				 * (0x80 for each when PCID is enabled, or 0x00 when PCID and NOFLUSH are

				 * not enabled): so that the one register can update both memory and cr3.

				 */

				movq %cr3, \reg

				orq  PER_CPU_VAR(x86_cr3_pcid_user), \reg

				js   9f

				/* If PCID enabled, FLUSH this time, reset to NOFLUSH for next time */

				movb \regb, PER_CPU_VAR(x86_cr3_pcid_user+7)

				9:

				movq \reg, %cr3

				.endm

				.macro SWITCH_KERNEL_CR3

				ALTERNATIVE "jmp 8f", "pushq %rax", X86_FEATURE_KAISER

				_SWITCH_TO_KERNEL_CR3 %rax

				popq %rax

				8:

				.endm

				.macro SWITCH_USER_CR3

				ALTERNATIVE "jmp 8f", "pushq %rax", X86_FEATURE_KAISER

				_SWITCH_TO_USER_CR3 %rax %al

				popq %rax

				8:

				.endm

				.macro SWITCH_KERNEL_CR3_NO_STACK

				ALTERNATIVE "jmp 8f", \

					__stringify(movq %rax, PER_CPU_VAR(unsafe_stack_register_backup)), \

					X86_FEATURE_KAISER

				_SWITCH_TO_KERNEL_CR3 %rax

				movq PER_CPU_VAR(unsafe_stack_register_backup), %rax

				8:

				.endm

				#else /* CONFIG_PAGE_TABLE_ISOLATION */

				.macro SWITCH_KERNEL_CR3

				.endm

				.macro SWITCH_USER_CR3

				.endm

				.macro SWITCH_KERNEL_CR3_NO_STACK

				.endm

				#endif /* CONFIG_PAGE_TABLE_ISOLATION */

				#else /* __ASSEMBLY__ */

				#ifdef CONFIG_PAGE_TABLE_ISOLATION

				/*

				 * Upon kernel/user mode switch, it may happen that the address

				 * space has to be switched before the registers have been

				 * stored.  To change the address space, another register is

				 * needed.  A register therefore has to be stored/restored.

				*/

				DECLARE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);

				DECLARE_PER_CPU(unsigned long, x86_cr3_pcid_user);

				extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];

				extern int kaiser_enabled;

				extern void __init kaiser_check_boottime_disable(void);

				#else

				#define kaiser_enabled	0

				static inline void __init kaiser_check_boottime_disable(void) {}

				#endif /* CONFIG_PAGE_TABLE_ISOLATION */

				/*

				 * Kaiser function prototypes are needed even when CONFIG_PAGE_TABLE_ISOLATION is not set,

				 * so as to build with tests on kaiser_enabled instead of #ifdefs.

				 */

				/**

				 *  kaiser_add_mapping - map a virtual memory part to the shadow (user) mapping

				 *  @addr: the start address of the range

				 *  @size: the size of the range

				 *  @flags: The mapping flags of the pages

				 *

				 *  The mapping is done on a global scope, so no bigger

				 *  synchronization has to be done.  the pages have to be

				 *  manually unmapped again when they are not needed any longer.

				 */

				extern int kaiser_add_mapping(unsigned long addr, unsigned long size, unsigned long flags);

				/**

				 *  kaiser_remove_mapping - unmap a virtual memory part of the shadow mapping

				 *  @addr: the start address of the range

				 *  @size: the size of the range

				 */

				extern void kaiser_remove_mapping(unsigned long start, unsigned long size);

				/**

				 *  kaiser_init - Initialize the shadow mapping

				 *

				 *  Most parts of the shadow mapping can be mapped upon boot

				 *  time.  Only per-process things like the thread stacks

				 *  or a new LDT have to be mapped at runtime.  These boot-

				 *  time mappings are permanent and never unmapped.

				 */

				extern void kaiser_init(void);

				#endif /* __ASSEMBLY */

				#endif /* _ASM_X86_KAISER_H */

									
										3

arch/x86/include/asm/kvm_host.h
									
												View File
												
				@ -1113,7 +1113,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,

				static inline int emulate_instruction(struct kvm_vcpu *vcpu,

							int emulation_type)

				{

					return x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0);

					return x86_emulate_instruction(vcpu, 0,

							emulation_type | EMULTYPE_NO_REEXECUTE, NULL, 0);

				}

				void kvm_enable_efer_bits(u64);

									
										6

arch/x86/include/asm/mmu.h
									
												View File
												
				@ -33,12 +33,6 @@ typedef struct {

				#endif

				} mm_context_t;

				#ifdef CONFIG_SMP

				void leave_mm(int cpu);

				#else

				static inline void leave_mm(int cpu)

				{

				}

				#endif

				#endif /* _ASM_X86_MMU_H */

									
										2

arch/x86/include/asm/mmu_context.h
									
												View File
												
				@ -99,10 +99,8 @@ static inline void load_mm_ldt(struct mm_struct *mm)

				static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)

				{

				#ifdef CONFIG_SMP

					if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)

						this_cpu_write(cpu_tlbstate.state, TLBSTATE_LAZY);

				#endif

				}

				static inline int init_new_context(struct task_struct *tsk,

									
										15

arch/x86/include/asm/msr-index.h
									
												View File
												
				@ -37,6 +37,13 @@

				#define EFER_FFXSR		(1<<_EFER_FFXSR)

				/* Intel MSRs. Some also available on other CPUs */

				#define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */

				#define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */

				#define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */

				#define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */

				#define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */

				#define MSR_IA32_PERFCTR0		0x000000c1

				#define MSR_IA32_PERFCTR1		0x000000c2

				#define MSR_FSB_FREQ			0x000000cd

				@ -50,6 +57,11 @@

				#define SNB_C3_AUTO_UNDEMOTE		(1UL << 28)

				#define MSR_MTRRcap			0x000000fe

				#define MSR_IA32_ARCH_CAPABILITIES	0x0000010a

				#define ARCH_CAP_RDCL_NO		(1 << 0)   /* Not susceptible to Meltdown */

				#define ARCH_CAP_IBRS_ALL		(1 << 1)   /* Enhanced IBRS support */

				#define MSR_IA32_BBL_CR_CTL		0x00000119

				#define MSR_IA32_BBL_CR_CTL3		0x0000011e

				@ -330,6 +342,9 @@

				#define FAM10H_MMIO_CONF_BASE_MASK	0xfffffffULL

				#define FAM10H_MMIO_CONF_BASE_SHIFT	20

				#define MSR_FAM10H_NODE_ID		0xc001100c

				#define MSR_F10H_DECFG			0xc0011029

				#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT	1

				#define MSR_F10H_DECFG_LFENCE_SERIALIZE		BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT)

				/* K8 MSRs */

				#define MSR_K8_TOP_MEM1			0xc001001a

									
										3

arch/x86/include/asm/msr.h
									
												View File
												
				@ -188,8 +188,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)

					 * that some other imaginary CPU is updating continuously with a

					 * time stamp.

					 */

					alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,

							  "lfence", X86_FEATURE_LFENCE_RDTSC);

					barrier_nospec();

					return rdtsc();

				}

									
										179

arch/x86/include/asm/nospec-branch.h
									
										Normal file
									
												View File
												
				@ -0,0 +1,179 @@

				/* SPDX-License-Identifier: GPL-2.0 */

				#ifndef _ASM_X86_NOSPEC_BRANCH_H_

				#define _ASM_X86_NOSPEC_BRANCH_H_

				#include <asm/alternative.h>

				#include <asm/alternative-asm.h>

				#include <asm/cpufeatures.h>

				#ifdef __ASSEMBLY__

				/*

				 * This should be used immediately before a retpoline alternative.  It tells

				 * objtool where the retpolines are so that it can make sense of the control

				 * flow by just reading the original instruction(s) and ignoring the

				 * alternatives.

				 */

				.macro ANNOTATE_NOSPEC_ALTERNATIVE

					.Lannotate_\@:

					.pushsection .discard.nospec

					.long .Lannotate_\@ - .

					.popsection

				.endm

				/*

				 * These are the bare retpoline primitives for indirect jmp and call.

				 * Do not use these directly; they only exist to make the ALTERNATIVE

				 * invocation below less ugly.

				 */

				.macro RETPOLINE_JMP reg:req

					call	.Ldo_rop_\@

				.Lspec_trap_\@:

					pause

					lfence

					jmp	.Lspec_trap_\@

				.Ldo_rop_\@:

					mov	\reg, (%_ASM_SP)

					ret

				.endm

				/*

				 * This is a wrapper around RETPOLINE_JMP so the called function in reg

				 * returns to the instruction after the macro.

				 */

				.macro RETPOLINE_CALL reg:req

					jmp	.Ldo_call_\@

				.Ldo_retpoline_jmp_\@:

					RETPOLINE_JMP \reg

				.Ldo_call_\@:

					call	.Ldo_retpoline_jmp_\@

				.endm

				/*

				 * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple

				 * indirect jmp/call which may be susceptible to the Spectre variant 2

				 * attack.

				 */

				.macro JMP_NOSPEC reg:req

				#ifdef CONFIG_RETPOLINE

					ANNOTATE_NOSPEC_ALTERNATIVE

					ALTERNATIVE_2 __stringify(jmp *\reg),				\

						__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\

						__stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD

				#else

					jmp	*\reg

				#endif

				.endm

				.macro CALL_NOSPEC reg:req

				#ifdef CONFIG_RETPOLINE

					ANNOTATE_NOSPEC_ALTERNATIVE

					ALTERNATIVE_2 __stringify(call *\reg),				\

						__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\

						__stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD

				#else

					call	*\reg

				#endif

				.endm

				/* This clobbers the BX register */

				.macro FILL_RETURN_BUFFER nr:req ftr:req

				#ifdef CONFIG_RETPOLINE

					ALTERNATIVE "", "call __clear_rsb", \ftr

				#endif

				.endm

				#else /* __ASSEMBLY__ */

				#define ANNOTATE_NOSPEC_ALTERNATIVE				\

					"999:\n\t"						\

					".pushsection .discard.nospec\n\t"			\

					".long 999b - .\n\t"					\

					".popsection\n\t"

				#if defined(CONFIG_X86_64) && defined(RETPOLINE)

				/*

				 * Since the inline asm uses the %V modifier which is only in newer GCC,

				 * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE.

				 */

				# define CALL_NOSPEC						\

					ANNOTATE_NOSPEC_ALTERNATIVE				\

					ALTERNATIVE(						\

					"call *%[thunk_target]\n",				\

					"call __x86_indirect_thunk_%V[thunk_target]\n",		\

					X86_FEATURE_RETPOLINE)

				# define THUNK_TARGET(addr) [thunk_target] "r" (addr)

				#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)

				/*

				 * For i386 we use the original ret-equivalent retpoline, because

				 * otherwise we'll run out of registers. We don't care about CET

				 * here, anyway.

				 */

				# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n",	\

					"       jmp    904f;\n"					\

					"       .align 16\n"					\

					"901:	call   903f;\n"					\

					"902:	pause;\n"					\

					"    	lfence;\n"					\

					"       jmp    902b;\n"					\

					"       .align 16\n"					\

					"903:	addl   $4, %%esp;\n"				\

					"       pushl  %[thunk_target];\n"			\

					"       ret;\n"						\

					"       .align 16\n"					\

					"904:	call   901b;\n",				\

					X86_FEATURE_RETPOLINE)

				# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)

				#else /* No retpoline for C / inline asm */

				# define CALL_NOSPEC "call *%[thunk_target]\n"

				# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)

				#endif

				/* The Spectre V2 mitigation variants */

				enum spectre_v2_mitigation {

					SPECTRE_V2_NONE,

					SPECTRE_V2_RETPOLINE_MINIMAL,

					SPECTRE_V2_RETPOLINE_MINIMAL_AMD,

					SPECTRE_V2_RETPOLINE_GENERIC,

					SPECTRE_V2_RETPOLINE_AMD,

					SPECTRE_V2_IBRS,

				};

				extern char __indirect_thunk_start[];

				extern char __indirect_thunk_end[];

				/*

				 * On VMEXIT we must ensure that no RSB predictions learned in the guest

				 * can be followed in the host, by overwriting the RSB completely. Both

				 * retpoline and IBRS mitigations for Spectre v2 need this; only on future

				 * CPUs with IBRS_ALL *might* it be avoided.

				 */

				static inline void vmexit_fill_RSB(void)

				{

				#ifdef CONFIG_RETPOLINE

					alternative_input("",

							  "call __fill_rsb",

							  X86_FEATURE_RETPOLINE,

							  ASM_NO_INPUT_CLOBBER(_ASM_BX, "memory"));

				#endif

				}

				static inline void indirect_branch_prediction_barrier(void)

				{

					asm volatile(ALTERNATIVE("",

								 "movl %[msr], %%ecx\n\t"

								 "movl %[val], %%eax\n\t"

								 "movl $0, %%edx\n\t"

								 "wrmsr",

								 X86_FEATURE_USE_IBPB)

						     : : [msr] "i" (MSR_IA32_PRED_CMD),

							 [val] "i" (PRED_CMD_IBPB)

						     : "eax", "ecx", "edx", "memory");

				}

				#endif /* __ASSEMBLY__ */

				#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */

Compare commits

932 Commits v4.9.70 ... v4.9.82

16 Documentation/ABI/testing/sysfs-devices-system-cpu Unescape Escape View File

47 Documentation/kernel-parameters.txt Unescape Escape View File

90 Documentation/speculation.txt Normal file Unescape Escape View File

186 Documentation/x86/pti.txt Normal file Unescape Escape View File

5 Makefile Unescape Escape View File

3 arch/alpha/kernel/pci_impl.h Unescape Escape View File

3 arch/alpha/kernel/process.c Unescape Escape View File

13 arch/alpha/kernel/traps.c Unescape Escape View File

5 arch/arc/include/asm/uaccess.h Unescape Escape View File

1 arch/arm/boot/dts/am335x-evmsk.dts Unescape Escape View File

4 arch/arm/boot/dts/bcm-nsp.dtsi Unescape Escape View File

2 arch/arm/boot/dts/dra7.dtsi Unescape Escape View File

10 arch/arm/boot/dts/kirkwood-openblocks_a7.dts Unescape Escape View File

2 arch/arm/configs/sunxi_defconfig Unescape Escape View File

1 arch/arm/kvm/arm.c Unescape Escape View File

13 arch/arm/kvm/handle_exit.c Unescape Escape View File

6 arch/arm/kvm/mmio.c Unescape Escape View File

2 arch/arm/kvm/mmu.c Unescape Escape View File

20 arch/arm/mm/dma-mapping.c Unescape Escape View File

36 arch/arm/probes/kprobes/core.c Unescape Escape View File

11 arch/arm/probes/kprobes/test-core.c Unescape Escape View File

8 arch/arm64/Makefile Unescape Escape View File

4 arch/arm64/kvm/handle_exit.c Unescape Escape View File

2 arch/arm64/mm/init.c Unescape Escape View File

7 arch/blackfin/Kconfig Unescape Escape View File

1 arch/blackfin/Kconfig.debug Unescape Escape View File

2 arch/mips/ar7/platform.c Unescape Escape View File

12 arch/mips/kernel/process.c Unescape Escape View File

153 arch/mips/kernel/ptrace.c Unescape Escape View File

28 arch/mips/math-emu/cp1emu.c Unescape Escape View File

2 arch/mn10300/mm/misalignment.c Unescape Escape View File

2 arch/openrisc/include/asm/uaccess.h Unescape Escape View File

10 arch/openrisc/kernel/traps.c Unescape Escape View File

2 arch/parisc/include/asm/ldcw.h Unescape Escape View File

13 arch/parisc/kernel/entry.S Unescape Escape View File

9 arch/parisc/kernel/pacache.S Unescape Escape View File

39 arch/parisc/kernel/process.c Unescape Escape View File

1 arch/powerpc/Kconfig Unescape Escape View File

6 arch/powerpc/include/asm/exception-64e.h Unescape Escape View File

53 arch/powerpc/include/asm/exception-64s.h Unescape Escape View File

15 arch/powerpc/include/asm/feature-fixups.h Unescape Escape View File

18 arch/powerpc/include/asm/hvcall.h Unescape Escape View File

10 arch/powerpc/include/asm/paca.h Unescape Escape View File

14 arch/powerpc/include/asm/plpar_wrappers.h Unescape Escape View File

13 arch/powerpc/include/asm/setup.h Unescape Escape View File

4 arch/powerpc/kernel/asm-offsets.c Unescape Escape View File

30 arch/powerpc/kernel/entry_64.S Unescape Escape View File

108 arch/powerpc/kernel/exceptions-64s.S Unescape Escape View File

139 arch/powerpc/kernel/setup_64.c Unescape Escape View File

9 arch/powerpc/kernel/vmlinux.lds.S Unescape Escape View File

42 arch/powerpc/lib/feature-fixups.c Unescape Escape View File

8 arch/powerpc/perf/core-book3s.c Unescape Escape View File

2 arch/powerpc/perf/hv-24x7.c Unescape Escape View File

6 arch/powerpc/platforms/powernv/opal-async.c Unescape Escape View File

52 arch/powerpc/platforms/powernv/setup.c Unescape Escape View File

35 arch/powerpc/platforms/pseries/setup.c Unescape Escape View File

4 arch/powerpc/sysdev/ipic.c Unescape Escape View File

1 arch/s390/kernel/compat_linux.c Unescape Escape View File

3 arch/sh/kernel/traps_32.c Unescape Escape View File

1 arch/sparc/mm/srmmu.c Unescape Escape View File

2 arch/um/Makefile Unescape Escape View File

16 arch/x86/Kconfig Unescape Escape View File

8 arch/x86/Makefile Unescape Escape View File

1 arch/x86/boot/compressed/misc.h Unescape Escape View File

5 arch/x86/crypto/aesni-intel_asm.S Unescape Escape View File

2 arch/x86/crypto/aesni-intel_glue.c Unescape Escape View File

3 arch/x86/crypto/camellia-aesni-avx-asm_64.S Unescape Escape View File

3 arch/x86/crypto/camellia-aesni-avx2-asm_64.S Unescape Escape View File

3 arch/x86/crypto/crc32c-pcl-intel-asm_64.S Unescape Escape View File

1 arch/x86/crypto/poly1305_glue.c Unescape Escape View File

7 arch/x86/crypto/salsa20_glue.c Unescape Escape View File

10 arch/x86/crypto/sha512-mb/sha512_mb_mgr_init_avx2.c Unescape Escape View File

9 arch/x86/entry/common.c Unescape Escape View File

22 arch/x86/entry/entry_32.S Unescape Escape View File

294 arch/x86/entry/entry_64.S Unescape Escape View File

8 arch/x86/entry/entry_64_compat.S Unescape Escape View File

7 arch/x86/entry/syscall_64.c Unescape Escape View File

12 arch/x86/entry/vsyscall/vsyscall_64.c Unescape Escape View File

932 Commits

v4.9.70 ... v4.9.82

16

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

47

Documentation/kernel-parameters.txt

View File

90

Documentation/speculation.txt Normal file

View File

186

Documentation/x86/pti.txt Normal file

View File

5

Makefile

View File

3

arch/alpha/kernel/pci_impl.h

View File

3

arch/alpha/kernel/process.c

View File

13

arch/alpha/kernel/traps.c

View File

5

arch/arc/include/asm/uaccess.h

View File

1

arch/arm/boot/dts/am335x-evmsk.dts

View File

4

arch/arm/boot/dts/bcm-nsp.dtsi

View File

2

arch/arm/boot/dts/dra7.dtsi

View File

10

arch/arm/boot/dts/kirkwood-openblocks_a7.dts

View File

2

arch/arm/configs/sunxi_defconfig

View File

1

arch/arm/kvm/arm.c

View File

13

arch/arm/kvm/handle_exit.c

View File

6

arch/arm/kvm/mmio.c

View File

2

arch/arm/kvm/mmu.c

View File

20

arch/arm/mm/dma-mapping.c

View File

36

arch/arm/probes/kprobes/core.c

View File

11

arch/arm/probes/kprobes/test-core.c

View File

8

arch/arm64/Makefile

View File

4

arch/arm64/kvm/handle_exit.c

View File

2

arch/arm64/mm/init.c

View File

7

arch/blackfin/Kconfig

View File

1

arch/blackfin/Kconfig.debug

View File

2

arch/mips/ar7/platform.c

View File

12

arch/mips/kernel/process.c

View File

153

arch/mips/kernel/ptrace.c

View File

28

arch/mips/math-emu/cp1emu.c

View File

2

arch/mn10300/mm/misalignment.c

View File

2

arch/openrisc/include/asm/uaccess.h

View File

10

arch/openrisc/kernel/traps.c

View File

2

arch/parisc/include/asm/ldcw.h

View File

13

arch/parisc/kernel/entry.S

View File

9

arch/parisc/kernel/pacache.S

View File

39

arch/parisc/kernel/process.c

View File

1

arch/powerpc/Kconfig

View File

6

arch/powerpc/include/asm/exception-64e.h

View File

53

arch/powerpc/include/asm/exception-64s.h

View File

15

arch/powerpc/include/asm/feature-fixups.h

View File

18

arch/powerpc/include/asm/hvcall.h

View File

10

arch/powerpc/include/asm/paca.h

View File

14

arch/powerpc/include/asm/plpar_wrappers.h

View File

13

arch/powerpc/include/asm/setup.h

View File

4

arch/powerpc/kernel/asm-offsets.c

View File

30

arch/powerpc/kernel/entry_64.S

View File

108

arch/powerpc/kernel/exceptions-64s.S

View File

139

arch/powerpc/kernel/setup_64.c

View File

9

arch/powerpc/kernel/vmlinux.lds.S

View File

42

arch/powerpc/lib/feature-fixups.c

View File

8

arch/powerpc/perf/core-book3s.c

View File

2

arch/powerpc/perf/hv-24x7.c

View File

6

arch/powerpc/platforms/powernv/opal-async.c

View File

52

arch/powerpc/platforms/powernv/setup.c

View File

35

arch/powerpc/platforms/pseries/setup.c

View File

4

arch/powerpc/sysdev/ipic.c

View File

1

arch/s390/kernel/compat_linux.c

View File

3

arch/sh/kernel/traps_32.c

View File

1

arch/sparc/mm/srmmu.c

View File

2

arch/um/Makefile

View File

16

arch/x86/Kconfig

View File

8

arch/x86/Makefile

View File

1

arch/x86/boot/compressed/misc.h

View File

5

arch/x86/crypto/aesni-intel_asm.S

View File

2

arch/x86/crypto/aesni-intel_glue.c

View File

3

arch/x86/crypto/camellia-aesni-avx-asm_64.S

View File

3

arch/x86/crypto/camellia-aesni-avx2-asm_64.S

View File

3

arch/x86/crypto/crc32c-pcl-intel-asm_64.S

View File

1

arch/x86/crypto/poly1305_glue.c

View File

7

arch/x86/crypto/salsa20_glue.c

View File

10

arch/x86/crypto/sha512-mb/sha512_mb_mgr_init_avx2.c

View File

9

arch/x86/entry/common.c

View File

22

arch/x86/entry/entry_32.S

View File

294

arch/x86/entry/entry_64.S

View File

8

arch/x86/entry/entry_64_compat.S

View File

7

arch/x86/entry/syscall_64.c

View File

12

arch/x86/entry/vsyscall/vsyscall_64.c

View File

2

arch/x86/events/amd/power.c

View File