i2som-imx-linux

Author	SHA1	Message	Date
Greg Kroah-Hartman	abb04342fc	Linux 4.11.6	2017-06-17 06:47:27 +02:00
Kai Chen	6a9a78ba17	drm/i915: Disable decoupled MMIO commit `4c4c565513` upstream. The decoupled MMIO feature doesn't work as intended by HW team. Enabling it with forcewake will only make debugging efforts more difficult, so let's disable it. Fixes: `85ee17ebee` ("drm/i915/bxt: Broxton decoupled MMIO") Cc: Zhe Wang <zhe1.wang@intel.com> Cc: Praveen Paneri <praveen.paneri@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Kai Chen <kai.chen@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170523215812.18328-2-kai.chen@intel.com (cherry picked from commit `0051c10aca`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Maarten Lankhorst	b094fd12b9	drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2. commit `4e3aed8445` upstream. On some systems there can be a race condition in which no crtc state is added to the first atomic commit. This results in all crtc's having a null DDB allocation, causing a FIFO underrun on any update until the first modeset. Changes since v1: - Do not take the connection_mutex, this is already done below. Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Inspired-by: Mahesh Kumar <mahesh1.kumar@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: `98d39494d3` ("drm/i915/gen9: Compute DDB allocation at atomic check time (v4)") Cc: Mahesh Kumar <mahesh1.kumar@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531154236.27180-1-maarten.lankhorst@linux.intel.com Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `367d73d280`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-06-17 06:44:18 +02:00
Chris Wilson	dcdac1c29f	drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally commit `d90c98905a` upstream. Commit `7c3f86b6dc` ("drm/i915: Invalidate the guc ggtt TLB upon insertion") added the restoration of the invalidation routine after the GuC was disabled, but missed that the GuC was unconditionally disabled when not used. This then overwrites the invalidate routine for the older chipsets, causing havoc and breaking resume as the most obvious victim. We place the guard inside i915_ggtt_disable_guc() to be backport friendly (the bug was introduced into v4.11) but it would be preferred to be in more control over when this was guard (i.e. do not try and teardown the data structures before we have enabled them). That should be true with the reorganisation of the guc loaders. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `7c3f86b6dc` ("drm/i915: Invalidate the guc ggtt TLB upon insertion") Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531190514.3691-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com> (cherry picked from commit `cb60606d83`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Ville Syrjälä	4d2c473f9f	drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail commit `8f4d38099b` upstream. The scanline counter is bonkers on VLV/CHV DSI. The scanline counter increment is not lined up with the start of vblank like it is on every other platform and output type. This causes problems for both the vblank timestamping and atomic update vblank evasion. On my FFRD8 machine at least, the scanline counter increment happens about 1/3 of a scanline ahead of the start of vblank (which is where all register latching happens still). That means we can't trust the scanline counter to tell us whether we're in vblank or not while we're on that particular line. In order to keep vblank timestamping in working condition when called from the vblank irq, we'll leave scanline_offset at one, which means that the entire line containing the start of vblank is considered to be inside the vblank. For the vblank evasion we'll need to consider that entire line to be bad, since we can't tell whether the registers already got latched or not. And we can't actually use the start of vblank interrupt to get us past that line as the interrupt would fire too soon, and then we'd up waiting for the next start of vblank instead. One way around that would using the frame start interrupt instead since that wouldn't fire until the next scanline, but that would require some bigger changes in the interrupt code. So for simplicity we'll just poll until we get past the bad line. v2: Adjust the comments a bit Cc: Jonas Aaberg <cja@gmx.net> Tested-by: Jonas Aaberg <cja@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com Tested-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> (cherry picked from commit `ec1b4ee283`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Ville Syrjälä	3b981b2388	drm/i915: Fix 90/270 rotated coordinates for FBC commit `1065467ed8` upstream. The clipped src coordinates have already been rotated by 270 degrees for when the plane rotation is 90/270 degrees, hence the FBC code should no longer swap the width and height. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Fixes: `b63a16f6cd` ("drm/i915: Compute display surface offset in the plane check hook for SKL+") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170331180056.14086-4-ville.syrjala@linux.intel.com Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit `73714c05df`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Daniel Vetter	5b31ae00ac	Revert "drm/i915: Restore lost "Initialized i915" welcome message" commit `d38162e4b5` upstream. This reverts commit `bc5ca47c0a`. Gabriel put this back into generic code with commit `75f6dfe3e6` Author: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Date: Wed Dec 28 12:32:11 2016 -0200 drm: Deduplicate driver initialization message but somehow he missed Chris' patch to add the message meanwhile. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101025 Fixes: `75f6dfe3e6` ("drm: Deduplicate driver initialization message") Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517131557.7836-1-daniel.vetter@ffwll.ch (cherry picked from commit `6bdba81979`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Christian Borntraeger	c094b10c3c	s390/kvm: do not rely on the ILC on kvm host protection fauls commit `c0e7bb38c0` upstream. For most cases a protection exception in the host (e.g. copy on write or dirty tracking) on the sie instruction will indicate an instruction length of 4. Turns out that there are some corner cases (e.g. runtime instrumentation) where this is not necessarily true and the ILC is unpredictable. Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for all possible ILCs. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Max Filippov	2273302258	xtensa: don't use linux IRQ #0 commit `e5c86679d5` upstream. Linux IRQ #0 is reserved for error reporting and may not be used. Increase NR_IRQS for one additional slot and increase irq_domain_add_legacy parameter first_irq value to 1, so that linux IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains. Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform data definitions. This fixes inability to use hardware IRQ #0 in configurations that don't use device tree and allows for non-identity mapping between linux IRQ # and hardware IRQ #. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Dave Young	2f3271eb10	efi: Fix boot panic because of invalid BGRT image address commit `792ef14df5` upstream. Maniaxx reported a kernel boot crash in the EFI code, which I emulated by using same invalid phys addr in code: BUG: unable to handle kernel paging request at ffffffffff280001 IP: efi_bgrt_init+0xfb/0x153 ... Call Trace: ? bgrt_init+0xbc/0xbc acpi_parse_bgrt+0xe/0x12 acpi_table_parse+0x89/0xb8 acpi_boot_init+0x445/0x4e2 ? acpi_parse_x2apic+0x79/0x79 ? dmi_ignore_irq0_timer_override+0x33/0x33 setup_arch+0xb63/0xc82 ? early_idt_handler_array+0x120/0x120 start_kernel+0xb7/0x443 ? early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x29/0x2b x86_64_start_kernel+0x154/0x177 secondary_startup_64+0x9f/0x9f There is also a similar bug filed in bugzilla.kernel.org: https://bugzilla.kernel.org/show_bug.cgi?id=195633 The crash is caused by this commit: `7b0a911478` efi/x86: Move the EFI BGRT init code to early init code The root cause is the firmware on those machines provides invalid BGRT image addresses. In a kernel before above commit BGRT initializes late and uses ioremap() to map the image address. Ioremap validates the address, if it is not a valid physical address ioremap() just fails and returns. However in current kernel EFI BGRT initializes early and uses early_memremap() which does not validate the image address, and kernel panic happens. According to ACPI spec the BGRT image address should fall into EFI_BOOT_SERVICES_DATA, see the section 5.2.22.4 of below document: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf Fix this issue by validating the image address in efi_bgrt_init(). If the image address does not fall into any EFI_BOOT_SERVICES_DATA areas we just bail out with a warning message. Reported-by: Maniaxx <tripleshiftone@gmail.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") Link: http://lkml.kernel.org/r/20170609084558.26766-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Richard	041d665eb1	partitions/msdos: FreeBSD UFS2 file systems are not recognized commit `223220356d` upstream. The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD and NetBSD partitions and does a reasonable job picking out OpenBSD and NetBSD UFS subpartitions. But for FreeBSD the subpartitions are always "bad". Kernel: <bsd:bad subpartition - ignored Though all 3 of these BSD systems use UFS as a file system, only FreeBSD uses relative start addresses in the subpartition declarations. The following patch fixes this for FreeBSD partitions and leaves the code for OpenBSD and NetBSD intact: Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Imre Deak	99f5ba009e	drm/i915: Prevent the system suspend complete optimization commit `6ab92afc95` upstream. Since commit `bac2a909a0` Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Wed Jan 21 02:17:42 2015 +0100 PCI / PM: Avoid resuming PCI devices during system suspend PCI devices will default to allowing the system suspend complete optimization where devices are not woken up during system suspend if they were already runtime suspended. This however breaks the i915/HDA drivers for two reasons: - The i915 driver has system suspend specific steps that it needs to run, that bring the device to a different state than its runtime suspended state. - The HDA driver's suspend handler requires power that it will request from the i915 driver's power domain handler. This in turn requires the i915 driver to runtime resume itself, but this won't be possible if the suspend complete optimization is in effect: in this case the i915 runtime PM is disabled and trying to get an RPM reference returns -EACCESS. Solve this by requiring the PCI/PM core to resume the device during system suspend which in effect disables the suspend complete optimization. Regardless of the above commit the optimization stayed disabled for DRM devices until commit `d14d2a8453` Author: Lukas Wunner <lukas@wunner.de> Date: Wed Jun 8 12:49:29 2016 +0200 drm: Remove dev_pm_ops from drm_class so this patch is in practice a fix for this commit. Another reason for the bug staying hidden for so long is that the optimization for a device is disabled if it's disabled for any of its children devices. i915 may have a backlight device as its child which doesn't support runtime PM and so doesn't allow the optimization either. So if this backlight device got registered the bug stayed hidden. Credits to Marta, Tomi and David who enabled pstore logging, that caught one instance of this issue across a suspend/ resume-to-ram and Ville who rememberd that the optimization was enabled for some devices at one point. The first WARN triggered by the problem: [ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746448] pm_runtime_get_sync() failed: -13 [ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_ numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915] [ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W 4.11.0-rc5-CI-CI_DRM_334+ #1 [ 6250.746515] Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 [ 6250.746521] Workqueue: events_unbound async_run_entry_fn [ 6250.746525] Call Trace: [ 6250.746530] dump_stack+0x67/0x92 [ 6250.746536] __warn+0xc6/0xe0 [ 6250.746542] ? pci_restore_standard_config+0x40/0x40 [ 6250.746546] warn_slowpath_fmt+0x46/0x50 [ 6250.746553] ? __pm_runtime_resume+0x56/0x80 [ 6250.746584] intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746610] intel_display_power_get+0x1b/0x40 [i915] [ 6250.746646] i915_audio_component_get_power+0x15/0x20 [i915] [ 6250.746654] snd_hdac_display_power+0xc8/0x110 [snd_hda_core] [ 6250.746661] azx_runtime_resume+0x218/0x280 [snd_hda_intel] [ 6250.746667] pci_pm_runtime_resume+0x76/0xa0 [ 6250.746672] __rpm_callback+0xb4/0x1f0 [ 6250.746677] ? pci_restore_standard_config+0x40/0x40 [ 6250.746682] rpm_callback+0x1f/0x80 [ 6250.746686] ? pci_restore_standard_config+0x40/0x40 [ 6250.746690] rpm_resume+0x4ba/0x740 [ 6250.746698] __pm_runtime_resume+0x49/0x80 [ 6250.746703] pci_pm_suspend+0x57/0x140 [ 6250.746709] dpm_run_callback+0x6f/0x330 [ 6250.746713] ? pci_pm_freeze+0xe0/0xe0 [ 6250.746718] __device_suspend+0xf9/0x370 [ 6250.746724] ? dpm_watchdog_set+0x60/0x60 [ 6250.746730] async_suspend+0x1a/0x90 [ 6250.746735] async_run_entry_fn+0x34/0x160 [ 6250.746741] process_one_work+0x1f2/0x6d0 [ 6250.746749] worker_thread+0x49/0x4a0 [ 6250.746755] kthread+0x107/0x140 [ 6250.746759] ? process_one_work+0x6d0/0x6d0 [ 6250.746763] ? kthread_create_on_node+0x40/0x40 [ 6250.746768] ret_from_fork+0x2e/0x40 [ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]--- v2: - Use the new pci_dev->needs_resume flag, to avoid any overhead during the ->pm_prepare hook. (Rafael) v3: - Update commit message to reference the actual regressing commit. (Lukas) v4: - Rebase on v4 of patch 1/2. Fixes: `d14d2a8453` ("drm: Remove dev_pm_ops from drm_class") References: https://bugs.freedesktop.org/show_bug.cgi?id=100378 References: https://bugs.freedesktop.org/show_bug.cgi?id=100770 Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Cc: David Weinehall <david.weinehall@linux.intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: linux-pci@vger.kernel.org Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-and-tested-by: Marta Lofstedt <marta.lofstedt@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1493726649-32094-2-git-send-email-imre.deak@intel.com (cherry picked from commit `adfdf85d79`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Imre Deak	d7c0a50b21	PCI/PM: Add needs_resume flag to avoid suspend complete optimization commit `4d071c3238` upstream. Some drivers - like i915 - may not support the system suspend direct complete optimization due to differences in their runtime and system suspend sequence. Add a flag that when set resumes the device before calling the driver's system suspend handlers which effectively disables the optimization. Needed by a future patch fixing suspend/resume on i915. Suggested by Rafael. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org (rebased on v4.8, added kernel version to commit message stable tag) Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Chris Wilson	92220696d5	drm/i915: Do not drop pagetables when empty This is the minimal backport for stable of the upstream commit: commit `dd19674bac` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 08:43:46 2017 +0000 drm/i915: Remove bitmap tracking for used-ptes Due to a race with the shrinker, when we try to allocate a pagetable, we may end up shrinking it instead. This comes as a nasty surprise as we try to dereference it to fill in the pagetable entries for the object. In linus/master this is fixed by pinning the pagetables prior to allocation, but that backport is roughly drivers/gpu/drm/i915/i915_gem_gtt.c \| 10 ---------- 1 file changed, 10 deletions(-) i.e. unsuitable for stable. Instead we neuter the code that tried to free the pagetables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99295 Fixes: `2ce5179fe8` ("drm/i915/gtt: Free unused lower-level page tables") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.10+ Tested-by: Maël Lavault <mael.lavault@protonmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Greg Kroah-Hartman	be8335f6b5	Linux 4.11.5	2017-06-14 15:08:04 +02:00
Vegard Nossum	2c2c503792	kthread: fix boot hang (regression) on MIPS/OpenRISC commit `b0f5a8f32e` upstream. This fixes a regression in commit `4d6501dce0` where I didn't notice that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to NULL after our initialisation in copy_process(). We can simply get rid of the arch-specific initialisation here since it is now always done in copy_process() before hitting copy_thread{,_tls}(). Review notes: - As far as I can tell, copy_process() is the only user of copy_thread_tls(), which is the only caller of copy_thread() for architectures that don't implement copy_thread_tls(). - After this patch, there is no arch-specific code touching p->set_child_tid or p->clear_child_tid whatsoever. - It may look like MIPS/OpenRISC wanted to always have these fields be NULL, but that's not true, as copy_process() would unconditionally set them again _after_ calling copy_thread_tls() before commit `4d6501dce0`. Fixes: `4d6501dce0` ("kthread: Fix use-after-free if kthread fork fails") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Jamie Iles <jamie.iles@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Pablo Neira Ayuso	1a422e547a	netfilter: nft_set_rbtree: handle element re-addition after deletion commit `d2df92e98a` upstream. The existing code selects no next branch to be inspected when re-inserting an inactive element into the rb-tree, looping endlessly. This patch restricts the check for active elements to the EEXIST case only. Fixes: `e701001e7c` ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates") Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Jani Nikula	6b183d1b84	drm/i915/vbt: split out defaults that are set when there is no VBT commit `bb1d132935` upstream. The main thing are the DDI ports. If there's a VBT that says there are no outputs, we should trust that, and not have semi-random defaults. Unfortunately, the defaults have resulted in some Chromebooks without VBT to rely on this behaviour, so we split out the defaults for the missing VBT case. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/95c26079ff640d43f53b944f17e9fc356b36daec.1489152288.git.jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Jani Nikula	07f0aa40fc	drm/i915/vbt: don't propagate errors from intel_bios_init() commit `665788572c` upstream. We don't use the error return for anything other than reporting and logging that there is no VBT. We can pull the logging in the function, and remove the error status return. Moreover, if we needed the information for something later on, we'd probably be better off storing the bit in dev_priv, and using it where it's needed, instead of using the error return. While at it, improve the comments. Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/438ebbb0d5f0d321c625065b9cc78532a1dab24f.1489152288.git.jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Paul Moore	1df30f8264	audit: fix the RCU locking for the auditd_connection structure commit `48d0e023af` upstream. Cong Wang correctly pointed out that the RCU read locking of the auditd_connection struct was wrong, this patch correct this by adopting a more traditional, and correct RCU locking model. This patch is heavily based on an earlier prototype by Cong Wang. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Thomas Gleixner	f17b15c896	hwmon: (coretemp) Handle frozen hotplug state correctly commit `90b4f30b6d` upstream. The recent conversion to the hotplug state machine missed that the original hotplug notifiers did not execute in the frozen state, which is used on suspend on resume. This does not matter on single socket machines, but on multi socket systems this breaks when the device for a non-boot socket is removed when the last CPU of that socket is brought offline. The device removal locks up the machine hard w/o any debug output. Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true. Thanks to Tommi for providing debug information patiently while I failed to spot the obvious. Fixes: `e00ca5df37` ("hwmon: (coretemp) Convert to hotplug state machine") Reported-by: Tommi Rantala <tt.rantala@gmail.com> Tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: "Chen, Yu C" <yu.c.chen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Chandan Rajendra	dcaa3c1ec9	iomap_dio_rw: Prevent reading file data beyond iomap_dio->i_size commit `a008c31c7e` upstream. On a ppc64 machine executing overlayfs/019 with xfs as the lower and upper filesystem causes the following call trace, WARNING: CPU: 2 PID: 8034 at /root/repos/linux/fs/iomap.c:765 .iomap_dio_actor+0xcc/0x420 Modules linked in: CPU: 2 PID: 8034 Comm: fsstress Tainted: G L 4.11.0-rc5-next-20170405 #100 task: c000000631314880 task.stack: c0000003915d4000 NIP: c00000000035a72c LR: c00000000035a6f4 CTR: c00000000035a660 REGS: c0000003915d7570 TRAP: 0700 Tainted: G L (4.11.0-rc5-next-20170405) MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 24004284 XER: 00000000 CFAR: c0000000006f7190 SOFTE: 1 GPR00: c00000000035a6f4 c0000003915d77f0 c0000000015a3f00 000000007c22f600 GPR04: 000000000022d000 0000000000002600 c0000003b2d56360 c0000003915d7960 GPR08: c0000003915d7cd0 0000000000000002 0000000000002600 c000000000521cc0 GPR12: 0000000024004284 c00000000fd80a00 000000004b04ae64 ffffffffffffffff GPR16: 000000001000ca70 0000000000000000 c0000003b2d56380 c00000000153d2b8 GPR20: 0000000000000010 c0000003bc87bac8 0000000000223000 000000000022f5ff GPR24: c0000003b2d56360 000000000000000c 0000000000002600 000000000022d000 GPR28: 0000000000000000 c0000003915d7960 c0000003b2d56360 00000000000001ff NIP [c00000000035a72c] .iomap_dio_actor+0xcc/0x420 LR [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 Call Trace: [c0000003915d77f0] [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 (unreliable) [c0000003915d78f0] [c00000000035b9f4] .iomap_apply+0xf4/0x1f0 [c0000003915d79d0] [c00000000035c320] .iomap_dio_rw+0x230/0x420 [c0000003915d7ae0] [c000000000512a14] .xfs_file_dio_aio_read+0x84/0x160 [c0000003915d7b80] [c000000000512d24] .xfs_file_read_iter+0x104/0x130 [c0000003915d7c10] [c0000000002d6234] .__vfs_read+0x114/0x1a0 [c0000003915d7cf0] [c0000000002d7a8c] .vfs_read+0xac/0x1a0 [c0000003915d7d90] [c0000000002d96b8] .SyS_read+0x58/0x100 [c0000003915d7e30] [c00000000000b8e0] system_call+0x38/0xfc Instruction dump: 78630020 7f831b78 7ffc07b4 7c7ce039 40820360 a13d0018 2f890003 419e0288 2f890004 419e00a0 2f890001 419e02a8 <0fe00000> 3b80fffb 38210100 7f83e378 The above problem can also be recreated on a regular xfs filesystem using the command, $ fsstress -d /mnt -l 1000 -n 1000 -p 1000 The reason for the call trace is, 1. When 'reserving' blocks for delayed allocation , XFS reserves more blocks (i.e. past file's current EOF) than required. This is done because XFS assumes that userspace might write more data and hence 'reserving' more blocks might lead to the file's new data being stored contiguously on disk. 2. The in-memory 'struct xfs_bmbt_irec' mapping the file's last extent would then cover the prealloc-ed EOF blocks in addition to the regular blocks. 3. When flushing the dirty blocks to disk, we only flush data till the file's EOF. But before writing out the dirty data, we allocate blocks on the disk for holding the file's new data. This allocation includes the blocks that are part of the 'prealloc EOF blocks'. 4. Later, when the last reference to the inode is being closed, XFS frees the unused 'prealloc EOF blocks' in xfs_inactive(). In step 3 above, When allocating space on disk for the delayed allocation range, the space allocator might sometimes allocate less blocks than required. If such an allocation ends right at the current EOF of the file, We will not be able to clear the "delayed allocation" flag for the 'prealloc EOF blocks', since we won't have dirty buffer heads associated with that range of the file. In such a situation if a Direct I/O read operation is performed on file range [X, Y] (where X < EOF and Y > EOF), we flush dirty data in the range [X, Y] and invalidate page cache for that range (Refer to iomap_dio_rw()). Later for performing the Direct I/O read, XFS obtains the extent items (which are still cached in memory) for the file range. When doing so we are not supposed to get an extent item with IOMAP_DELALLOC flag set, since the previous "flush" operation should have converted any delayed allocation data in the range [X, Y]. Hence we end up hitting a WARN_ON_ONCE(1) statement in iomap_dio_actor(). This commit fixes the bug by preventing the read operation from going beyond iomap_dio->i_size. Reported-by: Santhosh G <santhog4@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Tejun Heo	695b2bd64b	cgroup: mark cgroup_get() with __maybe_unused commit `310b4816a5` upstream. `a590b90d47` ("cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc()") converted most cgroup_get() usages to cgroup_get_live() leaving cgroup_sk_alloc() the sole user of cgroup_get(). When !CONFIG_SOCK_CGROUP_DATA, this ends up triggering unused warning for cgroup_get(). Silence the warning by adding __maybe_unused to cgroup_get(). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20170501145340.17e8ef86@canb.auug.org.au Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Wei Yongjun	ba301861f6	pinctrl: cherryview: Add terminate entry for dmi_system_id tables commit `a9de080bbc` upstream. Make sure dmi_system_id tables are NULL terminated. Fixes: `7036502783` ("pinctrl: cherryview: Add a quirk to make Acer Chromebook keyboard work again") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Takatoshi Akiyama	944601cf16	serial: sh-sci: Fix panic when serial console and DMA are enabled commit `3c9101766b` upstream. This patch fixes an issue that kernel panic happens when DMA is enabled and we press enter key while the kernel booting on the serial console. * An interrupt may occur after sci_request_irq(). * DMA transfer area is initialized by setup_timer() in sci_request_dma() and used in interrupt. If an interrupt occurred between sci_request_irq() and setup_timer() in sci_request_dma(), DMA transfer area has not been initialized yet. So, this patch changes the order of sci_request_irq() and sci_request_dma(). Fixes: `73a19e4c03` ("serial: sh-sci: Add DMA support.") Signed-off-by: Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com> [Shimoda changes the commit log] Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Michał Winiarski	4e979ac972	drm/i915/skl: Add missing SKL ID commit `ca7a45ba6f` upstream. Used by production device: Intel(R) Iris(TM) Graphics P555 Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227112256.20060-1-michal.winiarski@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Ville Syrjälä	fa8a1ce792	drm/i915: Fix runtime PM for LPE audio commit `668e3b014a` upstream. Not calling pm_runtime_enable() means that runtime PM can't be enabled at all via sysfs. So we definitely need to call it from somewhere. Calling it from the driver seems like a bad idea because it would have to be paired with a pm_runtime_disable() at driver unload time, otherwise the core gets upset. Also if there's no LPE audio driver loaded then we couldn't runtime suspend i915 either. So it looks like a better plan is to call it from i915 when we register the platform device. That seems to match how pci generally does things. I cargo culted the pm_runtime_forbid() and pm_runtime_set_active() calls from pci as well. The exposed runtime PM API is massive an thorougly misleading, so I don't actually know if this is how you're supposed to use the API or not. But it seems to work. I can now runtime suspend i915 again with or without the LPE audio driver loaded, and reloading the LPE audio driver also seems to work. Note that powertop won't auto-tune runtime PM for platform devices, which is a little annoying. So I'm not sure that leaving runtime PM in "on" mode by default is the best choice here. But I've left it like that for now at least. Also remove the comment about there not being much benefit from LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime PM, which will also block s0ix, and that could have a measurable impact on power consumption. Cc: Takashi Iwai <tiwai@suse.de> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Fixes: `0b6b524f39` ("ALSA: x86: Don't enable runtime PM as default") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-2-ville.syrjala@linux.intel.com Reviewed-by: Takashi Iwai <tiwai@suse.de> (cherry picked from commit `183c00350c`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Julius Werner	fd6c3d0a3b	drivers: char: mem: Fix wraparound check to allow mappings up to the end commit `32829da54d` upstream. A recent fix to /dev/mem prevents mappings from wrapping around the end of physical address space. However, the check was written in a way that also prevents a mapping reaching just up to the end of physical address space, which may be a valid use case (especially on 32-bit systems). This patch fixes it by checking the last mapped address (instead of the first address behind that) for overflow. Fixes: `b299cde245` ("drivers: char: mem: Check for address space wraparound with mmap()") Reported-by: Nico Huber <nico.h@gmx.de> Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Sebastian Andrzej Siewior	fc337d0e00	cpu/hotplug: Drop the device lock on error commit `40da1b11f0` upstream. If a custom CPU target is specified and that one is not available _or_ can't be interrupted then the code returns to userland without dropping a lock as notices by lockdep: \|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target \| ================================================ \| [ BUG: lock held when returning to user space! ] \| ------------------------------------------------ \| bash/503 is leaving the kernel with locks still held! \| 1 lock held by bash/503: \| #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40 So release the lock then. Fixes: `757c989b99` ("cpu/hotplug: Make target state writeable") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Takashi Iwai	4fa867a37a	ASoC: Fix use-after-free at card unregistration commit `4efda5f213` upstream. soc_cleanup_card_resources() call snd_card_free() at the last of its procedure. This turned out to lead to a use-after-free. PCM runtimes have been already removed via soc_remove_pcm_runtimes(), while it's dereferenced later in soc_pcm_free() called via snd_card_free(). The fix is simple: just move the snd_card_free() call to the beginning of the whole procedure. This also gives another benefit: it guarantees that all operations have been shut down before actually releasing the resources, which was racy until now. Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Takashi Iwai	6d4bee6600	ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT commit `ba3021b2c7` upstream. snd_timer_user_tselect() reallocates the queue buffer dynamically, but it forgot to reset its indices. Since the read may happen concurrently with ioctl and snd_timer_user_tselect() allocates the buffer via kmalloc(), this may lead to the leak of uninitialized kernel-space data, as spotted via KMSAN: BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10 CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007 kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086 copy_to_user ./arch/x86/include/asm/uaccess.h:725 snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004 do_loop_readv_writev fs/read_write.c:716 __do_readv_writev+0x94c/0x1380 fs/read_write.c:864 do_readv_writev fs/read_write.c:894 vfs_readv fs/read_write.c:908 do_readv+0x52a/0x5d0 fs/read_write.c:934 SYSC_readv+0xb6/0xd0 fs/read_write.c:1021 SyS_readv+0x87/0xb0 fs/read_write.c:1018 This patch adds the missing reset of queue indices. Together with the previous fix for the ioctl/read race, we cover the whole problem. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Takashi Iwai	9018818b24	ALSA: timer: Fix race between read and ioctl commit `d11662f4f7` upstream. The read from ALSA timer device, the function snd_timer_user_tread(), may access to an uninitialized struct snd_timer_user fields when the read is concurrently performed while the ioctl like snd_timer_user_tselect() is invoked. We have already fixed the races among ioctls via a mutex, but we seem to have forgotten the race between read vs ioctl. This patch simply applies (more exactly extends the already applied range of) tu->ioctl_lock in snd_timer_user_tread() for closing the race window. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Ben Skeggs	a8282f018b	drm/nouveau/tmr: fully separate alarm execution/pending lists commit `b4e382ca75` upstream. Reusing the list_head for both is a bad idea. Callback execution is done with the lock dropped so that alarms can be rescheduled from the callback, which means that with some unfortunate timing, lists can get corrupted. The execution list should not require its own locking, the single function that uses it can only be called from a single context. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Dominik Brodowski	a7e7ade1b4	x86/microcode/intel: Clear patch pointer before jettisoning the initrd commit `5b0bc9ac2c` upstream. During early boot, load_ucode_intel_ap() uses __load_ucode_intel() to obtain a pointer to the relevant microcode patch (embedded in the initrd), and stores this value in 'intel_ucode_patch' to speed up the microcode patch application for subsequent CPUs. On resuming from suspend-to-RAM, however, load_ucode_ap() calls load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is long gone so the pointer stored in 'intel_ucode_patch' no longer points to a valid microcode patch. Clear that pointer so that we effectively fall back to the CPU hotplug notifier callbacks to update the microcode. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> [ Edit and massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Sinclair Yeh	3bc7a4a564	drm/vmwgfx: Make sure backup_handle is always valid commit `07678eca2c` upstream. When vmw_gb_surface_define_ioctl() is called with an existing buffer, we end up returning an uninitialized variable in the backup_handle. The fix is to first initialize backup_handle to 0 just to be sure, and second, when a user-provided buffer is found, we will use the req->buffer_handle as the backup_handle. Reported-by: Murray McAllister <murray.mcallister@insomniasec.com> Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Vladis Dronov	6a6a485719	drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() commit `ee9c4e681e` upstream. The 'req->mip_levels' parameter in vmw_gb_surface_define_ioctl() is a user-controlled 'uint32_t' value which is used as a loop count limit. This can lead to a kernel lockup and DoS. Add check for 'req->mip_levels'. References: https://bugzilla.redhat.com/show_bug.cgi?id=1437431 Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Dan Carpenter	8c704276e2	drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() commit `f0c62e9878` upstream. If vmalloc() fails then we need to a bit of cleanup before returning. Fixes: `fb1d9738ca` ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Timur Tabi	3e857aaa8f	net: qcom/emac: do not use hardware mdio automatic polling commit `246096690b` upstream. Use software polling (PHY_POLL) to check for link state changes instead of relying on the EMAC's hardware polling feature. Some PHY drivers are unable to get a functioning link because the HW polling is not robust enough. The EMAC is able to poll the PHY on the MDIO bus looking for link state changes (via the Link Status bit in the Status Register at address 0x1). When the link state changes, the EMAC triggers an interrupt and tells the driver what the new state is. The feature eliminates the need for software to poll the MDIO bus. Unfortunately, this feature is incompatible with phylib, because it ignores everything that the PHY core and PHY drivers are trying to do. In particular: 1. It assumes a compatible register set, so PHYs with different registers may not work. 2. It doesn't allow for hardware errata that have work-arounds implemented in the PHY driver. 3. It doesn't support multiple register pages. If the PHY core switches the register set to another page, the EMAC won't know the page has changed and will still attempt to read the same PHY register. 4. It only checks the copper side of the link, not the SGMII side. Some PHY drivers (e.g. at803x) may also check the SGMII side, and report the link as not ready during autonegotiation if the SGMII link is still down. Phylib then waits for another interrupt to query the PHY again, but the EMAC won't send another interrupt because it thinks the link is up. Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Paolo Bonzini	cadb844501	srcu: Allow use of Classic SRCU from both process and interrupt context commit `1123a60416` upstream. Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit `719d93cd5f` ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit `5a41344a3d` ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Fixes: `719d93cd5f` ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Jin Yao	574f76fc58	perf/core: Drop kernel samples even though :u is specified commit `cc1582c231` upstream. When doing sampling, for example: perf record -e cycles:u ... On workloads that do a lot of kernel entry/exits we see kernel samples, even though :u is specified. This is due to skid existing. This might be a security issue because it can leak kernel addresses even though kernel sampling support is disabled. The patch drops the kernel samples if exclude_kernel is specified. For example, test on Haswell desktop: perf record -e cycles:u <mgen> perf report --stdio Before patch applied: 99.77% mgen mgen [.] buf_read 0.20% mgen mgen [.] rand_buf_init 0.01% mgen [kernel.vmlinux] [k] apic_timer_interrupt 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen mgen [.] rand_array_init 0.00% mgen [kernel.vmlinux] [k] page_fault 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] __strcasestr 0.00% mgen ld-2.23.so [.] strcmp 0.00% mgen ld-2.23.so [.] _dl_start 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start We can see kernel symbols apic_timer_interrupt and page_fault. After patch applied: 99.79% mgen mgen [.] buf_read 0.19% mgen mgen [.] rand_buf_init 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen mgen [.] rand_array_init 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] vfprintf 0.00% mgen libc-2.23.so [.] rand 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen libc-2.23.so [.] _IO_doallocbuf 0.00% mgen ld-2.23.so [.] do_lookup_x 0.00% mgen ld-2.23.so [.] open_verify.constprop.7 0.00% mgen ld-2.23.so [.] _dl_important_hwcaps 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start There are only userspace symbols. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: kan.liang@intel.com Cc: mark.rutland@arm.com Cc: will.deacon@arm.com Cc: yao.jin@intel.com Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Andrew Lunn	52782a1e16	Revert "ata: sata_mv: Convert to devm_ioremap_resource()" commit `3e4240da0e` upstream. This reverts commit `368e5fbdfc`. devm_ioremap_resource() enforces that there are no overlapping resources, where as devm_ioremap() does not. The sata phy driver needs a subset of the sata IO address space, so maps some of the sata address space. As a result, sata_mv now fails to probe, reporting it cannot get its resources, and so we don't have any SATA disks. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Breno Leitao	93dfb4e6a7	powerpc/kernel: Initialize load_tm on task creation commit `7f22ced437` upstream. Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: `5d176f751e` ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Breno Leitao	154b657008	powerpc/kernel: Fix FP and vector register restoration commit `1195892c09` upstream. Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: `70fe3d980f` ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Michael Bringmann	c3e2874653	powerpc/hotplug-mem: Fix missing endian conversion of aa_index commit `dc421b200f` upstream. When adding or removing memory, the aa_index (affinity value) for the memblock must also be converted to match the endianness of the rest of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval of the attribute will likely lead to non-existent nodes, followed by using the default node in the code inappropriately. Fixes: `5f97b2a0d1` ("powerpc/pseries: Implement memory hotplug add in the kernel") Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Michael Ellerman	840407e3da	powerpc/numa: Fix percpu allocations to be NUMA aware commit `ba4a648f12` upstream. In commit `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Fixes: `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Christophe Leroy	2bcd25cbc6	powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function commit `6f553912ee` upstream. of_mm_gpiochip_add_data() generates an oops for NULL pointer dereference. of_mm_gpiochip_add_data() calls mm_gc->save_regs() before setting the data, therefore ->save_regs() cannot use gpiochip_get_data() Fixes: `937daafca7` ("powerpc: simple-gpio: use gpiochip data pointer") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	0ce1692273	scsi: qla2xxx: Fix mailbox pointer error in fwdump capture commit `74939a0bc7` upstream. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	223b1549f0	scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC commit `1d63496516` upstream. Set bit (BIT_15) to send right ECHO payload information for Diagnostic Echo Test command. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	67034eaa0d	scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues commit `ce6c668b14` upstream. Firmware dump allows for debugging customer issues. This patch fixes start/end pointer calculation to capture T262 template entry for dump tool. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Quinn Tran	21fffaa1b7	scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call commit `0ea88662b5` upstream. Remove redundant fc_host_port_name calls to prevent early access of scsi_host->shost_data buffer. This prevent null pointer access. Following stack trace is seen: BUG: unable to handle kernel NULL pointer dereference at 00000000000008 IP: qla24xx_report_id_acquisition+0x22d/0x3a0 [qla2xxx] Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Sawan Chandak	96e0d45552	scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue commit `b95b9452aa` upstream. when driver is loaded with Multi Queue enabled, it was noticed that there was one less queue pair created. Following message would indicate this: "No resources to create additional q pair." The result of one less queue pair means that system can crash, if the block mq layer thinks there is an extra hardware queue available, and the driver will use a NULL ptr qpair in that instance. Following stack trace is seen in one of the crash: irq_create_affinity_masks+0x98/0x530 irq_create_affinity_masks+0x98/0x530 __pci_enable_msix+0x321/0x4e0 mutex_lock+0x12/0x40 pci_alloc_irq_vectors_affinity+0xb5/0x140 qla24xx_enable_msix+0x79/0x530 [qla2xxx] qla2x00_request_irqs+0x61/0x2d0 [qla2xxx] qla2x00_probe_one+0xc73/0x2390 [qla2xxx] ida_simple_get+0x98/0x100 kernfs_next_descendant_post+0x40/0x50 local_pci_probe+0x45/0xa0 pci_device_probe+0xfc/0x140 driver_probe_device+0x2c5/0x470 __driver_attach+0xdd/0xe0 driver_probe_device+0x470/0x470 bus_for_each_dev+0x6c/0xc0 driver_attach+0x1e/0x20 bus_add_driver+0x45/0x270 driver_register+0x60/0xe0 __pci_register_driver+0x4c/0x50 qla2x00_module_init+0x1ce/0x21e [qla2xxx] Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
himanshu.madhani@cavium.com	083dca440b	scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive commit `cb590700e0` upstream. Following messages are seen into system logs qla2xxx [0000:09:00.0]-00af:9: Performing ISP error recovery - ha=ffff98315ee30000. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:09:00.0]-d009:9: Firmware has been previously dumped (ffffba488c001000) -- ignoring request. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. See Bugzilla for details https://bugzilla.kernel.org/show_bug.cgi?id=195285 Fixes: `d74595278f` ("scsi: qla2xxx: Add multiple queue pair functionality.") Reported-by: Laurence Oberman <loberman@redhat.com> Reported-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Tested-by: Laurence Oberman <loberman@redhat.com> Tested-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Johannes Thumshirn	daa15f6f84	scsi: qla2xxx: don't disable a not previously enabled PCI device commit `ddff7ed45e` upstream. When pci_enable_device() or pci_enable_device_mem() fail in qla2x00_probe_one() we bail out but do a call to pci_disable_device(). This causes the dev_WARN_ON() in pci_disable_device() to trigger, as the device wasn't enabled previously. So instead of taking the 'probe_out' error path we can directly return iff one of the pci_enable_device() calls fails. Additionally rename the 'probe_out' goto label's name to the more descriptive 'disable_device'. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: `e315cd28b9` ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Marc Zyngier	2bae71aa85	KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages commit `d6dbdd3c85` upstream. Under memory pressure, we start ageing pages, which amounts to parsing the page tables. Since we don't want to allocate any extra level, we pass NULL for our private allocation cache. Which means that stage2_get_pud() is allowed to fail. This results in the following splat: [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 1520.417741] pgd = ffff810f52fef000 [ 1520.421201] [00000008] pgd=0000010f636c5003, pud=0000010f56f48003, *pmd=0000000000000000 [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 1520.435156] Modules linked in: [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G W 4.12.0-rc4-00027-g1885c397eaec #7205 [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016 [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000 [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110 [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0 [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145 [ 1520.486325] sp : ffff800ce04e33d0 [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064 [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000 [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000 [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000 [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000 [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000 [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70 [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008 [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002 [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940 [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200 [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000 [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000 [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008 [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000) [...] [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110 [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0 [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8 [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0 [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0 [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0 [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188 [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250 [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0 [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180 [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8 [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600 [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328 [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340 [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240 [...] The trivial fix is to handle this NULL pud value early, rather than dereferencing it blindly. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Omar Sandoval	7751d94da9	Btrfs: fix delalloc accounting leak caused by u32 overflow commit `70e7af244f` upstream. btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication, which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2). For a nodesize of 16kB, this overflow happens at 16k items. Usually, num_items is a small constant passed to btrfs_start_transaction(), but we also use btrfs_calc_trans_metadata_size() for metadata reservations for extent items in btrfs_delalloc_{reserve,release}_metadata(). In drop_outstanding_extents(), num_items is calculated as inode->reserved_extents - inode->outstanding_extents. The difference between these two counters is usually small, but if many delalloc extents are reserved and then the outstanding extents are merged in btrfs_merge_extent_hook(), the difference can become large enough to overflow in btrfs_calc_trans_metadata_size(). The overflow manifests itself as a leak of a multiple of 4 GB in delalloc_block_rsv and the metadata bytes_may_use counter. This in turn can cause early ENOSPC errors. Additionally, these WARN_ONs in extent-tree.c will be hit when unmounting: WARN_ON(fs_info->delalloc_block_rsv.size > 0); WARN_ON(fs_info->delalloc_block_rsv.reserved > 0); WARN_ON(space_info->bytes_pinned > 0 \|\| space_info->bytes_reserved > 0 \|\| space_info->bytes_may_use > 0); Fix it by casting nodesize to a u64 so that btrfs_calc_trans_metadata_size() does a full 64-bit multiplication. While we're here, do the same in btrfs_calc_trunc_metadata_size(); this can't overflow with any existing uses, but it's better to be safe here than have another hard-to-debug problem later on. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Jeff Mahoney	acccdbef2c	btrfs: fix race with relocation recovery and fs_root setup commit `a9b3311ef3` upstream. If we have to recover relocation during mount, we'll ultimately have to evict the orphan inode. That goes through the reservation dance, where priority_reclaim_metadata_space and flush_space expect fs_info->fs_root to be valid. That's the next thing to be set up during mount, so we crash, almost always in flush_space trying to join the transaction but priority_reclaim_metadata_space is possible as well. This call path has been problematic in the past WRT whether ->fs_root is valid yet. Commit `957780eb27` (Btrfs: introduce ticketed enospc infrastructure) added new users that are called in the direct path instead of the async path that had already been worked around. The thing is that we don't actually need the fs_root, specifically, for anything. We either use it to determine whether the root is the chunk_root for use in choosing an allocation profile or as a root to pass btrfs_join_transaction before immediately committing it. Anything that isn't the chunk root works in the former case and any root works in the latter. A simple fix is to use a root we know will always be there: the extent_root. Fixes: `957780eb27` (Btrfs: introduce ticketed enospc infrastructure) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Jeff Mahoney	5ca9daf722	btrfs: fix memory leak in update_space_info failure path commit `896533a7da` upstream. If we fail to add the space_info kobject, we'll leak the memory for the percpu counter. Fixes: `6ab0a2029c` (btrfs: publish allocation data in sysfs) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
David Sterba	6bc3d6a633	btrfs: use correct types for page indices in btrfs_page_exists_in_range commit `cc2b702c52` upstream. Variables start_idx and end_idx are supposed to hold a page index derived from the file offsets. The int type is not the right one though, offsets larger than 1 << 44 will get silently trimmed off the high bits. (1 << 44 is 16TiB) What can go wrong, if start is below the boundary and end gets trimmed: - if there's a page after start, we'll find it (radix_tree_gang_lookup_slot) - the final check "if (page->index <= end_idx)" will unexpectedly fail The function will return false, ie. "there's no page in the range", although there is at least one. btrfs_page_exists_in_range is used to prevent races in: * in hole punching, where we make sure there are not pages in the truncated range, otherwise we'll wait for them to finish and redo truncation, but we're going to replace the pages with holes anyway so the only problem is the intermediate state * lock_extent_direct: we want to make sure there are no pages before we lock and start DIO, to prevent stale data reads For practical occurence of the bug, there are several constaints. The file must be quite large, the affected range must cross the 16TiB boundary and the internal state of the file pages and pending operations must match. Also, we must not have started any ordered data in the range, otherwise we don't even reach the buggy function check. DIO locking tries hard in several places to avoid deadlocks with buffered IO and avoids waiting for ranges. The worst consequence seems to be stale data read. CC: Liu Bo <bo.li.liu@oracle.com> Fixes: `fc4adbff82` ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking") Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Vaibhav Jain	574ab1b8fb	cxl: Avoid double free_irq() for psl,slice interrupts commit `b3aa20ba2b` upstream. During an eeh call to cxl_remove can result in double free_irq of psl,slice interrupts. This can happen if perst_reloads_same_image == 1 and call to cxl_configure_adapter() fails during slot_reset callback. In such a case we see a kernel oops with following back-trace: Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: free_irq+0x88/0xd0 (unreliable) cxl_unmap_irq+0x20/0x40 [cxl] cxl_native_release_psl_irq+0x78/0xd8 [cxl] pci_deconfigure_afu+0xac/0x110 [cxl] cxl_remove+0x104/0x210 [cxl] pci_device_remove+0x6c/0x110 device_release_driver_internal+0x204/0x2e0 pci_stop_bus_device+0xa0/0xd0 pci_stop_and_remove_bus_device+0x28/0x40 pci_hp_remove_devices+0xb0/0x150 pci_hp_remove_devices+0x68/0x150 eeh_handle_normal_event+0x140/0x580 eeh_handle_event+0x174/0x360 eeh_event_handler+0x1e8/0x1f0 This patch fixes the issue of double free_irq by checking that variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are not '0' before un-mapping and resetting these variables to '0' when they are un-mapped. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Frederic Barrat	0c94348b23	cxl: Fix error path on bad ioctl commit `cec422c11c` upstream. Fix error path if we can't copy user structure on CXL_IOCTL_START_WORK ioctl. We shouldn't unlock the context status mutex as it was not locked (yet). Fixes: `0712dc7e73` ("cxl: Fix issues when unmapping contexts") Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Al Viro	4907e3bb67	excessive checks in ufs_write_failed() and ufs_evict_inode() commit `babef37dcc` upstream. As it is, short copy in write() to append-only file will fail to truncate the excessive allocated blocks. As the matter of fact, all checks in ufs_truncate_blocks() are either redundant or wrong for that caller. As for the only other caller (ufs_evict_inode()), we only need the file type checks there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Al Viro	6af5db5d39	ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path commit `006351ac8e` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	c12c0c4ff5	ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() commit `940ef1a0ed` upstream. ... and it really needs splitting into "new" and "extend" cases, but that's for later Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	728154e963	ufs: set correct ->s_maxsize commit `6b0d144fa7` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	d426b9575f	ufs: restore maintaining ->i_blocks commit `eb315d2ae6` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	386e884c85	fix ufs_isblockset() commit `414cf7186d` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	823c065a40	ufs: restore proper tail allocation commit `8785d84d00` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Tejun Heo	9be0c9d62d	cpuset: consider dying css as offline commit `41c25707d2` upstream. In most cases, a cgroup controller don't care about the liftimes of cgroups. For the controller, a css becomes online when ->css_online() is called on it and offline when ->css_offline() is called. However, cpuset is special in that the user interface it exposes cares whether certain cgroups exist or not. Combined with the RCU delay between cgroup removal and css offlining, this can lead to user visible behavior oddities where operations which should succeed after cgroup removals fail for some time period. The effects of cgroup removals are delayed when seen from userland. This patch adds css_is_dying() which tests whether offline is pending and updates is_cpuset_online() so that the function returns false also while offline is pending. This gets rid of the userland visible delays. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com> Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Ulrik De Bie	48b2c7c865	Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled commit `47eb0c8b4d` upstream. The Lifebook E546 and E557 touchpad were also not functioning and worked after running: echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled Add them to the list of machines that need this workaround. Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Reviewed-by: Arjan Opmeer <arjan@opmeer.net> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Waiman Long	f3c1dfa84d	cgroup: Prevent kill_css() from being called more than once commit `33c35aa481` upstream. The kill_css() function may be called more than once under the condition that the css was killed but not physically removed yet followed by the removal of the cgroup that is hosting the css. This patch prevents any harmm from being done when that happens. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Sean Young	d31fff8cbc	rc-core: race condition during ir_raw_event_register() commit `963761a0b2` upstream. A rc device can call ir_raw_event_handle() after rc_allocate_device(), but before rc_register_device() has completed. This is racey because rcdev->raw is set before rcdev->raw->thread has a valid value. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Sui Chen	d9f48c4661	ahci: Acer SA5-271 SSD Not Detected Fix commit `8bfd174312` upstream. (Correction in this resend: fixed function name acer_sa5_271_workaround; fixed the always-true condition in the function; fixed description.) On the Acer Switch Alpha 12 (model number: SA5-271), the internal SSD may not get detected because the port_map and CAP.nr_ports combination causes the driver to skip the port that is actually connected to the SSD. More specifically, either all SATA ports are identified as DUMMY, or all ports get ``link down'' and never get up again. This problem occurs occasionally. When this problem occurs, CAP may hold a value of 0xC734FF00 or 0xC734FF01 and port_map may hold a value of 0x00 or 0x01. When this problem does not occur, CAP holds a value of 0xC734FF02 and port_map may hold a value of 0x07. Overriding the CAP value to 0xC734FF02 and port_map to 0x7 significantly reduces the occurrence of this problem. Link: https://bugzilla.kernel.org/attachment.cgi?id=253091 Signed-off-by: Sui Chen <suichen6@gmail.com> Tested-by: Damian Ivanov <damianatorrpm@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Rob Clark	39d584db44	drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state() commit `786813c343` upstream. Somehow the helper was never retrofitted for mdp5. Which meant when plane_state->fence was added, it could get copied into new state in mdp5_plane_duplicate_state(). If an update to disable the plane (for example on rmfb) managed to sneak in after an nonblock update had swapped state, but before it was committed, we'd get a splat: WARNING: CPU: 1 PID: 69 at ../drivers/gpu/drm/drm_atomic_helper.c:1061 drm_atomic_helper_wait_for_fences+0xe0/0xf8 Modules linked in: CPU: 1 PID: 69 Comm: kworker/1:1 Tainted: G W 4.11.0-rc8+ #1187 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: events drm_mode_rmfb_work_fn task: ffffffc036560d00 task.stack: ffffffc036550000 PC is at drm_atomic_helper_wait_for_fences+0xe0/0xf8 LR is at complete_commit.isra.1+0x44/0x1c0 pc : [<ffffff80084f6040>] lr : [<ffffff800854176c>] pstate: 20000145 sp : ffffffc036553b60 x29: ffffffc036553b60 x28: ffffffc0264e6a00 x27: ffffffc035659000 x26: 0000000000000000 x25: ffffffc0240e8000 x24: 0000000000000038 x23: 0000000000000000 x22: ffffff800858f200 x21: ffffffc0240e8000 x20: ffffffc02f56a800 x19: 0000000000000000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: ffffffc00a192700 x13: 0000000000000004 x12: 0000000000000000 x11: ffffff80089a1690 x10: 00000000000008f0 x9 : ffffffc036553b20 x8 : ffffffc036561650 x7 : ffffffc03fe6cb40 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000002 x3 : ffffffc035659000 x2 : ffffffc0240e8c80 x1 : 0000000000000000 x0 : ffffffc02adbe588 ---[ end trace 13aeec77c3fb55e2 ]--- Call trace: Exception stack(0xffffffc036553990 to 0xffffffc036553ac0) 3980: 0000000000000000 0000008000000000 39a0: ffffffc036553b60 ffffff80084f6040 0000000000004ff0 0000000000000038 39c0: ffffffc0365539d0 ffffff800857e098 ffffffc036553a00 ffffff800857e1b0 39e0: ffffffc036553a10 ffffff800857c554 ffffffc0365e8400 ffffffc0365e8400 3a00: ffffffc036553a20 ffffff8008103358 000000000001aad7 ffffff800851b72c 3a20: ffffffc036553a50 ffffff80080e9228 ffffffc02adbe588 0000000000000000 3a40: ffffffc0240e8c80 ffffffc035659000 0000000000000002 0000000000000001 3a60: 0000000000000000 ffffffc03fe6cb40 ffffffc036561650 ffffffc036553b20 3a80: 00000000000008f0 ffffff80089a1690 0000000000000000 0000000000000004 3aa0: ffffffc00a192700 0000000000000000 0000000000000000 0000000000000000 [<ffffff80084f6040>] drm_atomic_helper_wait_for_fences+0xe0/0xf8 [<ffffff800854176c>] complete_commit.isra.1+0x44/0x1c0 [<ffffff8008541c64>] msm_atomic_commit+0x32c/0x350 [<ffffff8008516230>] drm_atomic_commit+0x50/0x60 [<ffffff8008517548>] drm_atomic_remove_fb+0x158/0x250 [<ffffff80085186d0>] drm_framebuffer_remove+0x50/0x158 [<ffffff8008518818>] drm_mode_rmfb_work_fn+0x40/0x58 [<ffffff80080d5668>] process_one_work+0x1d0/0x378 [<ffffff80080d5a54>] worker_thread+0x244/0x488 [<ffffff80080db7fc>] kthread+0xfc/0x128 [<ffffff8008082ec0>] ret_from_fork+0x10/0x50 Fixes: `9626014` ("drm/fence: add in-fences support") Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reported-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Eric Anholt	a5ab52b38f	drm/msm: Expose our reservation object when exporting a dmabuf. commit `43523eba79` upstream. Without this, polling on the dma-buf (and presumably other devices synchronizing against our rendering) would return immediately, even while the BO was busy. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rob Clark <robdclark@gmail.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Nicholas Bellinger	0354d1d64f	target: Re-add check to reject control WRITEs with overflow data commit `4ff83daa02` upstream. During v4.3 when the overflow/underflow check was relaxed by commit `c72c525022`: commit `c72c525022` Author: Roland Dreier <roland@purestorage.com> Date: Wed Jul 22 15:08:18 2015 -0700 target: allow underflow/overflow for PR OUT etc. commands to allow underflow/overflow for Windows compliance + FCP, a consequence was to allow control CDBs to process overflow data for iscsi-target with immediate data as well. As per Roland's original change, continue to allow underflow cases for control CDBs to make Windows compliance + FCP happy, but until overflow for control CDBs is supported tree-wide, explicitly reject all control WRITEs with overflow following pre v4.3.y logic. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
David Arcari	0eedb7832e	cpufreq: cpufreq_register_driver() should return -ENODEV if init fails commit `6c77003677` upstream. For a driver that does not set the CPUFREQ_STICKY flag, if all of the ->init() calls fail, cpufreq_register_driver() should return an error. This will prevent the driver from loading. Fixes: `ce1bcfe94d` (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs) Signed-off-by: David Arcari <darcari@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Jason A. Donenfeld	86f95e53ed	random: invalidate batched entropy after crng init commit `b169c13de4` upstream. It's possible that get_random_{u32,u64} is used before the crng has initialized, in which case, its output might not be cryptographically secure. For this problem, directly, this patch set is introducing the *_wait variety of functions, but even with that, there's a subtle issue: what happens to our batched entropy that was generated before initialization. Prior to this commit, it'd stick around, supplying bad numbers. After this commit, we force the entropy to be re-extracted after each phase of the crng has initialized. In order to avoid a race condition with the position counter, we introduce a simple rwlock for this invalidation. Since it's only during this awkward transition period, after things are all set up, we stop using it, so that it doesn't have an impact on performance. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Pratyush Anand	0524867ee2	mei: make sysfs modalias format similar as uevent modalias commit `6f9193ec04` upstream. modprobe is not able to resolve sysfs modalias for mei devices. # cat /sys/class/watchdog/watchdog0/device/watchdog/watchdog0/device/modalias mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: # modprobe --set-version 4.9.6-200.fc25.x86_64 -R mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: modprobe: FATAL: Module mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: not found in directory /lib/modules/4.9.6-200.fc25.x86_64 # cat /lib/modules/4.9.6-200.fc25.x86_64/modules.alias \| grep 05b79a6f-4628-4d7f-899d-a91514cb32ab alias mei::05b79a6f-4628-4d7f-899d-a91514cb32ab::* mei_wdt commit `b26864cad1` ("mei: bus: add client protocol version to the device alias"), however sysfs modalias is still in formmat mei:S:uuid:*. This patch equates format of uevent and sysfs modalias so that modprobe is able to resolve the aliases. Fixes: commit `b26864cad1` ("mei: bus: add client protocol version to the device alias") Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Bart Van Assche	6712554818	block: Avoid that blk_exit_rl() triggers a use-after-free commit `b425e50492` upstream. Since the introduction of .init_rq_fn() and .exit_rq_fn() it is essential that the memory allocated for struct request_queue stays around until all blk_exit_rl() calls have finished. Hence make blk_init_rl() take a reference on struct request_queue. This patch fixes the following crash: general protection fault: 0000 [#2] SMP CPU: 3 PID: 28 Comm: ksoftirqd/3 Tainted: G D 4.12.0-rc2-dbg+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 task: ffff88013a108040 task.stack: ffffc9000071c000 RIP: 0010:free_request_size+0x1a/0x30 RSP: 0018:ffffc9000071fd38 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b6b RBX: ffff880067362a88 RCX: 0000000000000003 RDX: ffff880067464178 RSI: ffff880067362a88 RDI: ffff880135ea4418 RBP: ffffc9000071fd40 R08: 0000000000000000 R09: 0000000100180009 R10: ffffc9000071fd38 R11: ffffffff81110800 R12: ffff88006752d3d8 R13: ffff88006752d3d8 R14: ffff88013a108040 R15: 000000000000000a FS: 0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa8ec1edb00 CR3: 0000000138ee8000 CR4: 00000000001406e0 Call Trace: mempool_destroy.part.10+0x21/0x40 mempool_destroy+0xe/0x10 blk_exit_rl+0x12/0x20 blkg_free+0x4d/0xa0 __blkg_release_rcu+0x59/0x170 rcu_process_callbacks+0x260/0x4e0 __do_softirq+0x116/0x250 smpboot_thread_fn+0x123/0x1e0 kthread+0x109/0x140 ret_from_fork+0x31/0x40 Fixes: commit `e9c787e65c` ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Matt Ranostay	698aa72067	iio: proximity: as3935: fix iio_trigger_poll issue commit `9122b54f26` upstream. Using iio_trigger_poll() can oops when multiple interrupts happen before the first is handled. Use iio_trigger_poll_chained() instead and use the timestamp when processed, since it will be in theory be 2 ms max latency. Fixes: `24ddb0e4bb` ("iio: Add AS3935 lightning sensor support") Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Matt Ranostay	71c0950cd7	iio: proximity: as3935: fix AS3935_INT mask commit `275292d3a3` upstream. AS3935 interrupt mask has been incorrect so valid lightning events would never trigger an buffer event. Also noise interrupt should be BIT(0). Fixes: `24ddb0e4bb` ("iio: Add AS3935 lightning sensor support") Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Marcin Niestroj	7b5d3c1a14	iio: trigger: fix NULL pointer dereference in iio_trigger_write_current() commit `4eecbe8188` upstream. In case oldtrig == trig == NULL (which happens when we set none trigger, when there is already none set) there is a NULL pointer dereference during iio_trigger_put(trig). Below is kernel output when this occurs: [ 26.741790] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 26.750179] pgd = cacc0000 [ 26.752936] [00000000] pgd=8adc6835, pte=00000000, *ppte=00000000 [ 26.759531] Internal error: Oops: 17 [#1] SMP ARM [ 26.764261] Modules linked in: usb_f_ncm u_ether usb_f_acm u_serial usb_f_fs libcomposite configfs evbug [ 26.773844] CPU: 0 PID: 152 Comm: synchro Not tainted 4.12.0-rc1 #2 [ 26.780128] Hardware name: Freescale i.MX6 Ultralite (Device Tree) [ 26.786329] task: cb1de200 task.stack: cac92000 [ 26.790892] PC is at iio_trigger_write_current+0x188/0x1f4 [ 26.796403] LR is at lock_release+0xf8/0x20c [ 26.800696] pc : [<c0736f34>] lr : [<c016efb0>] psr: 600d0013 [ 26.800696] sp : cac93e30 ip : cac93db0 fp : cac93e5c [ 26.812193] r10: c0e64fe8 r9 : 00000000 r8 : 00000001 [ 26.817436] r7 : cb190810 r6 : 00000010 r5 : 00000001 r4 : 00000000 [ 26.823982] r3 : 00000000 r2 : 00000000 r1 : cb1de200 r0 : 00000000 [ 26.830528] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 26.837683] Control: 10c5387d Table: 8acc006a DAC: 00000051 [ 26.843448] Process synchro (pid: 152, stack limit = 0xcac92210) [ 26.849475] Stack: (0xcac93e30 to 0xcac94000) [ 26.853857] 3e20: 00000001 c0736dac c054033c cae6b680 [ 26.862060] 3e40: cae6b680 00000000 00000001 cb3f8610 cac93e74 cac93e60 c054035c c0736db8 [ 26.870264] 3e60: 00000001 c054033c cac93e94 cac93e78 c029bf34 c0540348 00000000 00000000 [ 26.878469] 3e80: cb3f8600 cae6b680 cac93ed4 cac93e98 c029b320 c029bef0 00000000 00000000 [ 26.886672] 3ea0: 00000000 cac93f78 cb2d41fc caed3280 c029b214 cac93f78 00000001 000e20f8 [ 26.894874] 3ec0: 00000001 00000000 cac93f44 cac93ed8 c0221dcc c029b220 c0e1ca39 cb2d41fc [ 26.903079] 3ee0: cac93f04 cac93ef0 c0183ef0 c0183ab0 cb2d41fc 00000000 cac93f44 cac93f08 [ 26.911282] 3f00: c0225eec c0183ebc 00000001 00000000 c0223728 00000000 c0245454 00000001 [ 26.919485] 3f20: 00000001 caed3280 000e20f8 cac93f78 000e20f8 00000001 cac93f74 cac93f48 [ 26.927690] 3f40: c0223680 c0221da4 c0246520 c0245460 caed3283 caed3280 00000000 00000000 [ 26.935893] 3f60: 000e20f8 00000001 cac93fa4 cac93f78 c0224520 c02235e4 00000000 00000000 [ 26.944096] 3f80: 00000001 000e20f8 00000001 00000004 c0107f84 cac92000 00000000 cac93fa8 [ 26.952299] 3fa0: c0107de0 c02244e8 00000001 000e20f8 `0000000e` 000e20f8 00000001 fbad2484 [ 26.960502] 3fc0: 00000001 000e20f8 00000001 00000004 beb6b698 00064260 0006421c beb6b4b4 [ 26.968705] 3fe0: 00000000 beb6b450 b6f219a0 b6e2f268 800d0010 `0000000e` cac93ff4 cac93ffc [ 26.976896] Backtrace: [ 26.979388] [<c0736dac>] (iio_trigger_write_current) from [<c054035c>] (dev_attr_store+0x20/0x2c) [ 26.988289] r10:cb3f8610 r9:00000001 r8:00000000 r7:cae6b680 r6:cae6b680 r5:c054033c [ 26.996138] r4:c0736dac r3:00000001 [ 26.999747] [<c054033c>] (dev_attr_store) from [<c029bf34>] (sysfs_kf_write+0x50/0x54) [ 27.007686] r5:c054033c r4:00000001 [ 27.011290] [<c029bee4>] (sysfs_kf_write) from [<c029b320>] (kernfs_fop_write+0x10c/0x224) [ 27.019579] r7:cae6b680 r6:cb3f8600 r5:00000000 r4:00000000 [ 27.025271] [<c029b214>] (kernfs_fop_write) from [<c0221dcc>] (__vfs_write+0x34/0x120) [ 27.033214] r10:00000000 r9:00000001 r8:000e20f8 r7:00000001 r6:cac93f78 r5:c029b214 [ 27.041059] r4:caed3280 [ 27.043622] [<c0221d98>] (__vfs_write) from [<c0223680>] (vfs_write+0xa8/0x170) [ 27.050959] r9:00000001 r8:000e20f8 r7:cac93f78 r6:000e20f8 r5:caed3280 r4:00000001 [ 27.058731] [<c02235d8>] (vfs_write) from [<c0224520>] (SyS_write+0x44/0x98) [ 27.065806] r9:00000001 r8:000e20f8 r7:00000000 r6:00000000 r5:caed3280 r4:caed3283 [ 27.073582] [<c02244dc>] (SyS_write) from [<c0107de0>] (ret_fast_syscall+0x0/0x1c) [ 27.081179] r9:cac92000 r8:c0107f84 r7:00000004 r6:00000001 r5:000e20f8 r4:00000001 [ 27.088947] Code: 1a000009 e1a04009 e3a06010 e1a05008 (e5943000) [ 27.095244] ---[ end trace 06d1dab86d6e6bab ]--- To fix that problem call iio_trigger_put(trig) only when trig is not NULL. Fixes: `d5d24bcc0a` ("iio: trigger: close race condition in acquiring trigger reference") Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Franziska Naepelt	80c8ac6b9b	iio: light: ltr501 Fix interchanged als/ps register field commit `7cc3bff4ef` upstream. The register mapping for the IIO driver for the Liteon Light and Proximity sensor LTR501 interrupt mode is interchanged (ALS/PS). There is a register called INTERRUPT register (address 0x8F) Bit 0 represents PS measurement trigger. Bit 1 represents ALS measurement trigger. This two bit fields are interchanged within the driver. see datasheet page 24: http://optoelectronics.liteon.com/upload/download/DS86-2012-0006/S_110_LTR-501ALS-01_PrelimDS_ver1%5B1%5D.pdf Signed-off-by: Franziska Naepelt <franziska.naepelt@idt.com> Fixes: `7ac702b314` ("iio: ltr501: Add interrupt support") Acked-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Raveendra Padasalagi	1cb7bbe7b5	iio: adc: bcm_iproc_adc: swap primary and secondary isr handler's commit `f7d86ecf83` upstream. The third argument of devm_request_threaded_irq() is the primary handler. It is called in hardirq context and checks whether the interrupt is relevant to the device. If the primary handler returns IRQ_WAKE_THREAD, the secondary handler (a.k.a. handler thread) is scheduled to run in process context. bcm_iproc_adc.c uses the secondary handler as the primary one and the other way around. So this patch fixes the same, along with re-naming the secondary handler and primary handler names properly. Tested on the BCM9583XX iProc SoC based boards. Fixes: `4324c97ece` ("iio: Add driver for Broadcom iproc-static-adc") Reported-by: Pavel Roskin <plroskin@gmail.com> Signed-off-by: Raveendra Padasalagi <raveendra.padasalagi@broadcom.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Oleg Drokin	d9b0c95585	staging/lustre/lov: remove set_fs() call from lov_getstripe() commit `0a33252e06` upstream. lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct lov_user_md pointer from user- or kernel-space. This changes the behavior of copy_from_user() on SPARC and may result in a misaligned access exception which in turn oopses the kernel. In fact the relevant argument to lov_getstripe() is never called with a kernel-space pointer and so changing the address limits is unnecessary and so we remove the calls to save, set, and restore the address limits. Signed-off-by: John L. Hammond <john.hammond@intel.com> Reviewed-on: http://review.whamcloud.com/6150 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Li Wei <wei.g.li@intel.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Michael Thalmeier	dd8980bb03	usb: chipidea: debug: check before accessing ci_role commit `0340ff83cd` upstream. ci_role BUGs when the role is >= CI_ROLE_END. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Jisheng Zhang	f2967b72b6	usb: chipidea: udc: fix NULL pointer dereference if udc_start failed commit `aa1f058d7d` upstream. Fix below NULL pointer dereference. we set ci->roles[CI_ROLE_GADGET] too early in ci_hdrc_gadget_init(), if udc_start() fails due to some reason, the ci->roles[CI_ROLE_GADGET] check in ci_hdrc_gadget_destroy can't protect us. We fix this issue by only setting ci->roles[CI_ROLE_GADGET] if udc_start() succeed. [ 1.398550] Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... [ 1.448600] PC is at dma_pool_free+0xb8/0xf0 [ 1.453012] LR is at dma_pool_free+0x28/0xf0 [ 2.113369] [<ffffff80081817d8>] dma_pool_free+0xb8/0xf0 [ 2.118857] [<ffffff800841209c>] destroy_eps+0x4c/0x68 [ 2.124165] [<ffffff8008413770>] ci_hdrc_gadget_destroy+0x28/0x50 [ 2.130461] [<ffffff800840fa30>] ci_hdrc_probe+0x588/0x7e8 [ 2.136129] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8 [ 2.142066] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8 [ 2.148270] [<ffffff800837f68c>] __device_attach_driver+0x9c/0xf8 [ 2.154563] [<ffffff800837d570>] bus_for_each_drv+0x58/0x98 [ 2.160317] [<ffffff800837f174>] __device_attach+0xc4/0x138 [ 2.166072] [<ffffff800837f738>] device_initial_probe+0x10/0x18 [ 2.172185] [<ffffff800837e58c>] bus_probe_device+0x94/0xa0 [ 2.177940] [<ffffff800837c560>] device_add+0x3f0/0x560 [ 2.183337] [<ffffff8008380d20>] platform_device_add+0x180/0x240 [ 2.189541] [<ffffff800840f0e8>] ci_hdrc_add_device+0x440/0x4f8 [ 2.195654] [<ffffff8008414194>] ci_hdrc_usb2_probe+0x13c/0x2d8 [ 2.201769] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8 [ 2.207705] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8 [ 2.213910] [<ffffff800837f5ec>] __driver_attach+0xac/0xb0 [ 2.219575] [<ffffff800837d4b0>] bus_for_each_dev+0x60/0xa0 [ 2.225329] [<ffffff800837ec80>] driver_attach+0x20/0x28 [ 2.230816] [<ffffff800837e880>] bus_add_driver+0x1d0/0x238 [ 2.236571] [<ffffff800837fdb0>] driver_register+0x60/0xf8 [ 2.242237] [<ffffff8008380ef4>] __platform_driver_register+0x44/0x50 [ 2.248891] [<ffffff80086fd440>] ci_hdrc_usb2_driver_init+0x18/0x20 [ 2.255365] [<ffffff8008082950>] do_one_initcall+0x38/0x128 [ 2.261121] [<ffffff80086e0d00>] kernel_init_freeable+0x1ac/0x250 [ 2.267414] [<ffffff800852f0b8>] kernel_init+0x10/0x100 [ 2.272810] [<ffffff8008082680>] ret_from_fork+0x10/0x50 Fixes: `3f124d233e` ("usb: chipidea: add role init and destroy APIs") Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Andrey Smirnov	f26ac1fc9f	usb: chipidea: imx: Do not access CLKONOFF on i.MX51 commit `62b97d502b` upstream. Unlike i.MX53, i.MX51's USBOH3 register file does not implemenent registers past offset 0x018, which includes MX53_USB_CLKONOFF_CTRL_OFFSET and trying to access that register on said platform results in external abort. Fix it by enabling CLKONOFF accessing codepath only for i.MX53. Fixes `3be3251db0` ("usb: chipidea: imx: Disable internal 60Mhz clock with ULPI PHY") Cc: cphealy@gmail.com Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Bin Liu	b89de0406a	usb: musb: dsps: keep VBUS on for host-only mode commit `b3addcf0d1` upstream. Currently VBUS is turned off while a usb device is detached, and turned on again by the polling routine. This short period VBUS loss prevents usb modem to switch mode. VBUS should be constantly on for host-only mode, so this changes the driver to not turn off VBUS for host-only mode. Fixes: `2f3fd2c5bd` ("usb: musb: Prepare dsps glue layer for PM runtime support") Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Thinh Nguyen	c33b087c83	usb: gadget: f_mass_storage: Serialize wake and sleep execution commit `dc9217b69d` upstream. f_mass_storage has a memorry barrier issue with the sleep and wake functions that can cause a deadlock. This results in intermittent hangs during MSC file transfer. The host will reset the device after receiving no response to resume the transfer. This issue is seen when dwc3 is processing 2 transfer-in-progress events at the same time, invoking completion handlers for CSW and CBW. Also this issue occurs depending on the system timing and latency. To increase the chance to hit this issue, you can force dwc3 driver to wait and process those 2 events at once by adding a small delay (~100us) in dwc3_check_event_buf() whenever the request is for CSW and read the event count again. Avoid debugging with printk and ftrace as extra delays and memory barrier will mask this issue. Scenario which can lead to failure: ----------------------------------- 1) The main thread sleeps and waits for the next command in get_next_command(). 2) bulk_in_complete() wakes up main thread for CSW. 3) bulk_out_complete() tries to wake up the running main thread for CBW. 4) thread_wakeup_needed is not loaded with correct value in sleep_thread(). 5) Main thread goes to sleep again. The pattern is shown below. Note the 2 critical variables. * common->thread_wakeup_needed * bh->state CPU 0 (sleep_thread) CPU 1 (wakeup_thread) ============================== =============================== bh->state = BH_STATE_FULL; smp_wmb(); thread_wakeup_needed = 0; thread_wakeup_needed = 1; smp_rmb(); if (bh->state != BH_STATE_FULL) sleep again ... As pointed out by Alan Stern, this is an R-pattern issue. The issue can be seen when there are two wakeups in quick succession. The thread_wakeup_needed can be overwritten in sleep_thread, and the read of the bh->state maybe reordered before the write to thread_wakeup_needed. This patch applies full memory barrier smp_mb() in both sleep_thread() and wakeup_thread() to ensure the order which the thread_wakeup_needed and bh->state are written and loaded. However, a better solution in the future would be to use wait_queue method that takes care of managing memory barrier between waker and waiter. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Hans de Goede	3d7ba52b5f	drm: Fix oops + Xserver hang when unplugging USB drm devices commit `75fb636324` upstream. commit `a39be606f9` ("drm: Do a full device unregister when unplugging") causes backtraces like this one when unplugging an usb drm device while it is in use: usb 2-3: USB disconnect, device number 25 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:424 drm_mode_config_cleanup+0x220/0x280 [drm] ... RIP: 0010:drm_mode_config_cleanup+0x220/0x280 [drm] ... Call Trace: gm12u320_modeset_cleanup+0xe/0x10 [gm12u320] gm12u320_driver_unload+0x35/0x70 [gm12u320] drm_dev_unregister+0x3c/0xe0 [drm] drm_unplug_dev+0x12/0x60 [drm] gm12u320_usb_disconnect+0x36/0x40 [gm12u320] usb_unbind_interface+0x72/0x280 device_release_driver_internal+0x158/0x210 device_release_driver+0x12/0x20 bus_remove_device+0x104/0x180 device_del+0x1d2/0x350 usb_disable_device+0x9f/0x270 usb_disconnect+0xc6/0x260 ... [drm:drm_mode_config_cleanup [drm]] ERROR connector Unknown-1 leaked! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:458 drm_mode_config_cleanup+0x268/0x280 [drm] ... <same Call Trace> ---[ end trace 80df975dae439ed6 ]--- general protection fault: 0000 [#1] SMP ... Call Trace: ? __switch_to+0x225/0x450 drm_mode_rmfb_work_fn+0x55/0x70 [drm] process_one_work+0x193/0x3c0 worker_thread+0x4a/0x3a0 ... RIP: drm_framebuffer_remove+0x62/0x3f0 [drm] RSP: ffffb776c39dfd98 ---[ end trace 80df975dae439ed7 ]--- After which the system is unusable this is caused by drm_dev_unregister getting called immediately on unplug, which calls the drivers unload function which calls drm_mode_config_cleanup which removes the framebuffer object while userspace is still holding a reference to it. Reverting commit `a39be606f9` ("drm: Do a full device unregister when unplugging") leads to the following oops on unplug instead, when userspace closes the last fd referencing the drm_dev: sysfs group 'power' not found for kobject 'card1-Unknown-1' ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2459 at fs/sysfs/group.c:237 sysfs_remove_group+0x80/0x90 ... RIP: 0010:sysfs_remove_group+0x80/0x90 ... Call Trace: dpm_sysfs_remove+0x57/0x60 device_del+0xfd/0x350 device_unregister+0x1a/0x60 drm_sysfs_connector_remove+0x39/0x50 [drm] drm_connector_unregister+0x5a/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbe ]--- BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: down_write+0x1f/0x40 ... Call Trace: debugfs_remove_recursive+0x55/0x1b0 drm_debugfs_connector_remove+0x21/0x40 [drm] drm_connector_unregister+0x62/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbf ]--- This is caused by the revert moving back to drm_unplug_dev calling drm_minor_unregister which does: device_del(minor->kdev); dev_set_drvdata(minor->kdev, NULL); /* safety belt */ drm_debugfs_cleanup(minor); Causing the sysfs entries to already be removed even though we still have references to them in e.g. drm_connector. Note we must call drm_minor_unregister to notify userspace of the unplug of the device, so calling drm_dev_unregister is not completely wrong the problem is that drm_dev_unregister does too much. This commit fixes drm_unplug_dev by not only reverting commit `a39be606f9` ("drm: Do a full device unregister when unplugging") but by also adding a call to drm_modeset_unregister_all before the drm_minor_unregister calls to make sure all sysfs entries are removed before calling device_del(minor->kdev) thereby also fixing the second set of oopses caused by just reverting the commit. Fixes: `a39be606f9` ("drm: Do a full device unregister when unplugging") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeffy <jeffy.chen@rock-chips.com> Cc: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170601115430.4113-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	de8f4aeaa1	ext4: fix fdatasync(2) after extent manipulation operations commit `67a7d5f561` upstream. Currently, extent manipulation operations such as hole punch, range zeroing, or extent shifting do not record the fact that file data has changed and thus fdatasync(2) has a work to do. As a result if we crash e.g. after a punch hole and fdatasync, user can still possibly see the punched out data after journal replay. Test generic/392 fails due to these problems. Fix the problem by properly marking that file data has changed in these operations. Fixes: `a4bb6b64e3` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	875d084e97	ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO commit `4f8caa60a5` upstream. When ext4_map_blocks() is called with EXT4_GET_BLOCKS_ZERO to zero-out allocated blocks and these blocks are actually converted from unwritten extent the following race can happen: CPU0 CPU1 page fault page fault ... ... ext4_map_blocks() ext4_ext_map_blocks() ext4_ext_handle_unwritten_extents() ext4_ext_convert_to_initialized() - zero out converted extent ext4_zeroout_es() - inserts extent as initialized in status tree ext4_map_blocks() ext4_es_lookup_extent() - finds initialized extent write data ext4_issue_zeroout() - zeroes out new extent overwriting data This problem can be reproduced by generic/340 for the fallocated case for the last block in the file. Fix the problem by avoiding zeroing out the area we are mapping with ext4_map_blocks() in ext4_ext_convert_to_initialized(). It is pointless to zero out this area in the first place as the caller asked us to convert the area to initialized because he is just going to write data there before the transaction finishes. To achieve this we delete the special case of zeroing out full extent as that will be handled by the cases below zeroing only the part of the extent that needs it. We also instruct ext4_split_extent() that the middle of extent being split contains data so that ext4_split_extent_at() cannot zero out full extent in case of ENOSPC. Fixes: `12735f8819` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Konstantin Khlebnikov	22fb074c67	ext4: keep existing extra fields when inode expands commit `887a973061` upstream. ext4_expand_extra_isize() should clear only space between old and new size. Fixes: `6dd4ee7cab` # v2.6.23 Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	699dc1080d	ext4: fix SEEK_HOLE commit `7d95eddf31` upstream. Currently, SEEK_HOLE implementation in ext4 may both return that there's a hole at some offset although that offset already has data and skip some holes during a search for the next hole. The first problem is demostrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec) Whence Result HOLE 0 Where we can see that SEEK_HOLE wrongly returned offset 0 as containing a hole although we have written data there. The second problem can be demonstrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec) wrote 8192/8192 bytes at offset 131072 8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec) Whence Result HOLE 139264 Where we can see that hole at offsets 56k..128k has been ignored by the SEEK_HOLE call. The underlying problem is in the ext4_find_unwritten_pgoff() which is just buggy. In some cases it fails to update returned offset when it finds a hole (when no pages are found or when the first found page has higher index than expected), in some cases conditions for detecting hole are just missing (we fail to detect a situation where indices of returned pages are not contiguous). Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page indices and also handle all cases where we got less pages then expected in one place and handle it properly there. Fixes: `c8c0df241c` CC: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Julien Grall	628ee50458	xen/privcmd: Support correctly 64KB page granularity when mapping memory commit `753c09b565` upstream. Commit `5995a68` "xen/privcmd: Add support for Linux 64KB page granularity" did not go far enough to support 64KB in mmap_batch_fn. The variable 'nr' is the number of 4KB chunk to map. However, when Linux is using 64KB page granularity the array of pages (vma->vm_private_data) contain one page per 64KB. Fix it by incrementing st->index correctly. Furthermore, st->va is not correctly incremented as PAGE_SIZE != XEN_PAGE_SIZE. Fixes: `5995a68` ("xen/privcmd: Add support for Linux 64KB page granularity") Reported-by: Feng Kan <fkan@apm.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Marc Gonzalez	b9d013c005	mtd: nand: tango: Update ecc_stats.corrected commit `60cf0ce14b` upstream. According to Boris, some user-space tools expect MTD drivers to update ecc_stats.corrected, and it's better to provide a lower bound than to provide no information at all. Fixes: `6956e2385a` ("mtd: nand: add tango NAND flash controller support") Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Andres Galacho	6d916021a1	mtd: nand: tango: Export OF device ID table as module aliases commit `2761b4f12b` upstream. The device table is required to load modules based on modaliases. After adding MODULE_DEVICE_TABLE, below entries for example will be added to module.alias: alias: of:NTCsigma,smp8758-nandC* alias: of:NTCsigma,smp8758-nand Fixes: `6956e2385a` ("mtd: nand: add tango NAND flash controller support") Signed-off-by: Andres Galacho <andresgalacho@gmail.com> Acked-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Jan Kara	f0aa7a0415	reiserfs: Make flush bios explicitely sync commit `d8747d642e` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: reiserfs-devel@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Hou Tao	f2fee0c4cd	cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode commit `5be6b75610` upstream. When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY as the delay of cfq_group's vdisktime if there have been other cfq_groups already. When cfq is under iops mode, commit `9a7f38c42c` ("cfq-iosched: Convert from jiffies to nanoseconds") could result in a large iops delay and lead to an abnormal io schedule delay for the added cfq_group. To fix it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5 when iops mode is enabled. Despite having the same value, the delay of a cfq_queue in idle class and the delay of cfq_group are different things, so I define two new macros for the delay of a cfq_group under time-slice mode and iops mode. Fixes: `9a7f38c42c` ("cfq-iosched: Convert from jiffies to nanoseconds") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	067b131e20	dmaengine: mv_xor_v2: set DMA mask to 40 bits commit `b2d3c270f9` upstream. The XORv2 engine on Armada 7K/8K can only access the first 40 bits of the physical address space, so the DMA mask must be set accordingly. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	6d41a85b29	dmaengine: mv_xor_v2: remove interrupt coalescing commit `9dd4f319ba` upstream. The current implementation of interrupt coalescing doesn't work, because it doesn't configure the coalescing timer, which is needed to make sure we get an interrupt at some point. As a fix for stable, we simply remove the interrupt coalescing functionality. It will be re-introduced properly in a future commit. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	6f4c739de9	dmaengine: mv_xor_v2: fix tx_submit() implementation commit `44d5887a8b` upstream. The mv_xor_v2_tx_submit() gets the next available HW descriptor by calling mv_xor_v2_get_desq_write_ptr(), which reads a HW register telling the next available HW descriptor. This was working fine when HW descriptors were issued for processing directly in tx_submit(). However, as part of the review process of the driver, a change was requested to move the actual kick-off of HW descriptors processing to ->issue_pending(). Due to this, reading the HW register to know the next available HW descriptor no longer works. So instead of using this HW register, we implemented a software index pointing to the next available HW descriptor. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Hanna Hawa	a4634dfb7c	dmaengine: mv_xor_v2: enable XOR engine after its configuration commit `ab2c5f0a77` upstream. The engine was enabled prior to its configuration, which isn't correct. This patch relocates the activation of the XOR engine, to be after the configuration of the XOR engine. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Hanna Hawa <hannah@marvell.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	cba2cd6df4	dmaengine: mv_xor_v2: do not use descriptors not acked by async_tx commit `bc473da1ed` upstream. Descriptors that have not been acknowledged by the async_tx layer should not be re-used, so this commit adjusts the implementation of mv_xor_v2_prep_sw_desc() to skip descriptors for which async_tx_test_ack() is false. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	2384df2bfe	dmaengine: mv_xor_v2: properly handle wrapping in the array of HW descriptors commit `2aab4e1815` upstream. mv_xor_v2_tasklet() is looping over completed HW descriptors. Before the loop, it initializes 'next_pending_hw_desc' to the first HW descriptor to handle, and then the loop simply increments this point, without taking care of wrapping when we reach the last HW descriptor. The 'pending_ptr' index was being wrapped back to 0 at the end, but it wasn't used in each iteration of the loop to calculate next_pending_hw_desc. This commit fixes that, and makes next_pending_hw_desc a variable local to the loop itself. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	fdcadb5f70	dmaengine: mv_xor_v2: handle mv_xor_v2_prep_sw_desc() error properly commit `eb8df543e4` upstream. The mv_xor_v2_prep_sw_desc() is called from a few different places in the driver, but we never take into account the fact that it might return NULL. This commit fixes that, ensuring that we don't panic if there are no more descriptors available. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Alexander Sverdlin	5c039176bc	dmaengine: ep93xx: Don't drain the transfers in terminate_all() commit `98f9de366f` upstream. Draining the transfers in terminate_all callback happens with IRQs disabled, therefore induces huge latency: irqsoff latency trace v1.1.5 on 4.11.0 -------------------------------------------------------------------- latency: 39770 us, #57/57, CPU#0 \| (M:preempt VP:0, KP:0, SP:0 HP:0) ----------------- \| task: process-129 (uid:0 nice:0 policy:2 rt_prio:50) ----------------- => started at: _snd_pcm_stream_lock_irqsave => ended at: snd_pcm_stream_unlock_irqrestore _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / process-129 0d.s. 3us : _snd_pcm_stream_lock_irqsave process-129 0d.s1 9us : snd_pcm_stream_lock <-_snd_pcm_stream_lock_irqsave process-129 0d.s1 15us : preempt_count_add <-snd_pcm_stream_lock process-129 0d.s2 22us : preempt_count_add <-snd_pcm_stream_lock process-129 0d.s3 32us : snd_pcm_update_hw_ptr0 <-snd_pcm_period_elapsed process-129 0d.s3 41us : soc_pcm_pointer <-snd_pcm_update_hw_ptr0 process-129 0d.s3 50us : dmaengine_pcm_pointer <-soc_pcm_pointer process-129 0d.s3 58us+: snd_dmaengine_pcm_pointer_no_residue <-dmaengine_pcm_pointer process-129 0d.s3 96us : update_audio_tstamp <-snd_pcm_update_hw_ptr0 process-129 0d.s3 103us : snd_pcm_update_state <-snd_pcm_update_hw_ptr0 process-129 0d.s3 112us : xrun <-snd_pcm_update_state process-129 0d.s3 119us : snd_pcm_stop <-xrun process-129 0d.s3 126us : snd_pcm_action <-snd_pcm_stop process-129 0d.s3 134us : snd_pcm_action_single <-snd_pcm_action process-129 0d.s3 141us : snd_pcm_pre_stop <-snd_pcm_action_single process-129 0d.s3 150us : snd_pcm_do_stop <-snd_pcm_action_single process-129 0d.s3 157us : soc_pcm_trigger <-snd_pcm_do_stop process-129 0d.s3 166us : snd_dmaengine_pcm_trigger <-soc_pcm_trigger process-129 0d.s3 175us : ep93xx_dma_terminate_all <-snd_dmaengine_pcm_trigger process-129 0d.s3 182us : preempt_count_add <-ep93xx_dma_terminate_all process-129 0d.s4 189us*: m2p_hw_shutdown <-ep93xx_dma_terminate_all process-129 0d.s4 39472us : m2p_hw_setup <-ep93xx_dma_terminate_all ... rest skipped... process-129 0d.s. 40080us : <stack trace> => ep93xx_dma_tasklet => tasklet_action => __do_softirq => irq_exit => __handle_domain_irq => vic_handle_irq => __irq_usr => 0xb66c6668 Just abort the transfers and warn if the HW state is not what we expect. Move draining into device_synchronize callback. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Alexander Sverdlin	81a38a595d	dmaengine: ep93xx: Always start from BASE0 commit `0037ae4781` upstream. The current buffer is being reset to zero on device_free_chan_resources() but not on device_terminate_all(). It could happen that HW is restarted and expects BASE0 to be used, but the driver is not synchronized and will start from BASE1. One solution is to reset the buffer explicitly in m2p_hw_setup(). Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Hiroyuki Yokoyama	9812dc2abf	dmaengine: usb-dmac: Fix DMAOR AE bit definition commit `9a445bbb16` upstream. This patch fixes the register definition of AE (Address Error flag) bit. Fixes: `0c1c8ff32f` ("dmaengine: usb-dmac: Add Renesas USB DMA Controller (USB-DMAC) driver") Signed-off-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> [Shimoda: add Fixes and Cc tags in the commit log] Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Wanpeng Li	3e456059a4	KVM: async_pf: avoid async pf injection when in guest mode commit `9bc1f09f6f` upstream. INFO: task gnome-terminal-:1734 blocked for more than 120 seconds. Not tainted 4.12.0-rc4+ #8 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. gnome-terminal- D 0 1734 1015 0x00000000 Call Trace: __schedule+0x3cd/0xb30 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? __vfs_read+0x37/0x150 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 This is triggered by running both win7 and win2016 on L1 KVM simultaneously, and then gives stress to memory on L1, I can observed this hang on L1 when at least ~70% swap area is occupied on L0. This is due to async pf was injected to L2 which should be injected to L1, L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host actually), and L1 guest starts accumulating tasks stuck in D state in kvm_async_pf_task_wait() since missing PAGE_READY async_pfs. This patch fixes the hang by doing async pf when executing L1 guest. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	37b1521501	arm: KVM: Allow unaligned accesses at HYP commit `33b5c38852` upstream. We currently have the HSCTLR.A bit set, trapping unaligned accesses at HYP, but we're not really prepared to deal with it. Since the rest of the kernel is pretty happy about that, let's follow its example and set HSCTLR.A to zero. Modern CPUs don't really care. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	d06ca326dd	arm64: KVM: Allow unaligned accesses at EL2 commit `78fd6dcf11` upstream. We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses at EL2, but we're not really prepared to deal with it. So far, this has been unnoticed, until GCC 7 started emitting those (in particular 64bit writes on a 32bit boundary). Since the rest of the kernel is pretty happy about that, let's follow its example and set SCTLR_EL2.A to zero. Modern CPUs don't really care. Reported-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	ff384c499b	arm64: KVM: Preserve RES1 bits in SCTLR_EL2 commit `d68c1f7fd1` upstream. __do_hyp_init has the rather bad habit of ignoring RES1 bits and writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything bad, but may end-up being pretty nasty on future revisions of the architecture. Let's preserve those bits so that we don't have to fix this later on. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Wanpeng Li	0d2c539ead	KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation commit `a3641631d1` upstream. If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it potentially can be exploited the vulnerability. this will out-of-bounds read and write. Luckily, the effect is small: /* when no next entry is found, the current entry[i] is reselected / for (j = i + 1; ; j = (j + 1) % nent) { struct kvm_cpuid_entry2 ej = &vcpu->arch.cpuid_entries[j]; if (ej->function == e->function) { It reads ej->maxphyaddr, which is user controlled. However... ej->flags \|= KVM_CPUID_FLAG_STATE_READ_NEXT; After cpuid_entries there is int maxphyaddr; struct x86_emulate_ctxt emulate_ctxt; /* 16-byte aligned */ So we have: - cpuid_entries at offset 1B50 (6992) - maxphyaddr at offset 27D0 (6992 + 3200 = 10192) - padding at 27D4...27DF - emulate_ctxt at 27E0 And it writes in the padding. Pfew, writing the ops field of emulate_ctxt would have been much worse. This patch fixes it by modding the index to avoid the out-of-bounds access. Worst case, i == j and ej->function == e->function, the loop can bail out. Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Guofang Mo <moguofang@huawei.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Paolo Bonzini	406aa22b29	kvm: async_pf: fix rcu_irq_enter() with irqs enabled commit `bbaf0e2b1c` upstream. native_safe_halt enables interrupts, and you just shouldn't call rcu_irq_enter() with interrupts enabled. Reorder the call with the following local_irq_disable() to respect the invariant. Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Dave Young	f9ecf4222a	efi/bgrt: Skip efi_bgrt_init() in case of non-EFI boot commit `7425826f4f` upstream. Sabrina Dubroca reported an early panic: BUG: unable to handle kernel paging request at ffffffffff240001 IP: efi_bgrt_init+0xdc/0x134 [...] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ... which was introduced by: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") The cause is that on this machine the firmware provides the EFI ACPI BGRT table even on legacy non-EFI bootups - which table should be EFI only. The garbage BGRT data causes the efi_bgrt_init() panic. Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around this firmware bug. Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk [ Rewrote the changelog to be more readable. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Juergen Gross	c76a4a0771	efi: Don't issue error message when booted under Xen commit `1ea34adb87` upstream. When booted as Xen dom0 there won't be an EFI memmap allocated. Avoid issuing an error message in this case: [ 0.144079] efi: Failed to allocate new EFI memmap Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170526113652.21339-2-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Jan Kara	b8745dbb65	gfs2: Make flush bios explicitely sync commit `0f0b9b63e1` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: Steven Whitehouse <swhiteho@redhat.com> CC: cluster-devel@redhat.com Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
J. Bruce Fields	836fb216da	nfsd4: fix null dereference on replay commit `9a307403d3` upstream. if we receive a compound such that: - the sessionid, slot, and sequence number in the SEQUENCE op match a cached succesful reply with N ops, and - the Nth operation of the compound is a PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH, then nfsd4_sequence will return 0 and set cstate->status to nfserr_replay_cache. The current filehandle will not be set. This will cause us to call check_nfsd_access with first argument NULL. To nfsd4_compound it looks like we just succesfully executed an operation that set a filehandle, but the current filehandle is not set. Fix this by moving the nfserr_replay_cache earlier. There was never any reason to have it after the encode_op label, since the only case where he hit that is when opdesc->op_func sets it. Note that there are two ways we could hit this case: - a client is resending a previously sent compound that ended with one of the four PUTFH-like operations, or - a client is sending a new compound that (incorrectly) shares sessionid, slot, and sequence number with a previously sent compound, and the length of the previously sent compound happens to match the position of a PUTFH-like operation in the new compound. The second is obviously incorrect client behavior. The first is also very strange--the only purpose of a PUTFH-like operation is to set the current filehandle to be used by the following operation, so there's no point in having it as the last in a compound. So it's likely this requires a buggy or malicious client to reproduce. Reported-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Alex Deucher	8015cc81a6	drm/amdgpu/ci: disable mclk switching for high refresh rates (v2) commit `0a646f331d` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Vegard Nossum	0f7ff02f7d	kthread: Fix use-after-free if kthread fork fails commit `4d6501dce0` upstream. If a kthread forks (e.g. usermodehelper since commit `1da5c46fa9`) but fails in copy_process() between calling dup_task_struct() and setting p->set_child_tid, then the value of p->set_child_tid will be inherited from the parent and get prematurely freed by free_kthread_struct(). kthread() - worker_thread() - process_one_work() \| - call_usermodehelper_exec_work() \| - kernel_thread() \| - _do_fork() \| - copy_process() \| - dup_task_struct() \| - arch_dup_task_struct() \| - tsk->set_child_tid = current->set_child_tid // implied \| - ... \| - goto bad_fork_* \| - ... \| - free_task(tsk) \| - free_kthread_struct(tsk) \| - kfree(tsk->set_child_tid) - ... - schedule() - __schedule() - wq_worker_sleeping() - kthread_data(task)->flags // UAF The problem started showing up with commit `1da5c46fa9` since it reused ->set_child_tid for the kthread worker data. A better long-term solution might be to get rid of the ->set_child_tid abuse. The comment in set_kthread_struct() also looks slightly wrong. Debugged-by: Jamie Iles <jamie.iles@oracle.com> Fixes: `1da5c46fa9` ("kthread: Make struct kthread kmalloc'ed") Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jamie Iles <jamie.iles@oracle.com> Link: http://lkml.kernel.org/r/20170509073959.17858-1-vegard.nossum@oracle.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Amir Goldstein	a5505a656f	ovl: fix creds leak in copy up error path commit `8137ae26d2` upstream. Fixes: `42f269b925` ("ovl: rearrange code in ovl_copy_up_locked()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Gilad Ben-Yossef	f59fdb278e	crypto: gcm - wait for crypto op not signal safe commit `f3ad587070` upstream. crypto_gcm_setkey() was using wait_for_completion_interruptible() to wait for completion of async crypto op but if a signal occurs it may return before DMA ops of HW crypto provider finish, thus corrupting the data buffer that is kfree'ed in this case. Resolve this by using wait_for_completion() instead. Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Gilad Ben-Yossef	1286652e80	crypto: drbg - wait for crypto op not signal safe commit `a5dfefb1c3` upstream. drbg_kcapi_sym_ctr() was using wait_for_completion_interruptible() to wait for completion of async crypto op but if a signal occurs it may return before DMA ops of HW crypto provider finish, thus corrupting the output buffer. Resolve this by using wait_for_completion() instead. Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Eric Biggers	7e4c3a6da3	KEYS: encrypted: avoid encrypting/decrypting stack buffers commit `e9ff56ac35` upstream. Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt stack buffers because the stack may be virtually mapped. Fix this for the padding buffers in encrypted-keys by using ZERO_PAGE for the encryption padding and by allocating a temporary heap buffer for the decryption padding. Tested with CONFIG_DEBUG_SG=y: keyctl new_session keyctl add user master "abcdefghijklmnop" @s keyid=$(keyctl add encrypted desc "new user:master 25" @s) datablob="$(keyctl pipe $keyid)" keyctl unlink $keyid keyid=$(keyctl add encrypted desc "load $datablob" @s) datablob2="$(keyctl pipe $keyid)" [ "$datablob" = "$datablob2" ] && echo "Success!" Cc: Andy Lutomirski <luto@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Eric Biggers	8a28f221a0	KEYS: fix freeing uninitialized memory in key_update() commit `63a0b0509e` upstream. key_update() freed the key_preparsed_payload even if it was not initialized first. This would cause a crash if userspace called keyctl_update() on a key with type like "asymmetric" that has a ->preparse() method but not an ->update() method. Possibly it could even be triggered for other key types by racing with keyctl_setperm() to make the KEY_NEED_WRITE check fail (the permission was already checked, so normally it wouldn't fail there). Reproducer with key type "asymmetric", given a valid cert.der: keyctl new_session keyid=$(keyctl padd asymmetric desc @s < cert.der) keyctl setperm $keyid 0x3f000000 keyctl update $keyid data [ 150.686666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 [ 150.687601] IP: asymmetric_key_free_kids+0x12/0x30 [ 150.688139] PGD 38a3d067 [ 150.688141] PUD 3b3de067 [ 150.688447] PMD 0 [ 150.688745] [ 150.689160] Oops: 0000 [#1] SMP [ 150.689455] Modules linked in: [ 150.689769] CPU: 1 PID: 2478 Comm: keyctl Not tainted 4.11.0-rc4-xfstests-00187-ga9f6b6b8cd2f #742 [ 150.690916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 [ 150.692199] task: ffff88003b30c480 task.stack: ffffc90000350000 [ 150.692952] RIP: 0010:asymmetric_key_free_kids+0x12/0x30 [ 150.693556] RSP: 0018:ffffc90000353e58 EFLAGS: 00010202 [ 150.694142] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004 [ 150.694845] RDX: ffffffff81ee3920 RSI: ffff88003d4b0700 RDI: 0000000000000001 [ 150.697569] RBP: ffffc90000353e60 R08: ffff88003d5d2140 R09: 0000000000000000 [ 150.702483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 150.707393] R13: 0000000000000004 R14: ffff880038a4d2d8 R15: 000000000040411f [ 150.709720] FS: 00007fcbcee35700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [ 150.711504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 150.712733] CR2: 0000000000000001 CR3: 0000000039eab000 CR4: 00000000003406e0 [ 150.714487] Call Trace: [ 150.714975] asymmetric_key_free_preparse+0x2f/0x40 [ 150.715907] key_update+0xf7/0x140 [ 150.716560] ? key_default_cmp+0x20/0x20 [ 150.717319] keyctl_update_key+0xb0/0xe0 [ 150.718066] SyS_keyctl+0x109/0x130 [ 150.718663] entry_SYSCALL_64_fastpath+0x1f/0xc2 [ 150.719440] RIP: 0033:0x7fcbce75ff19 [ 150.719926] RSP: 002b:00007ffd5d167088 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa [ 150.720918] RAX: ffffffffffffffda RBX: 0000000000404d80 RCX: 00007fcbce75ff19 [ 150.721874] RDX: 00007ffd5d16785e RSI: 000000002866cd36 RDI: 0000000000000002 [ 150.722827] RBP: 0000000000000006 R08: 000000002866cd36 R09: 00007ffd5d16785e [ 150.723781] R10: 0000000000000004 R11: 0000000000000206 R12: 0000000000404d80 [ 150.724650] R13: 00007ffd5d16784d R14: 00007ffd5d167238 R15: 000000000040411f [ 150.725447] Code: 83 c4 08 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 ff 74 23 55 48 89 e5 53 48 89 fb <48> 8b 3f e8 06 21 c5 ff 48 8b 7b 08 e8 fd 20 c5 ff 48 89 df e8 [ 150.727489] RIP: asymmetric_key_free_kids+0x12/0x30 RSP: ffffc90000353e58 [ 150.728117] CR2: 0000000000000001 [ 150.728430] ---[ end trace f7f8fe1da2d5ae8d ]--- Fixes: `4d8c0250b8` ("KEYS: Call ->free_preparse() even after ->preparse() returns an error") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Eric Biggers	5def69023a	KEYS: fix dereferencing NULL payload with nonzero length commit `5649645d72` upstream. sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a NULL payload with nonzero length to be passed to the key type's ->preparse(), ->instantiate(), and/or ->update() methods. Various key types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did not handle this case, allowing an unprivileged user to trivially cause a NULL pointer dereference (kernel oops) if one of these key types was present. Fix it by doing the copy_from_user() when 'plen' is nonzero rather than when '_payload' is non-NULL, causing the syscall to fail with EFAULT as expected when an invalid buffer is specified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Gilad Ben-Yossef	e423898fd8	crypto: asymmetric_keys - handle EBUSY due to backlog correctly commit `e68368aed5` upstream. public_key_verify_signature() was passing the CRYPTO_TFM_REQ_MAY_BACKLOG flag to akcipher_request_set_callback() but was not handling correctly the case where a -EBUSY error could be returned from the call to crypto_akcipher_verify() if backlog was used, possibly casuing data corruption due to use-after-free of buffers. Resolve this by handling -EBUSY correctly. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Murali Karicheri	c10acee89d	ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR commit `791229f1d5` upstream. Ethernet networking on K2L has been broken since v4.11-rc1. This was caused by commit `32a34441a9` ("ARM: keystone: dts: fix netcp clocks and add names"). This commit inadvertently moves on-chip static RAM clock to the end of list of clocks provided for netcp. Since keystone PM domain support does not have a list of recognized con_ids, only the first clock in the list comes under runtime PM management. This means the OSR (On-chip Static RAM) clock remains disabled and that broke networking on K2L. The OSR is used by QMSS on K2L as an external linking RAM. However this is a standalone RAM that can be used for non-QMSS usage (as well as from DSP side). So add a SRAM device node for the same and add the OSR clock to the node. Remove the now redundant OSR clock node from netcp. To manage all clocks defined for netCP's use by runtime PM needs keystone generic power domain (genpd) driver support which is under works. Meanwhile, this patch restores K2L networking and is correct irrespective of any future genpd work since OSR is an independent module and not part of NetCP anyway. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> [nsekhar@ti.com: commit message updates, port to latest mainline] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Eric W. Biederman	ff6c1649b4	ptrace: Properly initialize ptracer_cred on fork commit `c70d9d809f` upstream. When I introduced ptracer_cred I failed to consider the weirdness of fork where the task_struct copies the old value by default. This winds up leaving ptracer_cred set even when a process forks and the child process does not wind up being ptraced. Because ptracer_cred is not set on non-ptraced processes whose parents were ptraced this has broken the ability of the enlightenment window manager to start setuid children. Fix this by properly initializing ptracer_cred in ptrace_init_task This must be done with a little bit of care to preserve the current value of ptracer_cred when ptrace carries through fork. Re-reading the ptracer_cred from the ptracing process at this point is inconsistent with how PT_PTRACE_CAP has been maintained all of these years. Tested-by: Takashi Iwai <tiwai@suse.de> Fixes: `64b875f7ac` ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Lucas Stach	1c8faeebd5	serial: core: fix crash in uart_suspend_port commit `88e2582e90` upstream. With serdev we might end up with serial ports that have no cdev exported to userspace, as they are used as the bus interface to other devices. In that case serial_match_port() won't be able to find a matching tty_dev. Skip the irq wakeup enabling in that case, as serdev will make sure to keep the port active, as long as there are devices depending on it. Fixes: `8ee3fde047` (tty_port: register tty ports with serdev bus) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Johan Hovold	70876e7c02	serial: ifx6x60: fix use-after-free on module unload commit `1e948479b3` upstream. Make sure to deregister the SPI driver before releasing the tty driver to avoid use-after-free in the SPI remove callback where the tty devices are deregistered. Fixes: `72d4724ea5` ("serial: ifx6x60: Add modem power off function in the platform reboot process") Cc: Jun Chen <jun.d.chen@intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Jan Kiszka	7faf3f5f89	serial: exar: Fix stuck MSIs commit `2c0ac5b48a` upstream. After migrating 8250_exar to MSI in `172c33cb61`, we can get stuck without further interrupts because of the special wake-up event these chips send. They are only cleared by reading INT0. As we fail to do so during startup and shutdown, we can leave the interrupt line asserted, which is fatal with edge-triggered MSIs. Add the required reading of INT0 to startup and shutdown. Also account for the fact that a pending wake-up interrupt means we have to return 1 from exar_handle_irq. Drop the unneeded reading of INT1..3 along with this - those never reset anything. An alternative approach would have been disabling the wake-up interrupt. Unfortunately, this feature (REGB[17] = 1) is not available on the XR17D15X. Fixes: `172c33cb61` ("serial: exar: Enable MSI support") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Luis Henriques	bc0734aac3	ftrace: Fix memory leak in ftrace_graph_release() commit `f9797c2f20` upstream. ftrace_hash is being kfree'ed in ftrace_graph_release(), however the ->buckets field is not. This results in a memory leak that is easily captured by kmemleak: unreferenced object 0xffff880038afe000 (size 8192): comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0 [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80 [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100 [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0 [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0 [<ffffffff81140df7>] vfs_open+0x47/0x60 [<ffffffff81150f95>] path_openat+0x2a5/0x1020 [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0 [<ffffffff811411df>] do_sys_open+0x12f/0x200 [<ffffffff811412ce>] SyS_open+0x1e/0x20 [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com Fixes: `b9b0c831be` ("ftrace: Convert graph filter to use hash tables") Signed-off-by: Luis Henriques <lhenriques@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Jane Chu	1a223727ae	arch/sparc: support NR_CPUS = 4096 [ Upstream commit `c79a13734d` ] Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info() only allocates a single page for NR_CPUS mondo entries. Thus we cannot use all 4096 CPUs on some SPARC platforms. To fix, allocate (2^order) pages where order is set according to the size of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa are not used in asm code, there are no imm13 offsets from the base PA that will break because they can only reach one page. Orabug: 25505750 Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Atish Patra <atish.patra@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	10e3a2945e	sparc64: delete old wrap code [ Upstream commit `0197e41ce7` ] The old method that is using xcall and softint to get new context id is deleted, as it is replaced by a method of using per_cpu_secondary_mm without xcall to perform the context wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	b60d9051cc	sparc64: new context wrap [ Upstream commit `a0582f26ec` ] The current wrap implementation has a race issue: it is called outside of the ctx_alloc_lock, and also does not wait for all CPUs to complete the wrap. This means that a thread can get a new context with a new version and another thread might still be running with the same context. The problem is especially severe on CPUs with shared TLBs, like sun4v. I used the following test to very quickly reproduce the problem: - start over 8K processes (must be more than context IDs) - write and read values at a memory location in every process. Very quickly memory corruptions start happening, and what we read back does not equal what we wrote. Several approaches were explored before settling on this one: Approach 1: Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for every process to complete the wrap. (Note: every CPU must WAIT before leaving smp_new_mmu_context_version_client() until every one arrives). This approach ends up with deadlocks, as some threads own locks which other threads are waiting for, and they never receive softint until these threads exit smp_new_mmu_context_version_client(). Since we do not allow the exit, deadlock happens. Approach 2: Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into into C code, and issue new versions to every CPU. This approach adds some overhead to runtime: in switch_mm() we must add some checks to make sure that versions have not changed due to wrap while we were loading the new secondary context. (could be protected by PSTATE_IE but that degrades performance as on M7 and older CPUs as it takes 50 cycles for each access). Also, we still need a global per-cpu array of MMs to know where we need to load new contexts, otherwise we can change context to a thread that is going way (if we received mondo between switch_mm() and switch_to() time). Finally, there are some issues with window registers in rtrap() when context IDs are changed during CPU mondo time. The approach in this patch is the simplest and has almost no impact on runtime. We use the array with mm's where last secondary contexts were loaded onto CPUs and bump their versions to the new generation without changing context IDs. If a new process comes in to get a context ID, it will go through get_new_mmu_context() because of version mismatch. But the running processes do not need to be interrupted. And wrap is quicker as we do not need to xcall and wait for everyone to receive and complete wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	be5e2c7026	sparc64: add per-cpu mm of secondary contexts [ Upstream commit `7a5b4bbf49` ] The new wrap is going to use information from this array to figure out mm's that currently have valid secondary contexts setup. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	d5fb553c51	sparc64: redefine first version [ Upstream commit `c4415235b2` ] CTX_FIRST_VERSION defines the first context version, but also it defines first context. This patch redefines it to only include the first context version. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	557145f44c	sparc64: combine activate_mm and switch_mm [ Upstream commit `14d0334c67` ] The only difference between these two functions is that in activate_mm we unconditionally flush context. However, there is no need to keep this difference after fixing a bug where cpumask was not reset on a wrap. So, in this patch we combine these. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	aa6349e030	sparc64: reset mm cpumask after wrap [ Upstream commit `5889748573` ] After a wrap (getting a new context version) a process must get a new context id, which means that we would need to flush the context id from the TLB before running for the first time with this ID on every CPU. But, we use mm_cpumask to determine if this process has been running on this CPU before, and this mask is not reset after a wrap. So, there are two possible fixes for this issue: 1. Clear mm cpumask whenever mm gets a new context id 2. Unconditionally flush context every time process is running on a CPU This patch implements the first solution Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Liam R. Howlett	c8c7bb2f5b	sparc/mm/hugepages: Fix setup_hugepagesz for invalid values. [ Upstream commit `f322980b74` ] hugetlb_bad_size needs to be called on invalid values. Also change the pr_warn to a pr_err to better align with other platforms. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
James Clarke	a091e625ed	sparc: Machine description indices can vary [ Upstream commit `c982aa9c30` ] VIO devices were being looked up by their index in the machine description node block, but this often varies over time as devices are added and removed. Instead, store the ID and look up using the type, config handle and ID. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541 Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Mike Kravetz	376452d48e	sparc64: mm: fix copy_tsb to correctly copy huge page TSBs [ Upstream commit `654f480762` ] When a TSB grows beyond its current capacity, a new TSB is allocated and copy_tsb is called to copy entries from the old TSB to the new. A hash shift based on page size is used to calculate the index of an entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT should be used. As a result, when copy_tsb is called for a huge page TSB the entries are placed at the incorrect index in the newly allocated TSB. When doing hardware table walk, the MMU does not match these entries and we end up in the TSB miss handling code. This code will then create and write an entry to the correct index in the TSB. We take a performance hit for the table walk miss and recreation of these entries. Pass a new parameter to copy_tsb that is the page size shift to be used when copying the TSB. Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
David S. Miller	35b284c739	sparc64: Add __multi3 for gcc 7.x and later. [ Upstream commit `1b4af13ff2` ] Reported-by: Waldemar Brodkorb <wbx@openadk.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Niklas Cassel	6bf499d388	net: stmmac: fix completely hung TX when using TSO [ Upstream commit `426849e661` ] stmmac_tso_allocator can fail to set the Last Descriptor bit on a descriptor that actually was the last descriptor. This happens when the buffer of the last descriptor ends up having a size of exactly TSO_MAX_BUFF_SIZE. When the IP eventually reaches the next last descriptor, which actually has the bit set, the DMA will hang. When the DMA hangs, we get a tx timeout, however, since stmmac does not do a complete reset of the IP in stmmac_tx_timeout, we end up in a state with completely hung TX. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Max Filippov	d44184237a	net: ethoc: enable NAPI before poll may be scheduled [ Upstream commit `d220b942a4` ] ethoc_reset enables device interrupts, ethoc_interrupt may schedule a NAPI poll before NAPI is enabled in the ethoc_open, which results in device being unable to send or receive anything until it's closed and reopened. In case the device is flooded with ingress packets it may be unable to recover at all. Move napi_enable above ethoc_reset in the ethoc_open to fix that. Fixes: `a170285772` ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Nikolay Aleksandrov	9719e31a96	net: bridge: fix a null pointer dereference in br_afspec [ Upstream commit `1020ce3108` ] We might call br_afspec() with p == NULL which is a valid use case if the action is on the bridge device itself, but the bridge tunnel code dereferences the p pointer without checking, so check if p is null first. Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Fixes: `efa5356b0d` ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Eugeniu Rosca	d673c7641c	ravb: Fix use-after-free on `ifconfig eth0 down` [ Upstream commit `79514ef670` ] Commit `a47b70ea86` ("ravb: unmap descriptors when freeing rings") has introduced the issue seen in [1] reproduced on H3ULCB board. Fix this by relocating the RX skb ringbuffer free operation, so that swiotlb page unmapping can be done first. Freeing of aligned TX buffers is not relevant to the issue seen in [1]. Still, reposition TX free calls as well, to have all kfree() operations performed consistently _after_ dma_unmap_()/dma_free_(). [1] Console screenshot with the problem reproduced: salvator-x login: root root@salvator-x:~# ifconfig eth0 up Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: \ attached PHY driver [Micrel KSZ9031 Gigabit PHY] \ (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=235) IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready root@salvator-x:~# root@salvator-x:~# ifconfig eth0 down ================================================================== BUG: KASAN: use-after-free in swiotlb_tbl_unmap_single+0xc4/0x35c Write of size 1538 at addr ffff8006d884f780 by task ifconfig/1649 CPU: 0 PID: 1649 Comm: ifconfig Not tainted 4.12.0-rc4-00004-g112eb07287d1 #32 Hardware name: Renesas H3ULCB board based on r8a7795 (DT) Call trace: [<ffff20000808f11c>] dump_backtrace+0x0/0x3a4 [<ffff20000808f4d4>] show_stack+0x14/0x1c [<ffff20000865970c>] dump_stack+0xf8/0x150 [<ffff20000831f8b0>] print_address_description+0x7c/0x330 [<ffff200008320010>] kasan_report+0x2e0/0x2f4 [<ffff20000831eac0>] check_memory_region+0x20/0x14c [<ffff20000831f054>] memcpy+0x48/0x68 [<ffff20000869ed50>] swiotlb_tbl_unmap_single+0xc4/0x35c [<ffff20000869fcf4>] unmap_single+0x90/0xa4 [<ffff20000869fd14>] swiotlb_unmap_page+0xc/0x14 [<ffff2000080a2974>] __swiotlb_unmap_page+0xcc/0xe4 [<ffff2000088acdb8>] ravb_ring_free+0x514/0x870 [<ffff2000088b25dc>] ravb_close+0x288/0x36c [<ffff200008aaf8c4>] __dev_close_many+0x14c/0x174 [<ffff200008aaf9b4>] __dev_close+0xc8/0x144 [<ffff200008ac2100>] __dev_change_flags+0xd8/0x194 [<ffff200008ac221c>] dev_change_flags+0x60/0xb0 [<ffff200008ba2dec>] devinet_ioctl+0x484/0x9d4 [<ffff200008ba7b78>] inet_ioctl+0x190/0x194 [<ffff200008a78c44>] sock_do_ioctl+0x78/0xa8 [<ffff200008a7a128>] sock_ioctl+0x110/0x3c4 [<ffff200008365a70>] vfs_ioctl+0x90/0xa0 [<ffff200008365dbc>] do_vfs_ioctl+0x148/0xc38 [<ffff2000083668f0>] SyS_ioctl+0x44/0x74 [<ffff200008083770>] el0_svc_naked+0x24/0x28 The buggy address belongs to the page: page:ffff7e001b6213c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffff7e001b6213e0 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8006d884f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff8006d884f780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8006d884f800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint root@salvator-x:~# Fixes: `a47b70ea86` ("ravb: unmap descriptors when freeing rings") Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Richard Haines	4990610df0	net/ipv6: Fix CALIPSO causing GPF with datagram support [ Upstream commit `e3ebdb20fd` ] When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the IP header may have moved. Also update the payload length after adding the CALIPSO option. Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Eric Dumazet	76d36dd1b1	net: ping: do not abuse udp_poll() [ Upstream commit `77d4b1d369` ] Alexander reported various KASAN messages triggered in recent kernels The problem is that ping sockets should not use udp_poll() in the first place, and recent changes in UDP stack finally exposed this old bug. Fixes: `c319b4d76b` ("net: ipv4: add IPPROTO_ICMP socket kind") Fixes: `6d0bfe2261` ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <alexander.levin@verizon.com> Cc: Solar Designer <solar@openwall.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Lorenzo Colitti <lorenzo@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Tested-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Florian Fainelli	223313c9be	net: dsa: Fix stale cpu_switch reference after unbind then bind [ Upstream commit `b07ac98946` ] Commit `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") replaced the use of dst->ds[0] with dst->cpu_switch since that is functionally equivalent, however, we can now run into an use after free scenario after unbinding then rebinding the switch driver. The use after free happens because we do correctly initialize dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we unbind the driver: dsa_dst_unapply() is called, and we rebind again. dst->cpu_switch now points to a freed "ds" structure, and so when we finally dereference it in dsa_cpu_port_ethtool_setup(), we oops. To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply() which guarantees that we always correctly re-assign dst->cpu_switch in dsa_cpu_parse(). Fixes: `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
David S. Miller	a32f7198b2	ipv6: Fix leak in ipv6_gso_segment(). [ Upstream commit `e3e86b5119` ] If ip6_find_1stfragopt() fails and we return an error we have to free up 'segs' because nobody else is going to. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Eric Garver	f7f87871c4	geneve: fix needed_headroom and max_mtu for collect_metadata [ Upstream commit `9a1c44d989` ] Since commit `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") when using COLLECT_METADATA geneve devices are created with too small of a needed_headroom and too large of a max_mtu. This is because ip_tunnel_info_af() is not valid with the device level info when using COLLECT_METADATA and we mistakenly fall into the IPv4 case. For COLLECT_METADATA, always use the worst case of ipv6 since both sockets are created. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Soheil Hassas Yeganeh	19456d4526	sock: reset sk_err when the error queue is empty [ Upstream commit `38b257938a` ] Prior to `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb), sk_err was reset to the error of the skb on the head of the error queue. Applications, most notably ping, are relying on this behavior to reset sk_err for ICMP packets. Set sk_err to the ICMP error when there is an ICMP packet at the head of the error queue. Fixes: `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb) Reported-by: Cyril Hrubis <chrubis@suse.cz> Tested-by: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Liam McBirnie	3ab7563eed	ip6_tunnel: fix traffic class routing for tunnels [ Upstream commit `5f733ee68f` ] ip6_route_output() requires that the flowlabel contains the traffic class for policy routing. Commit `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") removed the code which previously added the traffic class to the flowlabel. The traffic class is added here because only route lookup needs the flowlabel to contain the traffic class. Fixes: `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com> Acked-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Mark Bloch	35e08f6e2b	vxlan: fix use-after-free on deletion [ Upstream commit `a53cb29b0a` ] Adding a vxlan interface to a socket isn't symmetrical, while adding is done in vxlan_open() the deletion is done in vxlan_dellink(). This can cause a use-after-free error when we close the vxlan interface before deleting it. We add vxlan_vs_del_dev() to match vxlan_vs_add_dev() and call it from vxlan_stop() to match the call from vxlan_open(). Fixes: `56ef9c909b` ("vxlan: Move socket initialization to within rtnl scope") Acked-by: Jiri Benc <jbenc@redhat.com> Tested-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Yuchung Cheng	905ae1b1e6	tcp: disallow cwnd undo when switching congestion control [ Upstream commit `44abafc4cc` ] When the sender switches its congestion control during loss recovery, if the recovery is spurious then it may incorrectly revert cwnd and ssthresh to the older values set by a previous congestion control. Consider a congestion control (like BBR) that does not use ssthresh and keeps it infinite: the connection may incorrectly revert cwnd to an infinite value when switching from BBR to another congestion control. This patch fixes it by disallowing such cwnd undo operation upon switching congestion control. Note that undo_marker is not reset s.t. the packets that were incorrectly marked lost would be corrected. We only avoid undoing the cwnd in tcp_undo_cwnd_reduction(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Ganesh Goudar	fb62fe105b	cxgb4: avoid enabling napi twice to the same queue [ Upstream commit `e7519f9926` ] Take uld mutex to avoid race between cxgb_up() and cxgb4_register_uld() to enable napi for the same uld queue. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Ben Hutchings	d935063ac4	ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() [ Upstream commit `6e80ac5cc9` ] xfrm6_find_1stfragopt() may now return an error code and we must not treat it as a length. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Florian Fainelli	ce45e4f995	net: systemport: Fix missing Wake-on-LAN interrupt for SYSTEMPORT Lite [ Upstream commit `d31353cd75` ] On SYSTEMPORT Lite, since we have the main interrupt source in the first cell, the second cell is the Wake-on-LAN interrupt, yet the code was not properly updated to fetch the second cell, and instead looked at the third and non-existing cell for Wake-on-LAN. Fixes: `44a4524c54` ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Lance Richardson	1f9f0e2822	vxlan: eliminate cached dst leak [ Upstream commit `35cf284556` ] After commit `0c1d70af92` ("net: use dst_cache for vxlan device"), cached dst entries could be leaked when more than one remote was present for a given vxlan_fdb entry, causing subsequent netns operations to block indefinitely and "unregister_netdevice: waiting for lo to become free." messages to appear in the kernel log. Fix by properly releasing cached dst and freeing resources in this case. Fixes: `0c1d70af92` ("net: use dst_cache for vxlan device") Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Nikolay Aleksandrov	8b8fd1831b	net: bridge: start hello timer only if device is up [ Upstream commit `aeb073241f` ] When the transition of NO_STP -> KERNEL_STP was fixed by always calling mod_timer in br_stp_start, it introduced a new regression which causes the timer to be armed even when the bridge is down, and since we stop the timers in its ndo_stop() function, they never get disabled if the device is destroyed before it's upped. To reproduce: $ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on; ip l del br0; done; CC: Xin Long <lucien.xin@gmail.com> CC: Ivan Vecera <cera@cera.cz> CC: Sebastian Ott <sebott@linux.vnet.ibm.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `6d18c732b9` ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:32 +02:00
Mintz, Yuval	f5f67e441e	bnx2x: Fix Multi-Cos [ Upstream commit `3968d38917` ] Apparently multi-cos isn't working for bnx2x quite some time - driver implements ndo_select_queue() to allow queue-selection for FCoE, but the regular L2 flow would cause it to modulo the fallback's result by the number of queues. The fallback would return a queue matching the needed tc [via __skb_tx_hash()], but since the modulo is by the number of TSS queues where number of TCs is not accounted, transmission would always be done by a queue configured into using TC0. Fixes: `ada7c19e6d` ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:32 +02:00
Greg Kroah-Hartman	553c942bef	Linux 4.11.4	2017-06-07 12:10:31 +02:00
Jan Kara	b5ff97c774	xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() commit `d7fd24257a` upstream. There is an off-by-one error in loop termination conditions in xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of desired range if 'endoff' is page aligned. It doesn't have any visible effects but still it is good to fix it. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:17 +02:00
Eric Sandeen	d514c634a4	xfs: fix unaligned access in xfs_btree_visit_blocks commit `a4d768e702` upstream. This structure copy was throwing unaligned access warnings on sparc64: Kernel unaligned access at TPC[1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs] xfs_btree_copy_ptrs does a memcpy, which avoids it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:17 +02:00
Darrick J. Wong	fff5729ce2	xfs: avoid mount-time deadlock in CoW extent recovery commit `3ecb3ac7b9` upstream. If a malicious user corrupts the refcount btree to cause a cycle between different levels of the tree, the next mount attempt will deadlock in the CoW recovery routine while grabbing buffer locks. We can use the ability to re-grab a buffer that was previous locked to a transaction to avoid deadlocks, so do that here. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Christoph Hellwig	ecb4261526	xfs: xfs_trans_alloc_empty This is a partial cherry-pick of commit `e89c041338` ("xfs: implement the GETFSMAP ioctl"), which also adds this helper, and a great example of why feature patches should be properly split into their parts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: split from the larger patch for -stable] Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-07 12:10:16 +02:00
Zorro Lang	2e08bd63dc	xfs: bad assertion for delalloc an extent that start at i_size commit `892d2a5f70` upstream. By run fsstress long enough time enough in RHEL-7, I find an assertion failure (harder to reproduce on linux-4.11, but problem is still there): XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c The assertion is in xfs_getbmap() funciton: if (map[i].br_startblock == DELAYSTARTBLOCK && --> map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip))) ASSERT((iflags & BMV_IF_DELALLOC) != 0); When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the startoff is just at EOF. But we only need to make sure delalloc extents that are within EOF, not include EOF. Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Darrick J. Wong	2dc6e27120	xfs: BMAPX shouldn't barf on inline-format directories commit `6eadbf4c8b` upstream. When we're fulfilling a BMAPX request, jump out early if the data fork is in local format. This prevents us from hitting a debugging check in bmapi_read and barfing errors back to userspace. The on-disk extent count check later isn't sufficient for IF_DELALLOC mode because da extents are in memory and not on disk. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Brian Foster	1ae26380c8	xfs: fix indlen accounting error on partial delalloc conversion commit `0daaecacb8` upstream. The delalloc -> real block conversion path uses an incorrect calculation in the case where the middle part of a delalloc extent is being converted. This is documented as a rare situation because XFS generally attempts to maximize contiguity by converting as much of a delalloc extent as possible. If this situation does occur, the indlen reservation for the two new delalloc extents left behind by the conversion of the middle range is calculated and compared with the original reservation. If more blocks are required, the delta is allocated from the global block pool. This delta value can be characterized as the difference between the new total requirement (temp + temp2) and the currently available reservation minus those blocks that have already been allocated (startblockval(PREV.br_startblock) - allocated). The problem is that the current code does not account for previously allocated blocks correctly. It subtracts the current allocation count from the (new - old) delta rather than the old indlen reservation. This means that more indlen blocks than have been allocated end up stashed in the remaining extents and free space accounting is broken as a result. Fix up the calculation to subtract the allocated block count from the original extent indlen and thus correctly allocate the reservation delta based on the difference between the new total requirement and the unused blocks from the original reservation. Also remove a bogus assert that contradicts the fact that the new indlen reservation can be larger than the original indlen reservation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Eryu Guan	d1a8ae21c3	xfs: fix use-after-free in xfs_finish_page_writeback commit `161f55efba` upstream. Commit `28b783e47a` ("xfs: bufferhead chains are invalid after end_page_writeback") fixed one use-after-free issue by pre-calculating the loop conditionals before calling bh->b_end_io() in the end_io processing loop, but it assigned 'next' pointer before checking end offset boundary & breaking the loop, at which point the bh might be freed already, and caused use-after-free. This is caught by KASAN when running fstests generic/127 on sub-page block size XFS. [ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50 [ 2747.868840] ================================================================== [ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698 ... [ 2747.918245] Call Trace: [ 2747.920975] dump_stack+0x63/0x84 [ 2747.924673] kasan_object_err+0x21/0x70 [ 2747.928950] kasan_report+0x271/0x530 [ 2747.933064] ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs] [ 2747.938409] ? end_page_writeback+0xce/0x110 [ 2747.943171] __asan_report_load8_noabort+0x19/0x20 [ 2747.948545] xfs_destroy_ioend+0x3d3/0x4e0 [xfs] [ 2747.953724] xfs_end_io+0x1af/0x2b0 [xfs] [ 2747.958197] process_one_work+0x5ff/0x1000 [ 2747.962766] worker_thread+0xe4/0x10e0 [ 2747.966946] kthread+0x2d3/0x3d0 [ 2747.970546] ? process_one_work+0x1000/0x1000 [ 2747.975405] ? kthread_create_on_node+0xc0/0xc0 [ 2747.980457] ? syscall_return_slowpath+0xe6/0x140 [ 2747.985706] ? do_page_fault+0x30/0x80 [ 2747.989887] ret_from_fork+0x2c/0x40 [ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104 [ 2748.001155] Allocated: [ 2748.003782] PID = 8327 [ 2748.006411] save_stack_trace+0x1b/0x20 [ 2748.010688] save_stack+0x46/0xd0 [ 2748.014383] kasan_kmalloc+0xad/0xe0 [ 2748.018370] kasan_slab_alloc+0x12/0x20 [ 2748.022648] kmem_cache_alloc+0xb8/0x1b0 [ 2748.027024] alloc_buffer_head+0x22/0xc0 [ 2748.031399] alloc_page_buffers+0xd1/0x250 [ 2748.035968] create_empty_buffers+0x30/0x410 [ 2748.040730] create_page_buffers+0x120/0x1b0 [ 2748.045493] __block_write_begin_int+0x17a/0x1800 [ 2748.050740] iomap_write_begin+0x100/0x2f0 [ 2748.055308] iomap_zero_range_actor+0x253/0x5c0 [ 2748.060362] iomap_apply+0x157/0x270 [ 2748.064347] iomap_zero_range+0x5a/0x80 [ 2748.068624] iomap_truncate_page+0x6b/0xa0 [ 2748.073227] xfs_setattr_size+0x1f7/0xa10 [xfs] [ 2748.078312] xfs_vn_setattr_size+0x68/0x140 [xfs] [ 2748.083589] xfs_file_fallocate+0x4ac/0x820 [xfs] [ 2748.088838] vfs_fallocate+0x2cf/0x780 [ 2748.093021] SyS_fallocate+0x48/0x80 [ 2748.097006] do_syscall_64+0x18a/0x430 [ 2748.101186] return_from_SYSCALL_64+0x0/0x6a [ 2748.105948] Freed: [ 2748.108189] PID = 8327 [ 2748.110816] save_stack_trace+0x1b/0x20 [ 2748.115093] save_stack+0x46/0xd0 [ 2748.118788] kasan_slab_free+0x73/0xc0 [ 2748.122969] kmem_cache_free+0x7a/0x200 [ 2748.127247] free_buffer_head+0x41/0x80 [ 2748.131524] try_to_free_buffers+0x178/0x250 [ 2748.136316] xfs_vm_releasepage+0x2e9/0x3d0 [xfs] [ 2748.141563] try_to_release_page+0x100/0x180 [ 2748.146325] invalidate_inode_pages2_range+0x7da/0xcf0 [ 2748.152087] xfs_shift_file_space+0x37d/0x6e0 [xfs] [ 2748.157557] xfs_collapse_file_space+0x49/0x120 [xfs] [ 2748.163223] xfs_file_fallocate+0x2a7/0x820 [xfs] [ 2748.168462] vfs_fallocate+0x2cf/0x780 [ 2748.172642] SyS_fallocate+0x48/0x80 [ 2748.176629] do_syscall_64+0x18a/0x430 [ 2748.180810] return_from_SYSCALL_64+0x0/0x6a Fixed it by checking on offset against end & breaking out first, dereference bh only if there're still bufferheads to process. Signed-off-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Darrick J. Wong	365625a670	xfs: reserve enough blocks to handle btree splits when remapping commit `fe0be23e68` upstream. In xfs_reflink_end_cow, we erroneously reserve only enough blocks to handle adding 1 extent. This is problematic if we fragment free space, have to do CoW, and then have to perform multiple bmap btree expansions. Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to handle btree splits, so log recovery fails after our first error causes the filesystem to go down. Therefore, refactor the transaction block reservation macros until we have a macro that works for our deferred (re)mapping activities, and fix both problems by using that macro. With 1k blocks we can hit this fairly often in g/187 if the scratch fs is big enough. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	67e439ccfe	xfs: wait on new inodes during quotaoff dquot release commit `e20c8a517f` upstream. The quotaoff operation has a race with inode allocation that results in a livelock. An inode allocation that occurs before the quota status flags are updated acquires the appropriate dquots for the inode via xfs_qm_vop_dqalloc(). It then inserts the XFS_INEW inode into the perag radix tree, sometime later attaches the dquots to the inode and finally clears the XFS_INEW flag. Quotaoff expects to release the dquots from all inodes in the filesystem via xfs_qm_dqrele_all_inodes(). This invokes the AG inode iterator, which skips inodes in the XFS_INEW state because they are not fully constructed. If the scan occurs after dquots have been attached to an inode, but before XFS_INEW is cleared, the newly allocated inode will continue to hold a reference to the applicable dquots. When quotaoff invokes xfs_qm_dqpurge_all(), the reference count of those dquot(s) remain elevated and the dqpurge scan spins indefinitely. To address this problem, update the xfs_qm_dqrele_all_inodes() scan to wait on inodes marked on the XFS_INEW state. We wait on the inodes explicitly rather than skip and retry to avoid continuous retry loops due to a parallel inode allocation workload. Since quotaoff updates the quota state flags and uses a synchronous transaction before the dqrele scan, and dquots are attached to inodes after radix tree insertion iff quota is enabled, one INEW waiting pass through the AG guarantees that the scan has processed all inodes that could possibly hold dquot references. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	f8c68633bb	xfs: update ag iterator to support wait on new inodes commit `ae2c4ac2dd` upstream. The AG inode iterator currently skips new inodes as such inodes are inserted into the inode radix tree before they are fully constructed. Certain contexts require the ability to wait on the construction of new inodes, however. The fs-wide dquot release from the quotaoff sequence is an example of this. Update the AG inode iterator to support the ability to wait on inodes flagged with XFS_INEW upon request. Create a new xfs_inode_ag_iterator_flags() interface and support a set of iteration flags to modify the iteration behavior. When the XFS_AGITER_INEW_WAIT flag is set, include XFS_INEW flags in the radix tree inode lookup and wait on them before the callback is executed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	56aab3095b	xfs: support ability to wait on new inodes commit `756baca27f` upstream. Inodes that are inserted into the perag tree but still under construction are flagged with the XFS_INEW bit. Most contexts either skip such inodes when they are encountered or have the ability to handle them. The runtime quotaoff sequence introduces a context that must wait for construction of such inodes to correctly ensure that all dquots in the fs are released. In anticipation of this, support the ability to wait on new inodes. Wake the appropriate bit when XFS_INEW is cleared. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	bf16242614	xfs: fix up quotacheck buffer list error handling commit `20e8a06378` upstream. The quotacheck error handling of the delwri buffer list assumes the resident buffers are locked and doesn't clear the _XBF_DELWRI_Q flag on the buffers that are dequeued. This can lead to assert failures on buffer release and possibly other locking problems. Move this code to a delwri queue cancel helper function to encapsulate the logic required to properly release buffers from a delwri queue. Update the helper to clear the delwri queue flag and call it from quotacheck. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	89aab40e12	xfs: prevent multi-fsb dir readahead from reading random blocks commit `cb52ee334a` upstream. Directory block readahead uses a complex iteration mechanism to map between high-level directory blocks and underlying physical extents. This mechanism attempts to traverse the higher-level dir blocks in a manner that handles multi-fsb directory blocks and simultaneously maintains a reference to the corresponding physical blocks. This logic doesn't handle certain (discontiguous) physical extent layouts correctly with multi-fsb directory blocks. For example, consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory block size and a directory with the following extent layout: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..7]: 88..95 0 (88..95) 8 1: [8..15]: 80..87 0 (80..87) 8 2: [16..39]: 168..191 0 (168..191) 24 3: [40..63]: 5242952..5242975 1 (72..95) 24 Directory block 0 spans physical extents 0 and 1, dirblk 1 lies entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because extent 2 is larger than the directory block size, the readahead code erroneously assumes the block is contiguous and issues a readahead based on the physical mapping of the first fsb of the dirblk. This results in read verifier failure and a spurious corruption or crc failure, depending on the filesystem format. Further, the subsequent readahead code responsible for walking through the physical table doesn't correctly advance the physical block reference for dirblk 2. Instead of advancing two physical filesystem blocks, the first iteration of the loop advances 1 block (correctly), but the subsequent iteration advances 2 more physical blocks because the next physical extent (extent 3, above) happens to cover more than dirblk 2. At this point, the higher-level directory block walking is completely off the rails of the actual physical layout of the directory for the respective mapping table. Update the contiguous dirblock logic to consider the current offset in the physical extent to avoid issuing directory readahead to unrelated blocks. Also, update the mapping table advancing code to consider the current offset within the current dirblock to avoid advancing the mapping reference too far beyond the dirblock. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Eric Sandeen	91bb4f7da5	xfs: handle array index overrun in xfs_dir2_leaf_readbuf() commit `023cc840b4` upstream. Carlos had a case where "find" seemed to start spinning forever and never return. This was on a filesystem with non-default multi-fsb (8k) directory blocks, and a fragmented directory with extents like this: 0:[0,133646,2,0] 1:[2,195888,1,0] 2:[3,195890,1,0] 3:[4,195892,1,0] 4:[5,195894,1,0] 5:[6,195896,1,0] 6:[7,195898,1,0] 7:[8,195900,1,0] 8:[9,195902,1,0] 9:[10,195908,1,0] 10:[11,195910,1,0] 11:[12,195912,1,0] 12:[13,195914,1,0] ... i.e. the first extent is a contiguous 2-fsb dir block, but after that it is fragmented into 1 block extents. At the top of the readdir path, we allocate a mapping array which (for this filesystem geometry) can hold 10 extents; see the assignment to map_info->map_size. During readdir, we are therefore able to map extents 0 through 9 above into the array for readahead purposes. If we count by 2, we see that the last mapped index (9) is the first block of a 2-fsb directory block. At the end of xfs_dir2_leaf_readbuf() we have 2 loops to fill more readahead; the outer loop assumes one full dir block is processed each loop iteration, and an inner loop that ensures that this is so by advancing to the next extent until a full directory block is mapped. The problem is that this inner loop may step past the last extent in the mapping array as it tries to reach the end of the directory block. This will read garbage for the extent length, and as a result the loop control variable 'j' may become corrupted and never fail the loop conditional. The number of valid mappings we have in our array is stored in map->map_valid, so stop this inner loop based on that limit. There is an ASSERT at the top of the outer loop for this same condition, but we never made it out of the inner loop, so the ASSERT never fired. Huge appreciation for Carlos for debugging and isolating the problem. Debugged-and-analyzed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Tested-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Christoph Hellwig	23da04dcc3	xfs: fix integer truncation in xfs_bmap_remap_alloc commit `52813fb13f` upstream. bno should be a xfs_fsblock_t, which is 64-bit wides instead of a xfs_aglock_t, which truncates the value to 32 bits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Brian Foster	eff615a670	xfs: drop iolock from reclaim context to appease lockdep commit `3b4683c294` upstream. Lockdep complains about use of the iolock in inode reclaim context because it doesn't understand that reclaim has the last reference to the inode, and thus an iolock->reclaim->iolock deadlock is not possible. The iolock is technically not necessary in xfs_inactive() and was only added to appease an assert in xfs_free_eofblocks(), which can be called from other non-reclaim contexts. Therefore, just kill the assert and drop the use of the iolock from reclaim context to quiet lockdep. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Darrick J. Wong	8a1f785887	xfs: actually report xattr extents via iomap commit `84358536dc` upstream. Apparently FIEMAP for xattrs has been broken since we switched to the iomap backend because of an incorrect check for xattr presence. Also fix the broken locking. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Darrick J. Wong	1c862f5a3e	xfs: fix over-copying of getbmap parameters from userspace commit `be6324c00c` upstream. In xfs_ioc_getbmap, we should only copy the fields of struct getbmap from userspace, or else we end up copying random stack contents into the kernel. struct getbmap is a strict subset of getbmapx, so a partial structure copy should work fine. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Brian Foster	2cd6bd867a	xfs: use dedicated log worker wq to avoid deadlock with cil wq commit `696a562072` upstream. The log covering background task used to be part of the xfssyncd workqueue. That workqueue was removed as of commit `5889608df` ("xfs: syncd workqueue is no more") and the associated work item scheduled to the xfs-log wq. The latter is used for log buffer I/O completion. Since xfs_log_worker() can invoke a log flush, a deadlock is possible between the xfs-log and xfs-cil workqueues. Consider the following codepath from xfs_log_worker(): xfs_log_worker() xfs_log_force() _xfs_log_force() xlog_cil_force() xlog_cil_force_lsn() xlog_cil_push_now() flush_work() The above is in xfs-log wq context and blocked waiting on the completion of an xfs-cil work item. Concurrently, the cil push in progress can end up blocked here: xlog_cil_push_work() xlog_cil_push() xlog_write() xlog_state_get_iclog_space() xlog_wait(&log->l_flush_wait, ...) The above is in xfs-cil context waiting on log buffer I/O completion, which executes in xfs-log wq context. In this scenario both workqueues are deadlocked waiting on eachother. Add a new workqueue specifically for the high level log covering and ail pushing worker, as was the case prior to commit `5889608df`. Diagnosed-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Eryu Guan	a19348b729	xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() commit `8affebe16d` upstream. xfs_find_get_desired_pgoff() is used to search for offset of hole or data in page range [index, end] (both inclusive), and the max number of pages to search should be at least one, if end == index. Otherwise the only page is missed and no hole or data is found, which is not correct. When block size is smaller than page size, this can be demonstrated by preallocating a file with size smaller than page size and writing data to the last block. E.g. run this xfs_io command on a 1k block size XFS on x86_64 host. # xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \ -c "seek -d 0" /mnt/xfs/testfile wrote 1024/1024 bytes at offset 2048 1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec) Whence Result DATA EOF Data at offset 2k was missed, and lseek(2) returned ENXIO. This is uncovered by generic/285 subtest 07 and 08 on ppc64 host, where pagesize is 64k. Because a recent change to generic/285 reduced the preallocated file size to smaller than 64k. Signed-off-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Brian Foster	0364c225a5	xfs: use ->b_state to fix buffer I/O accounting release race commit `63db7c815b` upstream. We've had user reports of unmount hangs in xfs_wait_buftarg() that analysis shows is due to btp->bt_io_count == -1. bt_io_count represents the count of in-flight asynchronous buffers and thus should always be >= 0. xfs_wait_buftarg() waits for this value to stabilize to zero in order to ensure that all untracked (with respect to the lru) buffers have completed I/O processing before unmount proceeds to tear down in-core data structures. The value of -1 implies an I/O accounting decrement race. Indeed, the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele() (where the buffer lock is no longer held) means that bp->b_flags can be updated from an unsafe context. While a user-level reproducer is currently not available, some intrusive hacks to run racing buffer lookups/ioacct/releases from multiple threads was used to successfully manufacture this problem. Existing callers do not expect to acquire the buffer lock from xfs_buf_rele(). Therefore, we can not safely update ->b_flags from this context. It turns out that we already have separate buffer state bits and associated serialization for dealing with buffer LRU state in the form of ->b_state and ->b_lock. Therefore, replace the _XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O accounting wrappers appropriately and make sure they are used with the correct locking. This ensures that buffer in-flight state can be modified at buffer release time without racing with modifications from a buffer lock holder. Fixes: `9c7504aa72` ("xfs: track and serialize in-flight async buffers against unmount") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Tested-by: Libor Pechacek <lpechacek@suse.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Jan Kara	34836549fb	xfs: Fix missed holes in SEEK_HOLE implementation commit `5375023ae1` upstream. XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as can be seen by the following command: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec) wrote 8192/8192 bytes at offset 131072 8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec) Whence Result HOLE 139264 Where we can see that hole at offset 56k was just ignored by SEEK_HOLE implementation. The bug is in xfs_find_get_desired_pgoff() which does not properly detect the case when pages are not contiguous. Fix the problem by properly detecting when found page has larger offset than expected. Fixes: `d126d43f63` Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Patrik Jakobsson	4ec50822a5	drm/gma500/psb: Actually use VBT mode when it is found commit `82bc9a42cf` upstream. With LVDS we were incorrectly picking the pre-programmed mode instead of the prefered mode provided by VBT. Make sure we pick the VBT mode if one is provided. It is likely that the mode read-out code is still wrong but this patch fixes the immediate problem on most machines. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562 Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Thomas Gleixner	74b4db844f	slub/memcg: cure the brainless abuse of sysfs attributes commit `478fe3037b` upstream. memcg_propagate_slab_attrs() abuses the sysfs attribute file functions to propagate settings from the root kmem_cache to a newly created kmem_cache. It does that with: attr->show(root, buf); attr->store(new, buf, strlen(bug); Aside of being a lazy and absurd hackery this is broken because it does not check the return value of the show() function. Some of the show() functions return 0 w/o touching the buffer. That means in such a case the store function is called with the stale content of the previous show(). That causes nonsense like invoking kmem_cache_shrink() on a newly created kmem_cache. In the worst case it would cause handing in an uninitialized buffer. This should be rewritten proper by adding a propagate() callback to those slub_attributes which must be propagated and avoid that insane conversion to and from ASCII, but that's too large for a hot fix. Check at least the return value of the show() function, so calling store() with stale content is prevented. Steven said: "It can cause a deadlock with get_online_cpus() that has been uncovered by recent cpu hotplug and lockdep changes that Thomas and Peter have been doing. Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpu_hotplug.lock); lock(slab_mutex); lock(cpu_hotplug.lock); lock(slab_mutex); * DEADLOCK *" Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Andrea Arcangeli	6caa2db34d	ksm: prevent crash after write_protect_page fails commit `a7306c3436` upstream. "err" needs to be left set to -EFAULT if split_huge_page succeeds. Otherwise if "err" gets clobbered with zero and write_protect_page fails, try_to_merge_one_page() will succeed instead of returning -EFAULT and then try_to_merge_with_ksm_page() will continue thinking kpage is a PageKsm when in fact it's still an anonymous page. Eventually it'll crash in page_add_anon_rmap. This has been reproduced on Fedora25 kernel but I can reproduce with upstream too. The bug was introduced in commit `f765f54059` ("ksm: prepare to new THP semantics") introduced in v4.5. page:fffff67546ce1cc0 count:4 mapcount:2 mapping:ffffa094551e36e1 index:0x7f0f46673 flags: 0x2ffffc0004007c(referenced\|uptodate\|dirty\|lru\|active\|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:ffffa09674bf0000 ------------[ cut here ]------------ kernel BUG at mm/rmap.c:1222! CPU: 1 PID: 76 Comm: ksmd Not tainted 4.9.3-200.fc25.x86_64 #1 RIP: do_page_add_anon_rmap+0x1c4/0x240 Call Trace: page_add_anon_rmap+0x18/0x20 try_to_merge_with_ksm_page+0x50b/0x780 ksm_scan_thread+0x1211/0x1410 ? prepare_to_wait_event+0x100/0x100 ? try_to_merge_with_ksm_page+0x780/0x780 kthread+0xd9/0xf0 ? kthread_park+0x60/0x60 ret_from_fork+0x25/0x30 Fixes: `f765f54059` ("ksm: prepare to new THP semantics") Link: http://lkml.kernel.org/r/20170513131040.21732-1-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Federico Simoncelli <fsimonce@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Rob Landley	d6eaf7a4d6	x86/boot: Use CROSS_COMPILE prefix for readelf commit `3780578761` upstream. The boot code Makefile contains a straight 'readelf' invocation. This causes build warnings in cross compile environments, when there is no unprefixed readelf accessible via $PATH. Add the missing $(CROSS_COMPILE) prefix. [ tglx: Rewrote changelog ] Fixes: `98f7852537` ("x86/boot: Refuse to build with data relocations") Signed-off-by: Rob Landley <rob@landley.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: "H.J. Lu" <hjl.tools@gmail.com> Link: http://lkml.kernel.org/r/ced18878-693a-9576-a024-113ef39a22c0@landley.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Mike Marciniszyn	f7e82ab3b6	RDMA/qib,hfi1: Fix MR reference count leak on write with immediate commit `1feb40067c` upstream. The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory reference when a buffer cannot be allocated for returning the immediate data. The issue is that the rkey validation has already occurred and the RNR nak fails to release the reference that was fruitlessly gotten. The the peer will send the identical single packet request when its RNR timer pops. The fix is to release the held reference prior to the rnr nak exit. This is the only sequence the requires both rkey validation and the buffer allocation on the same packet. Tested-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Israel Rukshin	68c98967e7	RDMA/srp: Fix NULL deref at srp_destroy_qp() commit `95c2ef50c7` upstream. If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq may be NULL. Calling directly to ib_destroy_qp() is sufficient because no work requests were posted on the created qp. Fixes: `9294000d6d` ("IB/srp: Drain the send queue before destroying a QP") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Michal Hocko	dbab023245	mm: consider memblock reservations for deferred memory initialization sizing commit `864b9a393d` upstream. We have seen an early OOM killer invocation on ppc64 systems with crashkernel=4096M: kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL\|__GFP_COMP\|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0 kthreadd cpuset=/ mems_allowed=7 CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1 Call Trace: dump_stack+0xb0/0xf0 (unreliable) dump_header+0xb0/0x258 out_of_memory+0x5f0/0x640 __alloc_pages_nodemask+0xa8c/0xc80 kmem_getpages+0x84/0x1a0 fallback_alloc+0x2a4/0x320 kmem_cache_alloc_node+0xc0/0x2e0 copy_process.isra.25+0x260/0x1b30 _do_fork+0x94/0x470 kernel_thread+0x48/0x60 kthreadd+0x264/0x330 ret_from_kernel_thread+0x5c/0xa4 Mem-Info: active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:5 slab_unreclaimable:73 mapped:0 shmem:0 pagetables:0 bounce:0 free:0 free_pcp:0 free_cma:0 Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 7 DMA: 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB 0*16384kB = 0kB 0 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 819200 pages RAM 0 pages HighMem/MovableOnly 817481 pages reserved 0 pages cma reserved 0 pages hwpoisoned the reason is that the managed memory is too low (only 110MB) while the rest of the the 50GB is still waiting for the deferred intialization to be done. update_defer_init estimates the initial memoty to initialize to 2GB at least but it doesn't consider any memory allocated in that range. In this particular case we've had Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB) so the low 2GB is mostly depleted. Fix this by considering memblock allocations in the initial static initialization estimation. Move the max_initialise to reset_deferred_meminit and implement a simple memblock_reserved_memory helper which iterates all reserved blocks and sums the size of all that start below the given address. The cumulative size is than added on top of the initial estimation. This is still not ideal because reset_deferred_meminit doesn't consider holes and so reservation might be above the initial estimation whihch we ignore but let's make the logic simpler until we really need to handle more complicated cases. Fixes: `3a80a7fa79` ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
James Morse	1cc8926344	mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified commit `9a291a7c94` upstream. KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a special case. (check_user_page_hwpoison()) When huge pages are involved, this doesn't work so well. get_user_pages() calls follow_hugetlb_page(), which stops early if it receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning -EFAULT to the caller. The step to map this to -EHWPOISON based on the FOLL_ flags is missing. The hwpoison special case is skipped, and -EFAULT is returned to user-space, causing Qemu or kvmtool to exit. Instead, move this VM_FAULT_ to errno mapping code into a header file and use it from faultin_page() and follow_hugetlb_page(). With this, KVM works as expected. This isn't a problem for arm64 today as we haven't enabled MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86 too, so I think this should be a fix. This doesn't apply earlier than stable's v4.11.1 due to all sorts of cleanup. [james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()] suggested. Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Yisheng Xie	f814bf4655	mlock: fix mlock count can not decrease in race condition commit `70feee0e1e` upstream. Kefeng reported that when running the follow test, the mlock count in meminfo will increase permanently: [1] testcase linux:~ # cat test_mlockal grep Mlocked /proc/meminfo for j in `seq 0 10` do for i in `seq 4 15` do ./p_mlockall >> log & done sleep 0.2 done # wait some time to let mlock counter decrease and 5s may not enough sleep 5 grep Mlocked /proc/meminfo linux:~ # cat p_mlockall.c #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #define SPACE_LEN 4096 int main(int argc, char ** argv) { int ret; void *adr = malloc(SPACE_LEN); if (!adr) return -1; ret = mlockall(MCL_CURRENT \| MCL_FUTURE); printf("mlcokall ret = %d\n", ret); ret = munlockall(); printf("munlcokall ret = %d\n", ret); free(adr); return 0; } In __munlock_pagevec() we should decrement NR_MLOCK for each page where we clear the PageMlocked flag. Commit `1ebb7cc6a5` ("mm: munlock: batch NR_MLOCK zone state updates") has introduced a bug where we don't decrement NR_MLOCK for pages where we clear the flag, but fail to isolate them from the lru list (e.g. when the pages are on some other cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK accounting gets permanently disrupted by this. Fix it by counting the number of page whose PageMlock flag is cleared. Fixes: `1ebb7cc6a5` (" mm: munlock: batch NR_MLOCK zone state updates") Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joern Engel <joern@logfs.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michel Lespinasse <walken@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: zhongjiang <zhongjiang@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Punit Agrawal	a0189db30d	mm/migrate: fix refcount handling when !hugepage_migration_supported() commit `30809f559a` upstream. On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite. Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000 soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate\|head) INFO: rcu_preempt detected stalls on CPUs/tasks: Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715 (detected by 7, t=5254 jiffies, g=963, c=962, q=321) thugetlb_overco R running task 0 2715 2685 0x00000008 Call trace: dump_backtrace+0x0/0x268 show_stack+0x24/0x30 sched_show_task+0x134/0x180 rcu_print_detail_task_stall_rnp+0x54/0x7c rcu_check_callbacks+0xa74/0xb08 update_process_times+0x34/0x60 tick_sched_handle.isra.7+0x38/0x70 tick_sched_timer+0x4c/0x98 __hrtimer_run_queues+0xc0/0x300 hrtimer_interrupt+0xac/0x228 arch_timer_handler_phys+0x3c/0x50 handle_percpu_devid_irq+0x8c/0x290 generic_handle_irq+0x34/0x50 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x5c/0xb0 Address this by changing the putback_active_hugepage() in soft_offline_huge_page() to putback_movable_pages(). This only triggers on systems that enable memory failure handling (ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration (!ARCH_ENABLE_HUGEPAGE_MIGRATION). I imagine this wasn't triggered as there aren't many systems running this configuration. [akpm@linux-foundation.org: remove dead comment, per Naoya] Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.com Reported-by: Manoj Iyer <manoj.iyer@canonical.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Ross Zwisler	bdaac1fe71	dax: fix race between colliding PMD & PTE entries commit `e2093926a0` upstream. We currently have two related PMD vs PTE races in the DAX code. These can both be easily triggered by having two threads reading and writing simultaneously to the same private mapping, with the key being that private mapping reads can be handled with PMDs but private mapping writes are always handled with PTEs so that we can COW. Here is the first race: CPU 0 CPU 1 (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK handle_pte_fault() passes check for pmd_devmap() (private mapping read) __handle_mm_fault() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD dax_iomap_pte_fault() does a PTE fault, but we already have a DAX PMD installed in our page tables at this spot. Here's the second race: CPU 0 CPU 1 (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() handle_pte_fault() dax_iomap_pte_fault() inserts PTE dax_iomap_pmd_fault() inserts PMD, but we already have a PTE at this spot. The core of the issue is that while there is isolation between faults to the same range in the DAX fault handlers via our DAX entry locking, there is no isolation between faults in the code in mm/memory.c. This means for instance that this code in __handle_mm_fault() can run: if (pmd_none(vmf.pmd) && transparent_hugepage_enabled(vma)) { ret = create_huge_pmd(&vmf); But by the time we actually get to run the fault handler called by create_huge_pmd(), the PMD is no longer pmd_none() because a racing PTE fault has installed a normal PMD here as a parent. This is the cause of the 2nd race. The first race is similar - there is the following check in handle_pte_fault(): } else { / See comment in pte_alloc_one_map() / if (pmd_devmap(vmf->pmd) \|\| pmd_trans_unstable(vmf->pmd)) return 0; So if a pmd_devmap() PMD (a DAX PMD) has been installed at vmf->pmd, we will bail and retry the fault. This is correct, but there is nothing preventing the PMD from being installed after this check but before we actually get to the DAX PTE fault handlers. In my testing these races result in the following types of errors: BUG: Bad rss-counter state mm:ffff8800a817d280 idx:1 val:1 BUG: non-zero nr_ptes on freeing mm: 15 Fix this issue by having the DAX fault handlers verify that it is safe to continue their fault after they have taken an entry lock to block other racing faults. [ross.zwisler@linux.intel.com: improve fix for colliding PMD & PTE entries] Link: http://lkml.kernel.org/r/20170526195932.32178-1-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/20170522215749.23516-2-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Pawel Lebioda <pawel.lebioda@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Ross Zwisler	b16e6ab5ad	mm: avoid spurious 'bad pmd' warning messages commit `d0f0931de9` upstream. When the pmd_devmap() checks were added by `5c7fb56e5e` ("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge pages, they were all added to the end of if() statements after existing pmd_trans_huge() checks. So, things like: - if (pmd_trans_huge(pmd)) + if (pmd_trans_huge(pmd) \|\| pmd_devmap(pmd)) When further checks were added after pmd_trans_unstable() checks by commit `7267ec008b` ("mm: postpone page table allocation until we have page to map") they were also added at the end of the conditional: + if (pmd_trans_unstable(fe->pmd) \|\| pmd_devmap(fe->pmd)) This ordering is fine for pmd_trans_huge(), but doesn't work for pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd() check inside of pmd_none_or_trans_huge_or_clear_bad() (called by pmd_trans_unstable()), which prints out a warning and returns 1. So, we do end up doing the right thing, but only after spamming dmesg with suspicious looking messages: mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5) Reorder these checks in a helper so that pmd_devmap() is checked first, avoiding the error messages, and add a comment explaining why the ordering is important. Fixes: commit `7267ec008b` ("mm: postpone page table allocation until we have page to map") Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Tetsuo Handa	de12c73fa2	mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once commit `c288983ddd` upstream. Roman Gushchin has reported that the OOM killer can trivially selects next OOM victim when a thread doing memory allocation from page fault path was selected as first OOM victim. allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 __alloc_pages_slowpath+0xc84/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 492 (allocate) score 899 or sacrifice child Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null) allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: __alloc_pages_slowpath+0xd32/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ... allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 pagefault_out_of_memory+0x68/0x80 mm_fault_error+0x8f/0x190 ? handle_mm_fault+0xf3/0x210 __do_page_fault+0x4b2/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB There is a race window that the OOM reaper completes reclaiming the first victim's memory while nothing but mutex_trylock() prevents the first victim from calling out_of_memory() from pagefault_out_of_memory() after memory allocation for page fault path failed due to being selected as an OOM victim. This is a side effect of commit `9a67f6488e` ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") because that commit silently changed the behavior from /* Avoid allocations with no watermarks from looping endlessly / to / * Give up allocations without trying memory reserves if selected * as an OOM victim */ in __alloc_pages_slowpath() by moving the location to check TIF_MEMDIE flag. I have noticed this change but I didn't post a patch because I thought it is an acceptable change other than noise by warn_alloc() because !__GFP_NOFAIL allocations are allowed to fail. But we overlooked that failing memory allocation from page fault path makes difference due to the race window explained above. While it might be possible to add a check to pagefault_out_of_memory() that prevents the first victim from calling out_of_memory() or remove out_of_memory() from pagefault_out_of_memory(), changing pagefault_out_of_memory() does not suppress noise by warn_alloc() when allocating thread was selected as an OOM victim. There is little point with printing similar backtraces and memory information from both out_of_memory() and warn_alloc(). Instead, if we guarantee that current thread can try allocations with no watermarks once when current thread looping inside __alloc_pages_slowpath() was selected as an OOM victim, we can follow "who can use memory reserves" rules and suppress noise by warn_alloc() and prevent memory allocations from page fault path from calling pagefault_out_of_memory(). If we take the comment literally, this patch would do - if (test_thread_flag(TIF_MEMDIE)) - goto nopage; + if (alloc_flags == ALLOC_NO_WATERMARKS \|\| (gfp_mask & __GFP_NOMEMALLOC)) + goto nopage; because gfp_pfmemalloc_allowed() returns false if __GFP_NOMEMALLOC is given. But if I recall correctly (I couldn't find the message), the condition is meant to apply to only OOM victims despite the comment. Therefore, this patch preserves TIF_MEMDIE check. Fixes: `9a67f6488e` ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") Link: http://lkml.kernel.org/r/201705192112.IAF69238.OQOHSJLFOFFMtV@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Roman Gushchin <guro@fb.com> Tested-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	af03bb0cab	ALSA: usb: Fix a typo in Tascam US-16x08 mixer element commit `617163fc25` upstream. A mixer element created in a quirk for Tascam US-16x08 contains a typo: it should be "EQ MidLow Q" instead of "EQ MidQLow Q". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875 Fixes: `d2bb390a20` ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	0be9a9a422	Revert "ALSA: usb-audio: purge needless variable length array" commit `64188cfbe5` upstream. This reverts commit `89b593c30e` ("ALSA: usb-audio: purge needless variable length array"). The patch turned out to cause a severe regression, triggering an Oops at snd_usb_ctl_msg(). It was overseen that snd_usb_ctl_msg() writes back the response to the given buffer, while the patch changed it to a read-only const buffer. (One should always double-check when an extra pointer cast is present...) As a simple fix, just revert the affected commit. It was merely a cleanup. Although it brings VLA again, it's clearer as a fix. We'll address the VLA later in another patch. Fixes: `89b593c30e` ("ALSA: usb-audio: purge needless variable length array") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Alexander Tsoy	ffb97b001b	ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430 commit `1fc2e41f7a` upstream. This model is actually called 92XXM2-8 in Windows driver. But since pin configs for M22 and M28 are identical, just reuse M22 quirk. Fixes external microphone (tested) and probably docking station ports (not tested). Signed-off-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	0c4afdc6d8	ALSA: hda - No loopback on ALC299 codec commit `fa16b69f12` upstream. ALC299 has no loopback mixer, but the driver still tries to add a beep control over the mixer NID which leads to the error at accessing it. This patch fixes it by properly declaring mixer_nid=0 for this codec. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195775 Fixes: `28f1f9b26c` ("ALSA: hda/realtek - Add new codec ID ALC299") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Nicolas Iooss	d5bc54d0a3	pcmcia: remove left-over %Z format commit `ff5a20169b` upstream. Commit `5b5e0928f7` ("lib/vsprintf.c: remove %Z support") removed some usages of format %Z but forgot "%.2Zx". This makes clang 4.0 reports a -Wformat-extra-args warning because it does not know about %Z. Replace %Z with %z. Link: http://lkml.kernel.org/r/20170520090946.22562-1-nicolas.iooss_linux@m4x.org Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Harald Welte <laforge@gnumonks.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Lyude	2609770993	drm/radeon: Unbreak HPD handling for r600+ commit `3d18e33735` upstream. We end up reading the interrupt register for HPD5, and then writing it to HPD6 which on systems without anything using HPD5 results in permanently disabling hotplug on one of the display outputs after the first time we acknowledge a hotplug interrupt from the GPU. This code is really bad. But for now, let's just fix this. I will hopefully have a large patch series to refactor all of this soon. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	6b693bbf9c	drm/radeon/ci: disable mclk switching for high refresh rates (v2) commit `58d7e3e427` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	d6ba1a4407	drm/amd/powerplay/smu7: disable mclk switching for high refresh rates commit `2275a3a2fe` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	662dbfcc66	drm/amd/powerplay/smu7: add vblank check for mclk switching (v2) commit `09be4a5219` upstream. Check to make sure the vblank period is long enough to support mclk switching. v2: drop needless initial assignment (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Ming Lei	a8aa8a0c10	nvme: avoid to use blk_mq_abort_requeue_list() commit `986f75c876` upstream. NVMe may add request into requeue list simply and not kick off the requeue if hw queues are stopped. Then blk_mq_abort_requeue_list() is called in both nvme_kill_queues() and nvme_ns_remove() for dealing with this issue. Unfortunately blk_mq_abort_requeue_list() is absolutely a race maker, for example, one request may be requeued during the aborting. So this patch just calls blk_mq_kick_requeue_list() in nvme_kill_queues() to handle this issue like what nvme_start_queues() does. Now all requests in requeue list when queues are stopped will be handled by blk_mq_kick_requeue_list() when queues are restarted, either in nvme_start_queues() or in nvme_kill_queues(). Reported-by: Zhang Yi <yizhan@redhat.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Ming Lei	20c03f455c	nvme: use blk_mq_start_hw_queues() in nvme_kill_queues() commit `806f026f9b` upstream. Inside nvme_kill_queues(), we have to start hw queues for draining requests in sw queues, .dispatch list and requeue list, so use blk_mq_start_hw_queues() instead of blk_mq_start_stopped_hw_queues() which only run queues if queues are stopped, but the queues may have been started already, for example nvme_start_queues() is called in reset work function. blk_mq_start_hw_queues() run hw queues in current context, instead of running asynchronously like before. Given nvme_kill_queues() is run from either remove context or reset worker context, both are fine to run hw queue directly. And the mutex of namespaces_mutex isn't a problem too becasue nvme_start_freeze() runs hw queue in this way already. Reported-by: Zhang Yi <yizhan@redhat.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Marta Rybczynska	0fe9c55195	nvme-rdma: support devices with queue size < 32 commit `0544f5494a` upstream. In the case of small NVMe-oF queue size (<32) we may enter a deadlock caused by the fact that the IB completions aren't sent waiting for 32 and the send queue will fill up. The error is seen as (using mlx5): [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 This patch changes the way the signaling is done so that it depends on the queue depth now. The magic define has been removed completely. Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> Signed-off-by: Samuel Jones <sjones@kalray.eu> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Jason Gerecke	f88d3d6e4f	HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference commit `2ac97f0f66` upstream. The following Smatch complaint was generated in response to commit `2a6cdbd` ("HID: wacom: Introduce new 'touch_input' device"): drivers/hid/wacom_wac.c:1586 wacom_tpc_irq() error: we previously assumed 'wacom->touch_input' could be null (see line 1577) The 'touch_input' and 'pen_input' variables point to the 'struct input_dev' used for relaying touch and pen events to userspace, respectively. If a device does not have a touch interface or pen interface, the associated input variable is NULL. The 'wacom_tpc_irq()' function is responsible for forwarding input reports to a more-specific IRQ handler function. An unknown report could theoretically be mistaken as e.g. a touch report on a device which does not have a touch interface. This can be prevented by only calling the pen/touch functions are called when the pen/touch pointers are valid. Fixes: `2a6cdbd` ("HID: wacom: Introduce new 'touch_input' device") Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Bryant G. Ly	8d975ebd0a	ibmvscsis: Fix the incorrect req_lim_delta commit `75dbf2d36f` upstream. The current code is not correctly calculating the req_lim_delta. We want to make sure vscsi->credit is always incremented when we do not send a response for the scsi op. Thus for the case where there is a successfully aborted task we need to make sure the vscsi->credit is incremented. v2 - Moves the original location of the vscsi->credit increment to a better spot. Since if we increment credit, the next command we send back will have increased req_lim_delta. But we probably shouldn't be doing that until the aborted cmd is actually released. Otherwise the client will think that it can send a new command, and we could find ourselves short of command elements. Not likely, but could happen. This patch depends on both: commit `25e7853126` ("ibmvscsis: Do not send aborted task response") commit `98883f1b54` ("ibmvscsis: Clear left-over abort_cmd pointers") Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Bryant G. Ly	e920be8367	ibmvscsis: Clear left-over abort_cmd pointers commit `98883f1b54` upstream. With the addition of ibmvscsis->abort_cmd pointer within commit `25e7853126` ("ibmvscsis: Do not send aborted task response"), make sure to explicitly NULL these pointers when clearing DELAY_SEND flag. Do this for two cases, when getting the new new ibmvscsis descriptor in ibmvscsis_get_free_cmd() and before posting the response completion in ibmvscsis_send_messages(). Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Artem Savkov	1fb66c6aad	scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get() commit `0648a07c9b` upstream. rdac_failover_get references struct rdac_controller as ctlr->ms_sdev->handler_data->ctlr for no apparent reason. Besides being inefficient this also introduces a null-pointer dereference as send_mode_select() sets ctlr->ms_sdev to NULL before calling rdac_failover_get(): [ 18.432550] device-mapper: multipath service-time: version 0.3.0 loaded [ 18.436124] BUG: unable to handle kernel NULL pointer dereference at 0000000000000790 [ 18.436129] IP: send_mode_select+0xca/0x560 [ 18.436129] PGD 0 [ 18.436130] P4D 0 [ 18.436130] [ 18.436132] Oops: 0000 [#1] SMP [ 18.436133] Modules linked in: dm_service_time sd_mod dm_multipath amdkfd amd_iommu_v2 radeon(+) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qla2xxx drm serio_raw scsi_transport_fc bnx2 i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 18.436143] CPU: 4 PID: 443 Comm: kworker/u16:2 Not tainted 4.12.0-rc1.1.el7.test.x86_64 #1 [ 18.436144] Hardware name: IBM BladeCenter LS22 -[79013SG]-/Server Blade, BIOS -[L8E164AUS-1.07]- 05/25/2011 [ 18.436145] Workqueue: kmpath_rdacd send_mode_select [ 18.436146] task: ffff880225116a40 task.stack: ffffc90002bd8000 [ 18.436148] RIP: 0010:send_mode_select+0xca/0x560 [ 18.436148] RSP: 0018:ffffc90002bdbda8 EFLAGS: 00010246 [ 18.436149] RAX: 0000000000000000 RBX: ffffc90002bdbe08 RCX: ffff88017ef04a80 [ 18.436150] RDX: ffffc90002bdbe08 RSI: ffff88017ef04a80 RDI: ffff8802248e4388 [ 18.436151] RBP: ffffc90002bdbe48 R08: 0000000000000000 R09: ffffffff81c104c0 [ 18.436151] R10: 00000000000001ff R11: 000000000000035a R12: ffffc90002bdbdd8 [ 18.436152] R13: ffff8802248e4390 R14: ffff880225152800 R15: ffff8802248e4400 [ 18.436153] FS: 0000000000000000(0000) GS:ffff880227d00000(0000) knlGS:0000000000000000 [ 18.436154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 18.436154] CR2: 0000000000000790 CR3: 000000042535b000 CR4: 00000000000006e0 [ 18.436155] Call Trace: [ 18.436159] ? rdac_activate+0x14e/0x150 [ 18.436161] ? refcount_dec_and_test+0x11/0x20 [ 18.436162] ? kobject_put+0x1c/0x50 [ 18.436165] ? scsi_dh_activate+0x6f/0xd0 [ 18.436168] process_one_work+0x149/0x360 [ 18.436170] worker_thread+0x4d/0x3c0 [ 18.436172] kthread+0x109/0x140 [ 18.436173] ? rescuer_thread+0x380/0x380 [ 18.436174] ? kthread_park+0x60/0x60 [ 18.436176] ret_from_fork+0x2c/0x40 [ 18.436177] Code: 49 c7 46 20 00 00 00 00 4c 89 ef c6 07 00 0f 1f 40 00 45 31 ed c7 45 b0 05 00 00 00 44 89 6d b4 4d 89 f5 4c 8b 75 a8 49 8b 45 20 <48> 8b b0 90 07 00 00 48 8b 56 10 8b 42 10 48 8d 7a 28 85 c0 0f [ 18.436192] RIP: send_mode_select+0xca/0x560 RSP: ffffc90002bdbda8 [ 18.436192] CR2: 0000000000000790 [ 18.436198] ---[ end trace 40f3e4dca1ffabdd ]--- [ 18.436199] Kernel panic - not syncing: Fatal exception [ 18.436222] Kernel Offset: disabled [-- MARK -- Thu May 18 11:45:00 2017] Fixes: `3278255741` scsi_dh_rdac: switch to scsi_execute_req_flags() Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Nicholas Bellinger	14ba78937e	iscsi-target: Fix initial login PDU asynchronous socket close OOPs commit `25cdda95fd` upstream. This patch fixes a OOPs originally introduced by: commit `bb048357da` Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Thu Sep 5 14:54:04 2013 -0700 iscsi-target: Add sk->sk_state_change to cleanup after TCP failure which would trigger a NULL pointer dereference when a TCP connection was closed asynchronously via iscsi_target_sk_state_change(), but only when the initial PDU processing in iscsi_target_do_login() from iscsi_np process context was blocked waiting for backend I/O to complete. To address this issue, this patch makes the following changes. First, it introduces some common helper functions used for checking socket closing state, checking login_flags, and atomically checking socket closing state + setting login_flags. Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP connection has dropped via iscsi_target_sk_state_change(), but the initial PDU processing within iscsi_target_do_login() in iscsi_np context is still running. For this case, it sets LOGIN_FLAGS_CLOSED, but doesn't invoke schedule_delayed_work(). The original NULL pointer dereference case reported by MNC is now handled by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before transitioning to FFP to determine when the socket has already closed, or iscsi_target_start_negotiation() if the login needs to exchange more PDUs (eg: iscsi_target_do_login returned 0) but the socket has closed. For both of these cases, the cleanup up of remaining connection resources will occur in iscsi_target_start_negotiation() from iscsi_np process context once the failure is detected. Finally, to handle to case where iscsi_target_sk_state_change() is called after the initial PDU procesing is complete, it now invokes conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once existing iscsi_target_sk_check_close() checks detect connection failure. For this case, the cleanup of remaining connection resources will occur in iscsi_target_do_login_rx() from delayed workqueue process context once the failure is detected. Reported-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Mike Christie <mchristi@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Reported-by: Hannes Reinecke <hare@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Jiang Yi	c732f30887	iscsi-target: Always wait for kthread_should_stop() before kthread exit commit `5e0cf5e6c4` upstream. There are three timing problems in the kthread usages of iscsi_target_mod: - np_thread of struct iscsi_np - rx_thread and tx_thread of struct iscsi_conn In iscsit_close_connection(), it calls send_sig(SIGINT, conn->tx_thread, 1); kthread_stop(conn->tx_thread); In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive SIGINT the kthread will exit without checking the return value of kthread_should_stop(). So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...) and kthread_stop(...), the kthread_stop() will try to stop an already stopped kthread. This is invalid according to the documentation of kthread_stop(). (Fix -ECONNRESET logout handling in iscsi_target_tx_thread and early iscsi_target_rx_thread failure case - nab) Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Long Li	a168ac5b24	scsi: zero per-cmd private driver data for each MQ I/O commit `1bad6c4a57` upstream. In lower layer driver's (LLD) scsi_host_template, the driver may optionally ask SCSI to allocate its private driver memory for each command, by specifying cmd_size. This memory is allocated at the end of scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use scsi_cmd_priv to get to its private data. Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In this case, the LLD may get to stale or uninitialized data in its private driver memory. This may result in unexpected driver and hardware behavior. Fix this problem by also zeroing the private driver memory before passing them to LLD. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Srinath Mannam	21f8aa4cfc	mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read commit `f5f968f237` upstream. The stingray SDHCI hardware supports ACMD12 and automatically issues after multi block transfer completed. If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen on multi block read command with below error message: Got data interrupt 0x00000002 even though no data operation was in progress. This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable ACM12 support in SDHCI hardware and suppress spurious interrupt. Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Reviewed-by: Scott Branden <scott.branden@broadcom.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `b580c52d58` ("mmc: sdhci-iproc: add IPROC SDHCI driver") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Benjamin Tissoires	4c5681afdf	Revert "ACPI / button: Change default behavior to lid_init_state=open" commit `878d8db039` upstream. Revert commit `77e9a4aa9d` (ACPI / button: Change default behavior to lid_init_state=open) which changed the kernel's behavior on laptops that boot with closed lids and expect the lid switch state to be reported accurately by the kernel. If you boot or resume your laptop with the lid closed on a docking station while using an external monitor connected to it, both internal and external displays will light on, while only the external should. There is a design choice in gdm to only provide the greeter on the internal display when lit on, so users only see a gray area on the external monitor. Also, the cursor will not show up as it's by default on the internal display too. To "fix" that, users have to open the laptop once and close it once again to sync the state of the switch with the hardware state. Even if the "method" operation mode implementation can be buggy on some platforms, the "open" choice is worse. It breaks docking stations basically and there is no way to have a user-space hwdb to fix that. On the contrary, it's rather easy in user-space to have a hwdb with the problematic platforms. Then, libinput (1.7.0+) can fix the state of the lid switch for us: you need to set the udev property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'. When libinput detects internal keyboard events, it will overwrite the state of the switch to open, making it reliable again. Given that logind only checks the lid switch value after a timeout, we can assume the user will use the internal keyboard before this timeout expires. For example, such a hwdb entry is: libinput:name:Lid Switch:dmi:svnMicrosoftCorporation:pnSurface3: LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Lv Zheng	5d0e4205ea	ACPICA: Tables: Fix regression introduced by a too early mechanism enabling commit `2ea65321b8` upstream. In the Linux kernel, acpi_get_table() "clones" haven't been fully balanced by acpi_put_table() invocations. In upstream ACPICA, due to the design change, there are also unbalanced acpi_get_table_by_index() invocations requiring special care. acpi_get_table() reference counting mismatches may occor due to that and printing error messages related to them is not useful at this point. The strict balanced validation count check should only be enabled after confirming that all invocations are safe and aligned with their designed purposes. Thus this patch removes the error value returned by acpi_tb_get_table() in that case along with the accompanying error message to fix the issue. Fixes: `174cc7187e` (ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel) Reported-by: Anush Seetharaman <anush.seetharaman@intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Dan Williams	34211cbf94	ACPI / sysfs: fix acpi_get_table() leak / acpi-sysfs denial of service commit `0de0e198bc` upstream. Reading an ACPI table through the /sys/firmware/acpi/tables interface more than 65,536 times leads to the following log message: ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment (20170119/tbutils-423) ...and the table being unavailable until the next reboot. Add the missing acpi_put_table() so the table ->validation_count is decremented after each read. Reported-by: Anush Seetharaman <anush.seetharaman@intel.com> Fixes: `174cc7187e` "ACPICA: Tables: Back port acpi_get_table_with_size() ..." Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Vishal Verma	93da4e6c45	acpi, nfit: Fix the memory error check in nfit_handle_mce() commit `fc08a4703a` upstream. The check for an MCE being a memory error in the NFIT mce handler was bogus. Use the new mce_is_memory_error() helper to detect the error properly. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170519093915.15413-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Borislav Petkov	9183980a9e	x86/MCE: Export memory_error() commit `2d1f406139` upstream. Export the function which checks whether an MCE is a memory error to other users so that we can reuse the logic. Drop the boot_cpu_data use, while at it, as mce.cpuvendor already has the CPU vendor in there. Integrate a piece from a patch from Vishal Verma <vishal.l.verma@intel.com> to export it for modules (nfit). The main reason we're exporting it is that the nfit handler nfit_handle_mce() needs to detect a memory error properly before doing its recovery actions. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Lv Zheng	8f8dca3c86	Revert "ACPI / button: Remove lid_init_state=method mode" commit `f369fdf4f6` upstream. This reverts commit `ecb10b694b`. The only expected ACPI control method lid device's usage model is 1. Listen to the lid notification, 2. Evaluate _LID after being notified by BIOS, 3. Suspend the system (if users configure to do so) after seeing "close". It's not ensured that BIOS will notify OS after boot/resume, and it's not ensured that BIOS will always generate "open" event upon opening the lid. But there are 2 wrong usage models: 1. When the lid device is responsible for suspend/resume the system, userspace requires to see "open" event to be paired with "close" after the system is resumed, or it will suspend the system again. 2. When an external monitor connects to the laptop attached docks, userspace requires to see "close" event after the system is resumed so that it can determine whether the internal display should remain dark and the external display should be lit on. After we made default kernel behavior to be suitable for usage model 1, users of usage model 2 start to report regressions for such behavior change. Reversion of button.lid_init_state=method doesn't actually reverts to old default behavior as doing so can enter a regression loop, but facilitates users to work the reported regressions around with button.lid_init_state=method. Fixes: `ecb10b694b` (ACPI / button: Remove lid_init_state=method mode) Link: https://bugzilla.kernel.org/show_bug.cgi?id=195455 Link: https://bugzilla.redhat.com/show_bug.cgi?id=1430259 Tested-by: Steffen Weber <steffen.weber@gmail.com> Tested-by: Julian Wiedmann <julian.wiedmann@jwi.name> Reported-by: Joachim Frieben <jfrieben@hotmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Herbert Xu	f5eef8d245	crypto: skcipher - Add missing API setkey checks commit `9933e113c2` upstream. The API setkey checks for key sizes and alignment went AWOL during the skcipher conversion. This patch restores them. Fixes: `4e6c3df4d7` ("crypto: skcipher - Add low-level skcipher...") Reported-by: Baozeng <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Sebastian Reichel	2da7518890	i2c: i2c-tiny-usb: fix buffer not being DMA capable commit `5165da5923` upstream. Since v4.9 i2c-tiny-usb generates the below call trace and longer works, since it can't communicate with the USB device. The reason is, that since v4.9 the USB stack checks, that the buffer it should transfer is DMA capable. This was a requirement since v2.2 days, but it usually worked nevertheless. [ 17.504959] ------------[ cut here ]------------ [ 17.505488] WARNING: CPU: 0 PID: 93 at drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x37c/0x570 [ 17.506545] transfer buffer not dma capable [ 17.507022] Modules linked in: [ 17.507370] CPU: 0 PID: 93 Comm: i2cdetect Not tainted 4.11.0-rc8+ #10 [ 17.508103] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 17.509039] Call Trace: [ 17.509320] ? dump_stack+0x5c/0x78 [ 17.509714] ? __warn+0xbe/0xe0 [ 17.510073] ? warn_slowpath_fmt+0x5a/0x80 [ 17.510532] ? nommu_map_sg+0xb0/0xb0 [ 17.510949] ? usb_hcd_map_urb_for_dma+0x37c/0x570 [ 17.511482] ? usb_hcd_submit_urb+0x336/0xab0 [ 17.511976] ? wait_for_completion_timeout+0x12f/0x1a0 [ 17.512549] ? wait_for_completion_timeout+0x65/0x1a0 [ 17.513125] ? usb_start_wait_urb+0x65/0x160 [ 17.513604] ? usb_control_msg+0xdc/0x130 [ 17.514061] ? usb_xfer+0xa4/0x2a0 [ 17.514445] ? __i2c_transfer+0x108/0x3c0 [ 17.514899] ? i2c_transfer+0x57/0xb0 [ 17.515310] ? i2c_smbus_xfer_emulated+0x12f/0x590 [ 17.515851] ? _raw_spin_unlock_irqrestore+0x11/0x20 [ 17.516408] ? i2c_smbus_xfer+0x125/0x330 [ 17.516876] ? i2c_smbus_xfer+0x125/0x330 [ 17.517329] ? i2cdev_ioctl_smbus+0x1c1/0x2b0 [ 17.517824] ? i2cdev_ioctl+0x75/0x1c0 [ 17.518248] ? do_vfs_ioctl+0x9f/0x600 [ 17.518671] ? vfs_write+0x144/0x190 [ 17.519078] ? SyS_ioctl+0x74/0x80 [ 17.519463] ? entry_SYSCALL_64_fastpath+0x1e/0xad [ 17.519959] ---[ end trace d047c04982f5ac50 ]--- Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Till Harbaum <till@harbaum.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Ard Biesheuvel	e4bab31cf9	drivers/tty: 8250: only call fintek_8250_probe when doing port I/O commit `4c4fc90964` upstream. Commit `fa01e2ca9f` ("serial: 8250: Integrate Fintek into 8250_base") modified the probing logic for PNP0501 devices, to remove a collision between the generic 16550A driver and the Fintek driver, which reused the same ACPI _HID. The Fintek device probe is now incorporated into the common 8250 probe path, and gets called for all discovered 16550A compatible devices, including ones that are MMIO mapped rather than IO mapped. However, the Fintek driver assumes the port base is a I/O address, and proceeds to probe some arbitrary offsets above it. This is generally a wrong thing to do, but on ARM systems (having no native port I/O), this may result in faulting accesses of completely unrelated MMIO regions in the PCI I/O space. Given that this is at serial probe time, this results in hard to diagnose crashes at boot. So let's restrict the Fintek probe to devices that we know are using port I/O in the first place. Fixes: `fa01e2ca9f` ("serial: 8250: Integrate Fintek into 8250_base") Suggested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ricardo Ribalda <ricardo.ribalda@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Johan Hovold	84ac7693f4	serdev: fix tty-port client deregistration commit `aee5da7838` upstream. The port client data must be set when registering the serdev controller or client deregistration will fail (and the serdev devices are left registered and allocated) if the port was never opened in between. Make sure to clear the port client data on any probe errors to avoid a use-after-free when the client is later deregistered unconditionally (e.g. in a tty-port deregistration helper). Also move port client operation initialisation to registration. Note that the client ops must be restored on failed probe. Fixes: `bed35c6dfa` ("serdev: add a tty port controller driver") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Johan Hovold	427fa8e391	Revert "tty_port: register tty ports with serdev bus" commit `d3ba126a22` upstream. This reverts commit `8ee3fde047`. The new serdev bus hooked into the tty layer in tty_port_register_device() by registering a serdev controller instead of a tty device whenever a serdev client is present, and by deregistering the controller in the tty-port destructor. This is broken in several ways: Firstly, it leads to a NULL-pointer dereference whenever a tty driver later deregisters its devices as no corresponding character device will exist. Secondly, far from every tty driver uses tty-port refcounting (e.g. serial core) so the serdev devices might never be deregistered or deallocated. Thirdly, deregistering at tty-port destruction is too late as the underlying device and structures may be long gone by then. A port is not released before an open tty device is closed, something which a registered serdev client can prevent from ever happening. A driver callback while the device is gone typically also leads to crashes. Many tty drivers even keep their ports around until the driver is unloaded (e.g. serial core), something which even if a late callback never happens, leads to leaks if a device is unbound from its driver and is later rebound. The right solution here is to add a new tty_port_unregister_device() helper and to never call tty_device_unregister() whenever the port has been claimed by serdev, but since this requires modifying just about every tty driver (and multiple subsystems) it will need to be done incrementally. Reverting the offending patch is the first step in fixing the broken lifetime assumptions. A follow-up patch will add a new pair of tty-device registration helpers, which a vetted tty driver can use to support serdev (initially serial core). When every tty driver uses the serdev helpers (at least for deregistration), we can add serdev registration to tty_port_register_device() again. Note that this also fixes another issue with serdev, which currently allocates and registers a serdev controller for every tty device registered using tty_port_device_register() only to immediately deregister and deallocate it when the corresponding OF node or serdev child node is missing. This should be addressed before enabling serdev for hot-pluggable buses. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Jeremy Kerr	baa4d4112e	powerpc/spufs: Fix hash faults for kernel regions commit `d75e4919cc` upstream. Commit `ac29c64089` ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED") swapped _PAGE_USER for _PAGE_PRIVILEGED, and introduced check_pte_access() which denied kernel access to non-_PAGE_PRIVILEGED pages. However, it didn't add _PAGE_PRIVILEGED to the hash fault handler for spufs' kernel accesses, so the DMAs required to establish SPE memory no longer work. This change adds _PAGE_PRIVILEGED to the hash fault handler for kernel accesses. Fixes: `ac29c64089` ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED") Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reported-by: Sombat Tragolgosol <sombat3960@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Michael Neuling	919c7173e0	powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N commit `d957fb4d17` upstream. Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on a P9. This is because we still set MMU_FTR_TYPE_RADIX via ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching in much of the asm code (ie. slb_miss_realmode) This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being set from ibm.pa-features. We may eventually end up removing the CONFIG_PPC_RADIX_MMU option completely but until then this fixes the issue. Fixes: `17a3dd2f5f` ("powerpc/mm/radix: Use firmware feature to enable Radix MMU") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Richard Narron	72351ac5cd	fs/ufs: Set UFS default maximum bytes per file commit `239e250e4a` upstream. This fixes a problem with reading files larger than 2GB from a UFS-2 file system: https://bugzilla.kernel.org/show_bug.cgi?id=195721 The incorrect UFS s_maxsize limit became a problem as of commit `c2a9737f45` ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") which started using s_maxbytes to avoid a page index overflow in do_generic_file_read(). That caused files to be truncated on UFS-2 file systems because the default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it. Here I simply increase the default to a common value used by other file systems. Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Will B <will.brokenbourgh2877@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Liam R. Howlett	f351b1226d	sparc/ftrace: Fix ftrace graph time measurement [ Upstream commit `48078d2dac` ] The ftrace function_graph time measurements of a given function is not accurate according to those recorded by ftrace using the function filters. This change pulls the x86_64 fix from 'commit `722b3c7469` ("ftrace/graph: Trace function entry before updating index")' into the sparc specific prepare_ftrace_return which stops ftrace from counting interrupted tasks in the time measurement. Example measurements for select_task_rq_fair running "hackbench 100 process 1000": \| tracing/trace_stat/function0 \| function_graph Before patch \| 2.802 us \| 4.255 us After patch \| 2.749 us \| 3.094 us Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Orlando Arias	76037bf9e1	sparc: Fix -Wstringop-overflow warning [ Upstream commit `deba804c90` ] Greetings, GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows in calls to string handling functions [1][2]. Due to the way ``empty_zero_page'' is declared in arch/sparc/include/setup.h, this causes a warning to trigger at compile time in the function mem_init(), which is subsequently converted to an error. The ensuing patch fixes this issue and aligns the declaration of empty_zero_page to that of other architectures. Thank you. Cheers, Orlando. [1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html [2] https://gcc.gnu.org/gcc-7/changes.html Signed-off-by: Orlando Arias <oarias@knights.ucf.edu> -------------------------------------------------------------------------------- Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Nitin Gupta	e346489fac	sparc64: Fix mapping of 64k pages with MAP_FIXED [ Upstream commit `b6c41cb050` ] An incorrect huge page alignment check caused mmap failure for 64K pages when MAP_FIXED is used with address not aligned to HPAGE_SIZE. Orabug: 25885991 Fixes: `dcd1912d21` ("sparc64: Add 64K page size support") Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	21dccb0f7c	bpf: adjust verifier heuristics [ Upstream commit `3c2ce60bdd` ] Current limits with regards to processing program paths do not really reflect today's needs anymore due to programs becoming more complex and verifier smarter, keeping track of more data such as const ALU operations, alignment tracking, spilling of PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for smarter matching of what LLVM generates. This also comes with the side-effect that we result in fewer opportunities to prune search states and thus often need to do more work to prove safety than in the past due to different register states and stack layout where we mismatch. Generally, it's quite hard to determine what caused a sudden increase in complexity, it could be caused by something as trivial as a single branch somewhere at the beginning of the program where LLVM assigned a stack slot that is marked differently throughout other branches and thus causing a mismatch, where verifier then needs to prove safety for the whole rest of the program. Subsequently, programs with even less than half the insn size limit can get rejected. We noticed that while some programs load fine under pre 4.11, they get rejected due to hitting limits on more recent kernels. We saw that in the vast majority of cases (90+%) pruning failed due to register mismatches. In case of stack mismatches, majority of cases failed due to different stack slot types (invalid, spill, misc) rather than differences in spilled registers. This patch makes pruning more aggressive by also adding markers that sit at conditional jumps as well. Currently, we only mark jump targets for pruning. For example in direct packet access, these are usually error paths where we bail out. We found that adding these markers, it can reduce number of processed insns by up to 30%. Another option is to ignore reg->id in probing PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning slightly as well by up to 7% observed complexity reduction as stand-alone. Meaning, if a previous path with register type PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then in the current state a PTR_TO_MAP_VALUE_OR_NULL register for the same map X must be safe as well. Last but not least the patch also adds a scheduling point and bumps the current limit for instructions to be processed to a more adequate value. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	87cebd0f19	bpf: fix wrong exposure of map_flags into fdinfo for lpm [ Upstream commit `a316338cb7` ] trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via attr->map_flags, since it does not support preallocation yet. We check the flag, but we never copy the flag into trie->map.map_flags, which is later on exposed into fdinfo and used by loaders such as iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test whether a pinned map has the same spec as the one from the BPF obj file and if not, bails out, which is currently the case for lpm since it exposes always 0 as flags. Also copy over flags in array_map_alloc() and stack_map_alloc(). They always have to be 0 right now, but we should make sure to not miss to copy them over at a later point in time when we add actual flags for them to use. Fixes: `b95a5c4db0` ("bpf: add a longest prefix match trie map implementation") Reported-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	d6d2860eee	bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data [ Upstream commit `41703a7310` ] The bpf_clone_redirect() still needs to be listed in bpf_helper_changes_pkt_data() since we call into bpf_try_make_head_writable() from there, thus we need to invalidate prior pkt regs as well. Fixes: `36bbef52c7` ("bpf: direct packet write and access for helpers for clsact progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Eric Dumazet	3b69d6516e	ipv4: add reference counting to metrics [ Upstream commit `3fb07daff8` ] Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit `82486aa6f1` ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: `2860583fe8` ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Peter Dawson	d3edf403e2	ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets [ Upstream commit `0e9a709560` ] This fix addresses two problems in the way the DSCP field is formulated on the encapsulating header of IPv6 tunnels. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661 1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it was incorrect to assume that the upper 12b containing the DSCP and ECN fields would remain intact when formulating the encapsulating header. This fix handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable. 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect and resulted in the DSCP value always being set to 0. Commit `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") caused the regression by masking out the flowlabel which exposed the incorrect handling of the DSCP portion of the flowlabel in ip6_tunnel and ip6_gre. Fixes: `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Davide Caratti	90e7c3322d	sctp: fix ICMP processing if skb is non-linear [ Upstream commit `804ec7ebe8` ] sometimes ICMP replies to INIT chunks are ignored by the client, even if the encapsulated SCTP headers match an open socket. This happens when the ICMP packet is carried by a paged skb: use skb_header_pointer() to read packet contents beyond the SCTP header, so that chunk header and initiate tag are validated correctly. v2: - don't use skb_header_pointer() to read the transport header, since icmp_socket_deliver() already puts these 8 bytes in the linear area. - change commit message to make specific reference to INIT chunks. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Wei Wang	0236d8c44e	tcp: avoid fastopen API to be used on AF_UNSPEC [ Upstream commit `ba615f6752` ] Fastopen API should be used to perform fastopen operations on the TCP socket. It does not make sense to use fastopen API to perform disconnect by calling it with AF_UNSPEC. The fastopen data path is also prone to race conditions and bugs when using with AF_UNSPEC. One issue reported and analyzed by Vegard Nossum is as follows: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Thread A: Thread B: ------------------------------------------------------------------------ sendto() - tcp_sendmsg() - sk_stream_memory_free() = 0 - goto wait_for_sndbuf - sk_stream_wait_memory() - sk_wait_event() // sleep \| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC) \| - tcp_sendmsg() \| - tcp_sendmsg_fastopen() \| - __inet_stream_connect() \| - tcp_disconnect() //because of AF_UNSPEC \| - tcp_transmit_skb()// send RST \| - return 0; // no reconnect! \| - sk_stream_wait_connect() \| - sock_error() \| - xchg(&sk->sk_err, 0) \| - return -ECONNRESET - ... // wake up, see sk->sk_err == 0 - skb_entail() on TCP_CLOSE socket If the connection is reopened then we will send a brand new SYN packet after thread A has already queued a buffer. At this point I think the socket internal state (sequence numbers etc.) becomes messed up. When the new connection is closed, the FIN-ACK is rejected because the sequence number is outside the window. The other side tries to retransmit, but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which corrupts the skb data length and hits a BUG() in copy_and_csum_bits(). +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hence, this patch adds a check for AF_UNSPEC in the fastopen data path and return EOPNOTSUPP to user if such case happens. Fixes: `cf60af03ca` ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Eric Garver	1642394fff	geneve: fix fill_info when using collect_metadata [ Upstream commit `11387fe4a9` ] Since `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") fill_info does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is because it uses ip_tunnel_info_af() with the device level info, which is not valid for COLLECT_METADATA. Fix by checking for the presence of the actual sockets. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Vlad Yasevich	4dbbbaad64	virtio-net: enable TSO/checksum offloads for Q-in-Q vlans [ Upstream commit `2836b4f224` ] Since virtio does not provide it's own ndo_features_check handler, TSO, and now checksum offload, are disabled for stacked vlans. Re-enable the support and let the host take care of it. This restores/improves Guest-to-Guest performance over Q-in-Q vlans. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Vlad Yasevich	acc866e9b5	be2net: Fix offload features for Q-in-Q packets [ Upstream commit `cc6e9de62a` ] At least some of the be2net cards do not seem to be capabled of performing checksum offload computions on Q-in-Q packets. In these case, the recevied checksum on the remote is invalid and TCP syn packets are dropped. This patch adds a call to check disbled acceleration features on Q-in-Q tagged traffic. CC: Sathya Perla <sathya.perla@broadcom.com> CC: Ajit Khaparde <ajit.khaparde@broadcom.com> CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> CC: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Vlad Yasevich	423c1b4324	vlan: Fix tcp checksum offloads in Q-in-Q vlans [ Upstream commit `35d2f80b07` ] It appears that TCP checksum offloading has been broken for Q-in-Q vlans. The behavior was execerbated by the series commit `afb0bc972b` ("Merge branch 'stacked_vlan_tso'") that that enabled accleleration features on stacked vlans. However, event without that series, it is possible to trigger this issue. It just requires a lot more specialized configuration. The root cause is the interaction between how netdev_intersect_features() works, the features actually set on the vlan devices and HW having the ability to run checksum with longer headers. The issue starts when netdev_interesect_features() replaces NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM \| NETIF_F_IPV6_CSUM, if the HW advertises IP\|IPV6 specific checksums. This happens for tagged and multi-tagged packets. However, HW that enables IP\|IPV6 checksum offloading doesn't gurantee that packets with arbitrarily long headers can be checksummed. This patch disables IP\|IPV6 checksums on the packet for multi-tagged packets. CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> CC: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Andrew Lunn	f1cd4c6331	net: phy: marvell: Limit errata to 88m1101 [ Upstream commit `f289978835` ] The 88m1101 has an errata when configuring autoneg. However, it was being applied to many other Marvell PHYs as well. Limit its scope to just the 88m1101. Fixes: `76884679c6` ("phylib: Add support for Marvell 88e1111S and 88e1145") Reported-by: Daniel Walker <danielwa@cisco.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Harini Katakam <harinik@xilinx.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Mohamad Haj Yahia	bea278025c	net/mlx5: Avoid using pending command interface slots [ Upstream commit `73dd3a4839` ] Currently when firmware command gets stuck or it takes long time to complete, the driver command will get timeout and the command slot is freed and can be used for new commands, and if the firmware receive new command on the old busy slot its behavior is unexpected and this could be harmful. To fix this when the driver command gets timeout we return failure, but we don't free the command slot and we wait for the firmware to explicitly respond to that command. Once all the entries are busy we will stop processing new firmware commands. Fixes: `9cba4ebcf3` ('net/mlx5: Fix potential deadlock in command mode change') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Jarod Wilson	1cdcbe7c70	bonding: fix accounting of active ports in 3ad [ Upstream commit `751da2a69b` ] As of `7bb11dc9f5` and `0622cab034`, bond slaves in a 3ad bond are not removed from the aggregator when they are down, and the active slave count is NOT equal to number of ports in the aggregator, but rather the number of ports in the aggregator that are still enabled. The sysfs spew for bonding_show_ad_num_ports() has a comment that says "Show number of active 802.3ad ports.", but it's currently showing total number of ports, both active and inactive. Remedy it by using the same logic introduced in `0622cab034` in __bond_3ad_get_active_agg_info(), so sysfs, procfs and netlink all report the number of active ports. Note that this means that IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of NUM_PORTS, and thus perhaps should be renamed for clarity. Lightly tested on a dual i40e lacp bond, simulating link downs with an ip link set dev <slave2> down, was able to produce the state where I could see both in the same aggregator, but a number of ports count of 1. MII Status: up Active Aggregator Info: Aggregator ID: 1 Number of ports: 2 <--- Slave Interface: ens10 MII Status: up <--- Aggregator ID: 1 Slave Interface: ens11 MII Status: up Aggregator ID: 1 MII Status: up Active Aggregator Info: Aggregator ID: 1 Number of ports: 1 <--- Slave Interface: ens10 MII Status: down <--- Aggregator ID: 1 Slave Interface: ens11 MII Status: up Aggregator ID: 1 CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Eric Dumazet	827624c3d1	ipv6: fix out of bound writes in __ip6_append_data() [ Upstream commit `232cd35d08` ] Andrey Konovalov and idaifish@gmail.com reported crashes caused by one skb shared_info being overwritten from __ip6_append_data() Andrey program lead to following state : copy -4200 datalen 2000 fraglen 2040 maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200 The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); is overwriting skb->head and skb_shared_info Since we apparently detect this rare condition too late, move the code earlier to even avoid allocating skb and risking crashes. Once again, many thanks to Andrey and syzkaller team. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: <idaifish@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Xin Long	99c971a56d	bridge: start hello_timer when enabling KERNEL_STP in br_stp_start [ Upstream commit `6d18c732b9` ] Since commit `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers"), bridge would not start hello_timer if stp_enabled is not KERNEL_STP when br_dev_open. The problem is even if users set stp_enabled with KERNEL_STP later, the timer will still not be started. It causes that KERNEL_STP can not really work. Users have to re-ifup the bridge to avoid this. This patch is to fix it by starting br->hello_timer when enabling KERNEL_STP in br_stp_start. As an improvement, it's also to start hello_timer again only when br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is no reason to start the timer again when it's NO_STP. Fixes: `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers") Reported-by: Haidong Li <haili@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Bjørn Mork	bf97c6bf24	qmi_wwan: add another Lenovo EM74xx device ID [ Upstream commit `486181bcb3` ] In their infinite wisdom, and never ending quest for end user frustration, Lenovo has decided to use a new USB device ID for the wwan modules in their 2017 laptops. The actual hardware is still the Sierra Wireless EM7455 or EM7430, depending on region. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Tobias Jungel	7b570175b3	bridge: netlink: check vlan_default_pvid range [ Upstream commit `a285860211` ] Currently it is allowed to set the default pvid of a bridge to a value above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and returns -EINVAL in case the pvid is out of bounds. Reproduce by calling: [root@test ~]# ip l a type bridge [root@test ~]# ip l a type dummy [root@test ~]# ip l s bridge0 type bridge vlan_filtering 1 [root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 9999 [root@test ~]# ip l s dummy0 master bridge0 [root@test ~]# bridge vlan port vlan ids bridge0 9999 PVID Egress Untagged dummy0 9999 PVID Egress Untagged Fixes: `0f963b7592` ("bridge: netlink: add support for default_pvid") Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
David S. Miller	4b3a6fa35e	ipv6: Check ip6_find_1stfragopt() return value properly. [ Upstream commit `7dd7eb9513` ] Do not use unsigned variables to see if it returns a negative error or not. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Craig Gallek	9909e4e4ff	ipv6: Prevent overrun when parsing v6 header options [ Upstream commit `2423496af3` ] The KASAN warning repoted below was discovered with a syzkaller program. The reproducer is basically: int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP); send(s, &one_byte_of_data, 1, MSG_MORE); send(s, &more_than_mtu_bytes_data, 2000, 0); The socket() call sets the nexthdr field of the v6 header to NEXTHDR_HOP, the first send call primes the payload with a non zero byte of data, and the second send call triggers the fragmentation path. The fragmentation code tries to parse the header options in order to figure out where to insert the fragment option. Since nexthdr points to an invalid option, the calculation of the size of the network header can made to be much larger than the linear section of the skb and data is read outside of it. This fix makes ip6_find_1stfrag return an error if it detects running out-of-bounds. [ 42.361487] ================================================================== [ 42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730 [ 42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789 [ 42.366469] [ 42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41 [ 42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 [ 42.368824] Call Trace: [ 42.369183] dump_stack+0xb3/0x10b [ 42.369664] print_address_description+0x73/0x290 [ 42.370325] kasan_report+0x252/0x370 [ 42.370839] ? ip6_fragment+0x11c8/0x3730 [ 42.371396] check_memory_region+0x13c/0x1a0 [ 42.371978] memcpy+0x23/0x50 [ 42.372395] ip6_fragment+0x11c8/0x3730 [ 42.372920] ? nf_ct_expect_unregister_notifier+0x110/0x110 [ 42.373681] ? ip6_copy_metadata+0x7f0/0x7f0 [ 42.374263] ? ip6_forward+0x2e30/0x2e30 [ 42.374803] ip6_finish_output+0x584/0x990 [ 42.375350] ip6_output+0x1b7/0x690 [ 42.375836] ? ip6_finish_output+0x990/0x990 [ 42.376411] ? ip6_fragment+0x3730/0x3730 [ 42.376968] ip6_local_out+0x95/0x160 [ 42.377471] ip6_send_skb+0xa1/0x330 [ 42.377969] ip6_push_pending_frames+0xb3/0xe0 [ 42.378589] rawv6_sendmsg+0x2051/0x2db0 [ 42.379129] ? rawv6_bind+0x8b0/0x8b0 [ 42.379633] ? _copy_from_user+0x84/0xe0 [ 42.380193] ? debug_check_no_locks_freed+0x290/0x290 [ 42.380878] ? ___sys_sendmsg+0x162/0x930 [ 42.381427] ? rcu_read_lock_sched_held+0xa3/0x120 [ 42.382074] ? sock_has_perm+0x1f6/0x290 [ 42.382614] ? ___sys_sendmsg+0x167/0x930 [ 42.383173] ? lock_downgrade+0x660/0x660 [ 42.383727] inet_sendmsg+0x123/0x500 [ 42.384226] ? inet_sendmsg+0x123/0x500 [ 42.384748] ? inet_recvmsg+0x540/0x540 [ 42.385263] sock_sendmsg+0xca/0x110 [ 42.385758] SYSC_sendto+0x217/0x380 [ 42.386249] ? SYSC_connect+0x310/0x310 [ 42.386783] ? __might_fault+0x110/0x1d0 [ 42.387324] ? lock_downgrade+0x660/0x660 [ 42.387880] ? __fget_light+0xa1/0x1f0 [ 42.388403] ? __fdget+0x18/0x20 [ 42.388851] ? sock_common_setsockopt+0x95/0xd0 [ 42.389472] ? SyS_setsockopt+0x17f/0x260 [ 42.390021] ? entry_SYSCALL_64_fastpath+0x5/0xbe [ 42.390650] SyS_sendto+0x40/0x50 [ 42.391103] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.391731] RIP: 0033:0x7fbbb711e383 [ 42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383 [ 42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003 [ 42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018 [ 42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad [ 42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00 [ 42.397257] [ 42.397411] Allocated by task 3789: [ 42.397702] save_stack_trace+0x16/0x20 [ 42.398005] save_stack+0x46/0xd0 [ 42.398267] kasan_kmalloc+0xad/0xe0 [ 42.398548] kasan_slab_alloc+0x12/0x20 [ 42.398848] __kmalloc_node_track_caller+0xcb/0x380 [ 42.399224] __kmalloc_reserve.isra.32+0x41/0xe0 [ 42.399654] __alloc_skb+0xf8/0x580 [ 42.400003] sock_wmalloc+0xab/0xf0 [ 42.400346] __ip6_append_data.isra.41+0x2472/0x33d0 [ 42.400813] ip6_append_data+0x1a8/0x2f0 [ 42.401122] rawv6_sendmsg+0x11ee/0x2db0 [ 42.401505] inet_sendmsg+0x123/0x500 [ 42.401860] sock_sendmsg+0xca/0x110 [ 42.402209] ___sys_sendmsg+0x7cb/0x930 [ 42.402582] __sys_sendmsg+0xd9/0x190 [ 42.402941] SyS_sendmsg+0x2d/0x50 [ 42.403273] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.403718] [ 42.403871] Freed by task 1794: [ 42.404146] save_stack_trace+0x16/0x20 [ 42.404515] save_stack+0x46/0xd0 [ 42.404827] kasan_slab_free+0x72/0xc0 [ 42.405167] kfree+0xe8/0x2b0 [ 42.405462] skb_free_head+0x74/0xb0 [ 42.405806] skb_release_data+0x30e/0x3a0 [ 42.406198] skb_release_all+0x4a/0x60 [ 42.406563] consume_skb+0x113/0x2e0 [ 42.406910] skb_free_datagram+0x1a/0xe0 [ 42.407288] netlink_recvmsg+0x60d/0xe40 [ 42.407667] sock_recvmsg+0xd7/0x110 [ 42.408022] ___sys_recvmsg+0x25c/0x580 [ 42.408395] __sys_recvmsg+0xd6/0x190 [ 42.408753] SyS_recvmsg+0x2d/0x50 [ 42.409086] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.409513] [ 42.409665] The buggy address belongs to the object at ffff88000969e780 [ 42.409665] which belongs to the cache kmalloc-512 of size 512 [ 42.410846] The buggy address is located 24 bytes inside of [ 42.410846] 512-byte region [ffff88000969e780, ffff88000969e980) [ 42.411941] The buggy address belongs to the page: [ 42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 [ 42.413298] flags: 0x100000000008100(slab\|head) [ 42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c [ 42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000 [ 42.415074] page dumped because: kasan: bad access detected [ 42.415604] [ 42.415757] Memory state around the buggy address: [ 42.416222] ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.416904] ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 42.418273] ^ [ 42.418588] ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419273] ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419882] ================================================================== Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
David Ahern	df6342be40	net: Improve handling of failures on link and route dumps [ Upstream commit `f6c5775ff0` ] In general, rtnetlink dumps do not anticipate failure to dump a single object (e.g., link or route) on a single pass. As both route and link objects have grown via more attributes, that is no longer a given. netlink dumps can handle a failure if the dump function returns an error; specifically, netlink_dump adds the return code to the response if it is <= 0 so userspace is notified of the failure. The missing piece is the rtnetlink dump functions returning the error. Fix route and link dump functions to return the errors if no object is added to an skb (detected by skb->len != 0). IPv6 route dumps (rt6_dump_route) already return the error; this patch updates IPv4 and link dumps. Other dump functions may need to be ajusted as well. Reported-by: Jan Moskyto Matejka <mq@ucw.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Christoph Hellwig	a19f55f9bb	net/smc: Add warning about remote memory exposure [ Upstream commit `19a0f7e37c` ] The driver explicitly bypasses APIs to register all memory once a connection is made, and thus allows remote access to memory. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Ursula Braun	516b3ed999	smc: switch to usage of IB_PD_UNSAFE_GLOBAL_RKEY [ Upstream commit `263eec9b2a` ] Currently, SMC enables remote access to physical memory when a user has successfully configured and established an SMC-connection until ten minutes after the last SMC connection is closed. Because this is considered a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in such a case. This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY. This improves user awareness, but does not remove the security risk itself. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Soheil Hassas Yeganeh	d6cb41cf30	tcp: eliminate negative reordering in tcp_clean_rtx_queue [ Upstream commit `bafbb9c732` ] tcp_ack() can call tcp_fragment() which may dededuct the value tp->fackets_out when MSS changes. When prior_fackets is larger than tp->fackets_out, tcp_clean_rtx_queue() can invoke tcp_update_reordering() with negative values. This results in absurd tp->reodering values higher than sysctl_tcp_max_reordering. Note that tcp_update_reordering indeeds sets tp->reordering to min(sysctl_tcp_max_reordering, metric), but because the comparison is signed, a negative metric always wins. Fixes: `c7caf8d3ed` ("[TCP]: Fix reord detection due to snd_una covered holes") Reported-by: Rebecca Isaacs <risaacs@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Gal Pressman	cc1d2a620d	net/mlx5e: Fix ethtool pause support and advertise reporting [ Upstream commit `e3c1950371` ] Pause bit should set when RX pause is on, not TX pause. Also, setting Asym_Pause is incorrect, and should be turned off. Fixes: `665bc53969` ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Gal Pressman	c8a52aa79a	net/mlx5e: Use the correct pause values for ethtool advertising [ Upstream commit `b383b544f2` ] Query the operational pause from firmware (PFCC register) instead of always passing zeros. Fixes: `665bc53969` ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Douglas Caetano dos Santos	03c10a87f1	net/packet: fix missing net_device reference release [ Upstream commit `d19b183cdc` ] When using a TX ring buffer, if an error occurs processing a control message (e.g. invalid message), the net_device reference is not released. Fixes `c14ac9451c` ("sock: enable timestamping using control messages") Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Eric Dumazet	703a208274	sctp: do not inherit ipv6_{mc\|ac\|fl}_list from parent [ Upstream commit `fdcee2cbb8` ] SCTP needs fixes similar to `83eaddab43` ("ipv6/dccp: do not inherit ipv6_mc_list from parent"), otherwise bad things can happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Xin Long	dc9bf5513e	sctp: fix src address selection if using secondary addresses for ipv6 [ Upstream commit `dbc2b5e9a0` ] Commit `0ca50d12fe` ("sctp: fix src address selection if using secondary addresses") has fixed a src address selection issue when using secondary addresses for ipv4. Now sctp ipv6 also has the similar issue. When using a secondary address, sctp_v6_get_dst tries to choose the saddr which has the most same bits with the daddr by sctp_v6_addr_match_len. It may make some cases not work as expected. hostA: [1] fd21:356b:459a:cf10::11 (eth1) [2] fd21:356b:459a:cf20::11 (eth2) hostB: [a] fd21:356b:459a:cf30::2 (eth1) [b] fd21:356b:459a:cf40::2 (eth2) route from hostA to hostB: fd21:356b:459a:cf30::/64 dev eth1 metric 1024 mtu 1500 The expected path should be: fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2 But addr[2] matches addr[a] more bits than addr[1] does, according to sctp_v6_addr_match_len. It causes the path to be: fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2 This patch is to fix it with the same way as Marcelo's fix for sctp ipv4. As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check if the saddr is in a dev instead. Note that for backwards compatibility, it will still do the addr_match_len check here when no optimal is found. Reported-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Jon Paul Maloy	d9cb26d6a3	tipc: make macro tipc_wait_for_cond() smp safe [ Upstream commit `844cf763fb` ] The macro tipc_wait_for_cond() is embedding the macro sk_wait_event() to fulfil its task. The latter, in turn, is evaluating the stated condition outside the socket lock context. This is problematic if the condition is accessing non-trivial data structures which may be altered by incoming interrupts, as is the case with the cong_links() linked list, used by socket to keep track of the current set of congested links. We sometimes see crashes when this list is accessed by a condition function at the same time as a SOCK_WAKEUP interrupt is removing an element from the list. We fix this by expanding selected parts of sk_wait_event() into the outer macro, while ensuring that all evaluations of a given condition are performed under socket lock protection. Fixes: commit `365ad353c2` ("tipc: reduce risk of user starvation during link congestion") Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Yuchung Cheng	c91cd74b32	tcp: avoid fragmenting peculiar skbs in SACK [ Upstream commit `b451e5d24b` ] This patch fixes a bug in splitting an SKB during SACK processing. Specifically if an skb contains multiple packets and is only partially sacked in the higher sequences, tcp_match_sack_to_skb() splits the skb and marks the second fragment as SACKed. The current code further attempts rounding up the first fragment to MSS boundaries. But it misses a boundary condition when the rounded-up fragment size (pkt_len) is exactly skb size. Spliting such an skb is pointless and causses a kernel warning and aborts the SACK processing. This patch universally checks such over-split before calling tcp_fragment to prevent these unnecessary warnings. Fixes: `adb92db857` ("tcp: Make SACK code to split only at mss boundaries") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Eric Dumazet	20d699e0ca	net: fix compile error in skb_orphan_partial() [ Upstream commit `9142e9007f` ] If CONFIG_INET is not set, net/core/sock.c can not compile : net/core/sock.c: In function ‘skb_orphan_partial’: net/core/sock.c:1810:2: error: implicit declaration of function ‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration] if (skb_is_tcp_pure_ack(skb)) ^ Fix this by always including <net/tcp.h> Fixes: `f6ba8d33cf` ("netem: fix skb_orphan_partial()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Eric Dumazet	e13cb6c25b	netem: fix skb_orphan_partial() [ Upstream commit `f6ba8d33cf` ] I should have known that lowering skb->truesize was dangerous :/ In case packets are not leaving the host via a standard Ethernet device, but looped back to local sockets, bad things can happen, as reported by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 ) So instead of tweaking skb->truesize, lets change skb->destructor and keep a reference on the owner socket via its sk_refcnt. Fixes: `f2f872f927` ("netem: Introduce skb_orphan_partial() helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Michael Madsen <mkm@nabto.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Daniel Borkmann	3bfb04d102	bpf, arm64: fix faulty emission of map access in tail calls [ Upstream commit `d8b54110ee` ] Shubham was recently asking on netdev why in arm64 JIT we don't multiply the index for accessing the tail call map by 8. That led me into testing out arm64 JIT wrt tail calls and it turned out I got a NULL pointer dereference on the tail call. The buggy access is at: prog = array->ptrs[index]; if (prog == NULL) goto out; [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: f86a682a ldr x10, [x1,x10] 00000068: f862694b ldr x11, [x10,x2] 0000006c: b40000ab cbz x11, 0x00000080 [...] The code triggering the crash is f862694b. x1 at the time contains the address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning, above we load the pointer to the program at map slot 0 into x10. x10 can then be NULL if the slot is not occupied, which we later on try to access with a user given offset in x2 that is the map index. Fix this by emitting the following instead: [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: 8b0a002a add x10, x1, x10 00000068: d37df04b lsl x11, x2, #3 0000006c: f86b694b ldr x11, [x10,x11] 00000070: b40000ab cbz x11, 0x00000084 [...] This basically adds the offset to ptrs to the base address of the bpf array we got and we later on access the map with an index * 8 offset relative to that. The tail call map itself is basically one large area with meta data at the head followed by the array of prog pointers. This makes tail calls working again, tested on Cavium ThunderX ARMv8. Fixes: `ddb55992b0` ("arm64: bpf: implement bpf_tail_call() helper") Reported-by: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Ursula Braun	617461cac6	s390/qeth: add missing hash table initializations [ Upstream commit `ebccc7397e` ] commit `5f78e29cee` ("qeth: optimize IP handling in rx_mode callback") added new hash tables, but missed to initialize them. Fixes: `5f78e29cee` ("qeth: optimize IP handling in rx_mode callback") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Julian Wiedmann	b8a9a79c89	s390/qeth: avoid null pointer dereference on OSN [ Upstream commit `25e2c341e7` ] Access card->dev only after checking whether's its valid. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Julian Wiedmann	9b1042aa59	s390/qeth: unbreak OSM and OSN support [ Upstream commit `2d2ebb3ed0` ] commit `b4d72c08b3` ("qeth: bridgeport support - basic control") broke the support for OSM and OSN devices as follows: As OSM and OSN are L2 only, qeth_core_probe_device() does an early setup by loading the l2 discipline and calling qeth_l2_probe_device(). In this context, adding the l2-specific bridgeport sysfs attributes via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c, since the basic sysfs infrastructure for the device hasn't been established yet. Note that OSN actually has its own unique sysfs attributes (qeth_osn_devtype), so the additional attributes shouldn't be created at all. For OSM, add a new qeth_l2_devtype that contains all the common and l2-specific sysfs attributes. When qeth_core_probe_device() does early setup for OSM or OSN, assign the corresponding devtype so that the ccwgroup probe code creates the full set of sysfs attributes. This allows us to skip qeth_l2_create_device_attributes() in case of an early setup. Any device that can't do early setup will initially have only the generic sysfs attributes, and when it's probed later qeth_l2_probe_device() adds the l2-specific attributes. If an early-setup device is removed (by calling ccwgroup_ungroup()), device_unregister() will - using the devtype - delete the l2-specific attributes before qeth_l2_remove_device() is called. So make sure to not remove them twice. What complicates the issue is that qeth_l2_probe_device() and qeth_l2_remove_device() is also called on a device when its layer2 attribute changes (ie. its layer mode is switched). For early-setup devices this wouldn't work properly - we wouldn't remove the l2-specific attributes when switching to L3. But switching the layer mode doesn't actually make any sense; we already decided that the device can only operate in L2! So just refuse to switch the layer mode on such devices. Note that OSN doesn't have a layer2 attribute, so we only need to special-case OSM. Based on an initial patch by Ursula Braun. Fixes: `b4d72c08b3` ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Ursula Braun	41b81ff056	s390/qeth: handle sysfs error during initialization [ Upstream commit `9111e7880c` ] When setting up the device from within the layer discipline's probe routine, creating the layer-specific sysfs attributes can fail. Report this error back to the caller, and handle it by releasing the layer discipline. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> [jwi: updated commit msg, moved an OSN change to a subsequent patch] Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
WANG Cong	8e929937f8	ipv6/dccp: do not inherit ipv6_mc_list from parent [ Upstream commit `83eaddab43` ] Like commit `657831ffc3` ("dccp/tcp: do not inherit mc_list from parent") we should clear ipv6_mc_list etc. for IPv6 sockets too. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Gao Feng	03a275f5aa	driver: vrf: Fix one possible use-after-free issue [ Upstream commit `1a4a5bf52a` ] The current codes only deal with the case that the skb is dropped, it may meet one use-after-free issue when NF_HOOK returns 0 that means the skb is stolen by one netfilter rule or hook. When one netfilter rule or hook stoles the skb and return NF_STOLEN, it means the skb is taken by the rule, and other modules should not touch this skb ever. Maybe the skb is queued or freed directly by the rule. Now uses the nf_hook instead of NF_HOOK to get the result of netfilter, and check the return value of nf_hook. Only when its value equals 1, it means the skb could go ahead. Or reset the skb as NULL. BTW, because vrf_rcv_finish is empty function, so needn't invoke it even though nf_hook returns 1. But we need to modify vrf_rcv_finish to deal with the NF_STOLEN case. There are two cases when skb is stolen. 1. The skb is stolen and freed directly. There is nothing we need to do, and vrf_rcv_finish isn't invoked. 2. The skb is queued and reinjected again. The vrf_rcv_finish would be invoked as okfn, so need to free the skb in it. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Eric Dumazet	db8ebc6da8	dccp/tcp: do not inherit mc_list from parent [ Upstream commit `657831ffc3` ] syzkaller found a way to trigger double frees from ip_mc_drop_socket() It turns out that leave a copy of parent mc_list at accept() time, which is very bad. Very similar to commit `8b485ce698` ("tcp: do not inherit fastopen_req from parent") Initial report from Pray3r, completed by Andrey one. Thanks a lot to them ! Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Pray3r <pray3r.z@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Greg Kroah-Hartman	b43ae25b70	Linux 4.11.3	2017-05-25 15:46:45 +02:00
Tadeusz Struk	8d6d97abb8	IB/hfi1: Protect the global dev_cntr_names and port_cntr_names commit `62eed66e98` upstream. Protect the global dev_cntr_names and port_cntr_names with the global mutex as they are allocated and freed in a function called per device. Otherwise there is a danger of double free and memory leaks. Fixes: Commit `b7481944b0` ("IB/hfi1: Show statistics counters under IB stats interface") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Chris Wilson	b770585918	drm/i915/gvt: Disable access to stolen memory as a guest commit `04a68a35ce` upstream. Explicitly disable stolen memory when running as a guest in a virtual machine, since the memory is not mediated between clients and reserved entirely for the host. The actual size should be reported as zero, but like every other quirk we want to tell the user what is happening. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.uk Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Julius Werner	846d6e12d0	drivers: char: mem: Check for address space wraparound with mmap() commit `b299cde245` upstream. /dev/mem currently allows mmap() mappings that wrap around the end of the physical address space, which should probably be illegal. It circumvents the existing STRICT_DEVMEM permission check because the loop immediately terminates (as the start address is already higher than the end address). On the x86_64 architecture it will then cause a panic (from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()). This patch adds an explicit check to make sure offset + size will not wrap around in the physical address type. Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	326842c019	nfsd: Fix up the "supattr_exclcreat" attributes commit `b26b78cb72` upstream. If an NFSv4 client asks us for the supattr_exclcreat, then we must not return attributes that are unsupported by this minor version. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `75976de655` ("NFSD: Return word2 bitmask if setting security..,") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
J. Bruce Fields	9a4723626e	nfsd: encoders mustn't use unitialized values in error cases commit `f961e3f2ac` upstream. In error cases, lgp->lg_layout_type may be out of bounds; so we shouldn't be using it until after the check of nfserr. This was seen to crash nfsd threads when the server receives a LAYOUTGET request with a large layout type. GETDEVICEINFO has the same problem. Reported-by: Ari Kauppi <Ari.Kauppi@synopsys.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Ari Kauppi	06cc61e8f9	nfsd: fix undefined behavior in nfsd4_layout_verify commit `b550a32e60` upstream. UBSAN: Undefined behaviour in fs/nfsd/nfs4proc.c:1262:34 shift exponent 128 is too large for 32-bit type 'int' Depending on compiler+architecture, this may cause the check for layout_type to succeed for overly large values (which seems to be the case with amd64). The large value will be later used in de-referencing nfsd4_layout_ops for function pointers. Reported-by: Jani Tuovila <tuovila@synopsys.com> Signed-off-by: Ari Kauppi <ari@synopsys.com> [colin.king@canonical.com: use LAYOUT_TYPE_MAX instead of 32] Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	af069f63c5	NFSv4: Fix an rcu lock leak commit `2e84611b3f` upstream. The intention in the original patch was to release the lock when we put the inode, however something got screwed up. Reported-by: Jason Yan <yanaijie@huawei.com> Fixes: `7b410d9ce4` ("pNFS: Delay getting the layout header in..") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	89207ffd22	pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect commit `260f32adb8` upstream. The check in nfs4_ff_layout_prepare_ds() seems to be missing. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `a33e4b036d` ("pNFS: return status from nfs4_pnfs_ds_connect") Cc: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Benjamin Coddington	bc38735f21	NFS: Use GFP_NOIO for two allocations in writeback commit `ae97aa524e` upstream. Prevent a deadlock that can occur if we wait on allocations that try to write back our pages. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `00bfa30abe` ("NFS: Create a common pgio_alloc and pgio_release...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Fred Isaman	2aa6b5f2ac	NFS: Fix use after free in write error path commit `1f84ccdf37` upstream. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Fixes: `0bcbf039f6` ("nfs: handle request add failure properly") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Trond Myklebust	e7ec98b137	NFSv4: Fix a hang in OPEN related to server reboot commit `56e0d71ef1` upstream. If the server fails to return the attributes as part of an OPEN reply, and then reboots, we can end up hanging. The reason is that the client attempts to send a GETATTR in order to pick up the missing OPEN call, but fails to release the slot first, causing reboot recovery to deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `2e80dbe7ac` ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Mario Kleiner	acef1b8125	drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2 commit `e345da82bd` upstream. The builtin eDP panel in the HP zBook 17 G2 supports 10 bpc, as advertised by the Laptops product specs and verified via injecting a fixed edid + photometer measurements, but edid reports unknown depth, so drivers fall back to 6 bpc. Add a quirk to get the full 10 bpc. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1492787108-23959-1-git-send-email-mario.kleiner.de@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Alexander Couzens	9c92bffbd8	mtd: nand: add ooblayout for old hamming layout commit `6a623e0769` upstream. The old 1-bit hamming layout requires ECC data to be placed at a fixed offset, and not necessarily at the end of the OOB area. Add this old layout back in order to fix legacy setups. Fixes: `41b207a70d` ("mtd: nand: implement the default mtd_ooblayout_ops") Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Roger Quadros	2d3e850373	mtd: nand: omap2: Fix partition creation via cmdline mtdparts commit `2d283ede59` upstream. commit `c9711ec525` ("mtd: nand: omap: Clean up device tree support") caused the parent device name to be changed from "omap2-nand.0" to "<base address>.nand" (e.g. 30000000.nand on omap3 platforms). This caused mtd->name to be changed as well. This breaks partition creation via mtdparts passed by u-boot as it uses "omap2-nand.0" for the mtd-id. Fix this by explicitly setting the mtd->name to "omap2-nand.<CS number>" if it isn't already set by nand_set_flash_node(). CS number is the NAND controller instance ID. Fixes: `c9711ec525` ("mtd: nand: omap: Clean up device tree support") Reported-by: Leto Enrico <enrico.leto@siemens.com> Reported-by: Adam Ford <aford173@gmail.com> Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Tested-by: Adam Ford <aford173@gmail.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Simon Baatz	750d21357e	mtd: nand: orion: fix clk handling commit `675b11d94c` upstream. The clk handling in orion_nand.c had two problems: - In the probe function, clk_put() was called for an enabled clock, which violates the API (see documentation for clk_put() in include/linux/clk.h) - In the error path of the probe function, clk_put() could be called twice for the same clock. In order to clean this up, use the managed function devm_clk_get() and store the pointer to the clk in the driver data. Fixes: `baffab28b1` ('ARM: Orion: fix driver probe error handling with respect to clk') Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Lukas Wunner	6003ac5842	PCI: Freeze PME scan before suspending devices commit `ea00353f36` upstream. Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to reproduce the issue on an M2-W Koelsch board (r8a7791): It occurs when the PME scan runs, once per second. During PME scan, the PCI host bridge (rcar-pci) registers are accessed while its module clock has already been disabled, leading to the crash. One reproducer is to configure s2ram to use "s2idle" instead of "deep" suspend: # echo 0 > /sys/module/printk/parameters/console_suspend # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state Another reproducer is to write either "platform" or "processors" to /sys/power/pm_test. It does not (or is less likely) to happen during full system suspend ("core" or "none") because system suspend also disables timers, and thus the workqueue handling PME scans no longer runs. Geert believes the issue may still happen in the small window between disabling module clocks and disabling timers: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # Or "processors" # echo mem > /sys/power/state (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.) Rafael Wysocki agrees that PME scans should be suspended before the host bridge registers become inaccessible. To that end, queue the task on a workqueue that gets frozen before devices suspend. Rafael notes however that as a result, some wakeup events may be missed if they are delivered via PME from a device without working IRQ (which hence must be polled) and occur after the workqueue has been frozen. If that turns out to be an issue in practice, it may be possible to solve it by calling pci_pme_list_scan() once directly from one of the host bridge's pm_ops callbacks. Stacktrace for posterity: PM: Syncing filesystems ... [ 38.566237] done. PM: Preparing system for sleep (mem) Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Suspending system (mem) PM: suspend of devices complete after 152.456 msecs PM: late suspend of devices complete after 2.809 msecs PM: noirq suspend of devices complete after 29.863 msecs suspend debug: Waiting for 5 second(s). Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] pgd=80000040004003, pmd=00000000 Internal error: : 1211 [#1] SMP ARM Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383 Hardware name: Generic R8A7791 (Flattened Device Tree) Workqueue: events pci_pme_list_scan task: eb56e140 task.stack: eb58e000 PC is at pci_generic_config_read+0x64/0x6c LR is at rcar_pci_cfg_base+0x64/0x84 pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093 sp : eb58fe98 ip : c041d750 fp : 00000008 r10: c0e2283c r9 : 00000000 r8 : 600d0013 r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4 r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 6a9f6c80 DAC: 55555555 Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210) Stack: (0xeb58fe98 to 0xeb590000) fe80: 00000002 00000044 fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000 fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830 fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100 ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000 ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380 ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000 ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0 ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>] (pci_bus_read_config_word+0x58/0x80) [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>] (pci_check_pme_status+0x34/0x78) [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54) [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4) [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>] (process_one_work+0x1bc/0x308) [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0) [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc) [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c) Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000) ---[ end trace 667d43ba3aa9e589 ]--- Fixes: `df17e62e5b` ("PCI: Add support for polling PME state on suspended legacy PCI devices") Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
David Woodhouse	f21143a53f	PCI: Only allow WC mmap on prefetchable resources commit `cef4d02305` upstream. The /proc/bus/pci mmap interface allows the user to specify whether they want WC or not. Don't let them do so on non-prefetchable BARs. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
David Woodhouse	bee38f0f37	PCI: Fix another sanity check bug in /proc/pci mmap commit `17caf56731` upstream. Don't match MMIO maps with I/O BARs and vice versa. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
David Woodhouse	b87e133e3d	PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms commit `6bccc7f426` upstream. In the PCI_MMAP_PROCFS case when the address being passed by the user is a 'user visible' resource address based on the bus window, and not the actual contents of the resource, that's what we need to be checking it against. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
K. Y. Srinivasan	3537380a05	PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs commit `433fcf6b7b` upstream. When we have 32 or more CPUs in the affinity mask, we should use a special constant to specify that to the host. Fix this issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
K. Y. Srinivasan	9dc0babcd2	PCI: hv: Allocate interrupt descriptors with GFP_ATOMIC commit `59c58ceeea` upstream. The memory allocation here needs to be non-blocking. Fix the issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Tomasz Nowicki	569ed828de	PCI/ACPI: Add ThunderX pass2.x 2nd node MCFG quirk commit `cd18374048` upstream. Currently SoCs pass2.x do not emulate EA headers for ACPI boot method at all. However, for pass2.x some devices (like EDAC) advertise incorrect base addresses in their BARs which results in driver probe failure during resource request. Since all problematic blocks are on 2nd NUMA node under domain 10 add necessary quirk entry to obtain BAR addresses correction using EA header emulation. Fixes: `44f22bd91e` ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller") Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Bjorn Helgaas	b4447dbb5c	PCI/ACPI: Tidy up MCFG quirk whitespace commit `ced414a14f` upstream. With no blank lines, it's not obvious where the macro definitions end and the uses begin. Add some blank lines and reorder the ThunderX definitions. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Dawei Chien	d76f918859	thermal: mt8173: minor mtk_thermal.c cleanups commit `05d7839aa2` upstream. If thermal bank with 4 sensors, thermal driver should read TEMP_MSR3. However, currently thermal driver would not read TEMP_MSR3 since mt8173 thermal driver only use 3 sensors on each thermal bank at the same time, so this patch would not effect temperature. Only if mt mt8173 thermal driver use 4 sensors on any thermal bank, would read third sensor two times, and lose fourth sensor of vale. Fixes: `b7cf005373` ("thermal: Add Mediatek thermal driver for mt2701.") Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Dawei Chien <dawei.chien@mediatek.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Thomas Gleixner	63ab09e39d	tracing/kprobes: Enforce kprobes teardown after testing commit `30e7d894c1` upstream. Enabling the tracer selftest triggers occasionally the warning in text_poke(), which warns when the to be modified page is not marked reserved. The reason is that the tracer selftest installs kprobes on functions marked __init for testing. These probes are removed after the tests, but that removal schedules the delayed kprobes_optimizer work, which will do the actual text poke. If the work is executed after the init text is freed, then the warning triggers. The bug can be reproduced reliably when the work delay is increased. Flush the optimizer work and wait for the optimizing/unoptimizing lists to become empty before returning from the kprobes tracer selftest. That ensures that all operations which were queued due to the probes removal have completed. Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `6274de498` ("kprobes: Support delayed unoptimizing") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Arnd Bergmann	18b4b0ab65	firmware: ti_sci: fix strncat length check commit `76cefef8e8` upstream. gcc-7 notices that the length we pass to strncat is wrong: drivers/firmware/ti_sci.c: In function 'ti_sci_probe': drivers/firmware/ti_sci.c:204:32: error: specified bound 50 equals the size of the destination [-Werror=stringop-overflow=] Instead of the total length, we must pass the length of the remaining space here. Fixes: `aa276781a6` ("firmware: Add basic support for TI System Control Interface (TI-SCI) protocol") Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Masami Hiramatsu	e57f049b09	um: Fix to call read_initrd after init_bootmem commit `5b4236e17c` upstream. Since read_initrd() invokes alloc_bootmem() for allocating memory to load initrd image, it must be called after init_bootmem. This makes read_initrd() called directly from setup_arch() after init_bootmem() and mem_total_pages(). Fixes: `b63236972e` ("um: Setup physical memory in setup_arch()") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Lars Ellenberg	ece6ff2261	drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub() commit `a00ebd1cf1` upstream. When killing kref_sub(), the unconditional additional kref_get() was not properly paired with the necessary kref_put(), causing a leak of struct drbd_requests (~ 224 Bytes) per submitted bio, and breaking DRBD in general, as the destructor of those "drbd_requests" does more than just the mempoll_free(). Fixes: `bdfafc4ffd` ("locking/atomic, kref: Kill kref_sub()") Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Al Viro	7811108277	osf_wait4(): fix infoleak commit `a8c39544a6` upstream. failing sys_wait4() won't fill struct rusage... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	90c676ffab	kvm: arm/arm64: Force reading uncached stage2 PGD commit `2952a6070e` upstream. Make sure we don't use a cached value of the KVM stage2 PGD while resetting the PGD. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	59e1faddb3	kvm: arm/arm64: Fix use after free of stage2 page table commit `0c428a6a92` upstream. We yield the kvm->mmu_lock occassionaly while performing an operation (e.g, unmap or permission changes) on a large area of stage2 mappings. However this could possibly cause another thread to clear and free up the stage2 page tables while we were waiting for regaining the lock and thus the original thread could end up in accessing memory that was freed. This patch fixes the problem by making sure that the stage2 pagetable is still valid after we regain the lock. The fact that mmu_notifer->release() could be called twice (via __mmu_notifier_release and mmu_notifier_unregsister) enhances the possibility of hitting this race where there are two threads trying to unmap the entire guest shadow pages. While at it, cleanup the redudant checks around cond_resched_lock in stage2_wp_range(), as cond_resched_lock already does the same checks. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: andreyknvl@google.com Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	3bbe9f26a0	kvm: arm/arm64: Fix race in resetting stage2 PGD commit `6c0d706b56` upstream. In kvm_free_stage2_pgd() we check the stage2 PGD before holding the lock and proceed to take the lock if it is valid. And we unmap the page tables, followed by releasing the lock. We reset the PGD only after dropping this lock, which could cause a race condition where another thread waiting on or even holding the lock, could potentially see that the PGD is still valid and proceed to perform a stage2 operation and later encounter a NULL PGD. [223090.242280] Unable to handle kernel NULL pointer dereference at virtual address 00000040 [223090.262330] PC is at unmap_stage2_range+0x8c/0x428 [223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c [223090.262531] Call trace: [223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428 [223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c [223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104 [223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc [223090.262543] [<ffff0000080a2478>] kvm_mmu_notifier_invalidate_page+0x50/0x8c [223090.262547] [<ffff0000082274f8>] __mmu_notifier_invalidate_page+0x5c/0x84 [223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0 [223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0 [223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4 [223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac [223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac [223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0 [223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290 [223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac [223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4 [223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254 [223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538 [223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4 [223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c [223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8 [223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c [223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c [223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774 [223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac [223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648 [223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768 [223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4 [223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4 [223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c This patch moves the stage2 PGD manipulation under the lock. Reported-by: Alexander Graf <agraf@suse.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Huacai Chen	a24a206192	MIPS: Loongson-3: Select MIPS_L1_CACHE_SHIFT_6 commit `17c99d9421` upstream. Some newer Loongson-3 have 64 bytes cache lines, so select MIPS_L1_CACHE_SHIFT_6. Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com> Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15755/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Jon Derrick	7ce12d8b97	nvme: unmap CMB and remove sysfs file in reset path commit `f63572dff1` upstream. CMB doesn't get unmapped until removal while getting remapped on every reset. Add the unmapping and sysfs file removal to the reset path in nvme_pci_disable to match the mapping path in nvme_pci_enable. Fixes: `202021c1a` ("nvme : Add sysfs entry for NVMe CMBs when appropriate") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Reviewed-By: Stephen Bates <sbates@raithlin.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Thomas Gleixner	ef1f1648fe	genirq: Fix chained interrupt data ordering commit `2c4569ca26` upstream. irq_set_chained_handler_and_data() sets up the chained interrupt and then stores the handler data. That's racy against an immediate interrupt which gets handled before the store of the handler data happened. The handler will dereference a NULL pointer and crash. Cure it by storing handler data before installing the chained handler. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Johan Hovold	3a97bedaa3	uwb: fix device quirk on big-endian hosts commit `41318a2b82` upstream. Add missing endianness conversion when using the USB device-descriptor idProduct field to apply a hardware quirk. Fixes: `1ba47da527` ("uwb: add the i1480 DFU driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Daniel Micay	4c44438b56	stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms commit `5ea30e4e58` upstream. The stack canary is an 'unsigned long' and should be fully initialized to random data rather than only 32 bits of random data. Signed-off-by: Daniel Micay <danielmicay@gmail.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arjan van Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
James Hogan	88f7b253a9	metag/uaccess: Check access_ok in strncpy_from_user commit `3a158a62da` upstream. The metag implementation of strncpy_from_user() doesn't validate the src pointer, which could allow reading of arbitrary kernel memory. Add a short access_ok() check to prevent that. Its still possible for it to read across the user/kernel boundary, but it will invariably reach a NUL character after only 9 bytes, leaking only a static kernel address being loaded into D0Re0 at the beginning of __start, which is acceptable for the immediate fix. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
James Hogan	160f7df55b	metag/uaccess: Fix access_ok() commit `8a8b56638b` upstream. The __user_bad() macro used by access_ok() has a few corner cases noticed by Al Viro where it doesn't behave correctly: - The kernel range check has off by 1 errors which permit access to the first and last byte of the kernel mapped range. - The kernel range check ends at LINCORE_BASE rather than META_MEMORY_LIMIT, which is ineffective when the kernel is in global space (an extremely uncommon configuration). There are a couple of other shortcomings here too: - Access to the whole of the other address space is permitted (i.e. the global half of the address space when the kernel is in local space). This isn't ideal as it could theoretically still contain privileged mappings set up by the bootloader. - The size argument is unused, permitting user copies which start on valid pages at the end of the user address range and cross the boundary into the kernel address space (e.g. addr = 0x3ffffff0, size > 0x10). It isn't very convenient to add size checks when disallowing certain regions, and it seems far safer to be sure and explicit about what userland is able to access, so invert the logic to allow certain regions instead, and fix the off by 1 errors and missing size checks. This also allows the get_fs() == KERNEL_DS check to be more easily optimised into the user address range case. We now have 3 such allowed regions: - The user address range (incorporating the get_fs() == KERNEL_DS check). - NULL (some kernel code expects this to work, and we'll always catch the fault anyway). - The core code memory region. Fixes: `373cd784d0` ("metag: Memory handling") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Li, Fei	4b70137931	cpuidle: check dev before usage in cpuidle_use_deepest_state() commit `41dc750ea6` upstream. In case of there is no cpuidle devices registered, dev will be null, and panic will be triggered like below; In this patch, add checking of dev before usage, like that done in cpuidle_idle_call. Panic without fix: [ 184.961328] BUG: unable to handle kernel NULL pointer dereference at (null) [ 184.961328] IP: cpuidle_use_deepest_state+0x30/0x60 ... [ 184.961328] play_idle+0x8d/0x210 [ 184.961328] ? __schedule+0x359/0x8e0 [ 184.961328] ? _raw_spin_unlock_irqrestore+0x28/0x50 [ 184.961328] ? kthread_queue_delayed_work+0x41/0x80 [ 184.961328] clamp_idle_injection_func+0x64/0x1e0 Fixes: `bb8313b603` (cpuidle: Allow enforcing deepest idle state selection) Signed-off-by: Li, Fei <fei.li@intel.com> Tested-by: Shi, Feng <fengx.shi@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
KarimAllah Ahmed	4fd7fd47a0	iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings commit `f73a7eee90` upstream. Ever since commit `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") the kdump kernel copies the IOMMU context tables from the previous kernel. Each device mappings will be destroyed once the driver for the respective device takes over. This unfortunately breaks the workflow of mapping and unmapping a new context to the IOMMU. The mapping function assumes that either: 1) Unmapping did the proper IOMMU flushing and it only ever flush if the IOMMU unit supports caching invalid entries. 2) The system just booted and the initialization code took care of flushing all IOMMU caches. This assumption is not true for the kdump kernel since the context tables have been copied from the previous kernel and translations could have been cached ever since. So make sure to flush the IOTLB as well when we destroy these old copied mappings. Cc: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	1cfc2a16c5	staging: rtl8192e: GetTs Fix invalid TID 7 warning. commit `95d93e271d` upstream. TID 7 is a valid value for QoS IEEE 802.11e. The switch statement that follows states 7 is valid. Remove function IsACValid and use the default case to filter invalid TIDs. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	7369a3b5ec	staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD. commit `90be652c9f` upstream. EPROM_CMD is 2 byte aligned on PCI map so calling with rtl92e_readl will return invalid data so use rtl92e_readw. The device is unable to select the right eeprom type. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	3e57290a19	staging: rtl8192e: fix 2 byte alignment of register BSSIDR. commit `867510bde1` upstream. BSSIDR has two byte alignment on PCI ioremap correct the write by swapping to 16 bits first. This fixes a problem that the device associates fail because the filter is not set correctly. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	d92507d9e5	staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory. commit `baabd567f8` upstream. The driver attempts to alter memory that is mapped to PCI device. This is because tx_fwinfo_8190pci points to skb->data Move the pci_map_single to when completed buffer is ready to be mapped with psdec is empty to drop on mapping error. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Phil Elwell	e478ae4e63	staging: vc04_services: Fix bulk cache maintenance commit `ff92b9e3c9` upstream. vchiq_arm supports transfers less than one page and at arbitrary alignment, using the dma-mapping API to perform its cache maintenance (even though the VPU drives the DMA hardware). Read (DMA_FROM_DEVICE) operations use cache invalidation for speed, falling back to clean+invalidate on partial cache lines, with writes (DMA_TO_DEVICE) using flushes. If a read transfer has ends which aren't page-aligned, performing cache maintenance as if they were whole pages can lead to memory corruption since the partial cache lines at the ends (and any cache lines before or after the transfer area) will be invalidated. This bug was masked until the disabling of the cache flush in flush_dcache_page(). Honouring the requested transfer start- and end-points prevents the corruption. Fixes: `cf9caf1929` ("staging: vc04_services: Replace dmac_map_area with dmac_map_sg") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	e3b8dd546d	arm64: documentation: document tagged pointer stack constraints commit `f0e421b1bf` upstream. Some kernel features don't currently work if a task puts a non-zero address tag in its stack pointer, frame pointer, or frame record entries (FP, LR). For example, with a tagged stack pointer, the kernel can't deliver signals to the process, and the task is killed instead. As another example, with a tagged frame pointer or frame records, perf fails to generate call graphs or resolve symbols. For now, just document these limitations, instead of finding and fixing everything that doesn't work, as it's not known if anyone needs to use tags in these places anyway. In addition, as requested by Dave Martin, generalize the limitations into a general kernel address tag policy, and refactor tagged-pointers.txt to include it. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	45b1f35406	arm64: entry: improve data abort handling of tagged pointers commit `276e93279a` upstream. When handling a data abort from EL0, we currently zero the top byte of the faulting address, as we assume the address is a TTBR0 address, which may contain a non-zero address tag. However, the address may be a TTBR1 address, in which case we should not zero the top byte. This patch fixes that. The effect is that the full TTBR1 address is passed to the task's signal handler (or printed out in the kernel log). When handling a data abort from EL1, we leave the faulting address intact, as we assume it's either a TTBR1 address or a TTBR0 address with tag 0x00. This is true as far as I'm aware, we don't seem to access a tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to forget about address tags, and code added in the future may not always remember to remove tags from addresses before accessing them. So add tag handling to the EL1 data abort handler as well. This also makes it consistent with the EL0 data abort handler. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	684f9dee46	arm64: hw_breakpoint: fix watchpoint matching for tagged pointers commit `7dcd9dd8ce` upstream. When we take a watchpoint exception, the address that triggered the watchpoint is found in FAR_EL1. We compare it to the address of each configured watchpoint to see which one was hit. The configured watchpoint addresses are untagged, while the address in FAR_EL1 will have an address tag if the data access was done using a tagged address. The tag needs to be removed to compare the address to the watchpoints. Currently we don't remove it, and as a result can report the wrong watchpoint as being hit (specifically, always either the highest TTBR0 watchpoint or lowest TTBR1 watchpoint). This patch removes the tag. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	5de6ca05b6	arm64: traps: fix userspace cache maintenance emulation on a tagged pointer commit `81cddd65b5` upstream. When we emulate userspace cache maintenance in the kernel, we can currently send the task a SIGSEGV even though the maintenance was done on a valid address. This happens if the address has a non-zero address tag, and happens to not be mapped in. When we get the address from a user register, we don't currently remove the address tag before performing cache maintenance on it. If the maintenance faults, we end up in either __do_page_fault, where find_vma can't find the VMA if the address has a tag, or in do_translation_fault, where the tagged address will appear to be above TASK_SIZE. In both cases, the address is not mapped in, and the task is sent a SIGSEGV. This patch removes the tag from the address before using it. With this patch, the fault is handled correctly, the address gets mapped in, and the cache maintenance succeeds. As a second bug, if cache maintenance (correctly) fails on an invalid tagged address, the address gets passed into arm64_notify_segfault, where find_vma fails to find the VMA due to the tag, and the wrong si_code may be sent as part of the siginfo_t of the segfault. With this patch, the correct si_code is sent. Fixes: `7dd01aef05` ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	87e2b8c096	arm64: uaccess: ensure extension of access_ok() addr commit `a06040d7a7` upstream. Our access_ok() simply hands its arguments over to __range_ok(), which implicitly assummes that the addr parameter is 64 bits wide. This isn't necessarily true for compat code, which might pass down a 32-bit address parameter. In these cases, we don't have a guarantee that the address has been zero extended to 64 bits, and the upper bits of the register may contain unknown values, potentially resulting in a suprious failure. Avoid this by explicitly casting the addr parameter to an unsigned long (as is done on other architectures), ensuring that the parameter is widened appropriately. Fixes: `0aea86a217` ("arm64: User access library functions") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	1f13dc4c22	arm64: armv8_deprecated: ensure extension of addr commit `55de49f9aa` upstream. Our compat swp emulation holds the compat user address in an unsigned int, which it passes to __user_swpX_asm(). When a 32-bit value is passed in a register, the upper 32 bits of the register are unknown, and we must extend the value to 64 bits before we can use it as a base address. This patch casts the address to unsigned long to ensure it has been suitably extended, avoiding the potential issue, and silencing a related warning from clang. Fixes: `bd35a4adc4` ("arm64: Port SWP/SWPB emulation support from arm") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	2339f63ef9	arm64: ensure extension of smp_store_release value commit `994870bead` upstream. When an inline assembly operand's type is narrower than the register it is allocated to, the least significant bits of the register (up to the operand type's width) are valid, and any other bits are permitted to contain any arbitrary value. This aligns with the AAPCS64 parameter passing rules. Our __smp_store_release() implementation does not account for this, and implicitly assumes that operands have been zero-extended to the width of the type being stored to. Thus, we may store unknown values to memory when the value type is narrower than the pointer type (e.g. when storing a char to a long). This patch fixes the issue by casting the value operand to the same width as the pointer operand in all cases, which ensures that the value is zero-extended as we expect. We use the same union trickery as __smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that pointers are potentially cast to narrower width integers in unreachable paths. A whitespace issue at the top of __smp_store_release() is also corrected. No changes are necessary for __smp_load_acquire(). Load instructions implicitly clear any upper bits of the register, and the compiler will only consider the least significant bits of the register as valid regardless. Fixes: `47933ad41a` ("arch: Introduce smp_load_acquire(), smp_store_release()") Fixes: `878a84d5a8` ("arm64: add missing data types in smp_load_acquire/smp_store_release") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	b55563999f	arm64: xchg: hazard against entire exchange variable commit `fee960bed5` upstream. The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a u8 pointer, and thus the hazard only applies to the first byte of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, as demonstrated with the following test case: union u { unsigned long l; unsigned int i[2]; }; unsigned long update_char_hazard(union u u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" ((char )&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } unsigned long update_long_hazard(union u u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" ((long )&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when using -O2 or above: 0000000000000000 <update_char_hazard>: 0: d2800001 mov x1, #0x0 // #0 4: f9000001 str x1, [x0] 8: d2800000 mov x0, #0x0 // #0 c: d65f03c0 ret 0000000000000010 <update_long_hazard>: 10: b9400401 ldr w1, [x0,#4] 14: d2800002 mov x2, #0x0 // #0 18: f9000002 str x2, [x0] 1c: b9400400 ldr w0, [x0,#4] 20: 4a000020 eor w0, w1, w0 24: d65f03c0 ret This patch fixes the issue by passing an unsigned long pointer into the +Q constraint, as we do for our cmpxchg code. This may hazard against more than is necessary, but this is better than missing a necessary hazard. Fixes: `305d454aaa` ("arm64: atomics: implement native {relaxed, acquire, release} atomics") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Daniel Lezcano	ee6dfbff78	arm64: dts: hi6220: Reset the mmc hosts commit `0fbdf9953b` upstream. The MMC hosts could be left in an unconsistent or uninitialized state from the firmware. Instead of assuming, the firmware did the right things, let's reset the host controllers. This change fixes a bug when the mmc2/sdio is initialized leading to a hung task: [ 242.704294] INFO: task kworker/7:1:675 blocked for more than 120 seconds. [ 242.711129] Not tainted 4.9.0-rc8-00017-gcf0251f #3 [ 242.716571] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.724435] kworker/7:1 D 0 675 2 0x00000000 [ 242.729973] Workqueue: events_freezable mmc_rescan [ 242.734796] Call trace: [ 242.737269] [<ffff00000808611c>] __switch_to+0xa8/0xb4 [ 242.742437] [<ffff000008d07c04>] __schedule+0x1c0/0x67c [ 242.747689] [<ffff000008d08254>] schedule+0x40/0xa0 [ 242.752594] [<ffff000008d0b284>] schedule_timeout+0x1c4/0x35c [ 242.758366] [<ffff000008d08e38>] wait_for_common+0xd0/0x15c [ 242.763964] [<ffff000008d09008>] wait_for_completion+0x28/0x34 [ 242.769825] [<ffff000008a1a9f4>] mmc_wait_for_req_done+0x40/0x124 [ 242.775949] [<ffff000008a1ab98>] mmc_wait_for_req+0xc0/0xf8 [ 242.781549] [<ffff000008a1ac3c>] mmc_wait_for_cmd+0x6c/0x84 [ 242.787149] [<ffff000008a26610>] mmc_io_rw_direct_host+0x9c/0x114 [ 242.793270] [<ffff000008a26aa0>] sdio_reset+0x34/0x7c [ 242.798347] [<ffff000008a1d46c>] mmc_rescan+0x2fc/0x360 [ ... ] Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Leonard Crestez	1116274777	ARM: dts: imx6sx-sdb: Remove OPP override commit `d8581c7c8b` upstream. The board file for imx6sx-sdb overrides cpufreq operating points to use higher voltages. This is done because the board has a shared rail for VDD_ARM_IN and VDD_SOC_IN and when using LDO bypass the shared voltage needs to be a value suitable for both ARM and SOC. This only applies to LDO bypass mode, a feature not present in upstream. When LDOs are enabled the effect is to use higher voltages than necessary for no good reason. Setting these higher voltages can make some boards fail to boot with ugly semi-random crashes reminiscent of memory corruption. These failures only happen on board rev. C, rev. B is reported to still work. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Fixes: `54183bd7f7` ("ARM: imx6sx-sdb: add revb board and make it default") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ludovic Desroches	8539c6b56a	ARM: dts: at91: sama5d3_xplained: not all ADC channels are available commit `d3df1ec063` upstream. Remove ADC channels that are not available by default on the sama5d3_xplained board (resistor not populated) in order to not create confusion. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ludovic Desroches	87cb676e83	ARM: dts: at91: sama5d3_xplained: fix ADC vref commit `9cdd31e591` upstream. The voltage reference for the ADC is not 3V but 3.3V since it is connected to VDDANA. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Vladimir Murzin	6b9e12331a	ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call commit `6d80594936` upstream. We save/restore registers around v7m_invalidate_l1 to address pointed by r12, which is vector table, so the first eight entries are overwritten with a garbage. We already have stack setup at that stage, so use it to save/restore register. Fixes: `6a8146f420` ("ARM: 8609/1: V7M: Add support for the Cortex-M7 processor") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Jon Medhurst	3e9570206d	ARM: 8667/3: Fix memory attribute inconsistencies when using fixmap commit `b089c31c51` upstream. To cope with the variety in ARM architectures and configurations, the pagetable attributes for kernel memory are generated at runtime to match the system the kernel finds itself on. This calculated value is stored in pgprot_kernel. However, when early fixmap support was added for ARM (commit `a5f4c561b3`) the attributes used for mappings were hard coded because pgprot_kernel is not set up early enough. Unfortunately, when fixmap is used after early boot this means the memory being mapped can have different attributes to existing mappings, potentially leading to unpredictable behaviour. A specific problem also exists due to the hard coded values not include the 'shareable' attribute which means on systems where this matters (e.g. those with multiple CPU clusters) the cache contents for a memory location can become inconsistent between CPUs. To resolve these issues we change fixmap to use the same memory attributes (from pgprot_kernel) that the rest of the kernel uses. To enable this we need to refactor the initialisation code so build_mem_type_table() is called early enough. Note, that relies on early param parsing for memory type overrides passed via the kernel command line, so we need to make sure this call is still after parse_early_params(). [ardb: keep early_fixmap_init() before param parsing, for earlycon] Fixes: `a5f4c561b3` ("ARM: 8415/1: early fixmap support for earlycon") Tested-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ard Biesheuvel	7e2aa72f11	ARM: 8662/1: module: split core and init PLT sections commit `b7ede5a1f5` upstream. Since commit `35fa91eed8` ("ARM: kernel: merge core and init PLTs"), the ARM module PLT code allocates all PLT entries in a single core section, since the overhead of having a separate init PLT section is not justified by the small number of PLT entries usually required for init code. However, the core and init module regions are allocated independently, and there is a corner case where the core region may be allocated from the VMALLOC region if the dedicated module region is exhausted, but the init region, being much smaller, can still be allocated from the module region. This puts the PLT entries out of reach of the relocated branch instructions, defeating the whole purpose of PLTs. So split the core and init PLT regions, and name the latter ".init.plt" so it gets allocated along with (and sufficiently close to) the .init sections that it serves. Also, given that init PLT entries may need to be emitted for branches that target the core module, modify the logic that disregards defined symbols to only disregard symbols that are defined in the same section. Fixes: `35fa91eed8` ("ARM: kernel: merge core and init PLTs") Reported-by: Angus Clark <angus@angusclark.org> Tested-by: Angus Clark <angus@angusclark.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Zhichao Huang	ae9bd04a13	KVM: arm: plug potential guest hardware debug leakage commit `661e6b02b5` upstream. Hardware debugging in guests is not intercepted currently, it means that a malicious guest can bring down the entire machine by writing to the debug registers. This patch enable trapping of all debug registers, preventing the guests to access the debug registers. This includes access to the debug mode(DBGDSCR) in the guest world all the time which could otherwise mess with the host state. Reads return 0 and writes are ignored (RAZ_WI). The result is the guest cannot detect any working hardware based debug support. As debug exceptions are still routed to the guest normal debug using software based breakpoints still works. To support debugging using hardware registers we need to implement a debug register aware world switch as well as special trapping for registers that may affect the host state. Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Marc Zyngier	c870815609	KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interrupt commit `3d6e77ad14` upstream. When an interrupt is injected with the HW bit set (indicating that deactivation should be propagated to the physical distributor), special care must be taken so that we never mark the corresponding LR with the Active+Pending state (as the pending state is kept in the physycal distributor). Fixes: `59529f69f5` ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	181339b540	KVM: arm/arm64: vgic-v2: Do not use Active+Pending state for a HW interrupt commit `ddf42d068f` upstream. When an interrupt is injected with the HW bit set (indicating that deactivation should be propagated to the physical distributor), special care must be taken so that we never mark the corresponding LR with the Active+Pending state (as the pending state is kept in the physycal distributor). Fixes: `140b086dd1` ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	50f665c39d	arm: KVM: Do not use stack-protector to compile HYP code commit `501ad27c67` upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the HYP code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at HYP. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	c735d4e3c2	arm64: KVM: Do not use stack-protector to compile EL2 code commit `cde13b5dad` upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the EL2 code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at EL2. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Michael Neuling	75295195f7	powerpc/tm: Fix FP and VMX register corruption commit `f48e91e87e` upstream. In commit `dc3106690b` ("powerpc: tm: Always use fp_state and vr_state to store live registers"), a section of code was removed that copied the current state to checkpointed state. That code should not have been removed. When an FP (Floating Point) unavailable is taken inside a transaction, we need to abort the transaction. This is because at the time of the tbegin, the FP state is bogus so the state stored in the checkpointed registers is incorrect. To fix this, we treclaim (to get the checkpointed GPRs) and then copy the thread_struct FP live state into the checkpointed state. We then trecheckpoint so that the FP state is correctly restored into the CPU. The copying of the FP registers from live to checkpointed is what was missing. This simplifies the logic slightly from the original patch. tm_reclaim_thread() will now always write the checkpointed FP state. Either the checkpointed FP state will be written as part of the actual treclaim (in tm.S), or it'll be a copy of the live state. Which one we use is based on MSR[FP] from userspace. Similarly for VMX. Fixes: `dc3106690b` ("powerpc: tm: Always use fp_state and vr_state to store live registers") Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: cyrilbur@gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Michael Ellerman	f9fde78726	powerpc/mm: Fix crash in page table dump with huge pages commit `bfb9956ab4` upstream. The page table dump code doesn't know about huge pages, so currently it crashes (or walks random memory, usually leading to a crash), if it finds a huge page. On Book3S we only see huge pages in the Linux page tables when we're using the P9 Radix MMU. Teaching the code to properly handle huge pages is a bit more involved, so for now just prevent the crash. Fixes: `8eb07b1870` ("powerpc/mm: Dump linux pagetables") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
LiuHailong	1d19e3f139	powerpc/64e: Fix hang when debugging programs with relocated kernel commit `fd615f69a1` upstream. Debug interrupts can be taken during interrupt entry, since interrupt entry does not automatically turn them off. The kernel will check whether the faulting instruction is between [interrupt_base_book3e, __end_interrupts], and if so clear MSR[DE] and return. However, when the kernel is built with CONFIG_RELOCATABLE, it can't use LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) and LOAD_REG_IMMEDIATE(r15,__end_interrupts), as they ignore relocation. Thus, if the kernel is actually running at a different address than it was built at, the address comparison will fail, and the exception entry code will hang at kernel_dbg_exc. r2(toc) is also not usable here, as r2 still holds data from the interrupted context, so LOAD_REG_ADDR() doesn't work either. So we use the name@got to get the EV of two labels directly. Test programs test.c shows as follows: int main(int argc, char *argv[]) { if (access("/proc/sys/kernel/perf_event_paranoid", F_OK) == -1) printf("Kernel doesn't have perf_event support\n"); } Steps to reproduce the bug, for example: 1) ./gdb ./test 2) (gdb) b access 3) (gdb) r 4) (gdb) s Signed-off-by: Liu Hailong <liu.hailong6@zte.com.cn> Signed-off-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Liu Song <liu.song11@zte.com.cn> Reviewed-by: Huang Jian <huang.jian@zte.com.cn> [scottwood: cleaned up commit message, and specified bad behavior as a hang rather than an oops to correspond to mainline kernel behavior] Fixes: `1cb6e06492` ("powerpc/book3e: support CONFIG_RELOCATABLE") Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Alistair Popple	31f96d860e	powerpc/powernv: Fix TCE kill on NVLink2 commit `6b3d12a948` upstream. Commit `616badd2fb` ("powerpc/powernv: Use OPAL call for TCE kill on NVLink2") forced all TCE kills to go via the OPAL call for NVLink2. However the PHB3 implementation of TCE kill was still being called directly from some functions which in some circumstances caused a machine check. This patch adds an equivalent IODA2 version of the function which uses the correct invalidation method depending on PHB model and changes all external callers to use it instead. Fixes: `616badd2fb` ("powerpc/powernv: Use OPAL call for TCE kill on NVLink2") Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Alexey Kardashevskiy	8781143e13	powerpc/iommu: Do not call PageTransHuge() on tail pages commit `e889e96e98` upstream. The CMA pages migration code does not support compound pages at the moment so it performs few tests before proceeding to actual page migration. One of the tests - PageTransHuge() - has VM_BUG_ON_PAGE(PageTail()) as it is designed to be called on head pages only. Since we also test for PageCompound(), and it contains PageTail() and PageHead(), we can simplify the check by leaving just PageCompound() and therefore avoid possible VM_BUG_ON_PAGE. Fixes: `2e5bbb5461` ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Tyrel Datwyler	eb0807b288	powerpc/sysfs: Fix reference leak of cpu device_nodes present at boot commit `e76ca27790` upstream. For CPUs present at boot each logical CPU acquires a reference to the associated device node of the core. This happens in register_cpu() which is called by topology_init(). The result of this is that we end up with a reference held by each thread of the core. However, these references are never freed if the CPU core is DLPAR removed. This patch fixes the reference leaks by acquiring and releasing the references in the CPU hotplug callbacks un/register_cpu_online(). With this patch symmetric reference counting is observed with both CPUs present at boot, and those DLPAR added after boot. Fixes: `f86e4718f2` ("driver/core: cpu: initialize of_node in cpu's device struture") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Tyrel Datwyler	d731227327	powerpc/pseries: Fix of_node_put() underflow during DLPAR remove commit `68baf692c4` upstream. Historically struct device_node references were tracked using a kref embedded as a struct field. Commit `75b57ecf9d` ("of: Make device nodes kobjects so they show up in sysfs") (Mar 2014) refactored device_nodes to be kobjects such that the device tree could by more simply exposed to userspace using sysfs. Commit `0829f6d1f6` ("of: device_node kobject lifecycle fixes") (Mar 2014) followed up these changes to better control the kobject lifecycle and in particular the referecne counting via of_node_get(), of_node_put(), and of_node_init(). A result of this second commit was that it introduced an of_node_put() call when a dynamic node is detached, in of_node_remove(), that removes the initial kobj reference created by of_node_init(). Traditionally as the original dynamic device node user the pseries code had assumed responsibilty for releasing this final reference in its platform specific DLPAR detach code. This patch fixes a refcount underflow introduced by commit `0829f6d1f6`, and recently exposed by the upstreaming of the recount API. Messages like the following are no longer seen in the kernel log with this patch following DLPAR remove operations of cpus and pci devices. rpadlpar_io: slot PHB 72 removed refcount_t: underflow; use-after-free. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 Fixes: `0829f6d1f6` ("of: device_node kobject lifecycle fixes") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> [mpe: Make change log commit references more verbose] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Mahesh Salgaonkar	11e382b3b8	powerpc/book3s/mce: Move add_taint() later in virtual mode commit `d93b0ac01a` upstream. machine_check_early() gets called in real mode. The very first time when add_taint() is called, it prints a warning which ends up calling opal call (that uses OPAL_CALL wrapper) for writing it to console. If we get a very first machine check while we are in opal we are doomed. OPAL_CALL overwrites the PACASAVEDMSR in r13 and in this case when we are done with MCE handling the original opal call will use this new MSR on it's way back to opal_return. This usually leads to unexpected behaviour or the kernel to panic. Instead move the add_taint() call later in the virtual mode where it is safe to call. This is broken with current FW level. We got lucky so far for not getting very first MCE hit while in OPAL. But easily reproducible on Mambo. Fixes: `27ea2c420c` ("powerpc: Set the correct kernel taint on machine check errors.") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Russell Currey	e548d552c0	powerpc/eeh: Avoid use after free in eeh_handle_special_event() commit `daeba2956f` upstream. eeh_handle_special_event() is called when an EEH event is detected but can't be narrowed down to a specific PE. This function looks through every PE to find one in an erroneous state, then calls the regular event handler eeh_handle_normal_event() once it knows which PE has an error. However, if eeh_handle_normal_event() found that the PE cannot possibly be recovered, it will free it, rendering the passed PE stale. This leads to a use after free in eeh_handle_special_event() as it attempts to clear the "recovering" state on the PE after eeh_handle_normal_event() returns. Thus, make sure the PE is valid when attempting to clear state in eeh_handle_special_event(). Fixes: `8a6b1bc70d` ("powerpc/eeh: EEH core to handle special event") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
David Gibson	63f44c113b	powerpc/mm: Ensure IRQs are off in switch_mm() commit `9765ad134a` upstream. powerpc expects IRQs to already be (soft) disabled when switch_mm() is called, as made clear in the commit message of `9c1e105238` ("powerpc: Allow perf_counters to access user memory at interrupt time"). Aside from any race conditions that might exist between switch_mm() and an IRQ, there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't followed at some point by an IRQ enable then interrupts will remain disabled until we return to userspace. It is true that when switch_mm() is called from the scheduler IRQs are off, but not when it's called by use_mm(). Looking closer we see that last year in commit `f98db6013c` ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") this was made more explicit by the addition of switch_mm_irqs_off() which is now called by the scheduler, vs switch_mm() which is used by use_mm(). Arguably it is a bug in use_mm() to call switch_mm() in a different context than it expects, but fixing that will take time. This was discovered recently when vhost started throwing warnings such as: BUG: sleeping function called from invalid context at kernel/mutex.c:578 in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760 no locks held by vhost-10760/10768. irq event stamp: 10 hardirqs last enabled at (9): _raw_spin_unlock_irq+0x40/0x80 hardirqs last disabled at (10): switch_slb+0x2e4/0x490 softirqs last enabled at (0): copy_process+0x5e8/0x1260 softirqs last disabled at (0): (null) Call Trace: show_stack+0x88/0x390 (unreliable) dump_stack+0x30/0x44 __might_sleep+0x1c4/0x2d0 mutex_lock_nested+0x74/0x5c0 cgroup_attach_task_all+0x5c/0x180 vhost_attach_cgroups_work+0x58/0x80 [vhost] vhost_worker+0x24c/0x3d0 [vhost] kthread+0xec/0x100 ret_from_kernel_thread+0x5c/0xd4 Prior to commit `04b96e5528` ("vhost: lockless enqueuing") (Aug 2016) the vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(), which had the effect of reenabling IRQs. Since that commit removed the locking in vhost_worker() the body of the vhost_worker() loop now runs with interrupts off causing the warnings. This patch addresses the problem by making the powerpc code mirror the x86 code, ie. we disable interrupts in switch_mm(), and optimise the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Flesh out/rewrite change log, add stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	5617775ed1	cx231xx-cards: fix NULL-deref at probe commit `0cd273bb5e` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	6f623a3490	cx231xx-audio: fix NULL-deref at probe commit `65f921647f` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	8bb9fa63d0	cx231xx-audio: fix init error path commit `fff1abc4d5` upstream. Make sure to release the snd_card also on a late allocation error. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Alyssa Milburn	a28e1f13a6	dw2102: limit messages to buffer size commit `950e252cb4` upstream. Otherwise the i2c transfer functions can read or write beyond the end of stack or heap buffers. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Alyssa Milburn	f953d6b03a	digitv: limit messages to buffer size commit `821117dc21` upstream. Return an error rather than memcpy()ing beyond the end of the buffer. Internal callers use appropriate sizes, but digitv_i2c_xfer may not. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Daniel Scheller	9e074c588d	dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops commit `158f0328af` upstream. Fixes "w_scan -f c" complaining with This dvb driver is buggy: the symbol rate limits are undefined - please report to linuxtv.org) Signed-off-by: Daniel Scheller <d.scheller@gmx.net> Acked-by: Abylay Ospan <aospan@netup.ru> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Alyssa Milburn	ac6dedddba	zr364xx: enforce minimum size when reading header commit `ee0fe833d9` upstream. This code copies actual_length-128 bytes from the header, which will underflow if the received buffer is too small. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Johan Hovold	20c98053a2	dib0700: fix NULL-deref at probe commit `d5823511c0` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Fixes: `c4018fa2e4` ("[media] dib0700: fix RC support on Hauppauge Nova-TD") Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Marek Szyprowski	6af3917b8d	s5p-mfc: Fix unbalanced call to clock management commit `a5cb00eb42` upstream. Clock should be turned off after calling s5p_mfc_init_hw() from the watchdog worker, like it is already done in the s5p_mfc_open() which also calls this function. Fixes: `af93574678` ("[media] MFC: Add MFC 5.1 V4L2 driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Johan Hovold	4f220307a8	gspca: konica: add missing endpoint sanity check commit `aa58fedb8c` upstream. Make sure to check the number of endpoints to avoid accessing memory beyond the endpoint array should a device lack the expected endpoints. Note that, as far as I can tell, the gspca framework has already made sure there is at least one endpoint in the current alternate setting so there should be no risk for a NULL-pointer dereference here. Fixes: `b517af7228` ("V4L/DVB: gspca_konica: New gspca subdriver for konica chipset using cams") Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Marek Szyprowski	d926ceaa72	s5p-mfc: Fix race between interrupt routine and device functions commit `0c32b8ec02` upstream. Interrupt routine must wake process waiting for given interrupt AFTER updating driver's internal structures and contexts. Doing it in-between is a serious bug. This patch moves all calls to the wake() function to the end of the interrupt processing block to avoid potential and real races, especially on multi-core platforms. This also fixes following issue reported from clock core (clocks were disabled in interrupt after being unprepared from the other place in the driver, the stack trace however points to the different place than s5p_mfc driver because of the race): WARNING: CPU: 1 PID: 18 at drivers/clk/clk.c:544 clk_core_unprepare+0xc8/0x108 Modules linked in: CPU: 1 PID: 18 Comm: kworker/1:0 Not tainted 4.10.0-next-20170223-00070-g04e18bc99ab9-dirty #2154 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm pm_runtime_work [<c010d8b0>] (unwind_backtrace) from [<c010a534>] (show_stack+0x10/0x14) [<c010a534>] (show_stack) from [<c033292c>] (dump_stack+0x74/0x94) [<c033292c>] (dump_stack) from [<c011cef4>] (__warn+0xd4/0x100) [<c011cef4>] (__warn) from [<c011cf40>] (warn_slowpath_null+0x20/0x28) [<c011cf40>] (warn_slowpath_null) from [<c0387a84>] (clk_core_unprepare+0xc8/0x108) [<c0387a84>] (clk_core_unprepare) from [<c0389d84>] (clk_unprepare+0x24/0x2c) [<c0389d84>] (clk_unprepare) from [<c03d4660>] (exynos_sysmmu_suspend+0x48/0x60) [<c03d4660>] (exynos_sysmmu_suspend) from [<c042b9b0>] (pm_generic_runtime_suspend+0x2c/0x38) [<c042b9b0>] (pm_generic_runtime_suspend) from [<c0437580>] (genpd_runtime_suspend+0x94/0x220) [<c0437580>] (genpd_runtime_suspend) from [<c042e240>] (__rpm_callback+0x134/0x208) [<c042e240>] (__rpm_callback) from [<c042e334>] (rpm_callback+0x20/0x80) [<c042e334>] (rpm_callback) from [<c042d3b8>] (rpm_suspend+0xdc/0x458) [<c042d3b8>] (rpm_suspend) from [<c042ea24>] (pm_runtime_work+0x80/0x90) [<c042ea24>] (pm_runtime_work) from [<c01322c4>] (process_one_work+0x120/0x318) [<c01322c4>] (process_one_work) from [<c0132520>] (worker_thread+0x2c/0x4ac) [<c0132520>] (worker_thread) from [<c0137ab0>] (kthread+0xfc/0x134) [<c0137ab0>] (kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c) ---[ end trace 1ead49a7bb83f0d8 ]--- Fixes: `af93574678` ("[media] MFC: Add MFC 5.1 V4L2 driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Lee Jones	776a591f75	cec: Fix runtime BUG when (CONFIG_RC_CORE && !CEC_CAP_RC) commit `43c0c03961` upstream. Currently when the RC Core is enabled (reachable) core code located in cec_register_adapter() attempts to populate the RC structure with a pointer to the 'parent' passed in by the caller. Unfortunately if the caller did not specify RC capability when calling cec_allocate_adapter(), then there will be no RC structure to populate. This causes a "NULL pointer dereference" error. Fixes: `f51e80804f` ("[media] cec: pass parent device in register(), not allocate()") Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Srinivas Pandruvada	0d1015c0e9	iio: hid-sensor: Store restore poll and hysteresis on S3 commit `5d9854eaea` upstream. This change undo the change done by 'commit `3bec247474` ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3")' as this breaks some USB/i2c sensor hubs. Instead of relying on HW for restoring poll and hysteresis, driver stores and restores on resume (S3). In this way user space modified settings are not lost for any kind of sensor hub behavior. In this change, whenever user space modifies sampling frequency or hysteresis driver will get the feature value from the hub and store in the per device hid_sensor_common data structure. On resume callback from S3, system will set the feature to sensor hub, if user space ever modified the feature value. Fixes: `3bec247474` ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3") Reported-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Song, Hongyan <hongyan.song@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Matt Ranostay	03652e4884	iio: proximity: as3935: fix as3935_write commit `84ca8e364a` upstream. AS3935_WRITE_DATA macro bit is incorrect and the actual write sequence is two leading zeros. Cc: George McCollister <george.mccollister@gmail.com> Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Dan Carpenter	b13b3f3985	ipx: call ipxitf_put() in ioctl error path commit `ee0d8d8482` upstream. We should call ipxitf_put() if the copy_to_user() fails. Reported-by: 李强 <liqiang6-s@360.cn> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	1e2a08069d	USB: hub: fix non-SS hub-descriptor handling commit `bec444cd1c` upstream. Add missing sanity check on the non-SuperSpeed hub-descriptor length in order to avoid parsing and leaking two bytes of uninitialised slab data through sysfs removable-attributes (or a compound-device debug statement). Note that we only make sure that the DeviceRemovable field is always present (and specifically ignore the unused PortPwrCtrlMask field) in order to continue support any hubs with non-compliant descriptors. As a further safeguard, the descriptor buffer is also cleared. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	345477ed11	USB: hub: fix SS hub-descriptor handling commit `2c25a2c818` upstream. A SuperSpeed hub descriptor does not have any variable-length fields so bail out when reading a short descriptor. This avoids parsing and leaking two bytes of uninitialised slab data through sysfs removable-attributes. Fixes: `dbe79bbe9d` ("USB 3.0 Hub Changes") Cc: John Youn <John.Youn@synopsys.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	3a82455292	USB: serial: io_ti: fix div-by-zero in set_termios commit `6aeb75e6ad` upstream. Fix a division-by-zero in set_termios when debugging is enabled and a high-enough speed has been requested so that the divisor value becomes zero. Instead of just fixing the offending debug statement, cap the baud rate at the base as a zero divisor value also appears to crash the firmware. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	e4d3ae36fc	USB: serial: mct_u232: fix big-endian baud-rate handling commit `26cede3436` upstream. Drop erroneous cpu_to_le32 when setting the baud rate, something which corrupted the divisor on big-endian hosts. Found using sparse: warning: incorrect type in argument 1 (different base types) expected unsigned int [unsigned] [usertype] val got restricted __le32 [usertype] <noident> Fixes: `af2ac1a091` ("USB: serial mct_usb232: move DMA buffers to heap") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-By: Pete Zaitcev <zaitcev@yahoo.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Bjørn Mork	07cf1f9d01	USB: serial: qcserial: add more Lenovo EM74xx device IDs commit `8d7a10dd32` upstream. In their infinite wisdom, and never ending quest for end user frustration, Lenovo has decided to use new USB device IDs for the wwan modules in their 2017 laptops. The actual hardware is still the Sierra Wireless EM7455 or EM7430, depending on region. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Daniele Palmas	d986525b7d	usb: serial: option: add Telit ME910 support commit `40dd46048c` upstream. This patch adds support for Telit ME910 PID 0x1100. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	b58d7087ce	USB: iowarrior: fix info ioctl on big-endian hosts commit `dd5ca753fa` upstream. Drop erroneous le16_to_cpu when returning the USB device speed which is already in host byte order. Found using sparse: warning: cast to restricted __le16 Fixes: `946b960d13` ("USB: add driver for iowarrior devices.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Tony Lindgren	d69737cb64	usb: musb: Fix trying to suspend while active for OTG configurations commit `3c50ffef25` upstream. Commit `d8e5f0eca1` ("usb: musb: Fix hardirq-safe hardirq-unsafe lock order error") caused a regression where musb keeps trying to enable host mode with no cable connected. This seems to be caused by the fact that now phy is enabled earlier, and we are wrongly trying to force USB host mode on an OTG port. The errors we are getting are "trying to suspend as a_idle while active". For ports configured as OTG, we should not need to do anything to try to force USB host mode on it's OTG port. Trying to force host mode in this case just seems to completely confuse the musb state machine. Let's fix the issue by making musb_host_setup() attempt to force the mode only if port_mode is configured for host mode. Fixes: `d8e5f0eca1` ("usb: musb: Fix hardirq-safe hardirq-unsafe lock order error") Cc: Johan Hovold <johan@kernel.org> Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Peter Ujfalusi	925f7f9244	usb: musb: tusb6010_omap: Do not reset the other direction's packet size commit `6df2b42f7c` upstream. We have one register for each EP to set the maximum packet size for both TX and RX. If for example an RX programming would happen before the previous TX transfer finishes we would reset the TX packet side. To fix this issue, only modify the TX or RX part of the register. Fixes: `550a7375fe` ("USB: Add MUSB and TUSB support") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Thinh Nguyen	33c19ef5e2	usb: dwc3: gadget: Prevent losing events in event cache commit `d325a1de49` upstream. The dwc3 driver can overwite its previous events if its top-half IRQ handler (TH) gets invoked again before processing the events in the cache. We see this as a hang in the file transfer and the host will attempt to reset the device. TH gets the event count and deasserts the interrupt line by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. If there's a new event coming between reading the event count and interrupt deassertion, dwc3 will lose previous pending events. More generally, we will see 0 event count, which should not affect anything. This shouldn't be possible in the current dwc3 implementation. However, through testing and reading the PCIe trace, the TH occasionally still gets invoked one more time after HW interrupt deassertion. (With PCIe legacy interrupts, TH is called repeatedly as long as the interrupt line is asserted). We suspect that there is a small detection delay in the SW. To avoid this issue, Check DWC3_EVENT_PENDING flag to determine if the events are processed in the bottom-half IRQ handler. If not, return IRQ_HANDLED and don't process new event. Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Ben Hutchings	79312c5d61	dvb-usb-dibusb-mc-common: Add MODULE_LICENSE commit `bf05b65a9f` upstream. dvb-usb-dibusb-mc-common is licensed under GPLv2, and if we don't say so then it won't even load since it needs a GPL-only symbol. Fixes: `e91455a149` ("[media] dvb-usb: split out common parts of dibusb") Reported-by: Dominique Dumont <dod@debian.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Alyssa Milburn	8d730619ae	ttusb2: limit messages to buffer size commit `a12b8ab8c5` upstream. Otherwise ttusb2_i2c_xfer can read or write beyond the end of static and heap buffers. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	f2b30b2f19	mceusb: fix NULL-deref at probe commit `03eb2a557e` upstream. Make sure to check for the required out endpoint to avoid dereferencing a NULL-pointer in mce_request_packet should a malicious device lack such an endpoint. Note that this path is hit during probe. Fixes: `66e89522af` ("V4L/DVB: IR: add mceusb IR receiver driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	03a7e78a25	usbvision: fix NULL-deref at probe commit `eacb975b48` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `2a9f8b5d25` ("V4L/DVB (5206): Usbvision: set alternate interface modification") Cc: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	3aade67b76	net: irda: irda-usb: fix firmware name on big-endian hosts commit `75cf067953` upstream. Add missing endianness conversion when using the USB device-descriptor bcdDevice field to construct a firmware file name. Fixes: `8ef80aef11` ("[IRDA]: irda-usb.c: STIR421x cleanups") Cc: Nick Fedchik <nfedchik@atlantic-link.com.ua> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Peter Chen	0cc9aa4c4b	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer commit `7480d912d5` upstream. According to xHCI ch4.20 Scratchpad Buffers, the Scratchpad Buffer needs to be zeroed. ... The following operations take place to allocate Scratchpad Buffers to the xHC: ... b. Software clears the Scratchpad Buffer to '0' Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Mathias Nyman	966e88d540	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton commit `a0c16630d3` upstream. Intel Denverton microserver is Atom based and need the PME and CAS quirks as well. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Alan Stern	5077513b4e	USB: xhci: fix lock-inversion problem commit `63aea0dbab` upstream. With threaded interrupts, bottom-half handlers are called with interrupts enabled. Therefore they can't safely use spin_lock(); they have to use spin_lock_irqsave(). Lockdep warns about a violation occurring in xhci_irq(): ========================================================= [ INFO: possible irq lock inversion dependency detected ] 4.11.0-rc8-dbg+ #1 Not tainted --------------------------------------------------------- swapper/7/0 just changed the state of lock: (&(&ehci->lock)->rlock){-.-...}, at: [<ffffffffa0130a69>] ehci_hrtimer_func+0x29/0xc0 [ehci_hcd] but this lock took another, HARDIRQ-unsafe lock in the past: (hcd_urb_list_lock){+.....} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hcd_urb_list_lock); local_irq_disable(); lock(&(&ehci->lock)->rlock); lock(hcd_urb_list_lock); <Interrupt> lock(&(&ehci->lock)->rlock); * DEADLOCK * no locks held by swapper/7/0. the shortest dependencies between 2nd lock and 1st lock: -> (hcd_urb_list_lock){+.....} ops: 252 { HARDIRQ-ON-W at: __lock_acquire+0x602/0x1280 lock_acquire+0xd5/0x1c0 _raw_spin_lock+0x2f/0x40 usb_hcd_unlink_urb_from_ep+0x1b/0x60 [usbcore] xhci_giveback_urb_in_irq.isra.45+0x70/0x1b0 [xhci_hcd] finish_td.constprop.60+0x1d8/0x2e0 [xhci_hcd] xhci_irq+0xdd6/0x1fa0 [xhci_hcd] usb_hcd_irq+0x26/0x40 [usbcore] irq_forced_thread_fn+0x2f/0x70 irq_thread+0x149/0x1d0 kthread+0x113/0x150 ret_from_fork+0x2e/0x40 This patch fixes the problem. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Thomas Petazzoni	354c85b7ad	usb: host: xhci-plat: propagate return value of platform_get_irq() commit `4b148d5144` upstream. platform_get_irq() returns an error code, but the xhci-plat driver ignores it and always returns -ENODEV. This is not correct, and prevents -EPROBE_DEFER from being propagated properly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Matthias Lange	c3dc5f5ae5	xhci: remove GFP_DMA flag from allocation commit `5db851cf20` upstream. There is no reason to restrict allocations to the first 16MB ISA DMA addresses. It is causing problems in a virtualization setup with enabled IOMMU (x86_64). The result is that USB is not working in the VM. Signed-off-by: Matthias Lange <matthias.lange@kernkonzept.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Mathias Nyman	d0c8639797	xhci: Fix command ring stop regression in 4.11 commit `604d02a2a6` upstream. In 4.11 TRB completion codes were renamed to match spec. Completion codes for command ring stopped and endpoint stopped were mixed, leading to failures while handling a stopped command ring. Use the correct completion code for command ring stopped events. Fixes: `0b7c105a04` ("usb: host: xhci: rename completion codes to match spec") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Yazen Ghannam	1b8de0a2bb	EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h commit `eb77e6b80f` upstream. The wrong index into the csbases/csmasks arrays was being passed to the function to compute the chip select sizes, which resulted in the wrong size being computed. Address that so that the correct values are computed and printed. Also, redo how we calculate the number of pages in a CS row. Reported-by: Benjamin Bennett <benbennett@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded integer math comment, minor cleanups. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Jan Kara	0a4115da61	dax: fix data corruption when fault races with write commit `13e451fdc1` upstream. Currently DAX read fault can race with write(2) in the following way: CPU1 - write(2) CPU2 - read fault dax_iomap_pte_fault() ->iomap_begin() - sees hole dax_iomap_rw() iomap_apply() ->iomap_begin - allocates blocks dax_iomap_actor() invalidate_inode_pages2_range() - there's nothing to invalidate grab_mapping_entry() - we add zero page in the radix tree and map it to page tables The result is that hole page is mapped into page tables (and thus zeros are seen in mmap) while file has data written in that place. Fix the problem by locking exception entry before mapping blocks for the fault. That way we are sure invalidate_inode_pages2_range() call for racing write will either block on entry lock waiting for the fault to finish (and unmap stale page tables after that) or read fault will see already allocated blocks by write(2). Fixes: `9f141d6ef6` Link: http://lkml.kernel.org/r/20170510085419.27601-5-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Toshi Kani	3ef206af54	libnvdimm: fix clear length of nvdimm_forget_poison() commit `8d13c02906` upstream. ND_CMD_CLEAR_ERROR command returns 'clear_err.cleared', the length of error actually cleared, which may be smaller than its requested 'len'. Change nvdimm_clear_poison() to call nvdimm_forget_poison() with 'clear_err.cleared' when this value is valid. Fixes: `e046114af5` ("libnvdimm: clear the internal poison_list when clearing badblocks") Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
David Howells	7868d9ff2e	Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx() commit `deccf497d8` upstream. stat/lstat/fstatat need to pass AT_NO_AUTOMOUNT to vfs_statx() as the pre-statx code didn't set LOOKUP_AUTOMOUNT, even though fstatat() accepted the AT_NO_AUTOMOUNT flag. Fixes: `a528d35e8b` ("statx: Add a system call to make enhanced file info available") Reported-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Johan Hovold	d08975541f	USB: chaoskey: fix Alea quirk on big-endian hosts commit `63afd5cc78` upstream. Add missing endianness conversion when applying the Alea timeout quirk. Found using sparse: warning: restricted __le16 degrades to integer Fixes: `e4a886e811` ("hwrng: chaoskey - Fix URB warning due to timeout on Alea") Cc: Bob Ham <bob.ham@collabora.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Andrey Korolyov	7ed35d6ca7	USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs commit `5f63424ab7` upstream. This patch adds support for recognition of ARM-USB-TINY(H) devices which are almost identical to ARM-USB-OCD(H) but lacking separate barrel jack and serial console. By suggestion from Johan Hovold it is possible to replace ftdi_jtag_quirk with a bit more generic construction. Since all Olimex-ARM debuggers has exactly two ports, we could safely always use only second port within the debugger family. Signed-off-by: Andrey Korolyov <andrey@xdel.ru> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Anthony Mallet	14bd189bef	USB: serial: ftdi_sio: fix setting latency for unprivileged users commit `bb246681b3` upstream. Commit `557aaa7ffa` ("ft232: support the ASYNC_LOW_LATENCY flag") enables unprivileged users to set the FTDI latency timer, but there was a logic flaw that skipped sending the corresponding USB control message to the device. Specifically, the device latency timer would not be updated until next open, something which was later also inadvertently broken by commit `c19db4c9e4` ("USB: ftdi_sio: set device latency timeout at port probe"). A recent commit `c6dce26266` ("USB: serial: ftdi_sio: fix extreme low-latency setting") disabled the low-latency mode by default so we now need this fix to allow unprivileged users to again enable it. Signed-off-by: Anthony Mallet <anthony.mallet@laas.fr> [johan: amend commit message] Fixes: `557aaa7ffa` ("ft232: support the ASYNC_LOW_LATENCY flag") Fixes: `c19db4c9e4` ("USB: ftdi_sio: set device latency timeout at port probe"). Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Kirill Tkhai	15e4ed2a46	pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() commit `3fd3722621` upstream. Imagine we have a pid namespace and a task from its parent's pid_ns, which made setns() to the pid namespace. The task is doing fork(), while the pid namespace's child reaper is dying. We have the race between them: Task from parent pid_ns Child reaper copy_process() .. alloc_pid() .. .. zap_pid_ns_processes() .. disable_pid_allocation() .. read_lock(&tasklist_lock) .. iterate over pids in pid_ns .. kill tasks linked to pids .. read_unlock(&tasklist_lock) write_lock_irq(&tasklist_lock); .. attach_pid(p, PIDTYPE_PID); .. .. .. So, just created task p won't receive SIGKILL signal, and the pid namespace will be in contradictory state. Only manual kill will help there, but does the userspace care about this? I suppose, the most users just inject a task into a pid namespace and wait a SIGCHLD from it. The patch fixes the problem. It simply checks for (pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process(). We do it under the tasklist_lock, and can't skip PIDNS_HASH_ADDING as noted by Oleg: "zap_pid_ns_processes() does disable_pid_allocation() and then takes tasklist_lock to kill the whole namespace. Given that copy_process() checks PIDNS_HASH_ADDING under write_lock(tasklist) they can't race; if copy_process() takes this lock first, the new child will be killed, otherwise copy_process() can't miss the change in ->nr_hashed." If allocation is disabled, we just return -ENOMEM like it's made for such cases in alloc_pid(). v2: Do not move disable_pid_allocation(), do not introduce a new variable in copy_process() and simplify the patch as suggested by Oleg Nesterov. Account the problem with double irq enabling found by Eric W. Biederman. Fixes: `c876ad7682` ("pidns: Stop pid allocation when init dies") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@kernel.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Oleg Nesterov <oleg@redhat.com> CC: Mike Rapoport <rppt@linux.vnet.ibm.com> CC: Michal Hocko <mhocko@suse.com> CC: Andy Lutomirski <luto@kernel.org> CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Andrei Vagin <avagin@openvz.org> CC: Cyrill Gorcunov <gorcunov@openvz.org> CC: Serge Hallyn <serge@hallyn.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Eric W. Biederman	9fbdfb7af4	pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes commit `b9a985db98` upstream. The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: `6347e90091` ("pidns: guarantee that the pidns init will be the last pidns process reaped") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Michael J. Ruhl	4cbd5018c4	IB/hfi1: Fix a subcontext memory leak commit `224d71f910` upstream. The only context that frees user_exp_rcv data structures is the last context closed (from a sub-context set). This leaks the allocations from the other sub-contexts. Separate the common frees from the specific frees and call them at the appropriate time. Using KEDR to check for memory leaks we get: Before test: [leak_check] Possible leaks: 25 After test: [leak_check] Possible leaks: 31 (6 leaked data structures) After patch applied (before and after test have the same value) [leak_check] Possible leaks: 25 Each leak is 192 + 13440 + 6720 = 20352 bytes per sub-context. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Michael J. Ruhl	d971ab21c9	IB/hfi1: Return an error on memory allocation failure commit `94679061dc` upstream. If the eager buffer allocation fails, it is necessary to return an error code. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Fabrice Gasnier	c07927502b	iio: stm32 trigger: fix sampling_frequency read commit `77a9febfd8` upstream. When prescaler (PSC) is 0, it means div factor is 1: counter clock frequency is equal to input clk / (PSC + 1). When reload value is 8 for example, counter counts 9 cycles, from 0 to 8. This is handled in frequency write routine, by writing respectively: - prescaler - 1 to PSC - reload value - 1 to ARR This fix does the opposite when reading the frequency from PSC and ARR: - prescaler is PSC + 1 - reload value is ARR + 1 Thus, PSC may be 0, depending on requested sampling frequency (div 1). In this case, reading freq wrongly reports 0, instead of computing and reporting correct value. Remove test on !psc and !arr. Small test on stm32f4 (example on tim1_trgo), before this fix: $ cd /sys/bus/iio/devices/triggerX $ echo 10000 > sampling_frequency $ cat sampling_frequency 0 After this fix: $ echo 10000 > sampling_frequency $ cat sampling_frequency 10000 Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Andreas Klinger	705fa4038f	IIO: bmp280-core.c: fix error in humidity calculation commit `ed3730c435` upstream. While calculating the compensation of the humidity there are negative values interpreted as unsigned because of unsigned variables used. These values as well as the constants need to be casted to signed as indicated by the documentation of the sensor. Signed-off-by: Andreas Klinger <ak@it-klinger.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Pavel Roskin	327f5da04e	iio: dac: ad7303: fix channel description commit `ce420fd425` upstream. realbits, storagebits and shift should be numbers, not ASCII characters. Signed-off-by: Pavel Roskin <plroskin@gmail.com> Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
James Smart	6567967aad	scsi: lpfc: Fix panic on BFS configuration commit `4492b739c9` upstream. To select the appropriate shost template, the driver is issuing a mailbox command to retrieve the wwn. Turns out the sending of the command precedes the reset of the function. On SLI-4 adapters, this is inconsequential as the mailbox command location is specified by dma via the BMBX register. However, on SLI-3 adapters, the location of the mailbox command submission area changes. When the function is first powered on or reset, the cmd is submitted via PCI bar memory. Later the driver changes the function config to use host memory and DMA. The request to start a mailbox command is the same, a simple doorbell write, regardless of submission area. So.. if there has not been a boot driver run against the adapter, the mailbox command works as defaults are ok. But, if the boot driver has configured the card and, and if no platform pci function/slot reset occurs as the os starts, the mailbox command will fail. The SLI-3 device will use the stale boot driver dma location. This can cause PCI eeh errors. Fix is to reset the sli-3 function before sending the mailbox command, thus synchronizing the function/driver on mailbox location. Note: The fix uses routines that are typically invoked later in the call flow to reset the sli-3 device. The issue in using those routines is that the normal (non-fix) flow does additional initialization, namely the allocation of the pport structure. So, rather than significantly reworking the initialization flow so that the pport is alloc'd first, pointer checks are added to work around it. Checks are limited to the routines invoked by a sli-3 adapter (s3 routines) as this fix/early call is only invoked on a sli3 adapter. Nothing changes post the fix. Subsequent initialization, and another adapter reset, still occur - both on sli-3 and sli-4 adapters. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Fixes: `96418b5e2c` ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.") Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Bryant G. Ly	5d6d5f07f8	ibmvscsis: Do not send aborted task response commit `25e7853126` upstream. The driver is sending a response to the actual scsi op that was aborted by an abort task TM, while LIO is sending a response to the abort task TM. ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag for that command, but we're still using the tag until the command is released at release_cmd time, so we chose to delay sending the response until then. That then caused this issue, because release_cmd is always called, even if queue_status is not. SCSI spec says that the initiator that sends the abort task TM NEVER gets a response to the aborted op and with the current code it will send a response. Thus this fix will remove that response if the CMD_T_ABORTED && !CMD_T_TAS. Another case with a small timing window is the case where if LIO sends a TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort cmd before the release_cmd for the (attemped) aborted cmd, then we need to ensure that we send the response for the (attempted) abort cmd to the client before we send the response for the TMR Abort cmd. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Johan Hovold	c3f0eff47d	of: fdt: add missing allocation-failure check commit `49e67dd176` upstream. The memory allocator passed to __unflatten_device_tree() (e.g. a wrapped kzalloc) can fail so add the missing sanity check to avoid dereferencing a NULL pointer. Fixes: `fe14042358` ("of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Tyrel Datwyler	5bc2cfba89	of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes() commit `b8475cbee5` upstream. The call to of_find_node_by_path("/cpus") returns the cpus device_node with its reference count incremented. There is no matching of_node_put() call in of_numa_parse_cpu_nodes() which results in a leaked reference to the "/cpus" node. This patch adds an of_node_put() to release the reference. fixes: `298535c00a` ("of, numa: Add NUMA of binding implementation.") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Rob Herring	854d0231ef	of: fix sparse warning in of_pci_range_parser_one commit `eb31003657` upstream. sparse gives the following warning for 'pci_space': ../drivers/of/address.c:266:26: warning: incorrect type in assignment (different base types) ../drivers/of/address.c:266:26: expected unsigned int [unsigned] [usertype] pci_space ../drivers/of/address.c:266:26: got restricted __be32 const [usertype] <noident> It appears that pci_space is only ever accessed on powerpc, so the endian swap is often not needed. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Takashi Iwai	d5331e619a	proc: Fix unbalanced hard link numbers commit `d66bb1607e` upstream. proc_create_mount_point() forgot to increase the parent's nlink, and it resulted in unbalanced hard link numbers, e.g. /proc/fs shows one less than expected. Fixes: `eb6d38d542` ("proc: Allow creating permanently empty directories...") Reported-by: Tristan Ye <tristan.ye@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Vaibhav Jain	525d5ef5ae	cxl: Route eeh events to all drivers in cxl_pci_error_detected() commit `4f58f0bf15` upstream. Fix a boundary condition where in some cases an eeh event that results in card reset isn't passed on to a driver attached to the virtual PCI device associated with a slice. This will happen in case when a slice attached device driver returns a value other than PCI_ERS_RESULT_NEED_RESET from the eeh error_detected() callback. This would result in an early return from cxl_pci_error_detected() and other drivers attached to other AFUs on the card wont be notified. The patch fixes this by making sure that all slice attached device-drivers are notified and the return values from error_detected() callback are aggregated in a scheme where request for 'disconnect' trumps all and 'none' trumps 'need_reset'. Fixes: `9e8df8a219` ("cxl: EEH support") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Vaibhav Jain	4a0e8757ad	cxl: Force context lock during EEH flow commit `ea9a26d117` upstream. During an eeh event when the cxl card is fenced and card sysfs attr perst_reloads_same_image is set following warning message is seen in the kernel logs: Adapter context unlocked with 0 active contexts ------------[ cut here ]------------ WARNING: CPU: 12 PID: 627 at ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl] Even though this warning is harmless, it clutters the kernel log during an eeh event. This warning is triggered as the EEH callback cxl_pci_error_detected doesn't obtain a context-lock before forcibly detaching all active context and when context-lock is released during call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in cxl_adapter_context_unlock is triggered. To fix this warning, we acquire the adapter context-lock via cxl_adapter_context_lock() in the eeh callback cxl_pci_error_detected() once all the virtual AFU PHBs are notified and their contexts detached. The context-lock is released in cxl_pci_slot_reset() after the adapter is successfully reconfigured and before the we call the slot_reset callback on slice attached device-drivers. Fixes: `70b565bbdb` ("cxl: Prevent adapter reset if an active context exists") Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Tested-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Gerd Hoffmann	6c58e144d5	ohci-pci: add qemu quirk commit `21a60f6e65` upstream. On a loaded virtualization host (dozen guests booting at the same time) it may happen that the ohci controller emulation doesn't manage to do timely frame processing, with the result that the io watchdog fires and considers the controller being dead, even though it's only the emulation being unusual slow due to the load peak. So, add a quirk for qemu and don't use the watchdog in case we figure we are running on emulated ohci. The virtual ohci controller masquerades as apple ohci controller, but we can identify it by subsystem id. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Tobias Herzog	a83f3099d0	cdc-acm: fix possible invalid access when processing notification commit `1bb9914e17` upstream. Notifications may only be 8 bytes long. Accessing the 9th and 10th byte of unimplemented/unknown notifications may be insecure. Also check the length of known notifications before accessing anything behind the 8th byte. Signed-off-by: Tobias Herzog <t-herzog@gmx.de> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
David Rivshin	9541621a12	gpio: omap: return error if requested debounce time is not possible commit `8397744393` upstream. omap_gpio_debounce() does not validate that the requested debounce is within a range it can handle. Instead it lets the register value wrap silently, and always returns success. This can lead to all sorts of unexpected behavior, such as gpio_keys asking for a too-long debounce, but getting a very short debounce in practice. Fix this by returning -EINVAL if the requested value does not fit into the register field. If there is no debounce clock available at all, return -ENOTSUPP. Fixes: `e85ec6c304` ("gpio: omap: fix omap2_set_gpio_debounce") Signed-off-by: David Rivshin <drivshin@allworx.com> Acked-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	33f379db47	drm/nouveau/tmr: handle races with hw when updating the next alarm time commit `1b0f84380b` upstream. If the time to the next alarm is short enough, we could race with HW and end up with an ~4 second delay until it triggers. Fix this by checking again after we update HW. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	53b8e32038	drm/nouveau/tmr: avoid processing completed alarms when adding a new one commit `330bdf62fe` upstream. The idea here was to avoid having to "manually" program the HW if there's a new earliest alarm. This was lazy and bad, as it leads to loads of fun races between inter-related callers (ie. therm). Turns out, it's not so difficult after all. Go figure ;) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	952030b868	drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm commit `9fc64667ee` upstream. At least therm/fantog "attempts" to work around this issue, which could lead to corruption of the pending alarm list. Fix it properly by not updating the timestamp without the lock held, or trying to add an already pending alarm to the pending alarm list.... Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	4c168033dd	drm/nouveau/tmr: ack interrupt before processing alarms commit `3733bd8b40` upstream. Fixes a race where we can miss an alarm that triggers while we're already processing previous alarms. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	a5e41c9e5a	drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes commit `e6db95799b` upstream. The DRM core used to only call prepare_fb/cleanup_fb() when a plane's framebuffer changed, which achieved the desired effect. It's apparently now up to the driver to decide on its own. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	4aecf3a73b	drm/nouveau/kms/nv50: fix source-rect-only plane updates commit `36601c2b36` upstream. This "optimisation" (which was originally meant to skip updating cursor settings in the core channel on position-only updates) turned out to be pointless in the final design of the code before it was merged. Remove it completely, as it breaks other cases. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	2138faf017	drm/nouveau/therm: remove ineffective workarounds for alarm bugs commit `e4311ee51d` upstream. These were ineffective due to touching the list without the alarm lock, but should no longer be required. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	21d3947bf7	drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path. commit `effaf848b9` upstream. This apparently got lost when implementing the new DCE-6 support and would cause failures in pageflip scheduling and timestamping. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	3f0db81c7d	drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations. commit `e190ed1ea7` upstream. At dot clocks > approx. 250 Mhz, some of these calcs will overflow and cause miscalculation of latency watermarks, and for some overflows also divide-by-zero driver crash ("divide error: 0000 [#1] PREEMPT SMP" in "dce_v10_0_latency_watermark+0x12d/0x190"). This zero-divide happened, e.g., on AMD Tonga Pro under DCE-10, on a Displayport panel when trying to set a video mode of 2560x1440 at 165 Hz vrefresh with a dot clock of 635.540 Mhz. Refine calculations to avoid the overflows. Tested for DCE-10 with R9 380 Tonga + ASUS ROG PG279 panel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	de2fa45aa9	drm/amdgpu: Make display watermark calculations more accurate commit `d63c277dc6` upstream. Avoid big roundoff errors in scanline/hactive durations for high pixel clocks, especially for >= 500 Mhz, and thereby program more accurate display fifo watermarks. Implemented here for DCE 6,8,10,11. Successfully tested on DCE 10 with AMD R9 380 Tonga. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Johan Hovold	fab4402325	ath9k_htc: fix NULL-deref at probe commit `ebeb36670e` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `36bcce4306` ("ath9k_htc: Handle storage devices") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Dmitry Tunin	43068b4585	ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device commit `16ff1fb0e3` upstream. T: Bus=01 Lev=02 Prnt=02 Port=02 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ff(vend.) Sub=ff Prot=ff MxPS=64 #Cfgs= 1 P: Vendor=1eda ProdID=2315 Rev=01.08 S: Manufacturer=ATHEROS S: Product=USB2.0 WLAN S: SerialNumber=12345 C: #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 6 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Martin Schwidefsky	82e31de018	s390/cputime: fix incorrect system time commit `07a63cbe8b` upstream. git commit `c5328901aa` "[S390] entry[64].S improvements" removed the update of the exit_timer lowcore field from the critical section cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done blocks. If the PSW is updated by the critical section cleanup to point to user space again, the interrupt entry code will do a vtime calculation after the cleanup completed with an exit_timer value which has not been updated. Due to this incorrect system time deltas are calculated. If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry time of the interrupt. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Michael Holzheu	5d1adbaf36	s390/kdump: Add final note commit `dcc00b79fc` upstream. Since linux v3.14 with commit `38dfac843c` ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Richard Cochran	d2cc3a8cb9	regulator: tps65023: Fix inverted core enable logic. commit `c90722b54a` upstream. Commit `43530b69d7` ("regulator: Use regmap_read/write(), regmap_update_bits functions directly") intended to replace working inline helper functions with standard regmap calls. However, it also inverted the set/clear logic of the "CORE ADJ Allowed" bit. That patch was clearly never tested, since without that bit cleared, the core VDCDC1 voltage output does not react to I2C configuration changes. This patch fixes the issue by clearing the bit as in the original, correct implementation. Note for stable back porting that, due to subsequent driver churn, this patch will not apply on every kernel version. Fixes: `43530b69d7` ("regulator: Use regmap_read/write(), regmap_update_bits functions directly") Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Wadim Egorov	fde4f79074	regulator: rk808: Fix RK818 LDO2 commit `75f8811539` upstream. Set the correct voltage select register for LDO2. Signed-off-by: Wadim Egorov <w.egorov@phytec.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Linus Torvalds	efb209c87a	x86: fix 32-bit case of __get_user_asm_u64() commit `33c9e97290` upstream. The code to fetch a 64-bit value from user space was entirely buggered, and has been since the code was merged in early 2016 in commit `b2f680380d` ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit kernels"). Happily the buggered routine is almost certainly entirely unused, since the normal way to access user space memory is just with the non-inlined "get_user()", and the inlined version didn't even historically exist. The normal "get_user()" case is handled by external hand-written asm in arch/x86/lib/getuser.S that doesn't have either of these issues. There were two independent bugs in __get_user_asm_u64(): - it still did the STAC/CLAC user space access marking, even though that is now done by the wrapper macros, see commit `11f1a4b975` ("x86: reorganize SMAP handling in user space accesses"). This didn't result in a semantic error, it just means that the inlined optimized version was hugely less efficient than the allegedly slower standard version, since the CLAC/STAC overhead is quite high on modern Intel CPU's. - the double register %eax/%edx was marked as an output, but the %eax part of it was touched early in the asm, and could thus clobber other inputs to the asm that gcc didn't expect it to touch. In particular, that meant that the generated code could look like this: mov (%eax),%eax mov 0x4(%eax),%edx where the load of %edx obviously was _supposed_ to be from the 32-bit word that followed the source of %eax, but because %eax was overwritten by the first instruction, the source of %edx was basically random garbage. The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark the 64-bit output as early-clobber to let gcc know that no inputs should alias with the output register. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	5bbaf88d79	KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation commit `cbfc6c9184` upstream. Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation. - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only, a read should be ignored) in guest can get a random number. - "rep insb" instruction to access PIT register port 0x43 can control memcpy() in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes, which will disclose the unimportant kernel memory in host but no crash. The similar test program below can reproduce the read out-of-bounds vulnerability: void hexdump(void mem, unsigned int len) { unsigned int i, j; for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++) { / print offset / if(i % HEXDUMP_COLS == 0) { printf("0x%06x: ", i); } / print hex data / if(i < len) { printf("%02x ", 0xFF & ((char)mem)[i]); } else /* end of block, just aligning for ASCII dump / { printf(" "); } / print ASCII dump / if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1)) { for(j = i - (HEXDUMP_COLS - 1); j <= i; j++) { if(j >= len) / end of block, not really printing / { putchar(' '); } else if(isprint(((char)mem)[j])) /* printable char / { putchar(0xFF & ((char)mem)[j]); } else /* other char / { putchar('.'); } } putchar('\n'); } } } int main(void) { int i; if (iopl(3)) { err(1, "set iopl unsuccessfully\n"); return -1; } static char buf[0x40]; / test ioport 0x40,0x41,0x42,0x43,0x44,0x45 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x40, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x41, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x42, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x44, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x45, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); / ins port 0x40 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x40, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); / ins port 0x43 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); return 0; } The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation w/o clear after using which results in some random datas are left over in the buffer. Guest reads port 0x43 will be ignored since it is write only, however, the function kernel_pio() can't distigush this ignore from successfully reads data from device's ioport. There is no new data fill the buffer from port 0x43, however, emulator_pio_in_emulated() will copy the stale data in the buffer to the guest unconditionally. This patch fixes it by clearing the buffer before in instruction emulation to avoid to grant guest the stale data in the buffer. In addition, string I/O is not supported for in kernel device. So there is no iteration to read ioport %RCX times for string I/O. The function kernel_pio() just reads one round, and then copy the io size %RCX to the guest unconditionally, actually it copies the one round ioport data w/ other random datas which are left over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by introducing the string I/O support for in kernel device in order to grant the right ioport datas to the guest. Before the patch: 0x000000: fe 38 93 93 ff ff ab ab .8...... 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ After the patch: 0x000000: 1e 02 f8 00 ff ff ab ab ........ 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: d2 e2 d2 df d2 db d2 d7 ........ 0x000008: d2 d3 d2 cf d2 cb d2 c7 ........ 0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........ 0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: 00 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 00 00 00 00 ........ 0x000018: 00 00 00 00 00 00 00 00 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Moguofang <moguofang@huawei.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	b67dcf8cf8	KVM: x86: Fix potential preemption when get the current kvmclock timestamp commit `e2c2206a18` upstream. BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809 caller is __this_cpu_preempt_check+0x13/0x20 CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13 Call Trace: dump_stack+0x99/0xce check_preemption_disabled+0xf5/0x100 __this_cpu_preempt_check+0x13/0x20 get_kvmclock_ns+0x6f/0x110 [kvm] get_time_ref_counter+0x5d/0x80 [kvm] kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm] kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 RIP: 0033:0x7f9d164ed357 ? __this_cpu_preempt_check+0x13/0x20 This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/ CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled. Safe access to per-CPU data requires a couple of constraints, though: the thread working with the data cannot be preempted and it cannot be migrated while it manipulates per-CPU variables. If the thread is preempted, the thread that replaces it could try to work with the same variables; migration to another CPU could also cause confusion. However there is no preemption disable when reads host per-CPU tsc rate to calculate the current kvmclock timestamp. This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both __this_cpu_read() and rdtsc() are not preempted. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	8223e8f994	KVM: x86: Fix load damaged SSEx MXCSR register commit `a575813bfe` upstream. Reported by syzkaller: BUG: unable to handle kernel paging request at ffffffffc07f6a2e IP: report_bug+0x94/0x120 PGD 348e12067 P4D 348e12067 PUD 348e14067 PMD 3cbd84067 PTE 80000003f7e87161 Oops: 0003 [#1] SMP CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G OE 4.11.0+ #8 task: ffff92fdfb525400 task.stack: ffffbda6c3d04000 RIP: 0010:report_bug+0x94/0x120 RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202 do_trap+0x156/0x170 do_error_trap+0xa3/0x170 ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? mark_held_locks+0x79/0xa0 ? retint_kernel+0x10/0x10 ? trace_hardirqs_off_thunk+0x1a/0x1c do_invalid_op+0x20/0x30 invalid_op+0x1e/0x30 RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm] kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm] kvm_vcpu_ioctl+0x384/0x780 [kvm] ? kvm_vcpu_ioctl+0x384/0x780 [kvm] ? sched_clock+0x13/0x20 ? __do_page_fault+0x2a0/0x550 do_vfs_ioctl+0xa4/0x700 ? up_read+0x1f/0x40 ? __do_page_fault+0x2a0/0x550 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 SDM mentioned that "The MXCSR has several reserved bits, and attempting to write a 1 to any of these bits will cause a general-protection exception(#GP) to be generated". The syzkaller forks' testcase overrides xsave area w/ random values and steps on the reserved bits of MXCSR register. The damaged MXCSR register values of guest will be restored to SSEx MXCSR register before vmentry. This patch fixes it by catching userspace override MXCSR register reserved bits w/ random values and bails out immediately. Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Daniel Glöckner	efa8cd1e2f	ima: accept previously set IMA_NEW_FILE commit `1ac202e978` upstream. Modifying the attributes of a file makes ima_inode_post_setattr reset the IMA cache flags. So if the file, which has just been created, is opened a second time before the first file descriptor is closed, verification fails since the security.ima xattr has not been written yet. We therefore have to look at the IMA_NEW_FILE even if the file already existed. With this patch there should no longer be an error when cat tries to open testfile: $ rm -f testfile $ ( echo test >&3 ; touch testfile ; cat testfile ) 3>testfile A file being new is no reason to accept that it is missing a digital signature demanded by the policy. Signed-off-by: Daniel Glöckner <dg@emlix.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Brian Norris	d5811285b5	mwifiex: pcie: fix cmd_buf use-after-free in remove/reset commit `3c8cb9ad03` upstream. Command buffers (skb's) are allocated by the main driver, and freed upon the last use. That last use is often in mwifiex_free_cmd_buffer(). In the meantime, if the command buffer gets used by the PCI driver, we map it as DMA-able, and store the mapping information in the 'cb' memory. However, if a command was in-flight when resetting the device (and therefore was still mapped), we don't get a chance to unmap this memory until after the core has cleaned up its command handling. Let's keep a refcount within the PCI driver, so we ensure the memory only gets freed after we've finished unmapping it. Noticed by KASAN when forcing a reset via: echo 1 > /sys/bus/pci/.../reset The same code path can presumably be exercised in remove() and shutdown(). [ 205.390377] mwifiex_pcie 0000:01:00.0: info: shutdown mwifiex... [ 205.400393] ================================================================== [ 205.407719] BUG: KASAN: use-after-free in mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] at addr ffffffc0ad471b28 [ 205.419040] Read of size 16 by task bash/1913 [ 205.423421] ============================================================================= [ 205.431625] BUG skbuff_head_cache (Tainted: G B ): kasan: bad access detected [ 205.439815] ----------------------------------------------------------------------------- [ 205.439815] [ 205.449534] INFO: Allocated in __build_skb+0x48/0x114 age=1311 cpu=4 pid=1913 [ 205.456709] alloc_debug_processing+0x124/0x178 [ 205.461282] ___slab_alloc.constprop.58+0x528/0x608 [ 205.466196] __slab_alloc.isra.54.constprop.57+0x44/0x54 [ 205.471542] kmem_cache_alloc+0xcc/0x278 [ 205.475497] __build_skb+0x48/0x114 [ 205.479019] __netdev_alloc_skb+0xe0/0x170 [ 205.483244] mwifiex_alloc_cmd_buffer+0x68/0xdc [mwifiex] [ 205.488759] mwifiex_init_fw+0x40/0x6cc [mwifiex] [ 205.493584] _mwifiex_fw_dpc+0x158/0x520 [mwifiex] [ 205.498491] mwifiex_reinit_sw+0x2c4/0x398 [mwifiex] [ 205.503510] mwifiex_pcie_reset_notify+0x114/0x15c [mwifiex_pcie] [ 205.509643] pci_reset_notify+0x5c/0x6c [ 205.513519] pci_reset_function+0x6c/0x7c [ 205.517567] reset_store+0x68/0x98 [ 205.521003] dev_attr_store+0x54/0x60 [ 205.524705] sysfs_kf_write+0x9c/0xb0 [ 205.528413] INFO: Freed in __kfree_skb+0xb0/0xbc age=131 cpu=4 pid=1913 [ 205.535064] free_debug_processing+0x264/0x370 [ 205.539550] __slab_free+0x84/0x40c [ 205.543075] kmem_cache_free+0x1c8/0x2a0 [ 205.547030] __kfree_skb+0xb0/0xbc [ 205.550465] consume_skb+0x164/0x178 [ 205.554079] __dev_kfree_skb_any+0x58/0x64 [ 205.558304] mwifiex_free_cmd_buffer+0xa0/0x158 [mwifiex] [ 205.563817] mwifiex_shutdown_drv+0x578/0x5c4 [mwifiex] [ 205.569164] mwifiex_shutdown_sw+0x178/0x310 [mwifiex] [ 205.574353] mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie] [ 205.580398] pci_reset_notify+0x5c/0x6c [ 205.584274] pci_dev_save_and_disable+0x24/0x6c [ 205.588837] pci_reset_function+0x30/0x7c [ 205.592885] reset_store+0x68/0x98 [ 205.596324] dev_attr_store+0x54/0x60 [ 205.600017] sysfs_kf_write+0x9c/0xb0 ... [ 205.800488] Call trace: [ 205.802980] [<ffffffc00020a69c>] dump_backtrace+0x0/0x190 [ 205.808415] [<ffffffc00020a96c>] show_stack+0x20/0x28 [ 205.813506] [<ffffffc0005d020c>] dump_stack+0xa4/0xcc [ 205.818598] [<ffffffc0003be44c>] print_trailer+0x158/0x168 [ 205.824120] [<ffffffc0003be5f0>] object_err+0x4c/0x5c [ 205.829210] [<ffffffc0003c45bc>] kasan_report+0x334/0x500 [ 205.834641] [<ffffffc0003c3994>] check_memory_region+0x20/0x14c [ 205.840593] [<ffffffc0003c3b14>] __asan_loadN+0x14/0x1c [ 205.845879] [<ffffffbffc46171c>] mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] [ 205.854282] [<ffffffbffc461864>] mwifiex_pcie_delete_cmdrsp_buf+0x94/0xa8 [mwifiex_pcie] [ 205.862421] [<ffffffbffc462028>] mwifiex_pcie_free_buffers+0x11c/0x158 [mwifiex_pcie] [ 205.870302] [<ffffffbffc4620d4>] mwifiex_pcie_down_dev+0x70/0x80 [mwifiex_pcie] [ 205.877736] [<ffffffbffc1397a8>] mwifiex_shutdown_sw+0x190/0x310 [mwifiex] [ 205.884658] [<ffffffbffc4606b4>] mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie] [ 205.892446] [<ffffffc000635f54>] pci_reset_notify+0x5c/0x6c [ 205.898048] [<ffffffc00063a044>] pci_dev_save_and_disable+0x24/0x6c [ 205.904350] [<ffffffc00063cf0c>] pci_reset_function+0x30/0x7c [ 205.910134] [<ffffffc000641118>] reset_store+0x68/0x98 [ 205.915312] [<ffffffc000771588>] dev_attr_store+0x54/0x60 [ 205.920750] [<ffffffc00046f53c>] sysfs_kf_write+0x9c/0xb0 [ 205.926182] [<ffffffc00046dfb0>] kernfs_fop_write+0x184/0x1f8 [ 205.931963] [<ffffffc0003d64f4>] __vfs_write+0x6c/0x17c [ 205.937221] [<ffffffc0003d7164>] vfs_write+0xf0/0x1c4 [ 205.942310] [<ffffffc0003d7da0>] SyS_write+0x78/0xd8 [ 205.947312] [<ffffffc000204634>] el0_svc_naked+0x24/0x28 ... [ 205.998268] ================================================================== This bug has been around in different forms for a while. It was sort of noticed in commit `955ab095c5` ("mwifiex: Do not kfree cmd buf while unregistering PCIe"), but it just fixed the double-free, without acknowledging the potential for use-after-free. Fixes: `fc33146090` ("mwifiex: use pci_alloc/free_consistent APIs for PCIe") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Brian Norris	a32a0c8285	mwifiex: MAC randomization should not be persistent commit `7e2f18f064` upstream. nl80211 provides the NL80211_SCAN_FLAG_RANDOM_ADDR for every scan request that should be randomized; the absence of such a flag means we should not randomize. However, mwifiex was stashing the latest randomization request and always using it for future scans, even those that didn't set the flag. Let's zero out the randomization info whenever we get a scan request without NL80211_SCAN_FLAG_RANDOM_ADDR. I'd prefer to remove priv->random_mac entirely (and plumb the randomization MAC properly through the call sequence), but the spaghetti is a little difficult to unravel here for me. Fixes: `c2a8f0ff9c` ("mwifiex: support random MAC address for scanning") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Larry Finger	cf6df2bcd4	rtlwifi: rtl8821ae: setup 8812ae RFE according to device type commit `46cfa2148e` upstream. Current channel switch implementation sets 8812ae RFE reg value assuming that device always has type 2. Extend possible RFE types set and write corresponding reg values. Source for new code is http://dlcdnet.asus.com/pub/ASUS/wireless/PCE-AC51/DR_PCE_AC51_20232801152016.zip Signed-off-by: Maxim Samoylov <max7255@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Yan-Hsuan Chuang <yhchuang@realtek.com> Cc: Pkshih <pkshih@realtek.com> Cc: Birming Chiu <birming@realtek.com> Cc: Shaofu <shaofu@realtek.com> Cc: Steven Ting <steventing@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
NeilBrown	699ae09634	md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop commit `065e519e71` upstream. if called md_set_readonly and set MD_CLOSING bit, the mddev cannot be opened any more due to the MD_CLOING bit wasn't cleared. Thus it needs to be cleared in md_ioctl after any call to md_set_readonly() or do_md_stop(). Signed-off-by: NeilBrown <neilb@suse.com> Fixes: `af8d8e6f03` ("md: changes for MD_STILL_CLOSED flag") Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Dennis Yang	1800fde309	md: update slab_cache before releasing new stripes when stripes resizing commit `583da48e38` upstream. When growing raid5 device on machine with small memory, there is chance that mdadm will be killed and the following bug report can be observed. The same bug could also be reproduced in linux-4.10.6. [57600.075774] BUG: unable to handle kernel NULL pointer dereference at (null) [57600.083796] IP: [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.110378] PGD 421cf067 PUD 4442d067 PMD 0 [57600.114678] Oops: 0002 [#1] SMP [57600.180799] CPU: 1 PID: 25990 Comm: mdadm Tainted: P O 4.2.8 #1 [57600.187849] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS QV05AR66 03/06/2013 [57600.197490] task: ffff880044e47240 ti: ffff880043070000 task.ti: ffff880043070000 [57600.204963] RIP: 0010:[<ffffffff81a6aa87>] [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.213057] RSP: 0018:ffff880043073810 EFLAGS: 00010046 [57600.218359] RAX: 0000000000000000 RBX: 000000000000000c RCX: ffff88011e296dd0 [57600.225486] RDX: 0000000000000001 RSI: ffffe8ffffcb46c0 RDI: 0000000000000000 [57600.232613] RBP: ffff880043073878 R08: ffff88011e5f8170 R09: 0000000000000282 [57600.239739] R10: 0000000000000005 R11: 28f5c28f5c28f5c3 R12: ffff880043073838 [57600.246872] R13: ffffe8ffffcb46c0 R14: 0000000000000000 R15: ffff8800b9706a00 [57600.253999] FS: 00007f576106c700(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000 [57600.262078] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [57600.267817] CR2: 0000000000000000 CR3: 00000000428fe000 CR4: 00000000001406e0 [57600.274942] Stack: [57600.276949] ffffffff8114ee35 ffff880043073868 0000000000000282 000000000000eb3f [57600.284383] ffffffff81119043 ffff880043073838 ffff880043073838 ffff88003e197b98 [57600.291820] ffffe8ffffcb46c0 ffff88003e197360 0000000000000286 ffff880043073968 [57600.299254] Call Trace: [57600.301698] [<ffffffff8114ee35>] ? cache_flusharray+0x35/0xe0 [57600.307523] [<ffffffff81119043>] ? __page_cache_release+0x23/0x110 [57600.313779] [<ffffffff8114eb53>] kmem_cache_free+0x63/0xc0 [57600.319344] [<ffffffff81579942>] drop_one_stripe+0x62/0x90 [57600.324915] [<ffffffff81579b5b>] raid5_cache_scan+0x8b/0xb0 [57600.330563] [<ffffffff8111b98a>] shrink_slab.part.36+0x19a/0x250 [57600.336650] [<ffffffff8111e38c>] shrink_zone+0x23c/0x250 [57600.342039] [<ffffffff8111e4f3>] do_try_to_free_pages+0x153/0x420 [57600.348210] [<ffffffff8111e851>] try_to_free_pages+0x91/0xa0 [57600.353959] [<ffffffff811145b1>] __alloc_pages_nodemask+0x4d1/0x8b0 [57600.360303] [<ffffffff8157a30b>] check_reshape+0x62b/0x770 [57600.365866] [<ffffffff8157a4a5>] raid5_check_reshape+0x55/0xa0 [57600.371778] [<ffffffff81583df7>] update_raid_disks+0xc7/0x110 [57600.377604] [<ffffffff81592b73>] md_ioctl+0xd83/0x1b10 [57600.382827] [<ffffffff81385380>] blkdev_ioctl+0x170/0x690 [57600.388307] [<ffffffff81195238>] block_ioctl+0x38/0x40 [57600.393525] [<ffffffff811731c5>] do_vfs_ioctl+0x2b5/0x480 [57600.399010] [<ffffffff8115e07b>] ? vfs_write+0x14b/0x1f0 [57600.404400] [<ffffffff811733cc>] SyS_ioctl+0x3c/0x70 [57600.409447] [<ffffffff81a6ad97>] entry_SYSCALL_64_fastpath+0x12/0x6a [57600.415875] Code: 00 00 00 00 55 48 89 e5 8b 07 85 c0 74 04 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 ef b0 01 5d c3 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 85 d1 63 ff 5d [57600.435460] RIP [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.441208] RSP <ffff880043073810> [57600.444690] CR2: 0000000000000000 [57600.448000] ---[ end trace cbc6b5cc4bf9831d ]--- The problem is that resize_stripes() releases new stripe_heads before assigning new slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called after resize_stripes() starting releasing new stripes but right before new slab cache being assigned, it is possible that these new stripe_heads will be freed with the old slab_cache which was already been destoryed and that triggers this bug. Signed-off-by: Dennis Yang <dennisyang@qnap.com> Fixes: `edbe83ab4c` ("md/raid5: allow the stripe_cache to grow and shrink.") Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Joe Thornber	d7252903e2	dm space map disk: fix some book keeping in the disk space map commit `0377a07c7a` upstream. When decrementing the reference count for a block, the free count wasn't being updated if the reference count went to zero. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Joe Thornber	f568a10289	dm thin metadata: call precommit before saving the roots commit `91bcdb92d3` upstream. These calls were the wrong way round in __write_initial_superblock. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Mikulas Patocka	17cedf3ba1	dm bufio: make the parameter "retain_bytes" unsigned long commit `13840d3801` upstream. Change the type of the parameter "retain_bytes" from unsigned to unsigned long, so that on 64-bit machines the user can set more than 4GiB of data to be retained. Also, change the type of the variable "count" in the function "__evict_old_buffers" to unsigned long. The assignment "count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];" could result in unsigned long to unsigned overflow and that could result in buffers not being freed when they should. While at it, avoid division in get_retain_buffers(). Division is slow, we can change it to shift because we have precalculated the log2 of block size. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Mike Snitzer	2e80638cee	dm cache metadata: fail operations if fail_io mode has been established commit `10add84e27` upstream. Otherwise it is possible to trigger crashes due to the metadata being inaccessible yet these methods don't safely account for that possibility without these checks. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	a5b9afd690	dm mpath: delay requeuing while path initialization is in progress commit `c1d7ecf7ca` upstream. Requeuing a request immediately while path initialization is ongoing causes high CPU usage, something that is undesired. Hence delay requeuing while path initialization is in progress. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	5087ded2bb	dm mpath: avoid that path removal can trigger an infinite loop commit `7083abbbfc` upstream. If blk_get_request() fails, check whether the failure is due to a path being removed. If that is the case, fail the path by triggering a call to fail_path(). This avoids that the following scenario can be encountered while removing paths: * CPU usage of a kworker thread jumps to 100%. * Removing the DM device becomes impossible. Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK, and the queue is not dying, because in these cases immediate requeuing is inappropriate. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	5a09414335	dm mpath: split and rename activate_path() to prepare for its expanded use commit `89bfce763e` upstream. activate_path() is renamed to activate_path_work() which now calls activate_or_offline_path(). activate_or_offline_path() will be used by the next commit. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Bart Van Assche	4ff00db718	dm mpath: requeue after a small delay if blk_get_request() fails commit `06eb061f48` upstream. If blk_get_request() returns ENODEV then multipath_clone_and_map() causes a request to be requeued immediately. This can cause a kworker thread to spend 100% of the CPU time of a single core in __blk_mq_run_hw_queue() and also can cause device removal to never finish. Avoid this by only requeuing after a delay if blk_get_request() fails. Additionally, reduce the requeue delay. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	7d67cd026d	dm bufio: check new buffer allocation watermark every 30 seconds commit `390020ad2a` upstream. dm-bufio checks a watermark when it allocates a new buffer in __bufio_new(). However, it doesn't check the watermark when the user changes /sys/module/dm_bufio/parameters/max_cache_size_bytes. This may result in a problem - if the watermark is high enough so that all possible buffers are allocated and if the user lowers the value of "max_cache_size_bytes", the watermark will never be checked against the new value because no new buffer would be allocated. To fix this, change __evict_old_buffers() so that it checks the watermark. __evict_old_buffers() is called every 30 seconds, so if the user reduces "max_cache_size_bytes", dm-bufio will react to this change within 30 seconds and decrease memory consumption. Depends-on: `1b0fb5a5b2` ("dm bufio: avoid a possible ABBA deadlock") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	2aac21e001	dm bufio: avoid a possible ABBA deadlock commit `1b0fb5a5b2` upstream. __get_memory_limit() tests if dm_bufio_cache_size changed and calls __cache_size_refresh() if it did. It takes dm_bufio_clients_lock while it already holds the client lock. However, lock ordering is violated because in cleanup_old_buffers() dm_bufio_clients_lock is taken before the client lock. This results in a possible deadlock and lockdep engine warning. Fix this deadlock by changing mutex_lock() to mutex_trylock(). If the lock can't be taken, it will be re-checked next time when a new buffer is allocated. Also add "unlikely" to the if condition, so that the optimizer assumes that the condition is false. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	539ee78649	dm raid: select the Kconfig option CONFIG_MD_RAID0 commit `7b81ef8b14` upstream. Since the commit `0cf4503174` ("dm raid: add support for the MD RAID0 personality"), the dm-raid subsystem can activate a RAID-0 array. Therefore, add MD_RAID0 to the dependencies of DM_RAID, so that MD_RAID0 will be selected when DM_RAID is selected. Fixes: `0cf4503174` ("dm raid: add support for the MD RAID0 personality") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Vinothkumar Raja	11dfdb1a46	dm btree: fix for dm_btree_find_lowest_key() commit `7d1fedb6e9` upstream. dm_btree_find_lowest_key() is giving incorrect results. find_key() traverses the btree correctly for finding the highest key, but there is an error in the way it traverses the btree for retrieving the lowest key. dm_btree_find_lowest_key() fetches the first key of the rightmost block of the btree instead of fetching the first key from the leftmost block. Fix this by conditionally passing the correct parameter to value64() based on the @find_highest flag. Signed-off-by: Erez Zadok <ezk@fsl.cs.sunysb.edu> Signed-off-by: Vinothkumar Raja <vinraja@cs.stonybrook.edu> Signed-off-by: Nidhi Panpalia <npanpalia@cs.stonybrook.edu> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Paolo Abeni	cd21f4af21	infiniband: call ipv6 route lookup via the stub interface commit `eea40b8f62` upstream. The infiniband address handle can be triggered to resolve an ipv6 address in response to MAD packets, regardless of the ipv6 module being disabled via the kernel command line argument. That will cause a call into the ipv6 routing code, which is not initialized, and a conseguent oops. This commit addresses the above issue replacing the direct lookup call with an indirect one via the ipv6 stub, which is properly initialized according to the ipv6 status (e.g. if ipv6 is disabled, the routing lookup fails gracefully) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Sagi Grimberg	3992c93b74	mlx5: Fix mlx5_ib_map_mr_sg mr length commit `0a49f2c31c` upstream. In case we got an initial sg_offset, we need to account for it in the mr length. Fixes: `ff2ba99365` ("IB/core: Add passing an offset into the SG to ib_map_mr_sg") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Alexander Sverdlin	cca9c04ad4	ASoC: cs4271: configure reset GPIO as output commit `49b2e27ab9` upstream. During reset "refactoring" the output configuration was lost. This commit repairs sound on EDB93XX boards. Fixes: `9a397f4` ("ASoC: cs4271: add regulator consumer support") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Petr Vandrovec	a7caccf886	tpm: fix handling of the TPM 2.0 event logs commit `fd5c78694f` upstream. When TPM2 log has entries with more than 3 digests, or with digests not listed in the log header, log gets misparsed, eventually leading to kernel complaint that code tried to vmalloc 512MB of memory (I have no idea what would happen on bigger system). So code should not parse only first 3 digests: both event header and event itself are already in memory, so we can parse any number of digests, as long as we do not try to parse whole memory when given count of 0xFFFFFFFF. So this change: * Rejects event entry with more digests than log header describes. Digest types should be unique, and all should be described in log header, so there cannot be more digests in the event than in the header. * Reject event entry with digest that is not described in the log header. In theory code could hardcode information about digest IDs already assigned by TCG, but if firmware authors cannot get event log format right, why should anyone believe that they got event log content right. Fixes: `4d23cc323c` ("tpm: add securityfs support for TPM 2.0 firmware event log") Signed-off-by: Petr Vandrovec <petr@vmware.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
$Hon Ching \(Vicky) Lo$ Hon Ching \(Vicky) Lo	455edd3ceb	vTPM: Fix missing NULL check commit `31574d321c` upstream. The current code passes the address of tpm_chip as the argument to dev_get_drvdata() without prior NULL check in tpm_ibmvtpm_get_desired_dma. This resulted an oops during kernel boot when vTPM is enabled in Power partition configured in active memory sharing mode. The vio_driver's get_desired_dma() is called before the probe(), which for vtpm is tpm_ibmvtpm_probe, and it's this latter function that initializes the driver and set data. Attempting to get data before the probe() caused the problem. This patch adds a NULL check to the tpm_ibmvtpm_get_desired_dma. fixes: `9e0d39d8a6` ("tpm: Remove useless priv field in struct tpm_vendor_specific") Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkine <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Jerry Snitselaar	b06ad9f0bf	tpm_crb: check for bad response size commit `8569defde8` upstream. Make sure size of response buffer is at least 6 bytes, or we will underflow and pass large size_t to memcpy_fromio(). This was encountered while testing earlier version of locality patchset. Fixes: `30fc8d138e` ("tpm: TPM 2.0 CRB Interface") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Nayna Jain	7cb54bfbd5	tpm: add sleep only for retry in i2c_nuvoton_write_status() commit `0afb7118ae` upstream. Currently, there is an unnecessary 1 msec delay added in i2c_nuvoton_write_status() for the successful case. This function is called multiple times during send() and recv(), which implies adding multiple extra delays for every TPM operation. This patch calls usleep_range() only if retry is to be done. Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Nayna Jain	6705f253bc	tpm: msleep() delays - replace with usleep_range() in i2c nuvoton driver commit `a233a0289c` upstream. Commit `500462a9de` "timers: Switch to a non-cascading wheel" replaced the 'classic' timer wheel, which aimed for near 'exact' expiry of the timers. Their analysis was that the vast majority of timeout timers are used as safeguards, not as real timers, and are cancelled or rearmed before expiration. The only exception noted to this were networking timers with a small expiry time. Not included in the analysis was the TPM polling timer, which resulted in a longer normal delay and, every so often, a very long delay. The non-cascading wheel delay is based on CONFIG_HZ. For a description of the different rings and their delays, refer to the comments in kernel/time/timer.c. Below are the delays given for rings 0 - 2, which explains the longer "normal" delays and the very, long delays as seen on systems with CONFIG_HZ 250. * HZ 1000 steps * Level Offset Granularity Range * 0 0 1 ms 0 ms - 63 ms * 1 64 8 ms 64 ms - 511 ms * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s) * HZ 250 * Level Offset Granularity Range * 0 0 4 ms 0 ms - 255 ms * 1 64 32 ms 256 ms - 2047 ms (256ms - ~2s) * 2 128 256 ms 2048 ms - 16383 ms (~2s - ~16s) Below is a comparison of extending the TPM with 1000 measurements, using msleep() vs. usleep_delay() when configured for 1000 hz vs. 250 hz, before and after commit `500462a9de`. linux-4.7 \| msleep() usleep_range() 1000 hz: 0m44.628s \| 1m34.497s 29.243s 250 hz: 1m28.510s \| 4m49.269s 32.386s linux-4.7 \| min-max (msleep) min-max (usleep_range) 1000 hz: 0:017 - 2:760s \| 0:015 - 3:967s 0:014 - 0:418s 250 hz: 0:028 - 1:954s \| 0:040 - 4:096s 0:016 - 0:816s This patch replaces the msleep() with usleep_range() calls in the i2c nuvoton driver with a consistent max range value. Signed-of-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	84eef50e82	tpm_tis_spi: Add small delay after last transfer commit `5cc0101d1f` upstream. Testing the implementation with a Raspberry Pi 2 showed that under some circumstances its SPI master erroneously releases the CS line before the transfer is complete, i.e. before the end of the last clock. In this case the TPM ignores the transfer and misses for example the GO command. The driver is unable to detect this communication problem and will wait for a command response that is never going to arrive, timing out eventually. As a workaround, the small delay ensures that the CS line is held long enough, even with a faulty SPI master. Other SPI masters are not affected, except for a negligible performance penalty. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	32a6b61947	tpm_tis_spi: Remove limitation of transfers to MAX_SPI_FRAMESIZE bytes commit `591e48c26c` upstream. Limiting transfers to MAX_SPI_FRAMESIZE was not expected by the upper layers, as tpm_tis has no such limitation. Add a loop to hide that limitation. v2: Moved scope of spi_message to the top as requested by Jarkko Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	e2fbfd47e3	tpm_tis_spi: Check correct byte for wait state indicator commit `e110cc69dc` upstream. Wait states are signaled in the last byte received from the TPM in response to the header, not the first byte. Check rx_buf[3] instead of rx_buf[0]. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Peter Huewe	6b2a246ad0	tpm_tis_spi: Abort transfer when too many wait states are signaled commit `975094ddc3` upstream. Abort the transfer with ETIMEDOUT when the TPM signals more than TPM_RETRY wait states. Continuing with the transfer in this state will only lead to arbitrary failures in other parts of the code. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Peter Huewe	b0e2a93c28	tpm_tis_spi: Use single function to transfer data commit `f848f2143a` upstream. The algorithm for sending data to the TPM is mostly identical to the algorithm for receiving data from the TPM, so a single function is sufficient to handle both cases. This is a prequisite for all the other fixes, so we don't have to fix everything twice (send/receive) v2: u16 instead of u8 for the length. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Amir Goldstein	580c656a95	fanotify: don't expose EOPENSTALE to userspace commit `4ff33aafd3` upstream. When delivering an event to userspace for a file on an NFS share, if the file is deleted on server side before user reads the event, user will not get the event. If the event queue contained several events, the stale event is quietly dropped and read() returns to user with events read so far in the buffer. If the event queue contains a single stale event or if the stale event is a permission event, read() returns to user with the kernel internal error code 518 (EOPENSTALE), which is not a POSIX error code. Check the internal return value -EOPENSTALE in fanotify_read(), just the same as it is checked in path_openat() and drop the event in the cases that it is not already dropped. This is a reproducer from Marko Rauhamaa: Just take the example program listed under "man fanotify" ("fantest") and follow these steps: ============================================================== NFS Server NFS Client(1) NFS Client(2) ============================================================== # echo foo >/nfsshare/bar.txt # cat /nfsshare/bar.txt foo # ./fantest /nfsshare Press enter key to terminate. Listening for events. # rm -f /nfsshare/bar.txt # cat /nfsshare/bar.txt read: Unknown error 518 cat: /nfsshare/bar.txt: Operation not permitted ============================================================== where NFS Client (1) and (2) are two terminal sessions on a single NFS Client machine. Reported-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com> Tested-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Jeeja KP	826ca313fd	ALSA: hda: Fix cpu lockup when stopping the cmd dmas commit `960013762d` upstream. Using jiffies in hdac_wait_for_cmd_dmas() to determine when to time out when interrupts are off (snd_hdac_bus_stop_cmd_io()/spin_lock_irq()) causes hard lockup so unlock while waiting using jiffies. ---<-snip->--- <0>[ 1211.603046] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 <4>[ 1211.603047] Modules linked in: snd_hda_intel i915 vgem <4>[ 1211.603053] irq event stamp: 13366 <4>[ 1211.603053] hardirqs last enabled at (13365): ... <4>[ 1211.603059] Call Trace: <4>[ 1211.603059] ? delay_tsc+0x3d/0xc0 <4>[ 1211.603059] __delay+0xa/0x10 <4>[ 1211.603060] __const_udelay+0x31/0x40 <4>[ 1211.603060] snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] <4>[ 1211.603060] ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] <4>[ 1211.603061] snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] <4>[ 1211.603061] azx_stop_chip+0x9/0x10 [snd_hda_codec] <4>[ 1211.603061] azx_suspend+0x72/0x220 [snd_hda_intel] <4>[ 1211.603061] pci_pm_suspend+0x71/0x140 <4>[ 1211.603062] dpm_run_callback+0x6f/0x330 <4>[ 1211.603062] ? pci_pm_freeze+0xe0/0xe0 <4>[ 1211.603062] __device_suspend+0xf9/0x370 <4>[ 1211.603062] ? dpm_watchdog_set+0x60/0x60 <4>[ 1211.603063] async_suspend+0x1a/0x90 <4>[ 1211.603063] async_run_entry_fn+0x34/0x160 <4>[ 1211.603063] process_one_work+0x1f4/0x6d0 <4>[ 1211.603063] ? process_one_work+0x16e/0x6d0 <4>[ 1211.603064] worker_thread+0x49/0x4a0 <4>[ 1211.603064] kthread+0x107/0x140 <4>[ 1211.603064] ? process_one_work+0x6d0/0x6d0 <4>[ 1211.603065] ? kthread_create_on_node+0x40/0x40 <4>[ 1211.603065] ret_from_fork+0x2e/0x40 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 Fixes: `38b19ed7f8` ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jeeja KP <jeeja.kp@intel.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Alexander Steffen	d2b8a861a6	tpm_tis_core: Choose appropriate timeout for reading burstcount commit `302a6ad7fc` upstream. TIS v1.3 for TPM 1.2 and PTP for TPM 2.0 disagree about which timeout value applies to reading a valid burstcount. It is TIMEOUT_D according to TIS, but TIMEOUT_A according to PTP, so choose the appropriate value depending on whether we deal with a TPM 1.2 or a TPM 2.0. This is important since according to the PTP TIMEOUT_D is much smaller than TIMEOUT_A. So the previous implementation could run into timeouts with a TPM 2.0, even though the TPM was behaving perfectly fine. During tpm2_probe TIMEOUT_D will be used even with a TPM 2.0, because TPM_CHIP_FLAG_TPM2 is not yet set. This is fine, since the timeout values will only be changed afterwards by tpm_get_timeouts. Until then TIS_TIMEOUT_D_MAX applies, which is large enough. Fixes: `aec04cbdf7` ("tpm: TPM 2.0 FIFO Interface") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Vamsi Krishna Samavedam	b3bf0bbc1e	USB: core: replace %p with %pK commit `2f964780c0` upstream. Format specifier %p can leak kernel addresses while not valuing the kptr_restrict system settings. When kptr_restrict is set to (1), kernel pointers printed using the %pK format specifier will be replaced with Zeros. Debugging Note : &pK prints only Zeros as address. If you need actual address information, write 0 to kptr_restrict. echo 0 > /proc/sys/kernel/kptr_restrict [Found by poking around in a random vendor kernel tree, it would be nice if someone would actually send these types of patches upstream - gkh] Signed-off-by: Vamsi Krishna Samavedam <vskrishn@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Willy Tarreau	28c7411cdb	char: lp: fix possible integer overflow in lp_setup() commit `3e21f4af17` upstream. The lp_setup() code doesn't apply any bounds checking when passing "lp=none", and only in this case, resulting in an overflow of the parport_nr[] array. All versions in Git history are affected. Reported-By: Roee Hay <roee.hay@hcl.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Johan Hovold	37f84f8b99	watchdog: pcwd_usb: fix NULL-deref at probe commit `46c319b848` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Alan Stern	a646b2974a	USB: ene_usb6250: fix DMA to the stack commit `628c2893d4` upstream. The ene_usb6250 sub-driver in usb-storage does USB I/O to buffers on the stack, which doesn't work with vmapped stacks. This patch fixes the problem by allocating a separate 512-byte buffer at probe time and using it for all of the offending I/O operations. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andreas Hartmann <andihartmann@01019freenet.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Maksim Salau	dff0ad3b9c	usb: misc: legousbtower: Fix memory leak commit `0bd193d62b` upstream. get_version_reply is not freed if function returns with success. Fixes: `942a48730f` ("usb: misc: legousbtower: Fix buffers on stack") Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Maksim Salau <maksim.salau@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Maksim Salau	cfd3c22c36	usb: misc: legousbtower: Fix buffers on stack commit `942a48730f` upstream. Allocate buffers on HEAP instead of STACK for local structures that are to be received using usb_control_msg(). Signed-off-by: Maksim Salau <maksim.salau@gmail.com> Tested-by: Alfredo Rafael Vicente Boix <alviboi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Greg Kroah-Hartman	02d8683735	Linux 4.11.2	2017-05-20 14:50:04 +02:00
Kees Cook	bbc105f387	pstore: Shut down worker when unregistering commit `6330d55347` upstream. When built as a module and running with update_ms >= 0, pstore will Oops during module unload since the work timer is still running. This makes sure the worker is stopped before unloading. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Kees Cook	ed8834ea4f	pstore: Use dynamic spinlock initializer commit `e9a330c428` upstream. The per-prz spinlock should be using the dynamic initializer so that lockdep can correctly track it. Without this, under lockdep, we get a warning at boot that the lock is in non-static memory. Fixes: `109704492e` ("pstore: Make spinlock per zone instead of global") Fixes: `76d5692a58` ("pstore: Correctly initialize spinlock and flags") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Ankit Kumar	f25c78c879	pstore: Fix flags to enable dumps on powerpc commit `041939c1ec` upstream. After commit `c950fd6f20` kernel registers pstore write based on flag set. Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for powerpc architecture. On panic, kernel doesn't write message to /fs/pstore/dmesg*(Entry doesn't gets created at all). This patch enables pstore write for powerpc architecture by setting PSTORE_FLAGS_DMESG flag. Fixes: `c950fd6f20` ("pstore: Split pstore fragile flags") Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	af0eb80e8e	libnvdimm, pfn: fix 'npfns' vs section alignment commit `d5483feda8` upstream. Fix failures to create namespaces due to the vmem_altmap not advertising enough free space to store the memmap. WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0 [..] Call Trace: dump_stack+0x63/0x83 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xde/0xf0 devm_memremap_pages+0x244/0x440 pmem_attach_disk+0x37e/0x490 [nd_pmem] nd_pmem_probe+0x7e/0xa0 [nd_pmem] nvdimm_bus_probe+0x71/0x120 [libnvdimm] driver_probe_device+0x2bb/0x460 bind_store+0x114/0x160 drv_attr_store+0x25/0x30 In commit `658922e57b` "libnvdimm, pfn: fix memmap reservation sizing" we arranged for the capacity to be allocated, but failed to also update the 'npfns' parameter. This leads to cases where there is enough capacity reserved to hold all the allocated sections, but vmemmap_populate_hugepages() still encounters -ENOMEM from altmap_alloc_block_buf(). This fix is a stop-gap until we can teach the core memory hotplug implementation to permit sub-section hotplug. Fixes: `658922e57b` ("libnvdimm, pfn: fix memmap reservation sizing") Reported-by: Anisha Allada <anisha.allada@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	a3ff3ebdf3	libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering commit `452bae0aed` upstream. A debug patch to turn the standard device_lock() into something that lockdep can analyze yielded the following: ====================================================== [ INFO: possible circular locking dependency detected ] 4.11.0-rc4+ #106 Tainted: G O ------------------------------------------------------- lt-libndctl/1898 is trying to acquire lock: (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm] but task is already holding lock: (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nvdimm_bus_lock+0x21/0x30 [libnvdimm] nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm] nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm] nd_pmem_probe+0x14/0x180 [nd_pmem] nvdimm_bus_probe+0xa9/0x260 [libnvdimm] -> #0 (&dev->nvdimm_mutex/3){+.+.+.}: __lock_acquire+0x1107/0x1280 lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nd_attach_ndns+0x178/0x1b0 [libnvdimm] nd_namespace_store+0x308/0x3c0 [libnvdimm] namespace_store+0x87/0x220 [libnvdimm] In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'. Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect nd_{attach,detach}_ndns() operations. Fixes: `8c2f7e8658` ("libnvdimm: infrastructure for btt devices") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Toshi Kani	de21b800a6	libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify commit `b2518c78ce` upstream. The following BUG was observed when nd_pmem_notify() was called for a BTT device. The use of a pmem_device pointer is not valid with BTT. BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: nd_pmem_notify+0x30/0xf0 [nd_pmem] Call Trace: nd_device_notify+0x40/0x50 child_notify+0x10/0x20 device_for_each_child+0x50/0x90 nd_region_notify+0x20/0x30 nd_device_notify+0x40/0x50 nvdimm_region_notify+0x27/0x30 acpi_nfit_scrub+0x341/0x590 [nfit] process_one_work+0x197/0x450 worker_thread+0x4e/0x4a0 kthread+0x109/0x140 Fix nd_pmem_notify() by setting nd_region and badblocks pointers properly for BTT. Cc: Vishal Verma <vishal.l.verma@intel.com> Fixes: `719994660c` ("libnvdimm: async notification support") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	d2572f5b01	libnvdimm, region: fix flush hint detection crash commit `bc042fdfbb` upstream. In the case where a dimm does not have any associated flush hints the ndrd->flush_wpq array may be uninitialized leading to crashes with the following signature: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: region_visible+0x10f/0x160 [libnvdimm] Call Trace: internal_create_group+0xbe/0x2f0 sysfs_create_groups+0x40/0x80 device_add+0x2d8/0x650 nd_async_device_register+0x12/0x40 [libnvdimm] async_run_entry_fn+0x39/0x170 process_one_work+0x212/0x6c0 ? process_one_work+0x197/0x6c0 worker_thread+0x4e/0x4a0 kthread+0x10c/0x140 ? process_one_work+0x6c0/0x6c0 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x31/0x40 Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Fixes: `f284a4f237` ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Joeseph Chang	d517be5133	ipmi: Fix kernel panic at ipmi_ssif_thread() commit `6de65fcfdb` upstream. msg_written_handler() may set ssif_info->multi_data to NULL when using ipmitool to write fru. Before setting ssif_info->multi_data to NULL, add new local pointer "data_to_send" and store correct i2c data pointer to it to fix NULL pointer kernel panic and incorrect ssif_info->multi_pos. Signed-off-by: Joeseph Chang <joechang@codeaurora.org> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Christoph Hellwig	8f0fde5bc9	libata: reject passthrough WRITE SAME requests commit `c6ade20f5e` upstream. The WRITE SAME to TRIM translation rewrites the DATA OUT buffer. While the SCSI code accomodates for this by passing a read-writable buffer userspace applications don't cater for this behavior. In fact it can be used to rewrite e.g. a readonly file through mmap and should be considered as a security fix. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Tejun Heo	a5938c093a	cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc() commit `a590b90d47` upstream. cgroup_get() expected to be called only on live cgroups and triggers warning on a dead cgroup; however, cgroup_sk_alloc() may be called while cloning a socket which is left in an empty and removed cgroup and thus may legitimately duplicate its reference on a dead cgroup. This currently triggers the following warning spuriously. WARNING: CPU: 14 PID: 0 at kernel/cgroup.c:490 cgroup_get+0x55/0x60 ... [<ffffffff8107e123>] __warn+0xd3/0xf0 [<ffffffff8107e20e>] warn_slowpath_null+0x1e/0x20 [<ffffffff810ff465>] cgroup_get+0x55/0x60 [<ffffffff81106061>] cgroup_sk_alloc+0x51/0xe0 [<ffffffff81761beb>] sk_clone_lock+0x2db/0x390 [<ffffffff817cce06>] inet_csk_clone_lock+0x16/0xc0 [<ffffffff817e8173>] tcp_create_openreq_child+0x23/0x4b0 [<ffffffff818601a1>] tcp_v6_syn_recv_sock+0x91/0x670 [<ffffffff817e8b16>] tcp_check_req+0x3a6/0x4e0 [<ffffffff81861ba3>] tcp_v6_rcv+0x693/0xa00 [<ffffffff81837429>] ip6_input_finish+0x59/0x3e0 [<ffffffff81837cb2>] ip6_input+0x32/0xb0 [<ffffffff81837387>] ip6_rcv_finish+0x57/0xa0 [<ffffffff81837ac8>] ipv6_rcv+0x318/0x4d0 [<ffffffff817778c7>] __netif_receive_skb_core+0x2d7/0x9a0 [<ffffffff81777fa6>] __netif_receive_skb+0x16/0x70 [<ffffffff81778023>] netif_receive_skb_internal+0x23/0x80 [<ffffffff817787d8>] napi_gro_frags+0x208/0x270 [<ffffffff8168a9ec>] mlx4_en_process_rx_cq+0x74c/0xf40 [<ffffffff8168b270>] mlx4_en_poll_rx_cq+0x30/0x90 [<ffffffff81778b30>] net_rx_action+0x210/0x350 [<ffffffff8188c426>] __do_softirq+0x106/0x2c7 [<ffffffff81082bad>] irq_exit+0x9d/0xa0 [<ffffffff8188c0e4>] do_IRQ+0x54/0xd0 [<ffffffff8188a63f>] common_interrupt+0x7f/0x7f <EOI> [<ffffffff8173d7e7>] cpuidle_enter+0x17/0x20 [<ffffffff810bdfd9>] cpu_startup_entry+0x2a9/0x2f0 [<ffffffff8103edd1>] start_secondary+0xf1/0x100 This patch renames the existing cgroup_get() with the dead cgroup warning to cgroup_get_live() after cgroup_kn_lock_live() and introduces the new cgroup_get() which doesn't check whether the cgroup is live or dead. All existing cgroup_get() users except for cgroup_sk_alloc() are converted to use cgroup_get_live(). Fixes: `d979a39d72` ("cgroup: duplicate cgroup reference when cloning sockets") Cc: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	740f485dae	Bluetooth: hci_intel: add missing tty-device sanity check commit `dcb9cfaa5e` upstream. Make sure to check the tty-device pointer before looking up the sibling platform device to avoid dereferencing a NULL-pointer when the tty is one end of a Unix98 pty. Fixes: `74cdad37cd` ("Bluetooth: hci_intel: Add runtime PM support") Fixes: `1ab1f239bf` ("Bluetooth: hci_intel: Add support for platform driver") Cc: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	a86af7983d	Bluetooth: hci_bcm: add missing tty-device sanity check commit `95065a61e9` upstream. Make sure to check the tty-device pointer before looking up the sibling platform device to avoid dereferencing a NULL-pointer when the tty is one end of a Unix98 pty. Fixes: `0395ffc1ee` ("Bluetooth: hci_bcm: Add PM for BCM devices") Cc: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Szymon Janc	ded39dd9f0	Bluetooth: Fix user channel for 32bit userspace on 64bit kernel commit `ab89f0bdd6` upstream. Running 32bit userspace on 64bit kernel results in MSG_CMSG_COMPAT being defined as 0x80000000. This results in sendmsg failure if used from 32bit userspace running on 64bit kernel. Fix this by accounting for MSG_CMSG_COMPAT in flags check in hci_sock_sendmsg. Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marko Kiiskila <marko@runtime.io> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Timur Tabi	01f9853326	tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44 commit `5a0722b898` upstream. Define a new early console name for Qualcomm Datacenter Technologies QDF2400 SOCs affected by erratum 44, instead of piggy-backing on "pl011". Previously, to enable traditional (non-SPCR) earlycon, the documentation said to specify "earlycon=pl011,<address>,qdf2400_e44", but the code was broken and this didn't actually work. So instead, the method for specifying the E44 work-around with traditional earlycon is "earlycon=qdf2400_e44,<address>". Both methods of earlycon are now enabled with the same function. Fixes: `e53e597fd4` ("tty: pl011: fix earlycon work-around for QDF2400 erratum 44") Signed-off-by: Timur Tabi <timur@codeaurora.org> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Wang YanQing	34e01f9207	tty: pty: Fix ldisc flush after userspace become aware of the data already commit `77dae61344` upstream. While using emacs, cat or others' commands in konsole with recent kernels, I have met many times that CTRL-C freeze konsole. After konsole freeze I can't type anything, then I have to open a new one, it is very annoying. See bug report: https://bugs.kde.org/show_bug.cgi?id=175283 The platform in that bug report is Solaris, but now the pty in linux has the same problem or the same behavior as Solaris :) It has high possibility to trigger the problem follow steps below: Note: In my test, BigFile is a text file whose size is bigger than 1G 1:open konsole 1:cat BigFile 2:CTRL-C After some digging, I find out the reason is that commit `1d1d14da12` ("pty: Fix buffer flush deadlock") changes the behavior of pty_flush_buffer. Thread A Thread B -------- -------- 1:n_tty_poll return POLLIN 2:CTRL-C trigger pty_flush_buffer tty_buffer_flush n_tty_flush_buffer 3:attempt to check count of chars: ioctl(fd, TIOCINQ, &available) available is equal to 0 4:read(fd, buffer, avaiable) return 0 5:konsole close fd Yes, I know we could use the same patch included in the BUG report as a workaround for linux platform too. But I think the data in ldisc is belong to application of another side, we shouldn't clear it when we want to flush write buffer of this side in pty_flush_buffer. So I think it is better to disable ldisc flush in pty_flush_buffer, because its new hehavior bring no benefit except that it mess up the behavior between POLLIN, and TIOCINQ or FIONREAD. Also I find no flush_buffer function in others' tty driver has the same behavior as current pty_flush_buffer. Fixes: `1d1d14da12` ("pty: Fix buffer flush deadlock") Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	aad32798dc	serial: omap: suspend device on probe errors commit `77e6fe7fd2` upstream. Make sure to actually suspend the device before returning after a failed (or deferred) probe. Note that autosuspend must be disabled before runtime pm is disabled in order to balance the usage count due to a negative autosuspend delay as well as to make the final put suspend the device synchronously. Fixes: `388bc26226` ("omap-serial: Fix the error handling in the omap_serial probe") Cc: Shubhrajyoti D <shubhrajyoti@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	9e85b5c73a	serial: omap: fix runtime-pm handling on unbind commit `099bd73dc1` upstream. An unbalanced and misplaced synchronous put was used to suspend the device on driver unbind, something which with a likewise misplaced pm_runtime_disable leads to external aborts when an open port is being removed. Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa024010 ... [<c046e760>] (serial_omap_set_mctrl) from [<c046a064>] (uart_update_mctrl+0x50/0x60) [<c046a064>] (uart_update_mctrl) from [<c046a400>] (uart_shutdown+0xbc/0x138) [<c046a400>] (uart_shutdown) from [<c046bd2c>] (uart_hangup+0x94/0x190) [<c046bd2c>] (uart_hangup) from [<c045b760>] (__tty_hangup+0x404/0x41c) [<c045b760>] (__tty_hangup) from [<c045b794>] (tty_vhangup+0x1c/0x20) [<c045b794>] (tty_vhangup) from [<c046ccc8>] (uart_remove_one_port+0xec/0x260) [<c046ccc8>] (uart_remove_one_port) from [<c046ef4c>] (serial_omap_remove+0x40/0x60) [<c046ef4c>] (serial_omap_remove) from [<c04845e8>] (platform_drv_remove+0x34/0x4c) Fix this up by resuming the device before deregistering the port and by suspending and disabling runtime pm only after the port has been removed. Also make sure to disable autosuspend before disabling runtime pm so that the usage count is balanced and device actually suspended before returning. Note that due to a negative autosuspend delay being set in probe, the unbalanced put would actually suspend the device on first driver unbind, while rebinding and again unbinding would result in a negative power.usage_count. Fixes: `7e9c8e7dbf` ("serial: omap: make sure to suspend device before remove") Cc: Felipe Balbi <balbi@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Marek Szyprowski	cd823aa0bc	serial: samsung: Add missing checks for dma_map_single failure commit `500fcc08a3` upstream. This patch adds missing checks for dma_map_single() failure and proper error reporting. Although this issue was harmless on ARM architecture, it is always good to use the DMA mapping API in a proper way. This patch fixes the following DMA API debug warning: WARNING: CPU: 1 PID: 3785 at lib/dma-debug.c:1171 check_unmap+0x8a0/0xf28 dma-pl330 121a0000.pdma: DMA-API: device driver failed to check map error[device address=0x000000006e0f9000] [size=4096 bytes] [mapped as single] Modules linked in: CPU: 1 PID: 3785 Comm: (agetty) Tainted: G W 4.11.0-rc1-00137-g07ca963-dirty #59 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24) [<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0) [<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180) [<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50) [<c01395a4>] (warn_slowpath_fmt) from [<c072a114>] (check_unmap+0x8a0/0xf28) [<c072a114>] (check_unmap) from [<c072a834>] (debug_dma_unmap_page+0x98/0xc8) [<c072a834>] (debug_dma_unmap_page) from [<c0803874>] (s3c24xx_serial_shutdown+0x314/0x52c) [<c0803874>] (s3c24xx_serial_shutdown) from [<c07f5124>] (uart_port_shutdown+0x54/0x88) [<c07f5124>] (uart_port_shutdown) from [<c07f522c>] (uart_shutdown+0xd4/0x110) [<c07f522c>] (uart_shutdown) from [<c07f6a8c>] (uart_hangup+0x9c/0x208) [<c07f6a8c>] (uart_hangup) from [<c07c426c>] (__tty_hangup+0x49c/0x634) [<c07c426c>] (__tty_hangup) from [<c07c78ac>] (tty_ioctl+0xc88/0x16e4) [<c07c78ac>] (tty_ioctl) from [<c03b5f2c>] (do_vfs_ioctl+0xc4/0xd10) [<c03b5f2c>] (do_vfs_ioctl) from [<c03b6bf4>] (SyS_ioctl+0x7c/0x8c) [<c03b6bf4>] (SyS_ioctl) from [<c010b4a0>] (ret_fast_syscall+0x0/0x3c) Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Marek Szyprowski	43a1c1ff12	serial: samsung: Use right device for DMA-mapping calls commit `768d64f491` upstream. Driver should provide its own struct device for all DMA-mapping calls instead of extracting device pointer from DMA engine channel. Although this is harmless from the driver operation perspective on ARM architecture, it is always good to use the DMA mapping API in a proper way. This patch fixes following DMA API debug warning: WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1241 check_sync+0x520/0x9f4 samsung-uart 12c20000.serial: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000006df0f580] [size=64 bytes] Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00137-g07ca963 #51 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24) [<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0) [<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180) [<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50) [<c01395a4>] (warn_slowpath_fmt) from [<c0729058>] (check_sync+0x520/0x9f4) [<c0729058>] (check_sync) from [<c072967c>] (debug_dma_sync_single_for_device+0x88/0xc8) [<c072967c>] (debug_dma_sync_single_for_device) from [<c0803c10>] (s3c24xx_serial_start_tx_dma+0x100/0x2f8) [<c0803c10>] (s3c24xx_serial_start_tx_dma) from [<c0804338>] (s3c24xx_serial_tx_chars+0x198/0x33c) Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Eric Biggers	ab62f4118b	fscrypt: avoid collisions when presenting long encrypted filenames commit `6b06cdee81` upstream. When accessing an encrypted directory without the key, userspace must operate on filenames derived from the ciphertext names, which contain arbitrary bytes. Since we must support filenames as long as NAME_MAX, we can't always just base64-encode the ciphertext, since that may make it too long. Currently, this is solved by presenting long names in an abbreviated form containing any needed filesystem-specific hashes (e.g. to identify a directory block), then the last 16 bytes of ciphertext. This needs to be sufficient to identify the actual name on lookup. However, there is a bug. It seems to have been assumed that due to the use of a CBC (ciphertext block chaining)-based encryption mode, the last 16 bytes (i.e. the AES block size) of ciphertext would depend on the full plaintext, preventing collisions. However, we actually use CBC with ciphertext stealing (CTS), which handles the last two blocks specially, causing them to appear "flipped". Thus, it's actually the second-to-last block which depends on the full plaintext. This caused long filenames that differ only near the end of their plaintexts to, when observed without the key, point to the wrong inode and be undeletable. For example, with ext4: # echo pass \| e4crypt add_key -p 16 edir/ # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 \| xargs touch # find edir/ -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 100000 # sync # echo 3 > /proc/sys/vm/drop_caches # keyctl new_session # find edir/ -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 2004 # rm -rf edir/ rm: cannot remove 'edir/_A7nNFi3rhkEQlJ6P,hdzluhODKOeWx5V': Structure needs cleaning ... To fix this, when presenting long encrypted filenames, encode the second-to-last block of ciphertext rather than the last 16 bytes. Although it would be nice to solve this without depending on a specific encryption mode, that would mean doing a cryptographic hash like SHA-256 which would be much less efficient. This way is sufficient for now, and it's still compatible with encryption modes like HEH which are strong pseudorandom permutations. Also, changing the presented names is still allowed at any time because they are only provided to allow applications to do things like delete encrypted directories. They're not designed to be used to persistently identify files --- which would be hard to do anyway, given that they're encrypted after all. For ease of backports, this patch only makes the minimal fix to both ext4 and f2fs. It leaves ubifs as-is, since ubifs doesn't compare the ciphertext block yet. Follow-on patches will clean things up properly and make the filesystems use a shared helper function. Fixes: `5de0b4d0cd` ("ext4 crypto: simplify and speed up filename encryption") Reported-by: Gwendal Grignou <gwendal@chromium.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Eric Biggers	eb86b4c68b	fscrypt: fix context consistency check when key(s) unavailable commit `272f98f684` upstream. To mitigate some types of offline attacks, filesystem encryption is designed to enforce that all files in an encrypted directory tree use the same encryption policy (i.e. the same encryption context excluding the nonce). However, the fscrypt_has_permitted_context() function which enforces this relies on comparing struct fscrypt_info's, which are only available when we have the encryption keys. This can cause two incorrect behaviors: 1. If we have the parent directory's key but not the child's key, or vice versa, then fscrypt_has_permitted_context() returned false, causing applications to see EPERM or ENOKEY. This is incorrect if the encryption contexts are in fact consistent. Although we'd normally have either both keys or neither key in that case since the master_key_descriptors would be the same, this is not guaranteed because keys can be added or removed from keyrings at any time. 2. If we have neither the parent's key nor the child's key, then fscrypt_has_permitted_context() returned true, causing applications to see no error (or else an error for some other reason). This is incorrect if the encryption contexts are in fact inconsistent, since in that case we should deny access. To fix this, retrieve and compare the fscrypt_contexts if we are unable to set up both fscrypt_infos. While this slightly hurts performance when accessing an encrypted directory tree without the key, this isn't a case we really need to be optimizing for; access with the key is much more important. Furthermore, the performance hit is barely noticeable given that we are already retrieving the fscrypt_context and doing two keyring searches in fscrypt_get_encryption_info(). If we ever actually wanted to optimize this case we might start by caching the fscrypt_contexts. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Linus Torvalds	602f8911c7	initramfs: avoid "label at end of compound statement" error commit `394e4f5d58` upstream. Commit `17a9be3174` ("initramfs: Always do fput() and load modules after rootfs populate") introduced an error for the CONFIG_BLK_DEV_RAM=y case, because even though the code looks fine, the compiler really wants a statement after a label, or you'll get complaints: init/initramfs.c: In function 'populate_rootfs': init/initramfs.c:644:2: error: label at end of compound statement That commit moved the subsequent statements to outside the compound statement, leaving the label without any associated statements. Reported-by: Jörg Otte <jrg.otte@gmail.com> Fixes: `17a9be3174` ("initramfs: Always do fput() and load modules after rootfs populate") Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Stafford Horne	e54af78e1a	initramfs: Always do fput() and load modules after rootfs populate commit `17a9be3174` upstream. In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit `0886551480` ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: `0886551480` ("initramfs: finish fput() before accessing any binary from initramfs") Cc: Lokesh Vutla <lokeshvutla@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jan Kara	9b3be66cb4	f2fs: Make flush bios explicitely sync commit `3adc5fcb7e` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions. Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: Jaegeuk Kim <jaegeuk@kernel.org> CC: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	bd3dfe5049	f2fs: check entire encrypted bigname when finding a dentry commit `6332cd32c8` upstream. If user has no key under an encrypted dir, fscrypt gives digested dentries. Previously, when looking up a dentry, f2fs only checks its hash value with first 4 bytes of the digested dentry, which didn't handle hash collisions fully. This patch enhances to check entire dentry bytes likewise ext4. Eric reported how to reproduce this issue by: # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 \| xargs touch # find edir -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 100000 # sync # echo 3 > /proc/sys/vm/drop_caches # keyctl new_session # find edir -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 99999 Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (fixed f2fs_dentry_hash() to work even when the hash is 0) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Sheng Yong	3a625468bd	f2fs: fix multiple f2fs_add_link() having same name for inline dentry commit `d3bb910c15` upstream. Commit `88c5c13a50` (f2fs: fix multiple f2fs_add_link() calls having same name) does not cover the scenario where inline dentry is enabled. In that case, F2FS_I(dir)->task will be NULL, and __f2fs_add_link will lookup dentries one more time. This patch fixes it by moving the assigment of current task to a upper level to cover both normal and inline dentry. Fixes: `88c5c13a50` (f2fs: fix multiple f2fs_add_link() calls having same name) Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	e71f099677	f2fs: fix fs corruption due to zero inode page commit `9bb02c3627` upstream. This patch fixes the following scenario. - f2fs_create/f2fs_mkdir - write_checkpoint - f2fs_mark_inode_dirty_sync - block_operations - f2fs_lock_all - f2fs_sync_inode_meta - f2fs_unlock_all - sync_inode_metadata - f2fs_lock_op - f2fs_write_inode - update_inode_page - get_node_page return -ENOENT - new_inode_page - fill_node_footer - f2fs_mark_inode_dirty_sync - ... - f2fs_unlock_op - f2fs_inode_synced - f2fs_lock_all - do_checkpoint In this checkpoint, we can get an inode page which contains zeros having valid node footer only. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	ed4d26a1e4	Revert "f2fs: put allocate_segment after refresh_sit_entry" commit `c6f82fe90d` upstream. This reverts commit `3436c4bdb3`. This makes a leak to register dirty segments. I reproduced the issue by modified postmark which injects a lot of file create/delete/update and finally triggers huge number of SSR allocations. [Jaegeuk Kim: Change missing incorrect comment] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jaegeuk Kim	33cbcc2556	f2fs: fix wrong max cost initialization commit `c541a51b8c` upstream. This patch fixes missing increased max cost caused by a patch that we increased cose of data segments in greedy algorithm. Fixes: `b9cd20619` "f2fs: node segment is prior to data segment selected victim" Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Ross Zwisler	03606c8ecc	dax: fix PMD data corruption when fault races with write commit `876f29460c` upstream. This is based on a patch from Jan Kara that fixed the equivalent race in the DAX PTE fault path. Currently DAX PMD read fault can race with write(2) in the following way: CPU1 - write(2) CPU2 - read fault dax_iomap_pmd_fault() ->iomap_begin() - sees hole dax_iomap_rw() iomap_apply() ->iomap_begin - allocates blocks dax_iomap_actor() invalidate_inode_pages2_range() - there's nothing to invalidate grab_mapping_entry() - we add huge zero page to the radix tree and map it to page tables The result is that hole page is mapped into page tables (and thus zeros are seen in mmap) while file has data written in that place. Fix the problem by locking exception entry before mapping blocks for the fault. That way we are sure invalidate_inode_pages2_range() call for racing write will either block on entry lock waiting for the fault to finish (and unmap stale page tables after that) or read fault will see already allocated blocks by write(2). Fixes: `9f141d6ef6` ("dax: Call ->iomap_begin without entry lock during dax fault") Link: http://lkml.kernel.org/r/20170510172700.18991-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jan Kara	5a3651b4a9	ext4: return to starting transaction in ext4_dax_huge_fault() commit `fb26a1cbed` upstream. DAX will return to locking exceptional entry before mapping blocks for a page fault to fix possible races with concurrent writes. To avoid lock inversion between exceptional entry lock and transaction start, start the transaction already in ext4_dax_huge_fault(). Fixes: `9f141d6ef6` Link: http://lkml.kernel.org/r/20170510085419.27601-4-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jan Kara	85f353dbd0	mm: fix data corruption due to stale mmap reads commit `cd656375f9` upstream. Currently, we didn't invalidate page tables during invalidate_inode_pages2() for DAX. That could result in e.g. 2MiB zero page being mapped into page tables while there were already underlying blocks allocated and thus data seen through mmap were different from data seen by read(2). The following sequence reproduces the problem: - open an mmap over a 2MiB hole - read from a 2MiB hole, faulting in a 2MiB zero page - write to the hole with write(3p). The write succeeds but we incorrectly leave the 2MiB zero page mapping intact. - via the mmap, read the data that was just written. Since the zero page mapping is still intact we read back zeroes instead of the new data. Fix the problem by unconditionally calling invalidate_inode_pages2_range() in dax_iomap_actor() for new block allocations and by properly invalidating page tables in invalidate_inode_pages2_range() for DAX mappings. Fixes: `c6dcf52c23` Link: http://lkml.kernel.org/r/20170510085419.27601-3-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Ross Zwisler	06305f51ef	dax: prevent invalidation of mapped DAX entries commit `4636e70bb0` upstream. Patch series "mm,dax: Fix data corruption due to mmap inconsistency", v4. This series fixes data corruption that can happen for DAX mounts when page faults race with write(2) and as a result page tables get out of sync with block mappings in the filesystem and thus data seen through mmap is different from data seen through read(2). The series passes testing with t_mmap_stale test program from Ross and also other mmap related tests on DAX filesystem. This patch (of 4): dax_invalidate_mapping_entry() currently removes DAX exceptional entries only if they are clean and unlocked. This is done via: invalidate_mapping_pages() invalidate_exceptional_entry() dax_invalidate_mapping_entry() However, for page cache pages removed in invalidate_mapping_pages() there is an additional criteria which is that the page must not be mapped. This is noted in the comments above invalidate_mapping_pages() and is checked in invalidate_inode_page(). For DAX entries this means that we can can end up in a situation where a DAX exceptional entry, either a huge zero page or a regular DAX entry, could end up mapped but without an associated radix tree entry. This is inconsistent with the rest of the DAX code and with what happens in the page cache case. We aren't able to unmap the DAX exceptional entry because according to its comments invalidate_mapping_pages() isn't allowed to block, and unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem. Since we essentially never have unmapped DAX entries to evict from the radix tree, just remove dax_invalidate_mapping_entry(). Fixes: `c6dcf52c23` ("mm: Invalidate DAX radix tree entries only if appropriate") Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Dan Williams	55dbfd7dcd	device-dax: fix sysfs attribute deadlock commit `565851c972` upstream. Usage of device_lock() for dax_region attributes is unnecessary and deadlock prone. It's unnecessary because the order of registration / un-registration guarantees that drvdata is always valid. It's deadlock prone because it sets up this situation: ndctl D 0 2170 2082 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 schedule_preempt_disabled+0x15/0x20 __mutex_lock+0x402/0x980 ? __mutex_lock+0x158/0x980 ? align_show+0x2b/0x80 [dax] ? kernfs_seq_start+0x2f/0x90 mutex_lock_nested+0x1b/0x20 align_show+0x2b/0x80 [dax] dev_attr_show+0x20/0x50 ndctl D 0 2186 2079 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 __kernfs_remove+0x1f6/0x340 ? kernfs_remove_by_name_ns+0x45/0xa0 ? remove_wait_queue+0x70/0x70 kernfs_remove_by_name_ns+0x45/0xa0 remove_files.isra.1+0x35/0x70 sysfs_remove_group+0x44/0x90 sysfs_remove_groups+0x2e/0x50 dax_region_unregister+0x25/0x40 [dax] devm_action_release+0xf/0x20 release_nodes+0x16d/0x2b0 devres_release_all+0x3c/0x60 device_release_driver_internal+0x17d/0x220 device_release_driver+0x12/0x20 unbind_store+0x112/0x160 ndctl/2170 is trying to acquire the device_lock() to read an attribute, and ndctl/2186 is holding the device_lock() while trying to drain all active attribute readers. Thanks to Yi Zhang for the reproduction script. Fixes: `d7fe1a67f6` ("dax: add region 'id', 'size', and 'align' attributes") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Dan Williams	8931477790	device-dax: fix cdev leak commit `ed01e50acd` upstream. If device_add() fails, cleanup the cdev. Otherwise, we leak a kobj_map() with a stale device number. As Jason points out, there is a small possibility that userspace has opened and mapped the device in the time between cdev_add() and the device_add() failure. We need a new kill_dax_dev() helper to invalidate any established mappings. Fixes: `ba09c01d2f` ("dax: convert to the cdev api") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
NeilBrown	9bcb9cc96d	md/raid1: avoid reusing a resync bio after error handling. commit `0c9d5b127f` upstream. fix_sync_read_error() modifies a bio on a newly faulty device by setting bi_end_io to end_sync_write. This ensure that put_buf() will still call rdev_dec_pending() as required, but makes sure that subsequent code in fix_sync_read_error() doesn't try to read from the device. Unfortunately this interacts badly with sync_request_write() which assumes that any bio with bi_end_io set to non-NULL other than end_sync_read is safe to write to. As the device is now faulty it doesn't make sense to write. As the bio was recently used for a read, it is "dirty" and not suitable for immediate submission. In particular, ->bi_next might be non-NULL, which will cause generic_make_request() to complain. Break this interaction by refusing to write to devices which are marked as Faulty. Reported-and-tested-by: Michael Wang <yun.wang@profitbricks.com> Fixes: `2e52d449bc` ("md/raid1: add failfast handling for reads.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jason A. Donenfeld	7d70808547	padata: free correct variable commit `07a77929ba` upstream. The author meant to free the variable that was just allocated, instead of the one that failed to be allocated, but made a simple typo. This patch rectifies that. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Amir Goldstein	f33f17b28d	ovl: do not set overlay.opaque on non-dir create commit `4a99f3c83d` upstream. The optimization for opaque dir create was wrongly being applied also to non-dir create. Fixes: `97c684cc91` ("ovl: create directories inside merged parent opaque") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Björn Jacke	6a1baa3a16	CIFS: add misssing SFM mapping for doublequote commit `85435d7a15` upstream. SFM is mapping doublequote to 0xF020 Without this patch creating files with doublequote fails to Windows/Mac Signed-off-by: Bjoern Jacke <bjacke@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	c4f35cc8dc	cifs: fix CIFS_IOC_GET_MNT_INFO oops commit `d8a6e505d6` upstream. An open directory may have a NULL private_data pointer prior to readdir. Fixes: `0de1f4c6f6` ("Add way to query server fs info for smb3") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Rabin Vincent	33fa011644	CIFS: fix oplock break deadlocks commit `3998e6b87d` upstream. When the final cifsFileInfo_put() is called from cifsiod and an oplock break work is queued, lockdep complains loudly: ============================================= [ INFO: possible recursive locking detected ] 4.11.0+ #21 Not tainted --------------------------------------------- kworker/0:2/78 is trying to acquire lock: ("cifsiod"){++++.+}, at: flush_work+0x215/0x350 but task is already holding lock: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock("cifsiod"); lock("cifsiod"); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by kworker/0:2/78: #0: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 #1: ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0 stack backtrace: CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21 Workqueue: cifsiod cifs_writev_complete Call Trace: dump_stack+0x85/0xc2 __lock_acquire+0x17dd/0x2260 ? match_held_lock+0x20/0x2b0 ? trace_hardirqs_off_caller+0x86/0x130 ? mark_lock+0xa6/0x920 lock_acquire+0xcc/0x260 ? lock_acquire+0xcc/0x260 ? flush_work+0x215/0x350 flush_work+0x236/0x350 ? flush_work+0x215/0x350 ? destroy_worker+0x170/0x170 __cancel_work_timer+0x17d/0x210 ? ___preempt_schedule+0x16/0x18 cancel_work_sync+0x10/0x20 cifsFileInfo_put+0x338/0x7f0 cifs_writedata_release+0x2a/0x40 ? cifs_writedata_release+0x2a/0x40 cifs_writev_complete+0x29d/0x850 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 This is a real warning. Since the oplock is queued on the same workqueue this can deadlock if there is only one worker thread active for the workqueue (which will be the case during memory pressure when the rescuer thread is handling it). Furthermore, there is at least one other kind of hang possible due to the oplock break handling if there is only worker. (This can be reproduced without introducing memory pressure by having passing 1 for the max_active parameter of cifsiod.) cifs_oplock_break() can wait indefintely in the filemap_fdatawait() while the cifs_writev_complete() work is blocked: sysrq: SysRq : Show Blocked State task PC stack pid father kworker/0:1 D 0 16 2 0x00000000 Workqueue: cifsiod cifs_oplock_break Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x4a/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 cifs_oplock_break+0x651/0x710 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 dd D 0 683 171 0x00000000 Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x29/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 filemap_write_and_wait+0x4e/0x70 cifs_flush+0x6a/0xb0 filp_close+0x52/0xa0 __close_fd+0xdc/0x150 SyS_close+0x33/0x60 entry_SYSCALL_64_fastpath+0x1f/0xbe Showing all locks held in the system: 2 locks held by kworker/0:1/16: #0: ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0 #1: ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0 Showing busy workqueues and worker pools: workqueue cifsiod: flags=0xc pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 in-flight: 16:cifs_oplock_break delayed: cifs_writev_complete, cifs_echo_request pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3 Fix these problems by creating a a new workqueue (with a rescuer) for the oplock break work. Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	076c404c12	cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops commit `6026685de3` upstream. As with `618763958b`, an open directory may have a NULL private_data pointer prior to readdir. CIFS_ENUMERATE_SNAPSHOTS must check for this before dereference. Fixes: `834170c859` ("Enable previous version support") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	63d86cd970	cifs: fix leak in FSCTL_ENUM_SNAPS response handling commit `0e5c795592` upstream. The server may respond with success, and an output buffer less than sizeof(struct smb_snapshot_array) in length. Do not leak the output buffer in this case. Fixes: `834170c859` ("Enable previous version support") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Björn Jacke	160239dfbf	CIFS: fix mapping of SFM_SPACE and SFM_PERIOD commit `b704e70b7c` upstream. - trailing space maps to 0xF028 - trailing period maps to 0xF029 This fix corrects the mapping of file names which have a trailing character that would otherwise be illegal (period or space) but is allowed by POSIX. Signed-off-by: Bjoern Jacke <bjacke@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Steve French	6f95f1925a	SMB3: Work around mount failure when using SMB3 dialect to Macs commit `7db0a6efdc` upstream. Macs send the maximum buffer size in response on ioctl to validate negotiate security information, which causes us to fail the mount as the response buffer is larger than the expected response. Changed ioctl response processing to allow for padding of validate negotiate ioctl response and limit the maximum response size to maximum buffer size. Signed-off-by: Steve French <steve.french@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Steve French	49f70d15b5	Set unicode flag on cifs echo request to avoid Mac error commit `26c9cb668c` upstream. Mac requires the unicode flag to be set for cifs, even for the smb echo request (which doesn't have strings). Without this Mac rejects the periodic echo requests (when mounting with cifs) that we use to check if server is down Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Sachin Prabhu	fdf150ee55	Do not return number of bytes written for ioctl CIFS_IOC_COPYCHUNK_FILE commit `7d0c234fd2` upstream. commit `620d8745b3` ("Introduce cifs_copy_file_range()") changes the behaviour of the cifs ioctl call CIFS_IOC_COPYCHUNK_FILE. In case of successful writes, it now returns the number of bytes written. This return value is treated as an error by the xfstest cifs/001. Depending on the errno set at that time, this may or may not result in the test failing. The patch fixes this by setting the return value to 0 in case of successful writes. Fixes: commit `620d8745b3` ("Introduce cifs_copy_file_range()") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Sachin Prabhu	ecf9bb0fd7	Fix match_prepath() commit `cd8c42968e` upstream. Incorrect return value for shares not using the prefix path means that we will never match superblocks for these shares. Fixes: commit `c1d8b24d18` ("Compare prepaths when comparing superblocks") Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Vlastimil Babka	d322fd67ca	mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC commit `62be1511b1` upstream. Patch series "more robust PF_MEMALLOC handling" This series aims to unify the setting and clearing of PF_MEMALLOC, which prevents recursive reclaim. There are some places that clear the flag unconditionally from current->flags, which may result in clearing a pre-existing flag. This already resulted in a bug report that Patch 1 fixes (without the new helpers, to make backporting easier). Patch 2 introduces the new helpers, modelled after existing memalloc_noio_* and memalloc_nofs_* helpers, and converts mm core to use them. Patches 3 and 4 convert non-mm code. This patch (of 4): __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent deadlock during page migration by lock_page() (see the comment in __unmap_and_move()). Then it unconditionally clears the flag, which can clear a pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a problem until commit `a8161d1ed6` ("mm, page_alloc: restructure direct compaction handling in slowpath"), because direct compation was called only after direct reclaim, which was skipped when PF_MEMALLOC flag was set. Even now it's only a theoretical issue, as the new callsite of __alloc_pages_direct_compact() is reached only for costly orders and when gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in gfp_flags or in_interrupt() is true. There is no such known context, but let's play it safe and make __alloc_pages_direct_compact() robust for cases where PF_MEMALLOC is already set. Fixes: `a8161d1ed6` ("mm, page_alloc: restructure direct compaction handling in slowpath") Link: http://lkml.kernel.org/r/20170405074700.29871-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Johannes Weiner	41a68ab851	mm: vmscan: fix IO/refault regression in cache workingset transition commit `2a2e48854d` upstream. Since commit `59dc76b0d4` ("mm: vmscan: reduce size of inactive file list") we noticed bigger IO spikes during changes in cache access patterns. The patch in question shrunk the inactive list size to leave more room for the current workingset in the presence of streaming IO. However, workingset transitions that previously happened on the inactive list are now pushed out of memory and incur more refaults to complete. This patch disables active list protection when refaults are being observed. This accelerates workingset transitions, and allows more of the new set to establish itself from memory, without eating into the ability to protect the established workingset during stable periods. The workloads that were measurably affected for us were hit pretty bad by it, with refault/majfault rates doubling and tripling during cache transitions, and the machines sustaining half-hour periods of 100% IO utilization, where they'd previously have sub-minute peaks at 60-90%. Stateful services that handle user data tend to be more conservative with kernel upgrades. As a result we hit most page cache issues with some delay, as was the case here. The severity seemed to warrant a stable tag. Fixes: `59dc76b0d4` ("mm: vmscan: reduce size of inactive file list") Link: http://lkml.kernel.org/r/20170404220052.27593-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Andrey Ryabinin	f7bccf3f16	fs/block_dev: always invalidate cleancache in invalidate_bdev() commit `a5f6a6a9c7` upstream. invalidate_bdev() calls cleancache_invalidate_inode() iff ->nrpages != 0 which doen't make any sense. Make sure that invalidate_bdev() always calls cleancache_invalidate_inode() regardless of mapping->nrpages value. Fixes: `c515e1fd36` ("mm/fs: add hooks to support cleancache") Link: http://lkml.kernel.org/r/20170424164135.22350-3-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Andrey Ryabinin	601462864d	fs: fix data invalidation in the cleancache during direct IO commit `55635ba76e` upstream. Patch series "Properly invalidate data in the cleancache", v2. We've noticed that after direct IO write, buffered read sometimes gets stale data which is coming from the cleancache. The reason for this is that some direct write hooks call call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero, so we may not invalidate data in the cleancache. Another odd thing is that we check only for ->nrpages and don't check for ->nrexceptional, but invalidate_inode_pages2[_range] also invalidates exceptional entries as well. So we invalidate exceptional entries only if ->nrpages != 0? This doesn't feel right. - Patch 1 fixes direct IO writes by removing ->nrpages check. - Patch 2 fixes similar case in invalidate_bdev(). Note: I only fixed conditional cleancache_invalidate_inode() here. Do we also need to add ->nrexceptional check in into invalidate_bdev()? - Patches 3-4: some optimizations. This patch (of 4): Some direct IO write fs hooks call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero. This can't be right, because invalidate_inode_pages2[_range]() also invalidate data in the cleancache via cleancache_invalidate_inode() call. So if page cache is empty but there is some data in the cleancache, buffered read after direct IO write would get stale data from the cleancache. Also it doesn't feel right to check only for ->nrpages because invalidate_inode_pages2[_range] invalidates exceptional entries as well. Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages state. Note: nfs,cifs,9p doesn't need similar fix because the never call cleancache_get_page() (nor directly, nor via mpage_readpage[s]()), so they are not affected by this bug. Fixes: `c515e1fd36` ("mm/fs: add hooks to support cleancache") Link: http://lkml.kernel.org/r/20170424164135.22350-2-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Luis Henriques	a4c34c20fb	ceph: fix memory leak in __ceph_setxattr() commit `eeca958dce` upstream. The ceph_inode_xattr needs to be released when removing an xattr. Easily reproducible running the 'generic/020' test from xfstests or simply by doing: attr -s attr0 -V 0 /mnt/test && attr -r attr0 /mnt/test While there, also fix the error path. Here's the kmemleak splat: unreferenced object 0xffff88001f86fbc0 (size 64): comm "attr", pid 244, jiffies 4294904246 (age 98.464s) hex dump (first 32 bytes): 40 fa 86 1f 00 88 ff ff 80 32 38 1f 00 88 ff ff @........28..... 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ backtrace: [<ffffffff81560199>] kmemleak_alloc+0x49/0xa0 [<ffffffff810f3e5b>] kmem_cache_alloc+0x9b/0xf0 [<ffffffff812b157e>] __ceph_setxattr+0x17e/0x820 [<ffffffff812b1c57>] ceph_set_xattr_handler+0x37/0x40 [<ffffffff8111fb4b>] __vfs_removexattr+0x4b/0x60 [<ffffffff8111fd37>] vfs_removexattr+0x77/0xd0 [<ffffffff8111fdd1>] removexattr+0x41/0x60 [<ffffffff8111fe65>] path_removexattr+0x75/0xa0 [<ffffffff81120aeb>] SyS_lremovexattr+0xb/0x10 [<ffffffff81564b20>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Michal Hocko	2cfc744069	fs/xattr.c: zero out memory copied to userspace in getxattr commit `81be3dee96` upstream. getxattr uses vmalloc to allocate memory if kzalloc fails. This is filled by vfs_getxattr and then copied to the userspace. vmalloc, however, doesn't zero out the memory so if the specific implementation of the xattr handler is sloppy we can theoretically expose a kernel memory. There is no real sign this is really the case but let's make sure this will not happen and use vzalloc instead. Fixes: `779302e678` ("fs/xattr.c:getxattr(): improve handling of allocation failures") Link: http://lkml.kernel.org/r/20170306103327.2766-1-mhocko@kernel.org Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Martin Brandenburg	8bd8322308	orangefs: do not check possibly stale size on truncate commit `53950ef541` upstream. Let the server figure this out because our size might be out of date or not present. The bug was that xfs_io -f -t -c "pread -v 0 100" /mnt/foo echo "Test" > /mnt/foo xfs_io -f -t -c "pread -v 0 100" /mnt/foo fails because the second truncate did not happen if nothing had requested the size after the write in echo. Thus i_size was zero (not present) and the orangefs_setattr though i_size was zero and there was nothing to do. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Martin Brandenburg	66f1885bea	orangefs: do not set getattr_time on orangefs_lookup commit `17930b252c` upstream. Since orangefs_lookup calls orangefs_iget which calls orangefs_inode_getattr, getattr_time will get set. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Martin Brandenburg	a6d3c33255	orangefs: clean up oversize xattr validation commit `e675c5ec51` upstream. Also don't check flags as this has been validated by the VFS already. Fix an off-by-one error in the max size checking. Stop logging just because userspace wants to write attributes which do not fit. This and the previous commit fix xfstests generic/020. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Martin Brandenburg	7adb15036a	orangefs: fix bounds check for listxattr commit `a956af337b` upstream. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Eric Biggers	1567c17063	ext4: evict inline data when writing to memory map commit `7b4cc9787f` upstream. Currently the case of writing via mmap to a file with inline data is not handled. This is maybe a rare case since it requires a writable memory map of a very small file, but it is trivial to trigger with on inline_data filesystem, and it causes the 'BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA));' in ext4_writepages() to be hit: mkfs.ext4 -O inline_data /dev/vdb mount /dev/vdb /mnt xfs_io -f /mnt/file \ -c 'pwrite 0 1' \ -c 'mmap -w 0 1m' \ -c 'mwrite 0 1' \ -c 'fsync' kernel BUG at fs/ext4/inode.c:2723! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 2532 Comm: xfs_io Not tainted 4.11.0-rc1-xfstests-00301-g071d9acf3d1f #633 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 task: ffff88003d3a8040 task.stack: ffffc90000300000 RIP: 0010:ext4_writepages+0xc89/0xf8a RSP: 0018:ffffc90000303ca0 EFLAGS: 00010283 RAX: 0000028410000000 RBX: ffff8800383fa3b0 RCX: ffffffff812afcdc RDX: 00000a9d00000246 RSI: ffffffff81e660e0 RDI: 0000000000000246 RBP: ffffc90000303dc0 R08: 0000000000000002 R09: 869618e8f99b4fa5 R10: 00000000852287a2 R11: 00000000a03b49f4 R12: ffff88003808e698 R13: 0000000000000000 R14: 7fffffffffffffff R15: 7fffffffffffffff FS: 00007fd3e53094c0(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd3e4c51000 CR3: 000000003d554000 CR4: 00000000003406e0 Call Trace: ? _raw_spin_unlock+0x27/0x2a ? kvm_clock_read+0x1e/0x20 do_writepages+0x23/0x2c ? do_writepages+0x23/0x2c __filemap_fdatawrite_range+0x80/0x87 filemap_write_and_wait_range+0x67/0x8c ext4_sync_file+0x20e/0x472 vfs_fsync_range+0x8e/0x9f ? syscall_trace_enter+0x25b/0x2d0 vfs_fsync+0x1c/0x1e do_fsync+0x31/0x4a SyS_fsync+0x10/0x14 do_syscall_64+0x69/0x131 entry_SYSCALL64_slow_path+0x25/0x25 We could try to be smart and keep the inline data in this case, or at least support delayed allocation when allocating the block, but these solutions would be more complicated and don't seem worthwhile given how rare this case seems to be. So just fix the bug by calling ext4_convert_inline_data() when we're asked to make a page writable, so that any inline data gets evicted, with the block allocated immediately. Reported-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Jan Kara	7621cb7966	jbd2: fix dbench4 performance regression for 'nobarrier' mounts commit `5052b069ac` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_FUA implementation. Since JBD2 strips REQ_FUA and REQ_FLUSH flags from submitted IO when the filesystem is mounted with nobarrier mount option, journal superblock writes ended up being async writes after this patch and that caused heavy performance regression for dbench4 benchmark with high number of processes. In my test setup with HP RAID array with non-volatile write cache and 32 GB ram, dbench4 runs with 8 processes regressed by ~25%. Fix the problem by making sure journal superblock writes are always treated as synchronous since they generally block progress of the journalling machinery and thus the whole filesystem. Fixes: `b685d3d65a` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Christian Borntraeger	7a17cbb4b8	perf annotate s390: Implement jump types for perf annotate commit `d9f8dfa9ba` upstream. Implement simple detection for all kind of jumps and branches. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Link: http://lkml.kernel.org/r/1491465112-45819-3-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Christian Borntraeger	b09eace76e	perf annotate s390: Fix perf annotate error -95 (4.10 regression) commit `e77852b32d` upstream. since 4.10 perf annotate exits on s390 with an "unknown error -95". Turns out that commit `786c1b5184` ("perf annotate: Start supporting cross arch annotation") added a hard requirement for architecture support when objdump is used but only provided x86 and arm support. Meanwhile power was added so lets add s390 as well. While at it make sure to implement the branch and jump types. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Fixes: `786c1b5184` "perf annotate: Start supporting cross arch annotation" Link: http://lkml.kernel.org/r/1491465112-45819-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Adrian Hunter	08b0b5bd54	perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() commit `c3a0bbc7ad` upstream. Address filtering with kernel symbols incorrectly resulted in the error "Cannot determine size of symbol" because the no_size logic was the wrong way around. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Mike Marciniszyn	2d58452241	IB/hfi1: Prevent kernel QP post send hard lockups commit `b6eac931b9` upstream. The driver progress routines can call cond_resched() when a timeslice is exhausted and irqs are enabled. If the ULP had been holding a spin lock without disabling irqs and the post send directly called the progress routine, the cond_resched() could yield allowing another thread from the same ULP to deadlock on that same lock. Correct by replacing the current hfi1_do_send() calldown with a unique one for post send and adding an argument to hfi1_do_send() to indicate that the send engine is running in a thread. If the routine is not running in a thread, avoid calling cond_resched(). Fixes: Commit `831464ce4b` ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Jack Morgenstein	7a48c89a3c	IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level commit `fb7a91746a` upstream. A warning message during SRIOV multicast cleanup should have actually been a debug level message. The condition generating the warning does no harm and can fill the message log. In some cases, during testing, some tests were so intense as to swamp the message log with these warning messages, causing a stall in the console message log output task. This stall caused an NMI to be sent to all CPUs (so that they all dumped their stacks into the message log). Aside from the message flood causing an NMI, the tests all passed. Once the message flood which caused the NMI is removed (by reducing the warning message to debug level), the NMI no longer occurs. Sample message log (console log) output illustrating the flood and resultant NMI (snippets with comments and modified with ... instead of hex digits, to satisfy checkpatch.pl): <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!... * About 4000 almost identical lines in less than one second * <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!... INFO: rcu_sched detected stalls on CPUs/tasks: { 17} (...) * { 17} above indicates that CPU 17 was the one that stalled * sending NMI to all CPUs: ... NMI backtrace for cpu 17 CPU: 17 PID: 45909 Comm: kworker/17:2 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 09/08/2013 Workqueue: events fb_flashcursor task: ffff880478...... ti: ffff88064e...... task.ti: ffff88064e...... RIP: 0010:[ffffffff81......] [ffffffff81......] io_serial_in+0x15/0x20 RSP: 0018:ffff88064e257cb0 EFLAGS: 00000002 RAX: 0000000000...... RBX: ffffffff81...... RCX: 0000000000...... RDX: 0000000000...... RSI: 0000000000...... RDI: ffffffff81...... RBP: ffff88064e...... R08: ffffffff81...... R09: 0000000000...... R10: 0000000000...... R11: ffff88064e...... R12: 0000000000...... R13: 0000000000...... R14: ffffffff81...... R15: 0000000000...... FS: 0000000000......(0000) GS:ffff8804af......(0000) knlGS:000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080...... CR2: 00007f2a2f...... CR3: 0000000001...... CR4: 0000000000...... DR0: 0000000000...... DR1: 0000000000...... DR2: 0000000000...... DR3: 0000000000...... DR6: 00000000ff...... DR7: 0000000000...... Stack: ffff88064e...... ffffffff81...... ffffffff81...... 0000000000...... ffffffff81...... ffff88064e...... ffffffff81...... ffffffff81...... ffffffff81...... ffff88064e...... ffffffff81...... 0000000000...... Call Trace: [<ffffffff813d099b>] wait_for_xmitr+0x3b/0xa0 [<ffffffff813d0b5c>] serial8250_console_putchar+0x1c/0x30 [<ffffffff813d0b40>] ? serial8250_console_write+0x140/0x140 [<ffffffff813cb5fa>] uart_console_write+0x3a/0x80 [<ffffffff813d0aae>] serial8250_console_write+0xae/0x140 [<ffffffff8107c4d1>] call_console_drivers.constprop.15+0x91/0xf0 [<ffffffff8107d6cf>] console_unlock+0x3bf/0x400 [<ffffffff813503cd>] fb_flashcursor+0x5d/0x140 [<ffffffff81355c30>] ? bit_clear+0x120/0x120 [<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5aef>] kthread+0xcf/0xe0 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81645858>] ret_from_fork+0x58/0x90 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 66 0f 1f 44 00 00 66 66 66 6 As indicated in the stack trace above, the console output task got swamped. Fixes: `b9c5d6a643` ("IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Jack Morgenstein	295a7aab35	IB/mlx4: Fix ib device initialization error flow commit `99e68909d5` upstream. In mlx4_ib_add, procedure mlx4_ib_alloc_eqs is called to allocate EQs. However, in the mlx4_ib_add error flow, procedure mlx4_ib_free_eqs is not called to free the allocated EQs. Fixes: `e605b743f3` ("IB/mlx4: Increase the number of vectors (EQs) available for ULPs") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Shamir Rabinovitch	f98af463f8	IB/IPoIB: ibX: failed to create mcg debug file commit `771a525840` upstream. When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: `1732b0ef3b` ([IPoIB] add path record information in debugfs) Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Michael J. Ruhl	88ded1f18b	IB/core: For multicast functions, verify that LIDs are multicast LIDs commit `8561eae60f` upstream. The Infiniband spec defines "A multicast address is defined by a MGID and a MLID" (section 10.5). Currently the MLID value is not validated. Add check to verify that the MLID value is in the correct address range. Fixes: `0c33aeedb2` ("[IB] Add checks to multicast attach and detach") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Parav Pandit	82f0feccee	IB/core: Fix kernel crash during fail to initialize device commit `4be3a4fa51` upstream. This patch fixes the kernel crash that occurs during ib_dealloc_device() called due to provider driver fails with an error after ib_alloc_device() and before it can register using ib_register_device(). This crashed seen in tha lab as below which can occur with any IB device which fails to perform its device initialization before invoking ib_register_device(). This patch avoids touching cache and port immutable structures if device is not yet initialized. It also releases related memory when cache and port immutable data structure initialization fails during register_device() state. [81416.561946] BUG: unable to handle kernel NULL pointer dereference at (null) [81416.570340] IP: ib_cache_release_one+0x29/0x80 [ib_core] [81416.576222] PGD 78da66067 [81416.576223] PUD 7f2d7c067 [81416.579484] PMD 0 [81416.582720] [81416.587242] Oops: 0000 [#1] SMP [81416.722395] task: ffff8807887515c0 task.stack: ffffc900062c0000 [81416.729148] RIP: 0010:ib_cache_release_one+0x29/0x80 [ib_core] [81416.735793] RSP: 0018:ffffc900062c3a90 EFLAGS: 00010202 [81416.741823] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [81416.749785] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff880859fec000 [81416.757757] RBP: ffffc900062c3aa0 R08: ffff8808536e5ac0 R09: ffff880859fec5b0 [81416.765708] R10: 00000000536e5c01 R11: ffff8808536e5ac0 R12: ffff880859fec000 [81416.773672] R13: 0000000000000000 R14: ffff8808536e5ac0 R15: ffff88084ebc0060 [81416.781621] FS: 00007fd879fab740(0000) GS:ffff88085fac0000(0000) knlGS:0000000000000000 [81416.790522] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [81416.797094] CR2: 0000000000000000 CR3: 00000007eb215000 CR4: 00000000003406e0 [81416.805051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [81416.812997] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [81416.820950] Call Trace: [81416.824226] ib_device_release+0x1e/0x40 [ib_core] [81416.829858] device_release+0x32/0xa0 [81416.834370] kobject_cleanup+0x63/0x170 [81416.839058] kobject_put+0x25/0x50 [81416.843319] ib_dealloc_device+0x25/0x40 [ib_core] [81416.848986] mlx5_ib_add+0x163/0x1990 [mlx5_ib] [81416.854414] mlx5_add_device+0x5a/0x160 [mlx5_core] [81416.860191] mlx5_register_interface+0x8d/0xc0 [mlx5_core] [81416.866587] ? 0xffffffffa09e9000 [81416.870816] mlx5_ib_init+0x15/0x17 [mlx5_ib] [81416.876094] do_one_initcall+0x51/0x1b0 [81416.880861] ? __vunmap+0x85/0xd0 [81416.885113] ? kmem_cache_alloc_trace+0x14b/0x1b0 [81416.890768] ? vfree+0x2e/0x70 [81416.894762] do_init_module+0x60/0x1fa [81416.899441] load_module+0x15f6/0x1af0 [81416.904114] ? __symbol_put+0x60/0x60 [81416.908709] ? ima_post_read_file+0x3d/0x80 [81416.913828] ? security_kernel_post_read_file+0x6b/0x80 [81416.920006] SYSC_finit_module+0xa6/0xf0 [81416.924888] SyS_finit_module+0xe/0x10 [81416.929568] entry_SYSCALL_64_fastpath+0x1a/0xa9 [81416.935089] RIP: 0033:0x7fd879494949 [81416.939543] RSP: 002b:00007ffdbc1b4e58 EFLAGS: 00000202 ORIG_RAX: 0000000000000139 [81416.947982] RAX: ffffffffffffffda RBX: 0000000001b66f00 RCX: 00007fd879494949 [81416.955965] RDX: 0000000000000000 RSI: 000000000041a13c RDI: 0000000000000003 [81416.963926] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000001b652a0 [81416.971861] R10: 0000000000000003 R11: 0000000000000202 R12: 00007ffdbc1b3e70 [81416.979763] R13: 00007ffdbc1b3e50 R14: 0000000000000005 R15: 0000000000000000 [81417.008005] RIP: ib_cache_release_one+0x29/0x80 [ib_core] RSP: ffffc900062c3a90 [81417.016045] CR2: 0000000000000000 Fixes: `55aeed0654` ("IB/core: Make ib_alloc_device init the kobject") Fixes: `7738613e7c` ("IB/core: Add per port immutable struct to ib_device") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Jack Morgenstein	5c7f1dfa7b	IB/core: Fix sysfs registration error flow commit `b312be3d87` upstream. The kernel commit cited below restructured ib device management so that the device kobject is initialized in ib_alloc_device. As part of the restructuring, the kobject is now initialized in procedure ib_alloc_device, and is later added to the device hierarchy in the ib_register_device call stack, in procedure ib_device_register_sysfs (which calls device_add). However, in the ib_device_register_sysfs error flow, if an error occurs following the call to device_add, the cleanup procedure device_unregister is called. This call results in the device object being deleted -- which results in various use-after-free crashes. The correct cleanup call is device_del -- which undoes device_add without deleting the device object. The device object will then (correctly) be deleted in the ib_register_device caller's error cleanup flow, when the caller invokes ib_dealloc_device. Fixes: `55aeed0654` ("IB/core: Make ib_alloc_device init the kobject") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Ding Tianhong	09944c2660	iov_iter: don't revert iov buffer if csum error commit `a6a5993243` upstream. The patch `3278682123` (make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error) will revert the iov buffer if copy to iter failed, but it didn't copy any datagram if the skb_checksum_complete error, so no need to revert any data at this place. v2: Sabrina notice that return -EFAULT when checksum error is not correct here, it would confuse the caller about the return value, so fix it. Fixes: `3278682123` ("make skb_copy_datagram_msg() et.al. preserve->msg_iter on error") Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Alex Williamson	1e0b0a9ef0	vfio/type1: Remove locked page accounting workqueue commit `0cfef2b741` upstream. If the mmap_sem is contented then the vfio type1 IOMMU backend will defer locked page accounting updates to a workqueue task. This has a few problems and depending on which side the user tries to play, they might be over-penalized for unmaps that haven't yet been accounted or race the workqueue to enter more mappings than they're allowed. The original intent of this workqueue mechanism seems to be focused on reducing latency through the ioctl, but we cannot do so at the cost of correctness. Remove this workqueue mechanism and update the callers to allow for failure. We can also now recheck the limit under write lock to make sure we don't exceed it. vfio_pin_pages_remote() also now necessarily includes an unwind path which we can jump to directly if the consecutive page pinning finds that we're exceeding the user's memory limits. This avoids the current lazy approach which does accounting and mapping up to the fault, only to return an error on the next iteration to unwind the entire vfio_dma. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Dennis Yang	34224e0e1c	dm thin: fix a memory leak when passing discard bio down commit `948f581a53` upstream. dm-thin does not free the discard_parent bio after all chained sub bios finished. The following kmemleak report could be observed after pool with discard_passdown option processes discard bios in linux v4.11-rc7. To fix this, we drop the discard_parent bio reference when its endio (passdown_endio) called. unreferenced object 0xffff8803d6b29700 (size 256): comm "kworker/u8:0", pid 30349, jiffies 4379504020 (age 143002.776s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 f0 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a5efd9>] kmemleak_alloc+0x49/0xa0 [<ffffffff8114ec34>] kmem_cache_alloc+0xb4/0x100 [<ffffffff8110eec0>] mempool_alloc_slab+0x10/0x20 [<ffffffff8110efa5>] mempool_alloc+0x55/0x150 [<ffffffff81374939>] bio_alloc_bioset+0xb9/0x260 [<ffffffffa018fd20>] process_prepared_discard_passdown_pt1+0x40/0x1c0 [dm_thin_pool] [<ffffffffa018b409>] break_up_discard_bio+0x1a9/0x200 [dm_thin_pool] [<ffffffffa018b484>] process_discard_cell_passdown+0x24/0x40 [dm_thin_pool] [<ffffffffa018b24d>] process_discard_bio+0xdd/0xf0 [dm_thin_pool] [<ffffffffa018ecf6>] do_worker+0xa76/0xd50 [dm_thin_pool] [<ffffffff81086239>] process_one_work+0x139/0x370 [<ffffffff810867b1>] worker_thread+0x61/0x450 [<ffffffff8108b316>] kthread+0xd6/0xf0 [<ffffffff81a6cd1f>] ret_from_fork+0x3f/0x70 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Dennis Yang <dennisyang@qnap.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Bart Van Assche	899750e7d9	dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() commit `23a6012489` upstream. Otherwise the request-based DM blk-mq request_queue will be put into service without being properly exported via sysfs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Somasundaram Krishnasamy	ee68aa113f	dm era: save spacemap metadata root after the pre-commit commit `117aceb030` upstream. When committing era metadata to disk, it doesn't always save the latest spacemap metadata root in superblock. Due to this, metadata is getting corrupted sometimes when reopening the device. The correct order of update should be, pre-commit (shadows spacemap root), save the spacemap root (newly shadowed block) to in-core superblock and then the final commit. Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Ondrej Kozina	f38faf569e	dm crypt: rewrite (wipe) key in crypto layer using random data commit `c82feeec9a` upstream. The message "key wipe" used to wipe real key stored in crypto layer by rewriting it with zeroes. Since commit `28856a9` ("crypto: xts - consolidate sanity check for keys") this no longer works in FIPS mode for XTS. While running in FIPS mode the crypto key part has to differ from the tweak key. Fixes: `28856a9` ("crypto: xts - consolidate sanity check for keys") Signed-off-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	6ab60d2b26	crypto: ccp - Change ISR handler method for a v5 CCP commit `6263b51eb3` upstream. The CCP has the ability to perform several operations simultaneously, but only one interrupt. When implemented as a PCI device and using MSI-X/MSI interrupts, use a tasklet model to service interrupts. By disabling and enabling interrupts from the CCP, coupled with the queuing that tasklets provide, we can ensure that all events (occurring on the device) are recognized and serviced. This change fixes a problem wherein 2 or more busy queues can cause notification bits to change state while a (CCP) interrupt is being serviced, but after the queue state has been evaluated. This results in the event being 'lost' and the queue hanging, waiting to be serviced. Since the status bits are never fully de-asserted, the CCP never generates another interrupt (all bits zero -> one or more bits one), and no further CCP operations will be executed. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	747cbf4753	crypto: ccp - Change ISR handler method for a v3 CCP commit `7b537b24e7` upstream. The CCP has the ability to perform several operations simultaneously, but only one interrupt. When implemented as a PCI device and using MSI-X/MSI interrupts, use a tasklet model to service interrupts. By disabling and enabling interrupts from the CCP, coupled with the queuing that tasklets provide, we can ensure that all events (occurring on the device) are recognized and serviced. This change fixes a problem wherein 2 or more busy queues can cause notification bits to change state while a (CCP) interrupt is being serviced, but after the queue state has been evaluated. This results in the event being 'lost' and the queue hanging, waiting to be serviced. Since the status bits are never fully de-asserted, the CCP never generates another interrupt (all bits zero -> one or more bits one), and no further CCP operations will be executed. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	511b79fb94	crypto: ccp - Disable interrupts early on unload commit `116591fe3e` upstream. Ensure that we disable interrupts first when shutting down the driver. Signed-off-by: Gary R Hook <ghook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	c133daf731	crypto: ccp - Use only the relevant interrupt bits commit `56467cb11c` upstream. Each CCP queue can product interrupts for 4 conditions: operation complete, queue empty, error, and queue stopped. This driver only works with completion and error events. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Stephan Mueller	b576fed7c8	crypto: algif_aead - Require setkey before accept(2) commit `2a2a251f11` upstream. Some cipher implementations will crash if you try to use them without calling setkey first. This patch adds a check so that the accept(2) call will fail with -ENOKEY if setkey hasn't been done on the socket yet. Fixes: `400c40cf78` ("crypto: algif - add AEAD support") Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Krzysztof Kozlowski	72a03cf8e3	crypto: s5p-sss - Close possible race for completed requests commit `42d5c176b7` upstream. Driver is capable of handling only one request at a time and it stores it in its state container struct s5p_aes_dev. This stored request must be protected between concurrent invocations (e.g. completing current request and scheduling new one). Combination of lock and "busy" field is used for that purpose. When "busy" field is true, the driver will not accept new request thus it will not overwrite currently handled data. However commit `28b62b1458` ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)") moved some of the write to "busy" field out of a lock protected critical section. This might lead to potential race between completing current request and scheduling a new one. Effectively the request completion might try to operate on new crypto request. Fixes: `28b62b1458` ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Mike Snitzer	d9ae27b661	block: fix blk_integrity_register to use template's interval_exp if not 0 commit `2859323e35` upstream. When registering an integrity profile: if the template's interval_exp is not 0 use it, otherwise use the ilog2() of logical block size of the provided gendisk. This fixes a long-standing DM linear target bug where it cannot pass integrity data to the underlying device if its logical block size conflicts with the underlying device's logical block size. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Marc Zyngier	d978303275	arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses commit `c667186f1c` upstream. Our 32bit CP14/15 handling inherited some of the ARMv7 code for handling the trapped system registers, completely missing the fact that the fields for Rt and Rt2 are now 5 bit wide, and not 4... Let's fix it, and provide an accessor for the most common Rt case. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Andrew Jones	95121cc98c	KVM: arm/arm64: fix races in kvm_psci_vcpu_on commit `6c7a5dce22` upstream. Fix potential races in kvm_psci_vcpu_on() by taking the kvm->lock mutex. In general, it's a bad idea to allow more than one PSCI_CPU_ON to process the same target VCPU at the same time. One such problem that may arise is that one PSCI_CPU_ON could be resetting the target vcpu, which fills the entire sys_regs array with a temporary value including the MPIDR register, while another looks up the VCPU based on the MPIDR value, resulting in no target VCPU found. Resolves both races found with the kvm-unit-tests/arm/psci unit test. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reported-by: Levente Kurusa <lkurusa@redhat.com> Suggested-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Paolo Bonzini	d1daa545bb	Revert "KVM: Support vCPU-based gfn->hva cache" commit `4e335d9e7d` upstream. This reverts commit `bbd6411513`. I've been sitting on this revert for too long and it unfortunately missed 4.11. It's also the reason why I haven't merged ring-based dirty tracking for 4.12. Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and kvm_vcpu_write_guest_offset_cached means that the MSR value can now be used to access SMRAM, simply by making it point to an SMRAM physical address. This is problematic because it lets the guest OS overwrite memory that it shouldn't be able to touch. Fixes: `bbd6411513` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
David Hildenbrand	2d271c9508	KVM: x86: fix user triggerable warning in kvm_apic_accept_events() commit `28bf288879` upstream. If we already entered/are about to enter SMM, don't allow switching to INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events() will report a warning. Same applies if we are already in MP state INIT_RECEIVED and SMM is requested to be turned on. Refuse to set the VCPU events in this case. Fixes: `cd7764fe9f` ("KVM: x86: latch INITs while in system management mode") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Vince Weaver	d85d4c871a	perf/x86: Fix Broadwell-EP DRAM RAPL events commit `33b88e708e` upstream. It appears as though the Broadwell-EP DRAM units share the special units quirk with Haswell-EP/KNL. Without this patch, you get really high results (a single DRAM using 20W of power). The powercap driver in drivers/powercap/intel_rapl.c already has this change. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Richard Weinberger	9e74819678	um: Fix PTRACE_POKEUSER on x86_64 commit `9abc74a22d` upstream. This is broken since ever but sadly nobody noticed. Recent versions of GDB set DR_CONTROL unconditionally and UML dies due to a heap corruption. It turns out that the PTRACE_POKEUSER was copy&pasted from i386 and assumes that addresses are 4 bytes long. Fix that by using 8 as address size in the calculation. Reported-by: jie cao <cj3054@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Ben Hutchings	f25c69bd5e	x86, pmem: Fix cache flushing for iovec write < 8 bytes commit `8376efd31d` upstream. Commit `11e63f6d92` added cache flushing for unaligned writes from an iovec, covering the first and last cache line of a >= 8 byte write and the first cache line of a < 8 byte write. But an unaligned write of 2-7 bytes can still cover two cache lines, so make sure we flush both in that case. Fixes: `11e63f6d92` ("x86, pmem: fix broken __copy_user_nocache ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Andy Lutomirski	20fd61dbb1	selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug commit `65973dd3fd` upstream. i386 glibc is buggy and calls the sigaction syscall incorrectly. This is asymptomatic for normal programs, but it blows up on programs that do evil things with segmentation. The ldt_gdt self-test is an example of such an evil program. This doesn't appear to be a regression -- I think I just got lucky with the uninitialized memory that glibc threw at the kernel when I wrote the test. This hackish fix manually issues sigaction(2) syscalls to undo the damage. Without the fix, ldt_gdt_32 segfaults; with the fix, it passes for me. See: https://sourceware.org/bugzilla/show_bug.cgi?id=21269 Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aaab0f9f93c9af25396f01232608c163a760a668.1490218061.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Ashish Kalra	0dc0e26fee	x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup commit `d594aa0277` upstream. The minimum size for a new stack (512 bytes) setup for arch/x86/boot components when the bootloader does not setup/provide a stack for the early boot components is not "enough". The setup code executing as part of early kernel startup code, uses the stack beyond 512 bytes and accidentally overwrites and corrupts part of the BSS section. This is exposed mostly in the early video setup code, where it was corrupting BSS variables like force_x, force_y, which in-turn affected kernel parameters such as screen_info (screen_info.orig_video_cols) and later caused an exception/panic in console_init(). Most recent boot loaders setup the stack for early boot components, so this stack overwriting into BSS section issue has not been exposed. Signed-off-by: Ashish Kalra <ashish@bluestacks.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170419152015.10011-1-ashishkalra@Ashishs-MacBook-Pro.local Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Guenter Roeck	12001f3a45	usb: hub: Do not attempt to autosuspend disconnected devices commit `f5cccf4942` upstream. While running a bind/unbind stress test with the dwc3 usb driver on rk3399, the following crash was observed. Unable to handle kernel NULL pointer dereference at virtual address 00000218 pgd = ffffffc00165f000 [00000218] pgd=000000000174f003, pud=000000000174f003, pmd=0000000001750003, pte=00e8000001751713 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: uinput uvcvideo videobuf2_vmalloc cmac ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat rfcomm xt_mark fuse bridge stp llc zram btusb btrtl btbcm btintel bluetooth ip6table_filter mwifiex_pcie mwifiex cfg80211 cdc_ether usbnet r8152 mii joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device ppp_async ppp_generic slhc tun CPU: 1 PID: 29814 Comm: kworker/1:1 Not tainted 4.4.52 #507 Hardware name: Google Kevin (DT) Workqueue: pm pm_runtime_work task: ffffffc0ac540000 ti: ffffffc0af4d4000 task.ti: ffffffc0af4d4000 PC is at autosuspend_check+0x74/0x174 LR is at autosuspend_check+0x70/0x174 ... Call trace: [<ffffffc00080dcc0>] autosuspend_check+0x74/0x174 [<ffffffc000810500>] usb_runtime_idle+0x20/0x40 [<ffffffc000785ae0>] __rpm_callback+0x48/0x7c [<ffffffc000786af0>] rpm_idle+0x1e8/0x498 [<ffffffc000787cdc>] pm_runtime_work+0x88/0xcc [<ffffffc000249bb8>] process_one_work+0x390/0x6b8 [<ffffffc00024abcc>] worker_thread+0x480/0x610 [<ffffffc000251a80>] kthread+0x164/0x178 [<ffffffc0002045d0>] ret_from_fork+0x10/0x40 Source: (gdb) l 0xffffffc00080dcc0 0xffffffc00080dcc0 is in autosuspend_check (drivers/usb/core/driver.c:1778). 1773 / We don't need to check interfaces that are 1774 * disabled for runtime PM. Either they are unbound 1775 * or else their drivers don't support autosuspend 1776 * and so they are permanently active. 1777 */ 1778 if (intf->dev.power.disable_depth) 1779 continue; 1780 if (atomic_read(&intf->dev.power.usage_count) > 0) 1781 return -EBUSY; 1782 w \|= intf->needs_remote_wakeup; Code analysis shows that intf is set to NULL in usb_disable_device() prior to setting actconfig to NULL. At the same time, usb_runtime_idle() does not lock the usb device, and neither does any of the functions in the traceback. This means that there is no protection against a race condition where usb_disable_device() is removing dev->actconfig->interface[] pointers while those are being accessed from autosuspend_check(). To solve the problem, synchronize and validate device state between autosuspend_check() and usb_disconnect(). Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Guenter Roeck	90d4e2a659	usb: hub: Fix error loop seen after hub communication errors commit `245b2eecee` upstream. While stress testing a usb controller using a bind/unbind looop, the following error loop was observed. usb 7-1.2: new low-speed USB device number 3 using xhci-hcd usb 7-1.2: hub failed to enable device, error -108 usb 7-1-port2: cannot disable (err = -22) usb 7-1-port2: couldn't allocate usb_device usb 7-1-port2: cannot disable (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) 57 printk messages dropped hub 7-1:1.0: activate --> -22 82 printk messages dropped hub 7-1:1.0: hub_ext_port_status failed (err = -22) This continues forever. After adding tracebacks into the code, the call sequence leading to this is found to be as follows. [<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8 [<ffffffc0007fceb4>] hub_resume+0x2c/0x3c [<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158 [<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288 [<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98 [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c [<ffffffc00078217c>] rpm_callback+0xa8/0xd4 [<ffffffc000786234>] rpm_suspend+0x84/0x758 [<ffffffc000786ca4>] rpm_idle+0x2c8/0x498 [<ffffffc000786ed4>] __pm_runtime_idle+0x60/0xac [<ffffffc00080eba8>] usb_autopm_put_interface+0x6c/0x7c [<ffffffc000803798>] hub_event+0x10ac/0x12ac [<ffffffc000249bb8>] process_one_work+0x390/0x6b8 [<ffffffc00024abcc>] worker_thread+0x480/0x610 [<ffffffc000251a80>] kthread+0x164/0x178 [<ffffffc0002045d0>] ret_from_fork+0x10/0x40 kick_hub_wq() is called from hub_activate() even after failures to communicate with the hub. This results in an endless sequence of hub event -> hub activate -> wq trigger -> hub event -> ... Provide two solutions for the problem. - Only trigger the hub event queue if communication with the hub is successful. - After a suspend failure, only resume already suspended interfaces if the communication with the device is still possible. Each of the changes fixes the observed problem. Use both to improve robustness. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Alexey Brodkin	8e3b0f66f4	usb: Make sure usb/phy/of gets built-in commit `3d6159640d` upstream. DWC3 driver uses of_usb_get_phy_mode() which is implemented in drivers/usb/phy/of.c and in bare minimal configuration it might not be pulled in kernel binary. In case of ARC or ARM this could be easily reproduced with "allnodefconfig" +CONFIG_USB=m +CONFIG_USB_DWC3=m. On building all ends-up with: ---------------------->8------------------ Kernel: arch/arm/boot/Image is ready Kernel: arch/arm/boot/zImage is ready Building modules, stage 2. MODPOST 5 modules ERROR: "of_usb_get_phy_mode" [drivers/usb/dwc3/dwc3.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 ---------------------->8------------------ Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Felipe Balbi <balbi@kernel.org> Cc: Felix Fietkau <nbd@nbd.name> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: linux-snps-arc@lists.infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Romain Izard	6a9fc06acf	usb: gadget: legacy gadgets are optional commit `6e253d0fbc` upstream. With commit `bc49d1d17d` ("usb: gadget: don't couple configfs to legacy gadgets"),it is possible to build a modular kernel with both built-in configfs support and modular legacy gadget drivers. But when building a kernel without modules, it is also necessary to be able to build with configfs but without any legacy gadget driver. This was a possible configuration when the USB_CONFIGFS was a part of the choice options, but not anymore. Mark the choice for legacy gadget drivers as optional restores this. Fixes: `bc49d1d17d` ("usb: gadget: don't couple configfs to legacy gadgets") Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Gustavo A. R. Silva	84759debb0	usb: misc: add missing continue in switch commit `2c930e3d0a` upstream. Add missing continue in switch. Addresses-Coverity-ID: 1248733 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Ian Abbott	69f8945b4f	staging: comedi: jr3_pci: cope with jiffies wraparound commit `8ec04a4918` upstream. The timer expiry routine `jr3_pci_poll_dev()` checks for expiry by checking whether the absolute value of `jiffies` (stored in local variable `now`) is greater than the expected expiry time in jiffy units. This will fail when `jiffies` wraps around. Also, it seems to make sense to handle the expiry one jiffy earlier than the current test. Use `time_after_eq()` to check for expiry. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Ian Abbott	6e4c973ac6	staging: comedi: jr3_pci: fix possible null pointer dereference commit `45292be0b3` upstream. For some reason, the driver does not consider allocation of the subdevice private data to be a fatal error when attaching the COMEDI device. It tests the subdevice private data pointer for validity at certain points, but omits some crucial tests. In particular, `jr3_pci_auto_attach()` calls `jr3_pci_alloc_spriv()` to allocate and initialize the subdevice private data, but the same function subsequently dereferences the pointer to access the `next_time_min` and `next_time_max` members without checking it first. The other missing test is in the timer expiry routine `jr3_pci_poll_dev()`, but it will crash before it gets that far. Fix the bug by returning `-ENOMEM` from `jr3_pci_auto_attach()` as soon as one of the calls to `jr3_pci_alloc_spriv()` returns `NULL`. The COMEDI core will subsequently call `jr3_pci_detach()` to clean up. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Sean Young	9918523578	staging: sir: fill in missing fields and fix probe commit `cf9ed9aa5b` upstream. Some fields are left blank. Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Aditya Shankar	89cb8fccea	staging: wilc1000: Fix problem with wrong vif index commit `0e490657c7` upstream. The vif->idx value is always 0 for two interfaces. wl->vif_num = 0; loop { ... vif->idx = wl->vif_num; ... wl->vif_num = i; .... i++; ... } At present, vif->idx is assigned the value of wl->vif_num at the beginning of this block and device is initialized based on this index value. In the next iteration, wl->vif_num is still 0 as it is only updated later but gets assigned to vif->idx in the beginning. This causes problems later when we try to reference a particular interface and also while configuring the firmware. This patch moves the assignment to vif->idx from the beginning of the block to after wl->vif_num is updated with latest value of i. Fixes: commit `735bb39ca3` ("staging: wilc1000: simplify vif[i]->ndev accesses") Signed-off-by: Aditya Shankar <aditya.shankar@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Johan Hovold	22d8767df9	staging: gdm724x: gdm_mux: fix use-after-free on module unload commit `b58f45c8fc` upstream. Make sure to deregister the USB driver before releasing the tty driver to avoid use-after-free in the USB disconnect callback where the tty devices are deregistered. Fixes: `61e1210476` ("staging: gdm7240: adding LTE USB driver") Cc: Won Kang <wkang77@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Malcolm Priestley	af685eefa2	staging: vt6656: use off stack for out buffer USB transfers. commit `12ecd24ef9` upstream. Since 4.9 mandated USB buffers be heap allocated this causes the driver to fail. Since there is a wide range of buffer sizes use kmemdup to create allocated buffer. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Malcolm Priestley	5ba4fd334f	staging: vt6656: use off stack for in buffer USB transfers. commit `05c0cf88be` upstream. Since 4.9 mandated USB buffers to be heap allocated. This causes the driver to fail. Create buffer for USB transfers. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Bjørn Mork	f49d36b7b3	USB: Revert "cdc-wdm: fix "out-of-sync" due to missing notifications" commit `1944581699` upstream. This reverts commit `833415a3e7` ("cdc-wdm: fix "out-of-sync" due to missing notifications") There have been several reports of wdm_read returning unexpected EIO errors with QMI devices using the qmi_wwan driver. The reporters confirm that reverting prevents these errors. I have been unable to reproduce the bug myself, and have no explanation to offer either. But reverting is the safe choice here, given that the commit was an attempt to work around a firmware problem. Living with a firmware problem is still better than adding driver bugs. Reported-by: Kasper Holtze <kasper@holtze.dk> Reported-by: Aleksander Morgado <aleksander@aleksander.es> Reported-by: Daniele Palmas <dnlplm@gmail.com> Fixes: `833415a3e7` ("cdc-wdm: fix "out-of-sync" due to missing notifications") Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Ajay Kaher	2c33341c96	USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously commit `2f86a96be0` upstream. There is race condition when two USB class drivers try to call init_usb_class at the same time and leads to crash. code path: probe->usb_register_dev->init_usb_class To solve this, mutex locking has been added in init_usb_class() and destroy_usb_class(). As pointed by Alan, removed "if (usb_class)" test from destroy_usb_class() because usb_class can never be NULL there. Signed-off-by: Ajay Kaher <ajay.kaher@samsung.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Marek Vasut	2436236ee3	USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit commit `31c5d1922b` upstream. This development kit has an FT4232 on it with a custom USB VID/PID. The FT4232 provides four UARTs, but only two are used. The UART 0 is used by the FlashPro5 programmer and UART 2 is connected to the SmartFusion2 CortexM3 SoC UART port. Note that the USB VID is registered to Actel according to Linux USB VID database, but that was acquired by Microsemi. Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Peter Chen	f34b103cd5	usb: host: xhci: print correct command ring address commit `6fc091fb04` upstream. Print correct command ring address using 'val_64'. Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Roger Quadros	c5379cf67f	usb: xhci: bInterval quirk for TI TUSB73x0 commit `69307ccb9a` upstream. As per [1] issue #4, "The periodic EP scheduler always tries to schedule the EPs that have large intervals (interval equal to or greater than 128 microframes) into different microframes. So it maintains an internal counter and increments for each large interval EP added. When the counter is greater than 128, the scheduler rejects the new EP. So when the hub re-enumerated 128 times, it triggers this condition." This results in Bandwidth error when devices with periodic endpoints (ISO/INT) having bInterval > 7 are plugged and unplugged several times on a TUSB73x0 XHCI host. Workaround this issue by limiting the bInterval to 7 (i.e. interval to 6) for High-speed or faster periodic endpoints. [1] - http://www.ti.com/lit/er/sllz076/sllz076.pdf Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Nicholas Bellinger	f9a45058a3	iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement commit `197b806ae5` upstream. While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit `d36ad77f70` to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo <ghg@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Bart Van Assche	14a890020f	target/fileio: Fix zero-length READ and WRITE handling commit `59ac9c0781` upstream. This patch fixes zero-length READ and WRITE handling in target/FILEIO, which was broken a long time back by: Since: commit `d81cb44726` Author: Paolo Bonzini <pbonzini@redhat.com> Date: Mon Sep 17 16:36:11 2012 -0700 target: go through normal processing for all zero-length commands which moved zero-length READ and WRITE completion out of target-core, to doing submission into backend driver code. To address this, go ahead and invoke target_complete_cmd() for any non negative return value in fd_do_rw(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Nicholas Bellinger	25ed85889d	target: Fix compare_and_write_callback handling for non GOOD status commit `a71a5dc7f8` upstream. Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit `9b2792c3da`, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari <wgb@datera.io> Cc: Bill Borsari <wgb@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Juergen Gross	1ec6f0815a	xen: adjust early dom0 p2m handling to xen hypervisor behavior commit `69861e0a52` upstream. When booted as pv-guest the p2m list presented by the Xen is already mapped to virtual addresses. In dom0 case the hypervisor might make use of 2M- or 1G-pages for this mapping. Unfortunately while being properly aligned in virtual and machine address space, those pages might not be aligned properly in guest physical address space. So when trying to obtain the guest physical address of such a page pud_pfn() and pmd_pfn() must be avoided as those will mask away guest physical address bits not being zero in this special case. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Greg Kroah-Hartman	4c71e91a04	Linux 4.11.1	2017-05-14 14:06:16 +02:00
Ilya Dryomov	0873ab0224	block: get rid of blk_integrity_revalidate() commit `19b7ccf865` upstream. Commit `25520d55cd` ("block: Inline blk_integrity in struct gendisk") introduced blk_integrity_revalidate(), which seems to assume ownership of the stable pages flag and unilaterally clears it if no blk_integrity profile is registered: if (bi->profile) disk->queue->backing_dev_info->capabilities \|= BDI_CAP_STABLE_WRITES; else disk->queue->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES; It's called from revalidate_disk() and rescan_partitions(), making it impossible to enable stable pages for drivers that support partitions and don't use blk_integrity: while the call in revalidate_disk() can be trivially worked around (see zram, which doesn't support partitions and hence gets away with zram_revalidate_disk()), rescan_partitions() can be triggered from userspace at any time. This breaks rbd, where the ceph messenger is responsible for generating/verifying CRCs. Since blk_integrity_{un,}register() "must" be used for (un)registering the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES setting there. This way drivers that call blk_integrity_register() and use integrity infrastructure won't interfere with drivers that don't but still want stable pages. Fixes: `25520d55cd` ("block: Inline blk_integrity in struct gendisk") Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:03 +02:00
Boris Ostrovsky	3ed024e274	xen: Revert commits `da72ff5bfc` and `72a9b18629` commit `84d582d236` upstream. Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741) established that commit `72a9b18629` ("xen: Remove event channel notification through Xen PCI platform device") (and thus commit `da72ff5bfc` ("partially revert "xen: Remove event channel notification through Xen PCI platform device"")) are unnecessary and, in fact, prevent HVM guests from booting on Xen releases prior to 4.0 Therefore we revert both of those commits. The summary of that discussion is below: Here is the brief summary of the current situation: Before the offending commit (`72a9b18629`): 1) INTx does not work because of the reset_watches path. 2) The reset_watches path is only taken if you have Xen > 4.0 3) The Linux Kernel by default will use vector inject if the hypervisor support. So even INTx does not work no body running the kernel with Xen > 4.0 would notice. Unless he explicitly disabled this feature either in the kernel or in Xen (and this can only be disabled by modifying the code, not user-supported way to do it). After the offending commit (+ partial revert): 1) INTx is no longer support for HVM (only for PV guests). 2) Any HVM guest The kernel will not boot on Xen < 4.0 which does not have vector injection support. Since the only other mode supported is INTx which. So based on this summary, I think before commit (`72a9b18629`) we were in much better position from a user point of view. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:03 +02:00
Stefano Stabellini	5e25479ea9	xen/arm,arm64: fix xen_dma_ops after `815dd18` "Consolidate get_dma_ops..." commit `e058632670` upstream. The following commit: commit `815dd18788` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri Jan 20 13:04:04 2017 -0800 treewide: Consolidate get_dma_ops() implementations rearranges get_dma_ops in a way that xen_dma_ops are not returned when running on Xen anymore, dev->dma_ops is returned instead (see arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and include/linux/dma-mapping.h:get_dma_ops). Fix the problem by storing dev->dma_ops in dev_archdata, and setting dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from dev_archdata when needed. It also allows us to remove __generic_dma_ops from common headers. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Julien Grall <julien.grall@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> CC: linux@armlinux.org.uk CC: catalin.marinas@arm.com CC: will.deacon@arm.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: Julien Grall <julien.grall@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Jin Qian	c7f765b5d6	f2fs: sanity check segment count commit `b9dd46188e` upstream. F2FS uses 4 bytes to represent block address. As a result, supported size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments. Signed-off-by: Jin Qian <jinqian@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Jon Mason	275e2edbb8	net: mdio-mux: bcm-iproc: call mdiobus_free() in error path [ Upstream commit `922c60e89d` ] If an error is encountered in mdio_mux_init(), the error path will call mdiobus_free(). Since mdiobus_register() has been called prior to mdio_mux_init(), the bus->state will not be MDIOBUS_UNREGISTERED. This causes a BUG_ON() in mdiobus_free(). To correct this issue, add an error path for mdio_mux_init() which calls mdiobus_unregister() prior to mdiobus_free(). Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: `98bc865a1e` ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Daniel Borkmann	ced12308e5	bpf: don't let ldimm64 leak map addresses on unprivileged [ Upstream commit `0d0e57697f` ] The patch fixes two things at once: 1) It checks the env->allow_ptr_leaks and only prints the map address to the log if we have the privileges to do so, otherwise it just dumps 0 as we would when kptr_restrict is enabled on %pK. Given the latter is off by default and not every distro sets it, I don't want to rely on this, hence the 0 by default for unprivileged. 2) Printing of ldimm64 in the verifier log is currently broken in that we don't print the full immediate, but only the 32 bit part of the first insn part for ldimm64. Thus, fix this up as well; it's okay to access, since we verified all ldimm64 earlier already (including just constants) through replace_map_fd_with_map_ptr(). Fixes: `1be7f75d16` ("bpf: enable non-root eBPF programs") Fixes: `cbd3570086` ("bpf: verifier (add ability to receive verification log)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Dan Carpenter	7a0a483ec5	bnxt_en: allocate enough space for ->ntp_fltr_bmap [ Upstream commit `ac45bd93a5` ] We have the number of longs, but we need to calculate the number of bytes required. Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Eric Dumazet	f8e3892f9f	tcp: randomize timestamps on syncookies [ Upstream commit `84b114b984` ] Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: `95a22caee3` ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
WANG Cong	3960afa2e0	ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf [ Upstream commit `242d3a49a2` ] For each netns (except init_net), we initialize its null entry in 3 places: 1) The template itself, as we use kmemdup() 2) Code around dst_init_metrics() in ip6_route_net_init() 3) ip6_route_dev_notify(), which is supposed to initialize it after loopback registers Unfortunately the last one still happens in a wrong order because we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to net->loopback_dev's idev, thus we have to do that after we add idev to loopback. However, this notifier has priority == 0 same as ipv6_dev_notf, and ipv6_dev_notf is registered after ip6_route_dev_notifier so it is called actually after ip6_route_dev_notifier. This is similar to commit `2f460933f5` ("ipv6: initialize route null entry in addrconf_init()") which fixes init_net. Fix it by picking a smaller priority for ip6_route_dev_notifier. Also, we have to release the refcnt accordingly when unregistering loopback_dev because device exit functions are called before subsys exit functions. Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
WANG Cong	5ee83127ac	ipv6: initialize route null entry in addrconf_init() [ Upstream commit `2f460933f5` ] Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev since it is always NULL. This is clearly wrong, we have code to initialize it to loopback_dev, unfortunately the order is still not correct. loopback_dev is registered very early during boot, we lose a chance to re-initialize it in notifier. addrconf_init() is called after ip6_route_init(), which means we have no chance to correct it. Fix it by moving this initialization explicitly after ipv6_add_dev(init_net.loopback_dev) in addrconf_init(). Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Michal Schmidt	0913e57331	rtnetlink: NUL-terminate IFLA_PHYS_PORT_NAME string [ Upstream commit `77ef033b68` ] IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0. Otherwise libnl3 fails to validate netlink messages with this attribute. "ip -detail a" assumes too that the attribute is NUL-terminated when printing it. It often was, due to padding. I noticed this as libvirtd failing to start on a system with sfc driver after upgrading it to Linux 4.11, i.e. when sfc added support for phys_port_name. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Alexander Potapenko	165a007e51	ipv4, ipv6: ensure raw socket message is big enough to hold an IP header [ Upstream commit `86f4c90a1c` ] raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied from the userspace contains the IPv4/IPv6 header, so if too few bytes are copied, parts of the header may remain uninitialized. This bug has been detected with KMSAN. For the record, the KMSAN report: ================================================================== BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0 inter: 0 CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078 __kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510 nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577 ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn ./include/linux/netfilter.h:102 nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310 nf_hook ./include/linux/netfilter.h:212 NF_HOOK ./include/linux/netfilter.h:255 rawv6_send_hdrinc net/ipv6/raw.c:673 rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x436e03 RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000 origin: 00000000d9400053 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270 slab_alloc_node mm/slub.c:2735 __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678 sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903 sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920 rawv6_send_hdrinc net/ipv6/raw.c:638 rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 ================================================================== , triggered by the following syscalls: socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3 sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket instead of a PF_INET6 one. Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Eric Dumazet	86a9a0884d	tcp: do not inherit fastopen_req from parent [ Upstream commit `8b485ce698` ] Under fuzzer stress, it is possible that a child gets a non NULL fastopen_req pointer from its parent at accept() time, when/if parent morphs from listener to active session. We need to make sure this can not happen, by clearing the field after socket cloning. BUG: Double free or freeing an invalid pointer Unexpected shadow byte: 0xFB CPU: 3 PID: 20933 Comm: syz-executor3 Not tainted 4.11.0+ #306 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x292/0x395 lib/dump_stack.c:52 kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 kasan_report_double_free+0x5c/0x70 mm/kasan/report.c:185 kasan_slab_free+0x9d/0xc0 mm/kasan/kasan.c:580 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 inet_child_forget+0xb8/0x600 net/ipv4/inet_connection_sock.c:898 inet_csk_reqsk_queue_add+0x1e7/0x250 net/ipv4/inet_connection_sock.c:928 tcp_get_cookie_sock+0x21a/0x510 net/ipv4/syncookies.c:217 cookie_v4_check+0x1a19/0x28b0 net/ipv4/syncookies.c:384 tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1384 [inline] tcp_v4_do_rcv+0x731/0x940 net/ipv4/tcp_ipv4.c:1421 tcp_v4_rcv+0x2dc0/0x31c0 net/ipv4/tcp_ipv4.c:1715 ip_local_deliver_finish+0x4cc/0xc20 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:257 [inline] ip_local_deliver+0x1ce/0x700 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:492 [inline] ip_rcv_finish+0xb1d/0x20b0 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:257 [inline] ip_rcv+0xd8c/0x19c0 net/ipv4/ip_input.c:487 __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4210 __netif_receive_skb+0x2a/0x1a0 net/core/dev.c:4248 process_backlog+0xe5/0x6c0 net/core/dev.c:4868 napi_poll net/core/dev.c:5270 [inline] net_rx_action+0xe70/0x18e0 net/core/dev.c:5335 __do_softirq+0x2fb/0xb99 kernel/softirq.c:284 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:899 </IRQ> do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328 do_softirq kernel/softirq.c:176 [inline] __local_bh_enable_ip+0x1cf/0x1e0 kernel/softirq.c:181 local_bh_enable include/linux/bottom_half.h:31 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:931 [inline] ip_finish_output2+0x9ab/0x15e0 net/ipv4/ip_output.c:230 ip_finish_output+0xa35/0xdf0 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f6/0x7b0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9a8/0x1a10 net/ipv4/ip_output.c:503 tcp_transmit_skb+0x1ade/0x3470 net/ipv4/tcp_output.c:1057 tcp_write_xmit+0x79e/0x55b0 net/ipv4/tcp_output.c:2265 __tcp_push_pending_frames+0xfa/0x3a0 net/ipv4/tcp_output.c:2450 tcp_push+0x4ee/0x780 net/ipv4/tcp.c:683 tcp_sendmsg+0x128d/0x39b0 net/ipv4/tcp.c:1342 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x446059 RSP: 002b:00007faa6761fb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 0000000000446059 RDX: 0000000000000001 RSI: 0000000020ba3fcd RDI: 0000000000000017 RBP: 00000000006e40a0 R08: 0000000020ba4ff0 R09: 0000000000000010 R10: 0000000020000000 R11: 0000000000000282 R12: 0000000000708150 R13: 0000000000000000 R14: 00007faa676209c0 R15: 00007faa67620700 Object at ffff88003b5bbcb8, in cache kmalloc-64 size: 64 Allocated: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 kmalloc include/linux/slab.h:490 [inline] kzalloc include/linux/slab.h:663 [inline] tcp_sendmsg_fastopen net/ipv4/tcp.c:1094 [inline] tcp_sendmsg+0x221a/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 __inet_stream_connect+0x20c/0xf90 net/ipv4/af_inet.c:593 tcp_sendmsg_fastopen net/ipv4/tcp.c:1111 [inline] tcp_sendmsg+0x23a8/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Fixes: `7db92362d2` ("tcp: fix potential double free issue for fastopen_req") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Daniele Palmas	0b1d889cb0	net: usb: qmi_wwan: add Telit ME910 support [ Upstream commit `4c54dc0277` ] This patch adds support for Telit ME910 PID 0x1100. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
David Ahern	23e761f37b	net: ipv6: Do not duplicate DAD on link up [ Upstream commit `6d717134a1` ] Andrey reported a warning triggered by the rcu code: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289 debug_print_object+0x175/0x210 ODEBUG: activate active (active state 1) object type: rcu_head hint: (null) Modules linked in: CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x192/0x22d lib/dump_stack.c:52 __warn+0x19f/0x1e0 kernel/panic.c:549 warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564 debug_print_object+0x175/0x210 lib/debugobjects.c:286 debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442 debug_rcu_head_queue kernel/rcu/rcu.h:75 __call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229 call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288 rt6_rcu_free net/ipv6/ip6_fib.c:158 rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188 fib6_del_route net/ipv6/ip6_fib.c:1461 fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500 __ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174 ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187 __ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520 addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672 ... Andrey's reproducer program runs in a very tight loop, calling 'unshare -n' and then spawning 2 sets of 14 threads running random ioctl calls. The relevant networking sequence: 1. New network namespace created via unshare -n - ip6tnl0 device is created in down state 2. address added to ip6tnl0 - equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1 - DAD is started on the address and when it completes the host route is inserted into the FIB 3. ip6tnl0 is brought up - the new fixup_permanent_addr function restarts DAD on the address 4. exit namespace - teardown / cleanup sequence starts - once in a blue moon, lo teardown appears to happen BEFORE teardown of ip6tunl0 + down on 'lo' removes the host route from the FIB since the dst->dev for the route is loobback + host route added to rcu callback list * rcu callback has not run yet, so rt is NOT on the gc list so it has NOT been marked obsolete 5. in parallel to 4. worker_thread runs addrconf_dad_completed - DAD on the address on ip6tnl0 completes - calls ipv6_ifa_notify which inserts the host route All of that happens very quickly. The result is that a host route that has been deleted from the IPv6 FIB and added to the RCU list is re-inserted into the FIB. The exit namespace eventually gets to cleaning up ip6tnl0 which removes the host route from the FIB again, calls the rcu function for cleanup -- and triggers the double rcu trace. The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably, DAD should not be started in step 2. The interface is in the down state, so it can not really send out requests for the address which makes starting DAD pointless. Since the second DAD was introduced by a recent change, seems appropriate to use it for the Fixes tag and have the fixup function only start DAD for addresses in the PREDAD state which occurs in addrconf_ifdown if the address is retained. Big thanks to Andrey for isolating a reliable reproducer for this problem. Fixes: `f1705ec197` ("net: ipv6: Make address flushing on ifdown optional") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Eric Dumazet	a1baca92af	tcp: fix wraparound issue in tcp_lp [ Upstream commit `a9f11f963a` ] Be careful when comparing tcp_time_stamp to some u32 quantity, otherwise result can be surprising. Fixes: `7c106d7e78` ("[TCP]: TCP Low Priority congestion control") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Daniel Borkmann	f94cc32ccb	bpf, arm64: fix jit branch offset related to ldimm64 [ Upstream commit `ddc665a4bb` ] When the instruction right before the branch destination is a 64 bit load immediate, we currently calculate the wrong jump offset in the ctx->offset[] array as we only account one instruction slot for the 64 bit load immediate although it uses two BPF instructions. Fix it up by setting the offset into the right slot after we incremented the index. Before (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 54ffff82 b.cs 0x00000020 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] After (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 540000a2 b.cs 0x00000044 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] Also, add a couple of test cases to make sure JITs pass this test. Tested on Cavium ThunderX ARMv8. The added test cases all pass after the fix. Fixes: `8eee539dde` ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Yonghong Song	34acfbf08c	bpf: enhance verifier to understand stack pointer arithmetic [ Upstream commit `332270fdc8` ] llvm 4.0 and above generates the code like below: .... 440: (b7) r1 = 15 441: (05) goto pc+73 515: (79) r6 = (u64 )(r10 -152) 516: (bf) r7 = r10 517: (07) r7 += -112 518: (bf) r2 = r7 519: (0f) r2 += r1 520: (71) r1 = (u8 )(r8 +0) 521: (73) (u8 )(r2 +45) = r1 .... and the verifier complains "R2 invalid mem access 'inv'" for insn #521. This is because verifier marks register r2 as unknown value after #519 where r2 is a stack pointer and r1 holds a constant value. Teach verifier to recognize "stack_ptr + imm" and "stack_ptr + reg with const val" as valid stack_ptr with new offset. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Girish Moodalbail	5a5b34164f	geneve: fix incorrect setting of UDP checksum flag [ Upstream commit `5e0740c445` ] Creating a geneve link with 'udpcsum' set results in a creation of link for which UDP checksum will NOT be computed on outbound packets, as can be seen below. 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0 geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum Similarly, creating a link with 'noudpcsum' set results in a creation of link for which UDP checksum will be computed on outbound packets. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Davide Caratti	011fbfb296	tcp: fix access to sk->sk_state in tcp_poll() [ Upstream commit `d68be71ea1` ] avoid direct access to sk->sk_state when tcp_poll() is called on a socket using active TCP fastopen with deferred connect. Use local variable 'state', which stores the result of sk_state_load(), like it was done in commit `00fd38d938` ("tcp: ensure proper barriers in lockless contexts"). Fixes: `19f6d3f3c8` ("net/tcp-fastopen: Add new API support") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Alexandre Belloni	c0e5b847c7	net: macb: fix phy interrupt parsing [ Upstream commit `ae3696c167` ] Since `83a77e9ec4`, the phydev irq is explicitly set to PHY_POLL when there is no pdata. It doesn't work on DT enabled platforms because the phydev irq is already set by libphy before. Fixes: `83a77e9ec4` ("net: macb: Added PCI wrapper for Platform Driver.") Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Greg Kroah-Hartman	acc7c954b0	refcount: change EXPORT_SYMBOL markings commit `d557d1b58b` upstream. Now that kref is using the refcount apis, the _GPL markings are getting exported to places that it previously wasn't. Now kref.h is GPLv2 licensed, so any non-GPL code using it better be talking to some lawyers, but changing api markings isn't considered "nice", so let's fix this up. Cc: Philip Müller <philm@manjaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-14 14:06:01 +02:00
Dave Aldridge	e2b8851f1f	sparc64: fix fault handling in NGbzero.S and GENbzero.S commit `3c7f622120` upstream. When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
James Hughes	c98f47c1f6	brcmfmac: Make skb header writable before use commit `9cc4b7cb86` upstream. The driver was making changes to the skb_header without ensuring it was writable (i.e. uncloned). This patch also removes some boiler plate header size checking/adjustment code as that is also handled by the skb_cow_header function used to make header writable. Signed-off-by: James Hughes <james.hughes@raspberrypi.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
James Hughes	7f073b7e4d	brcmfmac: Ensure pointer correctly set if skb data location changes commit `455a1eb465` upstream. The incoming skb header may be resized if header space is insufficient, which might change the data adddress in the skb. Ensure that a cached pointer to that data is correctly set by moving assignment to after any possible changes. Signed-off-by: James Hughes <james.hughes@raspberrypi.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Giedrius Statkevičius	b998524b6c	power: supply: lp8788: prevent out of bounds array access commit `bdd9968d35` upstream. val might become 7 in which case stime[7] (array of length 7) would be accessed during the scnprintf call later and that will cause issues. Obviously, string concatenation is not intended here so just a comma needs to be added to fix the issue. Fixes: `98a2766493` ("power_supply: Add new lp8788 charger driver") Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@gmail.com> Acked-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Vincent Abriou	648ada88cb	drm/sti: fix GDP size to support up to UHD resolution commit `2f410f88c0` upstream. On stih407-410 chip family the GDP layers are able to support up to UHD resolution (3840 x 2160). Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Acked-by: Lee Jones <lee.jones@linaro.org> Tested-by: Lee Jones <lee.jones@linaro.org> Link: http://patchwork.freedesktop.org/patch/msgid/1490280292-30466-1-git-send-email-vincent.abriou@st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Adrian Salido	40a1937317	dm ioctl: prevent stack leak in dm ioctl call commit `4617f564c0` upstream. When calling a dm ioctl that doesn't process any data (IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct dm_ioctl are left initialized. Current code is incorrectly extending the size of data copied back to user, causing the contents of kernel stack to be leaked to user. Fix by only copying contents before data and allow the functions processing the ioctl to override. Signed-off-by: Adrian Salido <salidoa@google.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00

657 changed files with 6341 additions and 3232 deletions

16

Documentation/acpi/acpi-lid.txt

View File

 @ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
 If the userspace hasn't been prepared to ignore the unreliable "opened"
 events and the unreliable initial state notification, Linux users can use
 the following kernel parameters to handle the possible issues:
 A. button.lid_init_state=open:
 A. button.lid_init_state=method:
    When this option is specified, the ACPI button driver reports the
    initial lid state using the returning value of the _LID control method
    and whether the "opened"/"closed" events are paired fully relies on the
    firmware implementation.
    This option can be used to fix some platforms where the returning value
    of the _LID control method is reliable but the initial lid state
    notification is missing.
    This option is the default behavior during the period the userspace
    isn't ready to handle the buggy AML tables.
 B. button.lid_init_state=open:
    When this option is specified, the ACPI button driver always reports the
    initial lid state as "opened" and whether the "opened"/"closed" events
    are paired fully relies on the firmware implementation.
    This may fix some platforms where the returning value of the _LID
    control method is not reliable and the initial lid state notification is
    missing.
    This option is the default behavior during the period the userspace
    isn't ready to handle the buggy AML tables.
 If the userspace has been prepared to ignore the unreliable "opened" events
 and the unreliable initial state notification, Linux users should always
 use the following kernel parameter:
 B. button.lid_init_state=ignore:
 C. button.lid_init_state=ignore:
    When this option is specified, the ACPI button driver never reports the
    initial lid state and there is a compensation mechanism implemented to
    ensure that the reliable "closed" notifications can always be delievered

62

Documentation/arm64/tagged-pointers.txt

View File

 @ -11,24 +11,56 @@ in AArch64 Linux.
 The kernel configures the translation tables so that translations made
 via TTBR0 (i.e. userspace mappings) have the top byte (bits 63:56) of
 the virtual address ignored by the translation hardware. This frees up
 this byte for application use, with the following caveats:
 this byte for application use.
 	(1) The kernel requires that all user addresses passed to EL1
 	    are tagged with tag 0x00. This means that any syscall
 	    parameters containing user virtual addresses *must* have
 	    their top byte cleared before trapping to the kernel.
 	(2) Non-zero tags are not preserved when delivering signals.
 	    This means that signal handlers in applications making use
 	    of tags cannot rely on the tag information for user virtual
 	    addresses being maintained for fields inside siginfo_t.
 	    One exception to this rule is for signals raised in response
 	    to watchpoint debug exceptions, where the tag information
 	    will be preserved.
 Passing tagged addresses to the kernel
 --------------------------------------
 	(3) Special care should be taken when using tagged pointers,
 	    since it is likely that C compilers will not hazard two
 	    virtual addresses differing only in the upper byte.
 All interpretation of userspace memory addresses by the kernel assumes
 an address tag of 0x00.
 This includes, but is not limited to, addresses found in:
  - pointer arguments to system calls, including pointers in structures
    passed to system calls,
  - the stack pointer (sp), e.g. when interpreting it to deliver a
    signal,
  - the frame pointer (x29) and frame records, e.g. when interpreting
    them to generate a backtrace or call graph.
 Using non-zero address tags in any of these locations may result in an
 error code being returned, a (fatal) signal being raised, or other modes
 of failure.
 For these reasons, passing non-zero address tags to the kernel via
 system calls is forbidden, and using a non-zero address tag for sp is
 strongly discouraged.
 Programs maintaining a frame pointer and frame records that use non-zero
 address tags may suffer impaired or inaccurate debug and profiling
 visibility.
 Preserving tags
 ---------------
 Non-zero tags are not preserved when delivering signals. This means that
 signal handlers in applications making use of tags cannot rely on the
 tag information for user virtual addresses being maintained for fields
 inside siginfo_t. One exception to this rule is for signals raised in
 response to watchpoint debug exceptions, where the tag information will
 be preserved.
 The architecture prevents the use of a tagged PC, so the upper byte will
 be set to a sign-extension of bit 55 on exception return.
 Other considerations
 --------------------
 Special care should be taken when using tagged pointers, since it is
 likely that C compilers will not hazard two virtual addresses differing
 only in the upper byte.

									
										2

Makefile
									
												View File
												
				@ -1,6 +1,6 @@

				VERSION = 4

				PATCHLEVEL = 11

				SUBLEVEL = 0

				SUBLEVEL = 6

				EXTRAVERSION =

				NAME = Fearless Coyote

									
										6

arch/alpha/kernel/osf_sys.c
									
												View File
												
				@ -1199,8 +1199,10 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options,

					if (!access_ok(VERIFY_WRITE, ur, sizeof(*ur)))

						return -EFAULT;

					err = 0;

					err |= put_user(status, ustatus);

					err = put_user(status, ustatus);

					if (ret < 0)

						return err ? err : ret;

					err |= __put_user(r.ru_utime.tv_sec, &ur->ru_utime.tv_sec);

					err |= __put_user(r.ru_utime.tv_usec, &ur->ru_utime.tv_usec);

					err |= __put_user(r.ru_stime.tv_sec, &ur->ru_stime.tv_sec);

5

arch/arm/boot/dts/at91-sama5d3_xplained.dts

View File

 @ -162,9 +162,10 @@
 			};
 			adc0: adc@f8018000 {
 				atmel,adc-vref = <3300>;
 				atmel,adc-channels-used = <0xfe>;
 				pinctrl-0 = <
 					&pinctrl_adc0_adtrg
 					&pinctrl_adc0_ad0
 					&pinctrl_adc0_ad1
 					&pinctrl_adc0_ad2
 					&pinctrl_adc0_ad3
 @ -172,8 +173,6 @@
 					&pinctrl_adc0_ad5
 					&pinctrl_adc0_ad6
 					&pinctrl_adc0_ad7
 					&pinctrl_adc0_ad8
 					&pinctrl_adc0_ad9
 					>;
 				status = "okay";
 			};

17

arch/arm/boot/dts/imx6sx-sdb.dts

View File

 @ -12,23 +12,6 @@
 	model = "Freescale i.MX6 SoloX SDB RevB Board";
 };
 &cpu0 {
 	operating-points = <
 		/* kHz    uV */
 1250000
 1175000
 1175000
 1175000
 		>;
 	fsl,soc-operating-points = <
 		/* ARM kHz      SOC uV */
 	1250000
 	1175000
 	1175000
 1175000
 	>;
 };
 &i2c1 {
 	clock-frequency = <100000>;
 	pinctrl-names = "default";

4

arch/arm/boot/dts/keystone-k2l-netcp.dtsi

View File

 @ -137,8 +137,8 @@ netcp: netcp@26000000 {
 	/* NetCP address range */
 	ranges = <0 0x26000000 0x1000000>;
 	clocks = <&clkpa>, <&clkcpgmac>, <&chipclk12>, <&clkosr>;
 	clock-names = "pa_clk", "ethss_clk", "cpts", "osr_clk";
 	clocks = <&clkpa>, <&clkcpgmac>, <&chipclk12>;
 	clock-names = "pa_clk", "ethss_clk", "cpts";
 	dma-coherent;
 	ti,navigator-dmas = <&dma_gbe 0>,

8

arch/arm/boot/dts/keystone-k2l.dtsi

View File

 @ -232,6 +232,14 @@
 			};
 		};
 		osr: sram@70000000 {
 			compatible = "mmio-sram";
 			reg = <0x70000000 0x10000>;
 			#address-cells = <1>;
 			#size-cells = <1>;
 			clocks = <&clkosr>;
 		};
 		dspgpio0: keystone_dsp_gpio@02620240 {
 			compatible = "ti,keystone-dsp-gpio";
 			gpio-controller;

									
										3

arch/arm/include/asm/device.h
									
												View File
												
				@ -15,6 +15,9 @@ struct dev_archdata {

				#endif

				#ifdef CONFIG_ARM_DMA_USE_IOMMU

					struct dma_iommu_mapping	*mapping;

				#endif

				#ifdef CONFIG_XEN

					const struct dma_map_ops *dev_dma_ops;

				#endif

					bool dma_coherent;

				};

									
										12

arch/arm/include/asm/dma-mapping.h
									
												View File
												
				@ -16,19 +16,9 @@

				extern const struct dma_map_ops arm_dma_ops;

				extern const struct dma_map_ops arm_coherent_dma_ops;

				static inline const struct dma_map_ops *__generic_dma_ops(struct device *dev)

				{

					if (dev && dev->dma_ops)

						return dev->dma_ops;

					return &arm_dma_ops;

				}

				static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)

				{

					if (xen_initial_domain())

						return xen_dma_ops;

					else

						return __generic_dma_ops(NULL);

					return &arm_dma_ops;

				}

				#define HAVE_ARCH_DMA_SUPPORTED 1

									
										2

arch/arm/include/asm/fixmap.h
									
												View File
												
				@ -41,7 +41,7 @@ static const enum fixed_addresses __end_of_fixed_addresses =

				#define FIXMAP_PAGE_COMMON	(L_PTE_YOUNG | L_PTE_PRESENT | L_PTE_XN | L_PTE_DIRTY)

				#define FIXMAP_PAGE_NORMAL	(FIXMAP_PAGE_COMMON | L_PTE_MT_WRITEBACK)

				#define FIXMAP_PAGE_NORMAL	(pgprot_kernel | L_PTE_XN)

				#define FIXMAP_PAGE_RO		(FIXMAP_PAGE_NORMAL | L_PTE_RDONLY)

				/* Used by set_fixmap_(io|nocache), both meant for mapping a device */

									
										3

arch/arm/include/asm/kvm_coproc.h
									
												View File
												
				@ -31,7 +31,8 @@ void kvm_register_target_coproc_table(struct kvm_coproc_target_table *table);

				int kvm_handle_cp10_id(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp_0_13_access(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run);

				int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run);

									
										9

arch/arm/include/asm/module.h
									
												View File
												
				@ -18,13 +18,18 @@ enum {

				};

				#endif

				struct mod_plt_sec {

					struct elf32_shdr	*plt;

					int			plt_count;

				};

				struct mod_arch_specific {

				#ifdef CONFIG_ARM_UNWIND

					struct unwind_table *unwind[ARM_SEC_MAX];

				#endif

				#ifdef CONFIG_ARM_MODULE_PLTS

					struct elf32_shdr   *plt;

					int		    plt_count;

					struct mod_plt_sec	core;

					struct mod_plt_sec	init;

				#endif

				};

									
										85

arch/arm/kernel/module-plts.c
									
												View File
												
				@ -1,5 +1,5 @@

				/*

				 * Copyright (C) 2014 Linaro Ltd. <ard.biesheuvel@linaro.org>

				 * Copyright (C) 2014-2017 Linaro Ltd. <ard.biesheuvel@linaro.org>

				 *

				 * This program is free software; you can redistribute it and/or modify

				 * it under the terms of the GNU General Public License version 2 as

				@ -31,9 +31,17 @@ struct plt_entries {

					u32	lit[PLT_ENT_COUNT];

				};

				static bool in_init(const struct module *mod, unsigned long loc)

				{

					return loc - (u32)mod->init_layout.base < mod->init_layout.size;

				}

				u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val)

				{

					struct plt_entries *plt = (struct plt_entries *)mod->arch.plt->sh_addr;

					struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :

											  &mod->arch.init;

					struct plt_entries *plt = (struct plt_entries *)pltsec->plt->sh_addr;

					int idx = 0;

					/*

				@ -41,9 +49,9 @@ u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val)

					 * relocations are sorted, this will be the last entry we allocated.

					 * (if one exists).

					 */

					if (mod->arch.plt_count > 0) {

						plt += (mod->arch.plt_count - 1) / PLT_ENT_COUNT;

						idx = (mod->arch.plt_count - 1) % PLT_ENT_COUNT;

					if (pltsec->plt_count > 0) {

						plt += (pltsec->plt_count - 1) / PLT_ENT_COUNT;

						idx = (pltsec->plt_count - 1) % PLT_ENT_COUNT;

						if (plt->lit[idx] == val)

							return (u32)&plt->ldr[idx];

				@ -53,8 +61,8 @@ u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val)

							plt++;

					}

					mod->arch.plt_count++;

					BUG_ON(mod->arch.plt_count * PLT_ENT_SIZE > mod->arch.plt->sh_size);

					pltsec->plt_count++;

					BUG_ON(pltsec->plt_count * PLT_ENT_SIZE > pltsec->plt->sh_size);

					if (!idx)

						/* Populate a new set of entries */

				@ -129,7 +137,7 @@ static bool duplicate_rel(Elf32_Addr base, const Elf32_Rel *rel, int num)

				/* Count how many PLT entries we may need */

				static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base,

							       const Elf32_Rel *rel, int num)

							       const Elf32_Rel *rel, int num, Elf32_Word dstidx)

				{

					unsigned int ret = 0;

					const Elf32_Sym *s;

				@ -144,13 +152,17 @@ static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base,

						case R_ARM_THM_JUMP24:

							/*

							 * We only have to consider branch targets that resolve

							 * to undefined symbols. This is not simply a heuristic,

							 * it is a fundamental limitation, since the PLT itself

							 * is part of the module, and needs to be within range

							 * as well, so modules can never grow beyond that limit.

							 * to symbols that are defined in a different section.

							 * This is not simply a heuristic, it is a fundamental

							 * limitation, since there is no guaranteed way to emit

							 * PLT entries sufficiently close to the branch if the

							 * section size exceeds the range of a branch

							 * instruction. So ignore relocations against defined

							 * symbols if they live in the same section as the

							 * relocation target.

							 */

							s = syms + ELF32_R_SYM(rel[i].r_info);

							if (s->st_shndx != SHN_UNDEF)

							if (s->st_shndx == dstidx)

								break;

							/*

				@ -161,7 +173,12 @@ static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base,

							 * So we need to support them, but there is no need to

							 * take them into consideration when trying to optimize

							 * this code. So let's only check for duplicates when

							 * the addend is zero.

							 * the addend is zero. (Note that calls into the core

							 * module via init PLT entries could involve section

							 * relative symbol references with non-zero addends, for

							 * which we may end up emitting duplicates, but the init

							 * PLT is released along with the rest of the .init

							 * region as soon as module loading completes.)

							 */

							if (!is_zero_addend_relocation(base, rel + i) ||

							    !duplicate_rel(base, rel, i))

				@ -174,7 +191,8 @@ static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base,

				int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,

							      char *secstrings, struct module *mod)

				{

					unsigned long plts = 0;

					unsigned long core_plts = 0;

					unsigned long init_plts = 0;

					Elf32_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum;

					Elf32_Sym *syms = NULL;

				@ -184,13 +202,15 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,

					 */

					for (s = sechdrs; s < sechdrs_end; ++s) {

						if (strcmp(".plt", secstrings + s->sh_name) == 0)

							mod->arch.plt = s;

							mod->arch.core.plt = s;

						else if (strcmp(".init.plt", secstrings + s->sh_name) == 0)

							mod->arch.init.plt = s;

						else if (s->sh_type == SHT_SYMTAB)

							syms = (Elf32_Sym *)s->sh_addr;

					}

					if (!mod->arch.plt) {

						pr_err("%s: module PLT section missing\n", mod->name);

					if (!mod->arch.core.plt || !mod->arch.init.plt) {

						pr_err("%s: module PLT section(s) missing\n", mod->name);

						return -ENOEXEC;

					}

					if (!syms) {

				@ -213,16 +233,29 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,

						/* sort by type and symbol index */

						sort(rels, numrels, sizeof(Elf32_Rel), cmp_rel, NULL);

						plts += count_plts(syms, dstsec->sh_addr, rels, numrels);

						if (strncmp(secstrings + dstsec->sh_name, ".init", 5) != 0)

							core_plts += count_plts(syms, dstsec->sh_addr, rels,

										numrels, s->sh_info);

						else

							init_plts += count_plts(syms, dstsec->sh_addr, rels,

										numrels, s->sh_info);

					}

					mod->arch.plt->sh_type = SHT_NOBITS;

					mod->arch.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;

					mod->arch.plt->sh_addralign = L1_CACHE_BYTES;

					mod->arch.plt->sh_size = round_up(plts * PLT_ENT_SIZE,

									  sizeof(struct plt_entries));

					mod->arch.plt_count = 0;

					mod->arch.core.plt->sh_type = SHT_NOBITS;

					mod->arch.core.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;

					mod->arch.core.plt->sh_addralign = L1_CACHE_BYTES;

					mod->arch.core.plt->sh_size = round_up(core_plts * PLT_ENT_SIZE,

									       sizeof(struct plt_entries));

					mod->arch.core.plt_count = 0;

					pr_debug("%s: plt=%x\n", __func__, mod->arch.plt->sh_size);

					mod->arch.init.plt->sh_type = SHT_NOBITS;

					mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;

					mod->arch.init.plt->sh_addralign = L1_CACHE_BYTES;

					mod->arch.init.plt->sh_size = round_up(init_plts * PLT_ENT_SIZE,

									       sizeof(struct plt_entries));

					mod->arch.init.plt_count = 0;

					pr_debug("%s: plt=%x, init.plt=%x\n", __func__,

						 mod->arch.core.plt->sh_size, mod->arch.init.plt->sh_size);

					return 0;

				}

1

arch/arm/kernel/module.lds

View File

 @ -1,3 +1,4 @@
 SECTIONS {
 	.plt : { BYTE(0) }
 	.init.plt : { BYTE(0) }
 }

									
										4

arch/arm/kernel/setup.c
									
												View File
												
				@ -80,7 +80,7 @@ __setup("fpe=", fpe_setup);

				extern void init_default_cache_policy(unsigned long);

				extern void paging_init(const struct machine_desc *desc);

				extern void early_paging_init(const struct machine_desc *);

				extern void early_mm_init(const struct machine_desc *);

				extern void adjust_lowmem_bounds(void);

				extern enum reboot_mode reboot_mode;

				extern void setup_dma_zone(const struct machine_desc *desc);

				@ -1088,7 +1088,7 @@ void __init setup_arch(char **cmdline_p)

					parse_early_param();

				#ifdef CONFIG_MMU

					early_paging_init(mdesc);

					early_mm_init(mdesc);

				#endif

					setup_dma_zone(mdesc);

					xen_early_init();

									
										77

arch/arm/kvm/coproc.c
									
												View File
												
				@ -93,12 +93,6 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu, struct kvm_run *run)

					return 1;

				}

				int kvm_handle_cp14_access(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					kvm_inject_undefined(vcpu);

					return 1;

				}

				static void reset_mpidr(struct kvm_vcpu *vcpu, const struct coproc_reg *r)

				{

					/*

				@ -514,12 +508,7 @@ static int emulate_cp15(struct kvm_vcpu *vcpu,

					return 1;

				}

				/**

				 * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)

				static struct coproc_params decode_64bit_hsr(struct kvm_vcpu *vcpu)

				{

					struct coproc_params params;

				@ -533,9 +522,38 @@ int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)

					params.Rt2 = (kvm_vcpu_get_hsr(vcpu) >> 10) & 0xf;

					params.CRm = 0;

					return params;

				}

				/**

				 * kvm_handle_cp15_64 -- handles a mrrc/mcrr trap on a guest CP15 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp15_64(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					struct coproc_params params = decode_64bit_hsr(vcpu);

					return emulate_cp15(vcpu, &params);

				}

				/**

				 * kvm_handle_cp14_64 -- handles a mrrc/mcrr trap on a guest CP14 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp14_64(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					struct coproc_params params = decode_64bit_hsr(vcpu);

					/* raz_wi cp14 */

					pm_fake(vcpu, &params, NULL);

					/* handled */

					kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));

					return 1;

				}

				static void reset_coproc_regs(struct kvm_vcpu *vcpu,

							      const struct coproc_reg *table, size_t num)

				{

				@ -546,12 +564,7 @@ static void reset_coproc_regs(struct kvm_vcpu *vcpu,

							table[i].reset(vcpu, &table[i]);

				}

				/**

				 * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)

				static struct coproc_params decode_32bit_hsr(struct kvm_vcpu *vcpu)

				{

					struct coproc_params params;

				@ -565,9 +578,37 @@ int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)

					params.Op2 = (kvm_vcpu_get_hsr(vcpu) >> 17) & 0x7;

					params.Rt2 = 0;

					return params;

				}

				/**

				 * kvm_handle_cp15_32 -- handles a mrc/mcr trap on a guest CP15 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					struct coproc_params params = decode_32bit_hsr(vcpu);

					return emulate_cp15(vcpu, &params);

				}

				/**

				 * kvm_handle_cp14_32 -- handles a mrc/mcr trap on a guest CP14 access

				 * @vcpu: The VCPU pointer

				 * @run:  The kvm_run struct

				 */

				int kvm_handle_cp14_32(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					struct coproc_params params = decode_32bit_hsr(vcpu);

					/* raz_wi cp14 */

					pm_fake(vcpu, &params, NULL);

					/* handled */

					kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));

					return 1;

				}

				/******************************************************************************

				 * Userspace API

				 *****************************************************************************/

									
										4

arch/arm/kvm/handle_exit.c
									
												View File
												
				@ -95,9 +95,9 @@ static exit_handle_fn arm_exit_handlers[] = {

					[HSR_EC_WFI]		= kvm_handle_wfx,

					[HSR_EC_CP15_32]	= kvm_handle_cp15_32,

					[HSR_EC_CP15_64]	= kvm_handle_cp15_64,

					[HSR_EC_CP14_MR]	= kvm_handle_cp14_access,

					[HSR_EC_CP14_MR]	= kvm_handle_cp14_32,

					[HSR_EC_CP14_LS]	= kvm_handle_cp14_load_store,

					[HSR_EC_CP14_64]	= kvm_handle_cp14_access,

					[HSR_EC_CP14_64]	= kvm_handle_cp14_64,

					[HSR_EC_CP_0_13]	= kvm_handle_cp_0_13_access,

					[HSR_EC_CP10_ID]	= kvm_handle_cp10_id,

					[HSR_EC_HVC]		= handle_hvc,

									
										2

arch/arm/kvm/hyp/Makefile
									
												View File
												
				@ -2,6 +2,8 @@

				# Makefile for Kernel-based Virtual Machine module, HYP part

				#

				ccflags-y += -fno-stack-protector

				KVM=../../../../virt/kvm

				obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o

									
										4

arch/arm/kvm/hyp/switch.c
									
												View File
												
				@ -48,7 +48,9 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu, u32 *fpexc_host)

					write_sysreg(HSTR_T(15), HSTR);

					write_sysreg(HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11), HCPTR);

					val = read_sysreg(HDCR);

					write_sysreg(val | HDCR_TPM | HDCR_TPMCR, HDCR);

					val |= HDCR_TPM | HDCR_TPMCR; /* trap performance monitors */

					val |= HDCR_TDRA | HDCR_TDOSA | HDCR_TDA; /* trap debug regs */

					write_sysreg(val, HDCR);

				}

				static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)

									
										5

arch/arm/kvm/init.S
									
												View File
												
				@ -95,7 +95,6 @@ __do_hyp_init:

					@  - Write permission implies XN: disabled

					@  - Instruction cache: enabled

					@  - Data/Unified cache: enabled

					@  - Memory alignment checks: enabled

					@  - MMU: enabled (this code must be run from an identity mapping)

					mrc	p15, 4, r0, c1, c0, 0	@ HSCR

					ldr	r2, =HSCTLR_MASK

				@ -103,8 +102,8 @@ __do_hyp_init:

					mrc	p15, 0, r1, c1, c0, 0	@ SCTLR

					ldr	r2, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)

					and	r1, r1, r2

				 ARM(	ldr	r2, =(HSCTLR_M | HSCTLR_A)			)

				 THUMB(	ldr	r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE)		)

				 ARM(	ldr	r2, =(HSCTLR_M)					)

				 THUMB(	ldr	r2, =(HSCTLR_M | HSCTLR_TE)			)

					orr	r1, r1, r2

					orr	r0, r0, r1

					mcr	p15, 4, r0, c1, c0, 0	@ HSCR

									
										36

arch/arm/kvm/mmu.c
									
												View File
												
				@ -295,6 +295,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)

					assert_spin_locked(&kvm->mmu_lock);

					pgd = kvm->arch.pgd + stage2_pgd_index(addr);

					do {

						/*

						 * Make sure the page table is still active, as another thread

						 * could have possibly freed the page table, while we released

						 * the lock.

						 */

						if (!READ_ONCE(kvm->arch.pgd))

							break;

						next = stage2_pgd_addr_end(addr, end);

						if (!stage2_pgd_none(*pgd))

							unmap_stage2_puds(kvm, pgd, addr, next);

				@ -829,22 +836,22 @@ void stage2_unmap_vm(struct kvm *kvm)

				 * Walks the level-1 page table pointed to by kvm->arch.pgd and frees all

				 * underlying level-2 and level-3 tables before freeing the actual level-1 table

				 * and setting the struct pointer to NULL.

				 *

				 * Note we don't need locking here as this is only called when the VM is

				 * destroyed, which can only be done once.

				 */

				void kvm_free_stage2_pgd(struct kvm *kvm)

				{

					if (kvm->arch.pgd == NULL)

						return;

					void *pgd = NULL;

					spin_lock(&kvm->mmu_lock);

					unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);

					if (kvm->arch.pgd) {

						unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);

						pgd = READ_ONCE(kvm->arch.pgd);

						kvm->arch.pgd = NULL;

					}

					spin_unlock(&kvm->mmu_lock);

					/* Free the HW pgd, one page at a time */

					free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);

					kvm->arch.pgd = NULL;

					if (pgd)

						free_pages_exact(pgd, S2_PGD_SIZE);

				}

				static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,

				@ -872,6 +879,9 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache

					pmd_t *pmd;

					pud = stage2_get_pud(kvm, cache, addr);

					if (!pud)

						return NULL;

					if (stage2_pud_none(*pud)) {

						if (!cache)

							return NULL;

				@ -1170,11 +1180,13 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)

						 * large. Otherwise, we may see kernel panics with

						 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCKUP_DETECTOR,

						 * CONFIG_LOCKDEP. Additionally, holding the lock too long

						 * will also starve other vCPUs.

						 * will also starve other vCPUs. We have to also make sure

						 * that the page tables are not freed while we released

						 * the lock.

						 */

						if (need_resched() || spin_needbreak(&kvm->mmu_lock))

							cond_resched_lock(&kvm->mmu_lock);

						cond_resched_lock(&kvm->mmu_lock);

						if (!READ_ONCE(kvm->arch.pgd))

							break;

						next = stage2_pgd_addr_end(addr, end);

						if (stage2_pgd_present(*pgd))

							stage2_wp_puds(pgd, addr, next);

									
										8

arch/arm/kvm/psci.c
									
												View File
												
				@ -208,9 +208,10 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)

				static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)

				{

					int ret = 1;

					struct kvm *kvm = vcpu->kvm;

					unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);

					unsigned long val;

					int ret = 1;

					switch (psci_fn) {

					case PSCI_0_2_FN_PSCI_VERSION:

				@ -230,7 +231,9 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)

						break;

					case PSCI_0_2_FN_CPU_ON:

					case PSCI_0_2_FN64_CPU_ON:

						mutex_lock(&kvm->lock);

						val = kvm_psci_vcpu_on(vcpu);

						mutex_unlock(&kvm->lock);

						break;

					case PSCI_0_2_FN_AFFINITY_INFO:

					case PSCI_0_2_FN64_AFFINITY_INFO:

				@ -279,6 +282,7 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)

				static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)

				{

					struct kvm *kvm = vcpu->kvm;

					unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);

					unsigned long val;

				@ -288,7 +292,9 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)

						val = PSCI_RET_SUCCESS;

						break;

					case KVM_PSCI_FN_CPU_ON:

						mutex_lock(&kvm->lock);

						val = kvm_psci_vcpu_on(vcpu);

						mutex_unlock(&kvm->lock);

						break;

					default:

						val = PSCI_RET_NOT_SUPPORTED;

									
										7

arch/arm/mm/dma-mapping.c
									
												View File
												
				@ -2414,6 +2414,13 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,

						dma_ops = arm_get_dma_map_ops(coherent);

					set_dma_ops(dev, dma_ops);

				#ifdef CONFIG_XEN

					if (xen_initial_domain()) {

						dev->archdata.dev_dma_ops = dev->dma_ops;

						dev->dma_ops = xen_dma_ops;

					}

				#endif

				}

				void arch_teardown_dma_ops(struct device *dev)

									
										16

arch/arm/mm/mmu.c
									
												View File
												
				@ -414,6 +414,11 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)

						     FIXADDR_END);

					BUG_ON(idx >= __end_of_fixed_addresses);

					/* we only support device mappings until pgprot_kernel has been set */

					if (WARN_ON(pgprot_val(prot) != pgprot_val(FIXMAP_PAGE_IO) &&

						    pgprot_val(pgprot_kernel) == 0))

						return;

					if (pgprot_val(prot))

						set_pte_at(NULL, vaddr, pte,

							pfn_pte(phys >> PAGE_SHIFT, prot));

				@ -1492,7 +1497,7 @@ pgtables_remap lpae_pgtables_remap_asm;

				 * early_paging_init() recreates boot time page table setup, allowing machines

				 * to switch over to a high (>4G) address space on LPAE systems

				 */

				void __init early_paging_init(const struct machine_desc *mdesc)

				static void __init early_paging_init(const struct machine_desc *mdesc)

				{

					pgtables_remap *lpae_pgtables_remap;

					unsigned long pa_pgd;

				@ -1560,7 +1565,7 @@ void __init early_paging_init(const struct machine_desc *mdesc)

				#else

				void __init early_paging_init(const struct machine_desc *mdesc)

				static void __init early_paging_init(const struct machine_desc *mdesc)

				{

					long long offset;

				@ -1616,7 +1621,6 @@ void __init paging_init(const struct machine_desc *mdesc)

				{

					void *zero_page;

					build_mem_type_table();

					prepare_page_table();

					map_lowmem();

					memblock_set_current_limit(arm_lowmem_limit);

				@ -1636,3 +1640,9 @@ void __init paging_init(const struct machine_desc *mdesc)

					empty_zero_page = virt_to_page(zero_page);

					__flush_dcache_page(NULL, empty_zero_page);

				}

				void __init early_mm_init(const struct machine_desc *mdesc)

				{

					build_mem_type_table();

					early_paging_init(mdesc);

				}

									
										4

arch/arm/mm/proc-v7m.S
									
												View File
												
				@ -147,10 +147,10 @@ __v7m_setup_cont:

					@ Configure caches (if implemented)

					teq     r8, #0

					stmneia	r12, {r0-r6, lr}	@ v7m_invalidate_l1 touches r0-r6

					stmneia	sp, {r0-r6, lr}		@ v7m_invalidate_l1 touches r0-r6

					blne	v7m_invalidate_l1

					teq     r8, #0			@ re-evalutae condition

					ldmneia	r12, {r0-r6, lr}

					ldmneia	sp, {r0-r6, lr}

					@ Configure the System Control Register to ensure 8-byte stack alignment

					@ Note the STKALIGN bit is either RW or RAO.

3

arch/arm64/boot/dts/hisilicon/hi6220.dtsi

View File

 @ -774,6 +774,7 @@
 			clocks = <&sys_ctrl 2>, <&sys_ctrl 1>;
 			clock-names = "ciu", "biu";
 			resets = <&sys_ctrl PERIPH_RSTDIS0_MMC0>;
 			reset-names = "reset";
 			bus-width = <0x8>;
 			vmmc-supply = <&ldo19>;
 			pinctrl-names = "default";
 @ -797,6 +798,7 @@
 			clocks = <&sys_ctrl 4>, <&sys_ctrl 3>;
 			clock-names = "ciu", "biu";
 			resets = <&sys_ctrl PERIPH_RSTDIS0_MMC1>;
 			reset-names = "reset";
 			vqmmc-supply = <&ldo7>;
 			vmmc-supply = <&ldo10>;
 			bus-width = <0x4>;
 @ -815,6 +817,7 @@
 			clocks = <&sys_ctrl HI6220_MMC2_CIUCLK>, <&sys_ctrl HI6220_MMC2_CLK>;
 			clock-names = "ciu", "biu";
 			resets = <&sys_ctrl PERIPH_RSTDIS0_MMC2>;
 			reset-names = "reset";
 			bus-width = <0x4>;
 			broken-cd;
 			pinctrl-names = "default", "idle";

									
										9

arch/arm64/include/asm/asm-uaccess.h
									
												View File
												
				@ -62,4 +62,13 @@ alternative_if ARM64_ALT_PAN_NOT_UAO

				alternative_else_nop_endif

					.endm

				/*

				 * Remove the address tag from a virtual address, if present.

				 */

					.macro	clear_address_tag, dst, addr

					tst	\addr, #(1 << 55)

					bic	\dst, \addr, #(0xff << 56)

					csel	\dst, \dst, \addr, eq

					.endm

				#endif

									
										20

arch/arm64/include/asm/barrier.h
									
												View File
												
				@ -42,25 +42,35 @@

				#define __smp_rmb()	dmb(ishld)

				#define __smp_wmb()	dmb(ishst)

				#define __smp_store_release(p, v)						\

				#define __smp_store_release(p, v)					\

				do {									\

					union { typeof(*p) __val; char __c[1]; } __u =			\

						{ .__val = (__force typeof(*p)) (v) }; 			\

					compiletime_assert_atomic_type(*p);				\

					switch (sizeof(*p)) {						\

					case 1:								\

						asm volatile ("stlrb %w1, %0"				\

								: "=Q" (*p) : "r" (v) : "memory");	\

								: "=Q" (*p)				\

								: "r" (*(__u8 *)__u.__c)		\

								: "memory");				\

						break;							\

					case 2:								\

						asm volatile ("stlrh %w1, %0"				\

								: "=Q" (*p) : "r" (v) : "memory");	\

								: "=Q" (*p)				\

								: "r" (*(__u16 *)__u.__c)		\

								: "memory");				\

						break;							\

					case 4:								\

						asm volatile ("stlr %w1, %0"				\

								: "=Q" (*p) : "r" (v) : "memory");	\

								: "=Q" (*p)				\

								: "r" (*(__u32 *)__u.__c)		\

								: "memory");				\

						break;							\

					case 8:								\

						asm volatile ("stlr %1, %0"				\

								: "=Q" (*p) : "r" (v) : "memory");	\

								: "=Q" (*p)				\

								: "r" (*(__u64 *)__u.__c)		\

								: "memory");				\

						break;							\

					}								\

				} while (0)

									
										2

arch/arm64/include/asm/cmpxchg.h
									
												View File
												
				@ -46,7 +46,7 @@ static inline unsigned long __xchg_case_##name(unsigned long x,		\

					"	swp" #acq_lse #rel #sz "\t%" #w "3, %" #w "0, %2\n"	\

						__nops(3)						\

					"	" #nop_lse)						\

					: "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr)			\

					: "=&r" (ret), "=&r" (tmp), "+Q" (*(unsigned long *)ptr)	\

					: "r" (x)							\

					: cl);								\

													\

									
										3

arch/arm64/include/asm/device.h
									
												View File
												
				@ -19,6 +19,9 @@

				struct dev_archdata {

				#ifdef CONFIG_IOMMU_API

					void *iommu;			/* private IOMMU data */

				#endif

				#ifdef CONFIG_XEN

					const struct dma_map_ops *dev_dma_ops;

				#endif

					bool dma_coherent;

				};

									
										13

arch/arm64/include/asm/dma-mapping.h
									
												View File
												
				@ -27,11 +27,8 @@

				#define DMA_ERROR_CODE	(~(dma_addr_t)0)

				extern const struct dma_map_ops dummy_dma_ops;

				static inline const struct dma_map_ops *__generic_dma_ops(struct device *dev)

				static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)

				{

					if (dev && dev->dma_ops)

						return dev->dma_ops;

					/*

					 * We expect no ISA devices, and all other DMA masters are expected to

					 * have someone call arch_setup_dma_ops at device creation time.

				@ -39,14 +36,6 @@ static inline const struct dma_map_ops *__generic_dma_ops(struct device *dev)

					return &dummy_dma_ops;

				}

				static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)

				{

					if (xen_initial_domain())

						return xen_dma_ops;

					else

						return __generic_dma_ops(NULL);

				}

				void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,

							const struct iommu_ops *iommu, bool coherent);

				#define arch_setup_dma_ops	arch_setup_dma_ops

									
										6

arch/arm64/include/asm/kvm_emulate.h
									
												View File
												
				@ -240,6 +240,12 @@ static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)

					return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC_TYPE;

				}

				static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)

				{

					u32 esr = kvm_vcpu_get_hsr(vcpu);

					return (esr & ESR_ELx_SYS64_ISS_RT_MASK) >> ESR_ELx_SYS64_ISS_RT_SHIFT;

				}

				static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)

				{

					return vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;

									
										4

arch/arm64/include/asm/sysreg.h
									
												View File
												
				@ -138,6 +138,10 @@

				#define SCTLR_ELx_A	(1 << 1)

				#define SCTLR_ELx_M	1

				#define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \

							 (1 << 16) | (1 << 18) | (1 << 22) | (1 << 23) | \

							 (1 << 28) | (1 << 29))

				#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \

							 SCTLR_ELx_SA | SCTLR_ELx_I)

									
										9

arch/arm64/include/asm/uaccess.h
									
												View File
												
				@ -95,20 +95,21 @@ static inline void set_fs(mm_segment_t fs)

				 */

				#define __range_ok(addr, size)						\

				({									\

					unsigned long __addr = (unsigned long __force)(addr);		\

					unsigned long flag, roksum;					\

					__chk_user_ptr(addr);						\

					asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, ls"		\

						: "=&r" (flag), "=&r" (roksum)				\

						: "1" (addr), "Ir" (size),				\

						: "1" (__addr), "Ir" (size),				\

						  "r" (current_thread_info()->addr_limit)		\

						: "cc");						\

					flag;								\

				})

				/*

				 * When dealing with data aborts or instruction traps we may end up with

				 * a tagged userland pointer. Clear the tag to get a sane pointer to pass

				 * on to access_ok(), for instance.

				 * When dealing with data aborts, watchpoints, or instruction traps we may end

				 * up with a tagged userland pointer. Clear the tag to get a sane pointer to

				 * pass on to access_ok(), for instance.

				 */

				#define untagged_addr(addr)		sign_extend64(addr, 55)

									
										3

arch/arm64/kernel/armv8_deprecated.c
									
												View File
												
				@ -306,7 +306,8 @@ do {								\

					_ASM_EXTABLE(0b, 4b)					\

					_ASM_EXTABLE(1b, 4b)					\

					: "=&r" (res), "+r" (data), "=&r" (temp), "=&r" (temp2)	\

					: "r" (addr), "i" (-EAGAIN), "i" (-EFAULT),		\

					: "r" ((unsigned long)addr), "i" (-EAGAIN),		\

					  "i" (-EFAULT),					\

					  "i" (__SWP_LL_SC_LOOPS)				\

					: "memory");						\

					uaccess_disable();					\

									
										5

arch/arm64/kernel/entry.S
									
												View File
												
				@ -428,12 +428,13 @@ el1_da:

					/*

					 * Data abort handling

					 */

					mrs	x0, far_el1

					mrs	x3, far_el1

					enable_dbg

					// re-enable interrupts if they were enabled in the aborted context

					tbnz	x23, #7, 1f			// PSR_I_BIT

					enable_irq

				1:

					clear_address_tag x0, x3

					mov	x2, sp				// struct pt_regs

					bl	do_mem_abort

				@ -594,7 +595,7 @@ el0_da:

					// enable interrupts before calling the main handler

					enable_dbg_and_irq

					ct_user_exit

					bic	x0, x26, #(0xff << 56)

					clear_address_tag x0, x26

					mov	x1, x25

					mov	x2, sp

					bl	do_mem_abort

									
										3

arch/arm64/kernel/hw_breakpoint.c
									
												View File
												
				@ -36,6 +36,7 @@

				#include <asm/traps.h>

				#include <asm/cputype.h>

				#include <asm/system_misc.h>

				#include <asm/uaccess.h>

				/* Breakpoint currently in use for each BRP. */

				static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[ARM_MAX_BRP]);

				@ -721,6 +722,8 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,

					u64 wp_low, wp_high;

					u32 lens, lene;

					addr = untagged_addr(addr);

					lens = __ffs(ctrl->len);

					lene = __fls(ctrl->len);

									
										4

arch/arm64/kernel/traps.c
									
												View File
												
				@ -443,7 +443,7 @@ int cpu_enable_cache_maint_trap(void *__unused)

				}

				#define __user_cache_maint(insn, address, res)			\

					if (untagged_addr(address) >= user_addr_max()) {	\

					if (address >= user_addr_max()) {			\

						res = -EFAULT;					\

					} else {						\

						uaccess_ttbr0_enable();				\

				@ -469,7 +469,7 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)

					int crm = (esr & ESR_ELx_SYS64_ISS_CRM_MASK) >> ESR_ELx_SYS64_ISS_CRM_SHIFT;

					int ret = 0;

					address = pt_regs_read_reg(regs, rt);

					address = untagged_addr(pt_regs_read_reg(regs, rt));

					switch (crm) {

					case ESR_ELx_SYS64_ISS_CRM_DC_CVAU:	/* DC CVAU, gets promoted */

									
										11

arch/arm64/kvm/hyp-init.S
									
												View File
												
				@ -102,10 +102,13 @@ __do_hyp_init:

					tlbi	alle2

					dsb	sy

					mrs	x4, sctlr_el2

					and	x4, x4, #SCTLR_ELx_EE	// preserve endianness of EL2

					ldr	x5, =SCTLR_ELx_FLAGS

					orr	x4, x4, x5

					/*

					 * Preserve all the RES1 bits while setting the default flags,

					 * as well as the EE bit on BE. Drop the A flag since the compiler

					 * is allowed to generate unaligned accesses.

					 */

					ldr	x4, =(SCTLR_EL2_RES1 | (SCTLR_ELx_FLAGS & ~SCTLR_ELx_A))

				CPU_BE(	orr	x4, x4, #SCTLR_ELx_EE)

					msr	sctlr_el2, x4

					isb

									
										2

arch/arm64/kvm/hyp/Makefile
									
												View File
												
				@ -2,6 +2,8 @@

				# Makefile for Kernel-based Virtual Machine module, HYP part

				#

				ccflags-y += -fno-stack-protector

				KVM=../../../../virt/kvm

				obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o

									
										8

arch/arm64/kvm/sys_regs.c
									
												View File
												
				@ -1638,8 +1638,8 @@ static int kvm_handle_cp_64(struct kvm_vcpu *vcpu,

				{

					struct sys_reg_params params;

					u32 hsr = kvm_vcpu_get_hsr(vcpu);

					int Rt = (hsr >> 5) & 0xf;

					int Rt2 = (hsr >> 10) & 0xf;

					int Rt = kvm_vcpu_sys_get_rt(vcpu);

					int Rt2 = (hsr >> 10) & 0x1f;

					params.is_aarch32 = true;

					params.is_32bit = false;

				@ -1690,7 +1690,7 @@ static int kvm_handle_cp_32(struct kvm_vcpu *vcpu,

				{

					struct sys_reg_params params;

					u32 hsr = kvm_vcpu_get_hsr(vcpu);

					int Rt  = (hsr >> 5) & 0xf;

					int Rt  = kvm_vcpu_sys_get_rt(vcpu);

					params.is_aarch32 = true;

					params.is_32bit = true;

				@ -1805,7 +1805,7 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu, struct kvm_run *run)

				{

					struct sys_reg_params params;

					unsigned long esr = kvm_vcpu_get_hsr(vcpu);

					int Rt = (esr >> 5) & 0x1f;

					int Rt = kvm_vcpu_sys_get_rt(vcpu);

					int ret;

					trace_kvm_handle_sys_reg(esr);

									
										7

arch/arm64/mm/dma-mapping.c
									
												View File
												
				@ -977,4 +977,11 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,

					dev->archdata.dma_coherent = coherent;

					__iommu_setup_dma_ops(dev, dma_base, size, iommu);

				#ifdef CONFIG_XEN

					if (xen_initial_domain()) {

						dev->archdata.dev_dma_ops = dev->dma_ops;

						dev->dma_ops = xen_dma_ops;

					}

				#endif

				}

									
										13

arch/arm64/net/bpf_jit_comp.c
									
												View File
												
				@ -252,8 +252,9 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)

					 */

					off = offsetof(struct bpf_array, ptrs);

					emit_a64_mov_i64(tmp, off, ctx);

					emit(A64_LDR64(tmp, r2, tmp), ctx);

					emit(A64_LDR64(prg, tmp, r3), ctx);

					emit(A64_ADD(1, tmp, r2, tmp), ctx);

					emit(A64_LSL(1, prg, r3, 3), ctx);

					emit(A64_LDR64(prg, tmp, prg), ctx);

					emit(A64_CBZ(1, prg, jmp_offset), ctx);

					/* goto *(prog->bpf_func + prologue_size); */

				@ -779,14 +780,14 @@ static int build_body(struct jit_ctx *ctx)

						int ret;

						ret = build_insn(insn, ctx);

						if (ctx->image == NULL)

							ctx->offset[i] = ctx->idx;

						if (ret > 0) {

							i++;

							if (ctx->image == NULL)

								ctx->offset[i] = ctx->idx;

							continue;

						}

						if (ctx->image == NULL)

							ctx->offset[i] = ctx->idx;

						if (ret)

							return ret;

					}

									
										49

arch/metag/include/asm/uaccess.h
									
												View File
												
				@ -28,24 +28,32 @@

				#define segment_eq(a, b)	((a).seg == (b).seg)

				#define __kernel_ok (segment_eq(get_fs(), KERNEL_DS))

				/*

				 * Explicitly allow NULL pointers here. Parts of the kernel such

				 * as readv/writev use access_ok to validate pointers, but want

				 * to allow NULL pointers for various reasons. NULL pointers are

				 * safe to allow through because the first page is not mappable on

				 * Meta.

				 *

				 * We also wish to avoid letting user code access the system area

				 * and the kernel half of the address space.

				 */

				#define __user_bad(addr, size) (((addr) > 0 && (addr) < META_MEMORY_BASE) || \

								((addr) > PAGE_OFFSET &&		\

								 (addr) < LINCORE_BASE))

				static inline int __access_ok(unsigned long addr, unsigned long size)

				{

					return __kernel_ok || !__user_bad(addr, size);

					/*

					 * Allow access to the user mapped memory area, but not the system area

					 * before it. The check extends to the top of the address space when

					 * kernel access is allowed (there's no real reason to user copy to the

					 * system area in any case).

					 */

					if (likely(addr >= META_MEMORY_BASE && addr < get_fs().seg &&

						   size <= get_fs().seg - addr))

						return true;

					/*

					 * Explicitly allow NULL pointers here. Parts of the kernel such

					 * as readv/writev use access_ok to validate pointers, but want

					 * to allow NULL pointers for various reasons. NULL pointers are

					 * safe to allow through because the first page is not mappable on

					 * Meta.

					 */

					if (!addr)

						return true;

					/* Allow access to core code memory area... */

					if (addr >= LINCORE_CODE_BASE && addr <= LINCORE_CODE_LIMIT &&

					    size <= LINCORE_CODE_LIMIT + 1 - addr)

						return true;

					/* ... but no other areas. */

					return false;

				}

				#define access_ok(type, addr, size) __access_ok((unsigned long)(addr),	\

				@ -186,8 +194,13 @@ do {                                                            \

				extern long __must_check __strncpy_from_user(char *dst, const char __user *src,

									     long count);

				#define strncpy_from_user(dst, src, count) __strncpy_from_user(dst, src, count)

				static inline long

				strncpy_from_user(char *dst, const char __user *src, long count)

				{

					if (!access_ok(VERIFY_READ, src, 1))

						return -EFAULT;

					return __strncpy_from_user(dst, src, count);

				}

				/*

				 * Return the size of a string (including the ending 0)

				 *

1

arch/mips/Kconfig

View File

 @ -1373,6 +1373,7 @@ config CPU_LOONGSON3
 	select WEAK_ORDERING
 	select WEAK_REORDERING_BEYOND_LLSC
 	select MIPS_PGD_C0_CONTEXT
 	select MIPS_L1_CACHE_SHIFT_6
 	select GPIOLIB
 	help
 		The Loongson 3 processor implements the MIPS64R2 instruction

									
										1

arch/mips/kernel/process.c
									
												View File
												
				@ -120,7 +120,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,

					struct thread_info *ti = task_thread_info(p);

					struct pt_regs *childregs, *regs = current_pt_regs();

					unsigned long childksp;

					p->set_child_tid = p->clear_child_tid = NULL;

					childksp = (unsigned long)task_stack_page(p) + THREAD_SIZE - 32;

									
										2

arch/openrisc/kernel/process.c
									
												View File
												
				@ -167,8 +167,6 @@ copy_thread(unsigned long clone_flags, unsigned long usp,

					top_of_kernel_stack = sp;

					p->set_child_tid = p->clear_child_tid = NULL;

					/* Locate userspace context on stack... */

					sp -= STACK_FRAME_OVERHEAD;	/* redzone */

					sp -= sizeof(struct pt_regs);

									
										17

arch/powerpc/include/asm/mmu_context.h
									
												View File
												
				@ -70,8 +70,9 @@ extern void drop_cop(unsigned long acop, struct mm_struct *mm);

				 * switch_mm is the entry point called from the architecture independent

				 * code in kernel/sched/core.c

				 */

				static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,

							     struct task_struct *tsk)

				static inline void switch_mm_irqs_off(struct mm_struct *prev,

								      struct mm_struct *next,

								      struct task_struct *tsk)

				{

					/* Mark this context has been used on the new CPU */

					if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next)))

				@ -110,6 +111,18 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,

					switch_mmu_context(prev, next, tsk);

				}

				static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,

							     struct task_struct *tsk)

				{

					unsigned long flags;

					local_irq_save(flags);

					switch_mm_irqs_off(prev, next, tsk);

					local_irq_restore(flags);

				}

				#define switch_mm_irqs_off switch_mm_irqs_off

				#define deactivate_mm(tsk,mm)	do { } while (0)

				/*

									
										14

arch/powerpc/include/asm/topology.h
									
												View File
												
				@ -44,8 +44,22 @@ extern void __init dump_numa_cpu_topology(void);

				extern int sysfs_add_device_to_node(struct device *dev, int nid);

				extern void sysfs_remove_device_from_node(struct device *dev, int nid);

				static inline int early_cpu_to_node(int cpu)

				{

					int nid;

					nid = numa_cpu_lookup_table[cpu];

					/*

					 * Fall back to node 0 if nid is unset (it should be, except bugs).

					 * This allows callers to safely do NODE_DATA(early_cpu_to_node(cpu)).

					 */

					return (nid < 0) ? 0 : nid;

				}

				#else

				static inline int early_cpu_to_node(int cpu) { return 0; }

				static inline void dump_numa_cpu_topology(void) {}

				static inline int sysfs_add_device_to_node(struct device *dev, int nid)

									
										19

arch/powerpc/kernel/eeh_driver.c
									
												View File
												
				@ -724,7 +724,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,

				 */

				#define MAX_WAIT_FOR_RECOVERY 300

				static void eeh_handle_normal_event(struct eeh_pe *pe)

				static bool eeh_handle_normal_event(struct eeh_pe *pe)

				{

					struct pci_bus *frozen_bus;

					struct eeh_dev *edev, *tmp;

				@ -736,7 +736,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)

					if (!frozen_bus) {

						pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",

							__func__, pe->phb->global_number, pe->addr);

						return;

						return false;

					}

					eeh_pe_update_time_stamp(pe);

				@ -870,7 +870,7 @@ static void eeh_handle_normal_event(struct eeh_pe *pe)

					pr_info("EEH: Notify device driver to resume\n");

					eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);

					return;

					return false;

				excess_failures:

					/*

				@ -915,8 +915,12 @@ perm_error:

							pci_lock_rescan_remove();

							pci_hp_remove_devices(frozen_bus);

							pci_unlock_rescan_remove();

							/* The passed PE should no longer be used */

							return true;

						}

					}

					return false;

				}

				static void eeh_handle_special_event(void)

				@ -982,7 +986,14 @@ static void eeh_handle_special_event(void)

						 */

						if (rc == EEH_NEXT_ERR_FROZEN_PE ||

						    rc == EEH_NEXT_ERR_FENCED_PHB) {

							eeh_handle_normal_event(pe);

							/*

							 * eeh_handle_normal_event() can make the PE stale if it

							 * determines that the PE cannot possibly be recovered.

							 * Don't modify the PE state if that's the case.

							 */

							if (eeh_handle_normal_event(pe))

								continue;

							eeh_pe_state_clear(pe, EEH_PE_RECOVERING);

						} else {

							pci_lock_rescan_remove();

									
										12

arch/powerpc/kernel/exceptions-64e.S
									
												View File
												
				@ -735,8 +735,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)

					andis.	r15,r14,(DBSR_IC|DBSR_BT)@h

					beq+	1f

				#ifdef CONFIG_RELOCATABLE

					ld	r15,PACATOC(r13)

					ld	r14,interrupt_base_book3e@got(r15)

					ld	r15,__end_interrupts@got(r15)

				#else

					LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)

					LOAD_REG_IMMEDIATE(r15,__end_interrupts)

				#endif

					cmpld	cr0,r10,r14

					cmpld	cr1,r10,r15

					blt+	cr0,1f

				@ -799,8 +805,14 @@ kernel_dbg_exc:

					andis.	r15,r14,(DBSR_IC|DBSR_BT)@h

					beq+	1f

				#ifdef CONFIG_RELOCATABLE

					ld	r15,PACATOC(r13)

					ld	r14,interrupt_base_book3e@got(r15)

					ld	r15,__end_interrupts@got(r15)

				#else

					LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)

					LOAD_REG_IMMEDIATE(r15,__end_interrupts)

				#endif

					cmpld	cr0,r10,r14

					cmpld	cr1,r10,r15

					blt+	cr0,1f

									
										2

arch/powerpc/kernel/mce.c
									
												View File
												
				@ -221,6 +221,8 @@ static void machine_check_process_queued_event(struct irq_work *work)

				{

					int index;

					add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);

					/*

					 * For now just print it to console.

					 * TODO: log this error event to FSP or nvram.

									
										1

arch/powerpc/kernel/nvram_64.c
									
												View File
												
				@ -561,6 +561,7 @@ static ssize_t nvram_pstore_read(u64 *id, enum pstore_type_id *type,

				static struct pstore_info nvram_pstore_info = {

					.owner = THIS_MODULE,

					.name = "nvram",

					.flags = PSTORE_FLAGS_DMESG,

					.open = nvram_pstore_open,

					.read = nvram_pstore_read,

					.write = nvram_pstore_write,

									
										22

arch/powerpc/kernel/process.c
									
												View File
												
				@ -864,6 +864,25 @@ static void tm_reclaim_thread(struct thread_struct *thr,

					if (!MSR_TM_SUSPENDED(mfmsr()))

						return;

					/*

					 * If we are in a transaction and FP is off then we can't have

					 * used FP inside that transaction. Hence the checkpointed

					 * state is the same as the live state. We need to copy the

					 * live state to the checkpointed state so that when the

					 * transaction is restored, the checkpointed state is correct

					 * and the aborted transaction sees the correct state. We use

					 * ckpt_regs.msr here as that's what tm_reclaim will use to

					 * determine if it's going to write the checkpointed state or

					 * not. So either this will write the checkpointed registers,

					 * or reclaim will. Similarly for VMX.

					 */

					if ((thr->ckpt_regs.msr & MSR_FP) == 0)

						memcpy(&thr->ckfp_state, &thr->fp_state,

						       sizeof(struct thread_fp_state));

					if ((thr->ckpt_regs.msr & MSR_VEC) == 0)

						memcpy(&thr->ckvr_state, &thr->vr_state,

						       sizeof(struct thread_vr_state));

					giveup_all(container_of(thr, struct task_struct, thread));

					tm_reclaim(thr, thr->ckpt_regs.msr, cause);

				@ -1647,6 +1666,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)

				#ifdef CONFIG_VSX

					current->thread.used_vsr = 0;

				#endif

					current->thread.load_fp = 0;

					memset(&current->thread.fp_state, 0, sizeof(current->thread.fp_state));

					current->thread.fp_save_area = NULL;

				#ifdef CONFIG_ALTIVEC

				@ -1655,6 +1675,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)

					current->thread.vr_save_area = NULL;

					current->thread.vrsave = 0;

					current->thread.used_vr = 0;

					current->thread.load_vec = 0;

				#endif /* CONFIG_ALTIVEC */

				#ifdef CONFIG_SPE

					memset(current->thread.evr, 0, sizeof(current->thread.evr));

				@ -1666,6 +1687,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)

					current->thread.tm_tfhar = 0;

					current->thread.tm_texasr = 0;

					current->thread.tm_tfiar = 0;

					current->thread.load_tm = 0;

				#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */

				}

				EXPORT_SYMBOL(start_thread);

									
										2

arch/powerpc/kernel/prom.c
									
												View File
												
				@ -161,7 +161,9 @@ static struct ibm_pa_feature {

					{ .pabyte = 0,  .pabit = 3, .cpu_features  = CPU_FTR_CTRL },

					{ .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },

					{ .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },

				#ifdef CONFIG_PPC_RADIX_MMU

					{ .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },

				#endif

					{ .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = CPU_FTR_NODSISRALIGN },

					{ .pabyte = 5,  .pabit = 0, .cpu_features  = CPU_FTR_REAL_LE,

								    .cpu_user_ftrs = PPC_FEATURE_TRUE_LE },

									
										4

arch/powerpc/kernel/setup_64.c
									
												View File
												
				@ -650,7 +650,7 @@ void __init emergency_stack_init(void)

				static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)

				{

					return __alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu)), size, align,

					return __alloc_bootmem_node(NODE_DATA(early_cpu_to_node(cpu)), size, align,

								    __pa(MAX_DMA_ADDRESS));

				}

				@ -661,7 +661,7 @@ static void __init pcpu_fc_free(void *ptr, size_t size)

				static int pcpu_cpu_distance(unsigned int from, unsigned int to)

				{

					if (cpu_to_node(from) == cpu_to_node(to))

					if (early_cpu_to_node(from) == early_cpu_to_node(to))

						return LOCAL_DISTANCE;

					else

						return REMOTE_DISTANCE;

									
										6

arch/powerpc/kernel/sysfs.c
									
												View File
												
				@ -710,6 +710,10 @@ static int register_cpu_online(unsigned int cpu)

					struct device_attribute *attrs, *pmc_attrs;

					int i, nattrs;

					/* For cpus present at boot a reference was already grabbed in register_cpu() */

					if (!s->of_node)

						s->of_node = of_get_cpu_node(cpu, NULL);

				#ifdef CONFIG_PPC64

					if (cpu_has_feature(CPU_FTR_SMT))

						device_create_file(s, &dev_attr_smt_snooze_delay);

				@ -864,6 +868,8 @@ static int unregister_cpu_online(unsigned int cpu)

					}

				#endif

					cacheinfo_cpu_offline(cpu);

					of_node_put(s->of_node);

					s->of_node = NULL;

				#endif /* CONFIG_HOTPLUG_CPU */

					return 0;

				}

									
										4

arch/powerpc/kernel/traps.c
									
												View File
												
				@ -306,8 +306,6 @@ long machine_check_early(struct pt_regs *regs)

					__this_cpu_inc(irq_stat.mce_exceptions);

					add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);

					if (cur_cpu_spec && cur_cpu_spec->machine_check_early)

						handled = cur_cpu_spec->machine_check_early(regs);

					return handled;

				@ -741,6 +739,8 @@ void machine_check_exception(struct pt_regs *regs)

					__this_cpu_inc(irq_stat.mce_exceptions);

					add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);

					/* See if any machine dependent calls. In theory, we would want

					 * to call the CPU first, and call the ppc_md. one if the CPU

					 * one returns a positive number. However there is existing code

									
										7

arch/powerpc/mm/dump_linuxpagetables.c
									
												View File
												
				@ -16,6 +16,7 @@

				 */

				#include <linux/debugfs.h>

				#include <linux/fs.h>

				#include <linux/hugetlb.h>

				#include <linux/io.h>

				#include <linux/mm.h>

				#include <linux/sched.h>

				@ -331,7 +332,7 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)

					for (i = 0; i < PTRS_PER_PMD; i++, pmd++) {

						addr = start + i * PMD_SIZE;

						if (!pmd_none(*pmd))

						if (!pmd_none(*pmd) && !pmd_huge(*pmd))

							/* pmd exists */

							walk_pte(st, pmd, addr);

						else

				@ -347,7 +348,7 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)

					for (i = 0; i < PTRS_PER_PUD; i++, pud++) {

						addr = start + i * PUD_SIZE;

						if (!pud_none(*pud))

						if (!pud_none(*pud) && !pud_huge(*pud))

							/* pud exists */

							walk_pmd(st, pud, addr);

						else

				@ -367,7 +368,7 @@ static void walk_pagetables(struct pg_state *st)

					 */

					for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {

						addr = KERN_VIRT_START + i * PGDIR_SIZE;

						if (!pgd_none(*pgd))

						if (!pgd_none(*pgd) && !pgd_huge(*pgd))

							/* pgd exists */

							walk_pud(st, pgd, addr);

						else

									
										4

arch/powerpc/mm/mmu_context_iommu.c
									
												View File
												
				@ -81,7 +81,7 @@ struct page *new_iommu_non_cma_page(struct page *page, unsigned long private,

					gfp_t gfp_mask = GFP_USER;

					struct page *new_page;

					if (PageHuge(page) || PageTransHuge(page) || PageCompound(page))

					if (PageCompound(page))

						return NULL;

					if (PageHighMem(page))

				@ -100,7 +100,7 @@ static int mm_iommu_move_page_from_cma(struct page *page)

					LIST_HEAD(cma_migrate_pages);

					/* Ignore huge pages for now */

					if (PageHuge(page) || PageTransHuge(page) || PageCompound(page))

					if (PageCompound(page))

						return -EBUSY;

					lru_add_drain();

									
										4

arch/powerpc/platforms/cell/spu_base.c
									
												View File
												
				@ -197,7 +197,9 @@ static int __spu_trap_data_map(struct spu *spu, unsigned long ea, u64 dsisr)

					    (REGION_ID(ea) != USER_REGION_ID)) {

						spin_unlock(&spu->register_lock);

						ret = hash_page(ea, _PAGE_PRESENT | _PAGE_READ, 0x300, dsisr);

						ret = hash_page(ea,

								_PAGE_PRESENT | _PAGE_READ | _PAGE_PRIVILEGED,

								0x300, dsisr);

						spin_lock(&spu->register_lock);

						if (!ret) {

									
										8

arch/powerpc/platforms/powernv/npu-dma.c
									
												View File
												
				@ -180,7 +180,7 @@ long pnv_npu_set_window(struct pnv_ioda_pe *npe, int num,

						pe_err(npe, "Failed to configure TCE table, err %lld\n", rc);

						return rc;

					}

					pnv_pci_phb3_tce_invalidate_entire(phb, false);

					pnv_pci_ioda2_tce_invalidate_entire(phb, false);

					/* Add the table to the list so its TCE cache will get invalidated */

					pnv_pci_link_table_and_group(phb->hose->node, num,

				@ -204,7 +204,7 @@ long pnv_npu_unset_window(struct pnv_ioda_pe *npe, int num)

						pe_err(npe, "Unmapping failed, ret = %lld\n", rc);

						return rc;

					}

					pnv_pci_phb3_tce_invalidate_entire(phb, false);

					pnv_pci_ioda2_tce_invalidate_entire(phb, false);

					pnv_pci_unlink_table_and_group(npe->table_group.tables[num],

							&npe->table_group);

				@ -270,7 +270,7 @@ static int pnv_npu_dma_set_bypass(struct pnv_ioda_pe *npe)

							0 /* bypass base */, top);

					if (rc == OPAL_SUCCESS)

						pnv_pci_phb3_tce_invalidate_entire(phb, false);

						pnv_pci_ioda2_tce_invalidate_entire(phb, false);

					return rc;

				}

				@ -334,7 +334,7 @@ void pnv_npu_take_ownership(struct pnv_ioda_pe *npe)

						pe_err(npe, "Failed to disable bypass, err %lld\n", rc);

						return;

					}

					pnv_pci_phb3_tce_invalidate_entire(npe->phb, false);

					pnv_pci_ioda2_tce_invalidate_entire(npe->phb, false);

				}

				struct pnv_ioda_pe *pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe)

									
										10

arch/powerpc/platforms/powernv/pci-ioda.c
									
												View File
												
				@ -1883,7 +1883,7 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {

				#define PHB3_TCE_KILL_INVAL_PE		PPC_BIT(1)

				#define PHB3_TCE_KILL_INVAL_ONE		PPC_BIT(2)

				void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm)

				static void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm)

				{

					__be64 __iomem *invalidate = pnv_ioda_get_inval_reg(phb, rm);

					const unsigned long val = PHB3_TCE_KILL_INVAL_ALL;

				@ -1979,6 +1979,14 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,

					}

				}

				void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm)

				{

					if (phb->model == PNV_PHB_MODEL_NPU || phb->model == PNV_PHB_MODEL_PHB3)

						pnv_pci_phb3_tce_invalidate_entire(phb, rm);

					else

						opal_pci_tce_kill(phb->opal_id, OPAL_PCI_TCE_KILL, 0, 0, 0, 0);

				}

				static int pnv_ioda2_tce_build(struct iommu_table *tbl, long index,

						long npages, unsigned long uaddr,

						enum dma_data_direction direction,

									
										2

arch/powerpc/platforms/powernv/pci.h
									
												View File
												
				@ -229,7 +229,7 @@ extern void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,

				/* Nvlink functions */

				extern void pnv_npu_try_dma_set_bypass(struct pci_dev *gpdev, bool bypass);

				extern void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm);

				extern void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm);

				extern struct pnv_ioda_pe *pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe);

				extern long pnv_npu_set_window(struct pnv_ioda_pe *npe, int num,

						struct iommu_table *tbl);

									
										1

arch/powerpc/platforms/pseries/dlpar.c
									
												View File
												
				@ -288,7 +288,6 @@ int dlpar_detach_node(struct device_node *dn)

					if (rc)

						return rc;

					of_node_put(dn); /* Must decrement the refcount */

					return 0;

				}

									
										2

arch/powerpc/platforms/pseries/hotplug-memory.c
									
												View File
												
				@ -124,6 +124,7 @@ static struct property *dlpar_clone_drconf_property(struct device_node *dn)

					for (i = 0; i < num_lmbs; i++) {

						lmbs[i].base_addr = be64_to_cpu(lmbs[i].base_addr);

						lmbs[i].drc_index = be32_to_cpu(lmbs[i].drc_index);

						lmbs[i].aa_index = be32_to_cpu(lmbs[i].aa_index);

						lmbs[i].flags = be32_to_cpu(lmbs[i].flags);

					}

				@ -147,6 +148,7 @@ static void dlpar_update_drconf_property(struct device_node *dn,

					for (i = 0; i < num_lmbs; i++) {

						lmbs[i].base_addr = cpu_to_be64(lmbs[i].base_addr);

						lmbs[i].drc_index = cpu_to_be32(lmbs[i].drc_index);

						lmbs[i].aa_index = cpu_to_be32(lmbs[i].aa_index);

						lmbs[i].flags = cpu_to_be32(lmbs[i].flags);

					}

									
										3

arch/powerpc/sysdev/simple_gpio.c
									
												View File
												
				@ -75,7 +75,8 @@ static int u8_gpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)

				static void u8_gpio_save_regs(struct of_mm_gpio_chip *mm_gc)

				{

					struct u8_gpio_chip *u8_gc = gpiochip_get_data(&mm_gc->gc);

					struct u8_gpio_chip *u8_gc =

						container_of(mm_gc, struct u8_gpio_chip, mm_gc);

					u8_gc->data = in_8(mm_gc->regs);

				}

									
										15

arch/s390/kernel/crash_dump.c
									
												View File
												
				@ -428,6 +428,20 @@ static void *nt_vmcoreinfo(void *ptr)

					return nt_init_name(ptr, 0, vmcoreinfo, size, "VMCOREINFO");

				}

				/*

				 * Initialize final note (needed for /proc/vmcore code)

				 */

				static void *nt_final(void *ptr)

				{

					Elf64_Nhdr *note;

					note = (Elf64_Nhdr *) ptr;

					note->n_namesz = 0;

					note->n_descsz = 0;

					note->n_type = 0;

					return PTR_ADD(ptr, sizeof(Elf64_Nhdr));

				}

				/*

				 * Initialize ELF header (new kernel)

				 */

				@ -515,6 +529,7 @@ static void *notes_init(Elf64_Phdr *phdr, void *ptr, u64 notes_offset)

						if (sa->prefix != 0)

							ptr = fill_cpu_elf_notes(ptr, cpu++, sa);

					ptr = nt_vmcoreinfo(ptr);

					ptr = nt_final(ptr);

					memset(phdr, 0, sizeof(*phdr));

					phdr->p_type = PT_NOTE;

					phdr->p_offset = notes_offset;

									
										40

arch/s390/kernel/entry.S
									
												View File
												
				@ -233,12 +233,17 @@ ENTRY(sie64a)

					lctlg	%c1,%c1,__LC_USER_ASCE		# load primary asce

				.Lsie_done:

				# some program checks are suppressing. C code (e.g. do_protection_exception)

				# will rewind the PSW by the ILC, which is 4 bytes in case of SIE. Other

				# instructions between sie64a and .Lsie_done should not cause program

				# interrupts. So lets use a nop (47 00 00 00) as a landing pad.

				# will rewind the PSW by the ILC, which is often 4 bytes in case of SIE. There

				# are some corner cases (e.g. runtime instrumentation) where ILC is unpredictable.

				# Other instructions between sie64a and .Lsie_done should not cause program

				# interrupts. So lets use 3 nops as a landing pad for all possible rewinds.

				# See also .Lcleanup_sie

				.Lrewind_pad:

					nop	0

				.Lrewind_pad6:

					nopr	7

				.Lrewind_pad4:

					nopr	7

				.Lrewind_pad2:

					nopr	7

					.globl sie_exit

				sie_exit:

					lg	%r14,__SF_EMPTY+8(%r15)		# load guest register save area

				@ -251,7 +256,9 @@ sie_exit:

					stg	%r14,__SF_EMPTY+16(%r15)	# set exit reason code

					j	sie_exit

					EX_TABLE(.Lrewind_pad,.Lsie_fault)

					EX_TABLE(.Lrewind_pad6,.Lsie_fault)

					EX_TABLE(.Lrewind_pad4,.Lsie_fault)

					EX_TABLE(.Lrewind_pad2,.Lsie_fault)

					EX_TABLE(sie_exit,.Lsie_fault)

				EXPORT_SYMBOL(sie64a)

				EXPORT_SYMBOL(sie_exit)

				@ -314,6 +321,7 @@ ENTRY(system_call)

					lg	%r14,__LC_VDSO_PER_CPU

					lmg	%r0,%r10,__PT_R0(%r11)

					mvc	__LC_RETURN_PSW(16),__PT_PSW(%r11)

				.Lsysc_exit_timer:

					stpt	__LC_EXIT_TIMER

					mvc	__VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER

					lmg	%r11,%r15,__PT_R11(%r11)

				@ -601,6 +609,7 @@ ENTRY(io_int_handler)

					lg	%r14,__LC_VDSO_PER_CPU

					lmg	%r0,%r10,__PT_R0(%r11)

					mvc	__LC_RETURN_PSW(16),__PT_PSW(%r11)

				.Lio_exit_timer:

					stpt	__LC_EXIT_TIMER

					mvc	__VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER

					lmg	%r11,%r15,__PT_R11(%r11)

				@ -1124,15 +1133,23 @@ cleanup_critical:

					br	%r14

				.Lcleanup_sysc_restore:

					# check if stpt has been executed

					clg	%r9,BASED(.Lcleanup_sysc_restore_insn)

					jh	0f

					mvc	__LC_EXIT_TIMER(8),__LC_ASYNC_ENTER_TIMER

					cghi	%r11,__LC_SAVE_AREA_ASYNC

					je	0f

					mvc	__LC_EXIT_TIMER(8),__LC_MCCK_ENTER_TIMER

				0:	clg	%r9,BASED(.Lcleanup_sysc_restore_insn+8)

					je	1f

					lg	%r9,24(%r11)		# get saved pointer to pt_regs

					mvc	__LC_RETURN_PSW(16),__PT_PSW(%r9)

					mvc	0(64,%r11),__PT_R8(%r9)

					lmg	%r0,%r7,__PT_R0(%r9)

				0:	lmg	%r8,%r9,__LC_RETURN_PSW

				1:	lmg	%r8,%r9,__LC_RETURN_PSW

					br	%r14

				.Lcleanup_sysc_restore_insn:

					.quad	.Lsysc_exit_timer

					.quad	.Lsysc_done - 4

				.Lcleanup_io_tif:

				@ -1140,15 +1157,20 @@ cleanup_critical:

					br	%r14

				.Lcleanup_io_restore:

					# check if stpt has been executed

					clg	%r9,BASED(.Lcleanup_io_restore_insn)

					je	0f

					jh	0f

					mvc	__LC_EXIT_TIMER(8),__LC_MCCK_ENTER_TIMER

				0:	clg	%r9,BASED(.Lcleanup_io_restore_insn+8)

					je	1f

					lg	%r9,24(%r11)		# get saved r11 pointer to pt_regs

					mvc	__LC_RETURN_PSW(16),__PT_PSW(%r9)

					mvc	0(64,%r11),__PT_R8(%r9)

					lmg	%r0,%r7,__PT_R0(%r9)

				0:	lmg	%r8,%r9,__LC_RETURN_PSW

				1:	lmg	%r8,%r9,__LC_RETURN_PSW

					br	%r14

				.Lcleanup_io_restore_insn:

					.quad	.Lio_exit_timer

					.quad	.Lio_done - 4

				.Lcleanup_idle:

4

arch/sparc/Kconfig

View File

 @ -192,9 +192,9 @@ config NR_CPUS
 	int "Maximum number of CPUs"
 	depends on SMP
 	range 2 32 if SPARC32
 	range 2 1024 if SPARC64
 	range 2 4096 if SPARC64
 	default 32 if SPARC32
 	default 64 if SPARC64
 	default 4096 if SPARC64
 source kernel/Kconfig.hz

									
										6

arch/sparc/include/asm/hugetlb.h
									
												View File
												
				@ -24,9 +24,11 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,

				static inline int prepare_hugepage_range(struct file *file,

							unsigned long addr, unsigned long len)

				{

					if (len & ~HPAGE_MASK)

					struct hstate *h = hstate_file(file);

					if (len & ~huge_page_mask(h))

						return -EINVAL;

					if (addr & ~HPAGE_MASK)

					if (addr & ~huge_page_mask(h))

						return -EINVAL;

					return 0;

				}

									
										2

arch/sparc/include/asm/mmu_64.h
									
												View File
												
				@ -52,7 +52,7 @@

				#define CTX_NR_MASK		TAG_CONTEXT_BITS

				#define CTX_HW_MASK		(CTX_NR_MASK | CTX_PGSZ_MASK)

				#define CTX_FIRST_VERSION	((_AC(1,UL) << CTX_VERSION_SHIFT) + _AC(1,UL))

				#define CTX_FIRST_VERSION	BIT(CTX_VERSION_SHIFT)

				#define CTX_VALID(__ctx)	\

					 (!(((__ctx.sparc64_ctx_val) ^ tlb_context_cache) & CTX_VERSION_MASK))

				#define CTX_HWBITS(__ctx)	((__ctx.sparc64_ctx_val) & CTX_HW_MASK)

									
										32

arch/sparc/include/asm/mmu_context_64.h
									
												View File
												
				@ -19,13 +19,8 @@ extern spinlock_t ctx_alloc_lock;

				extern unsigned long tlb_context_cache;

				extern unsigned long mmu_context_bmap[];

				DECLARE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm);

				void get_new_mmu_context(struct mm_struct *mm);

				#ifdef CONFIG_SMP

				void smp_new_mmu_context_version(void);

				#else

				#define smp_new_mmu_context_version() do { } while (0)

				#endif

				int init_new_context(struct task_struct *tsk, struct mm_struct *mm);

				void destroy_context(struct mm_struct *mm);

				@ -76,8 +71,9 @@ void __flush_tlb_mm(unsigned long, unsigned long);

				static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, struct task_struct *tsk)

				{

					unsigned long ctx_valid, flags;

					int cpu;

					int cpu = smp_processor_id();

					per_cpu(per_cpu_secondary_mm, cpu) = mm;

					if (unlikely(mm == &init_mm))

						return;

				@ -123,7 +119,6 @@ static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, str

					 * for the first time, we must flush that context out of the

					 * local TLB.

					 */

					cpu = smp_processor_id();

					if (!ctx_valid || !cpumask_test_cpu(cpu, mm_cpumask(mm))) {

						cpumask_set_cpu(cpu, mm_cpumask(mm));

						__flush_tlb_mm(CTX_HWBITS(mm->context),

				@ -133,26 +128,7 @@ static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, str

				}

				#define deactivate_mm(tsk,mm)	do { } while (0)

				/* Activate a new MM instance for the current task. */

				static inline void activate_mm(struct mm_struct *active_mm, struct mm_struct *mm)

				{

					unsigned long flags;

					int cpu;

					spin_lock_irqsave(&mm->context.lock, flags);

					if (!CTX_VALID(mm->context))

						get_new_mmu_context(mm);

					cpu = smp_processor_id();

					if (!cpumask_test_cpu(cpu, mm_cpumask(mm)))

						cpumask_set_cpu(cpu, mm_cpumask(mm));

					load_secondary_context(mm);

					__flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT);

					tsb_context_switch(mm);

					spin_unlock_irqrestore(&mm->context.lock, flags);

				}

				#define activate_mm(active_mm, mm) switch_mm(active_mm, mm, NULL)

				#endif /* !(__ASSEMBLY__) */

				#endif /* !(__SPARC64_MMU_CONTEXT_H) */

									
										4

arch/sparc/include/asm/pgtable_32.h
									
												View File
												
				@ -91,9 +91,9 @@ extern unsigned long pfn_base;

				 * ZERO_PAGE is a global shared page that is always zero: used

				 * for zero-mapped memory areas etc..

				 */

				extern unsigned long empty_zero_page;

				extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];

				#define ZERO_PAGE(vaddr) (virt_to_page(&empty_zero_page))

				#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))

				/*

				 * In general all page table modifications should use the V8 atomic

									
										1

arch/sparc/include/asm/pil.h
									
												View File
												
				@ -20,7 +20,6 @@

				#define PIL_SMP_CALL_FUNC	1

				#define PIL_SMP_RECEIVE_SIGNAL	2

				#define PIL_SMP_CAPTURE		3

				#define PIL_SMP_CTX_NEW_VERSION	4

				#define PIL_DEVICE_IRQ		5

				#define PIL_SMP_CALL_FUNC_SNGL	6

				#define PIL_DEFERRED_PCR_WORK	7

									
										2

arch/sparc/include/asm/setup.h
									
												View File
												
				@ -16,7 +16,7 @@ extern char reboot_command[];

				 */

				extern unsigned char boot_cpu_id;

				extern unsigned long empty_zero_page;

				extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];

				extern int serial_console;

				static inline int con_is_present(void)

									
										1

arch/sparc/include/asm/vio.h
									
												View File
												
				@ -327,6 +327,7 @@ struct vio_dev {

					int			compat_len;

					u64			dev_no;

					u64			id;

					unsigned long		channel_id;

									
										15

arch/sparc/kernel/ftrace.c
									
												View File
												
				@ -130,18 +130,17 @@ unsigned long prepare_ftrace_return(unsigned long parent,

					if (unlikely(atomic_read(&current->tracing_graph_pause)))

						return parent + 8UL;

					trace.func = self_addr;

					trace.depth = current->curr_ret_stack + 1;

					/* Only trace if the calling function expects to */

					if (!ftrace_graph_entry(&trace))

						return parent + 8UL;

					if (ftrace_push_return_trace(parent, self_addr, &trace.depth,

								     frame_pointer, NULL) == -EBUSY)

						return parent + 8UL;

					trace.func = self_addr;

					/* Only trace if the calling function expects to */

					if (!ftrace_graph_entry(&trace)) {

						current->curr_ret_stack--;

						return parent + 8UL;

					}

					return return_hooker;

				}

				#endif /* CONFIG_FUNCTION_GRAPH_TRACER */

									
										6

arch/sparc/kernel/head_64.S
									
												View File
												
				@ -939,3 +939,9 @@ ENTRY(__retl_o1)

					retl

					 mov	%o1, %o0

				ENDPROC(__retl_o1)

				ENTRY(__retl_o1_asi)

					wr      %o5, 0x0, %asi

					retl

					 mov    %o1, %o0

				ENDPROC(__retl_o1_asi)

									
										17

arch/sparc/kernel/irq_64.c
									
												View File
												
				@ -1034,17 +1034,26 @@ static void __init init_cpu_send_mondo_info(struct trap_per_cpu *tb)

				{

				#ifdef CONFIG_SMP

					unsigned long page;

					void *mondo, *p;

					BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > (PAGE_SIZE - 64));

					BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > PAGE_SIZE);

					/* Make sure mondo block is 64byte aligned */

					p = kzalloc(127, GFP_KERNEL);

					if (!p) {

						prom_printf("SUN4V: Error, cannot allocate mondo block.\n");

						prom_halt();

					}

					mondo = (void *)(((unsigned long)p + 63) & ~0x3f);

					tb->cpu_mondo_block_pa = __pa(mondo);

					page = get_zeroed_page(GFP_KERNEL);

					if (!page) {

						prom_printf("SUN4V: Error, cannot allocate cpu mondo page.\n");

						prom_printf("SUN4V: Error, cannot allocate cpu list page.\n");

						prom_halt();

					}

					tb->cpu_mondo_block_pa = __pa(page);

					tb->cpu_list_pa = __pa(page + 64);

					tb->cpu_list_pa = __pa(page);

				#endif

				}

									
										1

arch/sparc/kernel/kernel.h
									
												View File
												
				@ -37,7 +37,6 @@ void handle_stdfmna(struct pt_regs *regs, unsigned long sfar, unsigned long sfsr

				/* smp_64.c */

				void __irq_entry smp_call_function_client(int irq, struct pt_regs *regs);

				void __irq_entry smp_call_function_single_client(int irq, struct pt_regs *regs);

				void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs);

				void __irq_entry smp_penguin_jailcell(int irq, struct pt_regs *regs);

				void __irq_entry smp_receive_signal_client(int irq, struct pt_regs *regs);

									
										31

arch/sparc/kernel/smp_64.c
									
												View File
												
				@ -964,37 +964,6 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)

					preempt_enable();

				}

				void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs)

				{

					struct mm_struct *mm;

					unsigned long flags;

					clear_softint(1 << irq);

					/* See if we need to allocate a new TLB context because

					 * the version of the one we are using is now out of date.

					 */

					mm = current->active_mm;

					if (unlikely(!mm || (mm == &init_mm)))

						return;

					spin_lock_irqsave(&mm->context.lock, flags);

					if (unlikely(!CTX_VALID(mm->context)))

						get_new_mmu_context(mm);

					spin_unlock_irqrestore(&mm->context.lock, flags);

					load_secondary_context(mm);

					__flush_tlb_mm(CTX_HWBITS(mm->context),

						       SECONDARY_CONTEXT);

				}

				void smp_new_mmu_context_version(void)

				{

					smp_cross_call(&xcall_new_mmu_context_version, 0, 0, 0);

				}

				#ifdef CONFIG_KGDB

				void kgdb_roundup_cpus(unsigned long flags)

				{

									
										11

arch/sparc/kernel/tsb.S
									
												View File
												
				@ -455,13 +455,16 @@ __tsb_context_switch:

					.type	copy_tsb,#function

				copy_tsb:		/* %o0=old_tsb_base, %o1=old_tsb_size

							 * %o2=new_tsb_base, %o3=new_tsb_size

							 * %o4=page_size_shift

							 */

					sethi		%uhi(TSB_PASS_BITS), %g7

					srlx		%o3, 4, %o3

					add		%o0, %o1, %g1	/* end of old tsb */

					add		%o0, %o1, %o1	/* end of old tsb */

					sllx		%g7, 32, %g7

					sub		%o3, 1, %o3	/* %o3 == new tsb hash mask */

					mov		%o4, %g1	/* page_size_shift */

				661:	prefetcha	[%o0] ASI_N, #one_read

					.section	.tsb_phys_patch, "ax"

					.word		661b

				@ -486,9 +489,9 @@ copy_tsb:		/* %o0=old_tsb_base, %o1=old_tsb_size

					/* This can definitely be computed faster... */

					srlx		%o0, 4, %o5	/* Build index */

					and		%o5, 511, %o5	/* Mask index */

					sllx		%o5, PAGE_SHIFT, %o5 /* Put into vaddr position */

					sllx		%o5, %g1, %o5	/* Put into vaddr position */

					or		%o4, %o5, %o4	/* Full VADDR. */

					srlx		%o4, PAGE_SHIFT, %o4 /* Shift down to create index */

					srlx		%o4, %g1, %o4	/* Shift down to create index */

					and		%o4, %o3, %o4	/* Mask with new_tsb_nents-1 */

					sllx		%o4, 4, %o4	/* Shift back up into tsb ent offset */

					TSB_STORE(%o2 + %o4, %g2)	/* Store TAG */

				@ -496,7 +499,7 @@ copy_tsb:		/* %o0=old_tsb_base, %o1=old_tsb_size

					TSB_STORE(%o2 + %o4, %g3)	/* Store TTE */

				80:	add		%o0, 16, %o0

					cmp		%o0, %g1

					cmp		%o0, %o1

					bne,pt		%xcc, 90b

					 nop

									
										2

arch/sparc/kernel/ttable_64.S
									
												View File
												
				@ -50,7 +50,7 @@ tl0_resv03e:	BTRAP(0x3e) BTRAP(0x3f) BTRAP(0x40)

				tl0_irq1:	TRAP_IRQ(smp_call_function_client, 1)

				tl0_irq2:	TRAP_IRQ(smp_receive_signal_client, 2)

				tl0_irq3:	TRAP_IRQ(smp_penguin_jailcell, 3)

				tl0_irq4:	TRAP_IRQ(smp_new_mmu_context_version_client, 4)

				tl0_irq4:       BTRAP(0x44)

				#else

				tl0_irq1:	BTRAP(0x41)

				tl0_irq2:	BTRAP(0x42)

									
										68

arch/sparc/kernel/vio.c
									
												View File
												
				@ -302,13 +302,16 @@ static struct vio_dev *vio_create_one(struct mdesc_handle *hp, u64 mp,

					if (!id) {

						dev_set_name(&vdev->dev, "%s", bus_id_name);

						vdev->dev_no = ~(u64)0;

						vdev->id = ~(u64)0;

					} else if (!cfg_handle) {

						dev_set_name(&vdev->dev, "%s-%llu", bus_id_name, *id);

						vdev->dev_no = *id;

						vdev->id = ~(u64)0;

					} else {

						dev_set_name(&vdev->dev, "%s-%llu-%llu", bus_id_name,

							     *cfg_handle, *id);

						vdev->dev_no = *cfg_handle;

						vdev->id = *id;

					}

					vdev->dev.parent = parent;

				@ -351,27 +354,84 @@ static void vio_add(struct mdesc_handle *hp, u64 node)

					(void) vio_create_one(hp, node, &root_vdev->dev);

				}

				struct vio_md_node_query {

					const char *type;

					u64 dev_no;

					u64 id;

				};

				static int vio_md_node_match(struct device *dev, void *arg)

				{

					struct vio_md_node_query *query = (struct vio_md_node_query *) arg;

					struct vio_dev *vdev = to_vio_dev(dev);

					if (vdev->mp == (u64) arg)

						return 1;

					if (vdev->dev_no != query->dev_no)

						return 0;

					if (vdev->id != query->id)

						return 0;

					if (strcmp(vdev->type, query->type))

						return 0;

					return 0;

					return 1;

				}

				static void vio_remove(struct mdesc_handle *hp, u64 node)

				{

					const char *type;

					const u64 *id, *cfg_handle;

					u64 a;

					struct vio_md_node_query query;

					struct device *dev;

					dev = device_find_child(&root_vdev->dev, (void *) node,

					type = mdesc_get_property(hp, node, "device-type", NULL);

					if (!type) {

						type = mdesc_get_property(hp, node, "name", NULL);

						if (!type)

							type = mdesc_node_name(hp, node);

					}

					query.type = type;

					id = mdesc_get_property(hp, node, "id", NULL);

					cfg_handle = NULL;

					mdesc_for_each_arc(a, hp, node, MDESC_ARC_TYPE_BACK) {

						u64 target;

						target = mdesc_arc_target(hp, a);

						cfg_handle = mdesc_get_property(hp, target,

										"cfg-handle", NULL);

						if (cfg_handle)

							break;

					}

					if (!id) {

						query.dev_no = ~(u64)0;

						query.id = ~(u64)0;

					} else if (!cfg_handle) {

						query.dev_no = *id;

						query.id = ~(u64)0;

					} else {

						query.dev_no = *cfg_handle;

						query.id = *id;

					}

					dev = device_find_child(&root_vdev->dev, &query,

								vio_md_node_match);

					if (dev) {

						printk(KERN_INFO "VIO: Removing device %s\n", dev_name(dev));

						device_unregister(dev);

						put_device(dev);

					} else {

						if (!id)

							printk(KERN_ERR "VIO: Removed unknown %s node.\n",

							       type);

						else if (!cfg_handle)

							printk(KERN_ERR "VIO: Removed unknown %s node %llu.\n",

							       type, *id);

						else

							printk(KERN_ERR "VIO: Removed unknown %s node %llu-%llu.\n",

							       type, *cfg_handle, *id);

					}

				}

									
										2

arch/sparc/lib/GENbzero.S
									
												View File
												
				@ -8,7 +8,7 @@

				98:	x,y;			\

					.section __ex_table,"a";\

					.align 4;		\

					.word 98b, __retl_o1;	\

					.word 98b, __retl_o1_asi;\

					.text;			\

					.align 4;

									
										1

arch/sparc/lib/Makefile
									
												View File
												
				@ -15,6 +15,7 @@ lib-$(CONFIG_SPARC32) += copy_user.o locks.o

				lib-$(CONFIG_SPARC64) += atomic_64.o

				lib-$(CONFIG_SPARC32) += lshrdi3.o ashldi3.o

				lib-$(CONFIG_SPARC32) += muldi3.o bitext.o cmpdi2.o

				lib-$(CONFIG_SPARC64) += multi3.o

				lib-$(CONFIG_SPARC64) += copy_page.o clear_page.o bzero.o

				lib-$(CONFIG_SPARC64) += csum_copy.o csum_copy_from_user.o csum_copy_to_user.o

									
										2

arch/sparc/lib/NGbzero.S
									
												View File
												
				@ -8,7 +8,7 @@

				98:	x,y;			\

					.section __ex_table,"a";\

					.align 4;		\

					.word 98b, __retl_o1;	\

					.word 98b, __retl_o1_asi;\

					.text;			\

					.align 4;

									
										35

arch/sparc/lib/multi3.S
									
										Normal file
									
												View File
												
				@ -0,0 +1,35 @@

				#include <linux/linkage.h>

				#include <asm/export.h>

					.text

					.align	4

				ENTRY(__multi3) /* %o0 = u, %o1 = v */

					mov	%o1, %g1

					srl	%o3, 0, %g4

					mulx	%g4, %g1, %o1

					srlx	%g1, 0x20, %g3

					mulx	%g3, %g4, %g5

					sllx	%g5, 0x20, %o5

					srl	%g1, 0, %g4

					sub	%o1, %o5, %o5

					srlx	%o5, 0x20, %o5

					addcc	%g5, %o5, %g5

					srlx	%o3, 0x20, %o5

					mulx	%g4, %o5, %g4

					mulx	%g3, %o5, %o5

					sethi	%hi(0x80000000), %g3

					addcc	%g5, %g4, %g5

					srlx	%g5, 0x20, %g5

					add	%g3, %g3, %g3

					movcc	%xcc, %g0, %g3

					addcc	%o5, %g5, %o5

					sllx	%g4, 0x20, %g4

					add	%o1, %g4, %o1

					add	%o5, %g3, %g2

					mulx	%g1, %o2, %g1

					add	%g1, %g2, %g1

					mulx	%o0, %o3, %o0

					retl

					 add	%g1, %o0, %o0

				ENDPROC(__multi3)

				EXPORT_SYMBOL(__multi3)

									
										2

arch/sparc/mm/init_32.c
									
												View File
												
				@ -290,7 +290,7 @@ void __init mem_init(void)

					/* Saves us work later. */

					memset((void *)&empty_zero_page, 0, PAGE_SIZE);

					memset((void *)empty_zero_page, 0, PAGE_SIZE);

					i = last_valid_pfn >> ((20 - PAGE_SHIFT) + 5);

					i += 1;

									
										89

arch/sparc/mm/init_64.c
									
												View File
												
				@ -358,7 +358,8 @@ static int __init setup_hugepagesz(char *string)

					}

					if ((hv_pgsz_mask & cpu_pgsz_mask) == 0U) {

						pr_warn("hugepagesz=%llu not supported by MMU.\n",

						hugetlb_bad_size();

						pr_err("hugepagesz=%llu not supported by MMU.\n",

							hugepage_size);

						goto out;

					}

				@ -706,10 +707,58 @@ EXPORT_SYMBOL(__flush_dcache_range);

				/* get_new_mmu_context() uses "cache + 1".  */

				DEFINE_SPINLOCK(ctx_alloc_lock);

				unsigned long tlb_context_cache = CTX_FIRST_VERSION - 1;

				unsigned long tlb_context_cache = CTX_FIRST_VERSION;

				#define MAX_CTX_NR	(1UL << CTX_NR_BITS)

				#define CTX_BMAP_SLOTS	BITS_TO_LONGS(MAX_CTX_NR)

				DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);

				DEFINE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm) = {0};

				static void mmu_context_wrap(void)

				{

					unsigned long old_ver = tlb_context_cache & CTX_VERSION_MASK;

					unsigned long new_ver, new_ctx, old_ctx;

					struct mm_struct *mm;

					int cpu;

					bitmap_zero(mmu_context_bmap, 1 << CTX_NR_BITS);

					/* Reserve kernel context */

					set_bit(0, mmu_context_bmap);

					new_ver = (tlb_context_cache & CTX_VERSION_MASK) + CTX_FIRST_VERSION;

					if (unlikely(new_ver == 0))

						new_ver = CTX_FIRST_VERSION;

					tlb_context_cache = new_ver;

					/*

					 * Make sure that any new mm that are added into per_cpu_secondary_mm,

					 * are going to go through get_new_mmu_context() path.

					 */

					mb();

					/*

					 * Updated versions to current on those CPUs that had valid secondary

					 * contexts

					 */

					for_each_online_cpu(cpu) {

						/*

						 * If a new mm is stored after we took this mm from the array,

						 * it will go into get_new_mmu_context() path, because we

						 * already bumped the version in tlb_context_cache.

						 */

						mm = per_cpu(per_cpu_secondary_mm, cpu);

						if (unlikely(!mm || mm == &init_mm))

							continue;

						old_ctx = mm->context.sparc64_ctx_val;

						if (likely((old_ctx & CTX_VERSION_MASK) == old_ver)) {

							new_ctx = (old_ctx & ~CTX_VERSION_MASK) | new_ver;

							set_bit(new_ctx & CTX_NR_MASK, mmu_context_bmap);

							mm->context.sparc64_ctx_val = new_ctx;

						}

					}

				}

				/* Caller does TLB context flushing on local CPU if necessary.

				 * The caller also ensures that CTX_VALID(mm->context) is false.

				@ -725,48 +774,30 @@ void get_new_mmu_context(struct mm_struct *mm)

				{

					unsigned long ctx, new_ctx;

					unsigned long orig_pgsz_bits;

					int new_version;

					spin_lock(&ctx_alloc_lock);

				retry:

					/* wrap might have happened, test again if our context became valid */

					if (unlikely(CTX_VALID(mm->context)))

						goto out;

					orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);

					ctx = (tlb_context_cache + 1) & CTX_NR_MASK;

					new_ctx = find_next_zero_bit(mmu_context_bmap, 1 << CTX_NR_BITS, ctx);

					new_version = 0;

					if (new_ctx >= (1 << CTX_NR_BITS)) {

						new_ctx = find_next_zero_bit(mmu_context_bmap, ctx, 1);

						if (new_ctx >= ctx) {

							int i;

							new_ctx = (tlb_context_cache & CTX_VERSION_MASK) +

								CTX_FIRST_VERSION;

							if (new_ctx == 1)

								new_ctx = CTX_FIRST_VERSION;

							/* Don't call memset, for 16 entries that's just

							 * plain silly...

							 */

							mmu_context_bmap[0] = 3;

							mmu_context_bmap[1] = 0;

							mmu_context_bmap[2] = 0;

							mmu_context_bmap[3] = 0;

							for (i = 4; i < CTX_BMAP_SLOTS; i += 4) {

								mmu_context_bmap[i + 0] = 0;

								mmu_context_bmap[i + 1] = 0;

								mmu_context_bmap[i + 2] = 0;

								mmu_context_bmap[i + 3] = 0;

							}

							new_version = 1;

							goto out;

							mmu_context_wrap();

							goto retry;

						}

					}

					if (mm->context.sparc64_ctx_val)

						cpumask_clear(mm_cpumask(mm));

					mmu_context_bmap[new_ctx>>6] |= (1UL << (new_ctx & 63));

					new_ctx |= (tlb_context_cache & CTX_VERSION_MASK);

				out:

					tlb_context_cache = new_ctx;

					mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;

				out:

					spin_unlock(&ctx_alloc_lock);

					if (unlikely(new_version))

						smp_new_mmu_context_version();

				}

				static int numa_enabled = 1;

									
										7

arch/sparc/mm/tsb.c
									
												View File
												
				@ -496,7 +496,8 @@ retry_tsb_alloc:

						extern void copy_tsb(unsigned long old_tsb_base,

								     unsigned long old_tsb_size,

								     unsigned long new_tsb_base,

								     unsigned long new_tsb_size);

								     unsigned long new_tsb_size,

								     unsigned long page_size_shift);

						unsigned long old_tsb_base = (unsigned long) old_tsb;

						unsigned long new_tsb_base = (unsigned long) new_tsb;

				@ -504,7 +505,9 @@ retry_tsb_alloc:

							old_tsb_base = __pa(old_tsb_base);

							new_tsb_base = __pa(new_tsb_base);

						}

						copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size);

						copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size,

							tsb_index == MM_TSB_BASE ?

							PAGE_SHIFT : REAL_HPAGE_SHIFT);

					}

					mm->context.tsb_block[tsb_index].tsb = new_tsb;

									
										5

arch/sparc/mm/ultra.S
									
												View File
												
				@ -971,11 +971,6 @@ xcall_capture:

					wr		%g0, (1 << PIL_SMP_CAPTURE), %set_softint

					retry

					.globl		xcall_new_mmu_context_version

				xcall_new_mmu_context_version:

					wr		%g0, (1 << PIL_SMP_CTX_NEW_VERSION), %set_softint

					retry

				#ifdef CONFIG_KGDB

					.globl		xcall_kgdb_capture

				xcall_kgdb_capture:

									
										4

arch/um/kernel/initrd.c
									
												View File
												
				@ -14,7 +14,7 @@

				static char *initrd __initdata = NULL;

				static int load_initrd(char *filename, void *buf, int size);

				static int __init read_initrd(void)

				int __init read_initrd(void)

				{

					void *area;

					long long size;

				@ -46,8 +46,6 @@ static int __init read_initrd(void)

					return 0;

				}

				__uml_postsetup(read_initrd);

				static int __init uml_initrd_setup(char *line, int *add)

				{

					initrd = line;

									
										6

arch/um/kernel/um_arch.c
									
												View File
												
				@ -338,11 +338,17 @@ int __init linux_main(int argc, char **argv)

					return start_uml();

				}

				int __init __weak read_initrd(void)

				{

					return 0;

				}

				void __init setup_arch(char **cmdline_p)

				{

					stack_protections((unsigned long) &init_thread_info);

					setup_physmem(uml_physmem, uml_reserved, physmem_size, highmem);

					mem_total_pages(physmem_size, iomem_size, highmem);

					read_initrd();

					paging_init();

					strlcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);

									
										2

arch/x86/boot/boot.h
									
												View File
												
				@ -16,7 +16,7 @@

				#ifndef BOOT_BOOT_H

				#define BOOT_BOOT_H

				#define STACK_SIZE	512	/* Minimum number of bytes for stack */

				#define STACK_SIZE	1024	/* Minimum number of bytes for stack */

				#ifndef __ASSEMBLY__

									
										2

arch/x86/boot/compressed/Makefile
									
												View File
												
				@ -94,7 +94,7 @@ vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o

				quiet_cmd_check_data_rel = DATAREL $@

				define cmd_check_data_rel

					for obj in $(filter %.o,$^); do \

						readelf -S $$obj | grep -qF .rel.local && { \

						${CROSS_COMPILE}readelf -S $$obj | grep -qF .rel.local && { \

							echo "error: $$obj has data relocations!" >&2; \

							exit 1; \

						} || true; \

									
										2

arch/x86/events/intel/rapl.c
									
												View File
												
				@ -761,7 +761,7 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_CORE,   hsw_rapl_init),

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_GT3E,   hsw_rapl_init),

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_X,	  hsw_rapl_init),

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_X,	  hsx_rapl_init),

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_XEON_D, hsw_rapl_init),

					X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNL, knl_rapl_init),

									
										1

arch/x86/include/asm/mce.h
									
												View File
												
				@ -264,6 +264,7 @@ static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *s

				#endif

				int mce_available(struct cpuinfo_x86 *c);

				bool mce_is_memory_error(struct mce *m);

				DECLARE_PER_CPU(unsigned, mce_exception_count);

				DECLARE_PER_CPU(unsigned, mce_poll_count);

Compare commits

621 Commits v4.14.183 ... v4.11.6

16 Documentation/acpi/acpi-lid.txt Unescape Escape View File

62 Documentation/arm64/tagged-pointers.txt Unescape Escape View File

2 Makefile Unescape Escape View File

6 arch/alpha/kernel/osf_sys.c Unescape Escape View File

5 arch/arm/boot/dts/at91-sama5d3_xplained.dts Unescape Escape View File

17 arch/arm/boot/dts/imx6sx-sdb.dts Unescape Escape View File

4 arch/arm/boot/dts/keystone-k2l-netcp.dtsi Unescape Escape View File

8 arch/arm/boot/dts/keystone-k2l.dtsi Unescape Escape View File

3 arch/arm/include/asm/device.h Unescape Escape View File

12 arch/arm/include/asm/dma-mapping.h Unescape Escape View File

2 arch/arm/include/asm/fixmap.h Unescape Escape View File

3 arch/arm/include/asm/kvm_coproc.h Unescape Escape View File

9 arch/arm/include/asm/module.h Unescape Escape View File

85 arch/arm/kernel/module-plts.c Unescape Escape View File

1 arch/arm/kernel/module.lds Unescape Escape View File

4 arch/arm/kernel/setup.c Unescape Escape View File

77 arch/arm/kvm/coproc.c Unescape Escape View File

4 arch/arm/kvm/handle_exit.c Unescape Escape View File

2 arch/arm/kvm/hyp/Makefile Unescape Escape View File

4 arch/arm/kvm/hyp/switch.c Unescape Escape View File

5 arch/arm/kvm/init.S Unescape Escape View File

36 arch/arm/kvm/mmu.c Unescape Escape View File

8 arch/arm/kvm/psci.c Unescape Escape View File

7 arch/arm/mm/dma-mapping.c Unescape Escape View File

16 arch/arm/mm/mmu.c Unescape Escape View File

4 arch/arm/mm/proc-v7m.S Unescape Escape View File

3 arch/arm64/boot/dts/hisilicon/hi6220.dtsi Unescape Escape View File

9 arch/arm64/include/asm/asm-uaccess.h Unescape Escape View File

20 arch/arm64/include/asm/barrier.h Unescape Escape View File

2 arch/arm64/include/asm/cmpxchg.h Unescape Escape View File

3 arch/arm64/include/asm/device.h Unescape Escape View File

13 arch/arm64/include/asm/dma-mapping.h Unescape Escape View File

6 arch/arm64/include/asm/kvm_emulate.h Unescape Escape View File

4 arch/arm64/include/asm/sysreg.h Unescape Escape View File

9 arch/arm64/include/asm/uaccess.h Unescape Escape View File

3 arch/arm64/kernel/armv8_deprecated.c Unescape Escape View File

5 arch/arm64/kernel/entry.S Unescape Escape View File

3 arch/arm64/kernel/hw_breakpoint.c Unescape Escape View File

4 arch/arm64/kernel/traps.c Unescape Escape View File

11 arch/arm64/kvm/hyp-init.S Unescape Escape View File

2 arch/arm64/kvm/hyp/Makefile Unescape Escape View File

8 arch/arm64/kvm/sys_regs.c Unescape Escape View File

7 arch/arm64/mm/dma-mapping.c Unescape Escape View File

13 arch/arm64/net/bpf_jit_comp.c Unescape Escape View File

49 arch/metag/include/asm/uaccess.h Unescape Escape View File

1 arch/mips/Kconfig Unescape Escape View File

1 arch/mips/kernel/process.c Unescape Escape View File

2 arch/openrisc/kernel/process.c Unescape Escape View File

17 arch/powerpc/include/asm/mmu_context.h Unescape Escape View File

14 arch/powerpc/include/asm/topology.h Unescape Escape View File

19 arch/powerpc/kernel/eeh_driver.c Unescape Escape View File

12 arch/powerpc/kernel/exceptions-64e.S Unescape Escape View File

2 arch/powerpc/kernel/mce.c Unescape Escape View File

1 arch/powerpc/kernel/nvram_64.c Unescape Escape View File

22 arch/powerpc/kernel/process.c Unescape Escape View File

2 arch/powerpc/kernel/prom.c Unescape Escape View File

4 arch/powerpc/kernel/setup_64.c Unescape Escape View File

6 arch/powerpc/kernel/sysfs.c Unescape Escape View File

4 arch/powerpc/kernel/traps.c Unescape Escape View File

7 arch/powerpc/mm/dump_linuxpagetables.c Unescape Escape View File

4 arch/powerpc/mm/mmu_context_iommu.c Unescape Escape View File

4 arch/powerpc/platforms/cell/spu_base.c Unescape Escape View File

8 arch/powerpc/platforms/powernv/npu-dma.c Unescape Escape View File

10 arch/powerpc/platforms/powernv/pci-ioda.c Unescape Escape View File

2 arch/powerpc/platforms/powernv/pci.h Unescape Escape View File

1 arch/powerpc/platforms/pseries/dlpar.c Unescape Escape View File

2 arch/powerpc/platforms/pseries/hotplug-memory.c Unescape Escape View File

3 arch/powerpc/sysdev/simple_gpio.c Unescape Escape View File

15 arch/s390/kernel/crash_dump.c Unescape Escape View File

40 arch/s390/kernel/entry.S Unescape Escape View File

4 arch/sparc/Kconfig Unescape Escape View File

6 arch/sparc/include/asm/hugetlb.h Unescape Escape View File

2 arch/sparc/include/asm/mmu_64.h Unescape Escape View File

32 arch/sparc/include/asm/mmu_context_64.h Unescape Escape View File

4 arch/sparc/include/asm/pgtable_32.h Unescape Escape View File

1 arch/sparc/include/asm/pil.h Unescape Escape View File

2 arch/sparc/include/asm/setup.h Unescape Escape View File

1 arch/sparc/include/asm/vio.h Unescape Escape View File

621 Commits

v4.14.183 ... v4.11.6

16

Documentation/acpi/acpi-lid.txt

View File

62

Documentation/arm64/tagged-pointers.txt

View File

2

Makefile

View File

6

arch/alpha/kernel/osf_sys.c

View File

5

arch/arm/boot/dts/at91-sama5d3_xplained.dts

View File

17

arch/arm/boot/dts/imx6sx-sdb.dts

View File

4

arch/arm/boot/dts/keystone-k2l-netcp.dtsi

View File

8

arch/arm/boot/dts/keystone-k2l.dtsi

View File

3

arch/arm/include/asm/device.h

View File

12

arch/arm/include/asm/dma-mapping.h

View File

2

arch/arm/include/asm/fixmap.h

View File

3

arch/arm/include/asm/kvm_coproc.h

View File

9

arch/arm/include/asm/module.h

View File

85

arch/arm/kernel/module-plts.c

View File

1

arch/arm/kernel/module.lds

View File

4

arch/arm/kernel/setup.c

View File

77

arch/arm/kvm/coproc.c

View File

4

arch/arm/kvm/handle_exit.c

View File

2

arch/arm/kvm/hyp/Makefile

View File

4

arch/arm/kvm/hyp/switch.c

View File

5

arch/arm/kvm/init.S

View File

36

arch/arm/kvm/mmu.c

View File

8

arch/arm/kvm/psci.c

View File

7

arch/arm/mm/dma-mapping.c

View File

16

arch/arm/mm/mmu.c

View File

4

arch/arm/mm/proc-v7m.S

View File

3

arch/arm64/boot/dts/hisilicon/hi6220.dtsi

View File

9

arch/arm64/include/asm/asm-uaccess.h

View File

20

arch/arm64/include/asm/barrier.h

View File

2

arch/arm64/include/asm/cmpxchg.h

View File

3

arch/arm64/include/asm/device.h

View File

13

arch/arm64/include/asm/dma-mapping.h

View File

6

arch/arm64/include/asm/kvm_emulate.h

View File

4

arch/arm64/include/asm/sysreg.h

View File

9

arch/arm64/include/asm/uaccess.h

View File

3

arch/arm64/kernel/armv8_deprecated.c

View File

5

arch/arm64/kernel/entry.S

View File

3

arch/arm64/kernel/hw_breakpoint.c

View File

4

arch/arm64/kernel/traps.c

View File

11

arch/arm64/kvm/hyp-init.S

View File

2

arch/arm64/kvm/hyp/Makefile

View File

8

arch/arm64/kvm/sys_regs.c

View File

7

arch/arm64/mm/dma-mapping.c

View File

13

arch/arm64/net/bpf_jit_comp.c

View File

49

arch/metag/include/asm/uaccess.h

View File

1

arch/mips/Kconfig

View File

1

arch/mips/kernel/process.c

View File

2

arch/openrisc/kernel/process.c

View File

17

arch/powerpc/include/asm/mmu_context.h

View File

14

arch/powerpc/include/asm/topology.h

View File

19

arch/powerpc/kernel/eeh_driver.c

View File

12

arch/powerpc/kernel/exceptions-64e.S

View File

2

arch/powerpc/kernel/mce.c

View File

1

arch/powerpc/kernel/nvram_64.c

View File

22

arch/powerpc/kernel/process.c

View File

2

arch/powerpc/kernel/prom.c

View File

4

arch/powerpc/kernel/setup_64.c

View File

6

arch/powerpc/kernel/sysfs.c

View File

4

arch/powerpc/kernel/traps.c

View File

7

arch/powerpc/mm/dump_linuxpagetables.c

View File

4

arch/powerpc/mm/mmu_context_iommu.c

View File

4

arch/powerpc/platforms/cell/spu_base.c

View File

8

arch/powerpc/platforms/powernv/npu-dma.c

View File

10

arch/powerpc/platforms/powernv/pci-ioda.c

View File

2

arch/powerpc/platforms/powernv/pci.h

View File

1

arch/powerpc/platforms/pseries/dlpar.c

View File

2

arch/powerpc/platforms/pseries/hotplug-memory.c

View File

3

arch/powerpc/sysdev/simple_gpio.c

View File

15

arch/s390/kernel/crash_dump.c

View File

40

arch/s390/kernel/entry.S

View File

4

arch/sparc/Kconfig

View File

6

arch/sparc/include/asm/hugetlb.h

View File

2

arch/sparc/include/asm/mmu_64.h

View File

32

arch/sparc/include/asm/mmu_context_64.h

View File

4

arch/sparc/include/asm/pgtable_32.h

View File

1

arch/sparc/include/asm/pil.h

View File

2

arch/sparc/include/asm/setup.h

View File

1

arch/sparc/include/asm/vio.h

View File

15

arch/sparc/kernel/ftrace.c

View File