commit 514d748f69 upstream.
Commit e4502c63f5 (ufs: deal with nfsd/iget races) made ufs
create inodes with I_NEW flag set. However ufs_mkdir() never cleared
this flag. Thus if someone ever tried to lookup the directory by inode
number, he would deadlock waiting for I_NEW to be cleared. Luckily this
mostly happens only if the filesystem is exported over NFS since
otherwise we have the inode attached to dentry and don't look it up by
inode number. In rare cases dentry can get freed without inode being
freed and then we'd hit the deadlock even without NFS export.
Fix the problem by clearing I_NEW before instantiating new directory
inode.
Fixes: e4502c63f5
Reported-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12ecbb4b1d upstream.
Commit e4502c63f5 (ufs: deal with nfsd/iget races) introduced
unlock_new_inode() call into ufs_add_nondir(). However that function
gets called also from ufs_link() which hands it already initialized
inode and thus unlock_new_inode() complains. The problem is harmless but
annoying.
Fix the problem by opencoding necessary stuff in ufs_link()
Fixes: e4502c63f5
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ceeb0e5d39 upstream.
Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93e3bce628 upstream.
The warning message in prepend_path is unclear and outdated. It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead. Unfortunately the warning reads like a general warning,
making it unclear what to do with it.
Remove the warning. The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.
Reported-by: Ivan Delalande <colona@arista.com>
Reported-by: Omar Sandoval <osandov@osandov.com>
Fixes: f48cfddc67 ("vfs: In d_path don't call d_dname on a mount point")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2426f39100 upstream.
file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.
Fix the bug by checking actual mode bits before setting S_NOSEC.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42720138b0 upstream.
Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Actually the next patch makes it much, much easier to trigger the race
so I'm including this one for stable@ as well. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 431dae778a upstream.
Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.
When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.
Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.
Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Tested-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c8e5105e7 upstream.
The REGSET_VX_LOW ELF notes should contain the lower 64 bit halfes of the
first sixteen 128 bit vector registers. Unfortunately currently we copy
the upper halfes.
Fix this and correctly copy the lower halfes.
Fixes: a62bc07392 ("s390/kdump: add support for vector extension")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b938eacea0 upstream.
Commit ea5f496925 ("KVM: s390: only one external call may be pending
at a time") introduced a bug on machines that don't have SIGP
interpretation facility installed.
The injection of an external call will now always fail with -EBUSY
(if none is already pending).
This leads to the following symptoms:
- An external call will be injected but with the wrong "src cpu id",
as this id will not be remembered.
- The target vcpu will not be woken up, therefore the guest will hang if
it cannot deal with unexpected failures of the SIGP EXTERNAL CALL
instruction.
- If an external call is already pending, -EBUSY will not be reported.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e748c8d09 upstream.
KVM guest kernels for trap & emulate run in user mode, with a modified
set of kernel memory segments. However the fixmap address is still in
the normal KSeg3 region at 0xfffe0000 regardless, causing problems when
cache alias handling makes use of them when handling copy on write.
Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped
region when CONFIG_KVM_GUEST is defined.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9887/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69a1220060 upstream.
The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
position in the memslots array, which is sorted by gfn.
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1dace0116d upstream.
The Foxconn K8M890-8237A has two PCI host bridges, and we can't assign
resources correctly without the information from _CRS that tells us which
address ranges are claimed by which bridge. In the bugs mentioned below,
we incorrectly assign a sound card address (this example is from 1033299):
bus: 00 index 2 [mem 0x80000000-0xfcffffffff]
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
pci_root PNP0A08:00: host bridge window [mem 0x80000000-0xbfefffff] (ignored)
pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff] (ignored)
pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
pci_root PNP0A08:01: host bridge window [mem 0xbff00000-0xbfffffff] (ignored)
pci 0000:80:01.0: [1106:3288] type 0 class 0x000403
pci 0000:80:01.0: reg 10: [mem 0xbfffc000-0xbfffffff 64bit]
pci 0000:80:01.0: address space collision: [mem 0xbfffc000-0xbfffffff 64bit] conflicts with PCI Bus #00 [mem 0x80000000-0xfcffffffff]
pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
BUG: unable to handle kernel paging request at ffffc90000378000
IP: [<ffffffffa0345f63>] azx_create+0x37c/0x822 [snd_hda_intel]
We assigned 0xfd_0000_0000, but that is not in any of the host bridge
windows, and the sound card doesn't work.
Turn on pci=use_crs automatically for this system.
Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368
Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d9fecf6bf upstream.
We enable _CRS on all systems from 2008 and later. On older systems, we
ignore _CRS and assume the whole physical address space (excluding RAM and
other devices) is available for PCI devices, but on systems that support
physical address spaces larger than 4GB, it's doubtful that the area above
4GB is really available for PCI.
After d56dbf5bab ("PCI: Allocate 64-bit BARs above 4G when possible"), we
try to use that space above 4GB *first*, so we're more likely to put a
device there.
On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394,
and card reader devices unassigned (but only after Windows had been
booted). Only the sound device had a 64-bit BAR, so it was the only device
placed above 4GB, and hence the only device that didn't work.
Keep _CRS enabled even on pre-2008 systems if they support physical address
space larger than 4GB.
Fixes: d56dbf5bab ("PCI: Allocate 64-bit BARs above 4G when possible")
Reported-and-tested-by: Juan Dayer <jdayer@outlook.com>
Reported-and-tested-by: Alan Horsfield <alan@hazelgarth.co.uk>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72e349f112 upstream.
When we take a PMU exception or a software event we call
perf_read_regs(). This overloads regs->result with a boolean that
describes if we should use the sampled instruction address register
(SIAR) or the regs.
If the exception is in kernel, we start with the kernel regs and
backtrace through the kernel stack. At this point we switch to the
userspace regs and backtrace the user stack with perf_callchain_user().
Unfortunately these regs have not got the perf_read_regs() treatment,
so regs->result could be anything. If it is non zero,
perf_instruction_pointer() decides to use the SIAR, and we get issues
like this:
0.11% qemu-system-ppc [kernel.kallsyms] [k] _raw_spin_lock_irqsave
|
---_raw_spin_lock_irqsave
|
|--52.35%-- 0
| |
| |--46.39%-- __hrtimer_start_range_ns
| | kvmppc_run_core
| | kvmppc_vcpu_run_hv
| | kvmppc_vcpu_run
| | kvm_arch_vcpu_ioctl_run
| | kvm_vcpu_ioctl
| | do_vfs_ioctl
| | sys_ioctl
| | system_call
| | |
| | |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
| | | |
| | | --100.00%-- 0x7e714
| | | 0x7e714
Notice the bogus _raw_spin_irqsave when we transition from kernel
(system_call) to userspace (0x7e714). We inserted what was in the SIAR.
Add a check in regs_use_siar() to check that the regs in question
are from a PMU exception. With this fix the backtrace makes sense:
0.47% qemu-system-ppc [kernel.vmlinux] [k] _raw_spin_lock_irqsave
|
---_raw_spin_lock_irqsave
|
|--53.83%-- 0
| |
| |--44.73%-- hrtimer_try_to_cancel
| | kvmppc_start_thread
| | kvmppc_run_core
| | kvmppc_vcpu_run_hv
| | kvmppc_vcpu_run
| | kvm_arch_vcpu_ioctl_run
| | kvm_vcpu_ioctl
| | do_vfs_ioctl
| | sys_ioctl
| | system_call
| | __ioctl
| | 0x7e714
| | 0x7e714
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc5a2f7b8f upstream.
On some archs, the local clockevent device stops in deep cpuidle states.
The broadcast framework is used to wakeup cpus in these idle states, in
which either an external clockevent device is used to send wakeup ipis
or the hrtimer broadcast framework kicks in in the absence of such a
device. One cpu is nominated as the broadcast cpu and this cpu sends
wakeup ipis to sleeping cpus at the appropriate time. This is the
implementation in the oneshot mode of broadcast.
In periodic mode of broadcast however, the presence of such cpuidle
states results in the cpuidle driver calling tick_broadcast_enable()
which shuts down the local clockevent devices of all the cpus and
appoints the tick broadcast device as the clockevent device for each of
them. This works on those archs where the tick broadcast device is a
real clockevent device. But on archs which depend on the hrtimer mode
of broadcast, the tick broadcast device hapens to be a pseudo device.
The consequence is that the local clockevent devices of all cpus are
shutdown and the kernel hangs at boot time in periodic mode.
Let us thus not register the cpuidle states which have
CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
mode of broadcast in periodic mode. This patch takes care of doing this
on powerpc. The cpus would not have entered into such deep cpuidle
states in periodic mode on powerpc anyway. So there is no loss here.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f5bc307be upstream.
The current Armada XP suspend to RAM implementation, as added in
commit 27432825ae ("ARM: mvebu: Armada XP GP specific
suspend/resume code") does not handle big-endian configurations
properly: the small bit of assembly code putting the DRAM in
self-refresh and toggling the GPIOs to turn off power forgets to
convert the values to little-endian.
This commit fixes that by making sure the two values we will write to
the DRAM controller register and GPIO register are already in
little-endian before entering the critical assembly code.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: 27432825ae ("ARM: mvebu: Armada XP GP specific suspend/resume code")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d48edb3c3 upstream.
Commit 7232398abc ("ARM: tegra: Convert PMC to a driver") changed tegra_resume()
location storing from late to early and, as a result, broke suspend on Tegra20.
PMC scratch register 41 is used by tegra LP1 resume code for retrieving stored
physical memory address of common resume function and in the same time used by
tegra20_cpu_shutdown() (shared by Tegra20 cpuidle driver and platform SMP code),
which is storing CPU1 "resettable" status. It implies strict order of scratch
register usage, otherwise resume function address is lost on Tegra20 after
disabling non-boot CPU's on suspend. Fix it by storing "resettable" status in
IRAM instead of PMC scratch register.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Fixes: 7232398abc (ARM: tegra: Convert PMC to a driver)
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2d997366d upstream.
According to the PSCI specification and the SMC/HVC calling
convention, PSCI function_ids that are not implemented must
return NOT_SUPPORTED as return value.
Current KVM implementation takes an unhandled PSCI function_id
as an error and injects an undefined instruction into the guest
if PSCI implementation is called with a function_id that is not
handled by the resident PSCI version (ie it is not implemented),
which is not the behaviour expected by a guest when calling a
PSCI function_id that is not implemented.
This patch fixes this issue by returning NOT_SUPPORTED whenever
the kvm PSCI call is executed for a function_id that is not
implemented by the PSCI kvm layer.
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85e84ba310 upstream.
On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.
On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.
If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.
The same condition exists when trapping to enable VFP for the guest.
The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.
The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.
Reported-by: Vikram Sethi <vikrams@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fc2b4b436 upstream.
Before calling into the filesystem, vfs_setxattr calls
security_inode_setxattr, which ends up calling selinux_inode_setxattr in
our case. That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
only if selinux_is_sblabel_mnt returns true.
The selinux_is_sblabel_mnt logic was broken by eadcabc697 "SELinux: do
all flags twiddling in one place", which didn't take into the account
the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
with eb9ae68650 "SELinux: Add new labeling type native labels".
This caused setxattr's of security labels over NFSv4.2 to fail.
Cc: Eric Paris <eparis@redhat.com>
Cc: David Quigley <dpquigl@davequigley.com>
Reported-by: Richard Chan <rc556677@outlook.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the stable dependency]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0dd23f9425 upstream.
Commit 007bea098b (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it. It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.
Fixes: 007bea098b (intel_pstate: Add setting voltage value for baytrail P states.)
Signed-off-by: Joe Konno <joe.konno@intel.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62a7f368ff upstream.
When dma mapping (dma_map_sg) fails in sdhci_pre_dma_transfer, -EINVAL
is returned. There are 3 callers of sdhci_pre_dma_transfer:
* sdhci_pre_req and sdhci_adma_table_pre: handle negative return
* sdhci_prepare_data: handles 0 (error) and "else" (good) only
sdhci_prepare_data is therefore broken. When it receives -EINVAL from
sdhci_pre_dma_transfer, it assumes 1 sg mapping was mapped. Later,
this non-existent mapping with address 0 is kmap'ped and written to:
Corrupted low memory at ffff880000001000 (1000 phys) = 22b7d67df2f6d1cf
Corrupted low memory at ffff880000001008 (1008 phys) = 63848a5216b7dd95
Corrupted low memory at ffff880000001010 (1010 phys) = 330eb7ddef39e427
Corrupted low memory at ffff880000001018 (1018 phys) = 8017ac7295039bda
Corrupted low memory at ffff880000001020 (1020 phys) = 8ce039eac119074f
...
So teach sdhci_prepare_data to understand negative return values from
sdhci_pre_dma_transfer and disable DMA in that case, as well as for
zero.
It was introduced in 348487cb28 (mmc:
sdhci: use pipeline mmc requests to improve performance). The commit
seems to be suspicious also by assigning host->sg_count both in
sdhci_pre_dma_transfer and sdhci_adma_table_pre.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: 348487cb28
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Haibo Chen <haibo.chen@freescale.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b3fff54bc upstream.
Make sure that we are skipping over large PTEs while walking
the page-table tree.
Fixes: 5c34c403b7 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d38f0ff9ab upstream.
Commit 83a60ed8f0 ("iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS
condition") accidentally negated the ID0_ATOSNS predicate in the ATOS
feature check, causing the driver to attempt ATOS requests on SMMUv2
hardware without the ATOS feature implemented.
This patch restores the predicate to the correct value.
Reported-by: Varun Sethi <varun.sethi@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f14e9ad17f upstream.
ffs_closed can race with configfs_rmdir which will call config_item_release, so
add an extra check to avoid calling the unregister_gadget_item with an null
gadget item.
Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 342f39a6c8 upstream.
when copying to iter the size can be different then the iov count,
the check for full iov is wrong and make any read on request which
is not the exactly size of iov to return -EFAULT.
So, just check the success of the copy.
Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b65657fc24 ]
The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
support TCP/IP checksumming with frame sizes larger than 1600 bytes.
This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
to a value greater than 1600 bytes.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f522a975a8 ]
The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
380 and 385 SoCs. Since at least one more hardware feature is available
for the Armada XP SoCs then a way to identify them is needed.
This patch introduces a new compatible string "marvell,armada-xp-neta".
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 472cfe7127 ]
When allocating Rx related buffers, alloc_pages is called using an order
number that is decreased until successful. A system under stress can
experience failures during this allocation process resulting in a warning
being issued. This message can be of concern to end users even though the
failure is not fatal. Since the failure is not fatal and can occur
multiple times, the driver should include the __GFP_NOWARN flag to
suppress the warning message from being issued.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eb686231fc ]
When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7254acffee ]
When in HA mode, the driver exposes an IB (RoCE) device instance with only
one port. Under SRIOV, the existing implementation doesn't go well with
the PF RoCE driver's role of Special QPs Para-Virtualization, etc.
As such, disable HA for the mlx4 PF RoCE device in SRIOV mode.
Fixes: a575009030 ('IB/mlx4: Add port aggregation support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79a258526c ]
The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
offload is disabled. This is wrong since rxvlan offload can be switched
on/off regardless of hwtstamp_rx_filter.
Also moved check_csum to query CQE information to identify VLAN packets
and removed the check of IP packets, since it has been validated before.
Fixes: f8c6455bb0 ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 488a9b48e3 ]
Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.
Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0eb08514fd ]
TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
used the deprecated base_tx_qpn field. Move TX QP release to
mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.
Fixes: ddae0349fd ('net/mlx4: Change QP allocation scheme')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 34b99df4e6 ]
ICMP messages can trigger ICMP and local errors. In this case
serr->port is 0 and starting from Linux 4.0 we do not return
the original target address to the error queue readers.
Add function to define which errors provide addr_offset.
With this fix my ping command is not silent anymore.
Fixes: c247f0534c ("ip: fix error queue empty skb handling")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2c51a97f76 ]
The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:
1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.
2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.
Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.
Fixes: 767e97e1e0 ("neigh: RCU conversion of struct neighbour")
Fixes: a263b30936 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056 ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 468479e604 ]
PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
f->num_members. It returns the old value unconditionally, but
f->num_members may have changed since the last store. Ensure
that the return value is always < num.
When modifying the logic, simplify it further by replacing the loop
with an unconditional atomic increment.
Fixes: dc99f60069 ("packet: Add fanout support.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f98f4514d0 ]
We need to tell compiler it must not read f->num_members multiple
times. Otherwise testing if num is not zero is flaky, and we could
attempt an invalid divide by 0 in fanout_demux_cpu()
Note bug was present in packet_rcv_fanout_hash() and
packet_rcv_fanout_lb() but final 3.1 had a simple location
after commit 95ec3eb417 ("packet: Add 'cpu' fanout policy.")
Fixes: dc99f60069 ("packet: Add fanout support.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2dab80a8b4 ]
After the ->set() spinlocks were removed br_stp_set_bridge_priority
was left running without any protection when used via sysfs. It can
race with port add/del and could result in use-after-free cases and
corrupted lists. Tested by running port add/del in a loop with stp
enabled while setting priority in a loop, crashes are easily
reproducible.
The spinlocks around sysfs ->set() were removed in commit:
14f98f258f ("bridge: range check STP parameters")
There's also a race condition in the netlink priority support that is
fixed by this change, but it was introduced recently and the fixes tag
covers it, just in case it's needed the commit is:
af615762e9 ("bridge: add ageing_time, stp_state, priority over netlink")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 14f98f258f ("bridge: range check STP parameters")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d45a02d01 ]
->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.
Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.
This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().
Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().
Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.
Fixes: 9f7d653b67 ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen <jiji@redhat.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fb05e7a89f ]
We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f76858
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.
This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.
alloc_skb_with_frags is the same.
The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.
V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer
Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 58c98be137 ]
When programming the start of a periodic output, the code wrongly places
the seconds value into the "low" register and the nanoseconds into the
"high" register. Even though this is backwards, it slipped through my
testing, because the re-arming code in the interrupt service routine is
correct, and the signal does appear starting with the second edge.
This patch fixes the issue by programming the registers correctly.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1a040eaca1 ]
Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 0909e11758 ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upstream commit 671d773297
Since it is possible for vnet_event_napi to end up doing
vnet_control_pkt_engine -> ... -> vnet_send_attr ->
vnet_port_alloc_tx_ring -> ldc_alloc_exp_dring -> kzalloc()
(i.e., in softirq context), kzalloc() should be called with
GFP_ATOMIC from ldc_alloc_exp_dring.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f104765b4f upstream.
If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16c45eda96 upstream.
Fix a race condition and unnecessary locking:
* the root rb_node must only be accessed under the lock in nft_rbtree_lookup()
* the lock is not needed in lookup functions in netlink context
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0aab374709 upstream.
Patches 7cba160ad "powernv/cpuidle: Redesign idle states management"
and 77b54e9f2 "powernv/powerpc: Add winkle support for offline cpus"
use non-volatile condition registers (cr2, cr3 and cr4) early in the system
reset interrupt handler (system_reset_pSeries()) before it has been determined
if state loss has occurred. If state loss has not occurred, control returns via
the power7_wakeup_noloss() path which does not restore those condition
registers, leaving them corrupted.
Fix this by restoring the condition registers in the power7_wakeup_noloss()
case.
This is apparent when running a KVM guest on hardware that does not
support winkle or sleep and the guest makes use of secondary threads. In
practice this means Power7 machines, though some early unreleased Power8
machines may also be susceptible.
The secondary CPUs are taken off line before the guest is started and
they call pnv_smp_cpu_kill_self(). This checks support for sleep
states (in this case there is no support) and power7_nap() is called.
When the CPU is woken, power7_nap() returns and because the CPU is
still off line, the main while loop executes again. The sleep states
support test is executed again, but because the tested values cannot
have changed, the compiler has optimized the test away and instead we
rely on the result of the first test, which has been left in cr3
and/or cr4. With the result overwritten, the wrong branch is taken and
power7_winkle() is called on a CPU that does not support it, leading
to it stalling.
Fixes: 7cba160ad7 ("powernv/cpuidle: Redesign idle states management")
Fixes: 77b54e9f21 ("powernv/powerpc: Add winkle support for offline cpus")
[mpe: Massage change log a bit more]
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 364aece01a upstream.
This patch fixes a timing issue that causes a GPU hang when the system
comes out of power saving.
During pm_resume, We are submitting batchbuffers before enabling
Interrupts this is causing us to miss the context switch interrupt,
and in consequence intel_execlists_handle_ctx_events is not triggered.
This patch is based on a patch from Deepak S <deepak.s@intel.com>
from another platform.
The patch fixes an issue introduced by:
commit e7778be1ea
drm/i915: Fix startup failure in LRC mode after recent init changes
The above patch added a call to init_context() to fix an issue introduced
by a previous patch. But, it then opened up a small timing window for the
batches being added by the init_context (basically setting up the context)
to complete before the interrupts have been turned on, thus hanging the
GPU.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89600
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: fixed typo in subject, massaged the comments a bit]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a1407559a upstream.
When stacking request-based DM on blk_mq device, request cloning and
remapping are done in a single call to target's clone_and_map_rq().
The clone is allocated and valid only if clone_and_map_rq() returns
DM_MAPIO_REMAPPED.
The "IS_ERR(clone)" check in map_request() does not cover all the
!DM_MAPIO_REMAPPED cases that are possible (E.g. if underlying devices
are not ready or unavailable, clone_and_map_rq() may return
DM_MAPIO_REQUEUE without ever having established an ERR_PTR). Fix this
by explicitly checking for a return that is not DM_MAPIO_REMAPPED in
map_request().
Without this fix, DM core may call setup_clone() for a NULL clone
and oops like this:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
IP: [<ffffffff81227525>] blk_rq_prep_clone+0x7d/0x137
...
CPU: 2 PID: 5793 Comm: kdmwork-253:3 Not tainted 4.0.0-nm #1
...
Call Trace:
[<ffffffffa01d1c09>] map_tio_request+0xa9/0x258 [dm_mod]
[<ffffffff81071de9>] kthread_worker_fn+0xfd/0x150
[<ffffffff81071cec>] ? kthread_parkme+0x24/0x24
[<ffffffff81071cec>] ? kthread_parkme+0x24/0x24
[<ffffffff81071fdd>] kthread+0xe6/0xee
[<ffffffff81093a59>] ? put_lock_stats+0xe/0x20
[<ffffffff81071ef7>] ? __init_kthread_worker+0x5b/0x5b
[<ffffffff814c2d98>] ret_from_fork+0x58/0x90
[<ffffffff81071ef7>] ? __init_kthread_worker+0x5b/0x5b
Fixes: e5863d9ad ("dm: allocate requests in target when stacking on blk-mq devices")
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f024978e7 upstream.
On Exynos4412 boards (Trats2, Odroid U3) after enabling L2 cache in
56b60b8bce ("ARM: 8265/1: dts: exynos4: Add nodes for L2 cache
controller") the second suspend to RAM failed. First suspend worked fine
but the next one hang just after powering down of secondary CPUs (system
consumed energy as it would be running but was not responsive).
The issue was caused by enabling delayed reset assertion for CPU0 just
after issuing power down of cores. This was introduced for Exynos4 in
13cfa6c4f7 ("ARM: EXYNOS: Fix CPU idle clock down after CPU off").
The whole behavior is not well documented but after checking with vendor
code this should be done like this (on Exynos4):
1. Enable delayed reset assertion when system is running (for all CPUs).
2. Disable delayed reset assertion before suspending the system.
This can be done after powering off secondary CPUs.
3. Re-enable the delayed reset assertion when system is resumed.
Fixes: 13cfa6c4f7 ("ARM: EXYNOS: Fix CPU idle clock down after CPU off")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Tested-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15bf722e6f upstream.
ATOL FPrint fiscal printers require usb_clear_halt to be executed
to work properly. Add quirk to fix the issue.
Signed-off-by: Alexey Sokolov <sokolov@7pikes.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90f91b1298 upstream.
It seems Broadcom released two devices with conflicting device id. There
are for sure 14e4:4321 PCI devices with BCM4321 (N-PHY) chipset, they
can be found in routers, e.g. Netgear WNR834Bv2. However, according to
Broadcom public sources 0x4321 is also used for 5 GHz BCM4306 (G-PHY).
It's unsure if they meant PCI device id, or "virtual" id (from SPROM).
To distinguish these devices lets check PHY type (G vs. N).
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Cc: <stable@vger.kernel.org> # 3.16+
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 016a65a391 upstream.
With the introduction of multiple views of an obj in the same vm, each
vma was taught to cache its copy of the pages (so that different views
could have different page arrangements). However, this missed decoupling
those vma->ggtt_view.pages when the vma released its reference on the
obj->pages. As we don't always free the vma, this leads to a possible
scenario (e.g. execbuffer interrupted by the shrinker) where the vma
points to a stale obj->pages, and explodes.
Fixes regression from commit fe14d5f4e5
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date: Wed Dec 10 17:27:58 2014 +0000
drm/i915: Infrastructure for supporting different GGTT views per object
Tvrtko says, if someone else will be confused how this can happen, key
is the reservation execbuffer path. That puts the VMA on the exec_list
which prevents i915_vma_unbind and i915_gem_vma_destroy from fully
destroying the VMA. So the VMA is left existing as an empty object in
the list - unbound and disassociated with the backing store. Kind of a
cached memory object. And then re-using it needs to clear the cached
pages pointer which is fixed above.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1227892
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[Jani: Added Tvrtko's explanation to commit message.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 524630d582 upstream.
iser connection termination process happens in 2 stages:
- isert_wait_conn:
- resumes rdma disconnect
- wait for session commands
- wait for flush completions (post a marked wr to signal we are done)
- wait for logout completion
- queue work for connection cleanup (depends on disconnected/timewait
events)
- isert_free_conn
- last reference put on the connection
In case we are terminating during IOs, we might be posting send/recv
requests after we posted the last work request which might lead
to a use-after-free condition in isert_handle_wc.
After we posted the last wr in isert_wait_conn we are guaranteed that
no successful completions will follow (meaning no new work request posts
may happen) but other flush errors might still come. So before we
put the last reference on the connection, we repeat the process of
posting a marked work request (isert_wait4flush) in order to make sure all
pending completions were flushed.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Falkovich <jennyf@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9253e667ab upstream.
Since commit "2426bd456a6 target: Report correct response ..."
we might get a command with data_size that does not fit to
the number of allocated data sg elements. Given that we rely on
cmd t_data_nents which might be different than the data_size,
we sometimes receive local length error completion. The correct
approach would be to take the command data_size into account when
constructing the ib sg_list.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Falkovich <jennyf@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cf30dc180 upstream.
When the following filter is used it causes a warning to trigger:
# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
Modules linked in: bnep lockd grace bluetooth ...
CPU: 3 PID: 1223 Comm: bash Tainted: G W 4.1.0-rc3-test+ #450
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
Call Trace:
[<ffffffff816ed4f9>] dump_stack+0x4c/0x6e
[<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0
[<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80
[<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20
[<ffffffff81159065>] replace_preds+0x3c5/0x990
[<ffffffff811596b2>] create_filter+0x82/0xb0
[<ffffffff81159944>] apply_event_filter+0xd4/0x180
[<ffffffff81152bbf>] event_filter_write+0x8f/0x120
[<ffffffff811db2a8>] __vfs_write+0x28/0xe0
[<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0
[<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0
[<ffffffff811dc408>] vfs_write+0xb8/0x1b0
[<ffffffff811dc72f>] SyS_write+0x4f/0xb0
[<ffffffff816f5217>] system_call_fastpath+0x12/0x6a
---[ end trace e11028bd95818dcd ]---
Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.
This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:
# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression
And give no kernel warning.
Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ab42ff448 upstream.
On a HP Envy TouchSmart laptop, there are 2 speakers (main speaker
and subwoofer speaker), 1 headphone and 2 DACs, without this fixup,
the headphone will be assigned to a DAC and the 2 speakers will be
assigned to another DAC, this assignment makes the surround-2.1
channels invalid.
To fix it, here using a DAC/pin preference map to bind the main
speaker to 1 DAC and the subwoofer speaker will be assigned to another
DAC.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6c7b03e1ae upstream.
The PLL impose a certain input range to work correctly, but it appears that
this input range does not apply on the input clock (or parent clock) but
on the input clock after it has passed the PLL divisor.
Fix the implementation accordingly.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Jonas Andersson <jonas@microbit.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6ac069532 upstream.
lapic.timer_mode was not properly initialized after migration, which
broke few useful things, like login, by making every sleep eternal.
Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore.
There are other slowpaths that update lvtt, so this patch makes sure
something similar doesn't happen again by calling apic_update_lvtt
after every modification.
Fixes: f30ebc312c ("KVM: x86: optimize some accesses to LVTT and SPIV")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 412c98c1be upstream.
The hwrng output buffers (2) are cast inside of a a struct (caam_rng_ctx)
allocated in one DMA-tagged region. While the kernel's heap allocator
should place the overall struct on a cacheline aligned boundary, the 2
buffers contained within may not necessarily align. Consenquently, the ends
of unaligned buffers may not fully flush, and if so, stale data will be left
behind, resulting in small repeating patterns.
This fix aligns the buffers inside the struct.
Note that not all of the data inside caam_rng_ctx necessarily needs to be
DMA-tagged, only the buffers themselves require this. However, a fix would
incur the expense of error-handling bloat in the case of allocation failure.
Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Victoria Milhoan <vicki.milhoan@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 153c35b6cc upstream.
Commit 2f0810880f changed
btrfs_set_block_group_ro to avoid trying to allocate new chunks with the
new raid profile during conversion. This fixed failures when there was
no space on the drive to allocate a new chunk, but the metadata
reserves were sufficient to continue the conversion.
But this ended up causing a regression when the drive had plenty of
space to allocate new chunks, mostly because reduce_alloc_profile isn't
using the new raid profile.
Fixing btrfs_reduce_alloc_profile is a bigger patch. For now, do a
partial revert of 2f0810880, and don't error out if we hit ENOSPC.
Signed-off-by: Chris Mason <clm@fb.com>
Tested-by: Dave Sterba <dsterba@suse.cz>
Reported-by: Holger Hoffstaette <holger.hoffstaette@googlemail.com>
[adapted for stable kernel branch, v4.0.5]
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de249e66a7 upstream.
Commit 0d97a64e0 creates a new variable but doesn't always set it up.
This puts it back to the original method (key.offset + 1) for the cases
not covered by Filipe's new logic.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df858e7672 upstream.
While searching for extents to clone we might find one where we only use
a part of it coming from its tail. If our destination inode is the same
the source inode, we end up removing the tail part of the extent item and
insert after a new one that point to the same extent with an adjusted
key file offset and data offset. After this we search for the next extent
item in the fs/subvol tree with a key that has an offset incremented by
one. But this second search leaves us at the new extent item we inserted
previously, and since that extent item has a non-zero data offset, it
it can make us call btrfs_drop_extents with an empty range (start == end)
which causes the following warning:
[23978.537119] WARNING: CPU: 6 PID: 16251 at fs/btrfs/file.c:550 btrfs_drop_extent_cache+0x43/0x385 [btrfs]()
(...)
[23978.557266] Call Trace:
[23978.557978] [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[23978.559191] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[23978.560699] [<ffffffffa047f0ea>] ? btrfs_drop_extent_cache+0x43/0x385 [btrfs]
[23978.562389] [<ffffffff8104544d>] warn_slowpath_null+0x1a/0x1c
[23978.563613] [<ffffffffa047f0ea>] btrfs_drop_extent_cache+0x43/0x385 [btrfs]
[23978.565103] [<ffffffff810e3a18>] ? time_hardirqs_off+0x15/0x28
[23978.566294] [<ffffffff81079ff8>] ? trace_hardirqs_off+0xd/0xf
[23978.567438] [<ffffffffa047f73d>] __btrfs_drop_extents+0x6b/0x9e1 [btrfs]
[23978.568702] [<ffffffff8107c03f>] ? trace_hardirqs_on+0xd/0xf
[23978.569763] [<ffffffff811441c0>] ? ____cache_alloc+0x69/0x2eb
[23978.570817] [<ffffffff81142269>] ? virt_to_head_page+0x9/0x36
[23978.571872] [<ffffffff81143c15>] ? cache_alloc_debugcheck_after.isra.42+0x16c/0x1cb
[23978.573466] [<ffffffff811420d5>] ? kmemleak_alloc_recursive.constprop.52+0x16/0x18
[23978.574962] [<ffffffffa0480d07>] btrfs_drop_extents+0x66/0x7f [btrfs]
[23978.576179] [<ffffffffa049aa35>] btrfs_clone+0x516/0xaf5 [btrfs]
[23978.577311] [<ffffffffa04983dc>] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.578520] [<ffffffffa049b2a2>] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.580282] [<ffffffffa049d9ae>] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.591887] ---[ end trace 988ec2a653d03ed3 ]---
Then we attempt to insert a new extent item with a key that already
exists, which makes btrfs_insert_empty_item return -EEXIST resulting in
abortion of the current transaction:
[23978.594355] WARNING: CPU: 6 PID: 16251 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x52/0x114 [btrfs]()
(...)
[23978.622589] Call Trace:
[23978.623181] [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[23978.624359] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[23978.625573] [<ffffffffa044ab6c>] ? __btrfs_abort_transaction+0x52/0x114 [btrfs]
[23978.626971] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[23978.628003] [<ffffffff8108a6c8>] ? vprintk_default+0x1d/0x1f
[23978.629138] [<ffffffffa044ab6c>] __btrfs_abort_transaction+0x52/0x114 [btrfs]
[23978.630528] [<ffffffffa049ad1b>] btrfs_clone+0x7fc/0xaf5 [btrfs]
[23978.631635] [<ffffffffa04983dc>] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.632886] [<ffffffffa049b2a2>] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.634119] [<ffffffffa049d9ae>] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.647714] ---[ end trace 988ec2a653d03ed4 ]---
This is wrong because we should not process the extent item that we just
inserted previously, and instead process the extent item that follows it
in the tree
For example for the test case I wrote for fstests:
bs=$((64 * 1024))
mkfs.btrfs -f -l $bs -O ^no-holes /dev/sdc
mount /dev/sdc /mnt
xfs_io -f -c "pwrite -S 0xaa $(($bs * 2)) $(($bs * 2))" /mnt/foo
$CLONER_PROG -s $((3 * $bs)) -d $((267 * $bs)) -l 0 /mnt/foo /mnt/foo
$CLONER_PROG -s $((217 * $bs)) -d $((95 * $bs)) -l 0 /mnt/foo /mnt/foo
The second clone call fails with -EEXIST, because when we process the
first extent item (offset 262144), we drop part of it (counting from the
end) and then insert a new extent item with a key greater then the key we
found. The next time we search the tree we search for a key with offset
262144 + 1, which leaves us at the new extent item we have just inserted
but we think it refers to an extent that we need to clone.
Fix this by ensuring the next search key uses an offset corresponding to
the offset of the key we found previously plus the data length of the
corresponding extent item. This ensures we skip new extent items that we
inserted and works for the case of implicit holes too (NO_HOLES feature).
A test case for fstests follows soon.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 727b9784b6 upstream.
Orphans in the fs tree are cleaned up via open_ctree and subvolume
orphans are cleaned via btrfs_lookup_dentry -- except when a default
subvolume is in use. The name for the default subvolume uses a manual
lookup that doesn't trigger orphan cleanup and needs to trigger it
manually as well. This doesn't apply to the remount case since the
subvolumes are cleaned up by walking the root radix tree.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26e726afe0 upstream.
fiemap_fill_next_extent returns 0 on success, -errno on error, 1 if this was
the last extent that will fit in user array. If 1 is returned, the return
value may eventually returned to user space, which should not happen, according
to manpage of ioctl.
Signed-off-by: Chengyu Song <csong84@gatech.edu>
Reviewed-by: David Sterba <dsterba@suse.cz>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f1f465ae6 upstream.
If the clone root was not readonly or the dead flag was set on it, we were
leaving without decrementing the root's send_progress counter (and before
we just incremented it). If a concurrent snapshot deletion was in progress
and ended up being aborted, it would be impossible to later attempt to
delete again the snapshot, since the root's send_in_progress counter could
never go back to 0.
We were also setting clone_sources_to_rollback to i + 1 too early - if we
bailed out because the clone root we got is not readonly or flagged as dead
we ended up later derreferencing a null pointer because we didn't assign
the clone root to sctx->clone_roots[i].root:
for (i = 0; sctx && i < clone_sources_to_rollback; i++)
btrfs_root_dec_send_in_progress(
sctx->clone_roots[i].root);
So just don't increment the send_in_progress counter if the root is readonly
or flagged as dead.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cc2b17e80 upstream.
After we locked the root's root item, a concurrent snapshot deletion
call might have set the dead flag on it. So check if the dead flag
is set and abort if it is, just like we do for the parent root.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c5a18a31b upstream.
Until recently, mac80211 overwrote all the statistics it could
provide when getting called, but it now relies on the struct
having been zeroed by the caller. This was always the case in
nl80211, but wext used a static struct which could even cause
values from one device leak to another.
Using a static struct is OK (as even documented in a comment)
since the whole usage of this function and its return value is
always locked under RTNL. Not clearing the struct for calling
the driver has always been wrong though, since drivers were
free to only fill values they could report, so calling this
for one device and then for another would always have leaked
values from one to the other.
Fix this by initializing the structure in question before the
driver method call.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=99691
Reported-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Reported-by: Alexander Kaltsas <alexkaltsas@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3b4afca70 upstream.
Now blk_cleanup_queue() can be called before calling
del_gendisk()[1], inside which hctx->ctxs is touched
from blk_mq_unregister_hctx(), but the variable has
been freed by blk_cleanup_queue() at that time.
So this patch moves freeing of hctx->ctxs into queue's
release handler for fixing the oops reported by Stefan.
[1], 6cd18e711d (block: destroy bdi before blockdev is
unregistered)
Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e76d4eecf upstream.
Jovi Zhangwei reported the following problem
Below kernel vm bug can be triggered by tcpdump which mmaped a lot of pages
with GFP_COMP flag.
[Mon May 25 05:29:33 2015] page:ffffea0015414000 count:66 mapcount:1 mapping: (null) index:0x0
[Mon May 25 05:29:33 2015] flags: 0x20047580004000(head)
[Mon May 25 05:29:33 2015] page dumped because: VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page))
[Mon May 25 05:29:33 2015] ------------[ cut here ]------------
[Mon May 25 05:29:33 2015] kernel BUG at mm/migrate.c:1661!
[Mon May 25 05:29:33 2015] invalid opcode: 0000 [#1] SMP
In this case it was triggered by running tcpdump but it's not necessary
reproducible on all systems.
sudo tcpdump -i bond0.100 'tcp port 4242' -c 100000000000 -w 4242.pcap
Compound pages cannot be migrated and it was not expected that such pages
be marked for NUMA balancing. This did not take into account that drivers
such as net/packet/af_packet.c may insert compound pages into userspace
with vm_insert_page. This patch tells the NUMA balancing protection
scanner to skip all VM_MIXEDMAP mappings which avoids the possibility that
compound pages are marked for migration.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Jovi Zhangwei <jovi@cloudflare.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c008f1d356 upstream.
Returning zero from a 'store' function is bad.
The return value should be either len length of the string
or an error.
So use 'len' if 'err' is zero.
Fixes: 6791875e2e ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e8e2518fc upstream.
Checking ->sync_thread without holding the mddev_lock()
isn't really safe, even after flushing the workqueue which
ensures md_start_sync() has been run.
While this code is waiting for the lock, md_check_recovery could reap
the thread itself, and then start another thread (e.g. recovery might
finish, then reshape starts). When this thread gets the lock
md_start_sync() hasn't run so it doesn't get reaped, but
MD_RECOVERY_RUNNING gets cleared. This allows two threads to start
which leads to confusion.
So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
the flush and the test and the reap all under the mddev_lock to
avoid any race with md_check_recovery.
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: 6791875e2e ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85bd839983 upstream.
Izumi found the following oops when hot re-adding a node:
BUG: unable to handle kernel paging request at ffffc90008963690
IP: __wake_up_bit+0x20/0x70
Oops: 0000 [#1] SMP
CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
RIP: 0010:[<ffffffff810dff80>] [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
RSP: 0018:ffff880017b97be8 EFLAGS: 00010246
RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
FS: 00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
Call Trace:
unlock_page+0x6d/0x70
generic_write_end+0x53/0xb0
xfs_vm_write_end+0x29/0x80 [xfs]
generic_perform_write+0x10a/0x1e0
xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
xfs_file_write_iter+0x79/0x120 [xfs]
__vfs_write+0xd4/0x110
vfs_write+0xac/0x1c0
SyS_write+0x58/0xd0
system_call_fastpath+0x12/0x76
Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
RIP [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
RSP <ffff880017b97be8>
CR2: ffffc90008963690
Reproduce method (re-add a node)::
Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)
This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.
When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized. The
judgement of zone initialized is based on zone->wait_table:
static inline bool zone_is_initialized(struct zone *zone)
{
return !!zone->wait_table;
}
so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 692ef3ee36 upstream.
Model name in mt8173-evb.dts doesn't follow dts convention (it should
be human readable model name). Fix it.
Fixes: b3a3724841 ("arm64: dts: Add mediatek MT8173 SoC and evaluation board dts and Makefile")
Signed-off-by: Yingjoe Chen <yingjoe.chen@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 885dbd154b upstream.
This reverts commit 1737cac693 ("bus: mvebu-mbus: make sure SDRAM CS
for DMA don't overlap the MBus bridge window"), because it breaks DMA
on platforms having more than 2 GB of RAM.
This commit changed the information reported to DMA masters device
drivers through the mv_mbus_dram_info() function so that the returned
DRAM ranges do not overlap with I/O windows.
This was necessary as a preparation to support the new CESA Crypto
Engine driver, which will use DMA for cryptographic operations. But
since it does DMA with the SRAM which is mapped as an I/O window,
having DRAM ranges overlapping with I/O windows was problematic.
To solve this, the above mentioned commit changed the mvebu-mbus to
adjust the DRAM ranges so that they don't overlap with the I/O
windows. However, by doing this, we re-adjust the DRAM ranges in a way
that makes them have a size that is no longer a power of two. While
this is perfectly fine for the Crypto Engine, which supports DRAM
ranges with a granularity of 64 KB, it breaks basically all other DMA
masters, which expect power of two sizes for the DRAM ranges.
Due to this, if the installed system memory is 4 GB, in two
chip-selects of 2 GB, the second DRAM range will be reduced from 2 GB
to a little bit less than 2 GB to not overlap with the I/O windows, in
a way that results in a DRAM range that doesn't have a power of two
size. This means that whenever you do a DMA transfer with an address
located in the [ 2 GB ; 4 GB ] area, it will freeze the system. Any
serious DMA activity like simply running:
for i in $(seq 1 64) ; do dd if=/dev/urandom of=file$i bs=1M count=16 ; done
in an ext3 partition mounted over a SATA drive will freeze the system.
Since the new CESA crypto driver that uses DMA has not been merged
yet, the easiest fix is to simply revert this commit. A follow-up
commit will introduce a different solution for the CESA crypto driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: 1737cac693 ("bus: mvebu-mbus: make sure SDRAM CS for DMA don't overlap the MBus bridge window")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c9e06e647 upstream.
Commit a0b5cd4ac2 ("bus: mvebu-mbus: use automatic I/O
synchronization barriers") enabled the usage of automatic I/O
synchronization barriers by enabling bit WIN_CTRL_SYNCBARRIER in the
control registers of MBus windows, but on non io-coherent platforms
(orion5x, kirkwood and dove) the WIN_CTRL_SYNCBARRIER bit in
the window control register is either reserved (all windows except 6
and 7) or enables read-only protection (windows 6 and 7).
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: a0b5cd4ac2 ("bus: mvebu-mbus: use automatic I/O synchronization barriers")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e96998fc20 upstream.
According to the Armada 38x datasheet, the window base address
registers value is set in bits [31:4] of the register and corresponds
to the transaction address bits [47:20].
Therefore, the 32bit base address value should be shifted right by
20bits and left by 4bits, resulting in 16 bit shift right.
The bug as not been noticed yet because if the memory available on
the platform is less than 2GB, then the base address is zero.
[gregory.clement@free-electrons.com: add extra-explanation]
Fixes: a3464ed2f1 (ata: ahci_mvebu: new driver for Marvell Armada 380
AHCI interfaces)
Signed-off-by: Nadav Haklai <nadavh@marvell.com>
Reviewed-by: Omri Itach <omrii@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 210d150e1f upstream.
The cpumask vp_dev->msix_affinity_masks[info->msix_vector] may contain
staled information when vp_set_vq_affinity() gets called, so clear it
before setting the new cpu bit mask.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f76502aa91 upstream.
"IS_ENABLED(PPC_PSERIES)" always evaluates to false, as IS_ENABLED() is
supposed to be used with the full Kconfig symbol name, including the
"CONFIG_" prefix.
Add the missing "CONFIG_" prefix to fix this.
Fixes: a25095d451 ("of: Move dynamic node fixups out of powerpc and into common code")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 392bceedb1 upstream.
The driver configures the IDLE condition to interrupt the SDMA engine.
Since the SDMA UART ROM script doesn't clear the IDLE bit itself, this
caused repeated 1-byte DMA transfers, regardless of available data in the
RX FIFO. Also, when returning due to the IDLE condition, the UART ROM
script already increased its counter, causing residue to be off by one.
This patch clears the IDLE condition to avoid repeated 1-byte DMA transfers
and decreases count by when the DMA transfer was aborted due to the IDLE
condition, fixing serial transfers using DMA on i.MX6Q.
Reported-by: Peter Seiderer <ps.report@gmx.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6dfd197283 upstream.
Laptop with Turks/Thames GPU will freeze if dpm is enabled. It seems
the SMC engine is relying on some state inside the CP engine. CP needs
to chew at least one packet for it to get in good state for dynamic
power management.
This patch simply disabled and re-enable DPM after the ring test which
is enough to avoid the freeze.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f5f1554ee upstream.
Passive DP->DVI/HDMI dongles on DP++ ports show up to the system as HDMI
devices, as they do not have a sink device in them to respond to any AUX
traffic. When probing these dongles over the DDC, sometimes they will
NAK the first attempt even though the transaction is valid and they
support the DDC protocol. The retry loop inside of
drm_do_probe_ddc_edid() would normally catch this case and try the
transaction again, resulting in success.
That, however, was thwarted by the fix for [1]:
commit 9292f37e1f
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date: Thu Jan 5 09:34:28 2012 -0200
drm: give up on edid retries when i2c bus is not responding
This added code to exit immediately if the return code from the
i2c_transfer function was -ENXIO in order to reduce the amount of time
spent in waiting for unresponsive or disconnected devices. That was
possible because the underlying i2c bit banging algorithm had retries of
its own (which, of course, were part of the reason for the bug the
commit fixes).
Since its introduction in
commit f899fc64cd
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Jul 20 15:44:45 2010 -0700
drm/i915: use GMBUS to manage i2c links
we've been flipping back and forth enabling the GMBUS transfers, but
we've settled since then. The GMBUS implementation does not do any
retries, however, bailing out of the drm_do_probe_ddc_edid() retry loop
on first encounter of -ENXIO. This, combined with Eugeni's commit, broke
the retry on -ENXIO.
Retry GMBUS once on -ENXIO on first message to mitigate the issues with
passive adapters.
This patch is based on the work, and commit message, by Todd Previte
<tprevite@gmail.com>.
[1] https://bugs.freedesktop.org/show_bug.cgi?id=41059
v2: Don't retry if using bit banging.
v3: Move retry within gmbux_xfer, retry only on first message.
v4: Initialize GMBUS0 on retry (Ville).
v5: Take index reads into account (Ville).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85924
Cc: Todd Previte <tprevite@gmail.com>
Tested-by: Oliver Grafe <oliver.grafe@ge.com> (v2)
Tested-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0aedb16265 upstream.
Apparently we can have requests even if though the active list is empty,
so do the request retirement regardless of whether there's anything
on the active list.
The way it happened here is that during suspend intel_ring_idle()
notices the olr hanging around and then proceeds to get rid of it by
adding a request. However since there was nothing on the active lists
i915_gem_retire_requests() didn't clean those up, and so the idle work
never runs, and we leave the GPU "busy" during suspend resulting in a
WARN later.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e058c945e0 upstream.
According to the HSW b-spec we need to try clock divisors of 63
and 72, each 3 or more times, when attempting DP AUX channel
communication on a server chipset. This actually wasn't happening
due to a short-circuit that only checked the DP_AUX_CH_CTL_DONE bit
in status rather than checking that the operation was done and
that DP_AUX_CH_CTL_TIME_OUT_ERROR was not set.
[v2] Implemented alternate solution suggested by Jani Nikula.
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a6cb0abe1 upstream.
Avoid entering "RTC-only mode" at poweroff. It is unsupported by most
versions of BeagleBone, and risks hardware damage.
The damaging configuration is having system-power-controller
without ti,pmic-shutdown-controller.
Reported-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Tested-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Johan Hovold <johan@kernel.org>
[Matthijs van Duin: added explanatory comments]
Signed-off-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Fixes: http://bugs.elinux.org/issues/143
[tony@atomide.com: updated comments with the hardware breaking info]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 425be5679f upstream.
The early_idt_handlers asm code generates an array of entry
points spaced nine bytes apart. It's not really clear from that
code or from the places that reference it what's going on, and
the code only works in the first place because GAS never
generates two-byte JMP instructions when jumping to global
labels.
Clean up the code to generate the correct array stride (member size)
explicitly. This should be considerably more robust against
screw-ups, as GAS will warn if a .fill directive has a negative
count. Using '. =' to advance would have been even more robust
(it would generate an actual error if it tried to move
backwards), but it would pad with nulls, confusing anyone who
tries to disassemble the code. The new scheme should be much
clearer to future readers.
While we're at it, improve the comments and rename the array and
common code.
Binutils may start relaxing jumps to non-weak labels. If so,
this change will fix our build, and we may need to backport this
change.
Before, on x86_64:
0000000000000000 <early_idt_handlers>:
0: 6a 00 pushq $0x0
2: 6a 00 pushq $0x0
4: e9 00 00 00 00 jmpq 9 <early_idt_handlers+0x9>
5: R_X86_64_PC32 early_idt_handler-0x4
...
48: 66 90 xchg %ax,%ax
4a: 6a 08 pushq $0x8
4c: e9 00 00 00 00 jmpq 51 <early_idt_handlers+0x51>
4d: R_X86_64_PC32 early_idt_handler-0x4
...
117: 6a 00 pushq $0x0
119: 6a 1f pushq $0x1f
11b: e9 00 00 00 00 jmpq 120 <early_idt_handler>
11c: R_X86_64_PC32 early_idt_handler-0x4
After:
0000000000000000 <early_idt_handler_array>:
0: 6a 00 pushq $0x0
2: 6a 00 pushq $0x0
4: e9 14 01 00 00 jmpq 11d <early_idt_handler_common>
...
48: 6a 08 pushq $0x8
4a: e9 d1 00 00 00 jmpq 120 <early_idt_handler_common>
4f: cc int3
50: cc int3
...
117: 6a 00 pushq $0x0
119: 6a 1f pushq $0x1f
11b: eb 03 jmp 120 <early_idt_handler_common>
11d: cc int3
11e: cc int3
11f: cc int3
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Binutils <binutils@sourceware.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b04c846cea upstream.
Fixed regression. After commit 29e409f0f7 ("xhci: Allow xHCI drivers to
be built as separate modules") the module xhci_hcd became non-removable.
That behaviour is not expected and there're no notes about it in commit
message. The module should be removable as it blocks PM suspend/resume
functions (Debian Bug#666406).
Signed-off-by: Arthur Demchenkov <spinal.by@gmail.com>
Reviewed-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a00918d052 upstream.
Regression in commit 638139eb95 ("usb: hub: allow to process more usb
hub events in parallel")
The regression resulted in intermittent failure to initialise a 10-port
hub (with three internal VL812 4-port hub controllers) on boot, with a
failure rate of around 8%, due to multiple race conditions when
accessing addr_dev and slot_id in struct xhci_hcd.
This regression also exposed a problem with xhci_setup_device, which
"should be protected by the usb_address0_mutex" but no longer is due to
commit 6fecd4f2a5 ("USB: separate usb_address0 mutexes for each bus")
With separate buses (and locks) it is no longer the case that a single
lock will protect xhci_setup_device from accesses by two parallel
threads processing events on the two buses.
Fix this by adding a mutex to protect addr_dev and slot_id in struct
xhci_hcd, and by making the assignment of slot_id atomic.
Fixes multiple boot errors:
[ 0.583008] xhci_hcd 0000:00:14.0: Bad Slot ID 2
[ 0.583009] xhci_hcd 0000:00:14.0: Could not allocate xHCI USB device data structures
[ 0.583012] usb usb1-port3: couldn't allocate usb_device
And:
[ 0.637409] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
[ 0.637417] xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 32.
[ 0.637421] usb usb1-port1: couldn't allocate usb_device
And:
[ 0.753372] xhci_hcd 0000:00:14.0: ERROR: unexpected setup context command completion code 0x0.
[ 0.753373] usb 1-3: hub failed to enable device, error -22
[ 0.753400] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
[ 0.753402] xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 32.
[ 0.753403] usb usb1-port3: couldn't allocate usb_device
And:
[ 11.018386] usb 1-3: device descriptor read/all, error -110
And:
[ 5.753838] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
Tested with 200 reboots, resulting in no USB hub init related errors.
Fixes: 638139eb95 ("usb: hub: allow to process more usb hub events in parallel")
Link: https://lkml.kernel.org/g/CAP-bSRb=A0iEYobdGCLpwynS7pkxpt_9ZnwyZTPVAoy0Y=Zo3Q@mail.gmail.com
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
[changed git commit description style for checkpatch -Mathias]
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aad653a0bc upstream.
bdi_unregister() now contains very little functionality.
It contains a "WARN_ON" if bdi->dev is NULL. This warning is of no
real consequence as bdi->dev isn't needed by anything else in the function,
and it triggers if
blk_cleanup_queue() -> bdi_destroy()
is called before bdi_unregister, which happens since
Commit: 6cd18e711d ("block: destroy bdi before blockdev is unregistered.")
So this isn't wanted.
It also calls bdi_set_min_ratio(). This needs to be called after
writes through the bdi have all been flushed, and before the bdi is destroyed.
Calling it early is better than calling it late as it frees up a global
resource.
Calling it immediately after bdi_wb_shutdown() in bdi_destroy()
perfectly fits these requirements.
So bdi_unregister() can be discarded with the important content moved to
bdi_destroy(), as can the
writeback_bdi_unregister
event which is already not used.
Reported-by: Mike Snitzer <snitzer@redhat.com>
Fixes: c4db59d31e ("fs: don't reassign dirty inodes to default_backing_dev_info")
Fixes: 6cd18e711d ("block: destroy bdi before blockdev is unregistered.")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d66e5e9b6 upstream.
=================================
[ INFO: inconsistent lock state ]
4.1.0-rc7+ #217 Tainted: G O
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
(ext_devt_lock){+.?...}, at: [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70
{SOFTIRQ-ON-W} state was registered at:
[<ffffffff810bf6b1>] __lock_acquire+0x461/0x1e70
[<ffffffff810c1947>] lock_acquire+0xb7/0x290
[<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
[<ffffffff8143a07d>] blk_alloc_devt+0x6d/0xd0 <-- take the lock in process context
[..]
[<ffffffff810bf64e>] __lock_acquire+0x3fe/0x1e70
[<ffffffff810c00ad>] ? __lock_acquire+0xe5d/0x1e70
[<ffffffff810c1947>] lock_acquire+0xb7/0x290
[<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
[<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
[<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
[<ffffffff8143a60c>] blk_free_devt+0x3c/0x70 <-- take the lock in softirq
[<ffffffff8143bfec>] part_release+0x1c/0x50
[<ffffffff8158edf6>] device_release+0x36/0xb0
[<ffffffff8145ac2b>] kobject_cleanup+0x7b/0x1a0
[<ffffffff8145aad0>] kobject_put+0x30/0x70
[<ffffffff8158f147>] put_device+0x17/0x20
[<ffffffff8143c29c>] delete_partition_rcu_cb+0x16c/0x180
[<ffffffff8143c130>] ? read_dev_sector+0xa0/0xa0
[<ffffffff810e0e0f>] rcu_process_callbacks+0x2ff/0xa90
[<ffffffff810e0dcf>] ? rcu_process_callbacks+0x2bf/0xa90
[<ffffffff81067e2e>] __do_softirq+0xde/0x600
Neil sees this in his tests and it also triggers on pmem driver unbind
for the libnvdimm tests. This fix is on top of an initial fix by Keith
for incorrect usage of mutex_lock() in this path: 2da78092dd "block:
Fix dev_t minor allocation lifetime". Both this and 2da78092dd are
candidates for -stable.
Fixes: 2da78092dd ("block: Fix dev_t minor allocation lifetime")
Cc: Keith Busch <keith.busch@intel.com>
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f0ee9d17a upstream.
Make the check to skip the rate check more lax, so that it applies
to all hw_version 4 models.
This fixes the touchpad not being detected properly on Asus PU551LA
laptops.
Reported-and-tested-by: David Zafra Gómez <dezeta@klo.es>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 088df2ccef upstream.
On some v7 devices (e.g. Lenovo-E550) the deltas reported are typically
only in the 0-1 range dividing this by 2 results in a range of 0-0.
And even for v7 devices where this does not lead to making the trackstick
entirely unusable, it makes it twice as slow as before we added v7 support
and were using the ps/2 mouse emulation of the dual point setup.
If some kind of generic slowdown is actually necessary for some devices,
then that belongs in userspace, not in the kernel.
Reported-and-tested-by: Rico Moorman <rico.moorman@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d487a43c3 upstream.
Initialize sysreg by default, otherwise driver will crash in suspend
callback when not using DT.
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: a7750c3ef0 ("i2c: s3c2410: Handle i2c sys_cfg register in i2c driver")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c374fc7ce upstream.
Using _bh variant for spin locks causes this kind of warning:
Starting logging: ------------[ cut here ]------------
WARNING: CPU: 0 PID: 3 at /ssd_drive/linux/kernel/softirq.c:151
__local_bh_enable_ip+0xe8/0xf4()
Modules linked in:
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.1.0-rc2+ #94
Hardware name: Atmel SAMA5
[<c0013c04>] (unwind_backtrace) from [<c00118a4>] (show_stack+0x10/0x14)
[<c00118a4>] (show_stack) from [<c001bbcc>]
(warn_slowpath_common+0x80/0xac)
[<c001bbcc>] (warn_slowpath_common) from [<c001bc14>]
(warn_slowpath_null+0x1c/0x24)
[<c001bc14>] (warn_slowpath_null) from [<c001e28c>]
(__local_bh_enable_ip+0xe8/0xf4)
[<c001e28c>] (__local_bh_enable_ip) from [<c01fdbd0>]
(at_xdmac_device_terminate_all+0xf4/0x100)
[<c01fdbd0>] (at_xdmac_device_terminate_all) from [<c02221a4>]
(atmel_complete_tx_dma+0x34/0xf4)
[<c02221a4>] (atmel_complete_tx_dma) from [<c01fe4ac>]
(at_xdmac_tasklet+0x14c/0x1ac)
[<c01fe4ac>] (at_xdmac_tasklet) from [<c001de58>]
(tasklet_action+0x68/0xb4)
[<c001de58>] (tasklet_action) from [<c001dfdc>]
(__do_softirq+0xfc/0x238)
[<c001dfdc>] (__do_softirq) from [<c001e140>] (run_ksoftirqd+0x28/0x34)
[<c001e140>] (run_ksoftirqd) from [<c0033a3c>]
(smpboot_thread_fn+0x138/0x18c)
[<c0033a3c>] (smpboot_thread_fn) from [<c0030e7c>] (kthread+0xdc/0xf0)
[<c0030e7c>] (kthread) from [<c000f480>] (ret_from_fork+0x14/0x34)
---[ end trace b57b14a99c1d8812 ]---
It comes from the fact that devices can called some code from the DMA
controller with irq disabled. _bh variant is not intended to be used in
this case since it can enable irqs. Switch to irqsave/irqrestore variant to
avoid this situation.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 765c37d876 upstream.
Rework slave configuration part in order to more report wrong errors
about the configuration.
Only maxburst and addr width values are checked when doing the slave
configuration. The validity of the channel configuration is done at
prepare time.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88d04643c6 upstream.
Some drivers implement only pause operation (no resuming). Example is
pl330 where pause is needed for getting residuum. pl330 does not support
resume operation, transfer must be stopped after pause.
However for slaves this is exposed always as "pause and resume" which
introduces subtle errors on Odroid U3 board (Exynos4412 with pl330).
After adding pause function to pl330 driver the audio playback
(utilizing DMA) gets choppy after some time (approximately 24 hours).
Fix this by exposing "cmd_pause" if and only if pause and resume are
implemented.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: gabriel@unseen.is
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 88987d2c75 ("dmaengine: pl330: add DMA_PAUSE feature")
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81cc6edc08 upstream.
The pl330 device could hang infinitely on certain boards when DMA
channels are terminated.
It was caused by lack of runtime resume when executing
pl330_terminate_all() which calls the _stop() function. _stop() accesses
device register and can loop infinitely while checking for device state.
The hang was confirmed by Dinh Nguyen on Altera SOCFPGA Cyclone V
board during boot. It can be also triggered with:
$ echo 1 > /sys/module/dmatest/parameters/iterations
$ echo dma1chan0 > /sys/module/dmatest/parameters/channel
$ echo 1 > /sys/module/dmatest/parameters/run
$ sleep 1
$ cat /sys/module/dmatest/parameters/run
Reported-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: ae43b32891 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12")
Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea114fc27d upstream.
The driver worked around an error in the MAYA44 USB(+)'s mixer unit
descriptor by aborting before parsing the missing field. However,
aborting parsing too early prevented parsing of the other units
connected to this unit, so the capture mixer controls would be missing.
Fix this by moving the check for this descriptor error after the parsing
of the unit's input pins.
Reported-by: nightmixes <nightmixes@gmail.com>
Tested-by: nightmixes <nightmixes@gmail.com>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f80b2958a upstream.
This quirk allows us to avoid the noisy:
current rate 0 is different from the runtime rate
message every time playback starts. While USB DAC in the RR2150
supports reading the sample rate, it never returns a sample rate
other than zero in my observation with common sample rates.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Joe Turner <joe@oampo.co.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ef9f05835 upstream.
Fix this from the logs:
usb 7-1: New USB device found, idVendor=046d, idProduct=08ca
...
usb 7-1: Warning! Unlikely big volume range (=3072), cval->res is probably wrong.
usb 7-1: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ed6a540fa upstream.
When we use 'intel_iommu=igfx_off' to disable translation for the
graphics, and when we discover that the BIOS has misconfigured the DMAR
setup for I/OAT, we use a special DUMMY_DEVICE_DOMAIN_INFO value in
dev->archdata.iommu to indicate that translation is disabled.
With passthrough mode, we were attempting to dereference that as a
normal pointer to a struct device_domain_info when setting up an
identity mapping for the affected device.
This fixes the problem by making device_to_iommu() explicitly check for
the special value and indicate that no IOMMU was found to handle the
devices in question.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18436afdc1 upstream.
Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API
domains") prevents certain options for devices with RMRRs. This even
prevents those devices from getting a 1:1 mapping with 'iommu=pt',
because we don't have the code to handle *preserving* the RMRR regions
when moving the device between domains.
There's already an exclusion for USB devices, because we know the only
reason for RMRRs there is a misguided desire to keep legacy
keyboard/mouse emulation running in some theoretical OS which doesn't
have support for USB in its own right... but which *does* enable the
IOMMU.
Add an exclusion for graphics devices too, so that 'iommu=pt' works
there. We should be able to successfully assign graphics devices to
guests too, as long as the initial handling of stolen memory is
reconfigured appropriately. This has certainly worked in the past.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72586c6061 upstream.
Commit 32f13521ca
("n_tty: Line copy to user buffer in canonical mode")
changed cannonical mode copying to use copy_to_user
but missed adding the call to the audit framework.
Add in the appropriate functions to get audit support.
Fixes: 32f13521ca ("n_tty: Line copy to user buffer in canonical mode")
Reported-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3370e13aa4 upstream.
On some simulators like GEM5, caches may not be simulated. In those
cases, the cache levels and leaves will be zero and will result in
following exception:
Unable to handle kernel NULL pointer dereference at virtual address 0040
pgd = ffffffc0008fa000
[00000040] *pgd=00000009f6807003, *pud=00000009f6807003,
*pmd=00000009f6808003, *pte=006000002c010707
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.1.0-rc5 #198
task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
PC is at detect_cache_attributes+0x98/0x2c8
LR is at detect_cache_attributes+0x88/0x2c8
kcalloc(0) returns a special value ZERO_SIZE_PTR which is non-NULL value
but results in fault only on any attempt to dereferencing it. So
checking for the non-NULL pointer will not suffice.
This patch checks for non-zero cache leaf nodes and returns error if
there are no cache leaves in detect_cache_attributes.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: William Wang <william.wang@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d046ba268a upstream.
The adis16448, unlike the other chips in this family, in addition to the
hardware channels also sends out the DIAG_STAT register in burst mode
before them. Handle that case by skipping over the first 2 bytes before we
pass the received data to the buffer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 76ada52f7f ("iio:adis16400: Add support for the adis16448")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9df560350c upstream.
There are a few issues with the burst mode support. For one we don't setup
the rx buffer, so the buffer will never be filled and all samples will read
as the zero. Furthermore the tx buffer has the wrong type, which means the
driver sends the wrong command and not the right data is returned.
The final issue is that in burst mode all channels are transferred. Hence
the length of the transfer length should be the number of hardware
channels * 2 bytes. Currently the driver uses indio_dev->scan_bytes for
this. But if the timestamp channel is enabled the scan_bytes will be larger
than the burst length. Fix this by just calculating the burst length based
on the number of hardware channels.
Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 5eda3550a3 ("staging:iio:adis16400: Preallocate transfer message")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2a8b623a0 upstream.
We unfortunately can't use ~0UL for the scan mask to indicate that the
only valid scan mask is all channels selected. The IIO core needs the exact
mask to work correctly and not a super-set of it. So calculate the masked
based on the channels that are available for a particular device.
Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 5eda3550a3 ("staging:iio:adis16400: Preallocate transfer message")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7323d59862 upstream.
Previously, the two voltage channels had the same ID, which didn't cause
conflicts in sysfs only because one channel is named and the other isn't;
this is still violating the spec though, two indexed channels should never
have the same index.
Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69ca2d771e upstream.
Add the scale for the pressure channel, which is currently missing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 76ada52f7f ("iio:adis16400: Add support for the adis16448")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit: Not applicable ]
The current rhashtable rehash code is buggy and can't deal with
parallel insertions/removals without corrupting the hash table.
This patch disables it by partially reverting
c5adde9468 ("netlink: eliminate
nl_sk_hash_lock").
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c4c832f89d ]
br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to disable softirqs because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. The spin locks in br_fdb_update were _bh before commit f8ae737dee
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d139898 ("bridge: add NTF_USE support")
Using local_bh_disable/enable around br_fdb_update() allows us to keep
using the spin_lock/unlock in br_fdb_update for the fast-path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d139898 ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e51000db4c ]
There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.
Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6e54030932 ]
421b3885bf "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.
Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 31a418986a ]
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f950415e4 ]
Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.
This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.
In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.
Fixes: 55d8694fa8 ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit beb39db59d ]
We have two problems in UDP stack related to bogus checksums :
1) We return -EAGAIN to application even if receive queue is not empty.
This breaks applications using edge trigger epoll()
2) Under UDP flood, we can loop forever without yielding to other
processes, potentially hanging the host, especially on non SMP.
This patch is an attempt to make things better.
We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71d9f6149c ]
br_multicast_query_expired() querier argument is a pointer to
a struct bridge_mcast_querier :
struct bridge_mcast_querier {
struct br_ip addr;
struct net_bridge_port __rcu *port;
};
Intent of the code was to clear port field, not the pointer to querier.
Fixes: 2cd4143192 ("bridge: memorize and export selected IGMP/MLD querier port")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Steinar H. Gunderson <sesse@samfundet.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9302d7bb0c ]
sctp_v4_map_v6 was subtly writing and reading from members
of a union in a way the clobbered data it needed to read before
it read it.
Zeroing the v6 flowinfo overwrites the v4 sin_addr with 0, meaning
that every place that calls sctp_v4_map_v6 gets ::ffff:0.0.0.0 as the
result.
Reorder things to guarantee correct behaviour no matter what the
union layout is.
This impacts user space clients that open an IPv6 SCTP socket and
receive IPv4 connections. Prior to 299ee user space would see a
sockaddr with AF_INET and a correct address, after 299ee the sockaddr
is AF_INET6, but the address is wrong.
Fixes: 299ee123e1 (sctp: Fixup v4mapped behaviour to comply with Sock API)
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 86e363dc3b ]
For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev->qdisc is finally set, this causes
q->list points to an old root qdisc which is going to be
freed right before assigning with a new one.
Fix this by moving ->attach() after setting dev->qdisc.
For the record, this fixes the following crash:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
list_del corruption. prev->next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ #1019
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
Call Trace:
[<ffffffff81a44e7f>] dump_stack+0x4c/0x65
[<ffffffff810790da>] warn_slowpath_common+0x9c/0xb6
[<ffffffff814e725b>] ? __list_del_entry+0x5a/0x98
[<ffffffff81079162>] warn_slowpath_fmt+0x46/0x48
[<ffffffff81820eb0>] ? dev_graft_qdisc+0x5e/0x6a
[<ffffffff814e725b>] __list_del_entry+0x5a/0x98
[<ffffffff814e72a7>] list_del+0xe/0x2d
[<ffffffff81822f05>] qdisc_list_del+0x1e/0x20
[<ffffffff81820cd1>] qdisc_destroy+0x30/0xd6
[<ffffffff81822676>] qdisc_graft+0x11d/0x243
[<ffffffff818233c1>] tc_get_qdisc+0x1a6/0x1d4
[<ffffffff810b5eaf>] ? mark_lock+0x2e/0x226
[<ffffffff817ff8f5>] rtnetlink_rcv_msg+0x181/0x194
[<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
[<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
[<ffffffff817ff774>] ? __rtnl_unlock+0x17/0x17
[<ffffffff81855dc6>] netlink_rcv_skb+0x4d/0x93
[<ffffffff817ff756>] rtnetlink_rcv+0x26/0x2d
[<ffffffff818544b2>] netlink_unicast+0xcb/0x150
[<ffffffff81161db9>] ? might_fault+0x59/0xa9
[<ffffffff81854f78>] netlink_sendmsg+0x4fa/0x51c
[<ffffffff817d6e09>] sock_sendmsg_nosec+0x12/0x1d
[<ffffffff817d8967>] sock_sendmsg+0x29/0x2e
[<ffffffff817d8cf3>] ___sys_sendmsg+0x1b4/0x23a
[<ffffffff8100a1b8>] ? native_sched_clock+0x35/0x37
[<ffffffff810a1d83>] ? sched_clock_local+0x12/0x72
[<ffffffff810a1fd4>] ? sched_clock_cpu+0x9e/0xb7
[<ffffffff810def2a>] ? current_kernel_time+0xe/0x32
[<ffffffff810b4bc5>] ? lock_release_holdtime.part.29+0x71/0x7f
[<ffffffff810ddebf>] ? read_seqcount_begin.constprop.27+0x5f/0x76
[<ffffffff810b6292>] ? trace_hardirqs_on_caller+0x17d/0x199
[<ffffffff811b14d5>] ? __fget_light+0x50/0x78
[<ffffffff817d9808>] __sys_sendmsg+0x42/0x60
[<ffffffff817d9838>] SyS_sendmsg+0x12/0x1c
[<ffffffff81a50e97>] system_call_fastpath+0x12/0x6f
---[ end trace ef29d3fb28e97ae7 ]---
For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.
Fixes: 95dc19299f ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ce0e5c522d ]
Commit e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue
struct") introduced a regression when moving queue-specific data into
the queue struct by failing to set the credit_bytes field. This
prevented bandwidth limiting from working. Initialize the field as it
was done before multiqueue support was added.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b48732e4a4 ]
got a rare NULL pointer dereference in clear_bit
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
----
v2: switch to sock_flag(sk, SOCK_DEAD) and added net/caif/caif_socket.c
v3: return -ECONNRESET in upstream caller of wait function for SOCK_DEAD
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit adbe088f6f ]
A pair of nested spin locks was introduced in commit 63502b8d0
"dp83640: Fix receive timestamp race condition".
Unfortunately the 'flags' parameter was reused for the inner lock,
clobbering the originally saved IRQ state. This patch fixes the issue
by changing the inner lock to plain spin_lock without irqsave.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a935865c82 ]
Callers of the ext_write function are supposed to hold a mutex that
protects the state of the dialed page, but one caller was missing the
lock from the very start, and over time the code has been changed
without following the rule. This patch cleans up the call sites in
violation of the rule.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 397a253af5 ]
Currently, the calibration function that corrects the initial offsets
among multiple devices only works the first time. If the function is
called more than once, the calibration fails and bogus offsets will be
programmed into the devices.
In a well hidden spot, the device documentation tells that trigger indexes
0 and 1 are special in allowing the TRIG_IF_LATE flag to actually work.
This patch fixes the issue by using one of the special triggers during the
recalibration method.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 47cc84ce0c ]
When more than a multicast address is present in a MLDv2 report, all but
the first address is ignored, because the code breaks out of the loop if
there has not been an error adding that address.
This has caused failures when two guests connected through the bridge
tried to communicate using IPv6. Neighbor discoveries would not be
transmitted to the other guest when both used a link-local address and a
static address.
This only happens when there is a MLDv2 querier in the network.
The fix will only break out of the loop when there is a failure adding a
multicast address.
The mdb before the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
After the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::fb temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
dev ovirtmgmt port bond0.86 grp ff02::d temp
dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
dev ovirtmgmt port bond0.86 grp ff02::16 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp
Fixes: 08b202b672 ("bridge br_multicast: IPv6 MLD support.")
Reported-by: Rik Theys <Rik.Theys@esat.kuleuven.be>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Tested-by: Rik Theys <Rik.Theys@esat.kuleuven.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 44f6731d8b ]
The tx_curr_frame_payload field is u32. When we try to calculate a
small negative delta based on it, we end up with a positive integer
close to 2^32 instead. So the tx_bytes pointer increases by about
2^32 for every transmitted frame.
Fix by calculating the delta as a signed long.
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Reported-by: Florian Bruhin <me@the-compiler.org>
Fixes: 7a1e890e21 ("usbnet: Fix tx_bytes statistic running backward in cdc_ncm")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 381c759d99 ]
ip_error does not check if in_dev is NULL before dereferencing it.
IThe following sequence of calls is possible:
CPU A CPU B
ip_rcv_finish
ip_route_input_noref()
ip_route_input_slow()
inetdev_destroy()
dst_input()
With the result that a network device can be destroyed while processing
an input packet.
A crash was triggered with only unicast packets in flight, and
forwarding enabled on the only network device. The error condition
was created by the removal of the network device.
As such it is likely the that error code was -EHOSTUNREACH, and the
action taken by ip_error (if in_dev had been accessible) would have
been to not increment any counters and to have tried and likely failed
to send an icmp error as the network device is going away.
Therefore handle this weird case by just dropping the packet if
!in_dev. It will result in dropping the packet sooner, and will not
result in an actual change of behavior.
Fixes: 251da41301 ("ipv4: Cache ip_error() routes even when not forwarding.")
Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Tested-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c78e1746d3 ]
Vijay reported that a loop as simple as ...
while true; do
tc qdisc add dev foo root handle 1: prio
tc filter add dev foo parent 1: u32 match u32 0 0 flowid 1
tc qdisc del dev foo root
rmmod cls_u32
done
... will panic the kernel. Moreover, he bisected the change
apparently introducing it to 78fd1d0ab0 ("netlink: Re-add
locking to netlink_lookup() and seq walker").
The removal of synchronize_net() from the netlink socket
triggering the qdisc to be removed, seems to have uncovered
an RCU resp. module reference count race from the tc API.
Given that RCU conversion was done after e341694e3e ("netlink:
Convert netlink_lookup() to use RCU protected hash table")
which added the synchronize_net() originally, occasion of
hitting the bug was less likely (not impossible though):
When qdiscs that i) support attaching classifiers and,
ii) have at least one of them attached, get deleted, they
invoke tcf_destroy_chain(), and thus call into ->destroy()
handler from a classifier module.
After RCU conversion, all classifier that have an internal
prio list, unlink them and initiate freeing via call_rcu()
deferral.
Meanhile, tcf_destroy() releases already reference to the
tp->ops->owner module before the queued RCU callback handler
has been invoked.
Subsequent rmmod on the classifier module is then not prevented
since all module references are already dropped.
By the time, the kernel invokes the RCU callback handler from
the module, that function address is then invalid.
One way to fix it would be to add an rcu_barrier() to
unregister_tcf_proto_ops() to wait for all pending call_rcu()s
to complete.
synchronize_rcu() is not appropriate as under heavy RCU
callback load, registered call_rcu()s could be deferred
longer than a grace period. In case we don't have any pending
call_rcu()s, the barrier is allowed to return immediately.
Since we came here via unregister_tcf_proto_ops(), there
are no users of a given classifier anymore. Further nested
call_rcu()s pointing into the module space are not being
done anywhere.
Only cls_bpf_delete_prog() may schedule a work item, to
unlock pages eventually, but that is not in the range/context
of cls_bpf anymore.
Fixes: 25d8c0d55f ("net: rcu-ify tcf_proto")
Fixes: 9888faefe1 ("net: sched: cls_basic use RCU")
Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 33b4b015e1 ]
Commit <5cf3d46192fc> ("udp: Simplify__udp*_lib_mcast_deliver")
simplified the filter for incoming IPv6 multicast but removed
the check of the local socket address and the UDP destination
address.
This patch restores the filter to prevent sockets bound to a IPv6
multicast IP to receive other UDP traffic link unicast.
Signed-off-by: Henning Rogge <hrogge@gmail.com>
Fixes: 5cf3d46192 ("udp: Simplify__udp*_lib_mcast_deliver")
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 21858cd02d ]
commit 1d13a96c74 ("ipv6: tcp: fix flowlabel value in ACK messages
send from TIME_WAIT") added the flow label in the last TCP packets.
Unfortunately, it was not casted properly.
This patch replace the buggy shift with be32_to_cpu/cpu_to_be32.
Fixes: 1d13a96c74 ("ipv6: tcp: fix flowlabel value in ACK messages")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ed2a80ab7b ]
Before the patch, the command 'ip link add bond2 type bond mode 802.3ad'
causes the kernel to send a rtnl message for the bond2 interface, with an
ifindex 0.
'ip monitor' shows:
0: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
9: bond2@NONE: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default
link/ether ea:3e:1f:53:92:7b brd ff:ff:ff:ff:ff:ff
[snip]
The patch fixes the spotted bug by checking in bond driver if the interface
is registered before calling the notifier chain.
It also adds a check in rtmsg_ifinfo() to prevent this kind of bug in the
future.
Fixes: d4261e5650 ("bonding: create netlink event when bonding option is changed")
CC: Jiri Pirko <jiri@resnulli.us>
Reported-by: Julien Meunier <julien.meunier@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c0bb07df7d ]
The commit c5adde9468 ("netlink:
eliminate nl_sk_hash_lock") breaks the autobind retry mechanism
because it doesn't reset portid after a failed netlink_insert.
This means that should autobind fail the first time around, then
the socket will be stuck in limbo as it can never be bound again
since it already has a non-zero portid.
Fixes: c5adde9468 ("netlink: eliminate nl_sk_hash_lock")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7e14069651 ]
RGMII interfaces come in multiple flavors: RGMII with transmit or
receive internal delay, no delays at all, or delays in both direction.
This change extends the initial check for PHY_INTERFACE_MODE_RGMII to
cover all of these variants since EEE should be allowed for any of these
modes, since it is a property of the RGMII, hence Gigabit PHY capability
more than the RGMII electrical interface and its delays.
Fixes: a59a4d1921 ("phy: add the EEE support and the way to access to the MMD registers")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3f7352bf21 ]
x86 has variable length encoding. x86 JIT compiler is trying
to pick the shortest encoding for given bpf instruction.
While doing so the jump targets are changing, so JIT is doing
multiple passes over the program. Typical program needs 3 passes.
Some very short programs converge with 2 passes. Large programs
may need 4 or 5. But specially crafted bpf programs may hit the
pass limit and if the program converges on the last iteration
the JIT compiler will be producing an image full of 'int 3' insns.
Fix this corner case by doing final iteration over bpf program.
Fixes: 0a14842f5a ("net: filter: Just In Time compiler for x86-64")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 343f845b37 ]
FROM_BE16:
'ror %reg, 8' doesn't clear upper bits of the register,
so use additional 'movzwl' insn to zero extend 16 bits into 64
FROM_LE16:
should zero extend lower 16 bits into 64 bit
FROM_LE32:
should zero extend lower 32 bits into 64 bit
Fixes: 89aa075832 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d66bf7dd27 ]
The code in __netdev_upper_dev_link() has an over-stringent
loop detection logic that actually prevents valid configurations
from working correctly.
In particular, the logic returns an error if an upper device
is already in the list of all upper devices for a given dev.
This particular check seems to be a overzealous as it disallows
perfectly valid configurations. For example:
# ip l a link eth0 name eth0.10 type vlan id 10
# ip l a dev br0 typ bridge
# ip l s eth0.10 master br0
# ip l s eth0 master br0 <--- Will fail
If you switch the last two commands (add eth0 first), then both
will succeed. If after that, you remove eth0 and try to re-add
it, it will fail!
It appears to be enough to simply check adj_list to keeps things
safe.
I've tried stacking multiple devices multiple times in all different
combinations, and either rx_handler registration prevented the stacking
of the device linking cought the error.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc48e56d76 upstream.
exit_aio() currently serializes killing io contexts. Each context
killing ends up having to do percpu_ref_kill(), which in turns has
to wait for an RCU grace period. This can take a long time, depending
on the number of contexts. And there's no point in doing them serially,
when we could be waiting for all of them in one fell swoop.
This patches makes my fio thread offload test case exit 0.2s instead
of almost 6s.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45002267e8 upstream.
Crush temporary buffers are allocated as per replica size configured
by the user. When there are more final osds (to be selected as per
rule) than the replicas, buffer overlaps and it causes crash. Now, it
ensures that at most num-rep osds are selected even if more number of
osds are allowed by the rule.
Reflects ceph.git commits 6b4d1aa99718e3b367496326c1e64551330fabc0,
234b066ba04976783d15ff2abc3e81b6cc06fb10.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56ccc1125b upstream.
A recent change removed the need for locking around writing
to "sync_action" (and various other places), but introduced a
subtle race.
When e.g. setting 'reshape' on a 'frozen' array, the 'frozen'
flag is cleared before 'reshape' is set, so the md thread can
get in and start trying recovery - which isn't wanted.
So instead of clearing MD_RECOVERY_FROZEN for any command
except 'frozen', only clear it when each specific command
is parsed. This allows the handling of 'reshape' to clear
the bit while a lock is held.
Also remove some places where we set MD_RECOVERY_NEEDED,
as it is always set on non-error exit of the function.
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: 6791875e2e ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e70897d0e upstream.
The PWM hardware on Pistachio platform has a maximum timebase steps
value to 255. To fix it, let's introduce a compatible-specific
data structure to contain the SoC-specific details and use it to
specify a maximum timebase.
Also, let's limit the minimum timebase to 16 steps, to allow a sane
range of duty cycle steps.
Fixes: 277bb6a29e ("pwm: Imagination Technologies PWM DAC driver")
Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98fb1ffd81 upstream.
Block drivers are responsible for calling flush_dcache_page() on each
BIO request. This operation keeps the I$ coherent with the D$ on
architectures that don't have hardware coherency support. Without this
flush, random crashes are seen when executing user programs from an ext4
filesystem backed by a ubiblock device.
This patch is based on the change implemented in commit 2d4dc890b5
("block: add helpers to run flush_dcache_page() against a bio and a
request's pages").
Fixes: 9d54c8a33e ("UBI: R/O block driver on top of UBI volumes")
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 932df43005 upstream.
In case of error, the function devm_ioremap() returns NULL
not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Fixes: ecfe64d8c5 ("power: reset: Add AT91 reset driver")
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 161f873b89 upstream.
We used to read file_handle twice. Once to get the amount of extra
bytes, and once to fetch the entire structure.
This may be problematic since we do size verifications only after the
first read, so if the number of extra bytes changes in userspace between
the first and second calls, we'll have an incoherent view of
file_handle.
Instead, read the constant size once, and copy that over to the final
structure without having to re-read it again.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42e08c7836 upstream.
This patch sets the local memory size that is reported to userspace to 0.
This is done to make sure that userspace won't try to allocate local memory
for HSA.
As long as amdkfd doesn't support allocating local memory for HSA,
we need this patch.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15b94a6904 upstream.
dm-multipath accepts 0 path mapping.
# echo '0 2097152 multipath 0 0 0 0' | dmsetup create newdev
Such a mapping can be used to release underlying devices while still
holding requests in its queue until working paths come back.
However, once the multipath device is created over blk-mq devices,
it rejects reloading of 0 path mapping:
# echo '0 2097152 multipath 0 0 1 1 queue-length 0 1 1 /dev/sda 1' \
| dmsetup create mpath1
# echo '0 2097152 multipath 0 0 0 0' | dmsetup load mpath1
device-mapper: reload ioctl on mpath1 failed: Invalid argument
Command failed
With following kernel message:
device-mapper: ioctl: can't change device type after initial table load.
DM tries to inherit the current table type using dm_table_set_type()
but it doesn't work as expected because of unnecessary check about
whether the target type is hybrid or not.
Hybrid type is for targets that work as either request-based or bio-based
and not required for blk-mq or non blk-mq checking.
Fixes: 65803c2059 ("dm table: train hybrid target type detection to select blk-mq if appropriate")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c220c69ce upstream.
dm_merge_bvec() was originally added in f6fccb ("dm: introduce
merge_bvec_fn"). In that commit a value in sectors is converted to
bytes using << 9, and then assigned to an int. This code made
assumptions about the value of BIO_MAX_SECTORS.
A later commit 148e51 ("dm: improve documentation and code clarity in
dm_merge_bvec") was meant to have no functional change but it removed
the use of BIO_MAX_SECTORS in favor of using queue_max_sectors(). At
this point the cast from sector_t to int resulted in a zero value. The
fallout being dm_merge_bvec() would only allow a single page to be added
to a bio.
This interim fix is minimal for the benefit of stable@ because the more
comprehensive cleanup of passing a sector_t to all DM targets' merge
function will impact quite a few DM targets.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c0411d2fa upstream.
We have that bug for years and some users report side effects when fixing it on older hardware.
So revert it for VM_CONTEXT0_PAGE_TABLE_END_ADDR, but keep it for VM 1-15.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e7f43c41c upstream.
In
commit f02ad907cd
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jan 22 16:36:23 2015 +0100
drm/atomic-helpers: Recover full cursor plane behaviour
we've added a hack to atomic helpers to never to vblank waits for
cursor updates through the legacy apis since that's what X expects.
Unfortunately we've (again) forgotten to adjust the transitional
helpers. Do this now.
This fixes regressions for drivers only partially converted over to
atomic (like i915).
Reported-by: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 553452e5ff upstream.
In the case of a DMA mapping error on the last iteration of
the loop of the allocation of memory of the FW monitor we
indeed free the pages, but don't NULL out the page variable
thus allowing for the possibility of setting the FW monitor
variables with invalid data to use.
Fixes: c2d202017d ("iwlwifi: pcie: add firmware monitor capabilities")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fc863a514 upstream.
fw_status is the only pointer pointing to a block of memory
allocated above and should be freed after use.
Note: this come from Klockwork static analyzer.
Fixes: 2021a89d7b ("iwlwifi: mvm: treat netdetect wake up separately")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9a5e5e18f upstream.
Since acpi_reserve_resources() is defined as a device_initcall(),
there's no guarantee that it will be executed in the right order
with respect to the rest of the ACPI initialization code. On some
systems this leads to breakage if, for example, the address range
that should be reserved for the ACPI fixed registers is given to
the PCI host bridge instead if the race is won by the wrong code
path.
Fix this by turning acpi_reserve_resources() into a void function
and calling it directly from within the ACPI initialization sequence.
Reported-and-tested-by: George McCollister <george.mccollister@gmail.com>
Link: http://marc.info/?t=143092384600002&r=1&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74856fbf44 upstream.
256 bytes per sector support has been broken since 2.6.X,
and no-one stepped up to fix this.
So disable support for it.
Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc45708ca9 upstream.
Set the SRB flags correctly when there is no data transfer. Without this
change some IHV drivers will fail valid commands such as TEST_UNIT_READY.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c0213d17a upstream.
When the v3 hardware sees more than one finger, it uses the semi-mt
protocol to report the touches. However, it currently works when
num_fingers is 0, 1 or 2, but when it is 3 and above, it sends only 1
finger as if num_fingers was 1.
This confuses userspace which knows how to deal with extra fingers
when all the slots are used, but not when some are missing.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=90101
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 620b155034 upstream.
Commit 46490b5725 ("MIPS: kernel: elf: Improve the overall ABI and FPU
mode checks") reworked the ELF FP ABI mode selection logic, but when
CONFIG_MIPS_O32_FP64_SUPPORT is enabled it breaks the use of binaries
which have no PT_MIPS_ABIFLAGS program header & associated
.MIPS.abiflags section.
A default mode is selected based upon whether the ELF contains MIPS32 or
MIPS64 code, but that selection is made in arch_elf_pt_proc.
arch_elf_pt_proc only executes when a PT_MIPS_ABIFLAGS program header is
found. If one is not found then arch_elf_pt_proc is never called, and no
default overall_fp_mode value is selected. When arch_check_elf is
called, both abi0 & abi1 are MIPS_ABI_FP_UNKNOWN which leads to both
prog_req & interp_req being set to none_req. none_req matches none of
the conditions for mode selection at the end of arch_check_elf, so
overall_fp_mode is left untouched. Finally once mips_set_personality_fp
is called the BUG() in the default case is then hit & the kernel likely
panics.
Fix this by moving the selection of a default overall mode to the start
of arch_check_elf, which runs once per ELF executed regardless of
whether it has a PT_MIPS_ABIFLAGS program header.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/9978/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5006c1052a upstream.
This reverts commit 3a61e97563.
The Logitech TK820 seems to be affected by a firmware bug which
delays the sending of the keys (pressed, or released, which triggers
a key-repeat) while holding fingers on the touch sensor.
This behavior can be observed while using the mouse emulation mode
if the user moves the finger while typing (highly improbable though).
Holding the finger still while in the mouse emulation mode does
not trigger the key repeat problem.
So better keep things in their previous state to not have to
explain users that the new key-repeat bug they see is a "feature".
Furthermore, I noticed that I disabled the media keys whith
this patch. Sorry, my bad.
I think it is best to revert the patch, in all the current
versions it has been shipped.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a81157768a upstream.
The variable "sector" in "raid0_make_request()" was improperly updated
by a call to "sector_div()" which modifies its first argument in place.
Commit 47d68979cc restored this variable
after the call for later re-use. Unfortunetly the restore was done after
the referenced variable "bio" was advanced. This lead to the original
value and the restored value being different. Here we move this line to
the proper place.
One observed side effect of this bug was discarding a file though
unlinking would cause an unrelated file's contents to be discarded.
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: 47d68979cc ("md/raid0: fix bug with chunksize not a power of 2.")
URL: https://bugzilla.kernel.org/show_bug.cgi?id=98501
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e9eac2dce upstream.
If any memory allocation in resize_stripes fails we will return
-ENOMEM, but in some cases we update conf->pool_size anyway.
This means that if we try again, the allocations will be assumed
to be larger than they are, and badness results.
So only update pool_size if there is no error.
This bug was introduced in 2.6.17 and the patch is suitable for
-stable.
Fixes: ad01c9e375 ("[PATCH] md: Allow stripes to be expanded in preparation for expanding an array")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c1ac56b51 upstream.
In function dmi_present(), dmi_walk_early() calls dmi_table(), which
calls dmi_decode(), which ultimately calls dmi_save_uuid(). This last
function makes a decision based on the value of global variable
dmi_ver. The problem is that this variable is set right _after_
dmi_walk_early() returns. So dmi_save_uuid() always sees dmi_ver == 0
regardless of the actual version implemented.
This causes /sys/class/dmi/id/product_uuid to always use the old
ordering even on systems implementing DMI/SMBIOS 2.6 or later, which
should use the new ordering.
This is broken since kernel v3.8 for legacy DMI implementations and
since kernel v3.10 for SMBIOS 2 implementations. SMBIOS 3
implementations with the 64-bit entry point are not affected.
The first breakage does not matter much as in practice legacy DMI
implementations are always for versions older than 2.6, which is when
the UUID ordering changed. The second breakage is more problematic as
it affects the vast majority of x86 systems manufactured since 2009.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 9f9c9cbb60 ("drivers/firmware/dmi_scan.c: fetch dmi version from SMBIOS if it exists")
Fixes: 79bae42d51 ("dmi_scan: refactor dmi_scan_machine(), {smbios,dmi}_present()")
Acked-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Artem Savkov <artem.savkov@gmail.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9507271d96 upstream.
In an environment where the KDC is running Active Directory, the
exported composite name field returned in the context could be large
enough to span a page boundary. Attaching a scratch buffer to the
decoding xdr_stream helps deal with those cases.
The case where we saw this was actually due to behavior that's been
fixed in newer gss-proxy versions, but we're fixing it here too.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebe9cb3bb1 upstream.
If we find a non-confirmed openowner we jump to exit the function, but do
not set an error value. Fix this by factoring out a helper to do the
check and properly set the error from nfsd4_validate_stateid.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40cdc7a530 upstream.
Commit df52699e4f ("NFSv4.1: Don't cache deviceids that have no
notifications") causes the Linux NFS client to stop caching deviceid's
unless a server pretends to support deviceid notifications. While this
behavior is stupid and the language around this area in rfc5661 is a
mess carified by an errata that I submittted, Trond insists on this
behavior. Not caching deviceids degrades block layout performance
massively as a GETDEVICEINFO is fairly expensive.
So add this hack to make the Linux client happy again.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0dc2b9bb4 upstream.
NUMA balancing is meant to be disabled by default on UMA machines but
the check is using nr_node_ids (highest node) instead of
num_online_nodes (online nodes).
The consequences are that a UMA machine with a node ID of 1 or higher
will enable NUMA balancing. This will incur useless overhead due to
minor faults with the impact depending on the workload. These are the
impact on the stats when running a kernel build on a single node machine
whose node ID happened to be 1:
vanilla patched
NUMA base PTE updates 5113158 0
NUMA huge PMD updates 643 0
NUMA page range updates 5442374 0
NUMA hint faults 2109622 0
NUMA hint local faults 2109622 0
NUMA hint local percent 100 100
NUMA pages migrated 0 0
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 499611ed45 upstream.
root->ino_ida is used for kernfs inode number allocations. Since IDA has
a layered structure, different IDs can reside on the same layer, which
is currently accounted to some memory cgroup. The problem is that each
kmem cache of a memory cgroup has its own directory on sysfs (under
/sys/fs/kernel/<cache-name>/cgroup). If the inode number of such a
directory or any file in it gets allocated from a layer accounted to the
cgroup which the cache is created for, the cgroup will get pinned for
good, because one has to free all kmem allocations accounted to a cgroup
in order to release it and destroy all its kmem caches. That said we
must not account layers of ino_ida to any memory cgroup.
Since per net init operations may create new sysfs entries directly
(e.g. lo device) or indirectly (nf_conntrack creates a new kmem cache
per each namespace, which, in turn, creates new sysfs entries), an easy
way to reproduce this issue is by creating network namespace(s) from
inside a kmem-active memory cgroup.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f4fc071b1 upstream.
Not all kmem allocations should be accounted to memcg. The following
patch gives an example when accounting of a certain type of allocations to
memcg can effectively result in a memory leak. This patch adds the
__GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the
allocation to go through the root cgroup. It will be used by the next
patch.
Note, since in case of kmemleak enabled each kmalloc implies yet another
allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
gfp_kmemleak_mask.
Alternatively, we could introduce a per kmem cache flag disabling
accounting for all allocations of a particular kind, but (a) we would not
be able to bypass accounting for kmalloc then and (b) a kmem cache with
this flag set could not be merged with a kmem cache without this flag,
which would increase the number of global caches and therefore
fragmentation even if the memory cgroup controller is not used.
Despite its generic name, currently __GFP_NOACCOUNT disables accounting
only for kmem allocations while user page allocations are always charged.
To catch abusing of this flag, a warning is issued on an attempt of
passing it to mem_cgroup_try_charge.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d045c77c1a upstream.
On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
currently parisc and metag only) stack randomization sometimes leads to crashes
when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
by default if not defined in arch-specific headers).
The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
additional space needed for the stack randomization (as defined by the value of
STACK_RND_MASK) was not taken into account yet and as such, when the stack
randomization code added a random offset to the stack start, the stack
effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
which then sometimes leads to out-of-stack situations and crashes.
This patch fixes it by adding the maximum possible amount of memory (based on
STACK_RND_MASK) which theoretically could be added by the stack randomization
code to the initial stack size. That way, the user-defined stack size is always
guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).
This bug is currently not visible on the metag architecture, because on metag
STACK_RND_MASK is defined to 0 which effectively disables stack randomization.
The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
section, so it does not affect other platformws beside those where the
stack grows upwards (parisc and metag).
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b97937246 upstream.
Josh Stone reports:
I've discovered a case where both arm and arm64 will miss a ptrace
syscall-exit that they should report. If the syscall is entered
without TIF_SYSCALL_TRACE set, then it goes on the fast path. It's
then possible to have TIF_SYSCALL_TRACE added in the middle of the
syscall, but ret_fast_syscall doesn't check this flag again.
Fix this by always checking for a syscall trace in the fast exit path.
Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 242ddf0429 upstream.
This patch sets display clock correctly. If Display clock isn't set
correctly then you would find below messages and Display controller
doesn't work correctly.
exynos-drm: No connectors reported connected with modes
[drm] Cannot find any crtc or sizes - going 1024x768
Fixes: abc0b1447d ("drm: Perform basic sanity checks on probed modes")
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e46b5a6470 upstream.
The i.MX27 dtb build should be controlled by CONFIG_SOC_IMX27 rather
than CONFIG_SOC_IMX31.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Fixes: cb612390e5 ("ARM: dts: Only build dtb if associated Arch and/or SoC is enabled")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a29ef819f3 upstream.
According to the imx27 documentation, fec has a 4 Kbyte
memory space map. Moreover, the actual 16 Kbyte mapping
overlaps the SCC (Security Controller) memory register
space. So, we reduce the memory register space to 4 Kbyte.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 9f0749e3eb ("ARM i.MX27: Add devicetree support")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b7dc0ff95 upstream.
ERR_PTR was dereferenced during sub domain parsing, if parent domain
could not be obtained (because of invalid phandle or deferred
registration of parent domain).
The Exynos power domain code checked whether
of_genpd_get_from_provider() returned NULL and in that case it skipped
that power domain node. However this function returns ERR_PTR or valid
pointer, not NULL.
Fixes: 0f7807518f ("ARM: EXYNOS: add support for sub-power domains")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 965278dcb8 upstream.
At boot time we round the memblock limit down to section size in an
attempt to ensure that we will have mapped this RAM with section
mappings prior to allocating from it. When mapping RAM we iterate over
PMD-sized chunks, creating these section mappings.
Section mappings are only created when the end of a chunk is aligned to
section size. Unfortunately, with classic page tables (where PMD_SIZE is
2 * SECTION_SIZE) this means that if a chunk is between 1M and 2M in
size the first 1M will not be mapped despite having been accounted for
in the memblock limit. This has been observed to result in page tables
being allocated from unmapped memory, causing boot-time hangs.
This patch modifies the memblock limit rounding to always round down to
PMD_SIZE instead of SECTION_SIZE. For classic MMU this means that we
will round the memblock limit down to a 2M boundary, matching the limits
on section mappings, and preventing allocations from unmapped memory.
For LPAE there should be no change as PMD_SIZE == SECTION_SIZE.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Stefan Agner <stefan@agner.ch>
Tested-by: Stefan Agner <stefan@agner.ch>
Acked-by: Laura Abbott <labbott@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0782e63bc6 upstream.
Ronny reported that the following scenario is not handled correctly:
T1 (prio = 10)
lock(rtmutex);
T2 (prio = 20)
lock(rtmutex)
boost T1
T1 (prio = 20)
sys_set_scheduler(prio = 30)
T1 prio = 30
....
sys_set_scheduler(prio = 10)
T1 prio = 30
The last step is wrong as T1 should now be back at prio 20.
Commit c365c292d0 ("sched: Consider pi boosting in setscheduler()")
only handles the case where a boosted tasks tries to lower its
priority.
Fix it by taking the new effective priority into account for the
decision whether a change of the priority is required.
Reported-by: Ronny Meeus <ronny.meeus@gmail.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Fixes: c365c292d0 ("sched: Consider pi boosting in setscheduler()")
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7cded342c0 upstream.
Git commit 152125b7a8
"s390/mm: implement dirty bits for large segment table entries"
broke the pmd_pfn function, it changed the return value from
'unsigned long' to 'int'. This breaks all machine configurations
with memory above the 8TB line.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22d3a3c829 upstream.
No matter how the driver manages its NAPI context, there's no way
sending frames to it from a timer can be correct, since it would
corrupt the internal GRO lists.
To avoid that, always use the non-NAPI path when releasing frames
from the timer.
Reported-by: Jean Trivelly <jean.trivelly@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47b4e1fc49 upstream.
Remove checking tailroom when adding IV as it uses only
headroom, and move the check to the ICV generation that
actually needs the tailroom.
In other case I hit such warning and datapath don't work,
when testing:
- IBSS + WEP
- ath9k with hw crypt enabled
- IPv6 data (ping6)
WARNING: CPU: 3 PID: 13301 at net/mac80211/wep.c:102 ieee80211_wep_add_iv+0x129/0x190 [mac80211]()
[...]
Call Trace:
[<ffffffff817bf491>] dump_stack+0x45/0x57
[<ffffffff8107746a>] warn_slowpath_common+0x8a/0xc0
[<ffffffff8107755a>] warn_slowpath_null+0x1a/0x20
[<ffffffffc09ae109>] ieee80211_wep_add_iv+0x129/0x190 [mac80211]
[<ffffffffc09ae7ab>] ieee80211_crypto_wep_encrypt+0x6b/0xd0 [mac80211]
[<ffffffffc09d3fb1>] invoke_tx_handlers+0xc51/0xf30 [mac80211]
[...]
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1cae34e23 upstream.
Multitheaded tests showed that the icv buffer in the current ghash
implementation is not handled correctly. A move of this working ghash
buffer value to the descriptor context fixed this. Code is tested and
verified with an multithreaded application via af_alg interface.
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Gerald Schaefer <geraldsc@linux.vnet.ibm.com>
Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f230e8ffc0 upstream.
This patch fixes an inverted return value of the gpio get_direction
function.
The wrong value causes the direction sysfs entry and GPIO debugfs file
to indicate incorrect GPIO direction settings. In some cases it also
prevents setting GPIO output values.
The problem is also present in all other stable kernel versions since
linux-3.12.
Reported-by: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: Michael Brunner <michael.brunner@kontron.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12833bacf5 upstream.
This code calls cpu_resume() using a straight branch (b), so
now that we have moved cpu_resume() back to .text, this should
be moved there as well. Any direct references to symbols that will
remain in the .data section are replaced with explicit PC-relative
references.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e4df6b720 upstream.
Consider "(u64)insn1.imm << 32 | imm" in the arm64 JIT. Since imm is
signed 32-bit, it is sign-extended to 64-bit, losing the high 32 bits.
The fix is to convert imm to u32 first, which will be zero-extended to
u64 implicitly.
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 30d3d94cc3 ("arm64: bpf: add 'load 64-bit immediate' instruction")
Signed-off-by: Xi Wang <xi.wang@gmail.com>
[will: removed non-arm64 bits and redundant casting]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a9324d396 upstream.
The queued TRIM problems appear to be generic to Samsung's firmware and
not tied to a particular model. A recent update to the 840 EVO firmware
introduced the same issue as we saw on 850 Pro.
Blacklist queued TRIM on all 800-series drives while we work this issue
with Samsung.
Reported-by: Günter Waller <g.wal@web.de>
Reported-by: Sven Köhler <sven.koehler@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09c5b4803a upstream.
When the LPM policy is set to ATA_LPM_MAX_POWER, the device might
generate a spurious PHY event that cuases errors on the link.
Ignore this event if it occured within 10s after the policy change.
The timeout was chosen observing that on a Dell XPS13 9333 these
spurious events can occur up to roughly 6s after the policy change.
Link: http://lkml.kernel.org/g/3352987.ugV1Ipy7Z5@xps13
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8393b811f3 upstream.
This is a preparation commit that will allow to add other criteria
according to which PHY events should be dropped.
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dbfe8ef559 upstream.
Avoton AHCI occasionally sees drive probe timeouts at driver load time.
When this happens SCR_STATUS indicates device detected, but no D2H FIS
reception. Reset the internal link state machines by bouncing
port-enable in the PCS register when this occurs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e531d0bceb upstream.
The journal revoke block recovery code does not check r_count for
sanity, which means that an evil value of r_count could result in
the kernel reading off the end of the revoke table and into whatever
garbage lies beyond. This could crash the kernel, so fix that.
However, in testing this fix, I discovered that the code to write
out the revoke tables also was not correctly checking to see if the
block was full -- the current offset check is fine so long as the
revoke table space size is a multiple of the record size, but this
is not true when either journal_csum_v[23] are set.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f974865ff upstream.
The following commit introduced a bug when checking for zero length extent
5946d08 ext4: check for overlapping extents in ext4_valid_extent_entries()
Zero length extent could pass the check if lblock is zero.
Adding the explicit check for zero length back.
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d50659406 upstream.
Currently when journal restart fails, we'll have the h_transaction of
the handle set to NULL to indicate that the handle has been effectively
aborted. We handle this situation quietly in the jbd2_journal_stop() and just
free the handle and exit because everything else has been done before we
attempted (and failed) to restart the journal.
Unfortunately there are a number of problems with that approach
introduced with commit
41a5b91319 "jbd2: invalidate handle if jbd2_journal_restart()
fails"
First of all in ext4 jbd2_journal_stop() will be called through
__ext4_journal_stop() where we would try to get a hold of the superblock
by dereferencing h_transaction which in this case would lead to NULL
pointer dereference and crash.
In addition we're going to free the handle regardless of the refcount
which is bad as well, because others up the call chain will still
reference the handle so we might potentially reference already freed
memory.
Moreover it's expected that we'll get aborted handle as well as detached
handle in some of the journalling function as the error propagates up
the stack, so it's unnecessary to call WARN_ON every time we get
detached handle.
And finally we might leak some memory by forgetting to free reserved
handle in jbd2_journal_stop() in the case where handle was detached from
the transaction (h_transaction is NULL).
Fix the NULL pointer dereference in __ext4_journal_stop() by just
calling jbd2_journal_stop() quietly as suggested by Jan Kara. Also fix
the potential memory leak in jbd2_journal_stop() and use proper
handle refcounting before we attempt to free it to avoid use-after-free
issues.
And finally remove all WARN_ON(!transaction) from the code so that we do
not get random traces when something goes wrong because when journal
restart fails we will get to some of those functions.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f4d855839 upstream.
We had a fencepost error in the lazytime optimization which means that
timestamp would get written to the wrong inode.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a48632ffe upstream.
A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
after the pty slave has closed, even though input data remains to be read.
For example,
pty slave | input worker | pty master
| |
| | n_tty_read()
pty_write() | | input avail? no
add data | | sleep
schedule worker --->| | .
|---> flush_to_ldisc() | .
pty_close() | fill read buffer | .
wait for worker | wakeup reader --->| .
| read buffer full? |---> input avail ? yes
|<--- yes - exit worker | copy 4096 bytes to user
TTY_OTHER_CLOSED <---| |<--- kick worker
| |
**** New read() before worker starts ****
| | n_tty_read()
| | input avail? no
| | TTY_OTHER_CLOSED? yes
| | return -EIO
Several conditions are required to trigger this race:
1. the ldisc read buffer must become full so the input worker exits
2. the read() count parameter must be >= 4096 so the ldisc read buffer
is empty
3. the subsequent read() occurs before the kicked worker has processed
more input
However, the underlying cause of the race is that data is pipelined, while
tty state is not; ie., data already written by the pty slave end is not
yet visible to the pty master end, but state changes by the pty slave end
are visible to the pty master end immediately.
Pipeline the TTY_OTHER_CLOSED state through input worker to the reader.
1. Introduce TTY_OTHER_DONE which is set by the input worker when
TTY_OTHER_CLOSED is set and either the input buffers are flushed or
input processing has completed. Readers/polls are woken when
TTY_OTHER_DONE is set.
2. Reader/poll checks TTY_OTHER_DONE instead of TTY_OTHER_CLOSED.
3. A new input worker is started from pty_close() after setting
TTY_OTHER_CLOSED, which ensures the TTY_OTHER_DONE state will be
set if the last input worker is already finished (or just about to
exit).
Remove tty_flush_to_ldisc(); no in-tree callers.
Fixes: 52bce7f8d4 ("pty, n_tty: Simplify input processing on final close")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96311
BugLink: http://bugs.launchpad.net/bugs/1429756
Reported-by: Andy Whitcroft <apw@canonical.com>
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f9cfeed3e upstream.
when gsmtty_remove put dlci, it will cause memory leak if dlci->port's refcount is zero.
So we do the cleanup work in .cleanup callback instead.
dlci will be last put in two call chains.
1) gsmld_close -> gsm_cleanup_mux -> gsm_dlci_release -> dlci_put
2) gsmld_remove -> dlci_put
so there is a race. the memory leak depends on the race.
In call chain 2. we hit the memory leak. below comment tells.
release_tty -> tty_driver_remove_tty -> gsmtty_remove -> dlci_put -> tty_port_destructor (WARN_ON(port->itty) and return directly)
|
tty->port->itty = NULL;
|
tty_kref_put ---> release_one_tty -> gsmtty_cleanup (added by our patch)
So our patch fix the memory leak by doing the cleanup work after tty core did.
Signed-off-by: Pan Xinhui <xinhuix.pan@intel.com>
Fixes: dfabf7ffa3
Acked-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60c8f783a1 upstream.
clkdiv is declared as an u32 but it can be set to a negative value
causing a huge divisor value. Change its type to int to avoid this case.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e95235ccd upstream.
Recent toolchains force the TOC to be 256 byte aligned. We need
to enforce this alignment in our linker script, otherwise pointers
to our TOC variables (__toc_start, __prom_init_toc_start) could
be incorrect.
If they are bad, we die a few hundred instructions into boot.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb2d78eca upstream.
Before 69111bac42 ("powerpc: Replace __get_cpu_var uses"), in
save_mce_event, index got the value of mce_nest_count, and
mce_nest_count was incremented *after* index was set.
However, that patch changed the behaviour so that mce_nest count was
incremented *before* setting index.
This causes an off-by-one error, as get_mce_event sets index as
mce_nest_count - 1 before reading mce_event. Thus get_mce_event reads
bogus data, causing warnings like
"Machine Check Exception, Unknown event version 0 !"
and breaking MCEs handling.
Restore the old behaviour and unbreak MCE handling by subtracting one
from the newly incremented value.
The same broken change occured in machine_check_queue_event (which set
a queue read by machine_check_process_queued_event). Fix that too,
unbreaking printing of MCE information.
Fixes: 69111bac42 ("powerpc: Replace __get_cpu_var uses")
CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
CC: Christoph Lameter <cl@linux.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c735ed74d8 upstream.
Added the USB serial console device ID for KCF Technologies PRN device
which has a USB port for its serial console.
Signed-off-by: Mark Edwards <sonofaforester@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82ee3aeb92 upstream.
Samsung has just released a portable USB3 SSD, coming in a very small
and nice form factor. It's USB ID is 04e8:8001, which unfortunately is
already used by the Palm Visor driver for the Samsung I330 phone cradle.
Having pl2303 or visor pick up this device ID results in conflicts with
the usb-storage driver, which handles the newly released portable USB3
SSD.
To work around this conflict, I've dug up a mailing list post [1] from a
long time ago, in which a user posts the full USB descriptor
information. The most specific value in this appears to be the interface
class, which has value 255 (0xff). Since usb-storage requires an
interface class of 0x8, I believe it's correct to disambiguate the two
devices by matching on 0xff inside visor.
[1] http://permalink.gmane.org/gmane.linux.usb.user/4264
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 948fa13504 upstream.
If the xHCI host controller has died (ie, device removed) or suffered
other serious fatal error (STS_FATAL), then xhci_irq should handle this
condition with IRQ_HANDLED instead of -ESHUTDOWN.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18cc2f4cbb upstream.
Our event ring consists of only one segment, and we risk filling
the event ring in case we get isoc transfers with short intervals
such as webcams that fill a TD every microframe (125us)
With 64 TRB segment size one usb camera could fill the event ring in 8ms.
A setup with several cameras and other devices can fill up the
event ring as it is shared between all devices.
This has occurred when uvcvideo queues 5 * 32TD URBs which then
get cancelled when the video mode changes. The cancelled URBs are returned
in the xhci interrupt context and blocks the interrupt handler from
handling the new events.
A full event ring will block xhci from scheduling traffic and affect all
devices conneted to the xhci, will see errors such as Missed Service
Intervals for isoc devices, and and Split transaction errors for LS/FS
interrupt devices.
Increasing the TRB_PER_SEGMENT will also increase the default endpoint ring
size, which is welcome as for most isoc transfer we had to dynamically
expand the endpoint ring anyway to be able to queue the 5 * 32TDs uvcvideo
queues.
The default size used to be 64 TRBs per segment
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d104d0152a upstream.
Isoc TDs usually consist of one TRB, sometimes two. When all goes well we
receive only one success event for a TD, and move the dequeue pointer to
the next TD.
This fails if the TD consists of two TRBs and we get a transfer error
on the first TRB, we will then see two events for that TD.
Fix this by making sure the event we get is for the last TRB in that TD
before moving the dequeue pointer to the next TD. This will resolve some
of the uvc and dvb issues with the
"ERROR Transfer event TRB DMA ptr not part of current TD" error message
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 664a5c1d1e upstream.
This function selects page 1 and cause intermittent problems on
interrupt handler.
lock call with spin_lock_irqsave.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d65d2b25d2 upstream.
The state of m_td0TD0.f1Owner should change after the buff_addr
has been filled otherwise the device grabs the buffer too early.
m_td0TD0.f1Owner is protected by memory barriers on both sides
of change.
iTDUsed is best incremented after MACvTransmit.
It appears that f1Owner actually polls to do the memory transfer.
A back port patch will be needed for v3.19
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad3fee9b17 upstream.
Currently only TD_FLAGS_NETIF_SKB are reported back to mac80211.
Move vnt_int_report_rate to report all frame types.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3fa0917beb upstream.
TD_FLAGS_NETIF_SKB is only for data.
Fixes issue of ack frames not being reported.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b23f14302e upstream.
Information for packet type is in ieee80211_tx_info
band IEEE80211_BAND_5GHZ for PK_TYPE_11A.
IEEE80211_TX_RC_USE_CTS_PROTECT via tx_rate flags selects PK_TYPE_11GB
This ensures that the packet is always the right type.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 892c89d5d7 upstream.
Fix regression introduced by commit <29ef8a53542a>. After it writing
AT commands to /dev/GCT-ATM0 is unsuccessful (no echo, no response)
and dmesg show "gdmtty: invalid payload : 1 16 f011".
Before that commit value of dummy_cnt was only a padding size. After using
ALIGN() this value is increased by its first argument. So the following
usage of this variable needs correction.
Signed-off-by: Sławomir Demeszko <s.demeszko@wireless-instruments.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec04847c0c upstream.
The string iwpm_ulib_name is recorded in a nlmsg as a netlink attribute.
Without this fix parsing of the nlmsg by the userspace port mapper service fails
because of unknown attribute length, causing the port mapper service not to
register the client, which has sent the nlmsg.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdb6eb0a12 upstream.
When there is prefix specified, currently we will add this prefix in
widget->name, but not in widget->sname.
it causes failure at snd_soc_dapm_link_dai_widgets:
if (!w->sname || !strstr(w->sname, dai_w->name))
because dai_w->name has prefix added, but w->sname does not.
We should also add prefix for stream name
Signed-off-by: Koro Chen <koro.chen@mediatek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 810e4425c2 upstream.
set_dai_fmt_both() callback is called from snd_soc_runtime_set_dai_fmt()
which is called from snd_soc_register_card(), but at this time codec
is not powered on yet. Replace direct i2c write with regcache write.
Fixes: 5f0acedddf (ASoC: rx1950_uda1380: Use static DAI format setup)
Fixes: 5cc10b9b77 (ASoC: h1940_uda1380: Use static DAI format setup)
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 545774bd6e upstream.
mc13xxx_reg_rmw() won't change any bit if passing 0 to the mask field.
Pass AUDIO_SSI_SEL instead of 0 for the mask field to set AUDIO_SSI_SEL
bit.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3530febb5c upstream.
This reverts commit 7290006d8c.
Through the regression report, it was revealed that the
tpacpi_led_set() call to thinkpad_acpi helper doesn't only toggle the
mute LED but actually mutes the sound. This is contradiction to the
expectation, and rather confuses user.
According to Henrique, it's not trivial to judge which TP model
behaves "LED-only" and which model does whatever more intrusive, as
Lenovo's implementations vary model by model. So, from the safety
reason, we should revert the patch for now.
Reported-by: Martin Steigerwald <martin@lichtvoll.de>
Cc: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09ea997677 upstream.
The Lenovo ThinkPad L450 requires the ALC292_FIXUP_TPT440_DOCK fix in
order to get sound output on the docking stations audio port.
This patch was tested using a ThinkPad L450 (20DSS00B00) using kernel
4.0.3 and a ThinkPad Pro Dock.
Signed-off-by: Ansgar Hegerfeld <linux@hegerfeld.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37815bf866 upstream.
The module notifier call chain for MODULE_STATE_COMING was moved up before
the parsing of args, into the complete_formation() call. But if the module failed
to load after that, the notifier call chain for MODULE_STATE_GOING was
never called and that prevented the users of those call chains from
cleaning up anything that was allocated.
Link: http://lkml.kernel.org/r/554C52B9.9060700@gmail.com
Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Fixes: 4982223e51 "module: set nx before marking module MODULE_STATE_COMING"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2159184ea0 upstream.
when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.
Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f18c34e483 upstream.
If the specified maximum length of the string is a multiple of unsigned
long, we would load one long behind the specified maximum. If that
happens to be in a next page, we can hit a page fault although we were
not expected to.
Fix the off-by-one bug in the test whether we are at the end of the
specified range.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2a5d46b16 upstream.
Before commit 035a61c314 ("clk: Make clk API return per-user
struct clk instances") we acquired the enable_lock in
__clk_set_parent_{before,after}() by means of calling
clk_enable(). After commit 035a61c314 we use clk_core_enable()
in place of the clk_enable(), and clk_core_enable() doesn't
acquire the enable_lock. This opens up a race condition between
clk_set_parent() and clk_enable(). Fix it.
Fixes: 035a61c314 ("clk: Make clk API return per-user struct clk instances")
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97372e5a99 upstream.
Commit ae43b32891 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power
Management support v12") added pm support for the pl330 dma driver but
it makes the clock for the Exynos5420 MDMA0 DMA controller to be gated
during suspend and this in turn makes its parent clock aclk266_g2d to
be gated. But the clock needs to be ungated prior suspend to allow the
system to be suspend and resumed correctly.
Add GATE_BUS_TOP register to the list of registers to be restored when
the system enters into a suspend state so aclk266_g2d will be ungated.
Thanks to Abhilash Kesavan for figuring out that this was the issue.
Fixes: ae43b32 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12")
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Abhilash Kesavan <a.kesavan@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7bd6dc320 upstream.
The following error message is seen when loading the nct6683 driver
with DEBUG_LOCK_ALLOC enabled.
BUG: key ffff88040b2f0030 not in .data!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 186 at kernel/locking/lockdep.c:2988
lockdep_init_map+0x469/0x630()
DEBUG_LOCKS_WARN_ON(1)
Caused by a missing call to sysfs_attr_init() when initializing
sysfs attributes.
Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b63bf6172 upstream.
The following error message is seen when loading the nct6775 driver
with DEBUG_LOCK_ALLOC enabled.
BUG: key ffff88040b2f0030 not in .data!
------------[ cut here ]------------
WARNING: CPU: 0 PID: 186 at kernel/locking/lockdep.c:2988
lockdep_init_map+0x469/0x630()
DEBUG_LOCKS_WARN_ON(1)
Caused by a missing call to sysfs_attr_init() when initializing
sysfs attributes.
Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aecac04d8 upstream.
I2C address 0x37 may be used by EEPROMs, which can result in false
positives. Do not attempt to detect a chip at this address.
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97ffae1d30 upstream.
The VREFN channel is bipolar, not unipolar. Small negative values do
occur (e.g., -1mV), and unsigned conversion maps them incorrectly to
large positive values (about +1V), so fix this.
Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f29b212edb upstream.
According to hardware team there should be some delay after
setting channel number, start mode and before setting START.
Add a one microsecond delay for this purpose.
Fixes: 1664f6a5b0 ("iio: adc: Cosmic Circuits 10001 ADC driver")
Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 713276ea88 upstream.
At present we are incorrectly setting the register to 0x1 to power up
the ADC. Since it is an active high power down register, we need to set
the register to 0x0 to actually power up. Conversely, writing 0x1 to the
register powers it down.
This commit adds a couple of helpers to make the code clearer and then
use them to do the power-up/power-down properly.
Fixes: 1664f6a5b0 ("iio: adc: Cosmic Circuits 10001 ADC driver")
Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13415a998a upstream.
When some of the ADC channels are reserved for remote CPUs,
the scan index and the corresponding channel number doesn't
match. This leads to convesion on the incorrect channel during
triggered capture.
Fix this by using a scan index to channel mapping encoded
in the iio_chan_spec for this purpose while starting conversion
on a particular ADC channel in trigger handler.
Also, the channel_map is not really used anywhere but in probe(), so
no need to keep track of it. Remove it from device structure.
While here, add 1 to number of channels to register timestamp channel
with the IIO core.
Fixes: 1664f6a5b0 ("iio: adc: Cosmic Circuits 10001 ADC driver")
Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 937125aca0 upstream.
With 'dx' equal to 0.625V and 15 bit ADC, calculations overflow
when difference against GND is ~20% of the ADC range. Fix this.
Signed-off-by: Ivan T. Ivanov <ivan.ivanov@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2aab3d58b upstream.
Currently in_proximity_(null)_raw is getting presented as raw sysfs
attribute. Same with the scan_elements.
The modifier doesn't apply to this channel.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e71c04f86 upstream.
In SPI mode the transfer buffer is locked with a mutex. However this
mutex is only initilized after the probe, but some transfer needs to
be done in the probe.
To fix this bug we move the mutex initialization at the beginning of
the device probe.
Signed-off-by: Alban Bedel <alban.bedel@avionic-design.de>
Acked-by: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0716b0ea4 upstream.
Commit 65de7654d3 ("iio: iio: Fix iio_channel_read return if
channel havn't info") added a check for valid info masks.
This patch adds missing channel info masks for all ADC channels.
Otherwise, iio_read_channel_raw() would return -EINVAL when called
by consumer drivers.
Note that the change of _processed to _raw actually fixes an ABI abuse
in the original driver where it was used to avoid some special handling
rather than because it was correct.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adba657533 upstream.
When configured via device tree, the associated iio device needs to be
measuring voltage for the conversion to resistance to be correct.
Return -EINVAL if that is not the case.
Signed-off-by: Chris Lesiak <chris.lesiak@licor.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77bb3dfdc0 upstream.
A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different
VCPU than it is bound to. This can result in a race between
handle_percpu_irq() and removing the action in __free_irq() because
handle_percpu_irq() does not take desc->lock. The interrupt handler
sees a NULL action and oopses.
Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER).
# cat /proc/interrupts | grep virq
40: 87246 0 xen-percpu-virq timer0
44: 0 0 xen-percpu-virq debug0
47: 0 20995 xen-percpu-virq timer1
51: 0 0 xen-percpu-virq debug1
69: 0 0 xen-dyn-virq xen-pcpu
74: 0 0 xen-dyn-virq mce
75: 29 0 xen-dyn-virq hvc_console
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a96295965b upstream.
If while setting a block group read-only we end up allocating a system
chunk, through check_system_chunk(), we were not doing it while holding
the chunk mutex which is a problem if a concurrent chunk allocation is
happening, through do_chunk_alloc(), as it means both block groups can
end up using the same logical addresses and physical regions in the
device(s). So make sure we hold the chunk mutex.
Fixes: 2f0810880f ("btrfs: delete chunk allocation attemp when
setting block group ro")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 521a04d06a upstream.
This reverts commit ba9d114ec5.
.. which introduced a regression that prevented all lingering requests
requeued in kick_requests() from ever being sent to the OSDs, resulting
in a lot of missed notifies. In retrospect it's pretty obvious that
r_req_lru_item item in the case of lingering requests can be used not
only for notarget, but also for unsent linkage due to how tightly
actual map and enqueue operations are coupled in __map_request().
The assertion that was being silenced is taken care of in the previous
("libceph: request a new osdmap if lingering request maps to no osd")
commit: by always kicking homeless lingering requests we ensure that
none of them ends up on the notarget list outside of the critical
section guarded by request_mutex.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b049453221 upstream.
This commit does two things. First, if there are any homeless
lingering requests, we now request a new osdmap even if the osdmap that
is being processed brought no changes, i.e. if a given lingering
request turned homeless in one of the previous epochs and remained
homeless in the current epoch. Not doing so leaves us with a stale
osdmap and as a result we may miss our window for reestablishing the
watch and lose notifies.
MON=1 OSD=1:
# cat linger-needmap.sh
#!/bin/bash
rbd create --size 1 test
DEV=$(rbd map test)
ceph osd out 0
rbd map dne/dne # obtain a new osdmap as a side effect (!)
sleep 1
ceph osd in 0
rbd resize --size 2 test
# rbd info test | grep size -> 2M
# blockdev --getsize $DEV -> 1M
N.B.: Not obtaining a new osdmap in between "osd out" and "osd in"
above is enough to make it miss that resize notify, but that is a
bug^Wlimitation of ceph watch/notify v1.
Second, homeless lingering requests are now kicked just like those
lingering requests whose mapping has changed. This is mainly to
recognize that a homeless lingering request makes no sense and to
preserve the invariant that a registered lingering request is not
sitting on any of r_req_lru_item lists. This spares us a WARN_ON,
which commit ba9d114ec5 ("libceph: clear r_req_lru_item in
__unregister_linger_request()") tried to fix the _wrong_ way.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0c21530fa upstream.
Fix broken probe of da9052 regulators, which since commit b3f6c73db7
("mfd: da9052-core: Fix platform-device id collision") use a
non-deterministic platform-device id to retrieve static regulator
information. Fortunately, adequate error handling was in place so probe
would simply fail with an error message.
Update the mfd-cell ids to be zero-based and use those to identify the
cells when probing the regulator devices.
Fixes: b3f6c73db7 ("mfd: da9052-core: Fix platform-device id collision")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc6f67bcaf upstream.
OpenWRT folks reported that overlayfs fails to mount if upper fs is full,
because workdir can't be created. Wordir creation can fail for various
other reasons too.
There's no reason that the mount itself should fail, overlayfs can work
fine without a workdir, as long as the overlay isn't modified.
So mount it read-only and don't allow remounting read-write.
Add a couple of WARN_ON()s for the impossible case of workdir being used
despite being read-only.
Reported-by: Bastian Bittorf <bittorf@bluebottle.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d377c5eb54 upstream.
When removing an opaque directory we can't just call rmdir() to check for
emptiness, because the directory will need to be replaced with a whiteout.
The replacement is done with RENAME_EXCHANGE, which doesn't check
emptiness.
Solution is just to check emptiness by reading the directory. In the
future we could add a new rename flag to check for emptiness even for
RENAME_EXCHANGE to optimize this case.
Reported-by: Vincent Batts <vbatts@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Jordi Pujol Palomer <jordipujolp@gmail.com>
Fixes: 263b4a0fee ("ovl: dont replace opaque dir")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83a35114d0 upstream.
This bug has been there since day 1; addresses in the top guest physical
page weren't considered valid. You could map that page (the check in
check_gpte() is correct), but if a guest tried to put a pagetable there
we'd check that address manually when walking it, and kill the guest.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cddc116228 upstream.
It was missed when we converted everything in XFs to use negative error
numbers, so fix it now. Bug introduced in 3.17 by commit 2451337 ("xfs: global
error sign conversion"), and should go back to stable kernels.
Thanks to Brian Foster for noticing it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6dfe5a049f upstream.
xfs_attr_inactive() is supposed to clean up the attribute fork when
the inode is being freed. While it removes attribute fork extents,
it completely ignores attributes in local format, which means that
there can still be active attributes on the inode after
xfs_attr_inactive() has run.
This leads to problems with concurrent inode writeback - the in-core
inode attribute fork is removed without locking on the assumption
that nothing will be attempting to access the attribute fork after a
call to xfs_attr_inactive() because it isn't supposed to exist on
disk any more.
To fix this, make xfs_attr_inactive() completely remove all traces
of the attribute fork from the inode, regardless of it's state.
Further, also remove the in-core attribute fork structure safely so
that there is nothing further that needs to be done by callers to
clean up the attribute fork. This means we can remove the in-core
and on-disk attribute forks atomically.
Also, on error simply remove the in-memory attribute fork. There's
nothing that can be done with it once we have failed to remove the
on-disk attribute fork, so we may as well just blow it away here
anyway.
Reported-by: Waiman Long <waiman.long@hp.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0345ee57d upstream.
The count variable is used to iterate down to (below) zero from the size
of the bitmap and handle the one-filling the remainder of the last
partial bitmap block. The loop conditional expects count to be signed
in order to detect when the final block is processed, after which count
goes negative.
Unfortunately, a recent change made this unsigned along with some other
related fields. The result of is this is that during mount,
omfs_get_imap will overrun the bitmap array and corrupt memory unless
number of blocks happens to be a multiple of 8 * blocksize.
Fix by changing count back to signed: it is guaranteed to fit in an s32
without overflow due to an enforced limit on the number of blocks in the
filesystem.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcbff39da3 upstream.
match_token() expects a NULL terminator at the end of the token list so
that it would know where to stop. Not having one causes it to overrun
to invalid memory.
In practice, passing a mount option that omfs didn't recognize would
sometimes panic the system.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7bcb70eba upstream.
It was noted that the 32bit implementation of ktime_divns()
was doing unsigned division and didn't properly handle
negative values.
And when a ktime helper was changed to utilize
ktime_divns, it caused a regression on some IR blasters.
See the following bugzilla for details:
https://bugzilla.redhat.com/show_bug.cgi?id=1200353
This patch fixes the problem in ktime_divns by checking
and preserving the sign bit, and then reapplying it if
appropriate after the division, it also changes the return
type to a s64 to make it more obvious this is expected.
Nicolas also pointed out that negative dividers would
cause infinite loops on 32bit systems, negative dividers
is unlikely for users of this function, but out of caution
this patch adds checks for negative dividers for both
32-bit (BUG_ON) and 64-bit(WARN_ON) versions to make sure
no such use cases creep in.
[ tglx: Hand an u64 to do_div() to avoid the compiler warning ]
Fixes: 166afb6451 'ktime: Sanitize ktime_to_us/ms conversion'
Reported-and-tested-by: Trevor Cordes <trevor@tecnopolis.ca>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Link: http://lkml.kernel.org/r/1431118043-23452-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c447e76b4c upstream.
The MPX feature requires eager KVM FPU restore support. We have verified
that MPX cannot work correctly with the current lazy KVM FPU restore
mechanism. Eager KVM FPU restore should be enabled if the MPX feature is
exposed to VM.
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
[Also activate the FPU on AMD processors. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0be0226f07 upstream.
KVM may turn a user page to a kernel page when kernel writes a readonly
user page if CR0.WP = 1. This shadow page entry will be reused after
SMAP is enabled so that kernel is allowed to access this user page
Fix it by setting SMAP && !CR0.WP into shadow page's role and reset mmu
once CR4.SMAP is updated
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7cbeed9bce upstream.
Current permission check assumes that RSVD bit in PFEC is always zero,
however, it is not true since MMIO #PF will use it to quickly identify
MMIO access
Fix it by clearing the bit if walking guest page table is needed
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 898761158b upstream.
smep_andnot_wp is initialized in kvm_init_shadow_mmu and shadow pages
should not be reused for different values of it. Thus, it has to be
added to the mask in kvm_mmu_pte_write.
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e88221c50c upstream.
The kernel's handling of 'compacted' xsave state layout is buggy:
http://marc.info/?l=linux-kernel&m=142967852317199
I don't have such a system, and the description there is vague, but
from extrapolation I guess that there were two kinds of bugs
observed:
- boot crashes, due to size calculations being wrong and the dynamic
allocation allocating a too small xstate area. (This is now fixed
in the new FPU code - but still present in stable kernels.)
- FPU state corruption and ABI breakage: if signal handlers try to
change the FPU state in standard format, which then the kernel
tries to restore in the compacted format.
These breakages are scary, but they only occur on a small number of
systems that have XSAVES* CPU support. Yet we have had XSAVES support
in the upstream kernel for a large number of stable kernel releases,
and the fixes are involved and unproven.
So do the safe resolution first: disable XSAVES* support and only
use the standard xstate format. This makes the code work and is
easy to backport.
On top of this we can work on enabling (and testing!) proper
compacted format support, without backporting pressure, on top of the
new, cleaned up FPU code.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 17fea54bf0 upstream.
Derek noticed that a critical MCE gets reported with the wrong
error type description:
[Hardware Error]: CPU 34: Machine Check Exception: 5 Bank 9: f200003f000100b0
[Hardware Error]: RIP !INEXACT! 10:<ffffffff812e14c1> {intel_idle+0xb1/0x170}
[Hardware Error]: TSC 49587b8e321cb
[Hardware Error]: PROCESSOR 0:306e4 TIME 1431561296 SOCKET 1 APIC 29
[Hardware Error]: Some CPUs didn't answer in synchronization
[Hardware Error]: Machine check: Invalid
^^^^^^^
The last line with 'Invalid' should have printed the high level
MCE error type description we get from mce_severity, i.e.
something like:
[Hardware Error]: Machine check: Action required: data load error in a user process
this happens due to the fact that mce_no_way_out() iterates over
all MCA banks and possibly overwrites the @msg argument which is
used in the panic printing later.
Change behavior to take the message of only and the (last)
critical MCE it detects.
Reported-by: Derek <denc716@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1431936437-25286-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5dc5616ee8 upstream.
Stage 1 translation is controlled by two sets of page tables (TTBR0 and
TTBR1) which grow up and down from zero respectively in the ARMv8
translation regime. For the SMMU, we only care about TTBR0 and, in the
case of a 48-bit virtual space, we expect to map virtual addresses 0x0
through to 0xffff_ffff_ffff.
Given that some masters may be incapable of emitting virtual addresses
targetting TTBR1 (e.g. because they sit on a 48-bit bus), the SMMU
architecture allows bit 47 to be sign-extended, halving the virtual
range of TTBR0 but allowing TTBR1 to be used. This is controlled by the
SEP field in TTBCR2.
The SMMU driver incorrectly enables this sign-extension feature, which
causes problems when userspace addresses are programmed into a master
device with the SMMU expecting to map the incoming transactions via
TTBR0; if the top bit of address is set, we will instead get a
translation fault since TTBR1 walks are disabled in the TTBCR.
This patch fixes the issue by disabling sign-extension of a fixed
virtual address bit and instead basing the behaviour on the upstream bus
size: the incoming address is zero extended unless the upstream bus is
only 49 bits wide, in which case bit 48 is used as the sign bit and is
replicated to the upper bits.
Reported-by: Varun Sethi <varun.sethi@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bf1b431d9 upstream.
This patch fixes a bug in put_pasid_state_wait that appeared in kernel 4.0
The bug is that pasid_state->count wasn't decremented before entering the
wait_event. Thus, the condition in wait_event will never be true.
The fix is to decrement (atomically) the pasid_state->count before the
wait_event.
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 820f9f147d upstream.
This is needed to support lazily umounting locked mounts. Because the
entire unmounted subtree needs to stay together until there are no
users with references to any part of the subtree.
To support this guarantee that the fs_pin m_list and s_list nodes
are initialized by initializing them in init_fs_pin allowing
for the possibility that pin_insert_group does not touch them.
Further use hlist_del_init in pin_remove so that there is
a hlist_unhashed test before the list we attempt to update
the previous list item.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd4a40174b upstream.
The only users of collect_mounts are in audit_tree.c
In audit_trim_trees and audit_add_tree_rule the path passed into
collect_mounts is generated from kern_path passed an audit_tree
pathname which is guaranteed to be an absolute path. In those cases
collect_mounts is obviously intended to work on mounted paths and
if a race results in paths that are unmounted when collect_mounts
it is reasonable to fail early.
The paths passed into audit_tag_tree don't have the absolute path
check. But are used to play with fsnotify and otherwise interact with
the audit_trees, so again operating only on mounted paths appears
reasonable.
Avoid having to worry about what happens when we try and audit
unmounted filesystems by restricting collect_mounts to mounts
that appear in the mount tree.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d0a0b2f6d upstream.
ACPICA commit b60612373a4ef63b64a57c124576d7ddb6d8efb6
For physical addresses, since the address may exceed 32-bit address range
after calculation, we should use 0x%8.8X%8.8X instead of ACPI_PRINTF_UINT
and ACPI_FORMAT_UINT64() instead of
ACPI_FORMAT_NATIVE_UINT()/ACPI_FORMAT_TO_UINT().
This patch also removes above replaced macros as there are no users.
This is a preparation to switch acpi_physical_address to 64-bit on 32-bit
kernel builds.
Link: https://github.com/acpica/acpica/commit/b6061237
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: George G. Davis <george_davis@mentor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d3fd3cc33 upstream.
ACPICA commit 154f6d074dd38d6ebc0467ad454454e6c5c9ecdf
There are code pieces converting pointers using "(acpi_physical_address) x"
or "ACPI_CAST_PTR (t, x)" formats, this patch cleans up them.
Known issues:
1. Cleanup of "(ACPI_PHYSICAL_ADDRRESS) x" for a table field
For the conversions around the table fields, it is better to fix it with
alignment also fixed. So this patch doesn't modify such code. There
should be no functional problem by leaving them unchanged.
Link: https://github.com/acpica/acpica/commit/154f6d07
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: George G. Davis <george_davis@mentor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f254e3c57b upstream.
ACPICA commit 7d9fd64397d7c38899d3dc497525f6e6b044e0e3
OSPMs like Linux expect an acpi_physical_address returning value from
acpi_find_root_pointer(). This triggers warnings if sizeof (acpi_size) doesn't
equal to sizeof (acpi_physical_address):
drivers/acpi/osl.c:275:3: warning: passing argument 1 of 'acpi_find_root_pointer' from incompatible pointer type [enabled by default]
In file included from include/acpi/acpi.h:64:0,
from include/linux/acpi.h:36,
from drivers/acpi/osl.c:41:
include/acpi/acpixf.h:433:1: note: expected 'acpi_size *' but argument is of type 'acpi_physical_address *'
This patch corrects acpi_find_root_pointer().
Link: https://github.com/acpica/acpica/commit/7d9fd643
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: George G. Davis <george_davis@mentor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc26d4d06e upstream.
A deadlock can be initiated by userspace via ioctl(SNDCTL_SEQ_OUTOFBAND)
on /dev/sequencer with TMR_ECHO midi event.
In this case the control flow is:
sound_ioctl()
-> case SND_DEV_SEQ:
case SND_DEV_SEQ2:
sequencer_ioctl()
-> case SNDCTL_SEQ_OUTOFBAND:
spin_lock_irqsave(&lock,flags);
play_event();
-> case EV_TIMING:
seq_timing_event()
-> case TMR_ECHO:
seq_copy_to_input()
-> spin_lock_irqsave(&lock,flags);
It seems that spin_lock_irqsave() around play_event() is not necessary,
because the only other call location in seq_startplay() makes the call
without acquiring spinlock.
So, the patch just removes spinlocks around play_event().
By the way, it removes unreachable code in seq_timing_event(),
since (seq_mode == SEQ_2) case is handled in the beginning.
Compile tested only.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c097877319 upstream.
arm64 builds with GCC 5 have caused the __asmeq assertions in the PSCI
calling code to fire, so move the ARM PSCI calls out of line into their
own assembly file for consistency and to safeguard against the same
issue occuring with the 32-bit toolchain.
[will: brought into line with arm64 implementation]
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bad4371d87 upstream.
f9fd54f22e ("mmc: sh_mmcif: Use msecs_to_jiffies() for host->timeout")
changed the timeout value from 1000 jiffies to 1s. In the case where
HZ is 1000 the values are the same. However, for smaller HZ values the
timeout is now smaller, 1s instead of 10s in the case of HZ=100.
Since the timeout occurs in spite of a normal data transfer a timeout of
10s seems more appropriate. This restores the previous timeout in the
case where HZ=100 and results in an increase over the previous timeout
for larger values of HZ.
Fixes: f9fd54f22e ("mmc: sh_mmcif: Use msecs_to_jiffies() for host->timeout")
Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com>
[horms: rewrote changelog to refer to HZ]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 184af16b09 upstream.
The PM_RESTORE_PREPARE is not handled now in mmc_pm_notify(),
as result mmc_rescan() could be scheduled and executed at
late hibernation restore stages when MMC device is suspended
already - which, in turn, will lead to system crash on TI dra7-evm board:
WARNING: CPU: 0 PID: 3188 at drivers/bus/omap_l3_noc.c:148 l3_interrupt_handler+0x258/0x374()
44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4_PER1_P3 (Idle): Data Access in User mode during Functional access
Hence, add missed PM_RESTORE_PREPARE PM event in mmc_pm_notify().
Fixes: 4c2ef25fe0 (mmc: fix all hangs related to mmc/sd card...)
Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e93b9a6ab upstream.
During kernel boot, it will try to read some logical sectors
of each block device node for the possible partition table.
But since RPMB partition is special and can not be accessed
by normal eMMC read / write CMDs, it will cause below error
messages during kernel boot:
...
mmc0: Got data interrupt 0x00000002 even though no data operation was in progress.
mmcblk0rpmb: error -110 transferring data, sector 0, nr 32, cmd response 0x900, card status 0xb00
mmcblk0rpmb: retrying using single block read
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
mmcblk0rpmb: timed out sending r/w cmd command, card status 0x400900
end_request: I/O error, dev mmcblk0rpmb, sector 0
Buffer I/O error on device mmcblk0rpmb, logical block 0
end_request: I/O error, dev mmcblk0rpmb, sector 8
Buffer I/O error on device mmcblk0rpmb, logical block 1
end_request: I/O error, dev mmcblk0rpmb, sector 16
Buffer I/O error on device mmcblk0rpmb, logical block 2
end_request: I/O error, dev mmcblk0rpmb, sector 24
Buffer I/O error on device mmcblk0rpmb, logical block 3
...
This patch will discard the access request in eMMC queue if
it is RPMB partition access request. By this way, it avoids
trigger above error messages.
Fixes: 090d25fe22 ("mmc: core: Expose access to RPMB partition")
Signed-off-by: Yunpeng Gao <yunpeng.gao@intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
Tested-by: Michael Shigorin <mike@altlinux.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c5272a2856 upstream.
Way back, when the world was a simpler place and there was no war, no
evil, and no kernel bugs, there was just a single pinctrl lock. That
was how the world was when (57291ce pinctrl: core device tree mapping
table parsing support) was written. In that case, there were
instances where the pinctrl mutex was already held when
pinctrl_register_map() was called, hence a "locked" parameter was
passed to the function to indicate that the mutex was already locked
(so we shouldn't lock it again).
A few years ago in (42fed7b pinctrl: move subsystem mutex to
pinctrl_dev struct), we switched to a separate pinctrl_maps_mutex.
...but (oops) we forgot to re-think about the whole "locked" parameter
for pinctrl_register_map(). Basically the "locked" parameter appears
to still refer to whether the bigger pinctrl_dev mutex is locked, but
we're using it to skip locks of our (now separate) pinctrl_maps_mutex.
That's kind of a bad thing(TM). Probably nobody noticed because most
of the calls to pinctrl_register_map happen at boot time and we've got
synchronous device probing. ...and even cases where we're
asynchronous don't end up actually hitting the race too often. ...but
after banging my head against the wall for a bug that reproduced 1 out
of 1000 reboots and lots of looking through kgdb, I finally noticed
this.
Anyway, we can now safely remove the "locked" parameter and go back to
a war-free, evil-free, and kernel-bug-free world.
Fixes: 42fed7ba44 ("pinctrl: move subsystem mutex to pinctrl_dev struct")
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79b066bd76 upstream.
This patch fixes a bug where sdma vm wasn't initialized when
an sdma queue was created in HWS mode.
This caused GPUVM faults to appear on dmesg and it is one of the
causes that SDMA queues are not working.
Signed-off-by: Xihan Zhang <xihan.zhang@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.comt>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e5ec956a0 upstream.
Sometimes we might unregister process that have queues, because we couldn't
preempt the queues. Until now we blocked it with BUG_ON but instead just
print it as debug.
Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fcb1704d1 upstream.
The eDP port A register on PCH split platforms has a slightly different
register layout from the other ports, with bit 6 being either alternate
scrambler reset or reserved, depending on the generation. Our
misinterpretation of the bit as audio has lead to warning.
Fix this by not enabling audio on port A, since none of our platforms
support audio on port A anyway.
v2: DDI doesn't have audio on port A either (Sivakumar Thulasimani)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89958
Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3916e3fd81 upstream.
Single channel LVDS maxes out at 112 MHz. The 15" pre-retina models
shipped with 1440x900 (106 MHz) by default or 1680x1050 (119 MHz)
as a BTO option, both versions used dual channel LVDS even though
the smaller one would have fit into a single channel.
Notes:
Bug report showing that the MacBookPro8,2 with 1440x900 uses dual
channel LVDS (this lead to it being hardcoded in intel_lvds.c by
Daniel Vetter with commit 618563e394):
https://bugzilla.kernel.org/show_bug.cgi?id=42842
If i915.lvds_channel_mode=2 is missing even though the machine needs
it, every other vertical line is white and consequently, only the left
half of the screen is visible (verified by myself on a MacBookPro9,1).
Forum posting concerning a MacBookPro6,2 with 1440x900, author is
using i915.lvds_channel_mode=2 on the kernel command line, proving
that the machine uses dual channels:
https://bbs.archlinux.org/viewtopic.php?id=185770
Chi Mei N154C6-L04 with 1440x900 is a replacement panel for all
MacBook Pro "A1286" models, and that model number encompasses the
MacBookPro6,2 / 8,2 / 9,1. Page 17 of the panel's datasheet shows it's
driven with dual channel LVDS:
http://www.ebay.com/itm/-/400690878560http://www.everymac.com/ultimate-mac-lookup/?search_keywords=A1286http://www.taopanel.com/chimei/datasheet/N154C6-L04.pdf
Those three 15" models, MacBookPro6,2 / 8,2 / 9,1, are the only ones
with i915 graphics and dual channel LVDS, so that list should be
complete. And the 8,2 is already in intel_lvds.c.
Possible motivation to use dual channel LVDS even on the 1440x900
models: Reduce the number of different parts, i.e. use identical logic
boards and display cabling on both versions and the only differing
component is the panel.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
[Jani: included notes in the commit message for posterity]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f317cfe42 upstream.
Single channel LVDS maxes out at 112 MHz, anything above must be dual
channel. This avoids the need to specify i915.lvds_channel_mode=2 on
all 17" MacBook Pro models with i915 graphics since they had 1920x1200
(193 MHz), plus those 15" pre-retina models which had a resolution
of 1680x1050 (119 MHz) as a BTO option.
Source for 112 MHz limit of single channel LVDS is section 2.3 of:
https://01.org/linuxgraphics/sites/default/files/documentation/ivb_ihd_os_vol3_part4.pdf
v2: Avoid hardcoding 17" models by assuming dual channel LVDS if the
resolution necessitates it, suggested by Jani Nikula.
v3: Fix typo, thanks Joonas Lahtinen.
v4: Split commit in two, suggested by Ville Syrjälä.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[Jani: included spec reference into the commit message]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdb68e09bb upstream.
Since commit 844b03f277 we make
sure that after vblank irq off, we return the last valid
(vblank count, vblank timestamp) pair to clients, e.g., during
modesets, which is good.
An overlooked side effect of that commit for kms drivers without
support for precise vblank timestamping is that at vblank irq
enable, when we update the vblank counter from the hw counter, we
can't update the corresponding vblank timestamp, so now we have a
totally mismatched timestamp for the new count to confuse clients.
Restore old client visible behaviour from before Linux 3.17, but
zero out the timestamp at vblank counter update (instead of disable
as in original implementation) if we can't generate a meaningful
timestamp immediately for the new vblank counter. This will fix
this regression, so callers know they need to retry again later
if they need a valid timestamp, but at the same time preserves
the improvements made in the commit mentioned above.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org> #v3.17+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
commit 53d2669844 upstream.
The GPIO regulator for the SD-card isn't a ux500 SOC configuration, but
instead it's specific to the board. Move the definition of it, into the
board DTSs.
Fixes: c94a4ab7af ("ARM: ux500: Disable the MMCI gpio-regulator by default")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19fc99d0c6 upstream.
In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
and loading rm first into ARM_R0 will result in jit_udiv() function
being called the same dividend and divisor. Fix that by loading rn
first into ARM_R1 and then rm into ARM_R0.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Fixes: aee636c480 (bpf: do not use reciprocal divide)
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 102bcb6ed2 upstream.
If we use a combination of VMODE and I2C4 for retention modes,
eventually the off idle power consumption will creep up by about
23mW, even during off mode with I2C4 always staying enabled.
Turns out this is because of erratum i531 "Extra Power Consumed
When Repeated Start Operation Mode Is Enabled on I2C Interface
Dedicated for Smart Reflex (I2C4)" as pointed out by Nishanth
Menon <nm@ti.com>.
Let's fix the issue by adding i2c_cfg_clear_mask for the bits
to clear when initializing the I2C4 adapter so we can clear
SREN bit that drives the I2C4 lines low otherwise when there
is no traffic.
Fixes: 3b8c4ebb76 ("ARM: OMAP3: Fix idle mode signaling for
sys_clkreq and sys_off_mode")
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Tero Kristo <t-kristo@ti.com>
Reviewed-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 750e30d407 upstream.
There is no crystal connected to the internal RTC on the Open Block
AX3. So let's disable it in order to prevent the kernel probing the
driver uselessly. Eventually this patches removes the following
warning message from the boot log:
"rtc-mv d0010300.rtc: internal RTC not ticking"
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0fdebe1a2f upstream.
The dr_mode of usb0 on imx233-olinuxino is left to default "otg".
Since the green LED (GPIO2_1) on imx233-olinuxino is connected to the
same pin as USB_OTG_ID it's possible to disable USB host by LED toggling:
echo 0 > /sys/class/leds/green/brightness
[ 1068.890000] ci_hdrc ci_hdrc.0: remove, state 1
[ 1068.890000] usb usb1: USB disconnect, device number 1
[ 1068.920000] usb 1-1: USB disconnect, device number 2
[ 1068.920000] usb 1-1.1: USB disconnect, device number 3
[ 1069.070000] usb 1-1.2: USB disconnect, device number 4
[ 1069.450000] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
[ 1074.460000] ci_hdrc ci_hdrc.0: timeout waiting for 00000800 in 11
This patch fixes the issue by setting dr_mode to "host" in the dts file.
Reported-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Acked-by: Peter Chen <peter.chen@freescale.com>
Fixes: b493129482 ("ARM: dts: imx23-olinuxino: Add USB host support")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ada77e37a upstream.
Fix a typo in the TX DMA interrupt name for AUART4.
This patch makes AUART4 operational again.
Signed-off-by: Marek Vasut <marex@denx.de>
Fixes: f30fb03d4d ("ARM: dts: add generic DMA device tree binding for mxs-dma")
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1819e3034e upstream.
N900 audio recording needs that codec provides bias voltage for integrated
digital microphone and headset microphone depending which one is used.
Digital microphone uses 2 V bias and it comes from the codec A part. Codec
B part drives the headset microphone bias and that is set to 2.5 V.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
[Jarkko: Headset mic bias changed to 2 (2.5 V) as it was before commit
e2e8bfdf61 ("ASoC: tlv320aic3x: Convert mic bias to a supply widget")]
Signed-off-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a34c0872a upstream.
hctx->tags has to be set as NULL in case that it is to be unmapped
no matter if set->tags[hctx->queue_num] is NULL or not in blk_mq_map_swqueue()
because shared tags can be freed already from another request queue.
The same situation has to be considered during handling CPU online too.
Unmapped hw queue can be remapped after CPU topo is changed, so we need
to allocate tags for the hw queue in blk_mq_map_swqueue(). Then tags
allocation for hw queue can be removed in hctx cpu online notifier, and it
is reasonable to do that after mapping is updated.
Reported-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f054b56c95 upstream.
Firstly during CPU hotplug, even queue is freezed, timeout
handler still may come and access hctx->tags, which may cause
use after free, so this patch deactivates timeout handler
inside CPU hotplug notifier.
Secondly, tags can be shared by more than one queues, so we
have to check if the hctx has been unmapped, otherwise
still use-after-free on tags can be triggered.
Reported-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6cd18e711d upstream.
Because of the peculiar way that md devices are created (automatically
when the device node is opened), a new device can be created and
registered immediately after the
blk_unregister_region(disk_devt(disk), disk->minors);
call in del_gendisk().
Therefore it is important that all visible artifacts of the previous
device are removed before this call. In particular, the 'bdi'.
Since:
commit c4db59d31e
Author: Christoph Hellwig <hch@lst.de>
fs: don't reassign dirty inodes to default_backing_dev_info
moved the
device_unregister(bdi->dev);
call from bdi_unregister() to bdi_destroy() it has been quite easy to
lose a race and have a new (e.g.) "md127" be created after the
blk_unregister_region() call and before bdi_destroy() is ultimately
called by the final 'put_disk', which must come after del_gendisk().
The new device finds that the bdi name is already registered in sysfs
and complains
> [ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5a/0x70()
> [ 9627.630032] sysfs: cannot create duplicate filename '/devices/virtual/bdi/9:127'
We can fix this by moving the bdi_destroy() call out of
blk_release_queue() (which can happen very late when a refcount
reaches zero) and into blk_cleanup_queue() - which happens exactly when the md
device driver calls it.
Then it is only necessary for md to call blk_cleanup_queue() before
del_gendisk(). As loop.c devices are also created on demand by
opening the device node, we make the same change there.
Fixes: c4db59d31e
Reported-by: Azat Khuzhin <a3at.mail@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0403ec0bb upstream.
This reverts Linux 4.1-rc1 commit 0618764cb2.
The problem which that commit attempts to fix actually lies in the
Freescale CAAM crypto driver not dm-crypt.
dm-crypt uses CRYPTO_TFM_REQ_MAY_BACKLOG. This means the the crypto
driver should internally backlog requests which arrive when the queue is
full and process them later. Until the crypto hw's queue becomes full,
the driver returns -EINPROGRESS. When the crypto hw's queue if full,
the driver returns -EBUSY, and if CRYPTO_TFM_REQ_MAY_BACKLOG is set, is
expected to backlog the request and process it when the hardware has
queue space. At the point when the driver takes the request from the
backlog and starts processing it, it calls the completion function with
a status of -EINPROGRESS. The completion function is called (for a
second time, in the case of backlogged requests) with a status/err of 0
when a request is done.
Crypto drivers for hardware without hardware queueing use the helpers,
crypto_init_queue(), crypto_enqueue_request(), crypto_dequeue_request()
and crypto_get_backlog() helpers to implement this behaviour correctly,
while others implement this behaviour without these helpers (ccp, for
example).
dm-crypt (before the patch that needs reverting) uses this API
correctly. It queues up as many requests as the hw queues will allow
(i.e. as long as it gets back -EINPROGRESS from the request function).
Then, when it sees at least one backlogged request (gets -EBUSY), it
waits till that backlogged request is handled (completion gets called
with -EINPROGRESS), and then continues. The references to
af_alg_wait_for_completion() and af_alg_complete() in that commit's
commit message are irrelevant because those functions only handle one
request at a time, unlink dm-crypt.
The problem is that the Freescale CAAM driver, which that commit
describes as having being tested with, fails to implement the
backlogging behaviour correctly. In cam_jr_enqueue(), if the hardware
queue is full, it simply returns -EBUSY without backlogging the request.
What the observed deadlock was is not described in the commit message
but it is obviously the wait_for_completion() in crypto_convert() where
dm-crypto would wait for the completion being called with -EINPROGRESS
in the case of backlogged requests. This completion will never be
completed due to the bug in the CAAM driver.
Commit 0618764cb2 incorrectly made dm-crypt wait for every request,
even when the driver/hardware queues are not full, which means that
dm-crypt will never see -EBUSY. This means that that commit will cause
a performance regression on all crypto drivers which implement the API
correctly.
Revert it. Correct backlog handling should be implemented in the CAAM
driver instead.
Cc'ing stable purely because commit 0618764cb2 did. If for some reason
a stable@ kernel did pick up commit 0618764cb2 it should get reverted.
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Reviewed-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8014bcc86e upstream.
The variable for the 'permissive' module parameter used to be static
but was recently changed to be extern. This puts it in the kernel
global namespace if the driver is built-in, so its name should begin
with a prefix identifying the driver.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: af6fc858a3 ("xen-pciback: limit guest control of command register")
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cec988349 upstream.
When a guest is resumed, the hypervisor may change event channel
assignments. If this happens and the guest uses 2-level events it
is possible for the interrupt to be claimed by wrong VCPU since
cpu_evtchn_mask bits may be stale. This can happen even though
evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
is passed in is not necessarily the original one (from pre-migration
times) but instead is freshly allocated during resume and so any
information about which CPU the channel was bound to is lost.
Thus we should clear the mask during resume.
We also need to make sure that bits for xenstore and console channels
are set when these two subsystems are resumed. While rebind_evtchn_irq()
(which is invoked for both of them on a resume) calls irq_set_affinity(),
the latter will in fact postpone setting affinity until handling the
interrupt. But because cpu_evtchn_mask will have bits for these two
cleared we won't be able to take the interrupt.
With that in mind, we need to bind those two channels explicitly in
rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
pass through generic irq affinity code later, in case something needs
to be updated there as well.
(Also replace cpumask_of(0) with cpumask_of(info->cpu) in
rebind_evtchn_irq(): it should be set to zero in preceding
xen_irq_info_evtchn_setup().)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Annie Li <annie.li@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db7d4d7f40 upstream.
Commit 13060b64b8 ("vfio: Add and use device request op for vfio
bus drivers") incorrectly makes use of an interruptible timeout.
When interrupted, the signal remains pending resulting in subsequent
timeouts occurring instantly. This makes the loop spin at a much
higher rate than intended.
Instead of making this completely non-interruptible, we can change
this into a sort of interruptible-once behavior and use the "once"
to log debug information. The driver API doesn't allow us to abort
and return an error code.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 13060b64b8
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a700d8edf upstream.
Various formats had their byte ordering implemented incorrectly, and
the V4L2_PIX_FMT_UYVY is actually impossible to create, instead you
get V4L2_PIX_FMT_YVYU.
This was working before commit ad6ac45222
("add new formats support for marvell-ccic driver"). That commit broke
the original format support and the OLPC XO-1 laptop showed wrong
colors ever since (if you are crazy enough to attempt to run the latest
kernel on it, like I did).
The email addresses of the authors of that patch are no longer valid,
so without a way to reach them and ask them about their test setup
I am going with what I can test on the OLPC laptop.
If this breaks something for someone on their non-OLPC setup, then
contact the linux-media mailinglist. My suspicion however is that
that commit went in untested.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 602498f9aa upstream.
If multiple soft offline events hit one free page/hugepage concurrently,
soft_offline_page() can handle the free page/hugepage multiple times,
which makes num_poisoned_pages counter increased more than once. This
patch fixes this wrong counting by checking TestSetPageHWPoison for normal
papes and by checking the return value of dequeue_hwpoisoned_huge_page()
for hugepages.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 464d1387ac upstream.
mm/page-writeback.c has several places where 1 is added to the divisor
to prevent division by zero exceptions; however, if the original
divisor is equivalent to -1, adding 1 leads to division by zero.
There are three places where +1 is used for this purpose - one in
pos_ratio_polynom() and two in bdi_position_ratio(). The second one
in bdi_position_ratio() actually triggered div-by-zero oops on a
machine running a 3.10 kernel. The divisor is
x_intercept - bdi_setpoint + 1 == span + 1
span is confirmed to be (u32)-1. It isn't clear how it ended up that
but it could be from write bandwidth calculation underflow fixed by
c72efb658f ("writeback: fix possible underflow in write bandwidth
calculation").
At any rate, +1 isn't a proper protection against div-by-zero. This
patch converts all +1 protections to |1. Note that
bdi_update_dirty_ratelimit() was already using |1 before this patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f15133df08 upstream.
path_openat() jumps to the wrong place after do_tmpfile() - it has
already done path_cleanup() (as part of path_lookupat() called by
do_tmpfile()), so doing that again can lead to double fput().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09789e5de1 upstream.
Currently memory_failure() calls shake_page() to sweep pages out from
pcplists only when the victim page is 4kB LRU page or thp head page.
But we should do this for a thp tail page too.
Consider that a memory error hits a thp tail page whose head page is on
a pcplist when memory_failure() runs. Then, the current kernel skips
shake_pages() part, so hwpoison_user_mappings() returns without calling
split_huge_page() nor try_to_unmap() because PageLRU of the thp head is
still cleared due to the skip of shake_page().
As a result, me_huge_page() runs for the thp, which is broken behavior.
One effect is a leak of the thp. And another is to fail to isolate the
memory error, so later access to the error address causes another MCE,
which kills the processes which used the thp.
This patch fixes this problem by calling shake_page() for thp tail case.
Fixes: 385de35722 ("thp: allow a hwpoisoned head page to be put back to LRU")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e96c1b0e0 upstream.
This fixes a dumb bug in fs_fully_visible that allows proc or sys to
be mounted if there is a bind mount of part of /proc/ or /sys/ visible.
Reported-by: Eric Windisch <ewindisch@docker.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 483d821108 upstream.
Unregister GPIOs requested through sysfs at chip remove to avoid leaking
the associated memory and sysfs entries.
The stale sysfs entries prevented the gpio numbers from being exported
when the gpio range was later reused (e.g. at device reconnect).
This also fixes the related module-reference leak.
Note that kernfs makes sure that any on-going sysfs operations finish
before the class devices are unregistered and that further accesses
fail.
The chip exported flag is used to prevent gpiod exports during removal.
This also makes it harder to trigger, but does not fix, the related race
between gpiochip_remove and export_store, which is really a race with
gpiod_request that needs to be addressed separately.
Also note that this would prevent the crashes (e.g. NULL-dereferences)
at reconnect that affects pre-3.18 kernels, as well as use-after-free on
operations on open attribute files on pre-3.14 kernels (prior to
kernfs).
Fixes: d8f388d8dc ("gpio: sysfs interface")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 285214409a upstream.
When accepting a new IPv4 connect to an IPv6 socket, the CMA tries to
canonize the address family to IPv4, but does not properly process
the listening sockaddr to get the listening port, and does not properly
set the address family of the canonized sockaddr.
Fixes: e51060f08a ("IB: IP address based RDMA connection manager")
Reported-By: Yotam Kenneth <yotamke@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8fd150fe3 upstream.
The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).
Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.
This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 489405fe5e upstream.
While setting the time, the RTC TIME register should not be accessed.
However due to hardware constraints, setting the RTC time involves
sleeping during 100ms. This sleep was done outside the critical section
protected by the spinlock, so it was possible to read the RTC TIME
register and get an incorrect value. This patch introduces a mutex for
protecting the RTC TIME access, unlike the spinlock it is allowed to
sleep in a critical section protected by a mutex.
The RTC STATUS register can still be used from the interrupt handler but
it has no effect on setting the time.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1432a2a35 upstream.
There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged. This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.
dlm_get_lock_resource {
...
spin_lock(&dlm->spinlock);
tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
if (tmpres) {
spin_unlock(&dlm->spinlock);
>>>>>>>> race window, dlm_run_purge_list() may run and purge
the lock resource
spin_lock(&tmpres->spinlock);
...
spin_unlock(&tmpres->spinlock);
}
}
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 622532bb2f upstream.
Commit eec15edbb0 (ACPI / PNP: use device ID list for PNPACPI device
enumeration) changed the way how ACPI devices are enumerated and when
they are added to the PNP bus.
However, it broke the sound card support on (at least) a vintage
IBM ThinkPad 600E: with said commit applied, two of the necessary
"CSC01xx" devices are not added to the PNP bus and hence can not be
found during the initialization of the "snd-cs4236" module. As a
consequence, loading "snd-cs4236" causes null pointer exceptions.
The attached patch fixes the problem end re-enables sound on the
IBM ThinkPad 600E.
Fixes: eec15edbb0 (ACPI / PNP: use device ID list for PNPACPI device enumeration)
Signed-off-by: Witold Szczeponik <Witold.Szczeponik@gmx.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c62e8492e upstream.
An IO port or MMIO resource assigned to a PCI host bridge may be
consumed by the host bridge itself or available to its child
bus/devices. The ACPI specification defines a bit (Producer/Consumer)
to tell whether the resource is consumed by the host bridge itself,
but firmware hasn't used that bit consistently, so we can't rely on it.
Before commit 593669c2ac ("x86/PCI/ACPI: Use common ACPI resource
interfaces to simplify implementation"), arch/x86/pci/acpi.c ignored
all IO port resources defined by acpi_resource_io and
acpi_resource_fixed_io to filter out IO ports consumed by the host
bridge itself.
Commit 593669c2ac ("x86/PCI/ACPI: Use common ACPI resource interfaces
to simplify implementation") started accepting all IO port and MMIO
resources, which caused a regression that IO port resources consumed
by the host bridge itself became available to its child devices.
Then commit 63f1789ec7 ("x86/PCI/ACPI: Ignore resources consumed by
host bridge itself") ignored resources consumed by the host bridge
itself by checking the IORESOURCE_WINDOW flag, which accidently removed
MMIO resources defined by acpi_resource_memory24, acpi_resource_memory32
and acpi_resource_fixed_memory32.
On x86 and IA64 platforms, all IO port and MMIO resources are assumed
to be available to child bus/devices except one special case:
IO port [0xCF8-0xCFF] is consumed by the host bridge itself
to access PCI configuration space.
So explicitly filter out PCI CFG IO ports[0xCF8-0xCFF]. This solution
will also ease the way to consolidate ACPI PCI host bridge common code
from x86, ia64 and ARM64.
Related ACPI table are archived at:
https://bugzilla.kernel.org/show_bug.cgi?id=94221
Related discussions at:
http://patchwork.ozlabs.org/patch/461633/https://lkml.org/lkml/2015/3/29/304
Fixes: 63f1789ec7 (Ignore resources consumed by host bridge itself)
Reported-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3349fb64b2 upstream.
Commit 7bc5a2bad0 'ACPI: Support _OSI("Darwin") correctly' caused
the MacBook firmware to expose the SBS, resulting in intermittent
hangs of several minutes on boot, and failure to detect or report
the battery. Fix this by adding a 5 us delay to the start of each
SMBUS transaction. This timing is the result of experimentation -
hangs were observed with 3 us but never with 5 us.
Fixes: 7bc5a2bad0 'ACPI: Support _OSI("Darwin") correctly'
Link: https://bugzilla.kernel.org/show_bug.cgi?id=94651
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Cc: 3.18+ <stable@vger.kernel.org> # 3.18+
[ rjw: Subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db579e76f0 upstream.
On Mac OS X, HFS+ extended attributes are not namespaced. Since we want
to be compatible with OS X filesystems and yet still support the Linux
namespacing system, the hfsplus driver implements a special "osx"
namespace that is reported for any attribute that is not namespaced
on-disk. However, the current code for getting and setting these
unprefixed attributes is broken.
hfsplus_osx_setattr() and hfsplus_osx_getattr() are passed names that have
already had their "osx." prefixes stripped by the generic functions. The
functions first, quite correctly, check those names to make sure that they
aren't prefixed with a known namespace, which would allow namespace access
restrictions to be bypassed. However, the functions then prepend "osx."
to the name they're given before passing it on to hfsplus_getattr() and
hfsplus_setattr(). Not only does this cause the "osx." prefix to be
stored on-disk, defeating its purpose, it also breaks the check for the
special "com.apple.FinderInfo" attribute, which is reported for all files,
and as a consequence makes some userspace applications (e.g. GNU patch)
fail even when extended attributes are not otherwise in use.
There are five commits which have touched this particular code:
127e5f5ae5 ("hfsplus: rework functionality of getting, setting and deleting of extended attributes")
b168fff721 ("hfsplus: use xattr handlers for removexattr")
bf29e886b2 ("hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes")
fcacbd95e121 ("fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()")
ec1bbd346f18 ("fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()")
The first commit creates the functions to begin with. The namespace is
prepended by the original code, which I believe was correct at the time,
since hfsplus_?etattr() stripped the prefix if found. The second commit
removes this behavior from hfsplus_?etattr() and appears to have been
intended to also remove the prefixing from hfsplus_osx_?etattr().
However, what it actually does is remove a necessary strncpy() call
completely, breaking the osx namespace entirely. The third commit re-adds
the strncpy() call as it was originally, but doesn't mention it in its
commit message. The final two commits refactor the code and don't affect
its functionality.
This commit does what b168fff attempted to do (prevent the prefix from
being added), but does it properly, instead of passing in an empty buffer
(which is what b168fff actually did).
Fixes: b168fff721 ("hfsplus: use xattr handlers for removexattr")
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26d4d129b6 upstream.
If we unmap BOs before releasing them them the intervall tree locks
up because we try to remove an entry not inside the tree.
Based on a patch from Michel Dänzer.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f55db36d4 upstream.
Otherwise the driver may try and send audio which may confuse the
monitor.
v2: set pin to NULL if no audio
v3: avoid crash with analog encoders
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12428327bb upstream.
It's mostly duplicated with evergreen_dp_enable. This
is a prerequisite for fix implemented in another patch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 579d69bc1f upstream.
The 3w-sas driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Torsten Luettgert <ml-lkml@enda.eu>
Tested-by: Bernd Kardatzki <Bernd.Kardatzki@med.uni-tuebingen.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 118c855b56 upstream.
The 3w-9xxx driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cd9554615 upstream.
The 3w-xxxx driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point. Also remove the dma mapping helpers
which have another inherent race due to the request_id index.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35e9a9f939 upstream.
This works around a issue with qnap iscsi targets not handling large IOs
very well.
The target returns:
VPD INQUIRY: Block limits page (SBC)
Maximum compare and write length: 1 blocks
Optimal transfer length granularity: 1 blocks
Maximum transfer length: 4294967295 blocks
Optimal transfer length: 4294967295 blocks
Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
Maximum unmap LBA count: 8388607
Maximum unmap block descriptor count: 1
Optimal unmap granularity: 16383
Unmap granularity alignment valid: 0
Unmap granularity alignment: 0
Maximum write same length: 0xffffffff blocks
Maximum atomic transfer length: 0
Atomic alignment: 0
Atomic transfer length granularity: 0
and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
have seen in traces where it will sometimes work, but other times it
looks like it fails and it looks like it returns failures if we send
multiple large IOs sometimes. Also it looks like it can return 2 different
errors. It will sometimes send iscsi reject errors indicating out of
resources or it will send invalid cdb illegal requests check conditions.
And then when it sends iscsi rejects it does not seem to handle retries
when there are command sequence holes, so I could not just add code to
try and gracefully handle that error code.
The problem is that we do not have a good contact for the company,
so we are not able to determine under what conditions it returns
which error and why it sometimes works.
So, this patch just adds a new black list flag to set targets like this to
the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
caused this regression, so I also ccing stable.
Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 280227a75b upstream.
fallocate() checks that the file is extent-based and returns
EOPNOTSUPP in case is not. Other tasks can convert from and to
indirect and extent so it's safe to check only after grabbing
the inode mutex.
Signed-off-by: Davide Italiano <dccitaliano@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2dc317d56 upstream.
Currently it is possible to lose whole file system block worth of data
when we hit the specific interaction with unwritten and delayed extents
in status extent tree.
The problem is that when we insert delayed extent into extent status
tree the only way to get rid of it is when we write out delayed buffer.
However there is a limitation in the extent status tree implementation
so that when inserting unwritten extent should there be even a single
delayed block the whole unwritten extent would be marked as delayed.
At this point, there is no way to get rid of the delayed extents,
because there are no delayed buffers to write out. So when a we write
into said unwritten extent we will convert it to written, but it still
remains delayed.
When we try to write into that block later ext4_da_map_blocks() will set
the buffer new and delayed and map it to invalid block which causes
the rest of the block to be zeroed loosing already written data.
For now we can fix this by simply not allowing to set delayed status on
written extent in the extent status tree. Also add WARN_ON() to make
sure that we notice if this happens in the future.
This problem can be easily reproduced by running the following xfs_io.
xfs_io -f -c "pwrite -S 0xaa 4096 2048" \
-c "falloc 0 131072" \
-c "pwrite -S 0xbb 65536 2048" \
-c "fsync" /mnt/test/fff
echo 3 > /proc/sys/vm/drop_caches
xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fff
This can be theoretically also reproduced by at random by running fsx,
but it's not very reliable, though on machines with bigger page size
(like ppc) this can be seen more often (especially xfstest generic/127)
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee136af4a0 upstream.
The usb-storage driver sets max_sectors = 240 in its scsi-host template,
for uas we do not want to do that for all devices, but testing has shown
that some devices need it.
This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
implements support for it in uas.c, while at it it also adds support
for US_FL_MAX_SECTORS_64 to uas.c.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5011d44f0 upstream.
uas_use_uas_driver may set some US_FL_foo flags during detection, currently
these are stored in a local variable and then throw away, but these may be
of interest to the caller, so add an extra parameter to (optionally) return
the detected flags, and use this in the uas driver.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 082a75dad8 upstream.
When we end I/O struct request with error, we need to pass
obj_request->length as @nr_bytes so that the entire obj_request worth
of bytes is completed. Otherwise block layer ends up confused and we
trip on
rbd_assert(more ^ (which == img_request->obj_request_count));
in rbd_img_obj_callback() due to more being true no matter what. We
already do it in most cases but we are missing some, in particular
those where we don't even get a chance to submit any obj_requests, due
to an early -ENOMEM for example.
A number of obj_request->xferred assignments seem to be redundant but
I haven't touched any of obj_request->xferred stuff to keep this small
and isolated.
Cc: Alex Elder <elder@linaro.org>
Reported-by: Shawn Edwards <lesser.evil@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8d4e01637 upstream.
Maxburst was not set when doing the dma slave configuration. This value
is checked by the recently introduced xdmac. It causes an error when
doing the slave configuration and so prevents from using dma.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61f8ff6939 upstream.
Commit 9faf6136ff (ACPI / SBS: Disable smart battery manager on
Apple) introduced a regression disabling the SBS battery manager.
The battery manager should be marked as present when
acpi_manager_get_info() returns 0.
Fixes: 9faf6136ff (ACPI / SBS: Disable smart battery manager on Apple)
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 909e26dce3 upstream.
Whenever the check for a send in progress introduced in commit
521e0546c9 (btrfs: protect snapshots from deleting during send) is
hit, we return without unlocking inode->i_mutex. This is easy to see
with lockdep enabled:
[ +0.000059] ================================================
[ +0.000028] [ BUG: lock held when returning to user space! ]
[ +0.000029] 4.0.0-rc5-00096-g3c435c1 #93 Not tainted
[ +0.000026] ------------------------------------------------
[ +0.000029] btrfs/211 is leaving the kernel with locks still held!
[ +0.000029] 1 lock held by btrfs/211:
[ +0.000023] #0: (&type->i_mutex_dir_key){+.+.+.}, at: [<ffffffff8135b8df>] btrfs_ioctl_snap_destroy+0x2df/0x7a0
Make sure we unlock it in the error path.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60a8d62b84 upstream.
DMIC clock source is not from codec system clock directly. it is
generated from the division of system clock. And it should be 256 *
sample rate of AIF1.
Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c479163a1b upstream.
In case of error, the function devm_ioremap_resource() returns
ERR_PTR() and never returns NULL. The NULL test in the return
value check should be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski.k@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5a356cee8 upstream.
Wrongly release mutex lock during otg_statemachine may result in re-enter
otg_statemachine, which is not allowed, we should do next state transtition
after previous one completed.
Fixes: 826cfe751f ("usb: chipidea: add OTG fsm operation functions implementation")
Signed-off-by: Li Jun <jun.li@freescale.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cff98b99c upstream.
__dma_alloc() does a PAGE_ALIGN() on the passed in size argument before
doing anything else. __dma_free() does not. And because it doesn't, it is
possible to leak memory should size not be an integer multiple of PAGE_SIZE.
The solution is to add a PAGE_ALIGN() to __dma_free() like is done in
__dma_alloc().
Additionally, this patch removes a redundant PAGE_ALIGN() from
__dma_alloc_coherent(), since __dma_alloc_coherent() can only be called
from __dma_alloc(), which already does a PAGE_ALIGN() before the call.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6829e274a6 upstream.
Buffers allocated by dma_alloc_coherent() are always zeroed on Alpha,
ARM (32bit), MIPS, PowerPC, x86/x86_64 and probably other architectures.
It turned out that some drivers rely on this 'feature'. Allocated buffer
might be also exposed to userspace with dma_mmap() call, so clearing it
is desired from security point of view to avoid exposing random memory
to userspace. This patch unifies dma_alloc_coherent() behavior on ARM64
architecture with other implementations by unconditionally zeroing
allocated buffer.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c90c07b98 upstream.
For systems with CONFIG_SERIAL_OF_PLATFORM=y and device_type =
"serial"; property in DT of_serial.c driver maps and unmaps IRQ (because
driver probe fails). Then a driver is called but irq mapping is not
created that's why driver is failing again in again on request_irq().
Based on this use platform_get_irq() instead of platform_get_resource()
which is doing irq_desc allocation and driver itself can request IRQ.
Fix both xilinx serial drivers in the tree.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6befa9d883 upstream.
Do not probe all serial drivers by of_serial.c which are using
device_type = "serial"; property. Only drivers which have valid
compatible strings listed in the driver should be probed.
When PORT_UNKNOWN is setup probe will fail anyway.
Arnd quotation about driver historical background:
"when I wrote that driver initially, the idea was that it would
get used as a stub to hook up all other serial drivers but after
that, the common code learned to create platform devices from DT"
This patch fix the problem with on the system with xilinx_uartps and
16550a where of_serial failed to register for xilinx_uartps and because
of irq_dispose_mapping() removed irq_desc. Then when xilinx_uartps was asking
for irq with request_irq() EINVAL is returned.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d3bba0287 upstream.
Phil and I found out a problem with commit:
7e860a6e7a ("cdc-acm: add sanity checks")
It added some sanity checks to ignore potential garbage in CDC headers but
also introduced a potential infinite loop. This can happen at the first
loop iteration (elength = 0 in that case) if the description isn't a
DT_CS_INTERFACE or later if 'buffer[0]' is zero.
It should also be noted that the wrong length was being added to 'buffer'
in case 'buffer[1]' was not a DT_CS_INTERFACE descriptor, since elength was
assigned after that check in the loop.
A specially crafted USB device could be used to trigger this infinite loop.
Fixes: 7e860a6e7a ("cdc-acm: add sanity checks")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
CC: Oliver Neukum <oneukum@suse.de>
CC: Adam Lee <adam8157@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7290006d8c upstream.
This patch adds the missing flag to enable "Mute-LED Mode" mixer enum
ctl for Thinkpads that have also the software mute-LED control.
Reported-and-tested-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee52e56e7b upstream.
The mute-LED mode control has the fixed on/off states that are
supposed to remain on/off regardless of the master switch. However,
this doesn't work actually because the vmaster hook is called in the
vmaster code itself.
This patch fixes it by calling the hook indirectly after checking the
mute LED mode.
Reported-and-tested-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7241ea558c upstream.
Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two
modes for DMA. Second mode is useful for 64 bit os with more then 2 GB
of ram (fixes problems with big soundfont loading)
1) 32MB from 2 GB address space using 8192 pages (used now as default)
2) 16MB from 4 GB address space using 4096 pages
Mode is set using HCFG_EXPANDED_MEM flag in HCFG register.
Also format of emu10k2 page table is then different.
Signed-off-by: Peter Zubaj <pzubaj@marticonet.sk>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d02260824e upstream.
Some models provide too long string for the shortname that has 32bytes
including the terminator, and it results in a non-terminated string
exposed to the user-space. This isn't too critical, though, as the
string is stopped at the succeeding longname string.
This patch fixes such entries by dropping "SB" prefix (it's enough to
fit within 32 bytes, so far). Meanwhile, it also changes strcpy()
with strlcpy() to make sure that this kind of problem won't happen in
future, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c94e65c66 upstream.
The OSS emulation in synth-emux helper has a potential AB/BA deadlock
at the simultaneous closing and opening:
close ->
snd_seq_release() ->
sne_seq_free_client() ->
snd_seq_delete_all_ports(): takes client->ports_mutex ->
port_delete() ->
snd_emux_unuse(): takes emux->register_mutex
open ->
snd_seq_oss_open() ->
snd_emux_open_seq_oss(): takes emux->register_mutex ->
snd_seq_event_port_attach() ->
snd_seq_create_port(): takes client->ports_mutex
This patch addresses the deadlock by reducing the rance taking
emux->register_mutex in snd_emux_open_seq_oss(). The lock is needed
for the refcount handling, so move it locally. The calls in
emux_seq.c are already with the mutex, thus they are replaced with the
version without mutex lock/unlock.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07b0e5d49d upstream.
The emux-synth driver has a possible AB/BA mutex deadlock at unloading
the emu10k1 driver:
snd_emux_free() ->
snd_emux_detach_seq(): mutex_lock(&emu->register_mutex) ->
snd_seq_delete_kernel_client() ->
snd_seq_free_client(): mutex_lock(®ister_mutex)
snd_seq_release() ->
snd_seq_free_client(): mutex_lock(®ister_mutex) ->
snd_seq_delete_all_ports() ->
snd_emux_unuse(): mutex_lock(&emu->register_mutex)
Basically snd_emux_detach_seq() doesn't need a protection of
emu->register_mutex as it's already being unregistered. So, we can
get rid of this for avoiding the deadlock.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 5306a54508 upstream.
Commit 32098ec7bc ("MIPS: Makefile: Move the ASEs checks after
setting the core's CFLAGS") re-arranged the MIPS ASE detection code
and also added the current cflags to the detection logic. However,
this introduced a few bugs. First of all, the mips-cflags should not
be quoted since that ends up being passed as a string to subsequent
commands leading to broken detection from the cc-option-* tools.
Moreover, in order to avoid duplicating the cflags-y because of how
cc-option works, we rework the logic so we pass only those cflags which
are needed by the selected ASE. Finally, fix some typos resulting in MSA
not being detected correctly.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: Commit 32098ec7bc ("MIPS: Makefile: Move the ASEs checks after setting the core's CFLAGS")
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9661/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9cdf30bd3b upstream.
Returns a non-zero value if the current processor implementation requires
an IHB instruction to deal with an instruction hazard as per MIPS R2
architecture specification, zero otherwise.
For a discussion, see http://patchwork.linux-mips.org/patch/9539/.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit aebac99384 upstream.
Commit 6ebb496ffc7e("MIPS: kernel: entry.S: Add MIPS R6 related
definitions") added the MIPSR6 definition but it did not update the
ISA level of the actual assembly code so a pre-MIPSR6 jr.hb instruction
was generated instead. Fix this by using the MISP_ISA_LEVEL_RAW macro.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: 6ebb496ffc7e("MIPS: kernel: entry.S: Add MIPS R6 related definitions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9386/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f6b39ae6f4 upstream.
Commit 934c79231c1b("MIPS: asm: r4kcache: Add MIPS R6 cache unroll
functions") added support for MIPS R6 cache flushes but it used the
wrong base address register to perform the flushes so the same lines
were flushed over and over. Moreover, replace the "addiu" instructions
with LONG_ADDIU so the correct base address is calculated for 64-bit
cores.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: 934c79231c1b("MIPS: asm: r4kcache: Add MIPS R6 cache unroll functions")
Cc: linux-mips@linux-mips.org
Reviewed-by: Maciej W. Rozycki <macro@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/9384/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 4d46a67a3e upstream.
The lazy cache flushing implemented in the MIPS kernel suffers from a
race condition that is exposed by do_set_pte() in mm/memory.c.
A pre-condition is a file-system that writes to the page from the CPU
in its readpage method and then calls flush_dcache_page(). One example
is ubifs. Another pre-condition is that the dcache flush is postponed
in __flush_dcache_page().
Upon a page fault for an executable mapping not existing in the
page-cache, the following will happen:
1. Write to the page
2. flush_dcache_page
3. flush_icache_page
4. set_pte_at
5. update_mmu_cache (commits the flush of a dcache-dirty page)
Between steps 4 and 5 another thread can hit the same page and it will
encounter a valid pte. Because the data still is in the L1 dcache the CPU
will fetch stale data from L2 into the icache and execute garbage.
This fix moves the commit of the cache flush to step 3 to close the
race window. It also reduces the amount of flushes on non-executable
mappings because we never enter __flush_dcache_page() for non-aliasing
CPUs.
Regressions can occur in drivers that mistakenly relies on the
flush_dcache_page() in get_user_pages() for DMA operations.
[ralf@linux-mips.org: Folded in patch 9346 to fix highmem issue.]
Signed-off-by: Lars Persson <larper@axis.com>
Cc: linux-mips@linux-mips.org
Cc: paul.burton@imgtec.com
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9346/
Patchwork: https://patchwork.linux-mips.org/patch/9738/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 07841f9d94 ]
When system is out of memory, refilling of RX buffers fails while
the driver continue to pass the received packets to the kernel stack.
At some point, when all RX buffers deplete, driver may fall into a
sleep, and not recover when memory for new RX buffers is once again
availible. This is because hardware does not have valid descriptors,
so no interrupt will be generated for the driver to return to work
in napi context. Fix it by schedule the napi poll function from
stats_task delayed workqueue, as long as the allocations fail.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7f0b8a56c9 ]
Commit 6559a7e829 ("cxgb4: Cleanup macros so they follow the same
style and look consistent") introduced a regression where reading MC1
memory in adapters where MC0 isn't present or MC0 size is not equal to MC1
size caused the adapter to crash due to incorrect computation of memoffset.
Fix is to read the size of MC0 instead of MC1 for offset calculation
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 42eab005a5 ]
By default, the number of tx queues is limited by the number of online cpus
in mlx4_en_get_profile(). However, this limit no longer holds after the
ethtool .set_channels method has been called. In that situation, the driver
may access invalid bits of certain cpumask variables when queue_index >=
nr_cpu_ids.
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Acked-by: Ido Shamay <idos@mellanox.com>
Fixes: d03a68f ("net/mlx4_en: Configure the XPS queue mapping on driver load")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cb6ccf09d6 ]
The commit 3cdaa5be9e ("ipv4: Don't
increase PMTU with Datagram Too Big message") broke PMTU in cases
where the rt_pmtu value has expired but is smaller than the new
PMTU value.
This obsolete rt_pmtu then prevents the new PMTU value from being
installed.
Fixes: 3cdaa5be9e ("ipv4: Don't increase PMTU with Datagram Too Big message")
Reported-by: Gerd v. Egidy <gerd.von.egidy@intra2net.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 876a7ae65b ]
ALU64_DIV instruction should be dividing 64-bit by 64-bit,
whereas do_div() does 64-bit by 32-bit divide.
x64 and arm64 JITs correctly implement 64 by 64 unsigned divide.
llvm BPF backend emits code assuming that ALU64_DIV does 64 by 64.
Fixes: 89aa075832 ("net: sock: allow eBPF programs to be attached to sockets")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b67c43ce3 upstream.
We also need to save/store in forward, else br_parse_ip_options call
will zero frag_max_size as well.
Fixes: 93fdd47e5 ('bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING')
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c34203a14 upstream.
It is not necessary to call device_remove_groups() when device_add_groups()
fails.
The group added by device_add_groups() should be removed if sysfs_create_link()
fails.
Fixes: fa6fdb33b4 ("driver core: bus_type: add dev_groups")
Signed-off-by: Junjie Mao <junjie_mao@yeah.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7085a7401b upstream.
This fixes a regression from the net subsystem:
After commit d52fdbb735
"smc91x: retrieve IRQ and trigger flags in a modern way"
a regression would appear on some legacy platforms such
as the ARM PXA Zylonite that specify IRQ resources like
this:
static struct resource r = {
.start = X,
.end = X,
.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE,
};
The previous code would retrieve the resource and parse
the high edge setting in the SMC91x driver, a use pattern
that means every driver specifying an IRQ flag from a
static resource need to parse resource flags and apply
them at runtime.
As we switched the code to use IRQ descriptors to retrieve
the the trigger type like this:
irqd_get_trigger_type(irq_get_irq_data(...));
the code would work for new platforms using e.g. device
tree as the backing irq descriptor would have its flags
properly set, whereas this kind of oldstyle static
resources at no point assign the trigger flags to the
corresponding IRQ descriptor.
To make the behaviour identical on modern device tree
and legacy static platform data platforms, modify
platform_get_irq() to assign the trigger flags to the
irq descriptor when a client looks up an IRQ from static
resources.
Fixes: d52fdbb735 ("smc91x: retrieve IRQ and trigger flags in a modern way")
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13f6b191aa upstream.
Using the indenting we can see the curly braces were obviously intended.
This is a static checker fix, but my guess is that we don't read enough
bytes, because we don't calculate "t_len" correctly.
Fixes: f1d8269802 ('memstick: use fully asynchronous request processing')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f4831605f2 upstream.
time_init invokes timer64_init (which is __init annotation)
since all of these are invoked at init time, lets maintain
consistency by ensuring time_init is marked appropriately
as well.
This fixes the following warning with CONFIG_DEBUG_SECTION_MISMATCH=y
WARNING: vmlinux.o(.text+0x3bfc): Section mismatch in reference from the function time_init() to the function .init.text:timer64_init()
The function time_init() references
the function __init timer64_init().
This is often because time_init lacks a __init
annotation or the annotation of timer64_init is wrong.
Fixes: 546a39546c ("C6X: time management")
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d7e7e02a0 upstream.
For cases where total length of an input SGs is not same as
length of the input data for encryption, omap-aes driver
crashes. This happens in the case when IPsec is trying to use
omap-aes driver.
To avoid this, we copy all the pages from the input SG list
into a contiguous buffer and prepare a single element SG list
for this buffer with length as the total bytes to crypt, which is
similar thing that is done in case of unaligned lengths.
Fixes: 6242332ff2 ("crypto: omap-aes - Add support for cases of unaligned lengths")
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3fa71c40f upstream.
In struct wl18xx_acx_rx_rate_stat, rx_frames_per_rates field is an
array, not a number. This means WL18XX_DEBUGFS_FWSTATS_FILE can't be
used to display this field in debugfs (it would display a pointer, not
the actual data). Use WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY instead.
This bug has been found by adding a __printf attribute to
wl1271_format_buffer. gcc complained about "format '%u' expects
argument of type 'unsigned int', but argument 5 has type 'u32 *'".
Fixes: c5d94169e8 ("wl18xx: use new fw stats structures")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b053c9518 upstream.
OPTIMIZER_HIDE_VAR(), as defined when using gcc, is insufficient to
ensure protection from dead store optimization.
For the random driver and crypto drivers, calls are emitted ...
$ gdb vmlinux
(gdb) disassemble memzero_explicit
Dump of assembler code for function memzero_explicit:
0xffffffff813a18b0 <+0>: push %rbp
0xffffffff813a18b1 <+1>: mov %rsi,%rdx
0xffffffff813a18b4 <+4>: xor %esi,%esi
0xffffffff813a18b6 <+6>: mov %rsp,%rbp
0xffffffff813a18b9 <+9>: callq 0xffffffff813a7120 <memset>
0xffffffff813a18be <+14>: pop %rbp
0xffffffff813a18bf <+15>: retq
End of assembler dump.
(gdb) disassemble extract_entropy
[...]
0xffffffff814a5009 <+313>: mov %r12,%rdi
0xffffffff814a500c <+316>: mov $0xa,%esi
0xffffffff814a5011 <+321>: callq 0xffffffff813a18b0 <memzero_explicit>
0xffffffff814a5016 <+326>: mov -0x48(%rbp),%rax
[...]
... but in case in future we might use facilities such as LTO, then
OPTIMIZER_HIDE_VAR() is not sufficient to protect gcc from a possible
eviction of the memset(). We have to use a compiler barrier instead.
Minimal test example when we assume memzero_explicit() would *not* be
a call, but would have been *inlined* instead:
static inline void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
<foo>
}
int main(void)
{
char buff[20];
snprintf(buff, sizeof(buff) - 1, "test");
printf("%s", buff);
memzero_explicit(buff, sizeof(buff));
return 0;
}
With <foo> := OPTIMIZER_HIDE_VAR():
(gdb) disassemble main
Dump of assembler code for function main:
[...]
0x0000000000400464 <+36>: callq 0x400410 <printf@plt>
0x0000000000400469 <+41>: xor %eax,%eax
0x000000000040046b <+43>: add $0x28,%rsp
0x000000000040046f <+47>: retq
End of assembler dump.
With <foo> := barrier():
(gdb) disassemble main
Dump of assembler code for function main:
[...]
0x0000000000400464 <+36>: callq 0x400410 <printf@plt>
0x0000000000400469 <+41>: movq $0x0,(%rsp)
0x0000000000400471 <+49>: movq $0x0,0x8(%rsp)
0x000000000040047a <+58>: movl $0x0,0x10(%rsp)
0x0000000000400482 <+66>: xor %eax,%eax
0x0000000000400484 <+68>: add $0x28,%rsp
0x0000000000400488 <+72>: retq
End of assembler dump.
As can be seen, movq, movq, movl are being emitted inlined
via memset().
Reference: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/13764/
Fixes: d4c5efdb97 ("random: add and use memzero_explicit() for clearing data")
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: mancha security <mancha1@zoho.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5265047ac3 upstream.
Commit 077fcf116c ("mm/thp: allocate transparent hugepages on local
node") restructured alloc_hugepage_vma() with the intent of only
allocating transparent hugepages locally when there was not an effective
interleave mempolicy.
alloc_pages_exact_node() does not limit the allocation to the single node,
however, but rather prefers it. This is because __GFP_THISNODE is not set
which would cause the node-local nodemask to be passed. Without it, only
a nodemask that prefers the local node is passed.
Fix this by passing __GFP_THISNODE and falling back to small pages when
the allocation fails.
Commit 9f1b868a13 ("mm: thp: khugepaged: add policy for finding target
node") suffers from a similar problem for khugepaged, which is also fixed.
Fixes: 077fcf116c ("mm/thp: allocate transparent hugepages on local node")
Fixes: 9f1b868a13 ("mm: thp: khugepaged: add policy for finding target node")
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80f1d68ccb upstream.
I noticed that a helper function with argument type ARG_ANYTHING does
not need to have an initialized value (register).
This can worst case lead to unintented stack memory leakage in future
helper functions if they are not carefully designed, or unintended
application behaviour in case the application developer was not careful
enough to match a correct helper function signature in the API.
The underlying issue is that ARG_ANYTHING should actually be split
into two different semantics:
1) ARG_DONTCARE for function arguments that the helper function
does not care about (in other words: the default for unused
function arguments), and
2) ARG_ANYTHING that is an argument actually being used by a
helper function and *guaranteed* to be an initialized register.
The current risk is low: ARG_ANYTHING is only used for the 'flags'
argument (r4) in bpf_map_update_elem() that internally does strict
checking.
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a73f8e21f3 upstream.
Louis reported that a static checker was complaining that
the 'dst' variable was set (multiple times) but not used.
This is due to a previous commit having removed the usage
(apparently erroneously), so add it back.
Fixes: a344d6778a ("mac80211: allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR")
Reported-by: Louis Langholtz <lou_langholtz@me.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08e8331654 upstream.
There is a race condition between e1000_change_mtu's cleanups and
netpoll, when we change the MTU across jumbo size:
Changing MTU frees all the rx buffers:
e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
e1000_clean_rx_ring
Then, close to the end of e1000_change_mtu:
pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag
And when we come back to do the rest of the MTU change:
e1000_up -> e1000_configure -> e1000_configure_rx ->
e1000_alloc_jumbo_rx_buffers
alloc_jumbo finds the buffers already != NULL, since data (shared with
page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
or at least not what is expected when in jumbo state.
This results in an unusable adapter (packets don't get through), and a
NULL pointer dereference on the next call to e1000_clean_rx_ring
(other mtu change, link down, shutdown):
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330
[...]
Call Trace:
[<ffffffff81195445>] put_page+0x55/0x60
[<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
[<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
[<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
[<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
[<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
[<ffffffff81647050>] dev_set_mtu+0xa0/0x140
[<ffffffff81664218>] do_setlink+0x218/0xac0
[<ffffffff814459e9>] ? nla_parse+0xb9/0x120
[<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
[<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
[<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
[<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260
By setting the allocator to a dummy version, netpoll can't mess up our
rx buffers. The allocator is set back to a sane value in
e1000_configure_rx.
Fixes: edbbb3ca10 ("e1000: implement jumbo receive with partial descriptors")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c61f0d389 upstream.
d4b18c3e (pnfs: remove GETDEVICELIST implementation) removed the
GETDEVICELIST operation from the NFS client, but left a "hole" in the
nfs4_procedures array. This caused /proc/self/mountstats to report an
operation named "51" where GETDEVICELIST used to be. This patch adds a
stub to fix mountstats.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Fixes: d4b18c3e (pnfs: remove GETDEVICELIST implementation)
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ccbad9f9f upstream.
For direct read that has IO size larger than rsize, we'll split
it into several READ requests and nfs_direct_good_bytes() would
count completed bytes incorrectly by eating last zero count reply.
Fix it by handling mirror and non-mirror cases differently such that
we only count mirrored writes differently.
This fixes 5fadeb47("nfs: count DIO good bytes correctly with mirroring").
Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d05e54af3 upstream.
Chuck pointed out a problem that crept in with commit 6ffa30d3f7 (nfs:
don't call blocking operations while !TASK_RUNNING). Linux counts tasks
in uninterruptible sleep against the load average, so this caused the
system's load average to be pinned at at least 1 when there was a
NFSv4.1+ mount active.
Not a huge problem, but it's probably worth fixing before we get too
many complaints about it. This patch converts the code back to use
TASK_INTERRUPTIBLE sleep, simply has it flush any signals on each loop
iteration. In practice no one should really be signalling this thread at
all, so I think this is reasonably safe.
With this change, there's also no need to game the hung task watchdog so
we can also convert the schedule_timeout call back to a normal schedule.
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Fixes: commit 6ffa30d3f7 (“nfs: don't call blocking . . .”)
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb7ffbf29e upstream.
nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...)
in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id.
The following was observed on a MIPS 32-core processor:
kernel: Call Trace:
kernel: [<ffffffffc00bc5e4>] rpc_pipefs_event+0x7c/0x158 [nfsd]
kernel: [<ffffffff8017a2a0>] notifier_call_chain+0x70/0xb8
kernel: [<ffffffff8017a4e4>] __blocking_notifier_call_chain+0x4c/0x70
kernel: [<ffffffff8053aff8>] rpc_fill_super+0xf8/0x1a0
kernel: [<ffffffff8022204c>] mount_ns+0xb4/0xf0
kernel: [<ffffffff80222b48>] mount_fs+0x50/0x1f8
kernel: [<ffffffff8023dc00>] vfs_kern_mount+0x58/0xf0
kernel: [<ffffffff802404ac>] do_mount+0x27c/0xa28
kernel: [<ffffffff80240cf0>] SyS_mount+0x98/0xe8
kernel: [<ffffffff80135d24>] handle_sys64+0x44/0x68
kernel:
kernel:
Code: 0040f809 00000000 2e020001 <00020336> 3c12c00d
3c02801a de100000 6442eb98 0040f809
kernel: ---[ end trace 7471374335809536 ]---
Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before
registering rpc_pipefs_event(...) with the notifier chain.
Signed-off-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com>
Signed-off-by: Lorenzo Restelli <lorenzo.restelli.ext@nokia.com>
Reviewed-by: Kinlong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 135dd002c2 upstream.
Commit f895b252d4 ("sunrpc: eliminate RPC_DEBUG") introduced
use of IS_ENABLED() in a uapi header which leads to a build
failure for userspace apps trying to use <linux/nfsd/debug.h>:
linux/nfsd/debug.h:18:15: error: missing binary operator before token "("
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
^
Since this was only used to define NFSD_DEBUG if CONFIG_SUNRPC_DEBUG
is enabled, replace instances of NFSD_DEBUG with CONFIG_SUNRPC_DEBUG.
Fixes: f895b252d4 "sunrpc: eliminate RPC_DEBUG"
Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e4891dc28 upstream.
In the case we already have a struct file (derived from a stateid), we
still need to do permission-checking; otherwise an unauthorized user
could gain access to a file by sniffing or guessing somebody else's
stateid.
Fixes: dc97618ddd "nfsd4: separate splice and readv cases"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ba4a25ab7 upstream.
vfs_fallocate will hit a NULL dereference if the client tries an
ALLOCATE or DEALLOCATE with a special stateid. Fix that. (We also
depend on the open to have broken any conflicting leases or delegations
for us.)
(If it turns out we need to allow special stateid's then we could do a
temporary open here in the special-stateid case, as we do for read and
write. For now I'm assuming it's not necessary.)
Fixes: 95d871f03c "nfsd: Add ALLOCATE support"
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3708f842e1 upstream.
This reverts commit 5a254d08b0.
Since commit 5a254d08b0 ("nfs: replace nfs_add_stats with
nfs_inc_stats when add one"), nfs_readpage and nfs_do_writepage use
nfs_inc_stats to increment NFSIOS_READPAGES and NFSIOS_WRITEPAGES
instead of nfs_add_stats.
However nfs_inc_stats does not do the same thing as nfs_add_stats with
value 1 because these functions work on distinct stats:
nfs_inc_stats increments stats from "enum nfs_stat_eventcounters" (in
server->io_stats->events) and nfs_add_stats those from "enum
nfs_stat_bytecounters" (in server->io_stats->bytes).
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Fixes: 5a254d08b0 ("nfs: replace nfs_add_stats with nfs_inc_stats...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3cab989afd upstream.
Calling unlazy_walk() in walk_component() and do_last() when we find
a symlink that needs to be followed doesn't acquire a reference to vfsmount.
That's fine when the symlink is on the same vfsmount as the parent directory
(which is almost always the case), but it's not always true - one _can_
manage to bind a symlink on top of something. And in such cases we end up
with excessive mntput().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9535c4757b upstream.
The hardware, according to the specs, is limited to 256 byte transfers,
and current driver has no protections in case users attempt to do larger
transfers. The code will just stomp over status register and mayhem
ensues.
Let's split larger transfers into digestable chunks. Doing this allows
Atmel MXT driver on Pixel 1 function properly (it hasn't since commit
9d8dc3e529 "Input: atmel_mxt_ts -
implement T44 message handling" which tries to consume multiple
touchscreen/touchpad reports in a single transaction).
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5f1c97f94 upstream.
Due this typo we don't save/restore the GFX_MAX_REQ_COUNT register across
suspend/resume, so fix this.
This was introduced in
commit ddeea5b0c3
Author: Imre Deak <imre.deak@intel.com>
Date: Mon May 5 15:19:56 2014 +0300
drm/i915: vlv: add runtime PM support
I noticed this only by reading the code. To my knowledge it shouldn't
cause any real problems at the moment, since the power well backing this
register remains on across a runtime s/r. This may change once
system-wide s0ix functionality is enabled in the kernel.
v2:
- resend after a missing git add -u :/
Signed-off-by: Imre Deak <imre.deak@intel.com>
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5241289c4 upstream.
The EDID read code waits for the read completion interrupt to occur
using wait_event_interruptible(). The condition passed to the macro
reads I2C registers. This results in sleeping with the task state set
to TASK_INTERRUPTIBLE, triggering a WARN_ON() introduced in commit
8eb23b9f35 ("sched: Debug nested sleeps").
Fix this by reworking the EDID read code. Instead of checking whether
the read is complete through I2C reads, handle the interrupt registers
in the interrupt handler and update a new edid_read flag accordingly. As
a side effect both the IRQ and polling code paths now process the
interrupt sources through the same code path, simplifying the code.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e96206c4f upstream.
The DDC error interrupt bit is located in REG_INT1, not REG_INT0. Update
both the interrupt wait code and the interrupt sources reset code
accordingly.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c363c7ccc upstream.
After adding display power domain for Exynos5250 in commit
2d2c9a8d0a ("ARM: dts: add display power domain for exynos5250") the
display on Chromebook Snow and others stopped working after boot.
The reason for this suggested Andrzej Hajda: the DP clock was disabled.
This clock is required by Display Port and is enabled by bootloader.
However when FIMD driver probing was deferred, the display power domain
was turned off. This effectively reset the value of DP clock enable
register.
When exynos-dp is later probed, the clock is not enabled and display is
not properly configured:
exynos-dp 145b0000.dp-controller: Timeout of video streamclk ok
exynos-dp 145b0000.dp-controller: unable to config video
Fixes: 2d2c9a8d0a ("ARM: dts: add display power domain for exynos5250")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1c21f4e60 upstream.
Current -next fails to link an ARM allmodconfig because drivers that use
the core recovery functions can be built as modules but those functions
are not exported:
ERROR: "i2c_generic_gpio_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
ERROR: "i2c_generic_scl_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
ERROR: "i2c_recover_bus" [drivers/i2c/busses/i2c-davinci.ko] undefined!
Add exports to fix this.
Fixes: 5f9296ba21 (i2c: Add bus recovery infrastructure)
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ada5c1e1b upstream.
Commit 523c5b8964 ("i2c: Remove support for legacy PM") removed the PM
ops from the bus type, which causes the pm operations on the s3c2410
adapter device to fail (-ENOSUPP in rpm_callback). The adapter device
doesn't get bound to a driver and as such can't have its own pm_runtime
callbacks. Previously this was fine as the bus callbacks would have been
used, but now this can cause devices which use PM runtime and are
attached over I2C to fail to resume.
This commit fixes this issue by marking all adapter devices with
pm_runtime_no_callbacks, since they can't have any.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Beata Michalska <b.michalska@samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: 523c5b8964
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c6cbfb91b8 upstream.
master_xfer() method should return number of i2c messages transferred,
but on Rockchip we were usually returning just 1, which caused trouble
with users that actually check number of transferred messages vs.
checking for negative error codes.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 133778482e upstream.
Those symlinks are created for the mux_dev, so we need to remove it from
there. Currently, it breaks for muxes where the mux_dev is not the device
of the parent adapter like this:
[ 78.234644] WARNING: CPU: 0 PID: 365 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5c/0x78()
[ 78.242438] sysfs: cannot create duplicate filename '/devices/platform/i2cbus@8/channel-0'
Remove confusing comments while we are here.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: c9449affad
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84fce9db4d upstream.
There is a problem that trace events are not properly enabled with
boot cmdline. The problem is that if we pass "trace_event=kmem:mm_page_alloc"
to the boot cmdline, it enables all kmem trace events, and not just
the page_alloc event.
This is caused by the parsing mechanism. When we parse the cmdline, the buffer
contents is modified due to tokenization. And, if we use this buffer
again, we will get the wrong result.
Unfortunately, this buffer is be accessed three times to set trace events
properly at boot time. So, we need to handle this situation.
There is already code handling ",", but we need another for ":".
This patch adds it.
Link: http://lkml.kernel.org/r/1429159484-22977-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
[ added missing return ret; ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a065fe6aa2 upstream.
This length miss-calculation may cause a silent data corruption
in the DIX case and cause the device to reference unmapped area.
Fixes: d77e65350f ('libiscsi, iser: Adjust data_length to include protection information')
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca9b590caa upstream.
The current code decreases from the mss size (which is the gso_size
from the kernel skb) the size of the packet headers.
It shouldn't do that because the mss that comes from the stack
(e.g IPoIB) includes only the tcp payload without the headers.
The result is indication to the HW that each packet that the HW sends
is smaller than what it could be, and too many packets will be sent
for big messages.
An easy way to demonstrate one more aspect of the problem is by
configuring the ipoib mtu to be less than 2*hlen (2*56) and then
run app sending big TCP messages. This will tell the HW to send packets
with giant (negative value which under unsigned arithmetics becomes
a huge positive one) length and the QP moves to SQE state.
Fixes: b832be1e40 ('IB/mlx4: Add IPoIB LSO support')
Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 66578b0b2f upstream.
In a call to ib_umem_get(), if address is 0x0 and size is
already page aligned, check added in commit 8494057ab5
("IB/uverbs: Prevent integer overflow in ib_umem_get address
arithmetic") will refuse to register a memory region that
could otherwise be valid (provided vm.mmap_min_addr sysctl
and mmap_low_allowed SELinux knobs allow userspace to map
something at address 0x0).
This patch allows back such registration: ib_umem_get()
should probably don't care of the base address provided it
can be pinned with get_user_pages().
There's two possible overflows, in (addr + size) and in
PAGE_ALIGN(addr + size), this patch keep ensuring none
of them happen while allowing to pin memory at address
0x0. Anyway, the case of size equal 0 is no more (partially)
handled as 0-length memory region are disallowed by an
earlier check.
Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aeff092767 upstream.
The available (i.e. not used) buffers are returned by stk1160_clear_queue(),
on the stop_streaming() path. However, this is insufficient and the current
buffer must be released as well. Fix it.
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80ccf4ad06 upstream.
img_ir_remove() passes a pointer to the ISR function as the 2nd
parameter to irq_free() instead of a pointer to the device data
structure.
This issue causes unloading img-ir module to fail with the below
warning after building and loading img-ir as a module.
WARNING: CPU: 2 PID: 155 at ../kernel/irq/manage.c:1278
__free_irq+0xb4/0x214() Trying to free already-free IRQ 58
Modules linked in: img_ir(-)
CPU: 2 PID: 155 Comm: rmmod Not tainted 3.14.0 #55 ...
Call Trace:
...
[<8048d420>] __free_irq+0xb4/0x214
[<8048d6b4>] free_irq+0xac/0xf4
[<c009b130>] img_ir_remove+0x54/0xd4 [img_ir] [<8073ded0>]
platform_drv_remove+0x30/0x54 ...
Fixes: 160a8f8aec ("[media] rc: img-ir: add base driver")
Signed-off-by: Sifan Naeem <sifan.naeem@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56cbd0ccc1 upstream.
mvsas is giving a General protection fault when it encounters an expander
attached ATA device. Analysis of mvs_task_prep_ata() shows that the driver is
assuming all ATA devices are locally attached and obtaining the phy mask by
indexing the local phy table (in the HBA structure) with the phy id. Since
expanders have many more phys than the HBA, this is causing the index into the
HBA phy table to overflow and returning rubbish as the pointer.
mvs_task_prep_ssp() instead does the phy mask using the port properties.
Mirror this in mvs_task_prep_ata() to fix the panic.
Reported-by: Adam Talbot <ajtalbot1@gmail.com>
Tested-by: Adam Talbot <ajtalbot1@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40384e4bbe upstream.
Correctly rollback state if the failure occurs after we have handed over
the ownership of the buffer to the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f01a0bd892 upstream.
Au1x00/RT2800+ doesn't implement the 8250 scratch register (and
this may be true of other h/w currently supported by the 8250 driver);
read back the canary value written to the scratch register to enable
the console h/w restart after resume from system suspend.
Fixes: 4516d50aab ("serial: 8250: Use canary to restart console ...")
Reported-by: Mason <slash.tmp@free.fr>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91555ce901 upstream.
The writeable bits in the USR2 register are all "write 1 to
clear" so only write the bits that actually should be cleared.
Fixes: f1f836e420 ("serial: imx: Add Rx Fifo overrun error message")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0618764cb2 upstream.
I suspect this doesn't show up for most anyone because software
algorithms typically don't have a sense of being too busy. However,
when working with the Freescale CAAM driver it will return -EBUSY on
occasion under heavy -- which resulted in dm-crypt deadlock.
After checking the logic in some other drivers, the scheme for
crypt_convert() and it's callback, kcryptd_async_done(), were not
correctly laid out to properly handle -EBUSY or -EINPROGRESS.
Fix this by using the completion for both -EBUSY and -EINPROGRESS. Now
crypt_convert()'s use of completion is comparable to
af_alg_wait_for_completion(). Similarly, kcryptd_async_done() follows
the pattern used in af_alg_complete().
Before this fix dm-crypt would lockup within 1-2 minutes running with
the CAAM driver. Fix was regression tested against software algorithms
on PPC32 and x86_64, and things seem perfectly happy there as well.
Signed-off-by: Ben Collins <ben.c@servergy.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3261d768b upstream.
These frequency tables list the wrong rates. Either they don't
have the correct frequency at all, or they're specified in kHz
instead of Hz. Fix it.
Fixes: c99e515a92 "clk: qcom: Add IPQ806X LPASS clock controller (LCC) driver"
Tested-by: Kenneth Westfield <kwestfie@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b21503dbb upstream.
Currently, a RCG's M/N counter (used for fraction division) is
set to either 'bypass' (counter disabled) or 'dual edge' (counter
enabled) based on whether the corresponding rcg struct has a mnd
field specified and a non-zero N.
In the case where M and N are the same value, the M/N counter is
still enabled by code even though no division takes place.
Leaving the RCG in such a state can result in improper behavior.
This was observed with the DSI pixel clock RCG when M and N were
both set to 1.
Add an additional check (M != N) to enable the M/N counter only
when it's needed for fraction division.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Fixes: bcd61c0f53 (clk: qcom: Add support for root clock
generators (RCGs))
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d3745d44a upstream.
The ahbix clock can never be turned off in practice. To change the
rates we need to switch the mux off the M/N counter to an always on
source (XO), reprogram the M/N counter to get the rate we want and
finally switch back to the M/N counter. Add a new ops structure
for this type of clock so that we can set the rate properly.
Fixes: c99e515a92 "clk: qcom: Add IPQ806X LPASS clock controller (LCC) driver"
Tested-by: Kenneth Westfield <kwestfie@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1d676cec5 upstream.
The current parent, plld_out0, does not exist. The proper name is
pll_d_out0. While at it, rename the plld_dsi clock to pll_d_dsi_out to
be more consistent with other clock names.
Fixes: b270491eb9 ("clk: tegra: Define PLLD_DSI and remove dsia(b)_mux")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e43e25917 upstream.
The number of resets controls is 32 times the number of peripheral
register banks rather than 32 times the number of clocks. This reduces
(drastically) the number of reset controls registered from 10080 (315
clocks * 32) to 224 (6 peripheral register banks * 32).
This also fixes a potential crash because trying to use any of the
excess reset controls (224-10079) would have caused accesses beyond
the array bounds of the peripheral register banks definition array.
Cc: Peter De Schrijver <pdeschrijver@nvidia.com>
Cc: Prashant Gaikwad <pgaikwad@nvidia.com>
Fixes: 6d5b988e7d ("clk: tegra: implement a reset driver")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a9e9cb65b upstream.
Commit 42773b28e7 ("clk: samsung: exynos4: Enable ARMCLK
down feature") enabled ARMCLK down feature on all Exynos4
SoCs. Unfortunately on Exynos4210 SoC ARMCLK down feature
causes a lockup when ondemand cpufreq governor is used.
Fix it by limiting ARMCLK down feature to Exynos4x12 SoCs.
This patch was tested on:
- Exynos4210 SoC based Trats board
- Exynos4210 SoC based Origen board
- Exynos4412 SoC based Trats2 board
- Exynos4412 SoC based Odroid-U3 board
Cc: Daniel Drake <drake@endlessm.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Kukjin Kim <kgene@kernel.org>
Fixes: 42773b28e7 ("clk: samsung: exynos4: Enable ARMCLK down feature")
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61819549f5 upstream.
Level IRQ handlers and edge IRQ handler are managed by tow different
sets of registers. But currently the driver uses the same mask for the
both registers. It lead to issues with the following scenario:
First, an IRQ is requested on a GPIO to be triggered on front. After,
this an other IRQ is requested for a GPIO of the same bank but
triggered on level. Then the first one will be also setup to be
triggered on level. It leads to an interrupt storm.
The different kind of handler are already associated with two
different irq chip type. With this patch the driver uses a private
mask for each one which solves this issue.
It has been tested on an Armada XP based board and on an Armada 375
board. For the both boards, with this patch is applied, there is no
such interrupt storm when running the previous scenario.
This bug was already fixed but in a different way in the legacy
version of this driver by Evgeniy Dushistov:
9ece8839b1 "ARM: orion: Fix for certain
sequence of request_irq can cause irq storm". The fact the new version
of the gpio drive could be affected had been discussed there:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/344670/focus=364012
Reported-by: Evgeniy A. Dushistov <dushistov@mail.ru>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24e94454c8 upstream.
- don't lock lp->lock in the iss_net_timer for the call of iss_net_poll,
it will lock it itself;
- invert order of lp->lock and opened_lock acquisition in the
iss_net_open to make it consistent with iss_net_poll;
- replace spin_lock with spin_lock_bh when acquiring locks used in
iss_net_timer from non-atomic context;
- replace spin_lock_irqsave with spin_lock_bh in the iss_net_start_xmit
as the driver doesn't use lp->lock in the hard IRQ context;
- replace __SPIN_LOCK_UNLOCKED(lp.lock) with spin_lock_init, otherwise
lockdep is unhappy about using non-static key.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01e84c70fe upstream.
xtensa actually uses sync_file_range2 implementation, so it should
define __NR_sync_file_range2 as other architectures that use that
function. That fixes userspace interface (that apparently never worked)
and avoids special-casing xtensa in libc implementations.
See the thread ending at
http://lists.busybox.net/pipermail/uclibc/2015-February/048833.html
for more details.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4949009eb8 upstream.
LCD driver is always built for the XTFPGA platform, but its base address
is not configurable, and is wrong for ML605/KC705. Its initialization
locks up KC705 board hardware.
Make the whole driver optional, and its base address and bus width
configurable. Implement 4-bit bus access method.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c533c801d upstream.
acpi_scan_is_offline() may be called under the physical_node_lock
lock of the given device object's parent, so prevent lockdep from
complaining about that by annotating that instance with
SINGLE_DEPTH_NESTING.
Fixes: caa73ea158 (ACPI / hotplug / driver core: Handle containers in a special way)
Reported-and-tested-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ee0d34985 upstream.
It is reported that ACPI interrupts do not work any more on
Dell Latitude D600 after commit c50f13c672 (ACPICA: Save
current masks of enabled GPEs after enable register writes).
The problem turns out to be related to the fact that the
enable_mask and enable_for_run GPE bit masks are not in
sync (in the absence of any system suspend/resume events)
for at least one GPE register on that machine.
Address this problem by writing the enable_for_run mask into
enable_mask as soon as enable_for_run is updated instead of
doing that only after the subsequent register write has
succeeded. For consistency, update acpi_hw_gpe_enable_write()
to store the bit mask to be written into the GPE register
in enable_mask unconditionally before the write.
Since the ACPI_GPE_SAVE_MASK flag is not necessary any more after
that, drop it along with the symbols depending on it.
Reported-and-tested-by: Jim Bos <jim876@xs4all.nl>
Fixes: c50f13c672 (ACPICA: Save current masks of enabled GPEs after enable register writes)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77ddc2fe08 upstream.
ACPICA commit c70434d4da13e65b6163c79a5aa16b40193631c7
ACPI_MTX_TABLES is acquired and released by the callers of
acpi_tb_install_standard_table() so releasing it in the function itself is
causing the following error in Linux kernel if the table is reloaded:
ACPI Error: Mutex [0x2] is not acquired, cannot release (20141107/utmutex-321)
Call Trace:
[<ffffffff81b0bd48>] dump_stack+0x4f/0x7b
[<ffffffff81546bf5>] acpi_ut_release_mutex+0x47/0x67
[<ffffffff81544357>] acpi_load_table+0x73/0xcb
Link: https://github.com/acpica/acpica/commit/c70434d4
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b8760100e upstream.
ACPICA commit aacf863cfffd46338e268b7415f7435cae93b451
It is reported that on a physically 64-bit addressed machine, 32-bit kernel
can trigger crashes in accessing the memory regions that are beyond the
32-bit boundary. The region field's start address should still be 32-bit
compliant, but after a calculation (adding some offsets), it may exceed the
32-bit boundary. This case is rare and buggy, but there are real BIOSes
leaked with such issues (see References below).
This patch fixes this gap by always defining IO addresses as 64-bit, and
allows OSPMs to optimize it for a real 32-bit machine to reduce the size of
the internal objects.
Internal acpi_physical_address usages in the structures that can be fixed
by this change include:
1. struct acpi_object_region:
acpi_physical_address address;
2. struct acpi_address_range:
acpi_physical_address start_address;
acpi_physical_address end_address;
3. struct acpi_mem_space_context;
acpi_physical_address address;
4. struct acpi_table_desc
acpi_physical_address address;
See known issues 1 for other usages.
Note that acpi_io_address which is used for ACPI_PROCESSOR may also suffer
from same problem, so this patch changes it accordingly.
For iasl, it will enforce acpi_physical_address as 32-bit to generate
32-bit OSPM compatible tables on 32-bit platforms, we need to define
ACPI_32BIT_PHYSICAL_ADDRESS for it in acenv.h.
Known issues:
1. Cleanup of mapped virtual address
In struct acpi_mem_space_context, acpi_physical_address is used as a virtual
address:
acpi_physical_address mapped_physical_address;
It is better to introduce acpi_virtual_address or use acpi_size instead.
This patch doesn't make such a change. Because this should be done along
with a change to acpi_os_map_memory()/acpi_os_unmap_memory().
There should be no functional problem to leave this unchanged except
that only this structure is enlarged unexpectedly.
Link: https://github.com/acpica/acpica/commit/aacf863c
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501
Reported-and-tested-by: Paul Menzel <paulepanter@users.sourceforge.net>
Reported-and-tested-by: Sial Nije <sialnije@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f073faa736 upstream.
If den=1 and pllin_rate>20MHz then den and num are adjusted to 0
causing a divide by zero error a few lines further on. Therefore
this patch correctly scales num and den such that
pllin_rate/den < 20MHz as required in the device data sheet.
Signed-off-by: Howard Mitchell <hm@hmbedded.co.uk>
Signed-off-by: Mark Brown <broonie@sirena.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d9b13c7cc upstream.
This is to ensure that 'alsactl restore' does not apply default
initialisation as the chip reset defaults are preferred.
Signed-off-by: Howard Mitchell <hm@hmbedded.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a57069e33f upstream.
As davinci card gets registered using 'devm_' api
there is no need to unregister the card in 'remove'
function.
Hence drop the 'remove' function.
Fixes: ee2f615d6e (ASoC: davinci-evm: Add device tree binding)
Signed-off-by: Manish Badarkhe <manishvb@ti.com>
Signed-off-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8787041d9b upstream.
The WM8741 DAC supports the following typical audio sampling rates:
44.1kHz, 88.2kHz, 176.4kHz (eg: with a master clock of 22.5792MHz)
32kHz, 48kHz, 96kHz, 192kHz (eg: with a master clock of 24.576MHz)
For the rates lists, we should use 82000 instead of 88235, 176400
instead of 1764000 and 192000 instead of 19200 (seems to be a typo).
Signed-off-by: Sergej Sawazki <ce3a@gmx.de>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74ff960222 upstream.
The delay time after a reset in the codec probe callback was too short,
and did not work on certain hw because the codec needs more time to
power on. This increases the delay time from 1us to 1ms.
Signed-off-by: Pascal Huerst <pascal.huerst@gmail.com>
Acked-by: Brian Austin <brian.austin@cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7261b956b2 upstream.
The patch to add it_page_shift incorrectly changed the increment of
uaddr to use it_page_shift, rather then (1 << it_page_shift).
This broke booting on at least some Cell blades, as the iommu was
basically non-functional.
Fixes: 3a553170d3 ("powerpc/iommu: Add it_page_shift field to determine iommu page size")
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0dd00addc upstream.
The conversion from __get_cpu_var() to this_cpu_ptr() in iic_setup_cpu()
is wrong. It causes an oops at boot.
We need the per-cpu address of struct cpu_iic, not cpu_iic.regs->prio.
Sparse noticed this, because we pass a non-iomem pointer to out_be64(),
but we obviously don't check the sparse results often enough.
Fixes: 69111bac42 ("powerpc: Replace __get_cpu_var uses")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7e9e35836 upstream.
This problem appears to have been introduced in 2.6.29 by commit
93197a36a9 "Rewrite sysfs processor cache info code".
This caused lscpu to error out on at least e500v2 devices, eg:
error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory
Some embedded powerpc systems use cache-size in DTS for the unified L2
cache size, not d-cache-size, so we need to allow for both DTS names.
Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle
this.
Fixes: 93197a36a9 ("powerpc: Rewrite sysfs processor cache info code")
Signed-off-by: Dave Olson <olson@cumulusnetworks.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 027fa02f84 upstream.
If M64 has been supported, the prefetchable 64-bits memory resources
shouldn't be mapped to the corresponding PE# via M32DT. Unfortunately,
we're doing that in pnv_ioda_setup_pe_seg() wrongly. The issue was
introduced by commit 262af55 ("powerpc/powernv: Enable M64 aperatus
for PHB3"). The patch fixes the issue by simply skipping M64 resources
when updating to M32DT.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 905e8c5dca upstream.
When running a compat (AArch32) userspace on Cortex-A53, a load at EL0
from a virtual address that matches the bottom 32 bits of the virtual
address used by a recent load at (AArch64) EL1 might return incorrect
data.
This patch works around the issue by writing to the contextidr_el1
register on the exception return path when returning to a 32-bit task.
This workaround is patched in at runtime based on the MIDR value of the
processor.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 137650aad9 upstream.
Currently we only perform alternative patching for kernels built with
CONFIG_SMP, as we call apply_alternatives_all() in smp.c, which is only
built for CONFIG_SMP. Thus !SMP kernels may not have necessary
alternatives patched in.
This patch ensures that we call apply_alternatives_all() once all CPUs
are booted, even for !SMP kernels, by having the smp_init_cpus() stub
call this for !SMP kernels via up_late_init. A new wrapper,
do_post_cpus_up_work, is added so we can hook other calls here later
(e.g. boot mode logging).
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: e039ee4ee3 ("arm64: add alternative runtime patching")
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad08fd494b upstream.
Commit f4f75ad5 ("efi: efistub: Convert into static library")
introduced a static library for EFI stub, libstub.
The EFI libstub directory is referenced by the kernel build system via
a obj subdirectory rule in:
drivers/firmware/efi/Makefile
Unfortunately, arm64 also references the EFI libstub via:
libs-$(CONFIG_EFI_STUB) += drivers/firmware/efi/libstub/
If we're unlucky, the kernel build system can enter libstub via two
simultaneous threads resulting in build failures such as:
fixdep: error opening depfile: drivers/firmware/efi/libstub/.efi-stub-helper.o.d: No such file or directory
scripts/Makefile.build:257: recipe for target 'drivers/firmware/efi/libstub/efi-stub-helper.o' failed
make[1]: *** [drivers/firmware/efi/libstub/efi-stub-helper.o] Error 2
Makefile:939: recipe for target 'drivers/firmware/efi/libstub' failed
make: *** [drivers/firmware/efi/libstub] Error 2
make: *** Waiting for unfinished jobs....
This patch adjusts the arm64 Makefile to reference the compiled library
explicitly (as is currently done in x86), rather than the directory.
Fixes: f4f75ad5 efi: efistub: Convert into static library
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91d57155dc upstream.
After writing the page tables, we use __inval_cache_range to invalidate
any stale cache entries. Strongly Ordered memory accesses are not
ordered w.r.t. cache maintenance instructions, and hence explicit memory
barriers are required to provide this ordering. However,
__inval_cache_range was written to be used on Normal Cacheable memory
once the MMU and caches are on, and does not have any barriers prior to
the DC instructions.
This patch adds a DMB between the page tables being written and the
corresponding cachelines being invalidated, ensuring that the
invalidation makes the new data visible to subsequent cacheable
accesses. A barrier is not required before the prior invalidate as we do
not access the page table memory area prior to this, and earlier
barriers in preserve_boot_args and set_cpu_boot_mode_flag ensures
ordering w.r.t. any stores performed prior to entering Linux.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: c218bca74e ("arm64: Relax the kernel cache requirements for boot")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d1966dfd6 upstream.
Register MIDR_EL1 is masked to get variant and revision fields, then
compared against midr_range_min and midr_range_max when checking
whether CPU is affected by any particular erratum. However, variant
and revision fields in MIDR_EL1 are separated by 16 bits, so the min
and max of midr range should be constructed accordingly, otherwise
the patch will not be applied when variant field is non-0.
Acked-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Bo Yan <byan@nvidia.com>
[will: use MIDR_VARIANT_SHIFT to construct upper bound]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a579da258 upstream.
Before we reach to connection established we may get an
error event. In this case the core won't teardown this
connection (never established it), so we take care of freeing
it ourselves.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 364189f0ad upstream.
This hang was a result of a missing command put when
a DIF error occurred during a rdma read (and we sent
an CHECK_CONDITION error without passing it to the
backend).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c836777830 upstream.
In fd_do_prot_rw(), it allocates prot_buf which is used to copy from
se_cmd->t_prot_sg by sbc_dif_copy_prot(). The SG table for prot_buf
is also initialized by allocating 'se_cmd->t_prot_nents' entries of
scatterlist and setting the data length of each entry to PAGE_SIZE
at most.
However if se_cmd->t_prot_sg contains a clustered entry (i.e.
sg->length > PAGE_SIZE), the SG table for prot_buf can't be
initialized correctly and sbc_dif_copy_prot() can't copy to prot_buf.
(This actually happened with TCM loopback fabric module)
As prot_buf is allocated by kzalloc() and it's physically contiguous,
we only need a single scatterlist entry.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64d240b721 upstream.
When UNMAP command is issued with DIF protection support enabled,
the protection info for the unmapped region is remain unchanged.
So READ command for the region causes data integrity failure.
This fixes it by invalidating protection info for the unmapped region
by filling with 0xff pattern. This change also adds helper function
fd_do_prot_fill() in order to reduce code duplication with existing
fd_format_prot().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38da0f49e8 upstream.
When CONFIG_DEBUG_SG=y and DIF protection support enabled, kernel
BUG()s are triggered due to the following two issues:
1) prot_sg is not initialized by sg_init_table().
When CONFIG_DEBUG_SG=y, scatterlist helpers check sg entry has a
correct magic value.
2) vmalloc'ed buffer is passed to sg_set_buf().
sg_set_buf() uses virt_to_page() to convert virtual address to struct
page, but it doesn't work with vmalloc address. vmalloc_to_page()
should be used instead. As prot_buf isn't usually too large, so
fix it by allocating prot_buf by kmalloc instead of vmalloc.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8e639852a upstream.
This patch fixes a bug for COMPARE_AND_WRITE handling with
fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC.
It adds the missing allocation for cmd->t_bidi_data_sg within
transport_generic_new_cmd() that is used by COMPARE_AND_WRITE
for the initial READ payload, even if the fabric is already
providing a pre-allocated buffer for cmd->t_data_sg.
Also, fix zero-length COMPARE_AND_WRITE handling within the
compare_and_write_callback() and target_complete_ok_work()
to queue the response, skipping the initial READ.
This fixes COMPARE_AND_WRITE emulation with loopback, vhost,
and xen-backend fabric drivers using SG_TO_MEM_NOALLOC.
Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88dcd2dab5 upstream.
This patch converts iscsi-target code to use modern kthread.h API
callers for creating RX/TX threads for each new iscsi_conn descriptor,
and releasing associated RX/TX threads during connection shutdown.
This is done using iscsit_start_kthreads() -> kthread_run() to start
new kthreads from within iscsi_post_login_handler(), and invoking
kthread_stop() from existing iscsit_close_connection() code.
Also, convert iscsit_logout_post_handler_closesession() code to use
cmpxchg when determing when iscsit_cause_connection_reinstatement()
needs to sleep waiting for completion.
Reported-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Slava Shwartsman <valyushash@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 299d0c5b27 upstream.
The comparison from the previous line seems to have been erroneously
(partially) copied-and-pasted onto the next. The second line should be
checking req.bytes, not req.lnum.
Coverity CID #139400
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
[rw: Fixed comparison]
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f16db8071c upstream.
In some of the 'out_not_moved' error paths, lnum may be used
uninitialized. Don't ignore the warning; let's fix it.
This uninitialized variable doesn't have much visible effect in the end,
since we just schedule the PEB for erasure, and its LEB number doesn't
really matter (it just gets printed in debug messages). But let's get it
straight anyway.
Coverity CID #113449
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8eef7d70f7 upstream.
We are completely discarding the earlier value of 'bitflips', which
could reflect a bitflip found in ubi_io_read_vid_hdr(). Let's use the
bitwise OR of header and data 'bitflip' statuses instead.
Coverity CID #1226856
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f82263c698 upstream.
Since commit ee0778a301
("tools/power: turbostat: make Makefile a bit more capable")
turbostat's Makefile is using
[...]
BUILD_OUTPUT := $(PWD)
[...]
which obviously causes trouble when building "turbostat" with
make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat
because GNU make does not update nor guarantee that $PWD is set.
This patch changes the Makefile to use $CURDIR instead, which GNU make
guarantees to set and update (i.e. when using "make -C ...") and also
adds support for the O= option (see "make help" in your root of your
kernel source tree for more details).
Link: https://bugs.gentoo.org/show_bug.cgi?id=533918
Fixes: ee0778a301 ("tools/power: turbostat: make Makefile a bit more capable")
Signed-off-by: Thomas D. <whissi@whissi.de>
Cc: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a5cbce421 upstream.
We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH
(currently 127), but we forgot to do the same for 64bit backtraces.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 34d47b6322 upstream.
I started to work with PPI interface so that it would be available
under character device sysfs directory and realized that chip
registeration was still too messy.
In TPM 1.x in some rare scenarios (errors that almost never occur)
wrong order in deinitialization steps was taken in teardown. I
reproduced these scenarios by manually inserting error codes in the
place of the corresponding function calls.
The key problem is that the teardown is messy with two separate code
paths (this was inherited when moving code from tpm-interface.c).
Moved TPM 1.x specific register/unregister functionality to own helper
functions and added single code path for teardown in tpm_chip_register().
Now the code paths have been fixed and it should be easier to review
later on this part of the code.
Fixes: 7a1d7e6dd7 ("tpm: TPM 2.0 baseline support")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Scot Doyle <lkml14@scotdoyle.com>
Reviewed-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0c9c0afd2 upstream.
Now that it is possible to lazily unmount an entire mount tree and
leave the individual mounts connected to each other add a new flag
UMOUNT_CONNECTED to umount_tree to force this behavior and use
this flag in detach_mounts.
This closes a bug where the deletion of a file or directory could
trigger an unmount and reveal data under a mount point.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f53e579751 upstream.
lookup_mountpoint can return either NULL or an error value.
Update the test in __detach_mounts to test for an error value
to avoid pathological cases causing a NULL pointer dereferences.
The callers of __detach_mounts should prevent it from ever being
called on an unlinked dentry but don't take any chances.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce07d891a0 upstream.
Modify umount(MNT_DETACH) to keep mounts in the hash table that are
locked to their parent mounts, when the parent is lazily unmounted.
In mntput_no_expire detach the children from the hash table, depending
on mnt_pin_kill in cleanup_mnt to decrement the mnt_count of the children.
In __detach_mounts if there are any mounts that have been unmounted
but still are on the list of mounts of a mountpoint, remove their
children from the mount hash table and those children to the unmounted
list so they won't linger potentially indefinitely waiting for their
final mntput, now that the mounts serve no purpose.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a46c5735c upstream.
For future use factor out a function umount_mnt from umount_tree.
This function unhashes a mount and remembers where the mount
was mounted so that eventually when the code makes it to a
sleeping context the mountpoint can be dput.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7bdb11de8e upstream.
Create a function unhash_mnt that contains the common code between
detach_mnt and umount_tree, and use unhash_mnt in place of the common
code. This add a unncessary list_del_init(mnt->mnt_child) into
umount_tree but given that mnt_child is already empty this extra
line is a noop.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c56fe3142 upstream.
If the first mount in shared subtree is locked don't unmount the
shared subtree.
This is ensured by walking through the mounts parents before children
and marking a mount as unmountable if it is not locked or it is locked
but it's parent is marked.
This allows recursive mount detach to propagate through a set of
mounts when unmounting them would not reveal what is under any locked
mount.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d88457eb5 upstream.
A prerequisite of calling umount_tree is that the point where the tree
is mounted at is valid to unmount.
If we are propagating the effect of the unmount clear MNT_LOCKED in
every instance where the same filesystem is mounted on the same
mountpoint in the mount tree, as we know (by virtue of the fact
that umount_tree was called) that it is safe to reveal what
is at that mountpoint.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 411a938b5a upstream.
- Modify __lookup_mnt_hash_last to ignore mounts that have MNT_UMOUNTED set.
- Don't remove mounts from the mount hash table in propogate_umount
- Don't remove mounts from the mount hash table in umount_tree before
the entire list of mounts to be umounted is selected.
- Remove mounts from the mount hash table as the last thing that
happens in the case where a mount has a parent in umount_tree.
Mounts without parents are not hashed (by definition).
This paves the way for delaying removal from the mount hash table even
farther and fixing the MNT_LOCKED vs MNT_DETACH issue.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 590ce4bcbf upstream.
In some instances it is necessary to know if the the unmounting
process has begun on a mount. Add MNT_UMOUNT to make that reliably
testable.
This fix gets used in fixing locked mounts in MNT_DETACH
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c003b26ff9 upstream.
umount_tree builds a list of mounts that need to be unmounted.
Utilize mnt_list for this purpose instead of mnt_hash. This begins to
allow keeping a mount on the mnt_hash after it is unmounted, which is
necessary for a properly functioning MNT_LOCKED implementation.
The fact that mnt_list is an ordinary list makding available list_move
is nice bonus.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8318e667f1 upstream.
Invoking mount propagation from __detach_mounts is inefficient and
wrong.
It is inefficient because __detach_mounts already walks the list of
mounts that where something needs to be done, and mount propagation
walks some subset of those mounts again.
It is actively wrong because if the dentry that is passed to
__detach_mounts is not part of the path to a mount that mount should
not be affected.
change_mnt_propagation(p,MS_PRIVATE) modifies the mount propagation
tree of a master mount so it's slaves are connected to another master
if possible. Which means even removing a mount from the middle of a
mount tree with __detach_mounts will not deprive any mount propagated
mount events.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e819f15210 upstream.
- Remove the unneeded declaration from pnode.h
- Mark umount_tree static as it has no callers outside of namespace.c
- Define an enumeration of umount_tree's flags.
- Pass umount_tree's flags in by name
This removes the magic numbers 0, 1 and 2 making the code a little
clearer and makes it possible for there to be lazy unmounts that don't
propagate. Which is what __detach_mounts actually wants for example.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e12fb97222 upstream.
Previously commit 14ece1028b added a
support for for syncing parent directory of newly created inodes to
make sure that the inode is not lost after a power failure in
no-journal mode.
However this does not work in majority of cases, namely:
- if the directory has inline data
- if the directory is already indexed
- if the directory already has at least one block and:
- the new entry fits into it
- or we've successfully converted it to indexed
So in those cases we might lose the inode entirely even after fsync in
the no-journal mode. This also includes ext2 default mode obviously.
I've noticed this while running xfstest generic/321 and even though the
test should fail (we need to run fsck after a crash in no-journal mode)
I could not find a newly created entries even when if it was fsynced
before.
Fix this by adjusting the ext4_add_entry() successful exit paths to set
the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the
parent directory as well.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Frank Mayhar <fmayhar@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d4a41d10b2 upstream.
i2c_master_send may return many negative values different than
-EREMOTEIO.
In case an i2c transaction is NACK'ed, on raspberry pi B+
kernel 3.18, -EIO is generated instead.
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 361918970b upstream.
We currently need two checks of the peripheral version in MACB_MID register.
One of them got out of sync after modification by 8a013a9c71 (net: macb:
Include multi queue support for xilinx ZynqMP ethernet version).
Fix this in macb_configure_caps() so that xilinx ZynqMP will be considered
as a GEM flavor.
Fixes: 8a013a9c71 ("net: macb: Include multi queue support for xilinx ZynqMP
ethernet version")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Punnaiah Choudary Kalluri <punnaia@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7ef767173 upstream.
On some Silvermont-Core/Baytrail-SOC systems,
C1E latency is higher than original specifications.
Although C1E is still enumerated in CPUID.MWAIT.EDX,
we delete the state from intel_idle to avoid latency impact.
Under some conditions, the latency of the C6N-BYT and C6S-BYT states
may exceed the specified values of 40 and 140 usec, respectively.
Increase those values to 300 and 500 usec; to assure
that the hardware does not violate constraints that may be set
by the Linux PM_QOS sub-system.
Also increase the C7-BYT target residency to 4.0 ms from 1.5 ms.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Kumar P Mahesh <mahesh.kumar.p@intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b72c186999 upstream.
ptrace_resume() is called when the tracee is still __TASK_TRACED. We set
tracee->exit_code and then wake_up_state() changes tracee->state. If the
tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
wrongly looks like another report from tracee.
This confuses debugger, and since wait_task_stopped() clears ->exit_code
the tracee can miss a signal.
Test-case:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <pthread.h>
#include <assert.h>
int pid;
void *waiter(void *arg)
{
int stat;
for (;;) {
assert(pid == wait(&stat));
assert(WIFSTOPPED(stat));
if (WSTOPSIG(stat) == SIGHUP)
continue;
assert(WSTOPSIG(stat) == SIGCONT);
printf("ERR! extra/wrong report:%x\n", stat);
}
}
int main(void)
{
pthread_t thread;
pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
for (;;)
kill(getpid(), SIGHUP);
}
assert(pthread_create(&thread, NULL, waiter, NULL) == 0);
for (;;)
ptrace(PTRACE_CONT, pid, 0, SIGCONT);
return 0;
}
Note for stable: the bug is very old, but without 9899d11f65 "ptrace:
ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
should use lock_task_sighand(child).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Pavel Labath <labath@google.com>
Tested-by: Pavel Labath <labath@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a87938b2e2 upstream.
With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down
address allocation strategy, load_elf_binary() will attempt to map a PIE
binary into an address range immediately below mm->mmap_base.
Unfortunately, load_elf_ binary() does not take account of the need to
allocate sufficient space for the entire binary which means that, while
the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent
PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are
that is supposed to be the "gap" between the stack and the binary.
Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this
means that binaries with large data segments > 128MB can end up mapping
part of their data segment over their stack resulting in corruption of the
stack (and the data segment once the binary starts to run).
Any PIE binary with a data segment > 128MB is vulnerable to this although
address randomization means that the actual gap between the stack and the
end of the binary is normally greater than 128MB. The larger the data
segment of the binary the higher the probability of failure.
Fix this by calculating the total size of the binary in the same way as
load_elf_interp().
Signed-off-by: Michael Davidson <md@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a77c50b44c upstream.
Since commit 6e3f62f079 ("mfd: core: Fix platform-device id
generation") we honour PLATFORM_DEVID_AUTO and PLATFORM_DEVID_NONE when
registering mfd-devices.
Unfortunately, some mfd-drivers rely on the old behaviour of generating
platform-device ids by adding the cell id also to the special value of
PLATFORM_DEVID_NONE. The resulting platform ids are not only used to
generate device-unique names, but are also used instead of the cell id
to identify cells when probing subdevices.
These drivers should be updated to use PLATFORM_DEVID_AUTO, which would
also allow more than one device to be registered without resorting to
hacks (see for example wm831x), but lets fix the regression first by
partially reverting the above mentioned commit with respect to
PLATFORM_DEVID_NONE.
Fixes: 6e3f62f079 ("mfd: core: Fix platform-device id generation")
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bcca19f5d upstream.
When the left touchpad button gets pressed, and then the trackpoint is
moved, and then the button is released, the following happens:
1) touchpad packet is received, touchpad evdev node reports BTN_LEFT 1
2) pointing stick packet is received, the hw will report a BTN_LEFT 1 in
this packet because when the trackstick is active it communicates the
combined touchpad + pointing stick buttons in the trackstick packet,
since alps_report_bare_ps2_packet passes NULL (*) for the dev2 parameter
to alps_report_buttons the combining is not detected and the
pointing stick evdev node will also report BTN_LEFT 1
3) on release of the button a pointing stick packet with BTN_LEFT 0 is
received and the pointing stick evdev node will report BTN_LEFT 0
Note how because of the passing as NULL for dev2 the touchpad evdev node
will never send BTN_LEFT 0 in this scenario leading to a stuck mouse button.
This is a regression in 4.0 introduced by commit 04aae283ba
("Input: ALPS - do not mix trackstick and external PS/2 mouse data")
This commit fixes this by passing in the touchpad evdev as dev2 parameter
when calling alps_report_buttons for the pointingstick on alps v2 devices,
so that alps_report_buttons correctly detect that we're already reporting
the button as pressed via the touchpad evdev node, and will also send the
release event there.
Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd884149ac upstream.
On ASUS TP500LN and X750JN, the touchpad absolute mode is reset each
time set_rate is done.
In order to fix this, we will verify the firmware version, and if it
matches the one in those laptops, the set_rate function is overloaded
with a function elantech_set_rate_restore_reg_07 that performs the
set_rate with the original function, followed by a restore of reg_07
(the register that sets the absolute mode on elantech v4 hardware).
Also the ASUS TP500LN and X750JN firmware version, capabilities, and
button constellation is added to elantech.c
Reported-and-tested-by: George Moutsopoulos <gmoutso@yahoo.co.uk>
Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d1b6e2932 upstream.
The ALC256 does not have a mixer nid at 0x0b, and there's no
loopback path (the output pins are directly connected to the DACs).
This commit fixes an "num_steps = 0 for NID=0xb (ctl = Beep Playback Volume)"
error (and as a result, problems with amixer/alsamixer).
If there's pcbeep functionality, it certainly isn't controlled by setting an
amp on 0x0b, so disable beep functionality (at least for now).
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1446517
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2aa111041 upstream.
The Lenovo Thinkpad T450 requires the ALC292_FIXUP_TPT440_DOCK as well in
order to get working sound output on the docking stations headphone jack.
Patch tested on a Thinkpad T450 (20BVCTO1WW) using kernel 4.0-rc7 in
conjunction with a ThinkPad Ultradock.
Signed-off-by: Jo-Philipp Wich <jow@openwrt.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91bf0c2dcb upstream.
The functions snd_emu10k1_proc_spdif_read and snd_emu1010_fpga_read
acquire the emu_lock before accessing the FPGA. The function used
to access the FPGA (snd_emu1010_fpga_read) also tries to take
the emu_lock which causes a deadlock.
Remove the outer locking in the proc-functions (guarding only the
already safe fpga read) to prevent this deadlock.
[removed superfluous flags variables too -- tiwai]
Signed-off-by: Michael Gernoth <michael@gernoth.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eef0342cf3 upstream.
Adds Microsoft LifeCam Cinema USB ID to the snd_usb_get_sample_rate_quirk list as the Lifecam Cinema does not appear to support getting the sample rate.
Fixes the issue where the LifeCam Cinema would wait for USB timeout and log the message "cannot get freq at ep 0x82" when accessed.
Addresses bug report https://bugzilla.kernel.org/show_bug.cgi?id=95961.
Signed-off-by: Adam Honse <calcprogrammer1@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4591243102 upstream.
The at91sam9n12 and at91sam9x5 usb clocks do not propagate rate
modification requests to their parents.
This causes a bug when the PLLB is left uninitialized by the bootloader
(PLL multiplier set to 0, or in other words, PLL rate = 0 Hz).
Implement the determinate_rate method and propagate the change rate
request to the parent clk.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Bo Shen <voice.shen@atmel.com>
Tested-by: Bo Shen <voice.shen@atmel.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbc78c07a5 upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59c9904cce upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74bd7b6980 upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08debfb13b upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea16328f80 upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c0ae6574c upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a606ac297 upstream.
While this driver was already using a 50ms resume
timeout, let's make sure everybody uses the same
macro so it's easy to fix later should anything
go wrong.
It also gives a more "stable" expectation to Linux
users.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e136bb71a upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8fb6f79f7 upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 595227db1f upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84c0d178eb upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 309be23936 upstream.
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Based on original work by Bin Liu <Bin Liu <b-liu@ti.com>>
Cc: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62f0342de1 upstream.
Every USB Host controller should use this new
macro to define for how long resume signalling
should be driven on the bus.
Currently, almost every single USB controller
is using a 20ms timeout for resume signalling.
That's problematic for two reasons:
a) sometimes that 20ms timer expires a little
before 20ms, which makes us fail certification
b) some (many) devices actually need more than
20ms resume signalling.
Sure, in case of (b) we can state that the device
is against the USB spec, but the fact is that
we have no control over which device the certification
lab will use. We also have no control over which host
they will use. Most likely they'll be using a Windows
PC which, again, we have no control over how that
USB stack is written and how long resume signalling
they are using.
At the end of the day, we must make sure Linux passes
electrical compliance when working as Host or as Device
and currently we don't pass compliance as host because
we're driving resume signallig for exactly 20ms and
that confuses certification test setup resulting in
Certification failure.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 869aee0f31 upstream.
The res parameter passed to devm_usb_phy_match() is the location where the
pointer to the usb_phy is stored, hence it needs to be dereferenced before
comparing to the match data in order to find the correct match.
Fixes: 410219dcd2 ("usb: otg: utils: devres: Add API's to associate a device with the phy")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3c93e1a3f upstream.
As per Mentor Graphics' documentation, we should
always handle TX endpoints before RX endpoints.
This patch fixes that error while also updating
some hard-to-read comments which were scattered
around musb_interrupt().
This patch should be backported as far back as
possible since this error has been in the driver
since it's conception.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e9e20b1fa upstream.
Resolve a merge conflict with mmc refactoring aaa25a5a33 ("ARM: dts:
unuse the slot-node and deprecate the supports-highspeed for dw-mmc in
exynos") by dropping the slot@0 nodes, moving its bus-width property to
the mmc node and replacing supports-highspeed with cap-{mmc,sd}-highspeed,
matching exynos5250-snow.
Cc: Jaehoon Chung <jh80.chung@samsung.com>
Fixes: 53dd4138bb ("ARM: dts: Add exynos5250-spring device tree")
Signed-off-by: Andreas Faerber <afaerber@suse.de>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98b80987c9 upstream.
After 57a38effa5 (net: phy: micrel: disable broadcast for KSZ8081/KSZ8091)
the macb1 interface refuses to work properly because it tries
to cling to address 0 which isn't able to communicate in broadcast with
the mac anymore. The micrel phy on the board is actually configured
to show up at address 1.
Adding the phy node and its real address fixes the issue.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e330ae4ab upstream.
There are two PMICs on Cragganmore, currently one dynamically assign
its IRQ base and the other uses a fixed base. It is possible for the
statically assigned PMIC to fail if its IRQ is taken by the dynamically
assigned one. Fix this by statically assigning both the IRQ bases.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 548ae94c1c upstream.
On Armada 38x SoCs, under heavy I/O load, the system hangs when CPU
Idle is enabled. Waiting for a solution to this issue, this patch
disables the CPU Idle support for this SoC.
As CPU Hot plug support also uses some of the CPU Idle functions it is
also affected by the same issue. This patch disables it also for the
Armada 38x SoCs.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8defb3367f upstream.
Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel
split this is not so, because 2*TASK_SIZE overflows 32 bits,
so the actual value of ELF_ET_DYN_BASE is:
(2 * TASK_SIZE / 3) = 0x2a000000
When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address.
On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000]
for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled
as it fails to map shadow memory.
Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries
has a high chance of loading somewhere in between [0x2a000000 - 0x40000000]
even if ASLR enabled. This makes ASan with PIE absolutely incompatible.
Fix overflow by dividing TASK_SIZE prior to multiplying.
After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y):
(TASK_SIZE / 3 * 2) = 0x7f555554
[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#Mapping
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Maria Guseva <m.guseva@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 767bf7e7a1 upstream.
Normally, when a CPU wants to clear a cache line to zero in the external
L2 cache, it would generate bus cycles to write each word as it would do
with any other data access.
However, a Cortex A9 connected to a L2C-310 has a specific feature where
the CPU can detect this operation, and signal that it wants to zero an
entire cache line. This feature, known as Full Line of Zeros (FLZ),
involves a non-standard AXI signalling mechanism which only the L2C-310
can properly interpret.
There are separate enable bits in both the L2C-310 and the Cortex A9 -
the L2C-310 needs to be enabled and have the FLZ enable bit set in the
auxiliary control register before the Cortex A9 has this feature
enabled.
Unfortunately, the suspend code was not respecting this - it's not
obvious from the code:
swsusp_arch_suspend()
cpu_suspend() /* saves the Cortex A9 auxiliary control register */
arch_save_image()
soft_restart() /* turns off FLZ in Cortex A9, and disables L2C */
cpu_resume() /* restores the Cortex A9 registers, inc auxcr */
At this point, we end up with the L2C disabled, but the Cortex A9 with
FLZ enabled - which means any memset() or zeroing of a full cache line
will fail to take effect.
A similar issue exists in the resume path, but it's slightly more
complex:
swsusp_arch_suspend()
cpu_suspend() /* saves the Cortex A9 auxiliary control register */
arch_save_image() /* image with A9 auxcr saved */
...
swsusp_arch_resume()
call_with_stack()
arch_restore_image() /* restores image with A9 auxcr saved above */
soft_restart() /* turns off FLZ in Cortex A9, and disables L2C */
cpu_resume() /* restores the Cortex A9 registers, inc auxcr */
Again, here we end up with the L2C disabled, but Cortex A9 FLZ enabled.
There's no need to turn off the L2C in either of these two paths; there
are benefits from not doing so - for example, the page copies will be
faster with the L2C enabled.
Hence, fix this by providing a variant of soft_restart() which can be
used without turning the L2 cache controller off, and use it in both
of these paths to keep the L2C enabled across the respective resume
transitions.
Fixes: 8ef418c717 ("ARM: l2c: trial at enabling some Cortex-A9 optimisations")
Reported-by: Sean Cross <xobs@kosagi.com>
Tested-by: Sean Cross <xobs@kosagi.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1b8940b42 upstream.
We have observed a BUG() crash in fs/attr.c:notify_change(). The crash
occurs during an rsync into a filesystem that is exported via NFS.
1.) fs/attr.c:notify_change() modifies the caller's version of attr.
2.) 6de0ec00ba ("VFS: make notify_change pass ATTR_KILL_S*ID to
setattr operations") introduced a BUG() restriction such that "no
function will ever call notify_change() with both ATTR_MODE and
ATTR_KILL_S*ID set". Under some circumstances though, it will have
assisted in setting the caller's version of attr to this very
combination.
3.) 27ac0ffeac ("locks: break delegations on any attribute
modification") introduced code to handle breaking
delegations. This can result in notify_change() being re-called. attr
_must_ be explicitly reset to avoid triggering the BUG() established
in #2.
4.) The path that that triggers this is via fs/open.c:chmod_common().
The combination of attr flags set here and in the first call to
notify_change() along with a later failed break_deleg_wait()
results in notify_change() being called again via retry_deleg
without resetting attr.
Solution is to move retry_deleg in chmod_common() a bit further up to
ensure attr is completely reset.
There are other places where this seemingly could occur, such as
fs/utimes.c:utimes_common(), but the attr flags are not initially
set in such a way to trigger this.
Fixes: 27ac0ffeac ("locks: break delegations on any attribute modification")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b70b825802 upstream.
This mouse is also known under other IDs. It needs the quirk or will disconnect
in runlevel 1 or 3.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2c1d53185 upstream.
The return values of create_singlethread_workqueue() and
power_supply_register() calls were not checked and even on error probe()
function returned 0.
1. If allocation of workqueue failed (returning NULL) then further
accesses could lead to NULL pointer dereference. The
queue_delayed_work() expects workqueue to be non-NULL.
2. If registration of power supply failed then during unbind the driver
tried to unregister power supply which was not actually registered.
This could lead to memory corruption because
power_supply_unregister() unconditionally cleans up given power
supply.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 00a588f9d2 ("power: add driver for battery reading on iPaq h3xxx")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f852ec461e upstream.
Driver allocates singlethread workqueue in probe but it is not destroyed
during removal.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 00a588f9d2 ("power: add driver for battery reading on iPaq h3xxx")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7117f81e8 upstream.
Driver forgot to unregister charger power supply if registering of
battery supply failed in probe(). In such case the memory associated
with power supply leaked.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 98a2766493 ("power_supply: Add new lp8788 charger driver")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68c3ed6fa7 upstream.
The return value of power_supply_register() call was not checked and
even on error probe() function returned 0. If registering failed then
during unbind the driver tried to unregister power supply which was not
actually registered.
This could lead to memory corruption because power_supply_unregister()
unconditionally cleans up given power supply.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: da0a00ebc2 ("power: Add twl4030_madc battery driver.")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80a9b64e2c upstream.
It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.
101.356868: preempt_count_add <-ring_buffer_lock_reserve
101.356870: preempt_count_sub <-ring_buffer_lock_reserve
The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.
The generic version of this_cpu_read() and write() are:
#define this_cpu_generic_read(pcp) \
({ typeof(pcp) ret__; \
preempt_disable(); \
ret__ = *this_cpu_ptr(&(pcp)); \
preempt_enable(); \
ret__; \
})
#define this_cpu_generic_to_op(pcp, val, op) \
do { \
unsigned long flags; \
raw_local_irq_save(flags); \
*__this_cpu_ptr(&(pcp)) op val; \
raw_local_irq_restore(flags); \
} while (0)
Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.
Paul McKenney stated that __this_cpu_() versions produce much better code on
other architectures than this_cpu_() does, if we know that the call is done in
a preempt disabled location.
I also changed the recursive_unlock() to use two local variables instead
of accessing the per_cpu variable twice.
Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.home
Acked-by: Christoph Lameter <cl@linux.com>
Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1915a718b1 upstream.
The return value of power_supply_register() call was not checked and
even on error probe() function returned 0. If registering failed then
during unbind the driver tried to unregister power supply which was not
actually registered.
This could lead to memory corruption because power_supply_unregister()
unconditionally cleans up given power supply.
Fix this by checking return status of power_supply_register() call. In
case of failure, clean up sysfs entries and fail the probe.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 9be0fcb5ed ("compal-laptop: add JHL90, battery & hwmon interface")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad774702f1 upstream.
The commit c2be45f09b ("compal-laptop: Use
devm_hwmon_device_register_with_groups") wanted to change the
registering of hwmon device to resource-managed version. It mostly did
it except the main thing - it forgot to use devm-like function so the
hwmon device leaked after device removal or probe failure.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: c2be45f09b ("compal-laptop: Use devm_hwmon_device_register_with_groups")
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f20fbaad76 upstream.
`spidev_message()` sums the lengths of the individual SPI transfers to
determine the overall SPI message length. It restricts the total
length, returning an error if too long, but it does not check for
arithmetic overflow. For example, if the SPI message consisted of two
transfers and the first has a length of 10 and the second has a length
of (__u32)(-1), the total length would be seen as 9, even though the
second transfer is actually very long. If the second transfer specifies
a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could
overrun the spidev's pre-allocated tx buffer before it reaches an
invalid user memory address. Fix it by checking that neither the total
nor the individual transfer lengths exceed the maximum allowed value.
Thanks to Dan Carpenter for reporting the potential integer overflow.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f511ab09df upstream.
They are used to decide if the controller can do DMA on a buffer
of a specific length and thus are needed before any transfer is attempted.
This fixes a memory leak where the SPI core uses the drivers can_dma()
callback to determine if a buffer needs to be mapped. As the watermark
levels aren't correct at that point the driver falsely claims to be able to
DMA the buffer when it fact it isn't.
After the transfer has been done the core uses the same callback to
determine if it needs to unmap the buffers. As the driver now correctly
claims to not being able to DMA the buffer the core doesn't attempt to
unmap the buffer which leaves the SGT leaking.
Fixes: f62caccd12 (spi: spi-imx: add DMA support)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e71c589e4 upstream.
The reset control for the sunxi mmc controller is optional. Some
newer platforms (sun6i, sun8i, sun9i) have it, while older ones
(sun4i, sun5i, sun7i) don't.
Use the properly stubbed _optional version so the driver does not
fail to compile when RESET_CONTROLLER=n.
This patch also adds a check for deferred probing on the reset
control.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: David Lanzendörfer <david.lanzendoerfer@o2s.ch>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 323ece54e0 upstream.
Values directly from descriptors given in debug statements
must be converted to native endianness.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a7d95f95c upstream.
On architectures that depend on DT for obtaining cache hierarcy, we need
to validate the device node for all the cache indices, failing to do so
might result in wrong information being exposed to the userspace.
This is quite possible on initial/incomplete versions of the device
trees. In such cases, it's better to bail out if all the required device
nodes are not present.
This patch adds checks for the validation of device node for all the
caches and doesn't initialise the cacheinfo if there's any error.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16b8528d20 upstream.
We only want to steer the I/O completion towards a queue, but don't
actually access any per-CPU data, so the raw_ version is fine to use
and avoids the warnings when using smp_processor_id().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andy Lutomirski <luto@kernel.org>
Tested-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47d68979cc upstream.
Since commit 20d0189b10
in v3.14-rc1 RAID0 has performed incorrect calculations
when the chunksize is not a power of 2.
This happens because "sector_div()" modifies its first argument, but
this wasn't taken into account in the patch.
So restore that first arg before re-using the variable.
Reported-by: Joe Landman <joe.landman@gmail.com>
Reported-by: Dave Chinner <david@fromorbit.com>
Fixes: 20d0189b10
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6388e6832 upstream.
Information for packet type is in ieee80211_tx_info
band IEEE80211_BAND_5GHZ for PK_TYPE_11A.
IEEE80211_TX_RC_USE_CTS_PROTECT via tx_rate flags selects PK_TYPE_11GB
This ensures that the packet is always the right type.
Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e43c9c75f upstream.
The android_fence_release() function checks for active sync points
by calling list_empty() on the list head embedded on the sync
point. However, it is only valid to use list_empty() on nodes that
have been initialized with INIT_LIST_HEAD() or list_del_init().
Because the list entry has likely been removed from the active list
by sync_timeline_signal(), there is a good chance that this
WARN_ON_ONCE() will be hit due to dangling pointers pointing at
freed memory (even though the sync drivers did nothing wrong)
and memory corruption will ensue as the list entry is removed for
a second time, corrupting the active list.
This problem can be reproduced quite easily with CONFIG_DEBUG_LIST=y
and fences with more than one sync point.
Signed-off-by: Alistair Strachan <alistair.strachan@imgtec.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Colin Cross <ccross@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c20d92dad upstream.
the lcd type as defined in the Kconfig is not matching in the code.
as a result the rs, rw and en pins were getting interchanged.
Kconfig defines the value of PANEL_LCD to be 1 if we select custom
configuration but in the code LCD_TYPE_CUSTOM is defined as 5.
my hardware is LCD_TYPE_CUSTOM, but the pins were assigned to it
as pins of LCD_TYPE_OLD, and it was not working.
Now values are corrected with referenece to the values defined in
Kconfig and it is working.
checked on JHD204A lcd with LCD_TYPE_CUSTOM configuration.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6eae35485b upstream.
When emulating a regular lh/lw/lhu/sh/sw we need to use the appropriate
instruction if we are in EVA mode. This is necessary for userspace
applications which trigger alignment exceptions. In such case, the
userspace load/store instruction needs to be emulated with the correct
eva/non-eva instruction by the kernel emulator.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: c1771216ab ("MIPS: kernel: unaligned: Handle unaligned accesses for EVA")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9503/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeb5389503 upstream.
Commit c1771216ab ("MIPS: kernel: unaligned: Handle unaligned
accesses for EVA") allowed unaligned accesses to be emulated for
EVA. However, when emulating regular load/store unaligned accesses,
we need to use the appropriate "address space" instructions for that.
Previously, an unaligned load/store instruction in kernel space would
have used the corresponding EVA instructions to emulate it which led to
segmentation faults because of the address translation that happens
with EVA instructions. This is now fixed by using the EVA instruction
only when emulating EVA unaligned accesses.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: c1771216ab ("MIPS: kernel: unaligned: Handle unaligned accesses for EVA")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9501/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7f8aea4b9 upstream.
memsize denotes the amount of RAM we can access from kseg{0,1} and
that should be up to 256M. In case the bootloader reports a value
higher than that (perhaps reporting all the available RAM) it's best
if we fix it ourselves and just warn the user about that. This is
usually a problem with the bootloader and/or its environment.
[ralf@linux-mips.org: Remove useless parens as suggested bei Sergei.
Reformat long pr_warn statement to fit into 80 column limit.]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9362/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acaf6a97d6 upstream.
The lose_fpu() function only disables the FPU in CP0_Status.CU1 if the
FPU is in use and MSA isn't enabled.
This isn't necessarily a problem because KSTK_STATUS(current), the
version of CP0_Status stored on the kernel stack on entry from user
mode, does always get updated and gets restored when returning to user
mode, but I don't think it was intended, and it is inconsistent with the
case of only the FPU being in use. Sometimes leaving the FPU enabled may
also mask kernel bugs where FPU operations are executed when the FPU
might not be enabled.
So lets disable the FPU in the MSA case too.
Fixes: 33c771ba5c ("MIPS: save/disable MSA in lose_fpu")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9323/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98119ad533 upstream.
Guest user mode can generate a guest MSA Disabled exception on an MSA
capable core by simply trying to execute an MSA instruction. Since this
exception is unknown to KVM it will be passed on to the guest kernel.
However guest Linux kernels prior to v3.15 do not set up an exception
handler for the MSA Disabled exception as they don't support any MSA
capable cores. This results in a guest OS panic.
Since an older processor ID may be being emulated, and MSA support is
not advertised to the guest, the correct behaviour is to generate a
Reserved Instruction exception in the guest kernel so it can send the
guest process an illegal instruction signal (SIGILL), as would happen
with a non-MSA-capable core.
Fix this as minimally as reasonably possible by preventing
kvm_mips_check_privilege() from relaying MSA Disabled exceptions from
guest user mode to the guest kernel, and handling the MSA Disabled
exception by emulating a Reserved Instruction exception in the guest,
via a new handle_msa_disabled() KVM callback.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 085e68eeaf upstream.
The host's decision to enable machine check exceptions should remain
in force during non-root mode. KVM was writing 0 to cr4 on VCPU reset
and passed a slightly-modified 0 to the vmcs.guest_cr4 value.
Tested: Built.
On earlier version, tested by injecting machine check
while a guest is spinning.
Before the change, if guest CR4.MCE==0, then the machine check is
escalated to Catastrophic Error (CATERR) and the machine dies.
If guest CR4.MCE==1, then the machine check causes VMEXIT and is
handled normally by host Linux. After the change, injecting a machine
check causes normal Linux machine check handling.
Signed-off-by: Ben Serebrin <serebrin@google.com>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd1d0ddf2a upstream.
When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently
only check it against a fixed limit, which historically is set
to 127. With the new dynamic IRQ allocation the effective limit may
actually be smaller (64).
So when now a malicious or buggy userland injects a SPI in that
range, we spill over on our VGIC bitmaps and bytemaps memory.
I could trigger a host kernel NULL pointer dereference with current
mainline by injecting some bogus IRQ number from a hacked kvmtool:
-----------------
....
DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1)
DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1)
DEBUG: IRQ #114 still in the game, writing to bytemap now...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc07652e000
[00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027
Hardware name: FVP Base (DT)
task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000
PC is at kvm_vgic_inject_irq+0x234/0x310
LR is at kvm_vgic_inject_irq+0x30c/0x310
pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145
.....
So this patch fixes this by checking the SPI number against the
actual limit. Also we remove the former legacy hard limit of
127 in the ioctl code.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
[maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__,
as suggested by Christopher Covington]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca3f087472 upstream.
kvm_write_guest_cached() does not mark all written pages as dirty and
code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
with cross page accesses. Fix all the easy way.
The check is '<= 1' to have the same result for 'len = 0' cache anywhere
in the page. (nr_pages_needed is 0 on page boundary.)
Fixes: 8f964525a1 ("KVM: Allow cross page reads and writes from cached translations.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <20150408121648.GA3519@potion.brq.redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d744194956 upstream.
Sebastian reported a crash caused by a jump label mismatch after resume.
This happens because we do not save the kernel text section during suspend
and therefore also do not restore it during resume, but use the kernel image
that restores the old system.
This means that after a suspend/resume cycle we lost all modifications done
to the kernel text section.
The reason for this is the pfn_is_nosave() function, which incorrectly
returns that read-only pages don't need to be saved. This is incorrect since
we mark the kernel text section read-only.
We still need to make sure to not save and restore pages contained within
NSS and DCSS segment.
To fix this add an extra case for the kernel text section and only save
those pages if they are not contained within an NSS segment.
Fixes the following crash (and the above bugs as well):
Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0
Found: c0 04 00 00 00 00
Expected: c0 f4 00 00 00 11
New: c0 04 00 00 00 00
Kernel panic - not syncing: Corrupted kernel text
CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4
Call Trace:
[<0000000000113972>] show_stack+0x72/0xf0
[<000000000081f15e>] dump_stack+0x6e/0x90
[<000000000081c4e8>] panic+0x108/0x2b0
[<000000000081be64>] jump_label_bug.isra.2+0x104/0x108
[<0000000000112176>] __jump_label_transform+0x9e/0xd0
[<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50
[<00000000001d1136>] multi_cpu_stop+0x12e/0x170
[<00000000001d1472>] cpu_stopper_thread+0xb2/0x168
[<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0
[<0000000000158baa>] kthread+0x10a/0x110
[<0000000000824a86>] kernel_thread_starter+0x6/0xc
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94aa033efc upstream.
This fixes a bug introduced with commit c05c4186bb ("KVM: s390:
add floating irq controller").
get_all_floating_irqs() does copy_to_user() while holding
a spin lock. Let's fix this by filling a temporary buffer
first and copy it to userspace after giving up the lock.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15462e37ca upstream.
The reinjection of an I/O interrupt can fail if the list is at the limit
and between the dequeue and the reinjection, another I/O interrupt is
injected (e.g. if user space floods kvm with I/O interrupts).
This patch avoids this memory leak and returns -EFAULT in this special
case. This error is not recoverable, so let's fail hard. This can later
be avoided by not dequeuing the interrupt but working directly on the
locked list.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 261520dcfc upstream.
If the I/O interrupt could not be written to the guest provided
area (e.g. access exception), a program exception was injected into the
guest but "inti" wasn't freed, therefore resulting in a memory leak.
In addition, the I/O interrupt wasn't reinjected. Therefore the dequeued
interrupt is lost.
This patch fixes the problem while cleaning up the function and making the
cc and rc logic easier to handle.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb132ccbde upstream.
Function-specific setup requests should be handled in such a way, that
apart from filling in the data buffer, the requests are also actually
enqueued: if function-specific setup is called from composte_setup(),
the "usb_ep_queue()" block of code in composite_setup() is skipped.
The printer function lacks this part and it results in e.g. get device id
requests failing: the host expects some response, the device prepares it
but does not equeue it for sending to the host, so the host finally asserts
timeout.
This patch adds enqueueing the prepared responses.
Fixes: 2e87edf492: "usb: gadget: make g_printer use composite"
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 50c6a665b3 upstream.
Commit dc6c9a35b6 ("mm: account pmd page tables to the process")
added a counter that is incremented whenever a PMD is allocated and
decremented whenever a PMD is freed. For hugepages on PPC, common code
is used to allocated PMDs, but arch-specific code is used to free PMDs.
This results in kernel output such as "BUG: non-zero nr_pmds on freeing
mm: 1" when using hugepages.
Update the PPC hugepage PMD freeing code to decrement the count, just
as the above commit did for free_pmd_range().
Fixes: dc6c9a35b6 ("mm: account pmd page tables to the process")
Signed-off-by: Scott Wood <scottwood@freescale.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 113e828386 upstream.
If we pass a length of 0 to the extent_same ioctl, we end up locking an
extent range with a start offset greater then its end offset (if the
destination file's offset is greater than zero). This results in a warning
from extent_io.c:insert_state through the following call chain:
btrfs_extent_same()
btrfs_double_lock()
lock_extent_range()
lock_extent(inode->io_tree, offset, offset + len - 1)
lock_extent_bits()
__set_extent_bit()
insert_state()
--> WARN_ON(end < start)
This leads to an infinite loop when evicting the inode. This is the same
problem that my previous patch titled
"Btrfs: fix inode eviction infinite loop after cloning into it" addressed
but for the extent_same ioctl instead of the clone ioctl.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccccf3d672 upstream.
If we attempt to clone a 0 length region into a file we can end up
inserting a range in the inode's extent_io tree with a start offset
that is greater then the end offset, which triggers immediately the
following warning:
[ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
[ 3914.620886] BTRFS: end < start 4095 4096
(...)
[ 3914.638093] Call Trace:
[ 3914.638636] [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[ 3914.639620] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[ 3914.640789] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
[ 3914.642041] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[ 3914.643236] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
[ 3914.644441] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
[ 3914.645711] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
[ 3914.646914] [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33
[ 3914.648058] [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs]
[ 3914.650105] [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs]
[ 3914.651361] [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs]
[ 3914.652761] [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs]
[ 3914.654128] [<ffffffff811226dd>] ? might_fault+0x58/0xb5
[ 3914.655320] [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs]
(...)
[ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]---
This later makes the inode eviction handler enter an infinite loop that
keeps dumping the following warning over and over:
[ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
[ 3915.119913] BTRFS: end < start 4095 4096
(...)
[ 3915.137394] Call Trace:
[ 3915.137913] [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[ 3915.139154] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[ 3915.140316] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
[ 3915.141505] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[ 3915.142709] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
[ 3915.143849] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
[ 3915.145120] [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs]
[ 3915.146352] [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50
[ 3915.147565] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
[ 3915.148785] [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33
[ 3915.149931] [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs]
[ 3915.151154] [<ffffffff81168904>] evict+0xa0/0x148
[ 3915.152094] [<ffffffff811689e5>] dispose_list+0x39/0x43
[ 3915.153081] [<ffffffff81169564>] evict_inodes+0xdc/0xeb
[ 3915.154062] [<ffffffff81154418>] generic_shutdown_super+0x49/0xef
[ 3915.155193] [<ffffffff811546d1>] kill_anon_super+0x13/0x1e
[ 3915.156274] [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
(...)
[ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]---
So just bail out of the clone ioctl if the length of the region to clone
is zero, without locking any extent range, in order to prevent this issue
(same behaviour as a pwrite with a 0 length for example).
This is trivial to reproduce. For example, the steps for the test I just
made for fstests:
mkfs.btrfs -f SCRATCH_DEV
mount SCRATCH_DEV $SCRATCH_MNT
touch $SCRATCH_MNT/foo
touch $SCRATCH_MNT/bar
$CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
umount $SCRATCH_MNT
A test case for fstests follows soon.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcc82f4783 upstream.
While committing a transaction we free the log roots before we write the
new super block. Freeing the log roots implies marking the disk location
of every node/leaf (metadata extent) as pinned before the new super block
is written. This is to prevent the disk location of log metadata extents
from being reused before the new super block is written, otherwise we
would have a corrupted log tree if before the new super block is written
a crash/reboot happens and the location of any log tree metadata extent
ended up being reused and rewritten.
Even though we pinned the log tree's metadata extents, we were issuing a
discard against them if the fs was mounted with the -o discard option,
resulting in corruption of the log tree if a crash/reboot happened before
writing the new super block - the next time the fs was mounted, during
the log replay process we would find nodes/leafs of the log btree with
a content full of zeroes, causing the process to fail and require the
use of the tool btrfs-zero-log to wipeout the log tree (and all data
previously fsynced becoming lost forever).
Fix this by not doing a discard when pinning an extent. The discard will
be done later when it's safe (after the new super block is committed) at
extent-tree.c:btrfs_finish_extent_commit().
Fixes: e688b7252f (Btrfs: fix extent pinning bugs in the tree log)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d869844bd0 upstream.
Commit cae2a173fe ("x86: clean up/fix 'copy_in_user()' tail zeroing")
fixed the failure case tail zeroing of one special case of the x86-64
generic user-copy routine, namely when used for the user-to-user case
("copy_in_user()").
But in the process it broke an even more unusual case: using the user
copy routine for kernel-to-kernel copying.
Now, normally kernel-kernel copies are obviously done using memcpy(),
but we have a couple of special cases when we use the user-copy
functions. One is when we pass a kernel buffer to a regular user-buffer
routine, using set_fs(KERNEL_DS). That's a "normal" case, and continued
to work fine, because it never takes any faults (with the possible
exception of a silent and successful vmalloc fault).
But Jan Beulich pointed out another, very unusual, special case: when we
use the user-copy routines not because it's a path that expects a user
pointer, but for a couple of ftrace/kgdb cases that want to do a kernel
copy, but do so using "unsafe" buffers, and use the user-copy routine to
gracefully handle faults. IOW, for probe_kernel_write().
And that broke for the case of a faulting kernel destination, because we
saw the kernel destination and wanted to try to clear the tail of the
buffer. Which doesn't work, since that's what faults.
This only triggers for things like kgdb and ftrace users (eg trying
setting a breakpoint on read-only memory), but it's definitely a bug.
The fix is to not compare against the kernel address start (TASK_SIZE),
but instead use the same limits "access_ok()" uses.
Reported-and-tested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 517e6341fa upstream.
Ingo reported that cycles:pp didn't work for him on some machines.
It turns out that in this commit:
af4bdcf675 perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events
Andi forgot to explicitly allow that event when he
disabled event flags for PEBS on those uarchs.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: af4bdcf675 ("perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b253149b84 upstream.
In Linux-3.9 we removed the mwait_idle() loop:
69fb3676df ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
The reasoning was that modern machines should be sufficiently
happy during the boot process using the default_idle() HALT
loop, until cpuidle loads and either acpi_idle or intel_idle
invoke the newer MWAIT-with-hints idle loop.
But two machines reported problems:
1. Certain Core2-era machines support MWAIT-C1 and HALT only.
MWAIT-C1 is preferred for optimal power and performance.
But if they support just C1, cpuidle never loads and
so they use the boot-time default idle loop forever.
2. Some laptops will boot-hang if HALT is used,
but will boot successfully if MWAIT is used.
This appears to be a hidden assumption in BIOS SMI,
that is presumably valid on the proprietary OS
where the BIOS was validated.
https://bugzilla.kernel.org/show_bug.cgi?id=60770
So here we effectively revert the patch above, restoring
the mwait_idle() loop. However, we don't bother restoring
the idle=mwait cmdline parameter, since it appears to add
no value.
Maintainer notes:
For 3.9, simply revert 69fb3676df
for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
context For 3.11, 3.12, 3.13, this patch applies cleanly
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Mike Galbraith <bitbucket@online.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Malone <ibmalone@gmail.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
[ Ported to recent kernels. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80f7fdb1c7 upstream.
If we were migrated right after __getcpu, but before reading the
migration_count, we wouldn't notice that we read TSC of a different
VCPU, nor that KVM's bug made pvti invalid, as only migration_count
on source VCPU is increased.
Change vdso instead of updating migration_count on destination.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Fixes: 0a4e6be9ca ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"")
Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a4e6be9ca upstream.
The following point:
2. per-CPU pvclock time info is updated if the
underlying CPU changes.
Is not true anymore since "KVM: x86: update pvclock area conditionally,
on cpu migration".
Add task migration notification back.
Problem noticed by Andy Lutomirski.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91e5ed49fc upstream.
x86 instructions cannot exceed 15 bytes, and the instruction
decoder should enforce that. Prior to 6ba48ff46f, the
instruction length limit was implicitly set to 16, which was an
approximation of 15, but there is currently no limit at all.
Fix MAX_INSN_SIZE (it should be 15, not 16), and fix the decoder
to reject instructions that exceed MAX_INSN_SIZE.
Other than potentially confusing some of the decoder sanity
checks, I'm not aware of any actual problems that omitting this
check would cause, nor am I aware of any practical problems
caused by the MAX_INSN_SIZE error.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: 6ba48ff46f ("x86: Remove arbitrary instruction size limit ...
Link: http://lkml.kernel.org/r/f8f0bc9b8c58cfd6830f7d88400bf1396cbdcd0f.1422403511.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74672d069b upstream.
Simon reported the md io stats accounting issue:
"
I'm seeing "iostat -x -k 1" print this after a RAID1 rebuild on 4.0-rc5.
It's not abnormal other than it's 3-disk, with one being SSD (sdc) and
the other two being write-mostly:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 345.00 0.00 0.00 0.00 0.00 100.00
md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 58779.00 0.00 0.00 0.00 0.00 100.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 100.00
"
The cause is commit "18c0b223cf9901727ef3b02da6711ac930b4e5d4" uses the
generic_start_io_acct to account the disk stats rather than the open code,
but it also introduced the increase to .in_flight[rw] which is needless to
md. So we re-use the open code here to fix it.
Reported-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b37069090b ]
mlx4_en_check_rxfh_func() was checking for hardware support before
setting a known RSS hash function, but didn't do any check before
setting unknown RSS hash function. Need to make it fail on such values.
In this occasion, moved the actual setting of the new value from the
check function into mlx4_en_set_rxfh().
Fixes: 947cbb0 ("net/mlx4_en: Support for configurable RSS hash function")
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a31196b07f ]
Commit 567e4b7973 ("net: rfs: add hash collision detection") had one
mistake :
RPS_NO_CPU is no longer the marker for invalid cpu in set_rps_cpu()
and get_rps_cpu(), as @next_cpu was the result of an AND with
rps_cpu_mask
This bug showed up on a host with 72 cpus :
next_cpu was 0x7f, and the code was trying to access percpu data of an
non existent cpu.
In a follow up patch, we might get rid of compares against nr_cpu_ids,
if we init the tables with 0. This is silly to test for a very unlikely
condition that exists only shortly after table initialization, as
we got rid of rps_reset_sock_flow() and similar functions that were
writing this RPS_NO_CPU magic value at flow dismantle : When table is
old enough, it never contains this value anymore.
Fixes: 567e4b7973 ("net: rfs: add hash collision detection")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e03fd3e33 ]
Commit 43d3ddf87a ("net: pxa168_eth: add device tree support") starts
to use managed resources by adding devm_clk_get() and
devm_ioremap_resource(), but it leaves explicit iounmap() and clock_put()
in pxa168_eth_remove() and in failure handling code of pxa168_eth_probe().
As a result double free can happen.
The patch removes explicit resource deallocation. Also it converts
clk_disable() to clk_disable_unprepare() to make it symmetrical with
clk_prepare_enable().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79930f5892 ]
build_skb() should look at the page pfmemalloc status.
If set, this means page allocator allocated this page in the
expectation it would help to free other pages. Networking
stack can do that only if skb->pfmemalloc is also set.
Also, we must refrain using high order pages from the pfmemalloc
reserve, so __page_frag_refill() must also use __GFP_NOMEMALLOC for
them. Under memory pressure, using order-0 pages is probably the best
strategy.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 845704a535 ]
Presence of an unbound loop in tcp_send_fin() had always been hard
to explain when analyzing crash dumps involving gigantic dying processes
with millions of sockets.
Lets try a different strategy :
In case of memory pressure, try to add the FIN flag to last packet
in write queue, even if packet was already sent. TCP stack will
be able to deliver this FIN after a timeout event. Note that this
FIN being delivered by a retransmit, it also carries a Push flag
given our current implementation.
By checking sk_under_memory_pressure(), we anticipate that cooking
many FIN packets might deplete tcp memory.
In the case we could not allocate a packet, even with __GFP_WAIT
allocation, then not sending a FIN seems quite reasonable if it allows
to get rid of this socket, free memory, and not block the process from
eventually doing other useful work.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d83769a580 ]
Using sk_stream_alloc_skb() in tcp_send_fin() is dangerous in
case a huge process is killed by OOM, and tcp_mem[2] is hit.
To be able to free memory we need to make progress, so this
patch allows FIN packets to not care about tcp_mem[2], if
skb allocation succeeded.
In a follow-up patch, we might abort tcp_send_fin() infinite loop
in case TIF_MEMDIE is set on this thread, as memory allocator
did its best getting extra memory already.
This patch reverts d22e153718 ("tcp: fix tcp fin memory accounting")
Fixes: d22e153718 ("tcp: fix tcp fin memory accounting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3dfb05340e ]
Call checksum_complete_unset in PPP receive to discard checksum-complete
value. PPP does not pull checksum for headers and also modifies packet
as in VJ compression.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4e18b9adf2 ]
This function changes ip_summed to CHECKSUM_NONE if CHECKSUM_COMPLETE
is set. This is called to discard checksum-complete when packet
is being modified and checksum is not pulled for headers in a layer.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2ab957492d ]
Initial discussion was:
[FYI] xfrm: Don't lookup sk_policy for timewait sockets
Forwarded frames should not have a socket attached. Especially
tw sockets will lead to panics later-on in the stack.
This was observed with TPROXY assigning a tw socket and broken
policy routing (misconfigured). As a result frame enters
forwarding path instead of input. We cannot solve this in
TPROXY as it cannot know that policy routing is broken.
v2:
Remove useless comment
Signed-off-by: Sebastian Poehn <sebastian.poehn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b01fc86b9 upstream.
This prevents a race between chown() and execve(), where chowning a
setuid-user binary to root would momentarily make the binary setuid
root.
This patch was mostly written by Linus Torvalds.
Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 213dd74aee ]
On Wed, Apr 15, 2015 at 05:41:26PM +0200, Nicolas Dichtel wrote:
> Le 15/04/2015 15:57, Herbert Xu a écrit :
> >On Wed, Apr 15, 2015 at 06:22:29PM +0800, Herbert Xu wrote:
> [snip]
> >Subject: skbuff: Do not scrub skb mark within the same name space
> >
> >The commit ea23192e8e ("tunnels:
> Maybe add a Fixes tag?
> Fixes: ea23192e8e ("tunnels: harmonize cleanup done on skb on rx path")
>
> >harmonize cleanup done on skb on rx path") broke anyone trying to
> >use netfilter marking across IPv4 tunnels. While most of the
> >fields that are cleared by skb_scrub_packet don't matter, the
> >netfilter mark must be preserved.
> >
> >This patch rearranges skb_scurb_packet to preserve the mark field.
> nit: s/scurb/scrub
>
> Else it's fine for me.
Sure.
PS I used the wrong email for James the first time around. So
let me repeat the question here. Should secmark be preserved
or cleared across tunnels within the same name space? In fact,
do our security models even support name spaces?
---8<---
The commit ea23192e8e ("tunnels:
harmonize cleanup done on skb on rx path") broke anyone trying to
use netfilter marking across IPv4 tunnels. While most of the
fields that are cleared by skb_scrub_packet don't matter, the
netfilter mark must be preserved.
This patch rearranges skb_scrub_packet to preserve the mark field.
Fixes: ea23192e8e ("tunnels: harmonize cleanup done on skb on rx path")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4c0ee414e8 ]
This patch reverts commit b8fb4e0648
because the secmark must be preserved even when a packet crosses
namespace boundaries. The reason is that security labels apply to
the system as a whole and is not per-namespace.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c3de6317d7 ]
Due to missing bounds check the DAG pass of the BPF verifier can corrupt
the memory which can cause random crashes during program loading:
[8.449451] BUG: unable to handle kernel paging request at ffffffffffffffff
[8.451293] IP: [<ffffffff811de33d>] kmem_cache_alloc_trace+0x8d/0x2f0
[8.452329] Oops: 0000 [#1] SMP
[8.452329] Call Trace:
[8.452329] [<ffffffff8116cc82>] bpf_check+0x852/0x2000
[8.452329] [<ffffffff8116b7e4>] bpf_prog_load+0x1e4/0x310
[8.452329] [<ffffffff811b190f>] ? might_fault+0x5f/0xb0
[8.452329] [<ffffffff8116c206>] SyS_bpf+0x806/0xa30
Fixes: f1bca824da ("bpf: add search pruning optimization to verifier")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 074975d037 ]
Commit 9a2620c877 ("bnx2x: prevent WARN during driver unload")
switched the napi/busy_lock locking mechanism from spin_lock() into
spin_lock_bh(), breaking inter-operability with netconsole, as netpoll
disables interrupts prior to calling our napi mechanism.
This switches the driver into using atomic assignments instead of the
spinlock mechanisms previously employed.
Based on initial patch from Yuval Mintz & Ariel Elior
I basically added softirq starvation avoidance, and mixture
of atomic operations, plain writes and barriers.
Note this slightly reduces the overhead for this driver when no
busy_poll sockets are in use.
Fixes: 9a2620c877 ("bnx2x: prevent WARN during driver unload")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b50edd7812 ]
I noticed tcpdump was giving funky timestamps for locally
generated SYNACK messages on loopback interface.
11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>
20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
3160535375:3160535375(0) ack 945476043 win 43690 <mss
65495,nop,nop,sackOK,nop,wscale 7>
This is because we need to clear skb->tstamp before
entering lower stack, otherwise net_timestamp_check()
does not set skb->tstamp.
Fixes: 7faee5c0d5 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b736a623bd ]
handle_offloads() calls skb_reset_inner_headers() to store
the layer pointers to the encapsulated packet. However, we
currently push the vlag tag (if there is one) onto the packet
afterwards. This changes the MAC header for the encapsulated
packet but it is not reflected in skb->inner_mac_header, which
breaks GSO and drivers which attempt to use this for encapsulation
offloads.
Fixes: 1eaa8178 ("vxlan: Add tx-vlan offload support.")
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-29 10:22:17 +02:00
700 changed files with 6112 additions and 3277 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.