* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/accounting, proc: Fix /proc/stat interrupts sum
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT
x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignore
x86/kprobes: Fix typo transferred from Intel manual
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits
x86, tsc: Fix SMI induced variation in quick_pit_calibrate()
x86, opcode: ANDN and Group 17 in x86-opcode-map.txt
x86/kconfig: Move the ZONE_DMA entry under a menu
x86/UV2: Add accounting for BAU strong nacks
x86/UV2: Ack BAU interrupt earlier
x86/UV2: Remove stale no-resources test for UV2 BAU
x86/UV2: Work around BAU bug
x86/UV2: Fix BAU destination timeout initialization
x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode
x86: Get rid of dubious one-bit signed bitfield
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
qnx4: don't leak ->BitMap on late failure exits
qnx4: reduce the insane nesting in qnx4_checkroot()
qnx4: di_fname is an array, for crying out loud...
vfs: remove printk from set_nlink()
wake up s_wait_unfrozen when ->freeze_fs fails
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Permit key_serial() to be called with a const key pointer
keys: fix user_defined key sparse messages
ima: fix cred sparse warning
MPILIB: Add a missing ENOMEM check
Permit key_serial() to be called with a const key pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Replace the rcu_assign_pointer() calls with rcu_assign_keypointer().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix ima_policy.c sparse "warning: dereference of noderef expression"
message, by accessing cred->uid using current_cred().
Changelog v1:
- Change __cred to just cred (based on David Howell's comment)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Randy Dunlap reports that we get
arch/x86/um/shared/sysdep/ptrace.h:7:20: error: redefinition of 'regs_return_value'
arch/x86/um/shared/sysdep/ptrace.h:7:20: note: previous definition of 'regs_return_value' was here
when compiling UML for x86-64.
Stephen Rothwell root-caused it and says:
"Caused by commit d7e7528bcd ("Audit: push audit success and retcode
into arch ptrace.h") (another patch that was never in linux-next :-().
This file now needs protection against double inclusion."
so let's do as the man says.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Analyzed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (26 commits)
target: Set additional sense length field in sense data
target: Remove legacy device status check from transport_execute_tasks
target: Remove __transport_execute_tasks() for each processing context
target: Remove extra se_device->execute_task_lock access in fast path
target: Drop se_device TCQ queue_depth usage from I/O path
target: Fix possible NULL pointer with __transport_execute_tasks
target: Remove TFO->check_release_cmd() fabric API caller
tcm_fc: Convert ft_send_work to use target_submit_cmd
target: Add target_submit_cmd() for process context fabric submission
target: Make target_put_sess_cmd use target_release_cmd_kref
target: Set response format in INQUIRY response
target: tcm_mod_builder: small fixups
Documentation/target: Fix tcm_mod_builder.py build breakage
target: remove overagressive ____cacheline_aligned annoations
tcm_loop: bump max_sectors
target/configs: remove trailing newline from udev_path and alias
iscsi-target: fix chap identifier simple_strtoul usage
target: remove useless casts
target: simplify target_check_cdb_and_preempt
target: Move core_scsi3_check_cdb_abort_and_preempt
...
This includes initial support for the recently published ACPI 5.0 spec.
In particular, support for the "hardware-reduced" bit that eliminates
the dependency on legacy hardware.
APEI has patches resulting from testing on real hardware.
Plus other random fixes.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
intel_idle: Split up and provide per CPU initialization func
ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
ACPI processor: Remove unneeded cpuidle_unregister_driver call
intel idle: Make idle driver more robust
intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
intel_idle: remove redundant local_irq_disable() call
ACPI processor: Fix error path, also remove sysdev link
ACPI: processor: fix acpi_get_cpuid for UP processor
intel_idle: fix API misuse
ACPI APEI: Convert atomicio routines
ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
ACPI: Fix possible alignment issues with GAS 'address' references
ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
ACPI: Store SRAT table revision
ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
ACPI, Record ACPI NVS regions
ACPI, APEI, EINJ, Refine the fix of resource conflict
...
This patch fixes an (ACPI S3) suspend regression introduced in commit
68d6e6713f ("tpm: Introduce function to poll for result of self test")
and occurring with an Infineon TPM and tpm_tis and tpm_infineon drivers
active.
The suspend problem occurred if the TPM was disabled and/or deactivated
and therefore the TPM_PCRRead checking the result of the (asynchronous)
self test returned an error code which then caused the tpm_tis driver to
become inactive and this then seemed to have negatively influenced the
suspend support by the tpm_infineon driver... Besides that the tpm_tis
drive may stay active even if the TPM is disabled and/or deactivated.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The type of 'make_request_fn' changed in 5a7bbad27a ("block: remove
support for bio remapping from ->make_request"), but the merge of the
nvme driver didn't take that into account, and as a result the driver
would compile with a warning:
drivers/block/nvme.c: In function 'nvme_alloc_ns':
drivers/block/nvme.c:1336:2: warning: passing argument 2 of 'blk_queue_make_request' from incompatible pointer type [enabled by default]
include/linux/blkdev.h:830:13: note: expected 'void (*)(struct request_queue *, struct bio *)' but argument is of type 'int (*)(struct request_queue *, struct bio *)'
It's benign, but the warning is annoying.
Reported-by: Stephen Rothwell <sfr@canb.auug.org>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix these warnings:
drivers/xen/biomerge.c:14:1: warning: data definition has no type or storage class [enabled by default]
drivers/xen/biomerge.c:14:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int]
drivers/xen/biomerge.c:14:1: warning: parameter names (without types) in function declaration [enabled by default]
And this build error:
ERROR: "xen_biovec_phys_mergeable" [drivers/block/nvme.ko] undefined!
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus/i2c-33' of git://git.fluff.org/bjdooks/linux:
i2c-eg20t: Change-company-name-OKI-SEMICONDUCTOR to LAPIS Semiconductor
i2c-eg20t: Support new device LAPIS Semiconductor ML7831 IOH
i2c-eg20t: modified the setting of transfer rate.
i2c-eg20t: use i2c_add_numbered_adapter to get a fixed bus number
i2c: OMAP: Add DT support for i2c controller
I2C: OMAP: NACK without STP
I2C: OMAP: correct SYSC register offset for OMAP4
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (57 commits)
[media] as3645a: Fix compilation by including slab.h
[media] s5p-fimc: Remove linux/version.h include from fimc-mdevice.c
[media] s5p-mfc: Remove linux/version.h include from s5p_mfc.c
[media] ds3000: using logical && instead of bitwise &
[media] v4l2-ctrls: make control names consistent
[media] DVB: dib0700, add support for Nova-TD LEDs
[media] DVB: dib0700, add corrected Nova-TD frontend_attach
[media] DVB: dib0700, separate stk7070pd initialization
[media] DVB: dib0700, move Nova-TD Stick to a separate set
[media] : add MODULE_FIRMWARE to dib0700
[media] DVB-CORE: remove superfluous DTV_CMDs
[media] s5p-jpeg: adapt to recent videobuf2 changes
[media] s5p-g2d: fixed a bug in controls setting function
[media] s5p-mfc: Fix volatile controls setup
[media] drivers/media/video/s5p-mfc/s5p_mfc.c: adjust double test
[media] drivers/media/video/s5p-fimc/fimc-capture.c: adjust double test
[media] s5p-fimc: Fix incorrect control ID assignment
[media] dvb_frontend: Don't call get_frontend() if idle
[media] DocBook/dvbproperty.xml: Remove DTV_MODULATION from ISDB-T
[media] DocBook/dvbproperty.xml: Fix ISDB-T delivery system parameters
...
Using the correct gpio offset for setting the initial value
of gpio when setting output direction.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
SCSI updates on 20120118
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (49 commits)
[SCSI] libfc: remove redundant timer init for fcp
[SCSI] fcoe: Move fcoe_debug_logging from fcoe.h to fcoe.c
[SCSI] libfc: Declare local functions static
[SCSI] fcoe: fix regression on offload em matching function for initiator/target
[SCSI] qla4xxx: Update driver version to 5.02.00-k12
[SCSI] qla4xxx: Cleanup modinfo display
[SCSI] qla4xxx: Update license
[SCSI] qla4xxx: Clear the RISC interrupt bit during FW init
[SCSI] qla4xxx: Added error logging for firmware abort
[SCSI] qla4xxx: Disable generating pause frames in case of FW hung
[SCSI] qla4xxx: Temperature monitoring for ISP82XX core.
[SCSI] megaraid: fix sparse warnings
[SCSI] sg: convert to kstrtoul_from_user()
[SCSI] don't change sdev starvation list order without request dispatched
[SCSI] isci: fix, prevent port from getting stuck in the 'configuring' state
[SCSI] isci: fix start OOB
[SCSI] isci: fix io failures while wide port links are coming up
[SCSI] isci: allow more time for wide port targets
[SCSI] isci: enable wide port targets
[SCSI] isci: Fix IO fails when pull cable from phy in x4 wideport in MPC mode.
...
* git://git.infradead.org/users/willy/linux-nvme: (105 commits)
NVMe: Set number of queues correctly
NVMe: Version 0.8
NVMe: Set queue flags correctly
NVMe: Simplify nvme_unmap_user_pages
NVMe: Mark the end of the sg list
NVMe: Fix DMA mapping for admin commands
NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT
NVMe: Merge the nvme_bio and nvme_prp data structures
NVMe: Change nvme_completion_fn to take a dev
NVMe: Change get_nvmeq to take a dev instead of a namespace
NVMe: Simplify completion handling
NVMe: Update Identify Controller data structure
NVMe: Implement doorbell stride capability
NVMe: Version 0.7
NVMe: Don't probe namespace 0
Fix calculation of number of pages in a PRP List
NVMe: Create nvme_identify and nvme_get_features functions
NVMe: Fix memory leak in nvme_dev_add()
NVMe: Fix calls to dma_unmap_sg
NVMe: Correct sg list setup in nvme_map_user_pages
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
tg3: Fix single-vector MSI-X code
openvswitch: Fix multipart datapath dumps.
ipv6: fix per device IP snmp counters
inetpeer: initialize ->redirect_genid in inet_getpeer()
net: fix NULL-deref in WARN() in skb_gso_segment()
net: WARN if skb_checksum_help() is called on skb requiring segmentation
caif: Remove bad WARN_ON in caif_dev
caif: Fix typo in Vendor/Product-ID for CAIF modems
bnx2x: Disable AN KR work-around for BCM57810
bnx2x: Remove AutoGrEEEn for BCM84833
bnx2x: Remove 100Mb force speed for BCM84833
bnx2x: Fix PFC setting on BCM57840
bnx2x: Fix Super-Isolate mode for BCM84833
net: fix some sparse errors
net: kill duplicate included header
net: sh-eth: Fix build error by the value which is not defined
net: Use device model to get driver name in skb_gso_segment()
bridge: BH already disabled in br_fdb_cleanup()
net: move sock_update_memcg outside of CONFIG_INET
mwl8k: Fixing Sparse ENDIAN CHECK warning
...
ACPI 5.0 provides extensions to the EINJ mechanism to specify the
target for the error injection - by APICID for cpu related errors,
by address for memory related errors, and by segment/bus/device/function
for PCIe related errors. Also extensions for vendor specific error
injections.
Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
V2: Fix typo: pr->handle -> pr, here: acpi_processor_hotadd_init(pr)
This is a very small part taken from patches which afaik
are coming from Yunhong Jiang (for a Xen not a Linus repo?).
Cleanup only: no functional change.
Advantage (beside cleanup) is that other data of the pr (acpi_processor) struct
in the acpi_processor_hotadd_init() is needed later, for example a newly
introduced flag:
pr->flags.need_hotplug_init
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Kdump kernels leave MSI-X interrupts (as setup by the crashed kernel)
enabled. However, kdump only enables one CPU in the new environment,
thus causing tg3 to abort MSI-X setup. When the driver attempts to
enable INTA or MSI interrupt modes on a kdump kernel, interrupt
delivery fails.
This patch attempts to workaround the problem by forcing the driver
to enable a single MSI-X interrupt. In such a configuration, the
device's multivector interrupt mode must be disabled.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The logic to split up the list of datapaths into multiple Netlink messages
was simply wrong, causing the list to be terminated after the first part.
Only about the first 50 datapaths would be dumped. This fixes the
problem.
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 4ce3c183fc (snmp: 64bit ipstats_mib for all arches), I forgot
to change the /proc/net/dev_snmp6/xxx output for IP counters.
percpu array is 64bit per counter but the folding still used the 'long'
variant, and output garbage on 32bit arches.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ARM: fixes for ARM platforms
Some fallout from the 3.3. merge window as well as a couple bug fixes
for older preexisting bugs that seem valid to include at this time:
* sched_clock changes broke picoxcell, fix included
* BSYM bugs causing issues with thumb2-built kernels on SMP
* Missing module.h include on msm.
* A collection of bugfixes for samsung platforms that didn't make it into
the first pull requests.
* tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: make BSYM macro assembly only
ARM: highbank: remove incorrect BSYM usage
ARM: imx: remove incorrect BSYM usage
ARM: exynos: remove incorrect BSYM usage
ARM: ux500: add missing ENDPROC to headsmp.S
ARM: msm: Add missing ENDPROC to headsmp.S
ARM: versatile: Add missing ENDPROC to headsmp.S
ARM: EXYNOS: Invert VCLK polarity for framebuffer on ORIGEN
ARM: S3C64XX: Fix interrupt configuration for PCA935x on Cragganmore
ARM: S3C64XX: Fix the memory mapped GPIOs on Cragganmore
ARM: S3C64XX: Remove hsmmc1 from Cragganmore
ARM: S3C64XX: Remove unconditional power domain disables
ARM: SAMSUNG: Declare struct platform_device in plat/s3c64xx-spi.h
ARM: SAMSUNG: dma-ops.h needs mach/dma.h
ARM: SAMSUNG: Guard against multiple inclusion of plat/dma.h
ARM: picoxcell: fix sched_clock() cleanup fallout
ARM: msm: vreg is a module and so needs module.h
* 'upstream-linus' of git://github.com/jgarzik/libata-dev:
[libata] ata_piix: Add Toshiba Satellite Pro A120 to the quirks list due to broken suspend functionality.
[libata] add DVRTD08A and DVR-215 to NOSETXFER device quirk list
[libata] pata_bf54x: Support sg list in bmdma transfer.
[libata] sata_fsl: fix the controller operating mode
[libata] enable ata port async suspend
JONGMAN HEO reports:
With current linus git (commit a25a2b84), I got following build error,
arch/x86/kernel/vm86_32.c: In function 'do_sys_vm86':
arch/x86/kernel/vm86_32.c:340: error: implicit declaration of function '__audit_syscall_exit'
make[3]: *** [arch/x86/kernel/vm86_32.o] Error 1
OK, I can reproduce it (32bit allmodconfig with AUDIT=y, AUDITSYSCALL=n)
It's due to commit d7e7528bcd: "Audit: push audit success and retcode
into arch ptrace.h".
Reported-by: JONGMAN HEO <jongman.heo@samsung.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
BF54x on-chip ATAPI controller allows maximum 0x1fffe bytes to be transfered
in one ATAPI transfer. So, set the max sg_tablesize to 4.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The intent here was to test if the FE_HAS_LOCK was set. The current
test is equivalent to "if (status) { ..."
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Several control names used inconsistent capitalization or were inconsistent
in other ways. I also corrected a spelling mistake and fixed four strings
that were too long (>31 characters). Harmless, but the string is cut off when
it is returned with QUERYCTRL.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add an override of read_status to intercept lock status. This allows
us to switch LEDs appropriately on and off with signal un/locked.
The second phase is to override sleep to properly turn off both.
This is a hackish way to achieve that.
Thanks to Mike Krufky for his help.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This means cut & paste from the former f. attach. But while at it write
to the right GPIO to turn on the right LED. Also turn the other two
off jsut for sure.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The start is common for both stk7070pd and novatd specific routine.
This is just a preparation for the next patch.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
To properly support the three LEDs which are on the stick, we need
a special handling in the ->frontend_attach function. Thus let's have
a separate ->frontend_attach instead of ifs in the common one.
The hadnling itself will be added in further patches.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The dib0700 needs a binary firmware file. This patch added the MODULE_FIRMWARE-macro.
Signed-off-by: Christoph Anton Mitterer <calestyo@scientia.net>
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This small patch removes superfluous DTV_CMDs from dvb_frontend.c which were added in the initially when ISBD-T support was added.
They were there unnoticed even though compilers should have warning about those duplicates. Finally they did and now we can remove them.
Thanks to Dan Carpenter <dan.carpenter@oracle.com> for pointing that out.
Signed-off-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
queue_setup callback has been extended with struct v4l2_format *fmt
parameter in 2d86401c2c commit. This patch adds this parameter to
s5p-jpeg driver.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This driver does an unconditional read of io space during module init which
causes a bad dereference on ARM. It looks to me like this is an x86 only
drivers, so restrict it to only compile on x86.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Denis Turischev <denis@compulab.co.il>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits)
audit: no leading space in audit_log_d_path prefix
audit: treat s_id as an untrusted string
audit: fix signedness bug in audit_log_execve_info()
audit: comparison on interprocess fields
audit: implement all object interfield comparisons
audit: allow interfield comparison between gid and ogid
audit: complex interfield comparison helper
audit: allow interfield comparison in audit rules
Kernel: Audit Support For The ARM Platform
audit: do not call audit_getname on error
audit: only allow tasks to set their loginuid if it is -1
audit: remove task argument to audit_set_loginuid
audit: allow audit matching on inode gid
audit: allow matching on obj_uid
audit: remove audit_finish_fork as it can't be called
audit: reject entry,always rules
audit: inline audit_free to simplify the look of generic code
audit: drop audit_set_macxattr as it doesn't do anything
audit: inline checks for not needing to collect aux records
audit: drop some potentially inadvisable likely notations
...
Use evil merge to fix up grammar mistakes in Kconfig file.
Bad speling and horrible grammar (and copious swearing) is to be
expected, but let's keep it to commit messages and comments, rather than
expose it to users in config help texts or printouts.
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: cleanup xfs_file_aio_write
xfs: always return with the iolock held from xfs_file_aio_write_checks
xfs: remove the i_new_size field in struct xfs_inode
xfs: remove the i_size field in struct xfs_inode
xfs: replace i_pin_wait with a bit waitqueue
xfs: replace i_flock with a sleeping bitlock
xfs: make i_flags an unsigned long
xfs: remove the if_ext_max field in struct xfs_ifork
xfs: remove the unused dm_attrs structure
xfs: cleanup xfs_iomap_eof_align_last_fsb
xfs: remove xfs_itruncate_data
* 'btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
btrfs: take allocation of ->tree_root into open_ctree()
btrfs: let ->s_fs_info point to fs_info, not root...
btrfs: consolidate failure exits in btrfs_mount() a bit
btrfs: make free_fs_info() call ->kill_sb() unconditional
btrfs: merge free_fs_info() calls on fill_super failures
btrfs: kill pointless reassignment of ->s_fs_info in btrfs_fill_super()
btrfs: make open_ctree() return int
btrfs: sanitizing ->fs_info, part 5
btrfs: sanitizing ->fs_info, part 4
btrfs: sanitizing ->fs_info, part 3
btrfs: sanitizing ->fs_info, part 2
btrfs: sanitizing ->fs_info, part 1
btrfs: fix a deadlock in btrfs_scan_one_device()
btrfs: fix mount/umount race
btrfs: get ->kill_sb() of its own
btrfs: preparation to fixing mount/umount race
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (62 commits)
Btrfs: use larger system chunks
Btrfs: add a delalloc mutex to inodes for delalloc reservations
Btrfs: space leak tracepoints
Btrfs: protect orphan block rsv with spin_lock
Btrfs: add allocator tracepoints
Btrfs: don't call btrfs_throttle in file write
Btrfs: release space on error in page_mkwrite
Btrfs: fix btrfsck error 400 when truncating a compressed
Btrfs: do not use btrfs_end_transaction_throttle everywhere
Btrfs: add balance progress reporting
Btrfs: allow for resuming restriper after it was paused
Btrfs: allow for canceling restriper
Btrfs: allow for pausing restriper
Btrfs: add skip_balance mount option
Btrfs: recover balance on mount
Btrfs: save balance parameters to disk
Btrfs: soft profile changing mode (aka soft convert)
Btrfs: implement online profile changing
Btrfs: do not reduce profile in do_chunk_alloc()
Btrfs: virtual address space subset filter
...
Fix up trivial conflict in fs/btrfs/ioctl.c due to the use of the new
mnt_drop_write_file() helper.
pit_expect_msb() returns success wrongly in the below SMI scenario:
a. pit_verify_msb() has not yet seen the MSB transition.
b. we are close to the MSB transition though and got a SMI immediately after
returning from pit_verify_msb() which didn't see the MSB transition. PIT MSB
transition has happened somewhere during SMI execution.
c. returned from SMI and we noted down the 'tsc', saw the pit MSB change now and
exited the loop to calculate 'deltatsc'. Instead of noting the TSC at the MSB
transition, we are way off because of the SMI. And as the SMI happened
between the pit_verify_msb() and before the 'tsc' is recorded in the
for loop, 'delattsc' (d1/d2 in quick_pit_calibrate()) will be small and
quick_pit_calibrate() will not notice this error.
Depending on whether SMI disturbance happens while computing d1 or d2, we will
see the TSC calibrated value smaller or bigger than the expected value. As a
result, in a cluster we were seeing a variation of approximately +/- 20MHz in
the calibrated values, resulting in NTP failures.
[ As far as the SMI source is concerned, this is a periodic SMI that gets
disabled after ACPI is enabled by the OS. But the TSC calibration happens
before the ACPI is enabled. ]
To address this, change pit_expect_msb() so that
- the 'tsc' is the TSC in between the two reads that read the MSB
change from the PIT (same as before)
- the 'delta' is the difference in TSC from *before* the MSB changed
to *after* the MSB changed.
Now the delta is twice as big as before (it covers four PIT accesses,
roughly 4us) and quick_pit_calibrate() will loop a bit longer to get
the calibrated value with in the 500ppm precision. As the delta (d1/d2)
covers four PIT accesses, actual calibrated result might be closer to
250ppm precision.
As the loop now takes longer to stabilize, double MAX_QUICK_PIT_MS to 50.
SMI disturbance will showup as much larger delta's and the loop will take
longer than usual for the result to be with in the accepted precision. Or will
fallback to slow PIT calibration if it takes more than 50msec.
Also while we are at this, remove the calibration correction that aims to
get the result to the middle of the error bars. We really don't know which
direction to correct into, so remove it.
Reported-and-tested-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1326843337.5291.4.camel@sbsiddha-mobl2
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Similar to SIGNATURE, rename INTEGRITY_DIGSIG to INTEGRITY_SIGNATURE.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
It was reported that description of the MPILIB_EXTRA is confusing.
Indeed it was copy-paste typo. It is fixed here.
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
It was reported that DIGSIG is confusing name for digital signature
module. It was suggested to rename DIGSIG to SIGNATURE.
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
Enabling CONFIG_PROVE_RCU and CONFIG_SPARSE_RCU_POINTER resulted in
"suspicious rcu_dereference_check() usage!" and "incompatible types
in comparison expression (different address spaces)" messages.
Access the masterkey directly when holding the rwsem.
Changelog v1:
- Use either rcu_read_lock()/rcu_derefence_key()/rcu_read_unlock()
or remove the unnecessary rcu_derefence() - David Howells
Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Define rcu_assign_keypointer(), which uses the key payload.rcudata instead
of payload.data, to resolve the CONFIG_SPARSE_RCU_POINTER message:
"incompatible types in comparison expression (different address spaces)"
Replace the rcu_assign_pointer() calls in encrypted/trusted keys with
rcu_assign_keypointer().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add missing smp_rmb() primitives to the keyring search code.
When keyring payloads are appended to without replacement (thus using up spare
slots in the key pointer array), an smp_wmb() is issued between the pointer
assignment and the increment of the key count (nkeys).
There should be corresponding read barriers between the read of nkeys and
dereferences of keys[n] when n is dependent on the value of nkeys.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
TOMOYO 2.5 in Linux 3.2 and later handles Unix domain socket's address.
Thus, tomoyo_correct_word2() needs to accept \000 as a valid character, or
TOMOYO 2.5 cannot handle Unix domain's abstract socket address.
Reported-by: Steven Allen <steven@stebalien.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
CC: stable@vger.kernel.org [3.2+]
Signed-off-by: James Morris <jmorris@namei.org>
This set of build failures just started appearing on parisc:
In file included from drivers/input/serio/serio_raw.c:12:
include/linux/kref.h: In function 'kref_get':
include/linux/kref.h:40: error: 'TAINT_WARN' undeclared (first use in this function)
include/linux/kref.h:40: error: (Each undeclared identifier is reported only once
include/linux/kref.h:40: error: for each function it appears in.)
include/linux/kref.h: In function 'kref_sub':
include/linux/kref.h:65: error: 'TAINT_WARN' undeclared (first use in this function)
It happens because TAINT_WARN is defined in kernel.h and this particular
compile doesn't seem to include it (no idea why it's just manifesting ..
probably some #include file untangling exposed it).
Fix by adding
#include <linux/kernel.h>
to linux/kref.h
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update MAINTAINERS file with new git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git
Signed-off-by: James Morris <jmorris@namei.org>
On October 1 in 2011,
OKI SEMICONDUCTOR Co., Ltd. changed the company name in to LAPIS Semiconductor Co., Ltd.
Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.lapis-semi.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
This patch modified the setting value of
I2C Bus Transfer Rate Setting Counter regisrer.
Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
For EG20T and ML7213 IOH, the i2c controller numbers are fixed, using
fixed bus number will make it much easier for platform code to use
i2c_register_board_info() to register i2c devices.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Jüri Aedla reported that the /proc/<pid>/mem handling really isn't very
robust, and it also doesn't match the permission checking of any of the
other related files.
This changes it to do the permission checks at open time, and instead of
tracking the process, it tracks the VM at the time of the open. That
simplifies the code a lot, but does mean that if you hold the file
descriptor open over an execve(), you'll continue to read from the _old_
VM.
That is different from our previous behavior, but much simpler. If
somebody actually finds a load where this matters, we'll need to revert
this commit.
I suspect that nobody will ever notice - because the process mapping
addresses will also have changed as part of the execve. So you cannot
actually usefully access the fd across a VM change simply because all
the offsets for IO would have changed too.
Reported-by: Jüri Aedla <asd@ut.ee>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add initial DT support to retrieve the frequency using a
DT attribute instead of the pdata pointer if of_node exist.
Add documentation for omap i2c controller binding.
Based on original patches from Manju and Grant.
Signed-off-by: Benoit Cousson <b-cousson@ti.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Kevin Hilman <khilman@ti.com>
On OMAP4 OMAP_I2C_STAT_NACK is causing a timeout on the next access.
The isr cleans all flags in OMAP_I2C_CON_REG by setting OMAP_I2C_CON_STP
OMAP_I2C_CON_STP is also set in omap_i2c_xfer_msg on the last message.
According to the TI TSR the sequence for OMAP_I2C_STAT_NACK and
OMAP_I2C_STAT_AL are nearly the same.
Removing the OMAP_I2C_CON_STP part in the isr fix the problem.
Tested on OMAP4430 and OMAP3530 (here NACK was not a problem)
Fixes also booting on 2430sdp.
Signed-off-by: Jan Weitzel <j.weitzel@phytec.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Correct OMAP_I2C_SYSC_REG offset in omap4 register map.
Offset 0x20 is reserved and OMAP_I2C_SYSC_REG has 0x10 as offset.
Signed-off-by: Alexander Aring <a.aring@phytec.de>
[khilman@ti.com: minor changelog edits]
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Hilman <khilman@ti.com>
dd slept infinitely when fsfeeze failed because of EIO.
To fix this problem, if ->freeze_fs fails, freeze_super() wakes up
the tasks waiting for the filesystem to become unfrozen.
When s_frozen isn't SB_UNFROZEN in __generic_file_aio_write(),
the function sleeps until FITHAW ioctl wakes up s_wait_unfrozen.
However, if ->freeze_fs fails, s_frozen is set to SB_UNFROZEN and then
freeze_super() returns an error number. In this case, FITHAW ioctl returns
EINVAL because s_frozen is already SB_UNFROZEN. There is no way to wake up
s_wait_unfrozen, so __generic_file_aio_write() sleeps infinitely.
Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
audit_log_d_path() injects an additional space before the prefix,
which serves no purpose and doesn't mix well with other audit_log*()
functions that do not sneak extra characters into the log.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
The use of s_id should go through the untrusted string path, just to be
extra careful.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
In the loop, a size_t "len" is used to hold the return value of
audit_log_single_execve_arg(), which returns -1 on error. In that
case the error handling (len <= 0) will be bypassed since "len" is
unsigned, and the loop continues with (p += len) being wrapped.
Change the type of "len" to signed int to fix the error handling.
size_t len;
...
for (...) {
len = audit_log_single_execve_arg(...);
if (len <= 0)
break;
p += len;
}
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
This allows audit to specify rules in which we compare two fields of a
process. Such as is the running process uid != to the running process
euid?
Signed-off-by: Peter Moody <pmoody@google.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
This completes the matrix of interfield comparisons between uid/gid
information for the current task and the uid/gid information for inodes.
aka I can audit based on differences between the euid of the process and
the uid of fs objects.
Signed-off-by: Peter Moody <pmoody@google.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Rather than code the same loop over and over implement a helper function which
uses some pointer magic to make it generic enough to be used numerous places
as we implement more audit interfield comparisons
Signed-off-by: Eric Paris <eparis@redhat.com>
We wish to be able to audit when a uid=500 task accesses a file which is
uid=0. Or vice versa. This patch introduces a new audit filter type
AUDIT_FIELD_COMPARE which takes as an 'enum' which indicates which fields
should be compared. At this point we only define the task->uid vs
inode->uid, but other comparisons can be added.
Signed-off-by: Eric Paris <eparis@redhat.com>
This patch provides functionality to audit system call events on the
ARM platform. The implementation was based off the structure of the
MIPS platform and information in this
(http://lists.fedoraproject.org/pipermail/arm/2009-October/000382.html)
mailing list thread. The required audit_syscall_exit and
audit_syscall_entry checks were added to ptrace using the standard
registers for system call values (r0 through r3). A thread information
flag was added for auditing (TIF_SYSCALL_AUDIT) and a meta-flag was
added (_TIF_SYSCALL_WORK) to simplify modifications to the syscall
entry/exit. Now, if either the TRACE flag is set or the AUDIT flag is
set, the syscall_trace function will be executed. The prober changes
were made to Kconfig to allow CONFIG_AUDITSYSCALL to be enabled.
Due to platform availability limitations, this patch was only tested
on the Android platform running the modified "android-goldfish-2.6.29"
kernel. A test compile was performed using Code Sourcery's
cross-compilation toolset and the current linux-3.0 stable kernel. The
changes compile without error. I'm hoping, due to the simple modifications,
the patch is "obviously correct".
Signed-off-by: Nathaniel Husted <nhusted@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Just a code cleanup really. We don't need to make a function call just for
it to return on error. This also makes the VFS function even easier to follow
and removes a conditional on a hot path.
Signed-off-by: Eric Paris <eparis@redhat.com>
At the moment we allow tasks to set their loginuid if they have
CAP_AUDIT_CONTROL. In reality we want tasks to set the loginuid when they
log in and it be impossible to ever reset. We had to make it mutable even
after it was once set (with the CAP) because on update and admin might have
to restart sshd. Now sshd would get his loginuid and the next user which
logged in using ssh would not be able to set his loginuid.
Systemd has changed how userspace works and allowed us to make the kernel
work the way it should. With systemd users (even admins) are not supposed
to restart services directly. The system will restart the service for
them. Thus since systemd is going to loginuid==-1, sshd would get -1, and
sshd would be allowed to set a new loginuid without special permissions.
If an admin in this system were to manually start an sshd he is inserting
himself into the system chain of trust and thus, logically, it's his
loginuid that should be used! Since we have old systems I make this a
Kconfig option.
Signed-off-by: Eric Paris <eparis@redhat.com>
The function always deals with current. Don't expose an option
pretending one can use it for something. You can't.
Signed-off-by: Eric Paris <eparis@redhat.com>
Much like the ability to filter audit on the uid of an inode collected, we
should be able to filter on the gid of the inode.
Signed-off-by: Eric Paris <eparis@redhat.com>
Allow syscall exit filter matching based on the uid of the owner of an
inode used in a syscall. aka:
auditctl -a always,exit -S open -F obj_uid=0 -F perm=wa
Signed-off-by: Eric Paris <eparis@redhat.com>
Audit entry,always rules are not allowed and are automatically changed in
exit,always rules in userspace. The kernel refuses to load such rules.
Thus a task in the middle of a syscall (and thus in audit_finish_fork())
can only be in one of two states: AUDIT_BUILD_CONTEXT or AUDIT_DISABLED.
Since the current task cannot be in AUDIT_RECORD_CONTEXT we aren't every
going to actually use the code in audit_finish_fork() since it will
return without doing anything. Thus drop the code.
Signed-off-by: Eric Paris <eparis@redhat.com>
A number of audit hooks make function calls before they determine that
auxilary records do not need to be collected. Do those checks as static
inlines since the most common case is going to be that records are not
needed and we can skip the function call overhead.
Signed-off-by: Eric Paris <eparis@redhat.com>
The audit code makes heavy use of likely() and unlikely() macros, but they
don't always make sense. Drop any that seem questionable and let the
computer do it's thing.
Signed-off-by: Eric Paris <eparis@redhat.com>
Audit contexts have 3 states. Disabled, which doesn't collect anything,
build, which collects info but might not emit it, and record, which
collects and emits. There is a 4th state, setup, which isn't used. Get
rid of it.
Signed-off-by: Eric Paris <eparis@redhat.com>
Every arch calls:
if (unlikely(current->audit_context))
audit_syscall_entry()
which requires knowledge about audit (the existance of audit_context) in
the arch code. Just do it all in static inline in audit.h so that arch's
can remain blissfully ignorant.
Signed-off-by: Eric Paris <eparis@redhat.com>
In the ia32entry syscall exit audit fastpath we have assembly code which calls
__audit_syscall_exit directly. This code was, however, zeroes the upper 32
bits of the return code. It then proceeded to call code which expects longs
to be 64bits long. In order to handle code which expects longs to be 64bit we
sign extend the return code if that code is an error. Thus the
__audit_syscall_exit function can correctly handle using the values in
snprintf("%ld"). This fixes the regression introduced in 5cbf1565f2.
Old record:
type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=4294967283
New record:
type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=-13
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
The audit system previously expected arches calling to audit_syscall_exit to
supply as arguments if the syscall was a success and what the return code was.
Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
by converting from negative retcodes to an audit internal magic value stating
success or failure. This helper was wrong and could indicate that a valid
pointer returned to userspace was a failed syscall. The fix is to fix the
layering foolishness. We now pass audit_syscall_exit a struct pt_reg and it
in turns calls back into arch code to collect the return value and to
determine if the syscall was a success or failure. We also define a generic
is_syscall_success() macro which determines success/failure based on if the
value is < -MAX_ERRNO. This works for arches like x86 which do not use a
separate mechanism to indicate syscall failure.
We make both the is_syscall_success() and regs_return_value() static inlines
instead of macros. The reason is because the audit function must take a void*
for the regs. (uml calls theirs struct uml_pt_regs instead of just struct
pt_regs so audit_syscall_exit can't take a struct pt_regs). Since the audit
function takes a void* we need to use static inlines to cast it back to the
arch correct structure to dereference it.
The other major change is that on some arches, like ia64, MIPS and ppc, we
change regs_return_value() to give us the negative value on syscall failure.
THE only other user of this macro, kretprobe_example.c, won't notice and it
makes the value signed consistently for the audit functions across all archs.
In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
audit code as the return value. But the ptrace_64.h code defined the macro
regs_return_value() as regs[3]. I have no idea which one is correct, but this
patch now uses the regs_return_value() function, so it now uses regs[3].
For powerpc we previously used regs->result but now use the
regs_return_value() function which uses regs->gprs[3]. regs->gprs[3] is
always positive so the regs_return_value(), much like ia64 makes it negative
before calling the audit code when appropriate.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
Acked-by: Richard Weinberger <richard@nod.at> [for uml]
Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
The audit system likes to collect information about processes that end
abnormally (SIGSEGV) as this may me useful intrusion detection information.
This patch adds audit support to collect information when seccomp forces a
task to exit because of misbehavior in a similar way.
Signed-off-by: Eric Paris <eparis@redhat.com>
The audit system has the ability to filter on the major and minor number of
the device containing the inode being operated upon. Lets say that
/dev/sda1 has major,minor 8,1 and that we mount /dev/sda1 on /boot. Now lets
say we add a watch with a filter on 8,1. If we proceed to open an inode
inside /boot, such as /vboot/vmlinuz, we will match the major,minor filter.
Lets instead assume that one were to use a tool like debugfs and were to
open /dev/sda1 directly and to modify it's contents. We might hope that
this would also be logged, but it isn't. The rules will check the
major,minor of the device containing /dev/sda1. In other words the rule
would match on the major/minor of the tmpfs mounted at /dev.
I believe these rules should trigger on either device. The man page is
devoid of useful information about the intended semantics. It only seems
logical that if you want to know everything that happened on a major,minor
that would include things that happened to the device itself...
Signed-off-by: Eric Paris <eparis@redhat.com>
userspace audit messages look like so:
type=USER msg=audit(1271170549.415:24710): user pid=14722 uid=0 auid=500 ses=1 subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 msg=''
That third field just says 'user'. That's useless and doesn't follow the
key=value pair we are trying to enforce. We already know it came from the
user based on the record type. Kill that word. Die.
Signed-off-by: Eric Paris <eparis@redhat.com>
This patch does 2 things. First it reduces the number of audit_names
allocated in every audit context from 20 to 5. 5 should be enough for all
'normal' syscalls (rename being the worst). Some syscalls can still touch
more the 5 inodes such as mount. When rpc filesystem is mounted it will
create inodes and those can exceed 5. To handle that problem this patch will
dynamically allocate audit_names if it needs more than 5. This should
decrease the typicall memory usage while still supporting all the possible
kernel operations.
Signed-off-by: Eric Paris <eparis@redhat.com>
Every other filter that matches part of the inodes list collected by audit
will match against any of the inodes on that list. The filetype matching
however had a strange way of doing things. It allowed userspace to
indicated if it should match on the first of the second name collected by
the kernel. Name collection ordering seems like a kernel internal and
making userspace rules get that right just seems like a bad idea. As it
turns out the userspace audit writers had no idea it was doing this and
thus never overloaded the value field. The kernel always checked the first
name collected which for the tested rules was always correct.
This patch just makes the filetype matching like the major, minor, inode,
and LSM rules in that it will match against any of the names collected. It
also changes the rule validation to reject the old unused rule types.
Noone knew it was there. Noone used it. Why keep around the extra code?
Signed-off-by: Eric Paris <eparis@redhat.com>
With all the size field updates out of the way xfs_file_aio_write can
be further simplified by pushing all iolock handling into
xfs_file_dio_aio_write and xfs_file_buffered_aio_write and using
the generic generic_write_sync helper for synchronous writes.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
While xfs_iunlock is fine with 0 lockflags the calling conventions are much
cleaner if xfs_file_aio_write_checks never returns without the iolock held.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Now that we use the VFS i_size field throughout XFS there is no need for the
i_new_size field any more given that the VFS i_size field gets updated
in ->write_end before unlocking the page, and thus is always uptodate when
writeback could see a page. Removing i_new_size also has the advantage that
we will never have to trim back di_size during a failed buffered write,
given that it never gets updated past i_size.
Note that currently the generic direct I/O code only updates i_size after
calling our end_io handler, which requires a small workaround to make
sure di_size actually makes it to disk. I hope to fix this properly in
the generic code.
A downside is that we lose the support for parallel non-overlapping O_DIRECT
appending writes that recently was added. I don't think keeping the complex
and fragile i_new_size infrastructure for this is a good tradeoff - if we
really care about parallel appending writers we should investigate turning
the iolock into a range lock, which would also allow for parallel
non-overlapping buffered writers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
There is no fundamental need to keep an in-memory inode size copy in the XFS
inode. We already have the on-disk value in the dinode, and the separate
in-memory copy that we need for regular files only in the XFS inode.
Remove the xfs_inode i_size field and change the XFS_ISIZE macro to use the
VFS inode i_size field for regular files. Switch code that was directly
accessing the i_size field in the xfs_inode to XFS_ISIZE, or in cases where
we are limited to regular files direct access of the VFS inode i_size field.
This also allows dropping some fairly complicated code in the write path
which dealt with keeping the xfs_inode i_size uptodate with the VFS i_size
that is getting updated inside ->write_end.
Note that we do not bother resetting the VFS i_size when truncating a file
that gets freed to zero as there is no point in doing so because the VFS inode
is no longer in use at this point. Just relax the assert in xfs_ifree to
only check the on-disk size instead.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Replace i_pin_wait, which is only used during synchronous inode flushing
with a bit waitqueue. This trades off a much smaller inode against
slightly slower wakeup performance, and saves 12 (32-bit) or 20 (64-bit)
bytes in the XFS inode.
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
We almost never block on i_flock, the exception is synchronous inode
flushing. Instead of bloating the inode with a 16/24-byte completion
that we abuse as a semaphore just implement it as a bitlock that uses
a bit waitqueue for the rare sleeping path. This primarily is a
tradeoff between a much smaller inode and a faster non-blocking
path vs faster wakeups, and we are much better off with the former.
A small downside is that we will lose lockdep checking for i_flock, but
given that it's always taken inside the ilock that should be acceptable.
Note that for example the inode writeback locking is implemented in a
very similar way.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
To be used for bit wakeup i_flags needs to be an unsigned long or we'll
run into trouble on big endian systems. Because of the 1-byte i_update
field right after it this actually causes a fairly large size increase
on its own (4 or 8 bytes), but that increase will be more than offset
by the next two patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
We spent a lot of effort to maintain this field, but it always equals to the
fork size divided by the constant size of an extent. The prime use of it is
to assert that the two stay in sync. Just divide the fork size by the extent
size in the few places that we actually use it and remove the overhead
of maintaining it. Also introduce a few helpers to consolidate the places
where we actually care about the value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
kmemcheck complains that ->redirect_genid doesn't get initialized.
Presumably it should be set to zero.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_checksum_help() has never done anything useful with skbs that
require segmentation. Setting skb->ip_summed = CHECKSUM_NONE makes
them invalid and provokes a later WARNing in skb_gso_segment().
Passing such an skb to skb_checksum_help() indicates a bug, so we
should warn about it immediately. Move the warning from
skb_gso_segment() into a shared function, and add gso_type and
gso_size to it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'stable/for-linus-fixes-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/balloon: Move the registration from device to subsystem.
Since commit 46bcfad7a8 registering
and unregistering cpuidle is done in processor_idle.c.
Unregistering via:
acpi_bus_unregister_driver(&acpi_processor_driver)
-> acpi_processor_remove()
-> acpi_processor_power_exit()
Remove not needed cpuidle_unregister_driver() call from
acpi_processor_exit
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
kvm -cpu host passes the original cpuid info to the guest.
Latest kvm version seem to return true for mwait_leaf cpuid
function on recent Intel CPUs. But it does not return mwait
C-states (mwait_substates), instead zero is returned.
While real CPUs seem to always return non-zero values, the intel
idle driver should not get active in kvm (mwait_substates == 0)
case and bail out.
Otherwise a Null pointer exception will happen later when the
cpuidle subsystem tries to get active:
[0.984807] BUG: unable to handle kernel NULL pointer dereference at (null)
[0.984807] IP: [<(null)>] (null)
...
[0.984807][<ffffffff8143cf34>] ? cpuidle_idle_call+0xb4/0x340
[0.984807][<ffffffff8159e7bc>] ? __atomic_notifier_call_chain+0x4c/0x70
[0.984807][<ffffffff81001198>] ? cpu_idle+0x78/0xd0
Reference:
https://bugzilla.novell.com/show_bug.cgi?id=726296
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Bruno Friedmann <bruno@ioda-net.ch>
Signed-off-by: Len Brown <len.brown@intel.com>
Fix the following warning:
drivers/idle/intel_idle.c: In function 'intel_idle_cpuidle_devices_init':
drivers/idle/intel_idle.c:518:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
By making get_driver_data() return a long instead of an int.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
irq disabling happens earlier in process_32.c:cpu_idle. Basically,
cpuidle_state->enter is called, cpu irq is disabled. cpuidle_state->enter
would turn on irq when exiting.
intel_idle doesn't follow this assumption. Although it doesn't cause real
issue, it misleads developers. Remove the call to local_irq_disable() at
entry.
[akpm@linux-foundation.org: add comment]
Signed-off-by: Mingming Zhang <mingmingx.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: virtuoso: Xonar DS: fix polarity of front output
ALSA: Au88x0 - Reduce the number of playback subdevices of au8830 from 32 to 16
ALSA: Au88x0 - Support 4 channels playback when AC97 codecs has SDAC bit
ALSA: HDA: Fix internal microphone on Dell Studio 16 XPS 1645
ALSA: Don't prompt for CONFIG_SND_COMPRESS_OFFLOAD
ALSA: HDA: Use LPIB position fix for Macbook Pro 7,1
If the frontend is in idle state, don't call get_frontend.
Calling get_frontend() when the device is not tuned may
result in wrong parameters to be returned to the
userspace.
I was tempted to not call get_frontend() at all, except
inside the dvb frontend thread, but this won't work for
all cases. The ISDB-T specs (ABNT NBR 15601 and ARIB
STD-B31) allow the broadcaster to dynamically change the
channel specs at runtime. That means that an ISDB-T optimized
application may want/need to monitor the TMCC tables, decoded
at the frontends via get_frontend call.
So, let's do the simpler change here.
Eventually, the logic could be changed to work only if
the device is tuned and has lock, but, even so, the
lock is also standard-dependent. For ISDB-T, the right
lock to wait is that the demod has TMCC lock. So, drivers
may need to implement some logic to detect if the get_frontend
info was retrieved or not.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This reverts commit d2a7009f0b.
J. R. Okajima explains:
"After this commit, I am afraid access(2) on NFS may not work
correctly. The scenario based upon my guess.
- access(2) overrides the credentials.
- calls inode_permission() -- ... -- generic_permission() --
ns_capable().
- while the old ns_capable() calls security_capable(current_cred()),
the new ns_capable() calls has_ns_capability(current) --
security_capable(__task_cred(t)).
current_cred() returns current->cred which is effective (overridden)
credentials, but __task_cred(current) returns current->real_cred (the
NFSD's credential). And the overridden credentials by access(2) lost."
Requested-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove WARN_ON and bad handling of SKB without destructor callback
in caif_flow_cb. SKB without destructor cannot be handled as an
error case.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix typo for the Vendor/Product Id for ST-Ericsson CAIF modems.
Discovery is based on fixed USB vendor 0x04cc (ST-Ericsson),
product-id 0x230f (NCM).
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable the work-around for the autoneg KR of the BCM57810 in case the Warpcore version is 0xD108 and above, which fixes this problem.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unsupported speed of 100Mb force for BCM84833 due to hardware
limitation.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch handles the second port of a path in a 4-port device of
BCM57840.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Super-Isolate mode comes to isolate the BCM84833 PHY from the
outside world. Not doing it correctly, made link partner see the link
before the driver was loaded.
This patch also involves SPIROM version fixes since it is used to
determine whether the common init of the PHY was already executed, and
the common init of this PHY is partially responsible for setting the
Super-Isolate mode.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
make C=2 CF="-D__CHECK_ENDIAN__" M=net
And fix flowi4_init_output() prototype for sport
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----
drivers/net/ethernet/renesas/sh_eth.c:1706: error: 'pdid' undeclared (first use in this function)
drivers/net/ethernet/renesas/sh_eth.c:1706: error: (Each undeclared identifier is reported only once
drivers/net/ethernet/renesas/sh_eth.c:1706: error: for each function it appears in.)
make[5]: *** [drivers/net/ethernet/renesas/sh_eth.o] Error 1
-----
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
CC: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool operations generally require the caller to hold RTNL and are
not safe to call in atomic context. The device model provides this
information for most devices; we'll only lose it for some old ISA
drivers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although only used currently for tcp sockets, this function
is now used in common sock code (for sock_clone())
Commit 475f1b5264 moved the
declaration of sock_update_clone() to inside sock.c, but
this only fixes the problem when CONFIG_CGROUP_MEM_RES_CTLR_KMEM
is also not defined.
This patch here is verified to fix both problems, although
reverting the previous one is not necessary.
Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: David S. Miller <davem@davemloft.net>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixing following sparse warning
>drivers/net/wireless/mwl8k.c:2780:15: warning: incorrect type in assignment (different base types)
>drivers/net/wireless/mwl8k.c:2780:15: expected restricted unsigned short [usertype] channel
>drivers/net/wireless/mwl8k.c:2780:15: got unsigned short [unsigned] [usertype] hw_value
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
All other code paths in sta_unblock synchronize with the network
softirq by using local_bh_disable/enable. Do the same around
ieee80211_sta_ps_deliver_wakeup.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The intent here was to check whether key->cipher was WEP40 or WEP104.
We do a similar check correctly in several other places in this file.
The current condition is always true.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Alexandre Oliva <oliva@lsd.ic.unicamp.br> says:
"It's an issue brought about by GCC 4.7's partial-inlining, that ends up
splitting the udelay function just at the wrong spot, in such a way that
some sanity checks for constants fails, and we end up calling
bad_udelay.
This patch fixes the problem. Feel free to push it upstream if it makes
sense to you."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
remove version.h includes in net/openswitch/ as reported by make versioncheck.
Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no store() method for inflight attribute in the
tx-<n>/byte_queue_limits sysfs directory.
So remove S_IWUSR bit.
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some functions and variables in ehea are only used in their own file, so
they should be static. One particular function had a very generic name,
print_error_data.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The brcmsmac driver isn't a PCI driver any more, it's a bcma one. The
PCI device has been resumed by the PCI driver (the generic PCI layer,
really), we should be resuming just our own driver state.
Also add pr_debug() calls to show that we now actually get the
suspend/resume events.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Now the low-level driver actually gets informed that it is getting suspended and resumed.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
.. and connect it up with the pci host bcma driver.
Now, the next step is to connect those bcma bus-level suspend/resume
functions to the actual bcma device suspend resume functions.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On ISDB-T, each layer can have its own independent modulation,
applied to the carriers that belong to the segments associated
with them. So, there's no sense to define a global modulation
parameter.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The ISDB-T differs on its way to implement the hierarchical
transmissions: instead of using a low-priority/high-priority
FEC codes, it does that by using different layers, each layer
with their groups of segments. So, those parameters don't make sense
for ISDB-T.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The units for DTV_FREQUENCY are kHz for satellital delivery systems
(DVB-S/DVB-S2/DVB-TURBO/ISDB-S). Fix it at the API spec.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
For UP processor, it is likely that no _MAT method or MADT table defined.
So currently acpi_get_cpuid(...) always return -1 for UP processor.
This is wrong. It should return valid value for CPU0.
In the other hand, BIOS may define multiple CPU handles even for UP
processor, for example
Scope (_PR)
{
Processor (CPU0, 0x00, 0x00000410, 0x06) {}
Processor (CPU1, 0x01, 0x00000410, 0x06) {}
Processor (CPU2, 0x02, 0x00000410, 0x06) {}
Processor (CPU3, 0x03, 0x00000410, 0x06) {}
}
We should only return valid value for CPU0's acpi handle.
And return invalid value for others.
http://marc.info/?t=132329819900003&r=1&w=2
Cc: stable@vger.kernel.org
Reported-and-tested-by: wallak@free.fr
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
smp_call_function() only lets all other CPUs execute a specific function,
while we expect all CPUs do in intel_idle. Without the fix, we could have
one cpu which has auto_demotion enabled or has no broadcast timer setup.
Usually we don't see impact because auto demotion just harms power and the
intel_idle init is called in CPU 0, where boradcast timer delivers
interrupt, but this still could be a problem.
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
If there was a dumping error in the middle, the set-specific variable was
not zeroed out and thus the 'done' function of the dumping wrongly tried
to release the already released reference of the set. The already released
reference was caught by __ip_set_put and triggered a kernel BUG message.
Reported by Jean-Philippe Menil.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jan Engelhardt noticed when userspace requests a set type unknown
to the kernel, it can lead to a loop due to the unsafe type module
loading. The issue is fixed in this patch.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move the ZONE_DMA kconfig symbol under a menu item instead
of having it listed before everything else in
"make {xconfig | gconfig | nconfig | menuconfig}".
This drops the first line of the top-level kernel config menu
(in 3.2) below and moves it under "Processor type and features".
[*] DMA memory allocation support
General setup --->
[*] Enable loadable module support --->
[*] Enable the block layer --->
Processor type and features --->
Power management and ACPI options --->
Bus options (PCI etc.) --->
Executable file formats / Emulations --->
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-mm@kvack.org <linux-mm@kvack.org>
Link: http://lkml.kernel.org/r/4F14811E.6090107@xenotime.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
APEI needs memory access in interrupt context. The obvious choice is
acpi_read(), but originally it couldn't be used in interrupt context
because it makes temporary mappings with ioremap(). Therefore, we added
drivers/acpi/atomicio.c, which provides:
acpi_pre_map_gar() -- ioremap in process context
acpi_atomic_read() -- memory access in interrupt context
acpi_post_unmap_gar() -- iounmap
Later we added acpi_os_map_generic_address() (2971852) and enhanced
acpi_read() so it works in interrupt context as long as the address has
been previously mapped (620242a). Now this sequence:
acpi_os_map_generic_address() -- ioremap in process context
acpi_read()/apei_read() -- now OK in interrupt context
acpi_os_unmap_generic_address()
is equivalent to what atomicio.c provides.
This patch introduces apei_read() and apei_write(), which currently are
functional equivalents of acpi_read() and acpi_write(). This is mainly
proactive, to prevent APEI breakages if acpi_read() and acpi_write()
are ever augmented to support the 'bit_offset' field of GAS, as APEI's
__apei_exec_write_register() precludes splitting up functionality
related to 'bit_offset' and APEI's 'mask' (see its
APEI_EXEC_PRESERVE_REGISTER block).
With apei_read() and apei_write() in place, usages of atomicio routines
are converted to apei_read()/apei_write() and existing calls within
osl.c and the CA, based on the re-factoring that was done in an earlier
patch series - http://marc.info/?l=linux-acpi&m=128769263327206&w=2:
acpi_pre_map_gar() --> acpi_os_map_generic_address()
acpi_post_unmap_gar() --> acpi_os_unmap_generic_address()
acpi_atomic_read() --> apei_read()
acpi_atomic_write() --> apei_write()
Note that acpi_read() and acpi_write() currently use 'bit_width'
for accessing GARs which seems incorrect. 'bit_width' is the size of
the register, while 'access_width' is the size of the access the
processor must generate on the bus. The 'access_width' may be larger,
for example, if the hardware only supports 32-bit or 64-bit reads. I
wanted to minimize any possible impacts with this patch series so I
did *not* change this behavior.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Export remapping and unmapping interfaces - acpi_os_map_generic_address()
and acpi_os_unmap_generic_address() - for ACPI generic registers that are
backed by memory mapped I/O (MMIO).
The acpi_os_map_generic_address() and acpi_os_unmap_generic_address()
declarations may more properly belong in include/acpi/acpiosxf.h next to
acpi_os_read_memory() but I believe that would require the ACPI CA making
them an official part of the ACPI CA - OS interface.
ACPI Generic Address Structure (GAS) reference (ACPI's fixed/generic
hardware registers use the GAS format):
ACPI Specification, Revision 4.0, Section 5.2.3.1, "Generic Address
Structure"
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Generic Address Structures (GAS) may reside within ACPI tables which
are byte aligned. This patch copies GAS 'address' references to a local
variable, which will be naturally aligned, to be used going forward.
ACPI Generic Address Structure (GAS) reference:
ACPI Specification, Revision 4.0, Section 5.2.3.1, "Generic Address
Structure"
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
32bits for these. The new fields were reserved before.
According to the ACPI spec, the OS must disregrard reserved fields.
ia64 did handle the PXM fields almost consistently, but depending on
sgi's sn2 platform. This patch leaves the sn2 logic in, but does also
use 16/32 bits for PXM if the SRAT has rev 2 or higher.
The patch also adds __init to the two pxm accessor functions, as they
access __initdata now and are called from an __init function only anyway.
Note that the code only uses 16 bits for the PXM field in the processor
proximity field; the patch does not address this as 16 bits are more than
enough.
Signed-off-by: Kurt Garloff <kurt@garloff.de>
Signed-off-by: Len Brown <len.brown@intel.com>
In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
32bits for these. The new fields were reserved before.
According to the ACPI spec, the OS must disregrard reserved fields.
x86/x86-64 was rather inconsistent prior to this patch; it used 8 bits
for the pxm field in cpu_affinity, but 32 bits in mem_affinity.
This patch makes it consistent: Either use 8 bits consistently (SRAT
rev 1 or lower) or 32 bits (SRAT rev 2 or higher).
cc: x86@kernel.org
Signed-off-by: Kurt Garloff <kurt@garloff.de>
Signed-off-by: Len Brown <len.brown@intel.com>
In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
32bits for these. The new fields were reserved before.
According to the ACPI spec, the OS must disregrard reserved fields.
In order to know whether or not, we must know what version the SRAT
table has.
This patch stores the SRAT table revision for later consumption
by arch specific __init functions.
Signed-off-by: Kurt Garloff <kurt@garloff.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Some firmware will access memory in ACPI NVS region via APEI. That
is, instructions in APEI ERST/EINJ table will read/write ACPI NVS
region. The original resource conflict checking in APEI code will
check memory/ioport accessed by APEI via general resource management
mech. But ACPI NVS region is marked as busy already, so that the
false resource conflict will prevent APEI ERST/EINJ to work.
To fix this, this patch excludes ACPI NVS regions when APEI components
request resources. So that they will not conflict with ACPI NVS
regions.
Reported-and-tested-by: Pavel Ivanov <paivanof@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some firmware will access memory in ACPI NVS region via APEI. That
is, instructions in APEI ERST/EINJ table will read/write ACPI NVS
region. The original resource conflict checking in APEI code will
check memory/ioport accessed by APEI via general resource management
mechanism. But ACPI NVS region is marked as busy already, so that the
false resource conflict will prevent APEI ERST/EINJ to work.
To fix this, this patch record ACPI NVS regions, so that we can avoid
request resources for memory region inside it.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Current fix for resource conflict is to remove the address region <param1 &
param2, ~param2+1> from trigger resource, which is highly relies on valid user
input. This patch is trying to avoid such potential issues by fetching the
exact address region from trigger action table entry.
Signed-off-by: Xiao, Hui <hui.xiao@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory error.
This will cause resource conflict with RAM.
On one of our testing machine, if injecting at memory address
0x10000000, the following error will be reported in dmesg:
APEI: Can not request iomem region <0000000010000000-0000000010000008> for GARs.
This patch removes the injecting memory address range from trigger
table resources to avoid conflict.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
On one of our testing machine, the following EINJ command lines:
# echo 0x10000000 > param1
# echo 0xfffffffffffff000 > param2
# echo 0x8 > error_type
# echo 1 > error_inject
Will get:
echo: write error: Input/output error
The EIO comes from:
rc = apei_exec_pre_map_gars(&trigger_ctx);
The root cause is as follow. Normally, ACPI atomic IO support is used
to access IO memory. But in EINJ of that machine, it is used to
access RAM to trigger the injected error. And the ioremap() called by
apei_exec_pre_map_gars() can not map the RAM.
This patch add RAM mapping support to ACPI atomic IO support to
satisfy EINJ requirement.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Because printk is not safe inside NMI handler, the recoverable error
records received in NMI handler will be queued to be printked in a
delayed IRQ context via irq_work. If a fatal error occurs after the
recoverable error and before the irq_work processed, we lost a error
report.
To solve the issue, the queued error records are printked in NMI
handler if system will go panic.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
In most cases, printk only guarantees messages from different printk
calling will not be interleaved between each other. But, one APEI
GHES hardware error report will involve multiple printk calling,
normally each for one line. So it is possible that the hardware error
report comes from different generic hardware error source will be
interleaved.
In this patch, a sequence number is prefixed to each line of error
report. So that, even if they are interleaved, they still can be
distinguished by the prefixed sequence number.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
aer_recover_queue() is called when recoverable PCIe AER errors are
notified by firmware to do the recovery work.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
There is no 64bit read/write support in ACPI atomicio because
readq/writeq is used to implement 64bit read/write, but readq/writeq
is not available on i386. This patch implement 64bit read/write
support in atomicio via two readl/writel.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Make the various files in alphabetical order to simplify
addition of new files.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
If HW-reduced flag is set in the FADT, do not attempt to access
or initialize any ACPI hardware, including SCI and global lock.
No FACS will be present.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The call to acpi_os_validate_address in acpi_ds_get_region_arguments was
removed by mistake in commit 9ad19ac(ACPICA: Split large dsopcode and
dsload.c files).
Put it back.
Cc: stable@vger.kernel.org # 2.6.38+
Reported-and-bisected-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch moves the ack of the BAU interrupt to the beginning
of the interrupt handler so that there is less possibility of a
lost interrupt and slower response to a shootdown message.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116212146.GE5767@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes an unnecessary test for a
no-destination-resources-available condition that looks like a
destination timeout in UV1, but is separately distinguishable in
UV2.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116212050.GD5767@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements a workaround for a UV2 hardware bug.
The bug is a non-atomic update of a memory-mapped register. When
hardware message delivery and software message acknowledge occur
simultaneously the pending message acknowledge for the arriving
message may be lost. This causes the sender's message status to
stay busy.
Part of the workaround is to not acknowledge a completed message
until it is verified that no other message is actually using the
resource that is mistakenly recorded in the completed message.
Part of the workaround is to test for long elapsed time in such
a busy condition, then handle it by using a spare sending
descriptor. The stay-busy condition is eventually timed out by
hardware, and then the original sending descriptor can be
re-used. Most of that logic change is in keeping track of the
current descriptor and the state of the spares.
The occurrences of the workaround are added to the BAU
statistics.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116211947.GC5767@sgi.com
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move the call to enable_timeouts() forward so that
BAU_MISC_CONTROL is initialized before using it in
calculate_destination_timeout().
Fix the calculation of a BAU destination timeout
for UV2 (in calculate_destination_timeout()).
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116211848.GB5767@sgi.com
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Update the use of the Broadcast Assist Unit on SGI Altix UV2 to
the use of native UV2 mode on new hardware (not the legacy mode).
UV2 native mode has a different format for a broadcast message.
We also need quick differentiaton between UV1 and UV2.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20120116211750.GA5767@sgi.com
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'samsung-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: EXYNOS: Invert VCLK polarity for framebuffer on ORIGEN
ARM: S3C64XX: Fix interrupt configuration for PCA935x on Cragganmore
ARM: S3C64XX: Fix the memory mapped GPIOs on Cragganmore
ARM: S3C64XX: Remove hsmmc1 from Cragganmore
ARM: S3C64XX: Remove unconditional power domain disables
ARM: SAMSUNG: Declare struct platform_device in plat/s3c64xx-spi.h
ARM: SAMSUNG: dma-ops.h needs mach/dma.h
ARM: SAMSUNG: Guard against multiple inclusion of plat/dma.h
system chunks by default are very small. This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead. This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
This in addition to a script in my btrfs-tracing tree will help track down space
leaks when we're getting space left over in block groups on umount. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
We've been seeing warnings coming out of the orphan commit stuff forever from
ceph. Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock. So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime. Then clear
the root's orphan block rsv and release the lock. With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
I used these tracepoints when figuring out what the cluster stuff was doing, so
add them to mainline in case we need to profile this stuff again. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reproduce steps:
# mkfs.btrfs /dev/sdb5
# mount /dev/sdb5 -o compress=lzo /mnt
# dd if=/dev/zero of=/mnt/tmpfile bs=128K count=1
# sync
# truncate -s 64K /mnt/tmpfile
root 5 inode 257 errors 400
This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount. This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns. This adds a ridiculous amount of
latency and isn't really needed. The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return. This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Recognize BTRFS_BALANCE_RESUME flag passed from userspace. We use the
same heuristics used when recovering balance after a crash to try to
start where we left off last time.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Implement an ioctl for canceling restriper. Currently we wait until
relocation of the current block group is finished, in future this can be
done by triggering a commit. Balance item is deleted and no memory
about the interrupted balance is kept.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Implement an ioctl for pausing restriper. This pauses the relocation,
but balance is still considered to be "in progress": balance item is
not deleted, other volume operations cannot be started, etc. If paused
in the middle of profile changing operation we will continue making
allocations with the target profile.
Add a hook to close_ctree() to pause restriper and free its data
structures on unmount. (It's safe to unmount when restriper is in
"paused" state, we will resume with the same parameters on the next
mount)
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Since restriper kthread starts involuntarily on mount and can suck cpu
and memory bandwidth add a mount option to forcefully skip it. The
restriper in that case hangs around in paused state and can be resumed
from userspace when it's convenient.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
On mount, if balance item is found, resume balance in a separate
kernel thread.
Try to be smart to continue roughly where previous balance (or convert)
was interrupted. For chunk types that were being converted to some
profile we turn on soft convert, in case of a simple balance we turn on
usage filter and relocate only less-than-90%-full chunks of that type.
These are just heuristics but they help quite a bit, and can be improved
in future.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Introduce a new btree objectid for storing balance item. The reason is
to be able to resume restriper after a crash with the same parameters.
Balance item has a very high objectid and goes into tree of tree roots.
The key for the new item is as follows:
[ BTRFS_BALANCE_OBJECTID ; BTRFS_BALANCE_ITEM_KEY ; 0 ]
Older kernels simply ignore it so it's safe to mount with an older
kernel and then go back to the newer one.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
When doing convert from one profile to another if soft mode is on
restriper won't touch chunks that already have the profile we are
converting to. This is useful if e.g. half of the FS was converted
earlier.
The soft mode switch is (like every other filter) per-type. This means
that we can convert for example meta chunks the "hard" way while
converting data chunks selectively with soft switch.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Profile changing is done by launching a balance with
BTRFS_BALANCE_CONVERT bits set and target fields of respective
btrfs_balance_args structs initialized. Profile reducing code in this
case will pick restriper's target profile if it's available instead of
doing a blind reduce. If target profile is not yet available it goes
back to a plain reduce.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Every caller of do_chunk_alloc() feeds it the reduced allocation
profile, so stop trying to reduce it one more time. Instead check the
validity of the passed profile.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Select chunks which have at least one byte located inside a given
[vstart, vend) virtual address space range.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Select chunks which have at least one byte of at least one stripe
located on a device with devid X in a given [pstart,pend) physical
address range.
This filter only works when devid filter is turned on.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This allows to have a separate set of filters for each chunk type
(data,meta,sys). The code however is generic and switch on chunk type
is only done once.
This commit also adds a type filter: it allows to balance for example
meta and system chunks w/o touching data ones.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Add basic restriper infrastructure: extended balancing ioctl and all
related ioctl data structures, add data structure for tracking
restriper's state to fs_info, etc. The semantics of the old balancing
ioctl are fully preserved.
Explicitly disallow any volume operations when balance is in progress.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Currently when new chunks are created respective avail_alloc_bits field
is updated to reflect profiles of all chunks present in the system.
However when chunks are removed profile bits are never cleared.
This patch clears profile bit of respective avail_alloc_bits field when
the last chunk with that profile is removed. Restriper needs this to
properly operate when "downgrading".
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Right now on-disk BTRFS_BLOCK_GROUP_* profile bits are used for
avail_{data,metadata,system}_alloc_bits fields, which gather info about
available allocation profiles in the FS. When chunk is created or read
from disk, its profile is OR'ed with the corresponding avail_alloc_bits
field. Since SINGLE is denoted by 0 in the on-disk format, currently
there is no way to tell when such chunks become avaialble. Restriper
needs that information, so add a separate bit for SINGLE profile.
This bit is going to be in-memory only, it should never be written out
to disk, so it's not a disk format change. However to avoid remappings
in future, reserve corresponding on-disk bit.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Chunk's type and profile are encoded in u64 flags field. Introduce
masks to easily access them. Also fix the type of BTRFS_BLOCK_GROUP_*
constants, it should be ULL.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In function ieee80211_tx_h_encrypt the var info was
initialized from tx->skb, since the fucntion
is called after the function ieee80211_tx_h_fragment
tx->skb is not valid anymore.
Signed-off-by: Yoni Divinsky <yoni.divinsky@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix the following build warning:
drivers/net/wireless/iwlwifi/iwl-scan.c: In function ‘iwlagn_request_scan’:
drivers/net/wireless/iwlwifi/iwl-scan.c:572: warning: ‘cmd_len’ may be used uninitialized in this function
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We may leak the 'fwd_skb' we skb_copy() in ieee80211_rx_h_mesh_fwding() if
we take the 'else' branch in the 'if' statement just below. If we take
that branch we'll end up returning from the function and since we've not
assigned 'fwd_skb' to anything at that point, we leak it when the variable
goes out of scope.
The simple fix seems to be to just kfree_skb(fwd_skb); just before we
return. That is what this patch does.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Documentation states that the KeyMiss flag is only valid if RxFrameOK is
unset, however empirical evidence has shown that this is false.
When KeyMiss is set (and RxFrameOK is 1), the hardware passes a valid frame
which has not been decrypted. The driver then falsely marks the frame
as decrypted, and when using CCMP this corrupts the rx CCMP PN, leading
to connection hangs.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This clears the currently mapped core when suspending, to force
re-mapping after resume. Without that we were touching default core
registers believing some other core is mapped. Such a behaviour
resulted in lockups on some machines.
Cc: stable@vger.kernel.org
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Tracepoints are disabled for tainted modules, which is usually because the
module is either proprietary or was forced, and we don't want either of them
using kernel tracepoints.
But, a module can also be tainted by being in the staging directory or
compiled out of tree. Either is fine for use with tracepoints, no need
to punish them. I found this out when I noticed that my sample trace event
module, when done out of tree, stopped working.
Cc: stable@vger.kernel.org # 3.2
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The __iomem annotation is to be used together with pointers used
in iowrite32() but not for pointers returned by kzalloc().
For more details see [1] and [2].
This patch will remove the following sparse warning (i.e. when
copiling with "make C=1"):
* warning: incorrect type in assignment (different address spaces)
References:
[1] A new I/O memory access mechanism (Sep 15, 2004)
http://lwn.net/Articles/102232/
[2] Being more anal about iospace accesses (Sep 15, 2004)
http://lwn.net/Articles/102240/
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch will remove the following sparse warning ("make C=1"):
* warning: Using plain integer as NULL pointer
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The __iomem annotation is to be used together with pointers used
as iowrite32() parameter. For more details see [1] and [2].
This patch will remove the following sparse warnings ("make C=1"):
* warning: incorrect type in assignment (different address spaces)
* warning: incorrect type in argument 1 (different address spaces)
* warning: incorrect type in argument 2 (different address spaces)
References:
[1] A new I/O memory access mechanism (Sep 15, 2004)
http://lwn.net/Articles/102232/
[2] Being more anal about iospace accesses (Sep 15, 2004)
http://lwn.net/Articles/102240/
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch will remove the following sparse warning ("make C=1"):
* warning: Using plain integer as NULL pointer
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Fix platform removal by freeing the platform DAPM resources and remove
it from the DAPM list.
Signed-off-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Fixes a NULL pointer dereference in dapm_power_widgets() if the dapm context
has no codec.
Signed-off-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
BSYM macro is only needed for assembly files and its usage in c files is
wrong, so only define it for assembly.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Dave Martin <dave.martin@linaro.org>
BSYM macro is only needed for assembly files and its usage in c files is
wrong, so remove it. The linker will correctly set bit 0 for Thumb2
kernels.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Dave Martin <dave.martin@linaro.org>
This fixes bug introduced by multi-frontend to single-frontend change.
Finally HAS_LOCK is got back!
We are not allowed to access hardware in sleep mode...
Chip did not like when .get_frontend() reads some registers while
chip was sleeping and due to that HAS_LOCK bit was never gained.
TODO: We should add logic for dvb-core to drop out illegal calls like that.
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Fix yet another bug introduced be recent cxd2820r multi-frontend to
single-frontend change.
Finally, we have at least almost working picture for DVB-C too.
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As pointed out by 'make versioncheck', there's no need for
drivers/media/dvb/frontends/tda18271c2dd.c to
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Fix another bug introduced by recent multi-frontend to single-frontend
change.
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
cxd2820r implements only one frontend currently which
handles all the standards.
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The caller doesn't check the return value of check_firmware() but static
checkers complain. It currently returns negative error codes, or zero
or greater on success but since the return type is boolean the values
are truncated to one or zero. I've changed it to return an int,
negative on error and zero on success.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
BSYM macro is only needed for assembly files and its usage in c files is
wrong, so remove it. The linker will correctly set bit 0 for Thumb2
kernels.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Dave Martin <dave.martin@linaro.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
These were initialized twice by mistake. They were defined the same way
both times so this doesn't change how the code works.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Once the ENDPROC is in place, BSYM() in not longer necessary
to get correct pointer to versatile_secondary_startup().
Tested-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Dave Martin <dave.martin@linaro.org>
In xc4000 chipsets real signal and noise level is stored in register
0x0A and 0x0B,so we can use those registers to monitor signal strength.
I tested this patch on 2 different cards Leadtek DVR3200 and DTV2000H
Plus, both with same results, I used special antenna hubs (toner 4x, 6x,
8x and 12x) with mesured signal lost, both registers are in dB value,
first represent signal with limit value -113.5dB (should be -114dB) and
exactly match with test results. Second represents noise level also in
dB and there is no maximum value, but from tests we can drop everything
above 32dB which tuner realy can't use, signal was usable till 20dB
noise level.
In digital mode we can take signal strength but sadly noise level is not
relevant and real value is stored in demodulator for now just zl10353,
also digital mode is just for testing, because it needs changing other
parts of code which reads data only from demodulator.
In analog mode I was able to test only FM radio, signal level is not
important, it says something about cable and hub losts, but nothing
about real quality of reception, so even if we have signal level at
minimum 113dB we can still here radio, because of that it is displaied
only in debug mode, but for real signal level is used noise register
which is again very accurate, radio noise level was betwen 6-20dB for
good signal, 20-25dB for medium signal, and above 25dB signal is
unusable.
For now real benefit of this patch is only for FM radio mode.
Signed-off-by: Miroslav Slugen <thunder.mmm@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
All radio tuners in cx88 driver using same address for radio and tuner,
so there is no need to probe it twice for same tuner and we can use
radio_type UNSET, this also fix broken radio since kernel 2.6.39-rc1
for those tuners.
Cc: stable@kernel.org
Signed-off-by: Miroslav Slugen <thunder.mmm@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
All radio tuners in cx23885 driver using same address for radio and
tuner, so there is no need to probe it twice for same tuner and we can
use radio_type UNSET.
Be aware radio support in cx23885 is not yet committed, so this is only
minor fix for future support.
Signed-off-by: Miroslav Slugen <thunder.mmm@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
"dev" is NULL here so we should use "nr" instead of "dev->devno".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This reduces our module init to a simple usb_register() call, so
that we can make use of the new upcoming macro's for this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The code for this is rather crufty, and being able to tie a device
to a specific minor is not really something we want to support in
a modern udev based world.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The pwc driver used to:
1. kmalloc a buffer
2. memcpy data to send over usb there
3. do the usb_control_msg call (which does not work with data on the stack)
4. free the buffer
For every usb command send. This patch changes the code to instead malloc
a buffer for this purpose once and use it everywhere.
[mchehab@redhat.com: Fix a compilation breakage with allyesconfig:
drivers/media/video/pwc/pwc-ctrl.c: In function ‘pwc_get_cmos_sensor’:
drivers/media/video/pwc/pwc-ctrl.c:546:3: warning: passing argument 4 of ‘recv_control_msg’ makes integer from pointer without a cast [en$
drivers/media/video/pwc/pwc-ctrl.c:107:12: note: expected ‘int’ but argument is of type ‘unsigned char *’
drivers/media/video/pwc/pwc-ctrl.c:546:3: error: too many arguments to function ‘recv_control_msg’
drivers/media/video/pwc/pwc-ctrl.c:107:12: note: declared here]
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Since we always do a set_video_mode on stream start, there is no need
to actually send the mode info to the device on a s_fmt / s_parm ioctl.
Not doing this saves us doing (slow) usb io.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Before this patch various code in the mode setting patch checked
pdev->pixfmt, but that was not set until the mode setting succeeded, so
it was looking at the old pixfmt! This patch fixes this by making the
pixfmt a parameter to set_video_mode, and setting it from set_video_mode
on success.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch partially reverts:
3d058d7 netfilter: rework user-space expectation helper support
that was applied during the 3.2 development cycle.
After this patch, the tree remains just like before patch bc01bef,
that initially added the preliminary infrastructure.
I decided to partially revert this patch because the approach
that I proposed to resolve this problem is broken in NAT setups.
Moreover, a new infrastructure will be submitted for the 3.3.x
development cycle that resolve the existing issues while
providing a neat solution.
Since nobody has been seriously using this infrastructure in
user-space, the removal of this feature should affect any know
FOSS project (to my knowledge).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes this warning when CONFIG_IP6_NF_IPTABLES is not enabled:
net/netfilter/xt_hashlimit.c: In function ‘hashlimit_init_dst’:
net/netfilter/xt_hashlimit.c:448:9: warning: unused variable ‘frag_off’ [-Wunused-variable]
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of messing around with id's it's much easier to just compare
against a filehandle pointer.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Andy Walls <awalls@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
For some reason the cx18 driver could open the radio device only once.
Remove this limitation.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Andy Walls <awalls@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
For some reason the /dev/radio device was implemented as an exclusive open:
you could open it only once and not a second time.
Remove this limitation.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Andy Walls <awalls@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As per the feature removal document, make the tuner type check more strict
so that it is no longer possible to set the radio frequency through a video
node or the TV frequency through a radio node.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As Rupert pointed out, the phrase "It is good practice" should be replaced
with "You must".
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Rupert Eibauer <Rupert.Eibauer@ces.ch>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Update the spec to the behavior implemented by the control framework.
This should have been documented long ago but for some reason it was
never done.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The omap_vout driver has an output overlay, but never advertised that
capability.
The driver should also set the V4L2_FBUF_FLAG_OVERLAY flag.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
CC: Archit Taneja <archit@ti.com>
CC: Vaibhav Hiremath <hvaibhav@ti.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The zoran driver does not support this flag, so don't set it.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The error handling in the original code wasn't complete so static
checkers complained about a potential NULL deference.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The two DACs for the front output and the surround/center/LFE/back
outputs are wired up out of phase, so when channels are duplicated,
their sound can cancel out each other and result in a weaker bass
response. To fix this, reverse the polarity of the neutron flow to
the front output.
Reported-any-tested-by: Daniel Hill <daniel@enemyplanet.geek.nz>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: 2.6.34+ <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
- The maximum number of playback streams depend on the number of sample
rate conveters (16) and the number of DMA channels (32).
Signed-off-by: Raymond Yau <superquad.vortex2@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
- Check SDAC bit of AC97 codec for supporting 4 channels playback.
Signed-off-by: Raymond Yau <superquad.vortex2@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Move the definition of the global variable fcoe_debug_logging
from fcoe.h to fcoe.c. Avoid that sparse complains about missing
declarations for local functions or variables by declaring these
static.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Avoid that sparse complains about missing declarations for local
functions by declaring these static or by adding an #include directive.
Add the __percpu annotation where it is missing.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is a regression introduced by commit
1ff9918b62 The else statement here is breaking
the initiator logic of allocating xid from the offloaded em xid pool for READ
I/O only to use DDP, as shown by the snippet of trace below, where the WRITE
is using xid 0x5 from the offloaded em xid pool:
Protocol VID Len S_ID D_ID OX_ID RX_ID Summary
..
*FCP 228 96 0b.08.01 -> 01.0f.00 0x0005 0xffff SCSI: Write(10) LUN: 0x00
FCP 228 76 01.0f.00 -> 0b.08.01 0x0005 0x828d XFER_RDY
...
The bug is in the else statement, for both initiator and target, the
new command will have FC frame header bit 23 (FC_FC_EX_CTX) cleared as it was
originated from the initiator. Also, this is assuming the frame header is
already filled up, which is only true for target since for initiator, this is a
new frame and oem_match gets called when em tries get xid for this i/o before
it is filled up and sent out.
The fix is to check if there is a fc_fcp_pkt associated w/ this frame from
fr_fsp(fp), since fr_fsp(fp) is NULL for tcm_fc target and non-I/O frame in
initiator. This should also return true for target only if it is an
FC_RCTL_DD_UNSOL_CMD and rx_id is not allocated.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This very noisy sparse warning appears on almost every file in
the kernel:
CHECK init/main.c
arch/x86/include/asm/thread_info.h:43:55: error: dubious one-bit
signed bitfield arch/x86/include/asm/thread_info.h:44:46: error:
dubious one-bit signed bitfield
Sparse is right and this patch changes sig_on_uaccess_error and
uaccess_err flags to unsigned type and thus fixes the warning.
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
Acked-by: Andy Lutomirski <luto@mit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Dan Carpenter <error27@gmail.com>
Link: http://lkml.kernel.org/r/20120111011146.GA30428@oksana.dev.rtsoft.ru
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In case of FW hung ISP82xx generates continuous pause frames
which causes switch to disable port.
Added fix to disable generating pause frames in case of
FW hung
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There's a zero day mistake in the megaraid driver in that the code that
obtains the version number does a >> 8 on a char quantity. This >>8 causes a
sparse warning because it always produces zero. Al Viro suggested these
shifts should be >> 4 thus treating the firmware version as a BCD quantity.
However, in the interests of safety we've elected to replace the >> 8
quantities with an explicit zero, thus quieting the sparse warning while
preserving the same (albeit incorrect) version number as had previously been
seen.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The sdev is deleted from starved list and then try to dispatch from this
device. It's quite possible the sdev can't eventually dispatch a request,
then the sdev will be in starved list tail. This isn't fair.
There are two cases here:
1. unplug path. scsi_request_fn() calls to scsi_target_queue_ready(), then
the dev is removed from starved list, but quite possible host queue isn't
ready, the dev is moved to starved list without dispatching any request.
2. scsi_run_queue path. It deletes the dev from starved list first (both
global and local starved lists), then handles the dev. Then we could have
the same process like case 1.
This patch fixes the first case. Case 2 isn't fixed, because there is a
rare case scsi_run_queue finds host isn't busy but scsi_request_fn finds
host is busy (other CPU is faster to get host queue depth). Not deleting
the dev from starved list in scsi_run_queue will keep scsi_run_queue
looping (though this is very rare case, because host will become busy).
Fortunately fixing case 1 already gives big improvement for starvation in
my test. In a 12 disk JBOD setup, running file creation under EXT4, this
gives 12% more throughput.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When expander connected in x2 or x4 mode and with IO runnning, if
a cable from wideport is plugged out from the phy, IO's start failing
on all the targets.
Observed that when cable is pulled with IO running, cominit is
happening on all the links and IO's start dropping to 0 and eventually
the whole IO fails. Second observation, target is trying to open and
SCU is responding with "Open reject no destination".
A cause of the problem is when the port went from the "ready
configuring substate" back to "ready configuring substate" as a result
of phy being pulled off, scic suspended the port task scheduler
register. As a result no IO was allowed and in the "substate
configuring enter" routine the IO never goes back to 0. As a result
the port never comes out of "ready substate configuring".
The patch adds a mechanism of activate and deactivate phy when a port
link up, which fixes the problem.
Signed-off-by: Bartek Nowakowski <bartek.nowakowski@intel.com>
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When the first phy of a wide port comes up, don't report the port ready
yet, always wait for 250 miliseconds then config the port with all phys
added to the port. So that we can avoid reporting wide port device too
early to kernel, which caused the first IOs (report luns, inquirys)
failed due to not all the phys are configured into its port. Changes
also made that the phys in a wide port don't need to go through half
second wait time for consuming power.
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When hot insert the wide port device through the mini-sas port,
the first IOs (Report Luns or Inquiry) may fail due to the device
trying to open to a SCU phy that is still in suspended state. This
IO failure causes the wide port device stuck in UPDATING_PORT_WIDTH
state.
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Failure seen pulling a cable from a x4 port configured in manual port
configuration (MPC) mode (MPC mode is set by the the OEM paramaters
provided by the platform or isci_firmware.bin). While IO running to
devices behind and expander, plugging out the cable from phy is causing
IO failures and IO drops on disks and never recover.
It happens because during link up/down the phy were being taken out of
the port.
Fix: during link down the phy is kept in the same logical port.
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Bump the version now that the driver has atapi support and the initial
round of hotplug fixes. The EXPERIMENTAL tag should have been removed a
while back. While we're here also kill the "select SCSI_SAS_HOST_SMP"
as the build error was separately fixed by commit d962480e "[SCSI]
libsas: fix try_test_sas_gpio_gp_bit() build error".
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
As the field was never set, isci_print_tmf() using 'isci_tmf->device'
sometimes causes a kernel crash if the dev_dbg() statement is enabled.
Remove the unused field both from isci_tmf struct definition and from
isci_print_tmf()
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
"No task timeout timer reduced from 20 to 2 This timer controls how
long the SCU hardware will hold open the TX side of the connection
before sending a DONE. The timer allows the hardware to attempt to
optimize the DONE/CLOSE behavior to allow for new COMMAND IU to be
posted. In practice closing the connection quicker is better."
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
v1.3 allows the attenuation of the attached cables to be specified to
the driver in terms of 'short', 'medium', and 'long' (see probe_roms.h).
These settings (per phy) are retrieved from the platform oem-parameters
(BIOS rom) or via a module parameter override.
Reviewed-by: Jiangbi Liu <jiangbi.liu@intel.com>
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
v1.1 allows finer grained tuning of the SSC (spread-spectrum-clocking)
settings for SAS and SATA. See notes in probe_roms.h
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
C1 silicon requires updates to the phy tuning recipe and also support
for user provided cable selects (per-phy) for short, medium, and long
cables. Default to 'short' awaiting support for selecting the cable via
oem parameters.
Reviewed-by: Jiangbi Liu <jiangbi.liu@intel.com>
Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Before updating the code to support the latest platform updates and
silicon revision cleanup some of the long deref chains.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This code has never worked correctly, doesn't disable interrupts when
set as a module parameter, doesn't disable interrupts when set after
driver load time in sysfs node, etc.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Mask off flags in the ioctl path to prevent memory scribble with older MegaCLI
versions.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The following patch for megaraid_sas fixes the reglockFlags field for
degraded raid5/6 for MR9360/9380, which will result in a performance
improvement.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The arch/x86/lib/x86-opcode-map.txt file [used by the
kprobes instruction decoder] contains the line:
af: SCAS/W/D/Q rAX,Xv
This is what the Intel manuals show, but it's not correct.
The 'X' stands for:
Memory addressed by the DS:rSI register pair (for example, MOVS, CMPS, OUTS, or LODS).
On the other hand 'Y' means (also see the ae byte entry for
SCASB):
Memory addressed by the ES:rDI register pair (for example, MOVS, CMPS, INS, STOS, or SCAS).
Signed-off-by: Ulrich Drepper <drepper@gmail.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/CAOPLpQfytPyDEBF1Hbkpo7ovUerEsstVGxBr%3DEpDL-BKEMaqLA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Don't call kthread_stop with a spin lock held and interrupts
disabled because kthread_stop will sleep waiting for the thread
to stop.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch fixes a bug where devloss is not called on fc_host teardown.
The issue is seen if the LLDD uses rport_rolechg to add the target role
to an rport.
When an rport goes away, the LLDD will call fc_remote_port_delete, which
will start the devloss timer. If the timer expires, the transport will
call the devloss callback and set the FC_RPORT_DEVLOSS_CALLBK_DONE flag.
However, the rport structure is not deleted, it is retained to store the
SCSI id mappings for the rport in case it comes back. In the scenario
where it does come back, and the driver calls fc_remote_port_add, but does
not indicate the "target" role for the rport - the create will clear the
structure, but forgets to clear FC_RPORT_DEVLOSS_CALLBK_DONE flag (which
is cleared if it's added with the target role). The secondary call, of
fc_remote_port_rolechg to add the target role also does not clear the flag.
Thus, the next time the rport goes away, the resulting devloss timer
expiration will not call the driver callback as the flag is still set.
This patch adds the FC_RPORT_DEVLOSS_CALLBK_DONE flags to the list of
those that are cleared upon reuse of the rport structure.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The changeset 240ab508aa is incomplete, as the first thing that
happens at cache clear is to do a memset with 0 to the cache.
So, the delivery system needs to be explicitly preserved there.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As reported by Lawrence[1], MythTV 0.24.1 does the wrong thing
with a DVBv5 call: it fills the delivery system with
SYS_UNDEFINED, expecting that the DVB core would work with that.
This used to work by accident, as the DVB core were missing the
check for the supported delivery systems. Yet, fixing it
is easy, so let's add a logic to handle this case, to
provide backward compatibility.
[1] http://patchwork.linuxtv.org/patch/8314/
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
When userspace attempts to authorize a station
that is already authorized, nothing happens as
you'd expect. Similarly, when it unauthorizes
a station that is associated, nothing happens.
However, when it unauthorizes a station that
isn't even associated yet, we erroneously try
to move the station to associated. This seems
to happen occasionally as a result of a race
when wpa_supplicant attempts to unauthorize
the port in managed mode. Particularly with my
new patches to keep stations, it can then move
a station into ASSOCIATED state before we have
really associated, which is really confusing.
I introduced this bug in
"mac80211: refactor station state transitions"
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Same devices can generate interrupt without properly setting bit in
INT_SOURCE_CSR register (spurious interrupt), what will cause IRQ line
will be disabled by interrupts controller driver.
We discovered that clearing INT_MASK_CSR stops such behaviour. We
previously first read that register, and then clear all know interrupt
sources bits and do not touch reserved bits. After this patch, we write
to all register content (I believe writing to reserved bits on that
register will not cause any problems, I tested that on my rt2800pci
device).
This fix very bad performance problem, practically making device
unusable (since worked without interrupts), reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=658451
We previously tried to workaround that issue in commit
4ba7d99978 "rt2800pci: handle spurious
interrupts", but it was reverted in commit
82e5fc2a34
as thing, that will prevent to detect real spurious interrupts.
Reported-and-tested-by: Amir Hedayaty <hedayaty@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is basically just a cleanup. Large positive numbers get counted as
negative but then get implicitly cast to positive again for the checks
that matter.
This does make a small difference in ipw_handle_promiscuous_rx() when we
test "if (unlikely((len + IPW_RX_FRAME_SIZE) > skb_tailroom(rxb->skb)))"
It should return there, but we don't return until a couple lines later
when we test "if (len > IPW_RX_BUF_SIZE - sizeof(struct ipw_rt_hdr)) {".
The difference is that in the second test the sizeof() means that there
is an implied cast to unsigned.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
An Oops has once been observed, when the SDIO card had been ejected during
IO. The PC value shows, that the dev pointer in b43_op_stop() was NULL.
(I moved the NULL check before the lock, based upon a suggestion from
Julian Calaby <julian.calaby@gmail.com>. -- JWL)
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
By adding some module aliases, programs (or users) won't have to explicitly
call modprobe. Vhost-net will always be available if built into the kernel.
It does require assigning a permanent minor number for depmod to work.
Also:
- use C99 style initialization.
- add missing entry in documentation for loop-control
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-By: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
.. and the just as dead bhv_desc forward declaration while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Replace the nasty if, else if, elseif condition with more natural C flow
that expressed the logic we want here better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
This wrapper isn't overly useful, not to say rather confusing.
Around the call to xfs_itruncate_extents it does:
- add tracing
- add a few asserts in debug builds
- conditionally update the inode size in two places
- log the inode
Both the tracing and the inode logging can be moved to xfs_itruncate_extents
as they are useful for the attribute fork as well - in fact the attr code
already does an equivalent xfs_trans_log_inode call just after calling
xfs_itruncate_extents. The conditional size updates are a mess, and there
was no reason to do them in two places anyway, as the first one was
conditional on the inode having extents - but without extents we
xfs_itruncate_extents would be a no-op and the placement wouldn't matter
anyway. Instead move the size assignments and the asserts that make sense
to the callers that want it.
As a side effect of this clean up xfs_setattr_size by introducing variables
for the old and new inode size, and moving the size updates into a common
place.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
CONFIG_SND_COMPRESS_OFFLOAD is an item to be selected by the dirver
just like CONFIG_SND_PCM, and no need to prompt for explicit
selection.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Several users have reported "choppy" audio under the 3.2 kernel,
and that changing position_fix to 1 has resolved their problem.
The chip is an nVidia Corporation MCP89 High Definition Audio,
[10de:0d94] (rev a2).
Cc: stable@kernel.org (v3.2+)
BugLink: https://bugs.launchpad.net/bugs/909419
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
With git commit 0706802183
"xen-balloon: convert sysdev_class to a regular subsystem" we would
end up with the attributes being put in:
/sys/devices/xen_memory0/target_kb
instead of
/sys/devices/system/xen_memory/xen_memory0/target_kb
Making the tools inable to deflate the kernel to make more space
for launching another guest and printing:
Error: Failed to query current memory allocation of dom0
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Suggested-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* commit '070680218379e15c1901f4bf21b98e3cbf12b527': (50 commits)
xen-balloon: convert sysdev_class to a regular subsystem
clocksource: convert sysdev_class to a regular subsystem
ibm_rtl: convert sysdev_class to a regular subsystem
edac: convert sysdev_class to a regular subsystem
rtmutex-tester: convert sysdev_class to a regular subsystem
driver-core: implement 'sysdev' functionality for regular devices and buses
kref: fix up the kfree build problems
kref: Remove the memory barriers
kref: Implement kref_put in terms of kref_sub
kref: Inline all functions
Drivers: hv: Get rid of an unnecessary check in hv.c
Drivers: hv: Make the vmbus driver unloadable
Drivers: hv: Fix a memory leak
Documentation: Update stable address
MAINTAINERS: stable: Update address
w1: add fast search for single slave bus
driver-core: skip uevent generation when nobody is listening
drivers: hv: Don't OOPS when you cannot init vmbus
firmware: google: fix gsmi.c build warning
drivers_base: make argument to platform_device_register_full const
...
Framebuffer driver needs to fetch the video data during the rising
edge of the VCLK. Otherwise, there are some glitches in the LCD
display.
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
0 is a valid IRQ but the interrupt line for the pca935x certainly isn't
connected there which causes it to fail to probe.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Rather than letting them get allocated dynamically where we don't know
where they are, and also name the data line resource as gpio-generic
requires that. Without these changes the GPIOs are useless.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The second MMC slot was only present on revision 1 of Cragganmore so
remove it. Only MMC0 and MMC2 are there on production systems.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The function declarations in plat/s3c64xx-spi.h rely on struct
platform_device but it's not declared by the header causing compiler
warnings if it is included prior to another header which does that.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The number of submission & completion queues should be set by calling
Set Features, not Get Features.
Reported-by: Kwok Kong <Kwok.Kong@idt.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Init topology subsystem by cpu registration.
Microblaze Linux kernel is fauling by
"Oops: kernel access of bad area, sig: 11"
because cpu is not initialized.
Signed-off-by: Michal Simek <monstr@monstr.eu>
The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
but when we mount a filesystem which has backing seed devices, we have
this lock chain:
open_ctree()
lock(chunk_mutex);
read_chunk_tree();
read_one_dev();
open_seed_devices();
lock(uuid_mutex);
and then we hit a lockdep splat.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
A bug was triggered while using seed device:
# mkfs.btrfs /dev/loop1
# btrfstune -S 1 /dev/loop1
# mount -o /dev/loop1 /mnt
# btrfs dev add /dev/loop2 /mnt
btrfs: block rsv returned -28
------------[ cut here ]------------
WARNING: at fs/btrfs/extent-tree.c:5969 btrfs_alloc_free_block+0x166/0x396 [btrfs]()
...
Call Trace:
...
[<f7b7c31c>] btrfs_cow_block+0x101/0x147 [btrfs]
[<f7b7eaa6>] btrfs_search_slot+0x1b8/0x55f [btrfs]
[<f7b7f844>] btrfs_insert_empty_items+0x42/0x7f [btrfs]
[<f7b7f8c1>] btrfs_insert_item+0x40/0x7e [btrfs]
[<f7b8ac02>] btrfs_make_block_group+0x243/0x2aa [btrfs]
[<f7bb3f53>] __btrfs_alloc_chunk+0x672/0x70e [btrfs]
[<f7bb41ff>] init_first_rw_device+0x77/0x13c [btrfs]
[<f7bb5a62>] btrfs_init_new_device+0x664/0x9fd [btrfs]
[<f7bbb65a>] btrfs_ioctl+0x694/0xdbe [btrfs]
[<c04f55f7>] do_vfs_ioctl+0x496/0x4cc
[<c04f5660>] sys_ioctl+0x33/0x4f
[<c07b9edf>] sysenter_do_call+0x12/0x38
---[ end trace 906adac595facc7d ]---
Since seed device is readonly, there's no usable space in the filesystem.
Afterwards we add a sprout device to it, and the kernel creates a METADATA
block group and a SYSTEM block group where comes free space we can reserve,
but we still get revervation failure because the global block_rsv hasn't
been updated accordingly.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
There are various bugs in block group trimming:
- It may trim from offset smaller than user-specified offset.
- It may trim beyond user-specified range.
- It may leak free space for extents smaller than specified minlen.
- It may truncate the last trimmed extent thus leak free space.
- With mixed extents+bitmaps, some extents may not be trimmed.
- With mixed extents+bitmaps, some bitmaps may not be trimmed (even
none will be trimmed). Even for those trimmed, not all the free space
in the bitmaps will be trimmed.
I rewrite btrfs_trim_block_group() and break it into two functions.
One is to trim extents only, and the other is to trim bitmaps only.
Before patching:
# fstrim -v /mnt/
/mnt/: 1496465408 bytes were trimmed
After patching:
# fstrim -v /mnt/
/mnt/: 2193768448 bytes were trimmed
And this matches the total free space:
# btrfs fi df /mnt
Data: total=3.58GB, used=1.79GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=205.12MB, used=97.14MB
Metadata: total=8.00MB, used=0.00
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
For btrfs raid, while discarding a range of space, we'll need to know
the start offset and length to discard for each device, and it's done
in btrfs_map_block().
However the calculation is a bit complex for raid0 and raid10, so I
reimplement it based on a fact that:
dev1 dev2 dev3 (raid0)
-----------------------------------
s0 s3 s6 s1 s4 s7 s2 s5
Each device has (total_stripes / nr_dev) stripes, or plus one.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
We pre-allocate a btrfs bio with fixed size, and then may re-allocate
memory if we find stripes are bigger than the fixed size. But this
pre-allocation is not necessary.
Also we don't have to calcuate the stripe number twice.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
If we run into some failure path in io_ctl_prepare_pages(),
io_ctl->pages[] array may have some NULL pointers.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
I got this while running xfstests:
[24256.836098] block group 317849600 has an wrong amount of free space
[24256.836100] btrfs: failed to load free space cache for block group 317849600
We should clamp the extent returned by find_first_extent_bit(),
so the start of the extent won't smaller than the start of the
block group.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
This patch adds a check-condition in scsi_dh_alua handler for a retry.
Sometimes, I have seen attach failing due to this check-condition with
following error messages on NetApp E series storage.
Dec 7 15:31:01 nilgiris kernel: [102979.696673] scsi 3:0:2:9: alua: port group 00 rel port 01
Dec 7 15:31:01 nilgiris kernel: [102979.697082] scsi 3:0:2:9: alua: rtpg failed with 8000002
Dec 7 15:31:01 nilgiris kernel: [102979.697086] scsi 3:0:2:9: alua: rtpg sense code 06/2a/01
Dec 7 15:31:01 nilgiris kernel: [102979.697088] scsi 3:0:2:9: alua: not attached
Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch re-implements LUN Masking feature using SCSI Slave Callouts. With
the new design in the slave_alloc entry point; for each new LUN discovered we
check with our internal LUN Masking config whether to expose or to mask this
particular LUN. We return -ENXIO (No such device or address) from slave_alloc
for the LUNs we don't want to be exposed. We also notify the SCSI mid-layer
to do a sequential LUN scan rather than REPORT_LUNS based scan if LUN masking
is enabled on our HBA port, since a -ENXIO from any LUN in REPORT_LUNS based
scan translates to a scan abort. This patch also handles the dynamic lun
masking config change from enable to disable or vice-versa by resetting
sdev_bflags of LUN 0 appropriately.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch reverts the current LUN Masking Implementation. We re-implemented
this feature using the SCSI Slave Callout's as per the review comments.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Patch fixes the possible NULL pointer dereference when we try to add or delete
a rpwwn to the lunmask config which is not zoned to this port. Check if the
FCS rport is not NULL before de-referencing it.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
skb_linearize already has a check for skb_is_nonlinear,
there is no need to duplicate the check in fcoe.c. This
patch simply removes the unnecessary check and calls
skb_linearize unconditionally.
Reported-by: patrick kelle <patrick.kelle81@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Acked-by: patrick kelle <patrick.kelle81@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If ql4xdisablesysfsboot = 0 and sendtargets entry as boot index then
driver does export sendtarget entries in sysfs but iscsistart does not
do discovery. So in this case let driver do the discovery and
login to the targets.
Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
A wrong default timeout value programmed in the adapter causes driver
to wait for that much time while waiting for target discoveries to complete.
This could add huge delays during the driver load time. To avoid this,
limit the default timeout value to 12 seconds if the default timeout value
set in adapter is less than 12 seconds and greater than 120 seconds.
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In alloc_pdu, libcxgbi tries to allocate a skb with GFP_ATOMIC, which
may potentially fail. When it happens, the current code prints a warning
message.
When the system is under IO stress, this failure may happen lots of
times and it usually scares users.
Instead of printing the warning message, the code now increases the
tx_dropped statistics for the ethernet interface wich is doing the iscsi
task.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Karen Xie <kxie@chelsio.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
sym53c8xx_slave_destroy unconditionally assumes that sym53c8xx_slave_alloc has
succesesfully allocated a sym_lcb. This can lead to a NULL pointer dereference
(exposed by commit 4e6c82b).
Signed-off-by: Stratos Psomadakis <psomas@gentoo.org>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
QUEUE_FLAG_* are flags (other than QUEUE_FLAG_DEFAULT), so they cannot
be ORed together. Set the queue flags using queue_flag_set_unlocked().
Reported-by: Donald Wood <donald.e.wood@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
By using the iod->nents field (the same way other I/O paths do), we can
avoid recalculating the number of sg entries at unmap time, and make
nvme_unmap_user_pages() easier to call.
Also, use the 'write' parameter instead of assuming DMA_FROM_DEVICE.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
We were always mapping as DMA_FROM_DEVICE then unmapping with
DMA_TO_DEVICE which was clearly not correct. Follow the same pattern as
nvme_submit_io() and key off the bottom bit of the opcode to determine
whether this is a read or a write.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
IO_TIMEOUT is a little too generic and might be used by other parts of
the kernel in the future.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The new merged data structure is called nvme_iod. This improves performance
for mid-sized I/Os (in the 16k range) since we save a memory allocation.
It is also a slightly simpler interface to use.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The queue is only needed for some rare occasions, and it's more consistent
to pass the device around.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Upcoming patches require calling get_nvmeq when we don't have a namespace.
Some callers already have the device in a local variable anyway.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Instead of encoding the handler type in the bottom two bits of the
per-completion context pointer, store the handler function as well
as the context pointer. This gives us more flexibility and the code
is clearer. It comes at the cost of an extra 8k of memory per queue,
but this feels like a reasonable price to pay.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Commit 2f0778afa (ARM: 7205/2: sched_clock: allow sched_clock to be
selected at runtime) replaced the picoxcell specific sched_clock() with
a generic runtime selectable version but replaced "unsigned long long
notrace sched_clock(void)" with "unsigned u32 notrace
picoxcell_read_sched_clock(void)" fix this up to return a u32 as
expected.
Cc: Marc Zyngier <Marc.Zyngier@arm.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
the latter can be obtained from the former (by looking as ->tree_root)
just as cheaply as we currently are doing the other way round.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... and don't bother with it after btrfs_fill_super() failure -
->kill_sb() (unlike ->put_super()) will be called even if we
have not got non-NULL ->s_root.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We do not (fortunately) modify ->s_fs_info of superblock on the fly in
btrfs_fill_super(); apparent assignment is a no-op.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
It returns either ERR_PTR(-ve) or sb->s_fs_info. The latter can
be found by caller just as well, TYVM, no need to return it. Just
return -ve or 0...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
close_ctree() uses a weird mix of accesses to root->fs_info and
its value at the beginning of function stored in local variable.
Since ->fs_info *never* changes, let's just use the local variable
to avoid confusion.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
A new helper: btrfs_alloc_root(fs_info); allocates btrfs_root
and sets ->fs_info. All places allocating the suckers converted
to it. At that point we *never* reassign ->fs_info of btrfs_root;
it's set before anyone sees the address of newly allocated
struct btrfs_root and never assigned anywhere else.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
move assignments to ->fs_info in open_ctree() up, to the place
just after the original allocations. Assignment for tree_root
becomes a no-op - we'd obtained fs_info from tree_root->fs_info
in the first place.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
pathname resolution under a global mutex, taken on some paths in ->mount()
is a Bad Idea(tm) - think what happens if said pathname resolution triggers
automount of some btrfs instance and walks into attempt to grab the same
mutex. Deadlock - we are waiting for daemon to finish walking the path,
daemon is waiting for us to release the mutex...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Do *NOT* skip doomed superblocks in btrfs_test_super(); we want
sget() to wait for their shutdown to complete. Since we don't
mutilate ->s_fs_info in ->put_super() anymore (or free what it
used to point to until the superblock is past being findable
by sget()), we can just DTRT there and report a match.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... and move free_fs_info() to that, out of ->put_super(). Do NOT
set ->s_fs_info to NULL in the latter; we need it for sget() to
be able to see and wait for fs in the middle of umount if we get a
mount/umount race.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We need fs_info and root to live until the moment when the victim
superblock leaves the list, so we need to postpone free_fs_info()
until after ->put_super(). The call is buried in close_ctree(),
though, so we need to lift it into the callers (including
btrfs_put_super()) first.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Parameterize clusters on minimum total size, minimum chunk size and
minimum contiguous size for at least one chunk, without limits on
cluster, window or gap sizes. Don't tolerate any fragmentation for
SSD_SPREAD; accept it for metadata, but try to keep data dense.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
We store the allocation start and length twice in ins, once right
after the other, but with intervening calls that may prevent the
duplicate from being optimized out by the compiler. Remove one of the
assignments.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Since the clustered allocation may be taking extents from a different
block group, there's no point in spin-locking and testing the current
block group free space before attempting to allocate space from a
cluster, even more so when we might refrain from even trying the
cluster in the current block group because, after the cluster was set
up, not enough free space remained. Furthermore, cluster creation
attempts fail fast when the block group doesn't have enough free
space, so the test was completely superfluous.
I've move the free space test past the cluster allocation attempt,
where it is more useful, and arranged for a cluster in the current
block group to be released before trying an unclustered allocation,
when we reach the LOOP_NO_EMPTY_SIZE stage, so that the free space in
the cluster stands a chance of being combined with additional free
space in the block group so as to succeed in the allocation attempt.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The chunk allocation code has tried to keep a pretty tight lid on creating new
metadata chunks. This is partially because in the past the reservation
code didn't give us an accurate idea of how much space was being used.
The new code is much more accurate, so we're able to get rid of some of these
checks.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Btrfs tries to batch extent allocation tree changes to improve performance
and reduce metadata trashing. But it doesn't allocate new metadata chunks
while it is doing allocations for the extent allocation tree.
This commit changes the delayed refence code to do chunk allocations if we're
getting low on room. It prevents crashes and improves performance.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
On platforms, supporting power domains, if the domain, containing a DMAC
instance is powered down, the driver fails to resume correctly. On those
platforms DMAC channels have an additional CHCLR register for clearing
channel buffers. Using this register during runtime resume fixes the
problem.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
There's code in btrfs_get_extent that should never be used. This patch turns
a WARN_ON(1) into a BUG(), hoping we can remove the transaction code from
btrfs_get_extent soon.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
The old backref iteration code could only safely be used on commit roots.
Besides this limitation, it had bugs in finding the roots for these
references. This commit replaces large parts of it by btrfs_find_all_roots()
which a) really finds all roots and the correct roots, b) works correctly
under heavy file system load, c) considers delayed refs.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
This function gets a byte number (a data extent), collects all the leafs
pointing to it and walks up the trees to find all fs roots pointing to those
leafs. It also returns the list of all leafs pointing to that extent.
It does proper locking for the involved trees, can be used on busy file
systems and honors delayed refs.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Now that we may be holding back delayed refs for a limited period, we
might end up having no runnable delayed refs. Without this commit, we'd
do busy waiting in that thread until another (runnable) ref arives.
Instead, we're detecting this situation and use a waitqueue, such that
we only try to run more refs after
a) another runnable ref was added or
b) delayed refs are no longer held back
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
When processing a delayed ref, first check if there are still old refs in
the process of being added. If so, put this ref back to the tree. To avoid
looping on this ref, choose a newer one in the next loop.
btrfs_find_ref_cluster has to take care of that.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Sequence numbers are needed to reconstruct the backrefs of a given extent to
a certain point in time. The total set of backrefs consist of the set of
backrefs recorded on disk plus the enqueued delayed refs for it that existed
at that moment.
This patch also adds a list that records all delayed refs which are
currently in the process of being added.
When walking all refs of an extent in btrfs_find_all_roots(), we freeze the
current state of delayed refs, honor anythinh up to this point and prevent
processing newer delayed refs to assert consistency.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
This patch adds the possibilty to read-lock an extent even if it is already
write-locked from the same thread. btrfs_find_all_roots() needs this
capability.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
The commit 49920bc (dmaengine: add new enum dma_transfer_direction)
changes the type of parameter 'direction' of device_prep_dma_cyclic
from dma_data_direction to dma_transfer_direction.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Two issues are fixed:
1. DMA descriptors are reused so when freeing lli structures
that are linked to them, the pointer must be nulled.
2. midc_scan_descriptors() must be called with the
channel lock held.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
The driver gpmi-nand should compile at least. This patch adds the
missing gpmi-nand.h to fix the compile error below.
CC drivers/mtd/nand/gpmi-nand/gpmi-nand.o
CC drivers/mtd/nand/gpmi-nand/gpmi-lib.o
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:25:33: fatal error: linux/mtd/gpmi-nand.h: No such file or directory
drivers/mtd/nand/gpmi-nand/gpmi-lib.c:21:33: fatal error: linux/mtd/gpmi-nand.h: No such file or directory
This header is grabbed from patch below, which has not been postponed
for merging.
[PATCH v8 1/4] ARM: mxs: add GPMI-NAND support for imx23/imx28
http://permalink.gmane.org/gmane.linux.drivers.mtd/37338
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Before dma_transfer_direction was introduced to replace
dma_data_direction, some dmaengine device uses DMA_NONE of
dma_data_direction for some talk with its client drivers.
The mxs-dma and its clients mxs-mmc and gpmi-nand are such case.
This patch adds DMA_TRANS_NONE to dma_transfer_direction and
migrate the DMA_NONE use in mxs-dma to it.
It also fixes the compile warning below.
CC drivers/dma/mxs-dma.o
drivers/dma/mxs-dma.c: In function ‘mxs_dma_prep_slave_sg’:
drivers/dma/mxs-dma.c:420:16: warning: comparison between ‘enum dma_transfer_direction’ and ‘enum dma_data_direction’
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
This is how the original Freescale code (unintentionally) worked,
because the code path which would have asserted the CLKGATE bit was
never actually reached in their code.
This fixes the nefarious "DMA timout" bug when multiple DMA channels
(e.g. GPMI NAND and MMC) are used at the same time.
If a better fix for this problem should be found, the clkgate handling
could be reinstated.
See http://lists.infradead.org/pipermail/linux-arm-kernel/2011-September/065228.html
Also reverse the order of mxs_dma_disable_chan() and
mxs_dma_reset_chan() in mxs_dma_control() because mxs_dma_reset_chan()
can only work when the DMA channel is enabled.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Using a static variable for counting the number of CCWs attached to
a DMA channel when appending a new descriptor is not multi user safe.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
There is no need to have the clock enabled all the time the driver is
loaded.
It will be enabled anyway in mxs_dma_alloc_chan_resources() when a
channel is actually going to be used.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
For consistent backref walking and (later) qgroup calculation the
information to which root a delayed ref belongs is useful even for shared
refs.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Add a for_cow parameter to add_delayed_*_ref and pass the appropriate value
from every call site. The for_cow parameter will later on be used to
determine if a ref will change anything with respect to qgroups.
Delayed refs coming from relocation are always counted as for_cow, as they
don't change subvol quota.
Also pass in the fs_info for later use.
btrfs_find_all_roots() will use this as an optimization, as changes that are
for_cow will not change anything with respect to which root points to a
certain leaf. Thus, we don't need to add the current sequence number to
those delayed refs.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
btrfs_next_item() makes the btrfs path point to the next item, crossing leaf
boundaries if needed.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
ulist is a generic data structures to hold a collection of unique u64
values. The only operations it supports is adding to the list and
enumerating it.
It is possible to store an auxiliary value along with the key. The
implementation is preliminary and can probably be sped up significantly.
It is used by btrfs_find_all_roots() quota to translate recursions into
iterative loops.
Signed-off-by: Arne Jansen <sensille@gmx.net>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
This is the last part of the patch series. It modifies the btrfs
code to use the integrity check module if configured to do so
with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set,
the only effective change is that code is added that handles the
mount option to activate the integrity check. If the mount option is
set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code
complains in the log and the mount fails with EINVAL.
Add the mount option to activate the usage of the integrity check
code.
Add invocation of btrfs integrity check code init and cleanup
function on mount and umount, respectively.
Add hook to call btrfs integrity check code version of
submit_bh/submit_bio.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
If the btrfs integrity check is enabled, the files required to
implement the checks are included in the build.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
The two files added in this patch contain all the code that is
required to implement the integrity checks.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
This patch adds the kernel module ib_srpt SCSI RDMA Protocol (SRP) target
implementation conforming to the SRP r16a specification for the mainline
drivers/target infrastructure.
This driver was originally developed by Vu Pham and has been optimized by
Bart Van Assche and merged into upstream LIO based on his srpt-lio-4.1
branch here:
https://github.com/bvanassche/srpt-lio/commits/srpt-lio-4.1/
This updated patch also contains the following two changes from
lio-core-2.6.git/master. One is to fix a bug with 1 >= task->task_sg[]
chained mappings in ib_srpt, and the other to convert the configfs control
plane to reference IB Port GUID and struct srpt_port directly following
mainline v4.x target_core_fabric_configfs.c convertion for ib_srpt
to work with rtslib/rtsadmin v2 code.
These seperate patches can be found here:
ib_srpt: Fix bug with chainged SGLs in srpt_map_sg_to_ib_sge
http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=ea485147563b6555a97dbf811825fbb586519252
ib_srpt: Convert se_wwn endpoint reference to struct srpt_port->port_wwn
http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=4e544a210acb227df1bb4ca5086e65bdf4e648ea
This also includes the following recent v1 -> v2 review changes:
ib_srpt: Fix potential out-of-bounds array access
ib_srpt: Avoid failed multipart RDMA transfers
ib_srpt: Fix srpt_alloc_fabric_acl failure case return value
ib_srpt: Update comments to reference $driver/$port layout
ib_srpt: Fix sport->port_guid formatting code
ib_srpt: Remove legacy use_port_guid_in_session_name module parameter
ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
ib_srpt: Convert srp_max_rsp_size into per port configfs attribute
ib_srpt: Convert srpt_sq_size into per port configfs attribute
and v2 -> v3 review changes:
ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib
ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work
ib_srpt: Fix up MAX_SRPT_RDMA_SIZE define
ib_srpt: Make srpt_map_sg_to_ib_sge() failure case return -EAGAIN
ib_srpt: Convert port_guid to use subnet_prefix + interface_id formatting
ib_srpt: Make srpt_check_stop_free return kref_put status
ib_srpt: Make compilation with BUG=n proceed`
ib_srpt: Use new target_core_fabric.h include
ib_srpt: Check hex2bin() return code to silence build warning
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vu Pham <vu@mellanox.com>
Cc: David Dillow <dillowda@ornl.gov>
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
The target code was not setting the additional sense length field in the
sense data it returned, which meant that at least the Linux stack
ignored the ASC/ASCQ fields. For example, without this patch, on a
tcm_loop device:
# sg_raw -v /dev/sda 2 0 0 0 0 0
gives
cdb to send: 02 00 00 00 00 00
SCSI Status: Check Condition
Sense Information:
Fixed format, current; Sense key: Illegal Request
Raw sense data (in hex):
70 00 05 00 00 00 00 00
while after the patch we correctly get the following (which matches what
a regular disk returns):
cdb to send: 02 00 00 00 00 00
SCSI Status: Check Condition
Sense Information:
Fixed format, current; Sense key: Illegal Request
Additional sense: Invalid command operation code
Raw sense data (in hex):
70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00
00 00
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes a legacy se_dev_check_online() check from within
transport_execute_tasks() that should no longer be necessary as
transport_lookup_cmd_lun() is already making this call.
Using transport_cmd_check_stop() from transport_execute_tasks() should
already be checking per se_cmd context for each descriptor upon active
I/O shutdown, so no need to acquire dev->dev_status_lock again while
executing se_task submission.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes the original usage of __transport_execute_tasks() ahead
of every transport_get_cmd_from_queue() call in transport_processing_thread().
This helps reduce se_device->execute_task_lock contention between qla2xxx wq
with target_submit_cmd() for READs and transport_processing_thread()
context servicing WRITEs with full payloads for I/O submission.
It also adds a __transport_execute_tasks() to kick the task queue again
without a *se_cmd descriptor with existing queue full logic, but this may
end up not being necessary.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch makes __transport_execute_tasks() perform the addition of
tasks to dev->execute_task_list via __transport_add_tasks_from_cmd()
while holding dev->execute_task_lock during normal I/O fast path
submission.
It effectively removes the unnecessary re-acquire of dev->execute_task_lock
during transport_execute_tasks() -> transport_add_tasks_from_cmd() ahead
of calling __transport_execute_tasks() to queue tasks for the passed
*se_cmd descriptor.
(v2: Re-add goto check_depth usage for multi-task submission for now..)
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Historically, pSCSI devices have been the ones that required target-core
to enforce a per se_device->depth_left. This patch changes target-core
to no longer (by default) enforce a per se_device->depth_left or sleep in
transport_tcq_window_closed() when we out of queue slots for all backend
export cases.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch makes __transport_execute_tasks() use a local *se_dev
reference to prevent direct se_cmd->se_dev access after
transport_cmd_check_stop() -> transport_add_tasks_from_cmd()
has been called, as in the current implementation we can expect
__transport_execute_tasks() may be called from another context
that may have already completed the I/O.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Remove the now unused target_core_fabric_ops->check_release_cmd() as
target_core handles this directly for se_cmd->cmd_kref objects now.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch converts the main ft_send_work() I/O path to use
target_submit_cmd() with a single se_cmd->cmd_kref reference
that is released via the existing ft_check_stop_free() response
path callback.
It also makes ft_send_tm() use transport_init_se_cmd() and
target_get_sess_cmd() to also use single se_cmd->cmd_kref
reference.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a target_submit_cmd() caller that can be used by fabrics
to submit an uninitialized se_cmd descriptor to an struct se_session +
unpacked_lun from workqueue process context. This call will invoke the
following steps:
- transport_init_se_cmd() to setup se_cmd specific pointers
- Obtain se_cmd->cmd_kref references with target_get_sess_cmd()
- set se_cmd->t_tasks_bidi
- transport_lookup_cmd_lun() to setup struct se_cmd->se_lun from
the passed unpacked_lun
- transport_generic_allocate_tasks() to setup the passed *cdb, and
- transport_handle_cdb_direct() handle READ dispatch or WRITE
ready-to-transfer callback to fabric
v2 changes from hch feedback:
*) Add target_sc_flags_table for target_submit_cmd flags
*) Rename bidi parameter to flags, add TARGET_SCF_BIDI_OP
*) Convert checks to BUG_ON
*) Add out_check_cond for transport_send_check_condition_and_sense
usage
v3 changes:
*) Add TARGET_SCF_ACK_KREF for target_submit_cmd into
target_get_sess_cmd to determine when the fabric caller is expecting
a second kref_put() from fabric packet acknowledgement.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch moves target_put_sess_cmd() to use a se_cmd->cmd_kref
callback target_release_cmd_kref when performing driver release of
fabric->se_cmd descriptor memory. It sets the default cmd_kref
count value to '2' within target_get_sess_cmd() setup, and
currently assumes TFO->check_stop_free() usage.
It drops se_tfo->check_release_cmd() usage in the main
transport_release_cmd codepath.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Current SCSI specs say that the "response format" field in the standard
INQUIRY response should be set to 2, and all the real SCSI devices I
have do put 2 here. So let's do that too.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This includes:
- remove on _ in "__NAMELEN" in $fabric _make_tport
- target_fabric_configfs_init() returns an error pointer and not NULL
anymore. Consider that.
- replace (!(var_name)) with (!var_name). The extra () are not required
- remove #ifdef MODULE. If the code is builtin it needs an init function
or the code is useless
- put exit/clean functions into __exit
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes TFO->release_cmd() and removes legacy pack_lun() usage
and new_cmd_failure when generating new TCM fabric skeleton from the
tcm_mod_builder.py script.
Reported-by: Stefan Bergstrand <stefan.bergstrand@sdsab.se>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
If we want dynamically allocated objects to be cacheline aligned we need
to tell that to the slab allocator by using the proper flags and not
by liberally sprinkling annotations onto all structures.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
There is not reason to artifically limit max_sectors in tcm_loop, set
it to UINT_MAX to allow stressing the large I/O handling in the target
core using the loopback driver. Also remove various superflous defines
hiding the values set in the host template.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch strips the trailing newline from backend device udev_path and
alias attributes.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch makes chap_server_compute_md5() use proper unsigned long
usage for the CHAP_I (identifier) and check for values beyond 255 as
per RFC-1994.
Reported-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
A reader should spend an extra moment whenever noticing a cast,
because either something special is going on that deserves extra
attention or, as is all too often the case, the code is wrong.
These casts, afaics, have all been useless. They cast a foo* to a
foo*, cast a void* to the assigned type, cast a foo* to void*, before
assigning it to a void* variable, etc.
In a few cases I also removed an additional &...[0], which is equally
useless.
Lastly I added three FIXMEs where, to the best of my judgement, the
code appears to have a bug. It would be good if someone could check
these.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
- rename to target_check_cdb_and_preempt
- use non-safe list_for_each_entry
- move common check into callee (simplifying callers)
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The command
| echo rd_pages=32768 > ramdisk/control
Does not work because it writes "rd_pages=32768\n" and the parser which
matches for "rd_pages=%d" does not recognize it due to the \n. One way
of fixing this would be using "echo -n" instead.
This patch adds \n to the list of separators so we don't have to use the
-n argument which I find is more convinient.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
There is no need to make task_state_active an atomic_t given that it is
always set under the execute_task_lock so we can make it a simple bool.
Also rename it to t_state_active to be closer to the list it guards,
and make sure all checks before the list addion/removal actually happen
under execute_task_lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We only reach transport_complete_task once per task, so the test and set on
task_error_status is never going to have an effect.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This reorganized the headers under include/target into:
- target_core_base.h stays as is with all target-wide data stuctures and defines
- target_core_backend.h contains the whole interface to I/O backends
- target_core_fabric.h contains the whole interface to fabric modules
Except for those only the various configfs macro headers stay around.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Create a new headers, drivers/target/target_core_internal.h that is supposed
to hold all target_core-internal prototypes. Move all non-exported includes
from include/target to it, and merge the smaller prototype-only includes
inside drivers/target into it as well. Mark functions that were found to
not be called outside their implementation file static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
amba_probe() now calls pm_runtime_get_noresume() and pm_runtime_enable()
for the devices before the device probe is called. Hence we don't need
to call pm_runtime_get_xxx and pm_runtime_enable() in device probe again.
In the same way, since amba_remove() calls the respective pm_runtime
functions, those functions need not be called from device remove.
This patch fixes following run time error with pl330 driver.
dma-pl330 dma-pl330.0: Unbalanced pm_runtime_enable!
dma-pl330 dma-pl330.0: failed to get runtime pm
Signed-off-by: Giridhar Maruthy <giridhar.maruthy@linaro.org>
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
The IPU internally works on 32bit colors. It can arbitrarily map
between pixel formats and internal representation and also between
internal representation and the physical connection to the display.
The driver used to change the mapping between internal representation
and display connection depending on the user selected bpp which is
wrong. Instead, the mapping is specified by the hardware, so an
additional field in platform data is added to describe the connection
between i.MX and the display. The default for this field is RGB666
which seems to be the only configuration which works without this
patch, so I assumed that all in Kernel boards are connected this
way.
This patch has been tested on a RGB666 connected display and a
RGB888 connected display in both 16bpp and 32bpp modes.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
The burstsize (npb in struct chan_param_mem) is set in
ipu_ch_param_set_size() once. The number of allowed
pixels in a burst depend on the pixel format and the
rotation mode. For 16bit formats 16 pixels are allowed
whereas for 32bit formats only 8 pixels are allowed.
Set these values correctly in ipu_ch_param_set_size()
and do not overwrite them afterwards.
We do not support rotation right now, so ignore this
case.
This patch fixes the wrong burstsize setting of 16 pixels
for 32bpp.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Allow logical channels to specify the physical channel they want to use.
This is needed to avoid two peripherals operating on the same physical
channel during some special use-cases. (like mmc and usb during a
usb mass storage case).
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Narayanan G <narayanan.gopalakrishnan@stericsson.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Currently, if plchan->phychan is true, we return immediately from
prep_phy_chan(). We must configure txd->ccfg and increment phychan_hold before
returning. Otherwise, request line number wouldn't be configured in this txd.
Reported-by: Rajeev Kumar <rajeev-dlh.kumar@st.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
When we use the SDMA in the UART driver(such as imx6q), we will
meet one situation:
Assume we set 64 bytes for the RX DMA buffer.
The RX DMA buffer has received some data, but not full.
An Aging DMA request will be received by the SDMA controller if we enable the
IDDMAEN(UCR4[6]) in this case.
So the UART driver needs to know the count of the real received bytes,
and push them to upper layer.
Add two new fields to sdmac, and update the `residue` in sdma_tx_status().
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Setting the flags in case of IDE didn't have any effect since the control
register is overwritten few lines below. So move these to be after the control
register is initialized.
Signed-off-by: Rafal Prylowski <prylowski@metasoft.pl>
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Using a configuration structure simplify the finding of SoC
dependent parameters. Both platform data and device tree ids are
using these structures.
This will separate data from code and remove the need for an enum.
Idea from Grant Likely.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
This patch provides an option of having the lcla (link address)
in ESRAM instead of allocating it. The bool value (use_esram_lcla)
in the stedma40_platform_data if set to true, then the lcla
address would be taken from platform resources. Also, the
corresponding esram regulator is managed in the
suspend/resume functions.
Signed-off-by: Narayanan G <narayanan.gopalakrishnan@stericsson.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Sparse warns:
drivers/dma/timb_dma.c:168:17: warning: mixing different enum types
drivers/dma/timb_dma.c:168:17: int enum dma_transfer_direction versus
drivers/dma/timb_dma.c:168:17: int enum dma_data_direction
drivers/dma/timb_dma.c:172:32: warning: mixing different enum types
drivers/dma/timb_dma.c:172:32: int enum dma_transfer_direction versus
drivers/dma/timb_dma.c:172:32: int enum dma_data_direction
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
In S2R all DMA registers are reset by hardware and thus they are required to be
reprogrammed. The channels which aren't reprogrammed are channel configuration
and interrupt enable registers, which are currently programmed at chan_alloc
time.
This patch creates another routine to initialize a channel. It will try to
initialize channel on every dwc_dostart() call. If channel is already
initialised then it simply returns, otherwise it configures registers.
This routine will also initialize registers on wakeup from S2R, as we mark
channels as uninitialized on suspend.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Set the right DMA direction in the sdma_control(), else
we will get the wrong log when enable the DYNAMIC_DEBUG.
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
This patch adds power management support to the dma40
driver. The DMA registers are backed up and restored,
during suspend/resume. Also flags to track the dma usage
have been introduced to facilitate this. Patch also includes
few other minor changes, related to formatting, comments.
Signed-off-by: Narayanan G <narayanan.gopalakrishnan@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.
This api supports all 4 variants of scatter-gather and contiguous transfer.
Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.
Also since now we support SLAVE channels that might not provide
device_prep_slave_sg callback but device_prep_interleaved_dma,
remove the BUG_ON check.
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Acked-by: Barry Song <Baohua.Song@csr.com>
[renamed dmaxfer_template to dma_interleaved_template
did fixup after the enum dma_transfer_merge]
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
On October 1 in 2011,
OKI SEMICONDUCTOR Co., Ltd. changed the company name in to LAPIS Semiconductor
Co., Ltd.
Signed-off-by: Tomoya MORINAGA <tomoya.rohm@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
The individual SoC dependency in Kconfig hardly scales anymore.
Instead of having such a fine grained dependency just depend
on ARCH_MXC and risk that the uninformed user has to look in
the help text to figure out which driver is the correct one.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
This patch adds to fix the build warning as following.
drivers/dma/pl330.c: In function 'pl330_probe':
drivers/dma/pl330.c:859: warning: comparison of distinct pointer types lacks a cast
Signed-off-by: Boojin Kim <boojin.kim@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
We remove the use of platform data from DMA controller driver.
We now use of .id_table to distinguish between compatible
types. The two implementations allow to determine the
number of channels and the capabilities of the controller.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
The driver was still using an old definition of Identify Controller
which only came to light once we started using the 'number of namespaces'
field properly.
Reported-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Reported-by: Khosrow Panah <Khosrow.Panah@idt.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The doorbell stride allows devices to spread out their doorbells instead
of packing them tightly. This feature was added as part of ECN 003.
This patch also enables support for more than 512 queues :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
ECN 001 documented that namespace 0 is not valid. Sending an Identify
with CNS of 0 and Namespace of 0 is an undefined command.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The existing calculation underestimated the number of pages required
as it did not take into account the pointer at the end of each page.
The replacement calculation may overestimate the number of pages required
if the last page in the PRP List is entirely full. By using ->npages
as a counter as we fill in the pages, we ensure that we don't try to
free a page that was never allocated.
Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Instead of open-coding calls to nvme_submit_admin_cmd, these
small wrappers are simpler to use (the patch removes 14 lines from
nvme_dev_add() for example).
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
dma_unmap_sg() must be called with the same 'nents' passed to
dma_map_sg(), not the number returned from dma_map_sg().
Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Our SG list was constructed to always fill the entire first page, even
if that was more than the length of the I/O. This is probably harmless,
but some IOMMUs might do something bad.
Correcting the first call to sg_set_page() made it look a lot closer to
the sg_set_page() in the loop, so fold the first call to sg_set_page()
into the loop.
Reported-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Remove the special-purpose IDENTIFY, GET_RANGE_TYPE, DOWNLOAD_FIRMWARE
and ACTIVATE_FIRMWARE commands. Replace them with a generic ADMIN_CMD
ioctl that can submit any admin command.
Add a new ID ioctl that returns the namespace ID of the queried device.
It corresponds to the SCSI Idlun ioctl.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If the I/O was not completed by a single NVMe command, we add the
bio to the congestion list and wake up the kthread to resubmit it.
But the kthread calls remove_wait_queue() unconditionally, which
will oops if it's not on the wait queue. So add the kthread to
the wait queue before waking it up.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
nvme_setup_io_queues() was assuming that a NULL return from
nvme_create_queue() was an out-of-memory error. That's not necessarily
true; the adapter might return -EIO, for example. Change the calling
convention to return an ERR_PTR on failure instead of NULL.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
For the benefit of reviewers, add comments to a few functions describing
their calling context
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If any of the memory allocations in nvme_setup_prps fail, handle it by
modifying the passed-in data length to reflect the number of bytes we are
actually able to send. Also allow the caller to specify the GFP flags
they need; for user-initiated commands, we can use GFP_KERNEL allocations.
The various callers are updated to handle this possibility; the main
I/O path is already prepared for this possibility (as it may happen
due to nvme_map_bio being unable to map all the segments of the I/O).
The other callers return -ENOMEM instead of doing partial I/Os.
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The current approach of using the namespace ID as the minor number
doesn't work when there are multiple adapters in the machine. Rather
than statically partitioning the number of namespaces between adapters,
dynamically allocate minor numbers to namespaces as they are detected.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The trailing '_data' on the end was annoying and inconsistent. Also, make
it actually return the data since this is needed for timing out commands.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
When an I/O completed with an error, we would call bio_endio twice
(once with -EIO and once with 0). Found by inspection.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
THe device reports (in its capability register) how long it will take
to initialise. If that time elapses before the ready bit becomes set,
conclude the device is broken and refuse to initialise it. Log a nice
error message so the user knows why we did nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The arbitration field was extended by one bit, shifting the shutdown
notification bits by one. Also, the SQ/CQ entry size was made
configurable for future extensions.
Reported-by: Paul Luse <paul.e.luse@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The read and write commands don't define a 'result', so there's no need
to copy it back to userspace.
Remove the ability of the ioctl to submit commands to a different
namespace; it's just asking for trouble, and the use case I have in mind
will be addressed througha different ioctl in the future. That removes
the need for both the block_shift and nsid arguments.
Check that the opcode is one of 'read' or 'write'. Future opcodes may
be added in the future, but we will need a different structure definition
for them.
The nblocks field is redefined to be 0-based. This allows the user to
request the full 65536 blocks.
Don't byteswap the reftag, apptag and appmask. Martin Petersen tells
me these are calculated in big-endian and are transmitted to the device
in big-endian.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
NVME_IOCTL_SUBMIT_IO has a struct nvme_user_io, not a struct nvme_rw_command
as a parameter, and NVME_IOCTL_DOWNLOAD_FW is a Write, not a Read.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Make ioctls work for 32-bit applications on 64-bit kernels. The structures
are defined to be the same for both 32- and 64-bit applications, so
we can use the same handler for both.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Fill in all the num_possible_cpus() entries with duplicate pointers.
This reduces the complexity of the frequently-called get_nvmeq(), as
well as avoiding a bug in it when there are fewer queues than CPUs.
Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Once there are no more bios on the congestion list, we can stop waking
up the nvme kthread every time a completion happens.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If the last element in the PRP list fits on the end of the page, there's
no need to allocate an extra page to put that single element in. It can
fit on the end of the page.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The spec says this is a 0s based value. We don't need to handle the
maximal value because it's reserved to mean "every namespace".
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The head can never overrun the tail since we won't allocate enough command
IDs to let that happen. The status codes are in sync with the spec.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The spec says we're not allowed to completely fill the submission queue.
Solve this by reducing the number of allocatable cmdids by 1.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
When we submit subsequent portions of the I/O, we need to access the
updated block, not start reading again from the original position.
This was showing up as miscompares in the XFS randholes testcase.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
NVMe scatterlists must be virtually contiguous, like almost all I/Os.
However, when the filesystem lays out files with a hole, it can be that
adjacent LBAs map to non-adjacent virtual addresses. Handle this by
submitting one NVMe command at a time for each virtually discontiguous
range.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Linux implements Flush as a bit in the bio. That means there may also be
data associated with the flush; if so the flush should be sent before the
data. To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate
the completion routine should do nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The value written to the doorbell needs to be the first free index in
the queue, not the most recently used index in the queue.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If interrupts are misconfigured, the kthread will be needed to process
admin queue completions.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
I got confused about whether this included the admin queue or not, and
had to resort to reading the spec. It doesn't include the admin queue,
so make that clear in the name.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Instead of trying to resubmit I/Os in the I/O completion path (in
interrupt context), wake up a kthread which will resubmit I/O from
user context. This allows mke2fs to run to completion.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Return -EBUSY if the queue is full or -ENOMEM if we failed to allocate
memory (or map a scatterlist). Also use GFP_ATOMIC to allocate the
nvme_bio and move the locking to the callers of nvme_submit_bio_queue().
In nvme_make_request(), don't permit an I/O to jump the queue -- if the
congestion list already has an entry, just add to the tail, rather than
trying to submit.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Add two reserved registers in the middle of the BAR to match the 1.0
spec plus ECN 0002.
Also rename IMC and ISC to INTMC and INTSC to conform with the spec.
We still don't need to use them :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
In order to not overrun the sg array, we have to merge physically
contiguous pages into a single sg entry.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
We were passing the nvme_queue to access the q_dmadev for the
dma_alloc_coherent calls, but since we moved to the dma pool API,
we really only need the nvme_dev.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Add a second memory pool for smaller I/Os. We can pack 16 of these on a
single page instead of using an entire page for each one.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Calling dma_free_coherent from interrupt context causes warnings.
Using the DMA pools delays freeing until pool destruction, so avoids
the problem.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
There are too many things called 'info' in this driver. This data
structure is auxiliary information for a struct bio, so call it nvme_bio,
or nbio when used as a variable.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Add a pointer to the nvme_req_info to hold a new data structure
(nvme_prps) which contains a list of the pages allocated to this
particular request for holding PRP list entries. nvme_setup_prps()
now returns this pointer.
To allocate and free the memory used for PRP lists, we need a struct
device, so we need to pass the nvme_queue pointer to many functions
which didn't use to need it.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
For multipage BIOs, we were always using sg[0] instead of advancing
through the list. Oops :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If POISON_POINTER_DELTA isn't defined, ensure they're in page 0 which
should never be mapped.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
In the bio completion handler, check for bios on the congestion list
for this NVM queue. Also, lock the congestion list in the make_request
function as the queue may end up being shared between multiple CPUs.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
In addition to recording the completion data for each command, record
the anticipated completion time. Choose a timeout of 5 seconds for
normal I/Os and 60 seconds for admin I/Os.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If we're sharing a queue between multiple CPUs and we cancel a sync I/O,
we must have the queue locked to avoid corrupting the stack of the thread
that submitted the I/O. It turns out this is the same locking that's needed
for the threaded irq handler, so share that code.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If the adapter completes a command ID that is outside the bounds of
the array, return CMD_CTX_INVALID instead of random data, and print a
message in the sync_completion handler (which is rapidly becoming the
misc completion handler :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Set the context value to CMD_CTX_COMPLETED, and print a message in the
sync_completion handler if we see it.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
I have plans for other special values in sync_completion. Plus, this
is more self-documenting, and lets us detect bogus usages.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
We're currently calling bio_endio from hard interrupt context. This is
not a good idea for preemptible kernels as it will cause longer latencies.
Using a threaded interrupt will run the entire queue processing mechanism
(including bio_endio) in a thread, which can be preempted. Unfortuantely,
it also adds about 7us of latency to the single-I/O case, so make it a
module parameter for the moment.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
We can't have preemption disabled when we call schedule(). Accept the
possibility that we'll get preempted, and it'll cost us some cacheline
bounces.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
If the user sends a fatal signal, sleeping in the TASK_KILLABLE state
permits the task to be aborted. The only wrinkle is making sure that
if/when the command completes later that it doesn't upset anything.
Handle this by setting the data pointer to 0, and checking the value
isn't NULL in the sync completion path. Eventually, bios can be cancelled
through this path too. Note that the cmdid isn't freed to prevent reuse.
We should also abort the command in the future, but this is a good start.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Because I wasn't setting driverfs_dev, the devices were showing up under
/sys/devices/virtual/block. Now they appear underneath the PCI device
which they belong to.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
In case the card has been left in a partially-configured state,
write 0 to the Enable bit.
Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Call pci_enable_device_mem() at initialisation and pci_disable_device
at exit.
Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Factor out most of nvme_identify() into a new nvme_submit_user_admin_command()
function. Change nvme_get_range_type() to call it and change nvme_ioctl to
realise that it's getting back all 64 ranges.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Generalise the code from nvme_identify() that sets PRP1 & PRP2 so that
it's usable for commands sent by nvme_submit_bio_queue().
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Add prp1, prp2 and the metadata prp to the common command, since the
fields are generally used this way.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
The admin IRQ is supposed to use the pin-based (or single message MSI)
interrupt. Accomplish this by filling in entry[0]'s vector with the
INTx irq number.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Two callers with an almost identical long string of arguments, and
introducing a third soon. Time to factor out the commonalities.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
When Xen is enabled, using BIOVEC_PHYS_MERGEABLE in a module
causes xen_biovec_phys_mergeable to be referenced, so it needs
to be exported.
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
fixup usage of dma direction by introducing dma_transfer_direction,
this patch moves carma drivers to use new enum
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
CC Ira W. Snyder <iws@ovro.caltech.edu>
This new enum removes usage of dma_data_direction for dma direction. The new
enum cleans tells the DMA direction and mode
This further paves way for merging the dmaengine _prep operations and also for
interleaved dma
Suggested-by: Jassi Brar <jaswinder.singh@linaro.org>
Reviewed-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
2011-10-27 20:53:06 +05:30
670 changed files with 25918 additions and 7500 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.