Compare commits

...

37 Commits

Author SHA1 Message Date
ef7cede81d Linux 2.6.23.3 2007-11-16 08:24:58 -08:00
edc0636c31 revert "x86_64: allocate sparsemem memmap above 4G"
Reverted upstream by commit 6a22c57b8d

Revert this commit:

	commit 2e1c49db4c
	Author: Zou Nan hai <nanhai.zou@intel.com>
	Date:   Fri Jun 1 00:46:28 2007 -0700
	
	x86_64: allocate sparsemem memmap above 4G

This reverts commit 2e1c49db4c.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
Cc: Zou Nan hai <nanhai.zou@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:59 -08:00
963bbb3b5f x86: fix TSC clock source calibration error
patch edaf420fdc in mainline.

I ran into this problem on a system that was unable to obtain NTP sync
because the clock was running very slow (over 10000ppm slow). ntpd had
declared all of its peers 'reject' with 'peer_dist' reason.

On investigation, the tsc_khz variable was significantly incorrect
causing xtime to run slow.  After a reboot tsc_khz was correct so I
did a reboot test to see how often the problem occurred:

Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
had unacceptable tsc_khz values (>500ppm):

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          21      3.048%
1999800 - 1999850         166     24.128%
1999850 - 1999900         241     35.029%
1999900 - 1999950         211     30.669%
1999950 - 2000000          42      6.105%
2000000 - 2000000           0      0.000%
2000050 - 2000100           0      0.000%
                   [...]
2000100 - 2015000           1      0.145%  << BAD
2015000 - 2030000           6      0.872%  << BAD
2030000 - 2045000           1      0.145%  << BAD
2045000 <                   0      0.000%

The worst boot was 2032.577 Mhz, over 1.5% off!

It appears that on rare occasions, mach_countup() is taking longer to
complete than necessary.

I suspect that this is caused by the CPU taking a periodic SMI
interrupt right at the end of the 30ms calibration loop.  This would
cause the loop to delay while the SMI BIOS hander runs. The resulting
TSC value is beyond what it actually should be resulting in a higher
tsc_khz.

The below patch makes native_calculate_cpu_khz() take the best
(shortest duration, lowest khz) run of it's 3 calibration loops.  If a
SMI goes off causing a bad result (long duration, higher khz) it will
be discarded.

With the patch applied, 300 boots of the same system produce good
results:

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          30     10.000%
1999800 - 1999850         166     55.333%
1999850 - 1999900          89     29.667%
1999900 - 1999950          15      5.000%
1999950 <                   0      0.000%

Problem was found and tested against 2.6.18.  Patch is against 2.6.22.

Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
2d49e88857 x86 setup: sizeof() is unsigned, unbreak comparisons
patch e6e1ace990 in mainline.


We use signed values for limit checking since the values can go
negative under certain circumstances.  However, sizeof() is unsigned
and forces the comparison to be unsigned, so move the comparison into
the heap_free() macros so we can ensure it is a signed comparison.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
430bb2ee8d x86 setup: handle boot loaders which set up the stack incorrectly
patch 6b6815c6d5 in mainline.

Apparently some specific versions of LILO enter the kernel with a
stack pointer that doesn't match the rest of the segments.  Make our
best attempt at untangling the resulting mess.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
4b69ffe374 x86: fix global_flush_tlb() bug
patch 9a24d04a3c upstream

While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():

	down_read(&init_mm.mmap_sem);
	list_replace_init(&deferred_pages, &l);
	up_read(&init_mm.mmap_sem);

this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.

This bug has been introduced about a year ago in the 64-bit tree:

       commit ea7322decb
       Author: Andi Kleen <ak@suse.de>
       Date:   Thu Dec 7 02:14:05 2006 +0100

       [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr

                down_read(&init_mm.mmap_sem);
        -       dpage = xchg(&deferred_pages, NULL);
        +       list_replace_init(&deferred_pages, &l);
                up_read(&init_mm.mmap_sem);

the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.

why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.

the safe fix for now is to write-lock init_mm.mmap_sem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
ba4312eb97 xfs: eagerly remove vmap mappings to avoid upsetting Xen
patch ace2e92e19 in mainline.

XFS leaves stray mappings around when it vmaps memory to make it
virtually contigious.  This upsets Xen if one of those pages is being
recycled into a pagetable, since it finds an extra writable mapping of
the page.

This patch solves the problem in a brute force way, by making XFS
always eagerly unmap its mappings.

[ Stable: This works around a bug in 2.6.23.  We may come up with a
better solution for mainline, but this seems like a low-impact fix for
the stable kernel. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: XFS masters <xfs-masters@oss.sgi.com>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
418db1544b xen: fix incorrect vcpu_register_vcpu_info hypercall argument
patch e3d2697669 in mainline.

The kernel's copy of struct vcpu_register_vcpu_info was out of date,
at best causing the hypercall to fail and the guest kernel to fall
back to the old mechanism, or worse, causing random memory corruption.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:58 -08:00
edf06ad73a xen: deal with stale cr3 values when unpinning pagetables
patch 9f79991d41 in mainline.

When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed.  However, this is only possible if there are no
stray uses of the pagetable.  The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.

This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3.  It is only updated once
the actual hypercall to set cr3 has been completed.  Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
4fc04833aa xen: add batch completion callbacks
patch 91e0c5f3da in mainline.

This adds a mechanism to register a callback function to be called once
a batch of hypercalls has been issued.  This is typically used to unlock
things which must remain locked until the hypercall has taken place.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
6f8a6ffc2b UML - kill subprocesses on exit
commit a24864a1d5

uml: definitively kill subprocesses on panic

In a stock 2.6.22.6 kernel, poweroff a user mode linux guest (2.6.22.6 running
in skas0 mode) will halt the host linux.  I think the reason is the kernel
thread abort because of a bug.  Then the sys_reboot in process of user mode
linux guest is not trapped by the user mode linux kernel and is executed by
host.  I think it is better to make sure all of our children process to quit
when user mode linux kernel abort.

[ jdike - the kernel process needs to ignore SIGTERM, plus the waitpid/kill
loop is needed to make sure that all of our children are dead before the
kernel exits ]

Signed-off-by: Lepton Wu <ytht.net@gmail.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
9e6707f395 UML - stop using libc asm/user.h
commit 189872f968 in mainline.

uml: don't use glibc asm/user.h

Stop including asm/user.h from libc - it seems to be disappearing from
distros.  It's replaced with sys/user.h which defines user_fpregs_struct and
user_fpxregs_struct instead of user_i387_struct and struct user_fxsr_struct on
i386.

As a bonus, on x86_64, I get to dump some stupid typedefs which were needed in
order to get asm/user.h to compile.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
63140953f5 UML - Fix kernel vs libc symbols clash
commit 818f6ef407 in mainline.

uml: fix an IPV6 libc vs kernel symbol clash

On some systems, with IPV6 configured, there is a clash between the kernel's
in6addr_any and the one in libc.

This is handled in the usual (gross) way of defining the kernel symbol out of
the way on the gcc command line.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
90118e7540 UML - Stop using libc asm/page.h
commit 71f926f2ea in mainline.

uml: stop using libc asm/page.h

Remove includes of asm/page.h from libc code.  This header seems to be
disappearing, and UML doesn't make much use of it anyway.

The one use, PAGE_SHIFT in stub.h, is handled by copying the constant from the
kernel side of the house in common_offsets.h.

[ jdike - added arch/um/kernel/skas/clone.c for -stable ]

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
5361fb20cc POWERPC: Make sure to of_node_get() the result of pci_device_to_OF_node()
patch db220b234d in mainline.

pci_device_to_OF_node() returns the device node attached to a PCI device,
but doesn't actually grab a reference - we need to do it ourselves.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:57 -08:00
1d841b4fa6 POWERPC: Fix handling of stfiwx math emulation
patch ba02946a90 in mainline

Its legal for the stfiwx instruction to have RA = 0 as part of its
effective address calculation.  This is illegal for all other XE
form instructions.

Add code to compute the proper effective address for stfiwx if
RA = 0 rather than treating it as illegal.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:56 -08:00
2f51141bab MIPS: R1: Fix hazard barriers to make kernels work on R2 also.
patch 572afc248c in mainline.

Tested with Malta; inflates malta_defconfig by 3932 bytes.  Ideally there
should be additional configuration to allow getting rid of this overhead
but that would be too much complexity at this stage of the release cycle.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:56 -08:00
e989c61af4 MIPS: MT: Fix bug in multithreaded kernels.
patch a76ab5c10d in mainline.

When GDB writes a breakpoint into address area of inferior process the
kernel needs to invalidate the modified memory in the inferior which
is done by calling flush_cache_page which in turns calls
r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC
kernel via r4k_on_each_cpu().

As the VSMP and SMTC SMP kernels for 34K are running on a single shared
caches it is possible to get away without interprocessor function calls.
This optimization is implemented in r4k_on_each_cpu, so
local_r4k_flush_cache_page is only ever called on the local CPU.

This is where the following code in local_r4k_flush_cache_page() strikes:

        /*
         * If ownes no valid ASID yet, cannot possibly have gotten
         * this page into the cache.
         */
        if (cpu_context(smp_processor_id(), mm) == 0)
                return;

On VSMP and SMTC had a function of cpu_context() for each CPU(TC).

So in case another CPU than the CPU executing local_r4k_cache_flush_page
has not accessed the mm but one of the other CPUs has there may be data
to be flushed in the cache yet local_r4k_cache_flush_page will falsely
return leaving the I-cache inconsistent for the breakpoint.

While the issue was discovered with GDB it also exists in
local_r4k_flush_cache_range() and local_r4k_flush_cache().

Fixed by introducing a new function has_valid_asid which on MT kernels
returns true if a mm is active on any processor in the system.

This is relativly expensive since for memory acccesses in that loop
cache misses have to be assumed but it seems the most viable solution
for 2.6.23 and older -stable kernels.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:56 -08:00
efd233a45e Fix sparc64 MAP_FIXED handling of framebuffer mmaps
patch d58aa8c7b1 in mainline.

From: Chris Wright <chrisw@sous-sol.org>
Date: Tue, 23 Oct 2007 20:36:14 -0700
Subject: [PATCH] [SPARC64]: pass correct addr in get_fb_unmapped_area(MAP_FIXED)

Looks like the MAP_FIXED case is using the wrong address hint.  I'd
expect the comment "don't mess with it" means pass the request
straight on through, not change the address requested to -ENOMEM.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:56 -08:00
71ec6448bb Fix sparc64 niagara optimized RAID xor asm
patch d060db63fd in mainline.

[SPARC64]: Fix register usage in xor_raid_4().

Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is
really really bad because those are the frame pointer and return PC.

Based upon a raid5 crash report by Bertrand Joel.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:22:56 -08:00
553e6a1aec Linux 2.6.23.2 2007-11-16 08:19:12 -08:00
0520fb1646 BLOCK: Fix bad sharing of tag busy list on queues with shared tag maps
patch 6eca9004df in mainline.

For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list!  The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.

So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.

This is fixed with commit 6eca9004df in
Linus' tree.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:44 -08:00
bba9d994eb fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE
patch 487e9bf25c in mainline.

It's possible to provoke unionfs (not yet in mainline, though in mm and
some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
2.6.24 ecryptfs no longer calls lower level's ->writepage).

This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
already a fix (e423003028 - writeback: don't
propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
far as it goes; but insufficient because it doesn't address the underlying
issue, that shmem_writepage expects to be called only by vmscan (relying on
backing_dev_info capabilities to prevent the normal writeback path from
ever approaching it).

That's an increasingly fragile assumption, and ramdisk_writepage (the other
source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
wbc->for_reclaim before returning it.  Make the same check in
shmem_writepage, thereby sidestepping the page_mapped BUG also.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:44 -08:00
59ddd46073 Fix compat futex hangs.
[FUTEX]: Fix address computation in compat code.

[ Upstream commit: 3c5fd9c77d ]

compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:

	(void __user *)entry + futex_offset

'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').

Things explode if the 32-bit sign bit is set in futex_offset.

Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.

This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().

Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.

Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.

Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():

1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
   for address '0xf7f16bd8' which succeeds
3) goto #1

I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:44 -08:00
e823c33c6f sched: keep utime/stime monotonic
sched: keep utime/stime monotonic

cpustats use utime/stime as a ratio against sum_exec_runtime, as a
consequence it can happen - when the ratio changes faster than time
accumulates - that either can be appear to go backwards.

Combined backport for 2.6.23 of the following patches from mainline:
commit 73a2bcb0ed
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
  sched: keep utime/stime monotonic

commit 9301899be7
Author: Balbir Singh <balbir@linux.vnet.ibm.com>
  sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2

Signed-off-by: Frans Pop <elendil@planet.nl>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:44 -08:00
436e61d936 fix the softlockup watchdog to actually work
patch a115d5caca in mainline.

this Xen related commit:

   commit 966812dc98
   Author: Jeremy Fitzhardinge <jeremy@goop.org>
   Date:   Tue May 8 00:28:02 2007 -0700

       Ignore stolen time in the softlockup watchdog

broke the softlockup watchdog to never report any lockups. (!)

print_timestamp defaults to 0, this makes the following condition
always true:

	if (print_timestamp < (touch_timestamp + 1) ||

and we'll in essence never report soft lockups.

apparently the functionality of the soft lockup watchdog was never
actually tested with that patch applied ...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
4d03fda881 splice: fix double kunmap() in vmsplice copy path
patch 6866bef40d in mainline.

The out label should not include the unmap, the only way to jump
there already has unmapped the source.

00002000
       f7c21a00 00000000 00000000 c0489036 00018e32 00000002 00000000
00001000
Call Trace:
 [<c0487dd9>] pipe_to_user+0xca/0xd3
 [<c0488233>] __splice_from_pipe+0x53/0x1bd
 [<c0454947>] ------------[ cut here ]------------
filemap_fault+0x221/0x380
 [<c0487d0f>] pipe_to_user+0x0/0xd3
 [<c0489036>] sys_vmsplice+0x3b7/0x422
 [<c045ec3f>] kernel BUG at mm/highmem.c:206!
handle_mm_fault+0x4d5/0x8eb
 [<c041ed5b>] kmap_atomic+0x1c/0x20
 [<c045d33d>] unmap_vmas+0x3d1/0x584
 [<c045f717>] free_pgtables+0x90/0xa0
 [<c041d84b>] pgd_dtor+0x0/0x1
 [<c044d665>] audit_syscall_exit+0x2aa/0x2c6
 [<c0407817>] do_syscall_trace+0x124/0x169
 [<c0404df2>] syscall_call+0x7/0xb
 =======================
Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: 0000 [#1]
c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07
31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f
95 c1 b0 01
EIP: [<c045bbc3>] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0
SMP
Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap
bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core
ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath
dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4
i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button
ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd
ehci_hcd ohci_hcd uhci_hcd
CPU:    3
EIP:    0060:[<c045bbc3>]    Not tainted VLI
EFLAGS: 00010246   (2.6.23 #1)
EIP is at kunmap_high+0x51/0x8e

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
2e25e43319 writeback: don't propagate AOP_WRITEPAGE_ACTIVATE
patch e423003028 in mainline.

This is a writeback-internal marker but we're propagating it all the way back
to userspace!.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
f8b98ff93b SLUB: Fix memory leak by not reusing cpu_slab
patch 05aa345034 in mainline.

SLUB: Fix memory leak by not reusing cpu_slab

Fix the memory leak that may occur when we attempt to reuse a cpu_slab
that was allocated while we reenabled interrupts in order to be able to
grow a slab cache. The per cpu freelist may contain objects and in that
situation we may overwrite the per cpu freelist pointer loosing objects.
This only occurs if we find that the concurrently allocated slab fits
our allocation needs.

If we simply always deactivate the slab then the freelist will be properly
reintegrated and the memory leak will go away.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
0ebc8ca802 HOWTO: update ja_JP/HOWTO with latest changes
patch 3b6662f192 upstream.

Here is another sync patch of Documentation/ja_JP/HOWTO

Japanese developer sent me some cosmetic changes and also follow
changes of HOWTO
    Cross reference URL (sosdg.org/qiyong/lxr)
    known_regression explanations on kernel dev. process

Signed-off-by: Tsugikazu Shibata <tshibata@ab.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
e8af293bb2 fix param_sysfs_builtin name length check
patch 22800a2830 in mainline.

Commit faf8c714f4 caused a regression:
parameter names longer than MAX_KBUILD_MODNAME will now be rejected,
although we just need to keep the module name part that short.  This patch
restores the old behaviour while still avoiding that memchr is called with
its length parameter larger than the total string length.

Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:43 -08:00
2b5ee2866a param_sysfs_builtin memchr argument fix
patch faf8c714f4 in mainline.

If memchr argument is longer than strlen(kp->name), there will be some
weird result.

It will casuse duplicate filenames in sysfs for the "nousb".  kernel
warning messages are as bellow:

sysfs: duplicate filename 'usbcore' can not be created
WARNING: at fs/sysfs/dir.c:416 sysfs_add_one()
 [<c01c4750>] sysfs_add_one+0xa0/0xe0
 [<c01c4ab8>] create_dir+0x48/0xb0
 [<c01c4b69>] sysfs_create_dir+0x29/0x50
 [<c024e0fb>] create_dir+0x1b/0x50
 [<c024e3b6>] kobject_add+0x46/0x150
 [<c024e2da>] kobject_init+0x3a/0x80
 [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
 [<c053b9ce>] param_sysfs_builtin+0xee/0x130
 [<c053ba33>] param_sysfs_init+0x23/0x60
 [<c024d062>] __next_cpu+0x12/0x20
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052a856>] do_initcalls+0x46/0x1e0
 [<c01bdb12>] create_proc_entry+0x52/0x90
 [<c0158d4c>] register_irq_proc+0x9c/0xc0
 [<c01bda94>] proc_mkdir_mode+0x34/0x50
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa92>] kernel_init+0x62/0xb0
 [<c0104f83>] kernel_thread_helper+0x7/0x14
 =======================
kobject_add failed for usbcore with -EEXIST, don't try to register things with the same name in the same directory.
 [<c024e466>] kobject_add+0xf6/0x150
 [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
 [<c053b9ce>] param_sysfs_builtin+0xee/0x130
 [<c053ba33>] param_sysfs_init+0x23/0x60
 [<c024d062>] __next_cpu+0x12/0x20
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052a856>] do_initcalls+0x46/0x1e0
 [<c01bdb12>] create_proc_entry+0x52/0x90
 [<c0158d4c>] register_irq_proc+0x9c/0xc0
 [<c01bda94>] proc_mkdir_mode+0x34/0x50
 [<c052aa30>] kernel_init+0x0/0xb0
 [<c052aa92>] kernel_init+0x62/0xb0
 [<c0104f83>] kernel_thread_helper+0x7/0x14
 =======================
Module 'usbcore' failed to be added to sysfs, error number -17
The system will be unstable now.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:42 -08:00
aead196be3 Remove broken ptrace() special-case code from file mapping
The kernel has for random historical reasons allowed ptrace() accesses
to access (and insert) pages into the page cache above the size of the
file.

However, Nick broke that by mistake when doing the new fault handling in
commit 54cb8821de ("mm: merge populate and
nopage into fault (fixes nonlinear)".  The breakage caused a hang with
gdb when trying to access the invalid page.

The ptrace "feature" really isn't worth resurrecting, since it really is
wrong both from a portability _and_ from an internal page cache validity
standpoint.  So this removes those old broken remnants, and fixes the
ptrace() hang in the process.

Noticed and bisected by Duane Griffin, who also supplied a test-case
(quoth Nick: "Well that's probably the best bug report I've ever had,
thanks Duane!").

Cc: Duane Griffin <duaneg@dghda.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:42 -08:00
f153577e80 locks: fix possible infinite loop in posix deadlock detection
patch 97855b49b6 in mainline.

It's currently possible to send posix_locks_deadlock() into an infinite
loop (under the BKL).

For now, fix this just by bailing out after a few iterations.  We may
want to fix this in a way that better clarifies the semantics of
deadlock detection.  But that will take more time, and this minimal fix
is probably adequate for any realistic scenario, and is simple enough to
be appropriate for applying to stable kernels now.

Thanks to George Davis for reporting the problem.

Cc: "George G. Davis" <gdavis@mvista.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:42 -08:00
e354b801da lockdep: fix mismatched lockdep_depth/curr_chain_hash
patch 3aa416b07f in mainline.


 It is possible for the current->curr_chain_key to become inconsistent with the
 current index if the chain fails to validate.  The end result is that future
 lock_acquire() operations may inadvertently fail to find a hit in the cache
 resulting in a new node being added to the graph for every acquire.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-16 08:12:42 -08:00
4367388f04 Linux 2.6.23.1 2007-10-12 09:43:44 -07:00
2570089cd7 libata: sata_mv: more S/G fixes
commit 6c08772e49 in mainline.

* corruption fix: we only want the lower 16 bits of length (0 == 64kb)

* ditto: the upper layer sets max-phys-segments to LIBATA_MAX_PRD,
  so we must reset it to own hw-specific length.

* delete unused mv_fill_sg() return value

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 09:23:18 -07:00
53 changed files with 450 additions and 220 deletions

View File

@ -1,4 +1,4 @@
NOTE:
NOTE:
This is a version of Documentation/HOWTO translated into Japanese.
This document is maintained by Tsugikazu Shibata <tshibata@ab.jp.nec.com>
and the JF Project team <www.linux.or.jp/JF>.
@ -11,14 +11,14 @@ for non English (read: Japanese) speakers and is not intended as a
fork. So if you have any comments or updates for this file, please try
to update the original English file first.
Last Updated: 2007/07/18
Last Updated: 2007/09/23
==================================
これは、
linux-2.6.22/Documentation/HOWTO
linux-2.6.23/Documentation/HOWTO
の和訳です。
翻訳団体: JF プロジェクト < http://www.linux.or.jp/JF/ >
翻訳日: 2007/07/16
翻訳日: 2007/09/19
翻訳者: Tsugikazu Shibata <tshibata at ab dot jp dot nec dot com>
校正者: 松倉さん <nbh--mats at nifty dot com>
小林 雅典さん (Masanori Kobayasi) <zap03216 at nifty dot ne dot jp>
@ -27,6 +27,7 @@ linux-2.6.22/Documentation/HOWTO
野口さん (Kenji Noguchi) <tokyo246 at gmail dot com>
河内さん (Takayoshi Kochi) <t-kochi at bq dot jp dot nec dot com>
岩本さん (iwamoto) <iwamoto.kn at ncos dot nec dot co dot jp>
内田さん (Satoshi Uchida) <s-uchida at ap dot jp dot nec dot com>
==================================
Linux カーネル開発のやり方
@ -40,7 +41,7 @@ Linux カーネル開発コミュニティと共に活動するやり方を学
手助けになります。
もし、このドキュメントのどこかが古くなっていた場合には、このドキュメン
トの最後にリストしたメンテナにパッチを送ってください。
トの最後にリストしたメンテナにパッチを送ってください。
はじめに
---------
@ -59,7 +60,7 @@ Linux カーネル開発コミュニティと共に活動するやり方を学
ネル開発者には必要です。アーキテクチャ向けの低レベル部分の開発をするの
でなければ、(どんなアーキテクチャでも)アセンブリ(訳注: 言語)は必要あり
ません。以下の本は、C 言語の十分な知識や何年もの経験に取って代わるもの
ではありませんが、少なくともリファレンスとしてはい本です。
ではありませんが、少なくともリファレンスとしてはい本です。
- "The C Programming Language" by Kernighan and Ritchie [Prentice Hall]
-『プログラミング言語第2版』(B.W. カーニハン/D.M. リッチー著 石田晴久訳) [共立出版]
- "Practical C Programming" by Steve Oualline [O'Reilly]
@ -76,7 +77,7 @@ Linux カーネル開発コミュニティと共に活動するやり方を学
ときどき、カーネルがツールチェインや C 言語拡張に置いている前提がどう
なっているのかわかりにくいことがあり、また、残念なことに決定的なリファ
レンスは存在しません。情報を得るには、gcc の info ページ( info gcc )を
てください。
てください。
あなたは既存の開発コミュニティと一緒に作業する方法を学ぼうとしているこ
とに留意してください。そのコミュニティは、コーディング、スタイル、
@ -92,7 +93,7 @@ Linux カーネル開発コミュニティと共に活動するやり方を学
Linux カーネルのソースコードは GPL ライセンスの下でリリースされていま
す。ライセンスの詳細については、ソースツリーのメインディレクトリに存在
する、COPYING のファイルをてください。もしライセンスについてさらに質
する、COPYING のファイルをてください。もしライセンスについてさらに質
問があれば、Linux Kernel メーリングリストに質問するのではなく、どうぞ
法律家に相談してください。メーリングリストの人達は法律家ではなく、法的
問題については彼らの声明はあてにするべきではありません。
@ -109,7 +110,8 @@ Linux カーネルソースツリーは幅広い範囲のドキュメントを
新しいドキュメントファイルも追加することを勧めます。
カーネルの変更が、カーネルがユーザ空間に公開しているインターフェイスの
変更を引き起こす場合、その変更を説明するマニュアルページのパッチや情報
をマニュアルページのメンテナ mtk-manpages@gmx.net に送ることを勧めます。
をマニュアルページのメンテナ mtk-manpages@gmx.net に送ることを勧めま
す。
以下はカーネルソースツリーに含まれている読んでおくべきファイルの一覧で
す-
@ -117,7 +119,7 @@ Linux カーネルソースツリーは幅広い範囲のドキュメントを
README
このファイルは Linuxカーネルの簡単な背景とカーネルを設定(訳注
configure )し、生成(訳注 build )するために必要なことは何かが書かれ
ています。カーネルに関して初めての人はここからスタートするといで
ています。カーネルに関して初めての人はここからスタートするといで
しょう。
Documentation/Changes
@ -128,7 +130,7 @@ Linux カーネルソースツリーは幅広い範囲のドキュメントを
Documentation/CodingStyle
これは Linux カーネルのコーディングスタイルと背景にある理由を記述
しています。全ての新しいコードはこのドキュメントにあるガイドライン
に従っていることを期待されています。大部分のメンテナはこれらのルー
に従っていることを期待されています。大部分のメンテナはこれらのルー
ルに従っているものだけを受け付け、多くの人は正しいスタイルのコード
だけをレビューします。
@ -168,16 +170,16 @@ Linux カーネルソースツリーは幅広い範囲のドキュメントを
支援してください。
Documentation/ManagementStyle
このドキュメントは Linux カーネルのメンテナ達がどう行動するか、
このドキュメントは Linux カーネルのメンテナ達がどう行動するか、
彼らの手法の背景にある共有されている精神について記述しています。こ
れはカーネル開発の初心者なら(もしくは、単に興味があるだけの人でも)
重要です。なぜならこのドキュメントは、カーネルメンテナ達の独特な
重要です。なぜならこのドキュメントは、カーネルメンテナ達の独特な
行動についての多くの誤解や混乱を解消するからです。
Documentation/stable_kernel_rules.txt
このファイルはどのように stable カーネルのリリースが行われるかのルー
ルが記述されています。そしてこれらのリリースの中のどこかで変更を取
り入れてもらいたい場合に何をすればいかが示されています。
り入れてもらいたい場合に何をすればいかが示されています。
Documentation/kernel-docs.txt
  カーネル開発に付随する外部ドキュメントのリストです。もしあなたが
@ -218,9 +220,9 @@ web サイトには、コードの構成、サブシステム、現在存在す
ここには、また、カーネルのコンパイルのやり方やパッチの当て方などの間接
的な基本情報も記述されています。
あなたがどこからスタートしていかわからないが、Linux カーネル開発コミュ
あなたがどこからスタートしていかわからないが、Linux カーネル開発コミュ
ニティに参加して何かすることをさがしている場合には、Linux kernel
Janitor's プロジェクトにいけばいでしょう -
Janitor's プロジェクトにいけばいでしょう -
http://janitor.kernelnewbies.org/
ここはそのようなスタートをするのにうってつけの場所です。ここには、
Linux カーネルソースツリーの中に含まれる、きれいにし、修正しなければな
@ -243,7 +245,7 @@ Linux カーネルソースツリーの中に含まれる、きれいにし、
自己参照方式で、索引がついた web 形式で、ソースコードを参照することが
できます。この最新の素晴しいカーネルコードのリポジトリは以下で見つかり
ます-
http://sosdg.org/~coywolf/lxr/
http://sosdg.org/~qiyong/lxr/
開発プロセス
-----------------------
@ -265,9 +267,9 @@ Linux カーネルの開発プロセスは現在幾つかの異なるメイン
以下のとおり-
- 新しいカーネルがリリースされた直後に、2週間の特別期間が設けられ、
この期間中に、メンテナ達は Linus に大きな差分を送ることができま
す。このような差分は通常 -mm カーネルに数週間含まれてきたパッチで
す。 大きな変更は git(カーネルのソース管理ツール、詳細は
この期間中に、メンテナ達は Linus に大きな差分を送ることができます。
このような差分は通常 -mm カーネルに数週間含まれてきたパッチです。
大きな変更は git(カーネルのソース管理ツール、詳細は
http://git.or.cz/ 参照) を使って送るのが好ましいやり方ですが、パッ
チファイルの形式のまま送るのでも十分です。
@ -285,6 +287,10 @@ Linux カーネルの開発プロセスは現在幾つかの異なるメイン
に安定した状態にあると判断したときにリリースされます。目標は毎週新
しい -rc カーネルをリリースすることです。
- 以下の URL で各 -rc リリースに存在する既知の後戻り問題のリスト
が追跡されます-
http://kernelnewbies.org/known_regressions
- このプロセスはカーネルが 「準備ができた」と考えられるまで継続しま
す。このプロセスはだいたい 6週間継続します。
@ -331,8 +337,8 @@ Andrew は個別のサブシステムカーネルツリーとパッチを全て
linux-kernel メーリングリストで収集された多数のパッチと同時に一つにま
とめます。
このツリーは新機能とパッチが検証される場となります。ある期間の間パッチ
が -mm に入って価値を証明されたら、Andrew やサブシステムメンテナが、
インラインへ入れるように Linus にプッシュします。
が -mm に入って価値を証明されたら、Andrew やサブシステムメンテナが、
インラインへ入れるように Linus にプッシュします。
メインカーネルツリーに含めるために Linus に送る前に、すべての新しいパッ
チが -mm ツリーでテストされることが強く推奨されます。
@ -460,7 +466,7 @@ MAINTAINERS ファイルにリストがありますので参照してくださ
せん-
彼らはあなたのパッチの行毎にコメントを入れたいので、そのためにはそうす
るしかありません。あなたのメールプログラムが空白やタブを圧縮しないよう
に確認した方がいです。最初の良いテストとしては、自分にメールを送って
に確認した方がいです。最初の良いテストとしては、自分にメールを送って
みて、そのパッチを自分で当ててみることです。もしそれがうまく行かないな
ら、あなたのメールプログラムを直してもらうか、正しく動くように変えるべ
きです。
@ -507,14 +513,14 @@ MAINTAINERS ファイルにリストがありますので参照してくださ
とも普通のことです。これはあなたのパッチが受け入れられないということで
は *ありません*、そしてあなた自身に反対することを意味するのでも *ありま
せん*。単に自分のパッチに対して指摘された問題を全て修正して再送すれば
いのです。
いのです。
カーネルコミュニティと企業組織のちがい
-----------------------------------------------------------------
カーネルコミュニティは大部分の伝統的な会社の開発環境とは異ったやり方で
動いています。以下は問題を避けるためにできるといことのリストです-
動いています。以下は問題を避けるためにできるといことのリストです-
あなたの提案する変更について言うときのうまい言い方:
@ -525,7 +531,7 @@ MAINTAINERS ファイルにリストがありますので参照してくださ
- "以下は一連の小さなパッチ群ですが..."
- "これは典型的なマシンでの性能を向上させます.."
やめた方がい悪い言い方:
やめた方がい悪い言い方:
- このやり方で AIX/ptx/Solaris ではできたので、できるはずだ
- 私はこれを20年もの間やってきた、だから
@ -575,10 +581,10 @@ Linux カーネルコミュニティは、一度に大量のコードの塊を
1) 小さいパッチはあなたのパッチが適用される見込みを大きくします、カー
ネルの人達はパッチが正しいかどうかを確認する時間や労力をかけないか
らです。5行のパッチはメンテナがたった1秒見るだけで適用できます。
かし、500行のパッチは、正しいことをレビューするのに数時間かかるか
しれません(時間はパッチのサイズなどにより指数関数に比例してかかり
す)
らです。5行のパッチはメンテナがたった1秒見るだけで適用できます。
かし、500行のパッチは、正しいことをレビューするのに数時間かかるか
しれません(時間はパッチのサイズなどにより指数関数に比例してかかり
す)
小さいパッチは何かあったときにデバッグもとても簡単になります。パッ
チを1個1個取り除くのは、とても大きなパッチを当てた後に(かつ、何かお
@ -587,23 +593,23 @@ Linux カーネルコミュニティは、一度に大量のコードの塊を
2) 小さいパッチを送るだけでなく、送るまえに、書き直して、シンプルにす
る(もしくは、単に順番を変えるだけでも)ことも、とても重要です。
以下はカーネル開発者の Al Viro のたとえ話です:
以下はカーネル開発者の Al Viro のたとえ話です:
"生徒の数学の宿題を採点する先生のことを考えてみてください、先
生は生徒が解に到達するまでの試行錯誤をたいとは思わないでしょ
う。先生は簡潔な最高の解をたいのです。良い生徒はこれを知って
生は生徒が解に到達するまでの試行錯誤をたいとは思わないでしょ
う。先生は簡潔な最高の解をたいのです。良い生徒はこれを知って
おり、そして最終解の前の中間作業を提出することは決してないので
す"
カーネル開発でもこれは同じです。メンテナ達とレビューア達は、
問題を解決する解の背後になる思考プロセスをたいとは思いません。
彼らは単純であざやかな解決方法をたいのです。
カーネル開発でもこれは同じです。メンテナ達とレビューア達は、
問題を解決する解の背後になる思考プロセスをたいとは思いません。
彼らは単純であざやかな解決方法をたいのです。
あざやかな解を説明するのと、コミュニティと共に仕事をし、未解決の仕事を
議論することのバランスをキープするのは難しいかもしれません。
ですから、開発プロセスの早期段階で改善のためのフィードバックをもらうよ
うにするのもいですが、変更点を小さい部分に分割して全体ではまだ完成し
ていない仕事を(部分的に)取り込んでもらえるようにすることもいことです。
うにするのもいですが、変更点を小さい部分に分割して全体ではまだ完成し
ていない仕事を(部分的に)取り込んでもらえるようにすることもいことです。
また、でき上がっていないものや、"将来直す" ようなパッチを、本流に含め
てもらうように送っても、それは受け付けられないことを理解してください。
@ -629,7 +635,7 @@ Linux カーネルコミュニティは、一度に大量のコードの塊を
- テスト結果
これについて全てがどのようにあるべきかについての詳細は、以下のドキュメ
ントの ChangeLog セクションをてください-
ントの ChangeLog セクションをてください-
"The Perfect Patch"
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt

View File

@ -1,7 +1,7 @@
VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 23
EXTRAVERSION =
EXTRAVERSION = .3
NAME = Arr Matey! A Hairy Bilge Rat!
# *DOCUMENTATION*

View File

@ -17,6 +17,8 @@
#ifndef BOOT_BOOT_H
#define BOOT_BOOT_H
#define STACK_SIZE 512 /* Minimum number of bytes for stack */
#ifndef __ASSEMBLY__
#include <stdarg.h>
@ -198,8 +200,6 @@ static inline int isdigit(int ch)
}
/* Heap -- available for dynamic lists. */
#define STACK_SIZE 512 /* Minimum number of bytes for stack */
extern char _end[];
extern char *HEAP;
extern char *heap_end;
@ -216,9 +216,9 @@ static inline char *__get_heap(size_t s, size_t a, size_t n)
#define GET_HEAP(type, n) \
((type *)__get_heap(sizeof(type),__alignof__(type),(n)))
static inline int heap_free(void)
static inline bool heap_free(size_t n)
{
return heap_end-HEAP;
return (int)(heap_end-HEAP) >= (int)n;
}
/* copy.S */

View File

@ -173,7 +173,8 @@ ramdisk_size: .long 0 # its size in bytes
bootsect_kludge:
.long 0 # obsolete
heap_end_ptr: .word _end+1024 # (Header version 0x0201 or later)
heap_end_ptr: .word _end+STACK_SIZE-512
# (Header version 0x0201 or later)
# space from here (exclusive) down to
# end of setup code can be used by setup
# for local heap purposes.
@ -225,28 +226,53 @@ start_of_setup:
int $0x13
#endif
# We will have entered with %cs = %ds+0x20, normalize %cs so
# it is on par with the other segments.
pushw %ds
pushw $setup2
lretw
setup2:
# Force %es = %ds
movw %ds, %ax
movw %ax, %es
cld
# Stack paranoia: align the stack and make sure it is good
# for both 16- and 32-bit references. In particular, if we
# were meant to have been using the full 16-bit segment, the
# caller might have set %sp to zero, which breaks %esp-based
# references.
andw $~3, %sp # dword align (might as well...)
jnz 1f
movw $0xfffc, %sp # Make sure we're not zero
1: movzwl %sp, %esp # Clear upper half of %esp
sti
# Apparently some ancient versions of LILO invoked the kernel
# with %ss != %ds, which happened to work by accident for the
# old code. If the CAN_USE_HEAP flag is set in loadflags, or
# %ss != %ds, then adjust the stack pointer.
# Smallest possible stack we can tolerate
movw $(_end+STACK_SIZE), %cx
movw heap_end_ptr, %dx
addw $512, %dx
jnc 1f
xorw %dx, %dx # Wraparound - whole segment available
1: testb $CAN_USE_HEAP, loadflags
jnz 2f
# No CAN_USE_HEAP
movw %ss, %dx
cmpw %ax, %dx # %ds == %ss?
movw %sp, %dx
# If so, assume %sp is reasonably set, otherwise use
# the smallest possible stack.
jne 4f # -> Smallest possible stack...
# Make sure the stack is at least minimum size. Take a value
# of zero to mean "full segment."
2:
andw $~3, %dx # dword align (might as well...)
jnz 3f
movw $0xfffc, %dx # Make sure we're not zero
3: cmpw %cx, %dx
jnb 5f
4: movw %cx, %dx # Minimum value we can possibly use
5: movw %ax, %ss
movzwl %dx, %esp # Clear upper half of %esp
sti # Now we should have a working stack
# We will have entered with %cs = %ds+0x20, normalize %cs so
# it is on par with the other segments.
pushw %ds
pushw $6f
lretw
6:
# Check signature at end of setup
cmpl $0x5a5aaa55, setup_sig

View File

@ -79,7 +79,7 @@ static int bios_probe(void)
video_bios.modes = GET_HEAP(struct mode_info, 0);
for (mode = 0x14; mode <= 0x7f; mode++) {
if (heap_free() < sizeof(struct mode_info))
if (!heap_free(sizeof(struct mode_info)))
break;
if (mode_defined(VIDEO_FIRST_BIOS+mode))

View File

@ -57,7 +57,7 @@ static int vesa_probe(void)
while ((mode = rdfs16(mode_ptr)) != 0xffff) {
mode_ptr += 2;
if (heap_free() < sizeof(struct mode_info))
if (!heap_free(sizeof(struct mode_info)))
break; /* Heap full, can't save mode info */
if (mode & ~0x1ff)

View File

@ -371,7 +371,7 @@ static void save_screen(void)
saved.curx = boot_params.screen_info.orig_x;
saved.cury = boot_params.screen_info.orig_y;
if (heap_free() < saved.x*saved.y*sizeof(u16)+512)
if (!heap_free(saved.x*saved.y*sizeof(u16)+512))
return; /* Not enough heap to save the screen */
saved.data = GET_HEAP(u16, saved.x*saved.y);

View File

@ -137,7 +137,7 @@ unsigned long native_calculate_cpu_khz(void)
{
unsigned long long start, end;
unsigned long count;
u64 delta64;
u64 delta64 = (u64)ULLONG_MAX;
int i;
unsigned long flags;
@ -149,6 +149,7 @@ unsigned long native_calculate_cpu_khz(void)
rdtscll(start);
mach_countup(&count);
rdtscll(end);
delta64 = min(delta64, (end - start));
}
/*
* Error: ECTCNEVERSET
@ -159,8 +160,6 @@ unsigned long native_calculate_cpu_khz(void)
if (count <= 1)
goto err;
delta64 = end - start;
/* cpu freq too fast: */
if (delta64 > (1ULL<<32))
goto err;

View File

@ -56,7 +56,23 @@ DEFINE_PER_CPU(enum paravirt_lazy_mode, xen_lazy_mode);
DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
DEFINE_PER_CPU(unsigned long, xen_cr3);
/*
* Note about cr3 (pagetable base) values:
*
* xen_cr3 contains the current logical cr3 value; it contains the
* last set cr3. This may not be the current effective cr3, because
* its update may be being lazily deferred. However, a vcpu looking
* at its own cr3 can use this value knowing that it everything will
* be self-consistent.
*
* xen_current_cr3 contains the actual vcpu cr3; it is set once the
* hypercall to set the vcpu cr3 is complete (so it may be a little
* out of date, but it will never be set early). If one vcpu is
* looking at another vcpu's cr3 value, it should use this variable.
*/
DEFINE_PER_CPU(unsigned long, xen_cr3); /* cr3 stored as physaddr */
DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
struct start_info *xen_start_info;
EXPORT_SYMBOL_GPL(xen_start_info);
@ -100,7 +116,7 @@ static void __init xen_vcpu_setup(int cpu)
info.mfn = virt_to_mfn(vcpup);
info.offset = offset_in_page(vcpup);
printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %x, offset %d\n",
printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n",
cpu, vcpup, info.mfn, info.offset);
/* Check to see if the hypervisor will put the vcpu_info
@ -632,32 +648,36 @@ static unsigned long xen_read_cr3(void)
return x86_read_percpu(xen_cr3);
}
static void set_current_cr3(void *v)
{
x86_write_percpu(xen_current_cr3, (unsigned long)v);
}
static void xen_write_cr3(unsigned long cr3)
{
struct mmuext_op *op;
struct multicall_space mcs;
unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3));
BUG_ON(preemptible());
if (cr3 == x86_read_percpu(xen_cr3)) {
/* just a simple tlb flush */
xen_flush_tlb();
return;
}
mcs = xen_mc_entry(sizeof(*op)); /* disables interrupts */
/* Update while interrupts are disabled, so its atomic with
respect to ipis */
x86_write_percpu(xen_cr3, cr3);
op = mcs.args;
op->cmd = MMUEXT_NEW_BASEPTR;
op->arg1.mfn = mfn;
{
struct mmuext_op *op;
struct multicall_space mcs = xen_mc_entry(sizeof(*op));
unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3));
MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
op = mcs.args;
op->cmd = MMUEXT_NEW_BASEPTR;
op->arg1.mfn = mfn;
/* Update xen_update_cr3 once the batch has actually
been submitted. */
xen_mc_callback(set_current_cr3, (void *)cr3);
MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
xen_mc_issue(PARAVIRT_LAZY_CPU);
}
xen_mc_issue(PARAVIRT_LAZY_CPU); /* interrupts restored */
}
/* Early in boot, while setting up the initial pagetable, assume
@ -1113,6 +1133,7 @@ asmlinkage void __init xen_start_kernel(void)
/* keep using Xen gdt for now; no urgent need to change it */
x86_write_percpu(xen_cr3, __pa(pgd));
x86_write_percpu(xen_current_cr3, __pa(pgd));
#ifdef CONFIG_SMP
/* Don't do the full vcpu_info placement stuff until we have a

View File

@ -515,20 +515,43 @@ static void drop_other_mm_ref(void *info)
if (__get_cpu_var(cpu_tlbstate).active_mm == mm)
leave_mm(smp_processor_id());
/* If this cpu still has a stale cr3 reference, then make sure
it has been flushed. */
if (x86_read_percpu(xen_current_cr3) == __pa(mm->pgd)) {
load_cr3(swapper_pg_dir);
arch_flush_lazy_cpu_mode();
}
}
static void drop_mm_ref(struct mm_struct *mm)
{
cpumask_t mask;
unsigned cpu;
if (current->active_mm == mm) {
if (current->mm == mm)
load_cr3(swapper_pg_dir);
else
leave_mm(smp_processor_id());
arch_flush_lazy_cpu_mode();
}
if (!cpus_empty(mm->cpu_vm_mask))
xen_smp_call_function_mask(mm->cpu_vm_mask, drop_other_mm_ref,
mm, 1);
/* Get the "official" set of cpus referring to our pagetable. */
mask = mm->cpu_vm_mask;
/* It's possible that a vcpu may have a stale reference to our
cr3, because its in lazy mode, and it hasn't yet flushed
its set of pending hypercalls yet. In this case, we can
look at its actual current cr3 value, and force it to flush
if needed. */
for_each_online_cpu(cpu) {
if (per_cpu(xen_current_cr3, cpu) == __pa(mm->pgd))
cpu_set(cpu, mask);
}
if (!cpus_empty(mask))
xen_smp_call_function_mask(mask, drop_other_mm_ref, mm, 1);
}
#else
static void drop_mm_ref(struct mm_struct *mm)

View File

@ -32,7 +32,11 @@
struct mc_buffer {
struct multicall_entry entries[MC_BATCH];
u64 args[MC_ARGS];
unsigned mcidx, argidx;
struct callback {
void (*fn)(void *);
void *data;
} callbacks[MC_BATCH];
unsigned mcidx, argidx, cbidx;
};
static DEFINE_PER_CPU(struct mc_buffer, mc_buffer);
@ -43,6 +47,7 @@ void xen_mc_flush(void)
struct mc_buffer *b = &__get_cpu_var(mc_buffer);
int ret = 0;
unsigned long flags;
int i;
BUG_ON(preemptible());
@ -51,8 +56,6 @@ void xen_mc_flush(void)
local_irq_save(flags);
if (b->mcidx) {
int i;
if (HYPERVISOR_multicall(b->entries, b->mcidx) != 0)
BUG();
for (i = 0; i < b->mcidx; i++)
@ -65,6 +68,13 @@ void xen_mc_flush(void)
local_irq_restore(flags);
for(i = 0; i < b->cbidx; i++) {
struct callback *cb = &b->callbacks[i];
(*cb->fn)(cb->data);
}
b->cbidx = 0;
BUG_ON(ret);
}
@ -88,3 +98,16 @@ struct multicall_space __xen_mc_entry(size_t args)
return ret;
}
void xen_mc_callback(void (*fn)(void *), void *data)
{
struct mc_buffer *b = &__get_cpu_var(mc_buffer);
struct callback *cb;
if (b->cbidx == MC_BATCH)
xen_mc_flush();
cb = &b->callbacks[b->cbidx++];
cb->fn = fn;
cb->data = data;
}

View File

@ -42,4 +42,7 @@ static inline void xen_mc_issue(unsigned mode)
local_irq_restore(x86_read_percpu(xen_mc_irq_flags));
}
/* Set up a callback to be called when the current batch is flushed */
void xen_mc_callback(void (*fn)(void *), void *data);
#endif /* _XEN_MULTICALLS_H */

View File

@ -11,6 +11,7 @@ void xen_copy_trap_info(struct trap_info *traps);
DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
DECLARE_PER_CPU(unsigned long, xen_cr3);
DECLARE_PER_CPU(unsigned long, xen_current_cr3);
extern struct start_info *xen_start_info;
extern struct shared_info *HYPERVISOR_shared_info;

View File

@ -360,11 +360,26 @@ static void r4k___flush_cache_all(void)
r4k_on_each_cpu(local_r4k___flush_cache_all, NULL, 1, 1);
}
static inline int has_valid_asid(const struct mm_struct *mm)
{
#if defined(CONFIG_MIPS_MT_SMP) || defined(CONFIG_MIPS_MT_SMTC)
int i;
for_each_online_cpu(i)
if (cpu_context(i, mm))
return 1;
return 0;
#else
return cpu_context(smp_processor_id(), mm);
#endif
}
static inline void local_r4k_flush_cache_range(void * args)
{
struct vm_area_struct *vma = args;
if (!(cpu_context(smp_processor_id(), vma->vm_mm)))
if (!(has_valid_asid(vma->vm_mm)))
return;
r4k_blast_dcache();
@ -383,7 +398,7 @@ static inline void local_r4k_flush_cache_mm(void * args)
{
struct mm_struct *mm = args;
if (!cpu_context(smp_processor_id(), mm))
if (!has_valid_asid(mm))
return;
/*
@ -434,7 +449,7 @@ static inline void local_r4k_flush_cache_page(void *args)
* If ownes no valid ASID yet, cannot possibly have gotten
* this page into the cache.
*/
if (cpu_context(smp_processor_id(), mm) == 0)
if (!has_valid_asid(mm))
return;
addr &= PAGE_MASK;

View File

@ -407,11 +407,16 @@ do_mathemu(struct pt_regs *regs)
case XE:
idx = (insn >> 16) & 0x1f;
if (!idx)
goto illegal;
op0 = (void *)&current->thread.fpr[(insn >> 21) & 0x1f];
op1 = (void *)(regs->gpr[idx] + regs->gpr[(insn >> 11) & 0x1f]);
if (!idx) {
if (((insn >> 1) & 0x3ff) == STFIWX)
op1 = (void *)(regs->gpr[(insn >> 11) & 0x1f]);
else
goto illegal;
} else {
op1 = (void *)(regs->gpr[idx] + regs->gpr[(insn >> 11) & 0x1f]);
}
break;
case XEU:

View File

@ -126,7 +126,7 @@ static struct axon_msic *find_msi_translator(struct pci_dev *dev)
const phandle *ph;
struct axon_msic *msic = NULL;
dn = pci_device_to_OF_node(dev);
dn = of_node_get(pci_device_to_OF_node(dev));
if (!dn) {
dev_dbg(&dev->dev, "axon_msi: no pci_dn found\n");
return NULL;
@ -183,7 +183,7 @@ static int setup_msi_msg_address(struct pci_dev *dev, struct msi_msg *msg)
int len;
const u32 *prop;
dn = pci_device_to_OF_node(dev);
dn = of_node_get(pci_device_to_OF_node(dev));
if (!dn) {
dev_dbg(&dev->dev, "axon_msi: no pci_dn found\n");
return -ENODEV;

View File

@ -319,7 +319,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
if (flags & MAP_FIXED) {
/* Ok, don't mess with it. */
return get_unmapped_area(NULL, addr, len, pgoff, flags);
return get_unmapped_area(NULL, orig_addr, len, pgoff, flags);
}
flags &= ~MAP_SHARED;

View File

@ -491,12 +491,12 @@ xor_niagara_4: /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3 */
ldda [%i1 + 0x10] %asi, %i2 /* %i2/%i3 = src1 + 0x10 */
xor %g2, %i4, %g2
xor %g3, %i5, %g3
ldda [%i7 + 0x10] %asi, %i4 /* %i4/%i5 = src2 + 0x10 */
ldda [%l7 + 0x10] %asi, %i4 /* %i4/%i5 = src2 + 0x10 */
xor %l0, %g2, %l0
xor %l1, %g3, %l1
stxa %l0, [%i0 + 0x00] %asi
stxa %l1, [%i0 + 0x08] %asi
ldda [%i6 + 0x10] %asi, %g2 /* %g2/%g3 = src3 + 0x10 */
ldda [%l6 + 0x10] %asi, %g2 /* %g2/%g3 = src3 + 0x10 */
ldda [%i0 + 0x10] %asi, %l0 /* %l0/%l1 = dest + 0x10 */
xor %i4, %i2, %i4
@ -504,12 +504,12 @@ xor_niagara_4: /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3 */
ldda [%i1 + 0x20] %asi, %i2 /* %i2/%i3 = src1 + 0x20 */
xor %g2, %i4, %g2
xor %g3, %i5, %g3
ldda [%i7 + 0x20] %asi, %i4 /* %i4/%i5 = src2 + 0x20 */
ldda [%l7 + 0x20] %asi, %i4 /* %i4/%i5 = src2 + 0x20 */
xor %l0, %g2, %l0
xor %l1, %g3, %l1
stxa %l0, [%i0 + 0x10] %asi
stxa %l1, [%i0 + 0x18] %asi
ldda [%i6 + 0x20] %asi, %g2 /* %g2/%g3 = src3 + 0x20 */
ldda [%l6 + 0x20] %asi, %g2 /* %g2/%g3 = src3 + 0x20 */
ldda [%i0 + 0x20] %asi, %l0 /* %l0/%l1 = dest + 0x20 */
xor %i4, %i2, %i4
@ -517,12 +517,12 @@ xor_niagara_4: /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3 */
ldda [%i1 + 0x30] %asi, %i2 /* %i2/%i3 = src1 + 0x30 */
xor %g2, %i4, %g2
xor %g3, %i5, %g3
ldda [%i7 + 0x30] %asi, %i4 /* %i4/%i5 = src2 + 0x30 */
ldda [%l7 + 0x30] %asi, %i4 /* %i4/%i5 = src2 + 0x30 */
xor %l0, %g2, %l0
xor %l1, %g3, %l1
stxa %l0, [%i0 + 0x20] %asi
stxa %l1, [%i0 + 0x28] %asi
ldda [%i6 + 0x30] %asi, %g2 /* %g2/%g3 = src3 + 0x30 */
ldda [%l6 + 0x30] %asi, %g2 /* %g2/%g3 = src3 + 0x30 */
ldda [%i0 + 0x30] %asi, %l0 /* %l0/%l1 = dest + 0x30 */
prefetch [%i1 + 0x40], #one_read

View File

@ -60,7 +60,8 @@ SYS_DIR := $(ARCH_DIR)/include/sysdep-$(SUBARCH)
CFLAGS += $(CFLAGS-y) -D__arch_um__ -DSUBARCH=\"$(SUBARCH)\" \
$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap \
-Din6addr_loopback=kernel_in6addr_loopback
-Din6addr_loopback=kernel_in6addr_loopback \
-Din6addr_any=kernel_in6addr_any
AFLAGS += $(ARCH_INCLUDE)

View File

@ -10,6 +10,7 @@ OFFSET(HOST_TASK_PID, task_struct, pid);
DEFINE(UM_KERN_PAGE_SIZE, PAGE_SIZE);
DEFINE(UM_KERN_PAGE_MASK, PAGE_MASK);
DEFINE(UM_KERN_PAGE_SHIFT, PAGE_SHIFT);
DEFINE(UM_NSEC_PER_SEC, NSEC_PER_SEC);
DEFINE_STR(UM_KERN_EMERG, KERN_EMERG);

View File

@ -9,7 +9,6 @@
#include <sys/mman.h>
#include <asm/ptrace.h>
#include <asm/unistd.h>
#include <asm/page.h>
#include "stub-data.h"
#include "kern_constants.h"
#include "uml-config.h"
@ -19,7 +18,7 @@ extern void stub_clone_handler(void);
#define STUB_SYSCALL_RET EAX
#define STUB_MMAP_NR __NR_mmap2
#define MMAP_OFFSET(o) ((o) >> PAGE_SHIFT)
#define MMAP_OFFSET(o) ((o) >> UM_KERN_PAGE_SHIFT)
static inline long stub_syscall0(long syscall)
{

View File

@ -3,7 +3,6 @@
#include <sys/mman.h>
#include <sys/time.h>
#include <asm/unistd.h>
#include <asm/page.h>
#include "ptrace_user.h"
#include "skas.h"
#include "stub-data.h"

View File

@ -12,7 +12,6 @@
#include <sys/resource.h>
#include <sys/mman.h>
#include <sys/user.h>
#include <asm/page.h>
#include "kern_util.h"
#include "as-layout.h"
#include "mem_user.h"

View File

@ -9,7 +9,6 @@
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <asm/page.h>
#include <asm/unistd.h>
#include "mem_user.h"
#include "mem.h"

View File

@ -182,7 +182,7 @@ static int userspace_tramp(void *stack)
ptrace(PTRACE_TRACEME, 0, 0, 0);
init_new_thread_signals();
signal(SIGTERM, SIG_DFL);
err = set_interval(1);
if(err)
panic("userspace_tramp - setting timer failed, errno = %d\n",

View File

@ -19,7 +19,6 @@
#include <sys/mman.h>
#include <sys/resource.h>
#include <asm/unistd.h>
#include <asm/page.h>
#include <sys/types.h>
#include "kern_util.h"
#include "user.h"

View File

@ -17,7 +17,6 @@
#include <sys/mman.h>
#include <asm/ptrace.h>
#include <asm/unistd.h>
#include <asm/page.h>
#include "kern_util.h"
#include "user.h"
#include "signal_kern.h"

View File

@ -105,6 +105,44 @@ int setjmp_wrapper(void (*proc)(void *, void *), ...)
void os_dump_core(void)
{
int pid;
signal(SIGSEGV, SIG_DFL);
/*
* We are about to SIGTERM this entire process group to ensure that
* nothing is around to run after the kernel exits. The
* kernel wants to abort, not die through SIGTERM, so we
* ignore it here.
*/
signal(SIGTERM, SIG_IGN);
kill(0, SIGTERM);
/*
* Most of the other processes associated with this UML are
* likely sTopped, so give them a SIGCONT so they see the
* SIGTERM.
*/
kill(0, SIGCONT);
/*
* Now, having sent signals to everyone but us, make sure they
* die by ptrace. Processes can survive what's been done to
* them so far - the mechanism I understand is receiving a
* SIGSEGV and segfaulting immediately upon return. There is
* always a SIGSEGV pending, and (I'm guessing) signals are
* processed in numeric order so the SIGTERM (signal 15 vs
* SIGSEGV being signal 11) is never handled.
*
* Run a waitpid loop until we get some kind of error.
* Hopefully, it's ECHILD, but there's not a lot we can do if
* it's something else. Tell os_kill_ptraced_process not to
* wait for the child to report its death because there's
* nothing reasonable to do if that fails.
*/
while ((pid = waitpid(-1, NULL, WNOHANG)) > 0)
os_kill_ptraced_process(pid, 0);
abort();
}

View File

@ -2,9 +2,9 @@
#include <stddef.h>
#include <signal.h>
#include <sys/poll.h>
#include <sys/user.h>
#include <sys/mman.h>
#include <asm/ptrace.h>
#include <asm/user.h>
#define DEFINE(sym, val) \
asm volatile("\n->" #sym " %0 " #val : : "i" (val))
@ -48,8 +48,8 @@ void foo(void)
OFFSET(HOST_SC_FP_ST, _fpstate, _st);
OFFSET(HOST_SC_FXSR_ENV, _fpstate, _fxsr_env);
DEFINE_LONGS(HOST_FP_SIZE, sizeof(struct user_i387_struct));
DEFINE_LONGS(HOST_XFP_SIZE, sizeof(struct user_fxsr_struct));
DEFINE_LONGS(HOST_FP_SIZE, sizeof(struct user_fpregs_struct));
DEFINE_LONGS(HOST_XFP_SIZE, sizeof(struct user_fpxregs_struct));
DEFINE(HOST_IP, EIP);
DEFINE(HOST_SP, UESP);

View File

@ -3,17 +3,10 @@
#include <signal.h>
#include <sys/poll.h>
#include <sys/mman.h>
#include <sys/user.h>
#define __FRAME_OFFSETS
#include <asm/ptrace.h>
#include <asm/types.h>
/* For some reason, x86_64 defines u64 and u32 only in <pci/types.h>, which I
* refuse to include here, even though they're used throughout the headers.
* These are used in asm/user.h, and that include can't be avoided because of
* the sizeof(struct user_regs_struct) below.
*/
typedef __u64 u64;
typedef __u32 u32;
#include <asm/user.h>
#define DEFINE(sym, val) \
asm volatile("\n->" #sym " %0 " #val : : "i" (val))

View File

@ -734,12 +734,6 @@ int in_gate_area_no_task(unsigned long addr)
return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END);
}
void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size)
{
return __alloc_bootmem_core(pgdat->bdata, size,
SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0);
}
const char *arch_vma_name(struct vm_area_struct *vma)
{
if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso)

View File

@ -229,9 +229,14 @@ void global_flush_tlb(void)
struct page *pg, *next;
struct list_head l;
down_read(&init_mm.mmap_sem);
/*
* Write-protect the semaphore, to exclude two contexts
* doing a list_replace_init() call in parallel and to
* exclude new additions to the deferred_pages list:
*/
down_write(&init_mm.mmap_sem);
list_replace_init(&deferred_pages, &l);
up_read(&init_mm.mmap_sem);
up_write(&init_mm.mmap_sem);
flush_map(&l);

View File

@ -819,7 +819,6 @@ static int __blk_free_tags(struct blk_queue_tag *bqt)
retval = atomic_dec_and_test(&bqt->refcnt);
if (retval) {
BUG_ON(bqt->busy);
BUG_ON(!list_empty(&bqt->busy_list));
kfree(bqt->tag_index);
bqt->tag_index = NULL;
@ -931,7 +930,6 @@ static struct blk_queue_tag *__blk_queue_init_tags(struct request_queue *q,
if (init_tag_map(q, tags, depth))
goto fail;
INIT_LIST_HEAD(&tags->busy_list);
tags->busy = 0;
atomic_set(&tags->refcnt, 1);
return tags;
@ -982,6 +980,7 @@ int blk_queue_init_tags(struct request_queue *q, int depth,
*/
q->queue_tags = tags;
q->queue_flags |= (1 << QUEUE_FLAG_QUEUED);
INIT_LIST_HEAD(&q->tag_busy_list);
return 0;
fail:
kfree(tags);
@ -1152,7 +1151,7 @@ int blk_queue_start_tag(struct request_queue *q, struct request *rq)
rq->tag = tag;
bqt->tag_index[tag] = rq;
blkdev_dequeue_request(rq);
list_add(&rq->queuelist, &bqt->busy_list);
list_add(&rq->queuelist, &q->tag_busy_list);
bqt->busy++;
return 0;
}
@ -1173,11 +1172,10 @@ EXPORT_SYMBOL(blk_queue_start_tag);
**/
void blk_queue_invalidate_tags(struct request_queue *q)
{
struct blk_queue_tag *bqt = q->queue_tags;
struct list_head *tmp, *n;
struct request *rq;
list_for_each_safe(tmp, n, &bqt->busy_list) {
list_for_each_safe(tmp, n, &q->tag_busy_list) {
rq = list_entry_rq(tmp);
if (rq->tag == -1) {

View File

@ -69,10 +69,11 @@
#include <linux/device.h>
#include <scsi/scsi_host.h>
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_device.h>
#include <linux/libata.h>
#define DRV_NAME "sata_mv"
#define DRV_VERSION "1.0"
#define DRV_VERSION "1.01"
enum {
/* BAR's are enumerated in terms of pci_resource_start() terms */
@ -420,6 +421,7 @@ static void mv_error_handler(struct ata_port *ap);
static void mv_post_int_cmd(struct ata_queued_cmd *qc);
static void mv_eh_freeze(struct ata_port *ap);
static void mv_eh_thaw(struct ata_port *ap);
static int mv_slave_config(struct scsi_device *sdev);
static int mv_init_one(struct pci_dev *pdev, const struct pci_device_id *ent);
static void mv5_phy_errata(struct mv_host_priv *hpriv, void __iomem *mmio,
@ -457,7 +459,7 @@ static struct scsi_host_template mv5_sht = {
.use_clustering = 1,
.proc_name = DRV_NAME,
.dma_boundary = MV_DMA_BOUNDARY,
.slave_configure = ata_scsi_slave_config,
.slave_configure = mv_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
};
@ -475,7 +477,7 @@ static struct scsi_host_template mv6_sht = {
.use_clustering = 1,
.proc_name = DRV_NAME,
.dma_boundary = MV_DMA_BOUNDARY,
.slave_configure = ata_scsi_slave_config,
.slave_configure = mv_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
};
@ -763,6 +765,17 @@ static void mv_irq_clear(struct ata_port *ap)
{
}
static int mv_slave_config(struct scsi_device *sdev)
{
int rc = ata_scsi_slave_config(sdev);
if (rc)
return rc;
blk_queue_max_phys_segments(sdev->request_queue, MV_MAX_SG_CT / 2);
return 0; /* scsi layer doesn't check return value, sigh */
}
static void mv_set_edma_ptrs(void __iomem *port_mmio,
struct mv_host_priv *hpriv,
struct mv_port_priv *pp)
@ -1130,10 +1143,9 @@ static void mv_port_stop(struct ata_port *ap)
* LOCKING:
* Inherited from caller.
*/
static unsigned int mv_fill_sg(struct ata_queued_cmd *qc)
static void mv_fill_sg(struct ata_queued_cmd *qc)
{
struct mv_port_priv *pp = qc->ap->private_data;
unsigned int n_sg = 0;
struct scatterlist *sg;
struct mv_sg *mv_sg;
@ -1151,7 +1163,7 @@ static unsigned int mv_fill_sg(struct ata_queued_cmd *qc)
mv_sg->addr = cpu_to_le32(addr & 0xffffffff);
mv_sg->addr_hi = cpu_to_le32((addr >> 16) >> 16);
mv_sg->flags_size = cpu_to_le32(len);
mv_sg->flags_size = cpu_to_le32(len & 0xffff);
sg_len -= len;
addr += len;
@ -1160,12 +1172,9 @@ static unsigned int mv_fill_sg(struct ata_queued_cmd *qc)
mv_sg->flags_size |= cpu_to_le32(EPRD_FLAG_END_OF_TBL);
mv_sg++;
n_sg++;
}
}
return n_sg;
}
static inline void mv_crqb_pack_cmd(__le16 *cmdw, u8 data, u8 addr, unsigned last)

View File

@ -694,11 +694,20 @@ EXPORT_SYMBOL(posix_test_lock);
* Note: the above assumption may not be true when handling lock requests
* from a broken NFS client. But broken NFS clients have a lot more to
* worry about than proper deadlock detection anyway... --okir
*
* However, the failure of this assumption (also possible in the case of
* multiple tasks sharing the same open file table) also means there's no
* guarantee that the loop below will terminate. As a hack, we give up
* after a few iterations.
*/
#define MAX_DEADLK_ITERATIONS 10
static int posix_locks_deadlock(struct file_lock *caller_fl,
struct file_lock *block_fl)
{
struct list_head *tmp;
int i = 0;
next_task:
if (posix_same_owner(caller_fl, block_fl))
@ -706,6 +715,8 @@ next_task:
list_for_each(tmp, &blocked_list) {
struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link);
if (posix_same_owner(fl, block_fl)) {
if (i++ > MAX_DEADLK_ITERATIONS)
return 0;
fl = fl->fl_next;
block_fl = fl;
goto next_task;

View File

@ -351,7 +351,8 @@ static cputime_t task_utime(struct task_struct *p)
}
utime = (clock_t)temp;
return clock_t_to_cputime(utime);
p->prev_utime = max(p->prev_utime, clock_t_to_cputime(utime));
return p->prev_utime;
}
static cputime_t task_stime(struct task_struct *p)
@ -366,7 +367,8 @@ static cputime_t task_stime(struct task_struct *p)
stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
cputime_to_clock_t(task_utime(p));
return clock_t_to_cputime(stime);
p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
return p->prev_stime;
}
#endif

View File

@ -1390,10 +1390,10 @@ static int pipe_to_user(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
if (copy_to_user(sd->u.userptr, src + buf->offset, sd->len))
ret = -EFAULT;
buf->ops->unmap(pipe, buf, src);
out:
if (ret > 0)
sd->u.userptr += ret;
buf->ops->unmap(pipe, buf, src);
return ret;
}

View File

@ -187,6 +187,19 @@ free_address(
{
a_list_t *aentry;
#ifdef CONFIG_XEN
/*
* Xen needs to be able to make sure it can get an exclusive
* RO mapping of pages it wants to turn into a pagetable. If
* a newly allocated page is also still being vmap()ed by xfs,
* it will cause pagetable construction to fail. This is a
* quick workaround to always eagerly unmap pages so that Xen
* is happy.
*/
vunmap(addr);
return;
#endif
aentry = kmalloc(sizeof(a_list_t), GFP_NOWAIT);
if (likely(aentry)) {
spin_lock(&as_lock);

View File

@ -10,11 +10,12 @@
#ifndef _ASM_HAZARDS_H
#define _ASM_HAZARDS_H
#ifdef __ASSEMBLY__
#define ASMMACRO(name, code...) .macro name; code; .endm
#else
#include <asm/cpu-features.h>
#define ASMMACRO(name, code...) \
__asm__(".macro " #name "; " #code "; .endm"); \
\
@ -86,6 +87,57 @@ do { \
: "=r" (tmp)); \
} while (0)
#elif defined(CONFIG_CPU_MIPSR1)
/*
* These are slightly complicated by the fact that we guarantee R1 kernels to
* run fine on R2 processors.
*/
ASMMACRO(mtc0_tlbw_hazard,
_ssnop; _ssnop; _ehb
)
ASMMACRO(tlbw_use_hazard,
_ssnop; _ssnop; _ssnop; _ehb
)
ASMMACRO(tlb_probe_hazard,
_ssnop; _ssnop; _ssnop; _ehb
)
ASMMACRO(irq_enable_hazard,
_ssnop; _ssnop; _ssnop; _ehb
)
ASMMACRO(irq_disable_hazard,
_ssnop; _ssnop; _ssnop; _ehb
)
ASMMACRO(back_to_back_c0_hazard,
_ssnop; _ssnop; _ssnop; _ehb
)
/*
* gcc has a tradition of misscompiling the previous construct using the
* address of a label as argument to inline assembler. Gas otoh has the
* annoying difference between la and dla which are only usable for 32-bit
* rsp. 64-bit code, so can't be used without conditional compilation.
* The alterantive is switching the assembler to 64-bit code which happens
* to work right even for 32-bit code ...
*/
#define __instruction_hazard() \
do { \
unsigned long tmp; \
\
__asm__ __volatile__( \
" .set mips64r2 \n" \
" dla %0, 1f \n" \
" jr.hb %0 \n" \
" .set mips0 \n" \
"1: \n" \
: "=r" (tmp)); \
} while (0)
#define instruction_hazard() \
do { \
if (cpu_has_mips_r2) \
__instruction_hazard(); \
} while (0)
#elif defined(CONFIG_CPU_R10000)
/*

View File

@ -356,7 +356,6 @@ enum blk_queue_state {
struct blk_queue_tag {
struct request **tag_index; /* map of busy tags */
unsigned long *tag_map; /* bit map of free/busy tags */
struct list_head busy_list; /* fifo list of busy tags */
int busy; /* current depth */
int max_depth; /* what we will send to device */
int real_max_depth; /* what the array can hold */
@ -451,6 +450,7 @@ struct request_queue
unsigned int dma_alignment;
struct blk_queue_tag *queue_tags;
struct list_head tag_busy_list;
unsigned int nr_sorted;
unsigned int in_flight;

View File

@ -59,7 +59,6 @@ extern void *__alloc_bootmem_core(struct bootmem_data *bdata,
unsigned long align,
unsigned long goal,
unsigned long limit);
extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size);
#ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
extern void reserve_bootmem(unsigned long addr, unsigned long size);

View File

@ -1022,6 +1022,7 @@ struct task_struct {
unsigned int rt_priority;
cputime_t utime, stime;
cputime_t prev_utime, prev_stime;
unsigned long nvcsw, nivcsw; /* context switch counts */
struct timespec start_time; /* monotonic time */
struct timespec real_start_time; /* boot based time */

View File

@ -160,8 +160,9 @@ struct vcpu_set_singleshot_timer {
*/
#define VCPUOP_register_vcpu_info 10 /* arg == struct vcpu_info */
struct vcpu_register_vcpu_info {
uint32_t mfn; /* mfn of page to place vcpu_info */
uint32_t offset; /* offset within page */
uint64_t mfn; /* mfn of page to place vcpu_info */
uint32_t offset; /* offset within page */
uint32_t rsvd; /* unused */
};
#endif /* __XEN_PUBLIC_VCPU_H__ */

View File

@ -1045,6 +1045,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
p->utime = cputime_zero;
p->stime = cputime_zero;
p->prev_utime = cputime_zero;
p->prev_stime = cputime_zero;
#ifdef CONFIG_TASK_XACCT
p->rchar = 0; /* I/O counter: bytes read */

View File

@ -29,6 +29,15 @@ fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry,
return 0;
}
static void __user *futex_uaddr(struct robust_list *entry,
compat_long_t futex_offset)
{
compat_uptr_t base = ptr_to_compat(entry);
void __user *uaddr = compat_ptr(base + futex_offset);
return uaddr;
}
/*
* Walk curr->robust_list (very carefully, it's a userspace list!)
* and mark any locks found there dead, and notify any waiters.
@ -75,11 +84,13 @@ void compat_exit_robust_list(struct task_struct *curr)
* A pending lock might already be on the list, so
* dont process it twice:
*/
if (entry != pending)
if (handle_futex_death((void __user *)entry + futex_offset,
curr, pi))
return;
if (entry != pending) {
void __user *uaddr = futex_uaddr(entry,
futex_offset);
if (handle_futex_death(uaddr, curr, pi))
return;
}
if (rc)
return;
uentry = next_uentry;
@ -93,9 +104,11 @@ void compat_exit_robust_list(struct task_struct *curr)
cond_resched();
}
if (pending)
handle_futex_death((void __user *)pending + futex_offset,
curr, pip);
if (pending) {
void __user *uaddr = futex_uaddr(pending, futex_offset);
handle_futex_death(uaddr, curr, pip);
}
}
asmlinkage long

View File

@ -1521,7 +1521,7 @@ cache_hit:
}
static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
struct held_lock *hlock, int chain_head)
struct held_lock *hlock, int chain_head, u64 chain_key)
{
/*
* Trylock needs to maintain the stack of held locks, but it
@ -1534,7 +1534,7 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
* graph_lock for us)
*/
if (!hlock->trylock && (hlock->check == 2) &&
lookup_chain_cache(curr->curr_chain_key, hlock->class)) {
lookup_chain_cache(chain_key, hlock->class)) {
/*
* Check whether last held lock:
*
@ -1576,7 +1576,7 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
#else
static inline int validate_chain(struct task_struct *curr,
struct lockdep_map *lock, struct held_lock *hlock,
int chain_head)
int chain_head, u64 chain_key)
{
return 1;
}
@ -2450,11 +2450,11 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
chain_head = 1;
}
chain_key = iterate_chain_key(chain_key, id);
curr->curr_chain_key = chain_key;
if (!validate_chain(curr, lock, hlock, chain_head))
if (!validate_chain(curr, lock, hlock, chain_head, chain_key))
return 0;
curr->curr_chain_key = chain_key;
curr->lockdep_depth++;
check_chain_key(curr);
#ifdef CONFIG_DEBUG_LOCKDEP

View File

@ -595,13 +595,16 @@ static void __init param_sysfs_builtin(void)
for (i=0; i < __stop___param - __start___param; i++) {
char *dot;
size_t max_name_len;
kp = &__start___param[i];
max_name_len =
min_t(size_t, MAX_KBUILD_MODNAME, strlen(kp->name));
/* We do not handle args without periods. */
dot = memchr(kp->name, '.', MAX_KBUILD_MODNAME);
dot = memchr(kp->name, '.', max_name_len);
if (!dot) {
DEBUGP("couldn't find period in %s\n", kp->name);
DEBUGP("couldn't find period in first %d characters "
"of %s\n", MAX_KBUILD_MODNAME, kp->name);
continue;
}
name_len = dot - kp->name;

View File

@ -80,10 +80,11 @@ void softlockup_tick(void)
print_timestamp = per_cpu(print_timestamp, this_cpu);
/* report at most once a second */
if (print_timestamp < (touch_timestamp + 1) ||
did_panic ||
!per_cpu(watchdog_task, this_cpu))
if ((print_timestamp >= touch_timestamp &&
print_timestamp < (touch_timestamp + 1)) ||
did_panic || !per_cpu(watchdog_task, this_cpu)) {
return;
}
/* do not print during early bootup: */
if (unlikely(system_state != SYSTEM_RUNNING)) {

View File

@ -1312,7 +1312,7 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
if (vmf->pgoff >= size)
goto outside_data_content;
return VM_FAULT_SIGBUS;
/* If we don't want any read-ahead, don't bother */
if (VM_RandomReadHint(vma))
@ -1389,7 +1389,7 @@ retry_find:
if (unlikely(vmf->pgoff >= size)) {
unlock_page(page);
page_cache_release(page);
goto outside_data_content;
return VM_FAULT_SIGBUS;
}
/*
@ -1400,15 +1400,6 @@ retry_find:
vmf->page = page;
return ret | VM_FAULT_LOCKED;
outside_data_content:
/*
* An external ptracer can access pages that normally aren't
* accessible..
*/
if (vma->vm_mm == current->mm)
return VM_FAULT_SIGBUS;
/* Fall through to the non-read-ahead case */
no_cached_page:
/*
* We're only likely to ever get here if MADV_RANDOM is in

View File

@ -672,8 +672,10 @@ retry:
ret = (*writepage)(page, wbc, data);
if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE))
if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE)) {
unlock_page(page);
ret = 0;
}
if (ret || (--(wbc->nr_to_write) <= 0))
done = 1;
if (wbc->nonblocking && bdi_write_congested(bdi)) {

View File

@ -916,6 +916,21 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
struct inode *inode;
BUG_ON(!PageLocked(page));
/*
* shmem_backing_dev_info's capabilities prevent regular writeback or
* sync from ever calling shmem_writepage; but a stacking filesystem
* may use the ->writepage of its underlying filesystem, in which case
* we want to do nothing when that underlying filesystem is tmpfs
* (writing out to swap is useful as a response to memory pressure, but
* of no use to stabilize the data) - just redirty the page, unlock it
* and claim success in this case. AOP_WRITEPAGE_ACTIVATE, and the
* page_mapped check below, must be avoided unless we're in reclaim.
*/
if (!wbc->for_reclaim) {
set_page_dirty(page);
unlock_page(page);
return 0;
}
BUG_ON(page_mapped(page));
mapping = page->mapping;

View File

@ -1501,28 +1501,8 @@ new_slab:
page = new_slab(s, gfpflags, node);
if (page) {
cpu = smp_processor_id();
if (s->cpu_slab[cpu]) {
/*
* Someone else populated the cpu_slab while we
* enabled interrupts, or we have gotten scheduled
* on another cpu. The page may not be on the
* requested node even if __GFP_THISNODE was
* specified. So we need to recheck.
*/
if (node == -1 ||
page_to_nid(s->cpu_slab[cpu]) == node) {
/*
* Current cpuslab is acceptable and we
* want the current one since its cache hot
*/
discard_slab(s, page);
page = s->cpu_slab[cpu];
slab_lock(page);
goto load_freelist;
}
/* New slab does not fit our expectations */
if (s->cpu_slab[cpu])
flush_slab(s, s->cpu_slab[cpu], cpu);
}
slab_lock(page);
SetSlabFrozen(page);
s->cpu_slab[cpu] = page;

View File

@ -215,12 +215,6 @@ static int __meminit sparse_init_one_section(struct mem_section *ms,
return 1;
}
__attribute__((weak)) __init
void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size)
{
return NULL;
}
static struct page __init *sparse_early_mem_map_alloc(unsigned long pnum)
{
struct page *map;
@ -231,11 +225,6 @@ static struct page __init *sparse_early_mem_map_alloc(unsigned long pnum)
if (map)
return map;
map = alloc_bootmem_high_node(NODE_DATA(nid),
sizeof(struct page) * PAGES_PER_SECTION);
if (map)
return map;
map = alloc_bootmem_node(NODE_DATA(nid),
sizeof(struct page) * PAGES_PER_SECTION);
if (map)