Found these two bugs while browsing through the code. The first one is
a cut-n-paste bug, instead of disabling the clock when request_irq()
fails, it enabled it once more. The second one fixes a debug printout,
AT91_SSC_IER is write only, AT91_SSC_IMR is readable (the printed string
actually says imr).
Frank Mandarino was busy so he asked me to send these to this list.
/Patrik
Signed-off-by: Patrik Sevallius <patrik.sevallius@enea.com> Acked-by: Frank Mandarino <fmandarino@endrelia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Alexey Dobriyan [Thu, 8 May 2008 09:58:39 +0000 (10:58 +0100)]
[ARM] lubbock: fix compilation
arch/arm/mach-pxa/lubbock.c:399: error: expected '}' before ';' token
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Oliver Hartkopp [Thu, 8 May 2008 09:49:55 +0000 (02:49 -0700)]
can: Fix can_send() handling on dev_queue_xmit() failures
The tx packet counting and the local loopback of CAN frames should
only happen in the case that the CAN frame has been enqueued to the
netdevice tx queue successfully.
Thanks to Andre Naujoks <nautsch@gmail.com> for reporting this issue.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Urs Thuermann <urs@isnogud.escape.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Thu, 8 May 2008 08:24:25 +0000 (01:24 -0700)]
netns: Fix arbitrary net_device-s corruptions on net_ns stop.
When a net namespace is destroyed, some devices (those, not killed
on ns stop explicitly) are moved back to init_net.
The problem, is that this net_ns change has one point of failure -
the __dev_alloc_name() may be called if a name collision occurs (and
this is easy to trigger). This allocator performs a likely-to-fail
GFP_ATOMIC allocation to find a suitable number. Other possible
conditions that may cause error (for device being ns local or not
registered) are always false in this case.
So, when this call fails, the device is unregistered. But this is
*not* the right thing to do, since after this the device may be
released (and kfree-ed) improperly. E. g. bridges require more
actions (sysfs update, timer disarming, etc.), some other devices
want to remove their private areas from lists, etc.
I. e. arbitrary use-after-free cases may occur.
The proposed fix is the following: since the only reason for the
dev_change_net_namespace to fail is the name generation, we may
give it a unique fall-back name w/o %d-s in it - the dev<ifindex>
one, since ifindexes are still unique.
So make this change, raise the failure-case printk loglevel to
EMERG and replace the unregister_netdevice call with BUG().
[ Use snprintf() -DaveM ]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 8 May 2008 08:15:21 +0000 (01:15 -0700)]
netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request
Some Inovaphone PBXs exhibit very stange behaviour: when dialing for
example "123", the device sends INVITE requests for "1", "12" and
"123" back to back. The first requests will elicit error responses
from the receiver, causing the SIP helper to flush the RTP
expectations even though we might still see a positive response.
Note the sequence number of the last INVITE request that contained a
media description and only flush the expectations when receiving a
negative response for that sequence number.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 8 May 2008 08:13:31 +0000 (01:13 -0700)]
macvlan: Fix memleak on device removal/crash on module removal
As noticed by Ben Greear, macvlan crashes the kernel when unloading the
module. The reason is that it tries to clean up the macvlan_port pointer
on the macvlan device itself instead of the underlying device. A non-NULL
pointer is taken as indication that the macvlan_handle_frame_hook is
valid, when receiving the next packet on the underlying device it tries
to call the NULL hook and crashes.
Clean up the macvlan_port on the correct device to fix this.
Signed-off-by; Patrick McHardy <kaber@trash.net> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: J.H.M. Dassen (Ray) <jdassen@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Thu, 8 May 2008 08:09:11 +0000 (01:09 -0700)]
tcp FRTO: SACK variant is errorneously used with NewReno
Note: there's actually another bug in FRTO's SACK variant, which
is the causing failure in NewReno case because of the error
that's fixed here. I'll fix the SACK case separately (it's
a separate bug really, though related, but in order to fix that
I need to audit tp->snd_nxt usage a bit).
There were two places where SACK variant of FRTO is getting
incorrectly used even if SACK wasn't negotiated by the TCP flow.
This leads to incorrect setting of frto_highmark with NewReno
if a previous recovery was interrupted by another RTO.
An eventual fallback to conventional recovery then incorrectly
considers one or couple of segments as forward transmissions
though they weren't, which then are not LOST marked during
fallback making them "non-retransmittable" until the next RTO.
In a bad case, those segments are really lost and are the only
one left in the window. Thus TCP needs another RTO to continue.
The next FRTO, however, could again repeat the same events
making the progress of the TCP flow extremely slow.
In order for these events to occur at all, FRTO must occur
again in FRTOs step 3 while the key segments must be lost as
well, which is not too likely in practice. It seems to most
frequently with some small devices such as network printers
that *seem* to accept TCP segments only in-order. In cases
were key segments weren't lost, things get automatically
resolved because those wrongly marked segments don't need to be
retransmitted in order to continue.
I found a reproducer after digging up relevant reports (few
reports in total, none at netdev or lkml I know of), some
cases seemed to indicate middlebox issues which seems now
to be a false assumption some people had made. Bugzilla
#10063 _might_ be related. Damon L. Chesser <damon@damtek.com>
had a reproducable case and was kind enough to tcpdump it
for me. With the tcpdump log it was quite trivial to figure
out.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
[POWERPC] spufs: don't requeue victim contex in find_victim if it's not in spu_run
We should not requeue the victim context in find_victim if the owner is
not in spu_run. It's first not needed because leaving the context on
the spu is an optimization and second is harmful because it means the
owner could re-enter spu_run when the context is on the runqueue and
trip the BUG_ON in __spu_update_sched_info.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Magnus Damm [Thu, 24 Apr 2008 12:47:15 +0000 (21:47 +0900)]
sh: no high level trigger on some sh3 cpus
The processor models sh7706, sh7707 and sh7709 don't support high
level trigger sense configuration. And the intc code looks like
crap these days so what's the difference.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Magnus Damm [Thu, 24 Apr 2008 12:41:12 +0000 (21:41 +0900)]
sh: clean up sh7710 and sh7720 intc tables
Clean up the intc tables by removing unneeded #ifdefs. The vector
list is what selects which interrupt sources that should be added,
having unsupported bitfields listed is ok as long as the vector
is excluded from the list.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Magnus Damm [Thu, 24 Apr 2008 12:36:34 +0000 (21:36 +0900)]
sh: add interrupt ack code to sh3
This patch adds interrupt acknowledge code for external interrupt
sources on sh3 processors. Only really required for edge triggered
interrupts, but we ack regardless of sense configuration.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Magnus Damm [Thu, 24 Apr 2008 12:30:09 +0000 (21:30 +0900)]
sh: unify external irq pin code for sh3
This patch unifies the sh3 external irq pin code. It buys us some
savings with reduced code redundancy, but the main feature with
this change is irq sense selection support for all sh3 processors.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
David S. Miller [Thu, 8 May 2008 01:54:05 +0000 (18:54 -0700)]
sparc: Fix SA_ONSTACK signal handling.
We need to be more liberal about the alignment of the buffer given to
us by sigaltstack(). The user should not need to be mindful of all of
the alignment constraints we have for the stack frame.
This mirrors how we handle this situation in clone() as well.
Also, we align the stack even in non-SA_ONSTACK cases so that signals
due to bad stack alignment can be delivered properly. This makes such
errors easier to debug and recover from.
Finally, add the sanity check x86 has to make sure we won't overflow
the signal stack.
This fixes glibc testcases nptl/tst-cancel20.c and
nptl/tst-cancelx20.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Jarzmik [Wed, 7 May 2008 19:39:06 +0000 (20:39 +0100)]
[ARM] 5032/1: Added cpufreq support for pxa27x CPU
PXA cpus maximum frequency depends on the cpu (624 for
pxa270, 520 for pxa272, 416 for pxa271). It should be
provided on kernel or module start (cpu-pxa
pxa27x_maxfreq parameter).
Make use of cpufreq_frequency_table_cpuinfo (patch by Bill
Reese provided by Philipp Zabel).
Some additionnal fixes from Philipp Zabel include :
* rename PXA cpufreq driver to reflect added PXA27x support
* remove unused variable ramstart from PXA cpufreq driver
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Robert Jarzmik <rjarzmik@free.fr> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 84f43c308b73a6a12128288721a1007ba4f1a8da "pxafb: introduce register
independent LCD connection type for pxafb" implements compatibility mode
for old style pxafb_mach_info initialization data wrongly, causing the
system to Oops repeatedly - first during probe, then when drawing. Fix it
and make pxafb_decode_mach_info void.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
John Gregor [Wed, 7 May 2008 18:01:10 +0000 (11:01 -0700)]
IB/ipath: Fix SDMA error recovery in absence of link status change
What's fixed:
in ipath_cancel_sends()
We need to unconditionally set ABORTING. So, swap the tests
so the set_bit() isn't shadowed by the &&.
If we've disarmed the piobufs, then we need to unconditionally
set DISARMED. So, move it out from the overly protective if
at the bottom.
in sdma_abort_task()
Abort_task was written knowing that the SDMA engine would always
be reset (and restarted) on error. A recent change broke that
fundamental assumption by taking the restart portion and making
it conditional on a link status change. But, SDMA can go boom
without a link status change in some conditions.
Signed-off-by: John Gregor <john.gregor@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Dave Olson [Wed, 7 May 2008 18:00:15 +0000 (11:00 -0700)]
IB/ipath: Need to always request and handle PIO avail interrupts
Now that we always use PIO for vl15 on 7220, we could get stuck forever
if we happened to run out of PIO buffers from the verbs code, because
the setup code wouldn't run; the interrupt was also ignored if SDMA was
supported. We also have to reduce the pio update threshold if we have
fewer kernel buffers than the existing threshold.
Clean up the initialization a bit to get ordering safer and more
sensible, and use the existing ipath_chg_kernavail call to do init,
rather than doing it separately.
Drop unnecessary clearing of pio buffer on pio parity error.
Drop incorrect updating of pioavailshadow when exitting freeze mode
(software state may not match chip state if buffer has been allocated
and not yet written).
If we couldn't get a kernel buffer for a while, make sure we are
in sync with hardware, mainly to handle the exitting freeze case.
Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael Albaugh [Wed, 7 May 2008 17:59:23 +0000 (10:59 -0700)]
IB/ipath: Fix count of packets received by kernel
The loop in ipath_kreceive() that processes packets increments the
loop-index 'i' once too often, because the exit condition does not
depend on it, and is checked after the increment. By adding a check for
!last to the iterator in the for loop, we correct that in a way that is
not so likely to be re-broken by changes in the loop body.
Signed-off-by: Michael Albaugh <micheal.albaugh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Dave Olson [Wed, 7 May 2008 17:57:48 +0000 (10:57 -0700)]
IB/ipath: Fix bug that can leave sends disabled after freeze recovery
The semantics of cancel_sends changed, but the code using it was missed.
Don't leave sends and pioavail updates disabled, and add a comment as to
why the force update is needed.
Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Michael Albaugh [Wed, 7 May 2008 17:56:47 +0000 (10:56 -0700)]
IB/ipath: Only warn about prototype chip during init
We warn about prototype chips, but the function that checks for
support is also called as a result of a get_portinfo request, which
can clutter the logs.
Restrict warning to only appear during initialization.
Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Herbert Xu [Tue, 6 May 2008 12:46:49 +0000 (20:46 +0800)]
[CRYPTO] hmac: Avoid calling virt_to_page on key
When HMAC gets a key longer than the block size of the hash, it needs
to feed it as input to the hash to reduce it to a fixed length. As
it is HMAC converts the key to a scatter and gather list. However,
this doesn't work on certain platforms if the key is not allocated
via kmalloc. For example, the keys from tcrypt are stored in the
rodata section and this causes it to fail with HMAC on x86-64.
This patch fixes this by copying the key to memory obtained via
kmalloc before hashing it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Jens Axboe [Wed, 7 May 2008 08:15:46 +0000 (10:15 +0200)]
block: avoid duplicate calls to get_part() in disk stat code
get_part() is fairly expensive, as it O(N) loops over partitions
to find the right one. In lots of normal IO paths we end up looking
up the partition twice, to make matters even worse. Change the
stat add code to accept a passed in partition instead.
Jens Axboe [Wed, 7 May 2008 07:51:23 +0000 (09:51 +0200)]
cfq-iosched: make io priorities inherit CPU scheduling class as well as nice
We currently set all processes to the best-effort scheduling class,
regardless of what CPU scheduling class they belong to. Improve that
so that we correctly track idle and rt scheduling classes as well.
Jan Kara [Tue, 6 May 2008 16:26:17 +0000 (18:26 +0200)]
udf: Fix memory corruption when fs mounted with noadinicb option
When UDF filesystem is mounted with noadinicb mount option, it
happens that we extend an empty directory with a block. A code in
udf_add_entry() didn't count with this possibility and used
uninitialized data leading to memory and filesystem corruption.
Add a check whether file already has some extents before operating
on them.
Jens Axboe [Wed, 7 May 2008 07:48:17 +0000 (09:48 +0200)]
block: optimize generic_unplug_device()
Original patch from Mikulas Patocka <mpatocka@redhat.com>
Mike Anderson was doing an OLTP benchmark on a computer with 48 physical
disks mapped to one logical device via device mapper.
He found that there was a slowdown on request_queue->lock in function
generic_unplug_device. The slowdown is caused by the fact that when some
code calls unplug on the device mapper, device mapper calls unplug on all
physical disks. These unplug calls take the lock, find that the queue is
already unplugged, release the lock and exit.
With the below patch, performance of the benchmark was increased by 18%
(the whole OLTP application, not just block layer microbenchmarks).
So I'm submitting this patch for upstream. I think the patch is correct,
because when more threads call simultaneously plug and unplug, it is
unspecified, if the queue is or isn't plugged (so the patch can't make
this worse). And the caller that plugged the queue should unplug it
anyway. (if it doesn't, there's 3ms timeout).
Miklos Szeredi [Wed, 7 May 2008 07:22:39 +0000 (09:22 +0200)]
vfs: splice remove_suid() cleanup
generic_file_splice_write() duplicates remove_suid() just because it
doesn't hold i_mutex. But it grabs i_mutex inside splice_from_pipe()
anyway, so this is rather pointless.
Move locking to generic_file_splice_write() and call remove_suid() and
__splice_from_pipe() instead.
Jens Axboe [Wed, 7 May 2008 07:17:12 +0000 (09:17 +0200)]
cfq-iosched: fix RCU race in the cfq io_context destructor handling
put_io_context() drops the RCU read lock before calling into cfq_dtor(),
however we need to hold off freeing there before grabbing and
dereferencing the first object on the list.
So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(),
and optimize cfq_free_io_context() to use a new variant for
call_for_each_cic() that assumes the RCU read lock is already held.
Hit in the wild by Alexey Dobriyan <adobriyan@gmail.com>
Jens Axboe [Wed, 7 May 2008 07:27:43 +0000 (09:27 +0200)]
block: adjust tagging function queue bit locking
For most initialization purposes, calling blk_queue_init_tags() without
the queue lock held is OK. Only if called for resizing an existing map
must the lock be held. Ditto for tag cleanup, the maps are reference
counted.
So switch the general queue flag setting to the unlocked variant, but
retain the locked variant for resizing.
Roland McGrath [Wed, 7 May 2008 07:22:57 +0000 (09:22 +0200)]
[S390] compat ptrace cleanup
This removes redundant arch code for generic ptrace requests
already handled by ptrace_request and compat_ptrace_request.
It simplifies things to just have the standard entry points,
and use the generic compat_sys_ptrace.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Ernst [Wed, 7 May 2008 07:22:54 +0000 (09:22 +0200)]
[S390] cio: Remove cio_msg kernel parameter.
The only sporadically used CIO_DEBUG messages are replaced by ordinary
CIO_MSG_EVENT messages. The CIO_MSG_EVENT messages debug levels are
consolidated.
Signed-off-by: Michael Ernst <mernst@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] s390-kvm: leave sie context on work. Removes preemption requirement
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch fixes a bug with cpu bound guest on kvm-s390. Sometimes it
was impossible to deliver a signal to a spinning guest. We used
preemption as a circumvention. The preemption notifiers called
vcpu_load, which checked for pending signals and triggered a host
intercept. But even with preemption, a sigkill was not delivered
immediately.
This patch changes the low level host interrupt handler to check for the
SIE instruction, if TIF_WORK is set. In that case we change the
instruction pointer of the return PSW to rerun the vcpu_run loop. The kvm
code sees an intercept reason 0 if that happens. This patch adds accounting
for these types of intercept as well.
The advantages:
- works with and without preemption
- signals are delivered immediately
- much better host latencies without preemption
Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On return from syscall or interrupt, we have to check if we return to
userspace (likely) and if there is work todo (less likely) to decide
if we handle the work. We can optimize this check: we first check for
the less likely work case and then check for userspace.
This patch is also a preparation for an additional patch, that fixes a bug
in KVM dealing with cpu bound guests.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>