Jan Beulich [Wed, 14 Jan 2009 12:27:35 +0000 (12:27 +0000)]
x86: fully honor "nolapic"
Impact: widen the effect of the 'nolapic' boot parameter
"nolapic" should not only suppress SMP and use of the LAPIC, but it
also ought to have the effect of disabling all IO-APIC related activity
as well as PCI MSI and HT-IRQs.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 14 Jan 2009 11:39:19 +0000 (12:39 +0100)]
sched: prefer wakers
Prefer tasks that wake other tasks to preempt quickly. This improves
performance because more work is available sooner.
The workload that prompted this patch was a kernel build over NFS4 (for some
curious and not understood reason we had to revert commit: 18de9735300756e3ca9c361ef58409d8561dfe0d to make any progress at all)
Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
3m30-ish, with this patch we're down to 2m50-ish.
psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
well, tbench and vmark seemed to not care.
It is possible to improve upon the build time (to 2m20-ish) but that seriously
destroys other benchmarks (just shows that there's more room for tinkering).
Much thanks to Mike who put in a lot of effort to benchmark things and proved
a worthy opponent with a competing patch.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 14 Jan 2009 11:39:18 +0000 (12:39 +0100)]
sched: introduce avg_wakeup
Introduce a new avg_wakeup statistic.
avg_wakeup is a measure of how frequently a task wakes up other tasks, it
represents the average time between wakeups, with a limit of avg_runtime
for when it doesn't wake up anybody.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Wed, 14 Jan 2009 23:43:54 +0000 (15:43 -0800)]
irq: update all arches for new irq_desc, fix
Impact: fix build errors
Since the SPARSE IRQS changes redefined how the kstat irqs are
organized, arch's must use the new accessor function:
kstat_incr_irqs_this_cpu(irq, DESC);
If CONFIG_SPARSE_IRQS is set, then DESC is a pointer to the
irq_desc which has a pointer to the kstat_irqs. If not, then
the .irqs field of struct kernel_stat is used instead.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Clemens Ladisch [Thu, 15 Jan 2009 09:21:23 +0000 (10:21 +0100)]
sound: virtuoso: do not overwrite EEPROM on Xonar D2/D2X
On the Asus Xonar D2 and D2X models, the SPI chip select signal for the
fourth DAC shares its pin with the serial clock for the EEPROM that
contains the PCI subdevice ID values. It appears that when DAC
registers are written and some other unknown conditions occur (probably
noise on the EEPROM's chip select line), the EEPROM gets overwritten
with garbage, which makes it impossible to properly detect the card
later.
Therefore, we better avoid DAC register writes and make sure that the
driver works with the DAC's registers' default values. Consequently,
the sample format is now I2S instead of left-justified (no user-visible
change), and the DAC's volume/mute registers cannot be used anymore
(volume changes are now done by the software volume plugin).
Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
tracing/function-graph-tracer: fix a regression while suspend to disk
Impact: fix a crash while kernel image restore
When the function graph tracer is running and while suspend to disk, some racy
and dangerous things happen against this tracer.
The current task will save its registers including the stack pointer which
contains the return address hooked by the tracer. But the current task will
continue to enter other functions after that to save the memory, and then
it will store other return addresses, and finally loose the old depth which
matches the return address saved in the old stack (during the registers saving).
So on image restore, the code will return to wrong addresses.
And there are other things: on restore, the task will have it's "current"
pointer overwritten during registers restoring....switching from one task to
another... That would be insane to try to trace function graphs at these
stages.
This patch makes the function graph tracer listening on power events, making
it's tracing disabled for the current task (the one that performs the
hibernation work) while suspend/resume to disk, making the tracing safe
during hibernation.
Steven Rostedt [Wed, 14 Jan 2009 19:50:19 +0000 (14:50 -0500)]
trace: stop all recording to ring buffer on ftrace_dump
Impact: limit ftrace dump output
Currently ftrace_dump only calls ftrace_kill that is a fast way
to prevent the function tracer functions from being called (just sets
a flag and clears the function to call, nothing else). It is better
to also turn off any recording to the ring buffers as well.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
when "if (RB_WARN_ON(cpu_buffer, next_page == reader_page))", ring_buffer
is disabled, but some reserved buffers may haven't been committed.
we need reset struct buffer_page.write.
when "if (unlikely(next_page == cpu_buffer->commit_page))", ring_buffer
is still available, we should not corrupt it.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 14 Jan 2009 17:24:42 +0000 (12:24 -0500)]
trace: print ftrace_dump at KERN_EMERG log level
Impact: fix to print out ftrace_dump when expected
I was debugging a hard race condition to only find out that
after I hit the race, my log level was not at level to show
KERN_INFO. The time it took to trigger the race was wasted because
I did not capture the trace.
Since ftrace_dump is only called from kernel oops (and only when
it is set in the kernel command line to do so), or when a
developer adds it to their own local tree, the log level of
the print should be at KERN_EMERG to make sure the print appears.
ftrace_dump is not called by a normal user setup, and will not
add extra unwanted print out to the console. There is no reason
it should be at KERN_INFO.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland Dreier [Thu, 15 Jan 2009 05:44:39 +0000 (21:44 -0800)]
IPoIB: Fix deadlock between ipoib_open() and child interface create
Fix a deadlock between child interface creation/deletion and ipoib
start/stop. The former takes vlan_mutex, and then might take RTNL via
register_netdev()/unregister_netdev(). The latter is executed with
RTNL held, and tries to take vlan_mutex, which can lead to an AB-BA
deadlock.
Fix this by having the child interface creation/deletion code take the
RTNL first so vlan_mutex always nests inside RTNL. We can use
register_netdevice() for child interfaces because we form the
interface name from the parent interface and hence don't need the '%'
expansion of register_netdev().
Reported-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Peter Korsgaard [Thu, 15 Jan 2009 05:32:58 +0000 (22:32 -0700)]
fsldma: check for NO_IRQ in fsl_dma_chan_remove()
There's no per-channel IRQ on mpc83xx, so only call free_irq if we have one.
Acked-by: Timur Tabi <timur@freescale.com> Acked-by: Li Yang <leoli@freescale.com> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Oliver Hartkopp [Thu, 15 Jan 2009 05:06:55 +0000 (21:06 -0800)]
can: fix slowpath issue in hrtimer callback function
Due to the loopback functionality in can_send() we can not invoke it
from hardirq context which was done inside the
bcm_tx_timeout_handler() hrtimer callback:
Magnus Damm [Thu, 15 Jan 2009 05:05:55 +0000 (21:05 -0800)]
ax88796: start_xmit fix using net_device_ops
This patch hooks up the start_xmit/tx_timeout/get_stats callbacks
in the ax88796 driver since they no longer are installed by the
lib8390 code. Without this patch the function dev_hard_start_xmit()
crashes due to a start_xmit callback with the value NULL.
While at it, update the ax88796 driver to make use of use of struct
net_device_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Add init_dummy_netdev() and fix EMAC driver using it
This adds an init_dummy_netdev() function that gets a network device
structure (allocation and lifetime entirely under caller's control) and
initialize the minimum amount of fields so it can be used to schedule
NAPI polls without registering a full blown interface. This is to be
used by drivers that need to tie several hardware interfaces to a single
NAPI poll scheduler due to HW limitations.
It also updates the ibm_newemac driver to use that, this fixing the
oops on 2.6.29 due to passing NULL as "dev" to netif_napi_add()
Symbol is exported GPL only a I don't think we want binary drivers doing
that sort of acrobatics (if we want them at all).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taken from http://bugzilla.kernel.org/show_bug.cgi?id=12397
We're doing an sprintf of an 11-char string into an 11-char buffer.
Whoops. It breaks firmware uploading.
Reported-by: Jos-Vicente Gilabert <josevteg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Thu, 15 Jan 2009 04:49:43 +0000 (20:49 -0800)]
netxen: hold tx lock while sending firmware commands
Some firmware commands like mac address addition/deletion are sent
on the transmit ring. So need to hold the tx lock before touching
tx producer/consumer indices.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Thu, 15 Jan 2009 04:48:11 +0000 (20:48 -0800)]
netxen: fix ipv6 offload and tx cleanup
o fix the ip/tcp hdr offset in tx descriptors for ipv6.
o cleanup xmit function, move the tso checks into separate function,
this reduces unnecessary endian conversions back and forth.
o optimize macros to initialize tx descriptors.
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniele Venzano [Thu, 15 Jan 2009 04:46:24 +0000 (20:46 -0800)]
sis900: generate fake MAC address if the hardware doesn't have one
The attached patch modifies the sis900 driver when the MAC address
read from the hardware is invalid. As suggested, the patch now
generates a random address so that the user can go on and use
the hardware. In any case a message is also shown to warn on the
unexpected condition.
This seems to happen with newer HW implementation of the sis900
chipset, since this never came up before.
Patch is against vanilla 2.6.28 (but the driver doesn't change so often,
so it will probably apply to older/newer versions too).
See bugzilla ID 10201 and 11649 and ignore the previous patch.
Signed-off-by: Daniele Venzano <venza@brownhat.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 15 Jan 2009 04:41:12 +0000 (20:41 -0800)]
gso: Ensure that the packet is long enough
When we get a GSO packet from an untrusted source, we need to
ensure that it is sufficiently long so that we don't end up
crashing.
Based on discovery and patch by Ian Campbell.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 15 Jan 2009 04:40:03 +0000 (20:40 -0800)]
gro: Fix page ref count for skbs freed normally
When an skb with page frags is merged into an existing one, we
cannibalise its reference count. This is OK when the skb is
reused because we set nr_frags to zero in that case. However,
for the case where the skb is freed through kfree_skb, we didn't
clear nr_frags which causes the page to be freed prematurely.
This is fixed by moving the skb resetting into skb_gro_receive.
Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 13 Jan 2009 00:26:18 +0000 (11:26 +1100)]
crypto: authenc - Fix zero-length IV crash
As it is if an algorithm with a zero-length IV is used (e.g.,
NULL encryption) with authenc, authenc may generate an SG entry
of length zero, which will trigger a BUG check in the hash layer.
This patch fixes it by skipping the IV SG generation if the IV
size is zero.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Linus Torvalds [Thu, 15 Jan 2009 04:00:28 +0000 (20:00 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (29 commits)
powerpc/83xx: Move mcu_mpc8349emitx driver out of drivers/i2c/chips/
powerpc/83xx: Make serial ports work on MPC8315E-RDB w/ FSL U-Boots
powerpc/e500mc: Doorbells need to be taken w/exceptions disabled
powerpc: Enable PS3 options and QPACE in ppc64_defconfig
powerpc/powermac: Fix occasional SMP boot failure
powerpc/cacheinfo: Rename cache_dir per-cpu variable
hvc_console: Use kzalloc() instead of kmalloc() + memset()
hvc_console: Do not set low_latency when using interrupts
hvc_console: Call free_irq() only if request_irq() was successful
hvc_console: Change an mb() to smp_mb() and add some comments
powerpc: Cleanup from l64 to ll64 change: drivers/net
powerpc: Cleanup from l64 to ll64 change: drivers/char
powerpc: Cleanup from l64 to ll64 change: arch code
powerpc: Change u64/s64 to a long long integer type
powerpc/kexec: Check crash_base for relocatable kernel
powerpc: Make dummy section a valid note header
Xilinx: SPI: updated driver for device tree
drivers/of: Add the of_find_i2c_device_by_node function.
powerpc/xsysace: add compatible string for non-ipcore instance
powerpc/mpc52xx: remove dead code from GPIO driver
...
Linus Torvalds [Thu, 15 Jan 2009 03:58:40 +0000 (19:58 -0800)]
Merge branch 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[CVE-2009-0029] s390 specific system call wrappers
[CVE-2009-0029] System call wrappers part 33
[CVE-2009-0029] System call wrappers part 32
[CVE-2009-0029] System call wrappers part 31
[CVE-2009-0029] System call wrappers part 30
[CVE-2009-0029] System call wrappers part 29
[CVE-2009-0029] System call wrappers part 28
[CVE-2009-0029] System call wrappers part 27
[CVE-2009-0029] System call wrappers part 26
[CVE-2009-0029] System call wrappers part 25
[CVE-2009-0029] System call wrappers part 24
[CVE-2009-0029] System call wrappers part 23
[CVE-2009-0029] System call wrappers part 22
[CVE-2009-0029] System call wrappers part 21
[CVE-2009-0029] System call wrappers part 20
[CVE-2009-0029] System call wrappers part 19
[CVE-2009-0029] System call wrappers part 18
[CVE-2009-0029] System call wrappers part 17
[CVE-2009-0029] System call wrappers part 16
[CVE-2009-0029] System call wrappers part 15
...
Linus Torvalds [Thu, 15 Jan 2009 03:55:25 +0000 (19:55 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
[XFS] Update maintainers
[XFS] use scalable vmap API
[XFS] remove old vmap cache
[XFS] make xfs_ino_t an unsigned long long
[XFS] truncate readdir offsets to signed 32 bit values
[XFS] fix compile of xfs_btree_readahead_lblock on m68k
[XFS] Remove macro-to-function indirections in the mask code
[XFS] Remove macro-to-function indirections in attr code
[XFS] Remove several unused typedefs.
[XFS] pass XFS_IGET_BULKSTAT to xfs_iget for handle operations
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
IDE: fix sparse signed-ness errors with host->host_busy
ide: fix suspend regression
tx4938ide: Fix build error due to read_sff_dma_status moving
ide: remove unused CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ
sl82c105: remove dead code
via82cxxx: fix cable warning message
ide: can't use SSD/non-rotational queue flag for all CFA devices
it821x.c: use dev->revision instead of pci_read_config_byte
it821x: Add ultra_mask quirk for Vortex86SX
ide: fix accidental LOCKDEP breakage caused by local_irq_set() removal
Roland Dreier [Wed, 14 Jan 2009 22:55:41 +0000 (14:55 -0800)]
IPoIB: Fix hang in napi_disable() if P_Key is never found
After commit fe25c561 ("IPoIB: Don't enable NAPI when it's already
enabled"), if an interface is brought up but the corresponding P_Key
never appears, then ipoib_stop() will hang in napi_disable(), because
ipoib_open() returns before it does napi_enable().
Fix this by changing ipoib_open() to call napi_enable() even if the
P_Key isn't present.
Reported-by: Yossi Etigin <yosefe@Voltaire.COM> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Paul Bolle [Wed, 14 Jan 2009 22:41:00 +0000 (14:41 -0800)]
i4l: do not print a warning when shutting down an i4l ppp interface
When an i4l ppp interface is shut down (e.g. with /sbin/ifdown ippp0) a
scary warning is logged:
isdn_free_channel: called with invalid drv(-1) or channel(-1)
This warning is caused by isdn_net_unbind_channel(), which always calls
isdn_free_channel() even if isdn_net_local->isdn_device and
isdn_net_local->isdn_channel are (still) in a perfectly acceptable
default state, so let's not do that.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
PHYID returns 0xffff and not 0xffffffff when not found and in some
case(at91sam9263) 0x0. Maybe this patch could be useful.
phy_device.c treats PHY ID == 0x0 as bogus IDs, and that results in
gianfar driver failure to see the TBI PHYs. This code snippet triggers:
if (!priv->tbiphy) {
printk(KERN_WARNING "SGMII mode requires that the device "
"tree specify a tbi-handle\n");
return;
}
Although tbi-handle is specified in the device tree.
Btw, technically PHY ID == 0x0 is a valid ID (if we ever see a PHY
manufactured by Xerox :-).
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 14 Jan 2009 22:36:12 +0000 (14:36 -0800)]
gro: Check for GSO packets and packets with frag_list
As GRO cannot be applied to packets with frag_list we need to
make sure that we reject such packets if they are fed to us,
e.g., through a tunnel device.
Also there is no point in applying GRO on GSO packets so they
too should be rejected. This allows GRO to be used in virtio-net
which may produce GSO packets directly but may still benefit
from GRO if the other end of it doesn't support GSO.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Cyrill Gorcunov [Wed, 14 Jan 2009 20:37:47 +0000 (23:37 +0300)]
x86: headers cleanup - prctl.h
Impact: cleanup (internal kernel function exported)
'make headers_check' warn us about leaking of kernel private
(mostly compile time vars) data to userspace in headers. Fix it.
sys_arch_prctl is completely removed from
header since frankly I don't even understand why we
describe it here. It is described like
__SYSCALL(__NR_arch_prctl, sys_arch_prctl) in unistd_64.h
and implemented in process_64.c. User-mode linux involved?
So this one in fact is suspicious.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
These stripping patches has caused a set of issues:
1) People have reported compatibility issues with binutils due to
lack of support for `--strip-unneeded-symbols' with objcopy 2.15.92.0.2
Reported by: Wenji
2) ccache and distcc no longer works as expeced
Reported by: Ted, Roland, + others
3) The installed modules increased a lot in size
Reported by: Ted, Davej + others
Reported-by: Wenji Huang <wenji.huang@oracle.com> Reported-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Dave Jones <davej@redhat.com> Reported-by: Roland McGrath <roland@redhat.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
H. Peter Anvin [Wed, 14 Jan 2009 19:28:35 +0000 (11:28 -0800)]
init: make initrd/initramfs decompression failure a KERN_EMERG event
Impact: More consistent behaviour, avoid policy in the kernel
Upgrade/downgrade initrd/initramfs decompression failure from
inconsistently a panic or a KERN_ALERT message to a KERN_EMERG event.
It is, however, possible do design a system which can recover from
this (using the kernel builtin code and/or the internal initramfs),
which means this is policy, not a technical necessity.
A good way to handle this would be to have a panic-level=X option, to
force a panic on a printk above a certain level. That is a separate
patch, however.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Suresh Siddha [Fri, 9 Jan 2009 22:35:20 +0000 (14:35 -0800)]
x86, pat: fix reserve_memtype() for legacy 1MB range
Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)
Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.
Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)
Ben Dooks [Wed, 14 Jan 2009 18:19:04 +0000 (19:19 +0100)]
IDE: fix sparse signed-ness errors with host->host_busy
The host_busy field in struct ide_host defaults to a
signed-long, where most arch's test_and_set_bit_*
macros use an unsigned long.
Change to using an unsigned long, which on ARM removes
the following sparse errors:
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:681:8: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:681:8: expected unsigned long volatile *p
drivers/ide/ide-io.c:681:8: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
drivers/ide/ide-io.c:695:3: warning: incorrect type in argument 2 (different signedness)
drivers/ide/ide-io.c:695:3: expected unsigned long volatile *p
drivers/ide/ide-io.c:695:3: got long volatile *<noident>
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
On Monday 12 January 2009, Simon Holm Thøgersen wrote:
> commit 295f000 ("ide: don't execute the next queued command from the
> hard-IRQ context (v2)") breaks suspend to disk for me. On
> 'echo disk > /sys/power/state' the systems hangs, letting me switch
> virtual consoles, but not responding to Alt+SysRq
Restart the request queue early for REQ_TYPE_PM_RESUME requests
(though there is only one resume request for the whole resume
sequence it stays in the queue until is fully completed and now
depends on kblockd for processing consequential resume states).
Reported-and-bisected-by: Simon Holm Thøgersen <odie@cs.aau.dk> Tested-by: Simon Holm Thøgersen <odie@cs.aau.dk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide: fix accidental LOCKDEP breakage caused by local_irq_set() removal
commit 54cc1428cfa619e16d75baae8cb041a2eff015f0 ("ide: remove
local_irq_set() macro") accidentally replaced local_save_flags()
by local_irq_set() in ide_probe_port() and __ide_wait_stat()
which resulted in LOCKDEP breakage.
Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Chris Mason [Wed, 14 Jan 2009 16:29:31 +0000 (17:29 +0100)]
mutex: adaptive spinnning, performance tweaks
Spin more agressively. This is less fair but also markedly faster.
The numbers:
* dbench 50 (higher is better):
spin 1282MB/s
v10 548MB/s
v10 no wait 1868MB/s
* 4k creates (numbers in files/second higher is better):
spin avg 200.60 median 193.20 std 19.71 high 305.93 low 186.82
v10 avg 180.94 median 175.28 std 13.91 high 229.31 low 168.73
v10 no wait avg 232.18 median 222.38 std 22.91 high 314.66 low 209.12
* File stats (numbers in seconds, lower is better):
spin 2.27s
v10 5.1s
v10 no wait 1.6s
( The source changes are smaller than they look, I just moved the
need_resched checks in __mutex_lock_common after the cmpxchg. )
Signed-off-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andrew Morton [Wed, 14 Jan 2009 17:35:44 +0000 (09:35 -0800)]
kernel/up.c: omit it if SMP=y, USE_GENERIC_SMP_HELPERS=n
Fix the sparc build - we were including `up.o' on SMP builds, when
CONFIG_USE_GENERIC_SMP_HELPERS=n.
Tested-by: Robert Reif <reif@earthlink.net> Fixed-by: Robert Reif <reif@earthlink.net> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Mon, 12 Jan 2009 13:01:47 +0000 (14:01 +0100)]
mutex: implement adaptive spinning
Change mutex contention behaviour such that it will sometimes busy wait on
acquisition - moving its behaviour closer to that of spinlocks.
This concept got ported to mainline from the -rt tree, where it was originally
implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.
Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
gave a 345% boost for VFS scalability on my testbox:
The key criteria for the busy wait is that the lock owner has to be running on
a (different) cpu. The idea is that as long as the owner is running, there is a
fair chance it'll release the lock soon, and thus we'll be better off spinning
instead of blocking/scheduling.
Since regular mutexes (as opposed to rtmutexes) do not atomically track the
owner, we add the owner in a non-atomic fashion and deal with the races in
the slowpath.
Furthermore, to ease the testing of the performance impact of this new code,
there is means to disable this behaviour runtime (without having to reboot
the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
by issuing the following command:
# echo NO_OWNER_SPIN > /debug/sched_features
This command re-enables spinning again (this is also the default):
# echo OWNER_SPIN > /debug/sched_features
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 14 Jan 2009 14:36:26 +0000 (15:36 +0100)]
mutex: preemption fixes
The problem is that dropping the spinlock right before schedule is a voluntary
preemption point and can cause a schedule, right after which we schedule again.
Fix this inefficiency by keeping preemption disabled until we schedule, do this
by explicity disabling preemption and providing a schedule() variant that
assumes preemption is already disabled.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Nick Piggin [Wed, 14 Jan 2009 06:28:16 +0000 (07:28 +0100)]
mm: fix assertion
This assertion is incorrect for lockless pagecache. By definition if we
have an unpinned page that we are trying to take a speculative reference
to, it may become the tail of a compound page at any time (if it is
freed, then reallocated as a compound page).
It was still a valid assertion for the vmscan.c LRU isolation case, but
it doesn't seem incredibly helpful... if somebody wants it, they can
put it back directly where it applies in the vmscan code.
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Wed, 14 Jan 2009 10:22:25 +0000 (12:22 +0200)]
SLUB: Use ->objsize from struct kmem_cache_cpu in slab_free()
There's no reason to use ->objsize from struct kmem_cache in slab_free() for
the SLAB_DEBUG_OBJECTS case. All it does is generate extra cache pressure as we
try very hard not to touch struct kmem_cache in the fast-path.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Gregory Haskins [Wed, 14 Jan 2009 14:10:04 +0000 (09:10 -0500)]
sched: fix build error in kernel/sched_rt.c when RT_GROUP_SCHED && !SMP
Ingo found a build error in the scheduler when RT_GROUP_SCHED was
enabled, but SMP was not. This patch rearranges the code such
that it is a little more streamlined and compiles under all permutations
of SMP, UP and RT_GROUP_SCHED. It was boot tested on my 4-way x86_64
and it still passes preempt-test.