Ingo Molnar [Wed, 26 Nov 2008 13:13:42 +0000 (14:13 +0100)]
sched: convert struct root_domain to cpumask_var_t, fix
Mathieu Desnoyers reported this build failure on powerpc:
kernel/sched.c: In function 'sd_init_NODE':
kernel/sched.c:7319: error: non-static initialization of a flexible array member
kernel/sched.c:7319: error: (near initialization for '(anonymous)')
this happens because .span changed to cpumask_var_t, hence
the static CPU_MASK_NONE initializers in the SD_*_INIT
templates are not type-correct anymore.
Takashi Iwai [Wed, 26 Nov 2008 13:13:03 +0000 (14:13 +0100)]
ALSA: pcsp - Fix starting the stream with HRTIMER_CB_IRQSAFE_UNLOCK
With the callback mode HRTIMER_CB_IRQSAFE_UNLOCK, the start of the
stream with zero delay doesn't work. Since IRQSAFE mode is removed,
we have to change the pcsp start-up code.
This patch splits the callback function to two parts, the triggering
of the port and the calculation of the expire time, and the update of
the ALSA PCM core. The first part is called both from the trigger-start
and the hrtimer callback while the latter is handled only in the
hrtimer callback.
Ingo Molnar [Wed, 26 Nov 2008 10:59:56 +0000 (11:59 +0100)]
blktrace: port to tracepoints, update
Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.
Patrick McHardy [Wed, 26 Nov 2008 11:57:44 +0000 (03:57 -0800)]
netfilter: ctnetlink: fix GFP_KERNEL allocation under spinlock
The previous fix for the conntrack creation race (netfilter: ctnetlink:
fix conntrack creation race) missed a GFP_KERNEL allocation that is
now performed while holding a spinlock. Switch to GFP_ATOMIC.
Reported-and-tested-by: Zoltan Borbely <bozo@andrews.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.
Tejun Heo [Wed, 26 Nov 2008 11:03:55 +0000 (12:03 +0100)]
fuse: add fuse_ prefix to several functions
Add fuse_ prefix to request_send*() and get_root_inode() as some of
those functions will be exported for CUSE. With or without CUSE
export, having the function names scoped is a good idea for
debuggability.
Tejun Heo [Wed, 26 Nov 2008 11:03:55 +0000 (12:03 +0100)]
fuse: implement poll support
Implement poll support. Polled files are indexed using kh in a RB
tree rooted at fuse_conn->polled_files.
Client should send FUSE_NOTIFY_POLL notification once after processing
FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending
notification unconditionally after the latest poll or everytime file
content might have changed is inefficient but won't cause malfunction.
fuse_file_poll() can sleep and requires patches from the following
thread which allows f_op->poll() to sleep.
Tejun Heo [Wed, 26 Nov 2008 11:03:55 +0000 (12:03 +0100)]
fuse: implement unsolicited notification
Clients always used to write only in response to read requests. To
implement poll efficiently, clients should be able to issue
unsolicited notifications. This patch implements basic notification
support.
Zero fuse_out_header.unique is now accepted and considered unsolicited
notification and the error field contains notification code. This
patch doesn't implement any actual notification.
Tejun Heo [Wed, 26 Nov 2008 11:03:55 +0000 (12:03 +0100)]
fuse: add file kernel handle
The file handle, fuse_file->fh, is opaque value supplied by userland
FUSE server and uniqueness is not guaranteed. Add file kernel handle,
fuse_file->kh, which is allocated by the kernel on file allocation and
guaranteed to be unique.
This will be used by poll to match notification to the respective file
but can be used for other purposes where unique file handle is
necessary.
Tejun Heo [Wed, 26 Nov 2008 11:03:55 +0000 (12:03 +0100)]
fuse: implement ioctl support
Generic ioctl support is tricky to implement because only the ioctl
implementation itself knows which memory regions need to be read
and/or written. To support this, fuse client can request retry of
ioctl specifying memory regions to read and write. Deep copying
(nested pointers) can be implemented by retrying multiple times
resolving one depth of dereference at a time.
For security and cleanliness considerations, ioctl implementation has
restricted mode where the kernel determines data transfer directions
and sizes using the _IOC_*() macros on the ioctl command. In this
mode, retry is not allowed.
For all FUSE servers, restricted mode is enforced. Unrestricted ioctl
will be used by CUSE.
Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl()
for more information.
Tejun Heo [Wed, 26 Nov 2008 11:03:54 +0000 (12:03 +0100)]
fuse: don't let fuse_req->end() put the base reference
fuse_req->end() was supposed to be put the base reference but there's
no reason why it should. It only makes things more complex. Move it
out of ->end() and make it the responsibility of request_end().
Marcelo Tosatti [Tue, 25 Nov 2008 14:33:10 +0000 (15:33 +0100)]
KVM: MMU: avoid creation of unreachable pages in the shadow
It is possible for a shadow page to have a parent link
pointing to a freed page. When zapping a high level table,
kvm_mmu_page_unlink_children fails to remove the parent_pte link.
For that to happen, the child must be unreachable via the shadow
tree, which can happen in shadow_walk_entry if the guest pte was
modified in between walk() and fetch(). Remove the parent pte
reference in such case.
This fixes broken terminology added in the "m25p80.c erase enhance" patch,
which added a chip erase command but called it "block erase". There are
already two block erase commands; blocks are 4KiB or 32KiB. There's also
a sector erase (usually 64 KiB). Chip erase typically covers Megabytes.
[dbrownell@users.sourceforge.net: update sector erase comments too ]
Signed-off-by: Chen Gong <clumsycg@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Mike Frysinger [Wed, 26 Nov 2008 10:23:35 +0000 (10:23 +0000)]
[MTD] m25p80: fix detection of m25p16 flashes
Commit d0e8c47c58575b9131e786edb488fd029eba443e ("m25p80.c extended jedec
support") added support for extended ids but seems to break on flashes
which don't have an extended id defined. If the table does not have an
extid defined, then we should ignore it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Michael Hennerich <Michael.Hennerich@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Mike Frysinger [Wed, 26 Nov 2008 10:23:25 +0000 (10:23 +0000)]
[MTD] m25p80: fix detection of SPI parts
Commit d0e8c47c58575b9131e786edb488fd029eba443e ("m25p80.c extended jedec
support") added support for extended ids but in the process managed to
break detection of all flashes.
The ext jedec id check was inserted into an if statement that lacked
braces, and it did not add the required braces. As such, the detection
routine always returns the first entry in the SPI flash list.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Eric Dumazet [Wed, 26 Nov 2008 09:08:18 +0000 (01:08 -0800)]
net: release skb->dst in sock_queue_rcv_skb()
When queuing a skb to sk->sk_receive_queue, we can release its dst,
not anymore needed. Since current cpu did the dst_hold(), refcount is
probably still hot int this cpu caches.
This avoids readers to access the original dst to decrement its
refcount, possibly a long time after packet reception. This should
speedup UDP and RAW receive path.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Eder [Sun, 23 Nov 2008 19:49:52 +0000 (20:49 +0100)]
x86: microcode: fix sparse warnings
Impact: make global variables and a function static
Fix following sparse warnings:
arch/x86/kernel/microcode_core.c:102:22: warning: symbol
'microcode_ops' was not declared. Should it be static?
arch/x86/kernel/microcode_core.c:206:24: warning: symbol
'microcode_pdev' was not declared. Should it be static?
arch/x86/kernel/microcode_core.c:322:6: warning: symbol
'microcode_update_cpu' was not declared. Should it be static?
arch/x86/kernel/microcode_intel.c:468:22: warning: symbol
'microcode_intel_ops' was not declared. Should it be static?
Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan van de Ven [Mon, 24 Nov 2008 00:49:58 +0000 (16:49 -0800)]
tracing: add "power-tracer": C/P state tracer to help power optimization
Impact: new "power-tracer" ftrace plugin
This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.
Rusty Russell [Mon, 24 Nov 2008 23:29:20 +0000 (09:59 +1030)]
sched: avoid stack var in move_task_off_dead_cpu, fix
Impact: locking fix
We can't call cpuset_cpus_allowed_locked() with the rq lock held.
However, the rq lock merely protects us from (1) cpu_online_mask changing
and (2) someone else changing p->cpus_allowed.
The first can't happen because we're being called from a cpu hotplug
notifier. The second doesn't really matter: we are forcing the task off
a CPU it was affine to, so we're not doing very well anyway.
So we remove the rq lock from this path, and all is good.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:25 +0000 (00:16 -0500)]
ftrace: let function tracing and function return run together
Impact: feature
This patch enables function tracing and function return to run together.
I've tested this by enabling the stack tracer and return tracer, where
both the function entry and function return are used together with
dynamic ftrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 26 Nov 2008 05:16:23 +0000 (00:16 -0500)]
ftrace: add function tracing to single thread
Impact: feature to function trace a single thread
This patch adds the ability to function trace a single thread.
The file:
/debugfs/tracing/set_ftrace_pid
contains the pid to trace. Valid pids are any positive integer.
Writing any negative number to this file will disable the pid
tracing and the function tracer will go back to tracing all of
threads.
This feature works with both static and dynamic function tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peng Li [Tue, 18 Nov 2008 04:39:02 +0000 (12:39 +0800)]
drm/i915: Save/restore HWS_PGA on suspend/resume
It fixes suspend/resume failure of xf86-video-intel dri2
branch. As dri2 branch doesn't call I830DRIResume() to restore
hardware status page anymore, we need to preserve
this register across suspend/resume.
Signed-off-by: Peng Li <peng.li@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Dumazet [Wed, 26 Nov 2008 05:16:35 +0000 (21:16 -0800)]
net: Use a percpu_counter for sockets_allocated
Instead of using one atomic_t per protocol, use a percpu_counter
for "sockets_allocated", to reduce cache line contention on
heavy duty network servers.
Note : We revert commit (248969ae31e1b3276fc4399d67ce29a5d81e6fd9
net: af_unix can make unix_nr_socks visbile in /proc),
since it is not anymore used after sock_prot_inuse_add() addition
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Found that while trying average rate policing, it was possible to
request average rate policing without a rate estimator. This results
in no policing which is harmless but incorrect.
Since policing could be setup in two steps, need to check
in the kernel.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arjan van de Ven [Wed, 26 Nov 2008 05:08:13 +0000 (21:08 -0800)]
net: make skb_truesize_bug() call WARN()
The truesize message check is important enough to make it print "BUG"
to the user console... lets also make it important enough to spit a
backtrace/module list etc so that kerneloops.org can track them.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Liming Wang [Wed, 26 Nov 2008 02:29:26 +0000 (10:29 +0800)]
ftrace: adding other non-leaving .text sections
Impact: widen the scope of recordmcount.pl
Besides .text section, there are three .text sections that won't
be freed after kernel booting. They are: .sched.text, .spinlock.text
and .kprobes.text, which contain functions we can trace. But the last
section ".kprobes.text" is particular, which has been marked as "notrace",
we ignore it. Thus we add other two sections.
Signed-off-by: Liming Wang <liming.wang@windriver.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the defactoring of ei_XXX functions in 8390 and 8390p.
Remove the tx_timeout hack since no driver including the 3c503
overrides tx_timeout at this time, looks like a legacy thing.
Also, since several drivers all have same hooks, provide common
netdev_ops.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Wed, 26 Nov 2008 02:00:48 +0000 (18:00 -0800)]
netns xfrm: per-netns sysctls
Make
net.core.xfrm_aevent_etime
net.core.xfrm_acq_expires
net.core.xfrm_aevent_rseqth
net.core.xfrm_larval_drop
sysctls per-netns.
For that make net_core_path[] global, register it to prevent two
/proc/net/core antries and change initcall position -- xfrm_init() is called
from fs_initcall, so this one should be fs_initcall at least.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Wed, 26 Nov 2008 01:58:31 +0000 (17:58 -0800)]
netns PF_KEY: part 2
* interaction with userspace -- take netns from userspace socket.
* in ->notify hook take netns either from SA or explicitly passed --
we don't know if SA/SPD flush is coming.
* stub policy migration with init_net for now.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>