Brian King [Wed, 29 Oct 2008 13:46:33 +0000 (08:46 -0500)]
[SCSI] ibmvfc: Fix log level filtering
The ibmvfc log level filtering logic was reversed. The log_level scsi
host parameter should result in more verbose logs when log_level is
larger, not smaller.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Arjan van de Ven [Sun, 28 Sep 2008 23:18:02 +0000 (16:18 -0700)]
[SCSI] advansys, arcmsr, ipr, nsp32, qla1280, stex: use pci_ioremap_bar()
Use the newly introduced pci_ioremap_bar() function in drivers/scsi.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Matthew Wilcox <willy@linux.intel.com> Cc: Brian King <brking@us.ibm.com> Cc: Ed Lin <ed.lin@promise.com> Cc: Nick Cheng <nick.cheng@areca.com.tw> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
x86, pci: move arch/x86/pci/pci.h to arch/x86/include/asm/pci_x86.h
Impact: cleanup
Now that arch/x86/pci/pci.h is used in a number of other places as well,
move the lowlevel x86 pci definitions into the architecture include files.
(not to be confused with the existing arch/x86/include/asm/pci.h file,
which provides public details about x86 PCI)
Tony Lindgren [Mon, 29 Dec 2008 16:15:47 +0000 (18:15 +0200)]
hsmmc: Fix bus voltage reset, clean-up for checkpatch
The bus voltage was being reset to 3V even if card was not
removed. Move the reset code to happen after slot has been
powered down. Also clean-up for checkpatch.
Gregory Haskins [Mon, 29 Dec 2008 14:39:53 +0000 (09:39 -0500)]
RT: fix push_rt_task() to handle dequeue_pushable properly
A panic was discovered by Chirag Jog where a BUG_ON sanity check
in the new "pushable_task" logic would trigger a panic under
certain circumstances:
http://lkml.org/lkml/2008/9/25/189
Gilles Carry discovered that the root cause was attributed to the
pushable_tasks list getting corrupted in the push_rt_task logic.
This was the result of a dropped rq lock in double_lock_balance
allowing a task in the process of being pushed to potentially migrate
away, and thus corrupt the pushable_tasks() list.
I traced back the problem as introduced by the pushable_tasks patch
that went in recently. There is a "retry" path in push_rt_task()
that actually had a compound conditional to decide whether to
retry or exit. I missed the meaning behind the rationale for the
virtual "if(!task) goto out;" portion of the compound statement and
thus did not handle it properly. The new pushable_tasks logic
actually creates three distinct conditions:
1) an untouched and unpushable task should be dequeued
2) a migrated task where more pushable tasks remain should be retried
3) a migrated task where no more pushable tasks exist should exit
The original logic mushed (1) and (3) together, resulting in the
system dequeuing a migrated task (against an unlocked foreign run-queue
nonetheless).
To fix this, we get rid of the notion of "paranoid" and we support the
three unique conditions properly. The paranoid feature is no longer
relevant with the new pushable logic (since pushable naturally limits
the loop) anyway, so lets just remove it.
Gregory Haskins [Mon, 29 Dec 2008 14:39:53 +0000 (09:39 -0500)]
sched: create "pushable_tasks" list to limit pushing to one attempt
The RT scheduler employs a "push/pull" design to actively balance tasks
within the system (on a per disjoint cpuset basis). When a task is
awoken, it is immediately determined if there are any lower priority
cpus which should be preempted. This is opposed to the way normal
SCHED_OTHER tasks behave, which will wait for a periodic rebalancing
operation to occur before spreading out load.
When a particular RQ has more than 1 active RT task, it is said to
be in an "overloaded" state. Once this occurs, the system enters
the active balancing mode, where it will try to push the task away,
or persuade a different cpu to pull it over. The system will stay
in this state until the system falls back below the <= 1 queued RT
task per RQ.
However, the current implementation suffers from a limitation in the
push logic. Once overloaded, all tasks (other than current) on the
RQ are analyzed on every push operation, even if it was previously
unpushable (due to affinity, etc). Whats more, the operation stops
at the first task that is unpushable and will not look at items
lower in the queue. This causes two problems:
1) We can have the same tasks analyzed over and over again during each
push, which extends out the fast path in the scheduler for no
gain. Consider a RQ that has dozens of tasks that are bound to a
core. Each one of those tasks will be encountered and skipped
for each push operation while they are queued.
2) There may be lower-priority tasks under the unpushable task that
could have been successfully pushed, but will never be considered
until either the unpushable task is cleared, or a pull operation
succeeds. The net result is a potential latency source for mid
priority tasks.
This patch aims to rectify these two conditions by introducing a new
priority sorted list: "pushable_tasks". A task is added to the list
each time a task is activated or preempted. It is removed from the
list any time it is deactivated, made current, or fails to push.
This works because a task only needs to be attempted to push once.
After an initial failure to push, the other cpus will eventually try to
pull the task when the conditions are proper. This also solves the
problem that we don't completely analyze all tasks due to encountering
an unpushable tasks. Now every task will have a push attempted (when
appropriate).
This reduces latency both by shorting the critical section of the
rq->lock for certain workloads, and by making sure the algorithm
considers all eligible tasks in the system.
[ rostedt: added a couple more BUG_ONs ]
Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Steven Rostedt <srostedt@redhat.com>
Gregory Haskins [Mon, 29 Dec 2008 14:39:53 +0000 (09:39 -0500)]
plist: fix PLIST_NODE_INIT to work with debug enabled
It seems that PLIST_NODE_INIT breaks if used and DEBUG_PI_LIST is defined.
Since there are no current users of PLIST_NODE_INIT, this has gone
undetected. This patch fixes the build issue that enables the
DEBUG_PI_LIST later in the series when we use it in init_task.h
Gregory Haskins [Mon, 29 Dec 2008 14:39:52 +0000 (09:39 -0500)]
sched: add sched_class->needs_post_schedule() member
We currently run class->post_schedule() outside of the rq->lock, which
means that we need to test for the need to post_schedule outside of
the lock to avoid a forced reacquistion. This is currently not a problem
as we only look at rq->rt.overloaded. However, we want to enhance this
going forward to look at more state to reduce the need to post_schedule to
a bare minimum set. Therefore, we introduce a new member-func called
needs_post_schedule() which tests for the post_schedule condtion without
actually performing the work. Therefore it is safe to call this
function before the rq->lock is released, because we are guaranteed not
to drop the lock at an intermediate point (such as what post_schedule()
may do).
Gregory Haskins [Mon, 29 Dec 2008 14:39:51 +0000 (09:39 -0500)]
sched: make double-lock-balance fair
double_lock balance() currently favors logically lower cpus since they
often do not have to release their own lock to acquire a second lock.
The result is that logically higher cpus can get starved when there is
a lot of pressure on the RQs. This can result in higher latencies on
higher cpu-ids.
This patch makes the algorithm more fair by forcing all paths to have
to release both locks before acquiring them again. Since callsites to
double_lock_balance already consider it a potential preemption/reschedule
point, they have the proper logic to recheck for atomicity violations.
Gregory Haskins [Mon, 29 Dec 2008 14:39:50 +0000 (09:39 -0500)]
sched: pull only one task during NEWIDLE balancing to limit critical section
git-id c4acb2c0669c5c5c9b28e9d02a34b5c67edf7092 attempted to limit
newidle critical section length by stopping after at least one task
was moved. Further investigation has shown that there are other
paths nested further inside the algorithm which still remain that allow
long latencies to occur with newidle balancing. This patch applies
the same technique inside balance_tasks() to limit the duration of
this optional balancing operation.
Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Nick Piggin <npiggin@suse.de>
Gregory Haskins [Mon, 29 Dec 2008 14:39:50 +0000 (09:39 -0500)]
sched: only try to push a task on wakeup if it is migratable
There is no sense in wasting time trying to push a task away that
cannot move anywhere else. We gain no benefit from trying to push
other tasks at this point, so if the task being woken up is non
migratable, just skip the whole operation. This reduces overhead
in the wakeup path for certain tasks.
Gregory Haskins [Mon, 29 Dec 2008 14:39:50 +0000 (09:39 -0500)]
sched: use highest_prio.next to optimize pull operations
We currently take the rq->lock for every cpu in an overload state during
pull_rt_tasks(). However, we now have enough information via the
highest_prio.[curr|next] fields to determine if there is any tasks of
interest to warrant the overhead of the rq->lock, before we actually take
it. So we use this information to reduce lock contention during the
pull for the case where the source-rq doesnt have tasks that preempt
the current task.
Gregory Haskins [Mon, 29 Dec 2008 14:39:49 +0000 (09:39 -0500)]
sched: use highest_prio.curr for pull threshold
highest_prio.curr is actually a more accurate way to keep track of
the pull_rt_task() threshold since it is always up to date, even
if the "next" task migrates during double_lock. Therefore, stop
looking at the "next" task object and simply use the highest_prio.curr.
Tony Lindgren [Mon, 29 Dec 2008 14:24:25 +0000 (16:24 +0200)]
hsmmc: Fix setting the voltage back to 3V
We should not set it in mmc_omap_detect, as that does not necessarily
tell that card got removed. Reset the voltage back to 3V after powering
off the slot instead.
Robert Richter [Mon, 15 Dec 2008 14:09:50 +0000 (15:09 +0100)]
x86/oprofile: fix pci_dev use count for AMD northbridge devices
This patch fixes the PCI device use count for AMD northbridge
devices. In case of an IBS LVT initialization failure, the PCI device
is released now by calling pci_dev_put().
If there are no initialization errors, the devices are released in
pci_get_device() while iterating.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Vegard Nossum [Thu, 6 Nov 2008 16:42:18 +0000 (17:42 +0100)]
kmemtrace: add missing newline
This was causing artifacts in my dmesg.
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pekka Enberg [Fri, 10 Oct 2008 08:02:59 +0000 (11:02 +0300)]
kmemtrace: remove config option for enabling tracing at boot
Users can pass kmemtrace.enabled=yes as a kernel parameter to enable kmemtrace
at boot so remove the useless CONFIG_KMEMTRACE_DEFAULT_ENABLED config option.
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pekka Enberg [Fri, 10 Oct 2008 07:57:44 +0000 (10:57 +0300)]
kmemtrace: allow kmemtrace to be enabled after boot
The kmemtrace_init() function returns early if kmemtrace is disabled at boot
causing kmemtrace_setup_late() to also bail out on NULL channel. This has the
unfortunate side effect that none of the debugfs files needed to enable
kmemtrace after boot are created.
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
kmemtrace: SLUB hooks for caller-tracking functions.
This patch adds kmemtrace hooks for __kmalloc_track_caller() and
__kmalloc_node_track_caller(). Currently, they set the call site pointer
to the value recieved as a parameter. (This could change if we implement
stack trace exporting in kmemtrace.)
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
kmemtrace: Better alternative to "kmemtrace: fix printk format warnings".
Fix the problem "kmemtrace: fix printk format warnings" attempted to fix,
but resulted in marker-probe format mismatch warnings. Instead of carrying
size_t into probes, we get rid of it by casting to unsigned long, just as
we did with gfp_t.
This way, we don't need to change marker format strings and we don't have
to rely on other format specifiers like "%zu", making for consistent use
of more generic data types (since there are no format specifiers for
gfp_t, for example).
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
This adds hooks for the SLOB allocator, to allow tracing with kmemtrace.
We also convert some inline functions to __always_inline to make sure
_RET_IP_, which expands to __builtin_return_address(0), always works
as expected.
Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
This adds hooks for the SLAB allocator, to allow tracing with kmemtrace.
We also convert some inline functions to __always_inline to make sure
_RET_IP_, which expands to __builtin_return_address(0), always works
as expected.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Documented kmemtrace's ABI, purpose and design. Also includes a short
usage guide, FAQ, as well as a link to the userspace application's Git
repository, which is currently hosted at repo.or.cz.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
kmemtrace provides tracing for slab allocator functions, such as kmalloc,
kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected data is then fed
to the userspace application in order to analyse allocation hotspots,
internal fragmentation and so on, making it possible to see how well an
allocator performs, as well as debug and profile kernel code.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
SLUB: Replace __builtin_return_address(0) with _RET_IP_.
This patch replaces __builtin_return_address(0) with _RET_IP_, since a
previous patch moved _RET_IP_ and _THIS_IP_ to include/linux/kernel.h and
they're widely available now. This makes for shorter and easier to read
code.
[penberg@cs.helsinki.fi: remove _RET_IP_ casts to void pointer] Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Robert Hancock [Thu, 25 Dec 2008 01:06:06 +0000 (19:06 -0600)]
sata_sil: add Large Block Transfer support
This implements support for the Large Block Transfer feature found in Silicon
Image 311x controllers. This allows transferring bigger contiguous chunks of
data from system memory and avoids the 64KB boundary restriction of standard
SFF controllers.
This is based on a patch from Jeff Garzik (from the sii-lbt branch of
libata-dev) but includes a few bug fixes: Since the bmdma2 register does not
implement the status bits, the original bmdma register must be used except
where the bmdma2 register is required. As well the DMA boundary should be
31-bit instead of 32-bit since the top bit of the length field is still
required for the PRD end-of-table flag.
Signed-off-by: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Jiri Slaby [Wed, 10 Dec 2008 13:07:22 +0000 (14:07 +0100)]
[libata] ata_piix: cleanup dmi strings checking
Commit
ATA: piix, fix pointer deref on suspend
fixed a possible oops in an ugly manner. Use newly introduced dmi_match()
to make the code pretty again.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk> Cc: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Lubomir Bulej [Mon, 22 Dec 2008 10:35:22 +0000 (11:35 +0100)]
libata: blacklist NCQ on OCZ CORE 2 SSD (resend)
The patchlet below blacklists NCQ on OCZ CORE v2 SSD drive(s). Even
though the drive advertises NCQ support with queue depth 1, it responds
with all-zeroes FIS to NCQ commands which triggers ata error handling
several times before the kernel decides to disable NCQ on the drive.
Signed-off-by: Lubomir Bulej <lubomir.bulej@dsrg.mff.cuni.cz> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Impact: cleanup, avoid 44 sparse warnings, new file asm/sys_ia32.h
Fixes following sparse warnings:
CHECK arch/x86/ia32/sys_ia32.c
arch/x86/ia32/sys_ia32.c:53:17: warning: symbol 'sys32_truncate64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:60:17: warning: symbol 'sys32_ftruncate64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:98:17: warning: symbol 'sys32_stat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:109:17: warning: symbol 'sys32_lstat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:119:17: warning: symbol 'sys32_fstat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:128:17: warning: symbol 'sys32_fstatat' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:164:17: warning: symbol 'sys32_mmap' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:195:17: warning: symbol 'sys32_mprotect' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:201:17: warning: symbol 'sys32_pipe' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:215:17: warning: symbol 'sys32_rt_sigaction' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:291:17: warning: symbol 'sys32_sigaction' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:330:17: warning: symbol 'sys32_rt_sigprocmask' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:370:17: warning: symbol 'sys32_alarm' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:383:17: warning: symbol 'sys32_old_select' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:393:17: warning: symbol 'sys32_waitpid' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:401:17: warning: symbol 'sys32_sysfs' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:406:17: warning: symbol 'sys32_sched_rr_get_interval' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:421:17: warning: symbol 'sys32_rt_sigpending' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:445:17: warning: symbol 'sys32_rt_sigqueueinfo' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:472:17: warning: symbol 'sys32_sysctl' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:517:17: warning: symbol 'sys32_pread' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:524:17: warning: symbol 'sys32_pwrite' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:532:17: warning: symbol 'sys32_personality' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:545:17: warning: symbol 'sys32_sendfile' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:565:17: warning: symbol 'sys32_mmap2' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:589:17: warning: symbol 'sys32_olduname' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:626:6: warning: symbol 'sys32_uname' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:641:6: warning: symbol 'sys32_ustat' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:663:17: warning: symbol 'sys32_execve' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:678:17: warning: symbol 'sys32_clone' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:693:6: warning: symbol 'sys32_lseek' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:698:6: warning: symbol 'sys32_kill' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:703:6: warning: symbol 'sys32_fadvise64_64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:712:6: warning: symbol 'sys32_vm86_warning' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:726:6: warning: symbol 'sys32_lookup_dcookie' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:732:20: warning: symbol 'sys32_readahead' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:738:17: warning: symbol 'sys32_sync_file_range' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:746:17: warning: symbol 'sys32_fadvise64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:753:17: warning: symbol 'sys32_fallocate' was not declared. Should it be static?
CHECK arch/x86/ia32/ia32_signal.c
arch/x86/ia32/ia32_signal.c:126:17: warning: symbol 'sys32_sigsuspend' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:141:17: warning: symbol 'sys32_sigaltstack' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:249:17: warning: symbol 'sys32_sigreturn' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:279:17: warning: symbol 'sys32_rt_sigreturn' was not declared. Should it be static?
CHECK arch/x86/ia32/ipc32.c
arch/x86/ia32/ipc32.c:12:17: warning: symbol 'sys32_ipc' was not declared. Should it be static?
Sergio Luis [Sun, 28 Dec 2008 07:12:26 +0000 (04:12 -0300)]
x86: mark get_cpu_leaves() with __cpuinit annotation
Impact: fix section mismatch warning
Commit b2bb85549134c005e997e5a7ed303bda6a1ae738 ("x86: Remove cpumask games
in x86/kernel/cpu/intel_cacheinfo.c") introduced get_cpu_leaves(), which
references __cpuinit cpuid4_cache_lookup().
Mark get_cpu_leaves() with a __cpuinit annotation.
Signed-off-by: Sergio Luis <sergio@larces.uece.br> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 29 Dec 2008 12:02:17 +0000 (13:02 +0100)]
tracing/ftrace: make trace_find_cmdline() generally available
Impact: build fix
On !CONFIG_CONTEXT_SWITCH_TRACER trace_find_cmdline() is not defined:
kernel/trace/trace_output.c: In function 'trace_ctxwake_print':
kernel/trace/trace_output.c:499: error: implicit declaration of function 'trace_find_cmdline'
kernel/trace/trace_output.c:499: warning: assignment makes pointer from integer without a cast
tracing/branch-tracer: adapt to the stat tracing API
Impact: refactor the branch tracer
This patch adapts the branch tracer to the tracing API.
This is a proof of concept because the branch tracer implements two
"stat tracing" that were split in two files.
So I added an option to the branch tracer: stat_all_branch.
If it is set, then trace_stat will output all of the branches
entries stats. Otherwise, it will print the annotated branches.
Its is a kind of quick trick, waiting for a better solution.
By default, the annotated branches stat are sorted by incorrect branch
prediction percentage.
It enables a developer to quickly address the source of incorrect branch
predictions. Note that this sorting would be better with a second sort on
the number of incorrect predictions.
tracing/ftrace: provide the base infrastructure for histogram tracing
Impact: extend the tracing API
The goal of this patch is to normalize and make more easy the
implementation of statistical (histogram) tracing.
It implements a trace_stat file into the /debugfs/tracing directory where
one can print a one-shot output of statistics/histogram entries.
A tracer has to provide two basic iterator callbacks:
stat_start() => the first entry
stat_next(prev, idx) => the next one.
Note that it is adapted for arrays or hash tables or lists.... since it
provides a pointer to the previous entry and the current index of the
iterator.
These two callbacks are called to get a snapshot of the statistics at each
opening of the trace_stat file because. The values are so updated between
two "cat trace_stat". And the tracer is free to lock its datas during the
iteration to keep consistent values.
Since it is almost always interesting to sort statisticals values to
address the problems by priority, this infrastructure provides a "sorting"
of the stat entries too if desired. A tracer has just to provide a
stat_cmp callback to compare two entries and the stat tracing
infrastructure will build a sorted list of the given entries.
A last callback, called stat_headers, can be implemented by a tracer to
output headers on its trace.
If one of these callbacks is changed on runtime, it just have to signal it
to the stat tracing API by calling the init_tracer_stat() helper.
Changes in V2:
- Fix a memory leak if the user opens multiple times the trace_stat file
without closing it. Now we always free our list before rebuilding it.
Steven Rostedt [Wed, 24 Dec 2008 04:24:13 +0000 (23:24 -0500)]
ftrace: change trace.c to use registered events
Impact: rework trace.c to use new event register API
Almost every ftrace event has to implement its output display in
trace.c through a different function. Some events did not handle
all the formats (trace, latency-trace, raw, hex, binary), and
this method does not scale well.
This patch converts the format functions to use the event API to
find the event and and print its format. Currently, we have
a print function for trace, latency_trace, raw, hex and binary.
A trace_nop_print is available if the event wants to avoid output
on a particular format.
Perhaps other tracers could use this in the future (like mmiotrace and
function_graph).
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 24 Dec 2008 04:24:12 +0000 (23:24 -0500)]
ftrace: set up trace event hash infrastructure
Impact: simplify/generalize/refactor trace.c
The trace.c file is becoming more difficult to maintain due to the
growing number of events. There is several formats that an event may
be printed. This patch sets up the infrastructure of an event hash to
allow for events to register how they should be printed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that the ring buffer used by ftrace allows for variable length
entries, we do not need the 'cont' feature of the buffer. This code
makes other parts of ftrace more complex and by removing this it
simplifies the ftrace code.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Wang Chen [Mon, 29 Dec 2008 05:35:11 +0000 (13:35 +0800)]
genirq: check chip->ack before calling
Impact: fix theoretical NULL dereference
The generic irq layer doesn't know whether irq_chip has ack routine on some
architectures or not. Upon that, before calling chip->ack, we should check
that it's not NULL.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Mon, 29 Dec 2008 00:01:13 +0000 (16:01 -0800)]
sparseirq: move __weak symbols into separate compilation unit
GCC has a bug with __weak alias functions: if the functions are in
the same compilation unit as their call site, GCC can decide to
inline them - and thus rob the linker of the opportunity to override
the weak alias with the real thing.
So move all the IRQ handling related __weak symbols to kernel/irq/chip.c.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Eric Miao [Tue, 23 Dec 2008 09:49:43 +0000 (17:49 +0800)]
[ARM] pxafb: add support for overlay1 and overlay2 as framebuffer devices
PXA27x and later processors support overlay1 and overlay2 on-top of the
base framebuffer (although under-neath the base is also possible). They
support palette and no-palette RGB formats, as well as YUV formats (only
available on overlay2). These overlays have dedicated DMA channels and
behave in a similar way as a framebuffer.
This heavily simplified and re-structured work is based on the original
pxafb_overlay.c (which is pending for mainline merge for a long time).
The major problems with this pxafb_overlay.c are (if you are interested
in the history):
1. heavily redundant (the control logics for overlay1 and overlay2 are
actually identical except for some small operations, which are now
abstracted into a 'pxafb_layer_ops' structure)
2. a lot of useless and un-tested code (two workarounds which are now
fixed on mature silicons)
3. cursorfb is actually useless, hardware cursor should not be used
this way, and the code was actually un-tested for a long time.
The code in this patch should be self-explanatory, I tried to add minimum
comments. As said, this is basically simplified, there are several things
still on the pending list:
1. palette mode is un-supported and un-tested (although re-using the
palette code of the base framebuffer is actually very easy now with
previous clean-up patches)
2. fb_pan_display for overlay(s) is un-supported
3. the base framebuffer can actually be abstracted by 'pxafb_layer' as
well, which will help further re-use of the code and keep a better
and consistent structure. (This is the reason I named it 'pxafb_layer'
instead of 'pxafb_overlay' or something alike)
See Documentation/fb/pxafb.txt for additional usage information.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Cc: Rodolfo Giometti <giometti@linux.it> Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
Eric Miao [Thu, 18 Dec 2008 14:36:26 +0000 (22:36 +0800)]
[ARM] pxafb: cleanup of the color format manipulation code
1. introduce var_to_depth() to calculate the color depth including the
transparency bit
2. the conversion from 'fb_var_screeninfo' to LCCR3 BPP bits can be re-
used by overlays (in OVLxC1), thus an individual pxafb_var_to_bpp()
has been separated out.
3. pxafb_setmode() should really set the color bitfields correctly at
begining, introduce a pxafb_set_pixfmt() for this
4. allow user apps to specify color formats within fb_var_screeninfo,
and checking of this in pxafb_check_var() has been simplified as
below:
a) pxafb_var_to_bpp() should pass - which means a basically correct
bits_per_pixel and color depth setting
b) the RGBT bitfields are then forced into supported values by
pxafb_set_pixfmt()
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
Eric Miao [Wed, 17 Dec 2008 06:56:54 +0000 (14:56 +0800)]
[ARM] pxafb: allow pxafb_set_par() to start from arbitrary yoffset
Note the var->yres_virtual is only re-calculated from the fix.smem_len
when text mode acceleration is enabled (which is default), this is due
to the issue as Russell suggested below:
Previous experience of doing this with the X server and acornfb is that
it causes all sorts of problems - it seems to force the X server into
assuming that the framebuffer should be panned no matter what settings
you ask it for.
The recommended workaround (implemented in acornfb) is to only do these
kinds of adjustments if text mode acceleration is enabled. IIRC, the X
server should be disabling text mode acceleration when it maps the
framebuffer. I seem to remember that there are X servers which forget
to do that though.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
Eric Miao [Tue, 16 Dec 2008 03:54:34 +0000 (11:54 +0800)]
[ARM] pxafb: allow video memory size to be configurable
The amount of video memory size is decided according to the following
order:
1. <xres> x <yres> x <bits_per_pixel> by default, which is the backward
compatible way
2. size specified in platform data
3. size specified in module parameter 'options' string or specified in
kernel boot command line (see updated Documentation/fb/pxafb.txt)
And now since the memory is allocated from system memory, the pxafb_mmap
can be removed and the default fb_mmap() should be working all right.
Also, since we now have introduced the 'struct pxafb_dma_buff' for DMA
descriptors and palettes, the allocation can be separated cleanly.
NOTE: the LCD DMA actually supports chained transfer (i.e. page-based
transfers), to simplify the logic and keep the performance (with less
TLB misses when accessing from memory mapped user space), the memory
is allocated by alloc_pages_*() to ensures it's physical contiguous.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
Eric Miao [Thu, 18 Dec 2008 03:10:32 +0000 (11:10 +0800)]
[ARM] rtc-sa1100: don't assume CLOCK_TICK_RATE to be a constant
As Nicolas and Russell pointed out, CLOCK_TICK_RATE is no more
a constant on PXA when multiple processors and platforms are
selected, change TIMER_FREQ in rtc-sa1100.c into a variable.
Since the code to decide the clock tick rate is re-used from
timer.c, introduce a common get_clock_tick_rate() for this.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Nicolas Pitre <nico@marvell.com>
David Rientjes [Thu, 18 Dec 2008 06:09:46 +0000 (22:09 -0800)]
slub: avoid leaking caches or refcounts on sysfs error
If a slab cache is mergeable and the sysfs alias cannot be added, the
target cache shall have its refcount decremented. kmem_cache_create()
will return NULL, so if kmem_cache_destroy() is ever called on the target
cache, it will never be freed if the refcount has been leaked.
Likewise, if a slab cache is not mergeable and the sysfs link cannot be
added, the new cache shall be removed from the slab_caches list.
kmem_cache_create() will return NULL, so it will be impossible to call
kmem_cache_destroy() on it.
Both of these operations require slub_lock since refcount of all slab
caches and slab_caches are protected by the lock.
In the mergeable case, it would be better to restore objsize and offset
back to their original values, but this could race with another merge
since slub_lock was dropped.
Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Pekka Enberg [Wed, 26 Nov 2008 08:01:31 +0000 (10:01 +0200)]
slab: remove GFP_THISNODE clearing from alloc_slabmgmt()
Commit 6cb062296f73e74768cca2f3eaf90deac54de02d ("Categorize GFP flags")
left one call-site in alloc_slabmgmt() to clear GFP_THISNODE instead of
GFP_CONSTRAINT_MASK. Unfortunately, that ends up clearing __GFP_NOWARN
and __GFP_NORETRY as well which is not what we want. As the only caller
of alloc_slabmgmt() already clears GFP_CONSTRAINT_MASK before passing
local_flags to it, we can just remove the clearing of GFP_THISNODE.
This patch should fix spurious page allocation failure warnings on the
mempool_alloc() path. See the following URL for the original discussion
of the bug:
http://lkml.org/lkml/2008/10/27/100
Acked-by: Christoph Lameter <cl@linux-foundation.org> Reported-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Akinobu Mita [Tue, 23 Dec 2008 10:37:01 +0000 (19:37 +0900)]
SLUB: failslab support
Currently fault-injection capability for SLAB allocator is only
available to SLAB. This patch makes it available to SLUB, too.
[penberg@cs.helsinki.fi: unify slab and slub implementations] Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Eric Anholt [Fri, 19 Dec 2008 22:47:48 +0000 (14:47 -0800)]
drm/i915: Don't print to dmesg when taking signal during object_pin.
This showed up in logs where people had a hung chip, so pinning was blocked
on the chip unpinning other buffers, and the X Server took its scheduler
signal during that time.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>
Hannes Eder [Thu, 18 Dec 2008 14:09:00 +0000 (15:09 +0100)]
drm/i915: fix sparse warnings: declare one-bit bitfield as unsigned
Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Dave Airlie <airlied@linux.ie>
Eric Anholt [Wed, 10 Dec 2008 18:09:41 +0000 (10:09 -0800)]
drm/i915: Don't double-unpin buffers if we take a signal in evict_everything().
We haven't seen this in practice, but it was visible when looking at a bug
report from when i915_gem_evict_everything() was broken and would always
return error.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>
drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well.
The replace fb ioctl replaces the backing buffer object for a modesetting
framebuffer object. This can be acheived by just creating a new
framebuffer backed by the new buffer object, setting that for the crtcs
in question and then removing the old framebuffer object.
Signed-off-by: Kristian Hogsberg <krh@redhat.com> Acked-by: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>