Denis V. Lunev [Tue, 7 Oct 2008 21:47:12 +0000 (14:47 -0700)]
ipv6: separate seq_ops for global & per/device ipv6 statistics
idev has been stored on seq->private. NULL has been stored for global
statistics.
The situation is changed with net namespace. We need to store pointer to
struct net and the only place is seq->private. So, we'll have for
/proc/net/dev_snmp6/* and for /proc/net/snmp6 pointers of two different
types stored in the same field.
This effectively requires to separate seq_ops of these files.
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:46:47 +0000 (14:46 -0700)]
ipv6: consolidate ipv6 sock_stat code at the beginning of net/ipv6/proc.c
Simple, comsolidate sockstat6 staff in one place, at the beginning of
the file. Right now sockstat6_seq_open/sockstat6_seq_fops looks like an
intrusion in the middle of snmp6 code.
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Tue, 7 Oct 2008 21:43:31 +0000 (14:43 -0700)]
tcp: cleanup messy initializer
I'm quite sure that if I give this function in its old format
for you to inspect, you start to wonder what is the type of
demanded or if it's a global variable.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Tue, 7 Oct 2008 21:43:06 +0000 (14:43 -0700)]
tcp: kill pointless urg_mode
It all started from me noticing that this urgent check in
tcp_clean_rtx_queue is unnecessarily inside the loop. Then
I took a longer look to it and found out that the users of
urg_mode can trivially do without, well almost, there was
one gotcha.
Bonus: those funny people who use urg with >= 2^31 write_seq -
snd_una could now rejoice too (that's the only purpose for the
between being there, otherwise a simple compare would have done
the thing). Not that I assume that the rest of the tcp code
happily lives with such mind-boggling numbers :-). Alas, it
turned out to be impossible to set wmem to such numbers anyway,
yes I really tried a big sendfile after setting some wmem but
nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf...
So I hacked a bit variable to long and found out that it seems
to work... :-)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Trond Myklebust [Sun, 5 Oct 2008 17:31:21 +0000 (13:31 -0400)]
NFS: Don't clear nfsi->cache_validity in nfs_check_inode_attributes()
If we're merely checking the inode attributes because we suspect that the
'updated' attributes returned by the RPC call are stale, then we shouldn't
be doing weak cache consistency updates or clearing the cache_validity
flags.
NFS: Convert __nfs_revalidate_inode() to use nfs_refresh_inode()
In the case where there are parallel RPC calls to the same inode, we may
receive stale metadata due to the lack of ordering, hence the sanity
checking of metadata in nfs_refresh_inode().
Currently, __nfs_revalidate_inode() is calling nfs_update_inode() directly,
without any further sanity checks, and hence may end up setting the inode
up with stale metadata.
Fix is to use nfs_refresh_inode() instead of nfs_update_inode().
Trond Myklebust [Sun, 5 Oct 2008 16:27:55 +0000 (12:27 -0400)]
NFS: Fix nfs_post_op_update_inode_force_wcc()
If we believe that the attributes are old (see nfs_refresh_inode()), then
we shouldn't force an update.
Also ensure that we hold the inode->i_lock across attribute checks and the
call to nfs_refresh_inode_locked() to ensure that we don't race with other
attribute updates.
Suresh Siddha [Tue, 7 Oct 2008 21:04:28 +0000 (14:04 -0700)]
x86: xsave: set FP, SSE bits in the xsave header in the user sigcontext
If a processor implementation discern that a processor state component is in
its initialized state, it may modify the corresponding bit in the
xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave'
will be consistent with the bit values in the header.
During signal handling, legacy applications may change the FP/SSE bits
in the sigcontext memory layout without touching the FP/SSE header bits
in the xsave header. So always set FP/SSE bits in the xsave header
while saving the sigcontext state to the user space. During signal return,
this will enable the kernel to capture any changes to the FP/SSE bits by the
legacy applications which don't touch xsave headers.
xsave aware apps can change the xstate_bv in the xsave header aswell
as change any contents in the memory layout. xrestor as part of sigreturn
will capture all the changes.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Currently nfs_refresh_inode() will only update the inode metadata if it
sees that the RPC call that returned the nfs_fattr was started
after the last update of the inode. This means that if we have parallel
RPC calls to the same inode (when sending WRITE calls, for instance), we
may often miss updates.
This patch attempts to recover those missed updates by also accepting
them if the ctime in the nfs_fattr is more recent than the inode's
cached ctime.
It also recovers the case where the file size has increased, but the
ctime has not been updated due to limited ctime resolution.
Trond Myklebust [Sun, 5 Oct 2008 16:07:23 +0000 (12:07 -0400)]
NFS: Clean up nfs_refresh_inode() and nfs_post_op_update_inode()
Try to avoid taking and dropping the inode->i_lock more than once. Do so by
moving the code in nfs_refresh_inode() that needs to be done under the
spinlock into a function nfs_refresh_inode_locked(), and then having both
nfs_refresh_inode() and nfs_post_op_update_inode() call it directly.
Ben Dooks [Tue, 7 Oct 2008 21:26:09 +0000 (22:26 +0100)]
[ARM] S3C24XX: Move files out of include/asm-arm/plat-s3c*
First move of items out of include/asm-arm/plat-s3c* to their
new homes under arch/arm/plat-s3c/include/plat and
arch/arm/plat-s3c24xx/include/plat directories.
Note, we have to create a dummy arch/arm/plat-s3c/Makefile to
allow us to add arch/arm/plat-s3c/include/plat to the path.
Trond Myklebust [Fri, 15 Aug 2008 20:59:14 +0000 (16:59 -0400)]
NFS: Don't apply NFS_MOUNT_FLAGMASK to text-based mounts
The point of introducing text-based mounts was to allow us to add
functionality without having to worry about legacy binary mount formats.
The mask should be there in order to ensure that binary formats don't start
enabling features that they cannot support. There is no justification for
applying it to the text mount path.
NFS: Add options for finer control of the lookup cache
Add the flag NFS_MOUNT_LOOKUP_CACHE_NONEG to turn off the caching of
negative dentries. In reality what we do is to force
nfs_lookup_revalidate() to always discard negative dentries.
Add the flag NFS_MOUNT_LOOKUP_CACHE_NONE for enforcing stricter
revalidation of dentries. It forces the revalidate code to always do a
lookup instead of just checking the cached mtime of the parent directory.
Peter Zijlstra [Tue, 7 Oct 2008 21:15:00 +0000 (14:15 -0700)]
ipv6: initialize ip6_route sysctl vars in ip6_route_net_init()
This makes that ip6_route_net_init() does all of the route init code.
There used to be a race between ip6_route_net_init() and ip6_net_init()
and someone relying on the combined result was left out cold.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve French [Tue, 7 Oct 2008 20:03:33 +0000 (20:03 +0000)]
[CIFS] make sure we have the right resume info before calling CIFSFindNext
When we do a seekdir() or equivalent, we usually end up doing a
FindFirst call and then call FindNext until we get to the offset that we
want. The problem is that when we call FindNext, the code usually
doesn't have the proper info (mostly, the filename of the entry from the
last search) to resume the search.
Add a "last_entry" field to the cifs_search_info that points to the last
entry in the search. We calculate this pointer by using the
LastNameOffset field from the search parms that are returned. We then
use that info to do a cifs_save_resume_key before we call CIFSFindNext.
This patch allows CIFS to reliably pass the "telldir" connectathon test.
Signed-off-by: Jeff Layton <jlayton@redhat.com> CC: Stable <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
inet: Don't lookup the socket if there's a socket attached to the skb
Use the socket cached in the skb if it's present.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve French [Tue, 7 Oct 2008 18:42:52 +0000 (18:42 +0000)]
[CIFS] clean up error handling in cifs_unlink
Currently, if a standard delete fails and we end up getting -EACCES
we try to clear ATTR_READONLY and try the delete again. If that
then fails with -ETXTBSY then we try a rename_pending_delete. We
aren't handling other errors appropriately though.
Another client could have deleted the file in the meantime and
we get back -ENOENT, for instance. In that case we wouldn't do a
d_drop. Instead of retrying in a separate call, simply goto the
original call and use the error handling from that.
Also, we weren't properly undoing any attribute changes that
were done before returning an error back to the caller.
CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Mackall [Tue, 7 Oct 2008 16:37:35 +0000 (11:37 -0500)]
SLOB: fix bogus ksize calculation
SLOB's ksize calculation was braindamaged and generally harmlessly
underreported the allocation size. But for very small buffers, it could
in fact overreport them, leading code depending on krealloc to overrun
the allocation and trample other data.
Signed-off-by: Matt Mackall <mpm@selenic.com> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Miao [Sat, 4 Oct 2008 04:45:39 +0000 (12:45 +0800)]
[ARM] ohci-pxa27x: introduce pxa27x_clear_otgph()
Direct access to pxa27x specific register PSSR in a generic ohci driver
is no good, introduce pxa27x_clear_otgph() and move the implementation
into processor specific code.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Miao [Fri, 3 Oct 2008 14:37:21 +0000 (22:37 +0800)]
[ARM] ohci-pxa27x: use platform_get_{irq,resource} for the resource
Depending on the order of how resource is defined in the platform
device is not good, use platform_get_{irq,resource} for the IRQ
and memory resources.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Miao [Sat, 27 Sep 2008 07:49:57 +0000 (15:49 +0800)]
[ARM] ohci-pxa27x: introduce flags to avoid direct access to OHCI registers
Direct access to USB host controller registers is considered to be not
portable, and is usually a bad sign for poorly abstracted interface.
Introduce .flags and .power_on_delay to "struct pxaohci_platform_data"
so that most platforms don't bother to write their own .init/.exit()
sequences.
Signed-off-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Miao [Mon, 8 Sep 2008 07:37:50 +0000 (15:37 +0800)]
[ARM] pxa: move I2S register and bit definitions into pxa2xx-i2s.c
Signed-off-by: Eric Miao <eric.miao@marvell.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
rangetimer: fix x86 build failure for the !HRTIMERS case
the timer peek function was on the wrong side of an ifdef,
breaking for the !HRTIMERs case. Just provide an empty inline
for that case since it doesn't make sense in that scenario.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Mike Rapoport [Sun, 5 Oct 2008 09:30:42 +0000 (10:30 +0100)]
[ARM] 5285/1: pxa: update xm_x2xx_defconfig
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Sun, 5 Oct 2008 09:29:59 +0000 (10:29 +0100)]
[ARM] 5284/1: pxa: cm-x255: add NOR and NAND flash support
This patch adds support for NOR and NAND flashes on CM-X255.
The NAND flash support uses not yet merged GPIO NAND driver.
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Sun, 5 Oct 2008 09:27:22 +0000 (10:27 +0100)]
[ARM] 5283/1: pxa: add CM-X255 pcmcia support
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Sun, 5 Oct 2008 09:26:55 +0000 (10:26 +0100)]
[ARM] 5282/1: pxa: add CM-X255 support
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Sun, 5 Oct 2008 09:26:10 +0000 (10:26 +0100)]
[ARM] 5281/1: pxa: split cm-x2xx.c to cm-x2xx.c and cm-x270.c
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Sun, 5 Oct 2008 09:25:44 +0000 (10:25 +0100)]
[ARM] 5280/1: pxa: prepare cm-x2xx.c and cm-x2xx-pci.[ch] for addition of CM-X255
- Change CM-X255 and CM-X270 common function prefix from cmx270 to cmx2xx
- Split cmx2xx_init to common and CM-X270-specific parts
- Use dynamic assignement for DM9000 resources and led GPIOs.
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Tue, 7 Oct 2008 10:58:25 +0000 (11:58 +0100)]
[ARM] 5286/2: pxa: rename cm-x270* to cm-x2xx* to allow addition of cm-x255 support
Signed-off-by: Russ Dill <russ.dill@gmail.com> Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Hartley Sweeten [Mon, 6 Oct 2008 21:43:02 +0000 (22:43 +0100)]
[ARM] 5293/1: ep93xx: add defines for external chipselects
This patch adds defines for the external chipselect physical base
addresses available with the EP93xx. These are meant to be used
in the platform init code.
In addition, documentation about the synchronous/asynchronous boot
modes for the EP93xx and a reference to errata about issues with
synchronous booting has been added.
Signed-off-by: <H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David Brownell [Mon, 6 Oct 2008 07:41:32 +0000 (00:41 -0700)]
twl4030-gpio irq simplification
Simplify and correct TWL4030 GPIO IRQ handling:
- support mask() not just unmask()
- use genirq handle_edge_irq() handler not custom hacks
- let that handle (correct) accounting of chained IRQ counts
- use the more efficient clear-on-read mechanism
- don't misuse IRQ_LEVEL
- remove some superfluous locking
- locking fix: all irq_chip data needs spinlock protection
Cleanups:
- give the helper thread a more accurate name
- don't name the NOP ack() method misleadingly
- use generic_handle_irq(), not a manually unrolled version thereof
- comment fixes
Note that the previous IRQ dispatch code was somewhat confused.
It seemed not to know that it was working with edge triggered
interrupts, much less ones which could be transparently acked.
(Also note that COR=1 doesn't enable Clear-On-Read for all modules;
some are documented as using COR=0 for that. GPIO uses COR=1.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com>
David Brownell [Tue, 7 Oct 2008 03:43:35 +0000 (20:43 -0700)]
twl4030-core irq simplification
Simplify twl4030 IRQ handling by removing a needless custom flow
handler. The top level IRQs, from the PIH, are well suited for
handle_simple_irq() ... they can't be acked or masked.
Switching resolves some issues with how IRQs were dispatched.
Notably, abuse of desc->status, IRQ accounting, and handling
of various faults.
In short, use standard genirq code.
Drivers that request_irq() to the PIH will need to pay more
attention to things like setting IRQF_DISABLED (since it's
no longer ignored), and making I2C calls from handlers (you'll
need a lockdep workaround).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com>
David Brownell [Tue, 7 Oct 2008 03:37:59 +0000 (20:37 -0700)]
twl4030 "subdriver" irq tweaks
Bugfixes to TWL subdriver irq handler setup ... lockdep
workarounds, remove IRQF_DISABLED. NOPs with current code.
These changes are specific to the drivers which register
directly with the PIH irq_chip (in twl4030-core), and are
prerequsites to a cleanup patch for that PIH infrastructure.
(Unless you don't use these drivers.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com>
David Brownell [Mon, 6 Oct 2008 20:17:35 +0000 (13:17 -0700)]
beagle: two more GPIOs
Request two more GPIOs on Beagle: MMC write protect, DVI enable.
Also define the first OMAP3 GPIO mux config symbols, with a new
naming convention. Examples:
- GPIO42 is bidirectional, and uses external pull up/down (if any);
- GPIO42_UP is bidirectional, with an internal pullup;
- GPIO42_DOWN is bidirectional, with an internal pulldown.
- GPIO42_OUT is output-only, and needs no pullup or pulldown.
All of those would be fully functional through the standard GPIO
interface as well as the legacy OMAP-only one, except GPIO42_OUT
which won't let you read the actual pin value after setting it.
There's no special off-mode support for these particular pins;
on Beagle they have external pullups or pulldowns.
(OMAP2 can use this naming convention too, except that since it
has no input-enable the "_OUT" convention can't be used there.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com>
x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC
On some HP nx6... laptops (e.g. nx6325) BIOS reports an IRQ0 override
but the SB450 chipset is configured such that timer interrupts goe to
INT0 of IOAPIC.
Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.
[ This more generic PCI ID based quirk should alleviate the need for
dmi_ignore_irq0_timer_override DMI quirks. ]
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: "Maciej W. Rozycki" <macro@linux-mips.org> Tested-by: Dmitry Torokhov <dtor@mail.ru> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
powerpc: Fix sysfs pci mmap on 32-bit machines with 64-bit PCI
When manipulating 64-bit PCI addresses, the code would lose the
top 32-bit in a couple of places when shifting a pfn due to missing
type casting from the 32-bit pfn to a 64-bit resource before the
shift.
This breaks using newer X servers for example on 440 machines
with the PCI bus above 32-bit.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Honor O_NONBLOCK flag when reading RTAS log
rtas_log_read() doesn't check file flags for O_NONBLOCK and blocks
non-blocking readers of /proc/ppc64/rtas/error_log when there is
no data available. This fixes it.
Also rtas_log_read() returns now with ENODATA to prevent suspending of
process in wait_event_interruptible() when logging facility was
switched off and log is already empty.
Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Johannes Berg [Wed, 24 Sep 2008 04:29:08 +0000 (04:29 +0000)]
powerpc: Enforce sane MAX_ORDER
powerpc uses CONFIG_FORCE_MAX_ZONEORDER, and some things depend on it
being at least 10 when 64k pages are not configured (notably the dart
iommu code with CONFIG_PM). The defaults are fine, but when going from a
64K pages config to one without 64K pages, MAX_ORDER stays at 9 which is
too low for 4K pages.
This patch makes the Kconfig enforce at least the defaults.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit 14cf11af6cf608eb8c23e989ddb17a715ddce109 ("powerpc: Merge enough to
start building in arch/powerpc.") unwired /proc/ppc_htab, and commit 917f0af9e5a9ceecf9e72537fabb501254ba321d ("powerpc: Remove arch/ppc and
include/asm-ppc") removed the rest of the /proc/ppc_htab support, but there are
still a few references left. Kill them for good.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Roland Dreier [Mon, 15 Sep 2008 10:43:35 +0000 (10:43 +0000)]
powerpc: Avoid integer overflow in page_is_ram()
Commit 8b150478 ("ppc: make phys_mem_access_prot() work with pfns
instead of addresses") fixed page_is_ram() in arch/ppc to avoid overflow
for addresses above 4G on 32-bit kernels. However arch/powerpc's
page_is_ram() is missing the same fix -- it computes a physical address
by doing pfn << PAGE_SHIFT, which overflows if pfn corresponds to a page
above 4G.
In particular this causes pages above 4G to be mapped with the wrong
caching attribute; for example many ppc440-based SoCs have PCI space
above 4G, and mmap()ing MMIO space may end up with a mapping that has
caching enabled.
Fix this by working with the pfn and avoiding the conversion to
physical address that causes the overflow. This patch compares the
pfn to max_pfn, which is a semantic change from the old code -- that
code compared the physical address to high_memory, which corresponds
to max_low_pfn. However, I think that was is another bug, since
highmem pages are still RAM.
Reported-by: vb <vb@vsbe.com> Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Andi Kleen [Tue, 7 Oct 2008 01:37:44 +0000 (21:37 -0400)]
ext4: Avoid double dirtying of super block in ext4_put_super()
While reading code I noticed that ext4_put_super() dirties the
superblock bh twice. It is always done in ext4_commit_super()
too. Remove the redundant dirty operation.
Should be a nop semantically.
Theodore Ts'o [Tue, 7 Oct 2008 00:58:09 +0000 (20:58 -0400)]
Update ext4 MAINTAINERS file
The ext4 entry was copied from ext3 and was never correct. Update it
so that Theodore Ts'o is listed as the maintainer, and point the
website to http://ext4.wiki.kernel.org.
Instead of causing umount requests to block on server->active_wq while the
asynchronous sillyrename deletes are executing, we can use the sb->s_active
counter to obtain a reference to the super_block, and then release that
reference in nfs_async_unlink_release().
After the BKL removal patches were applied to the rest of the NFS code, the
BKL protection in nfs_file_llseek() is no longer sufficient to ensure that
inode->i_size is read safely in generic_file_llseek_unlocked().
In order to fix the situation, we either have to replace the naked read of
inode->i_size in generic_file_llseek_unlocked() with i_size_read(), or the
whole thing needs to be executed under the inode->i_lock;
In order to avoid disrupting other filesystems, avoid touching
generic_file_llseek_unlocked() for now...
mac80211: avoid "Wireless Event too big" message for assoc response
The association response IEs are sent to userland with an IWEVCUSTOM
event, which unfortunately is limited to a little more than 100 bytes
of IE information with the encoding used. Many APs send so much
IE information that this message overflows. When the IWEVCUSTOM
event is too large, the kernel doesn't send it to userland anyway --
better just not to send it.
An attempt was made by Jouni Malinen to correct this issue by
converting to use IWEVASSOCREQIE and IWEVASSOCRESPIE messages instead
("mac80211: Use IWEVASSOCREQIE instead of IWEVCUSTOM"). Unfortunately,
that caused a problem due to 32-/64-bit interactions on some systems and
was reverted after the 'userland ABI' rule was invoked. That leaves
us with this option instead of a proper fix, at least until we move
to a cfg80211-based solution.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* Theodore Ts'o (tytso@mit.edu) wrote:
>
> I've been playing with adding some markers into ext4 to see if they
> could be useful in solving some problems along with Systemtap. It
> appears, though, that as of 2.6.27-rc8, markers defined in code which is
> compiled directly into the kernel (i.e., not as modules) don't show up
> in Module.markers:
>
> kvm_trace_entryexit arch/x86/kvm/kvm-intel %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-intel %u %p %u %u %u %u %u %u
> kvm_trace_entryexit arch/x86/kvm/kvm-amd %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-amd %u %p %u %u %u %u %u %u
>
> (Note the lack of any of the kernel_sched_* markers, and the markers I
> added for ext4_* and jbd2_* are missing as wel.)
>
> Systemtap apparently depends on in-kernel trace_mark being recorded in
> Module.markers, and apparently it's been claimed that it used to be
> there. Is this a bug in systemtap, or in how Module.markers is getting
> built? And is there a file that contains the equivalent information
> for markers located in non-modules code?