* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (864 commits)
Btrfs: explicitly mark the tree log root for writeback
Btrfs: Drop the hardware crc32c asm code
Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING
Btrfs: kmap_atomic(KM_USER0) is safe for btrfs_readpage_end_io_hook
Btrfs: Don't use kmap_atomic(..., KM_IRQ0) during checksum verifies
Btrfs: tree logging checksum fixes
Btrfs: don't change file extent's ram_bytes in btrfs_drop_extents
Btrfs: Use btrfs_join_transaction to avoid deadlocks during snapshot creation
Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code
Btrfs: drop EXPORT symbols from extent_io.c
Btrfs: Fix checkpatch.pl warnings
Btrfs: Fix free block discard calls down to the block layer
Btrfs: avoid orphan inode caused by log replay
Btrfs: avoid potential super block corruption
Btrfs: do not call kfree if kmalloc failed in btrfs_sysfs_add_super
Btrfs: fix a memory leak in btrfs_get_sb
Btrfs: Fix typo in clear_state_cb
Btrfs: Fix memset length in btrfs_file_write
Btrfs: update directory's size when creating subvol/snapshot
Btrfs: add permission checks to the ioctls
...
Andi Kleen [Fri, 9 Jan 2009 20:17:39 +0000 (12:17 -0800)]
x86: only scan the root bus in early PCI quirks
We found a situation on Linus' machine that the Nvidia timer quirk hit on
a Intel chipset system. The problem is that the system has a fancy Nvidia
card with an own PCI bridge, and the early-quirks code looking for any
NVidia bridge triggered on it incorrectly. This didn't lead a boot
failure by luck, but the timer routing code selecting the wrong timer
first and some ugly messages. It might lead to real problems on other
systems.
I checked all the devices which are currently checked for by early_quirks
and it turns out they are all located in the root bus zero.
So change the early-quirks loop to only scan bus 0. This incidently also
saves quite some unnecessary scanning work, because early_quirks doesn't
go through all the non root busses.
The graphics card is not on bus 0, so it is not matched anymore.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
block: add one-hit cache for disk partition lookup
and has the effect of killing my machine whenever I try to assemble
an md array :-(
One of the devices in the array has partitions, and mdadm always
deletes partitions before putting a whole-device in an array (as it
can cause confusion). The next IO to that device locks the machine.
I don't really understand exactly why it locks up, but it happens in
disk_map_sector_rcu(). This patch fixes it.
Which is due to a missing clear of the (now) stale partition lookup
data. So clear that when we delete a partition.
Linus Torvalds [Fri, 9 Jan 2009 20:43:06 +0000 (12:43 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (31 commits)
powerpc/oprofile: fix whitespaces in op_model_cell.c
powerpc/oprofile: IBM CELL: add SPU event profiling support
powerpc/oprofile: fix cell/pr_util.h
powerpc/oprofile: IBM CELL: cleanup and restructuring
oprofile: make new cpu buffer functions part of the api
oprofile: remove #ifdef CONFIG_OPROFILE_IBS in non-ibs code
ring_buffer: fix ring_buffer_event_length()
oprofile: use new data sample format for ibs
oprofile: add op_cpu_buffer_get_data()
oprofile: add op_cpu_buffer_add_data()
oprofile: rework implementation of cpu buffer events
oprofile: modify op_cpu_buffer_read_entry()
oprofile: add op_cpu_buffer_write_reserve()
oprofile: rename variables in add_ibs_begin()
oprofile: rename add_sample() in cpu_buffer.c
oprofile: rename variable ibs_allowed to has_ibs in op_model_amd.c
oprofile: making add_sample_entry() inline
oprofile: remove backtrace code for ibs
oprofile: remove unused ibs macro
oprofile: remove unused components in struct oprofile_cpu_buffer
...
Linus Torvalds [Fri, 9 Jan 2009 19:55:14 +0000 (11:55 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (94 commits)
ACPICA: hide private headers
ACPICA: create acpica/ directory
ACPI: fix build warning
ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt"
ACPI: Avoid array address overflow when _CST MWAIT hint bits are set
fujitsu-laptop: Simplify SBLL/SBL2 backlight handling
fujitsu-laptop: Add BL power, LED control and radio state information
ACPICA: delete utcache.c
ACPICA: delete acdisasm.h
ACPICA: Update version to 20081204.
ACPICA: FADT: Update error msgs for consistency
ACPICA: FADT: set acpi_gbl_use_default_register_widths to TRUE by default
ACPICA: FADT parsing changes and fixes
ACPICA: Add ACPI_MUTEX_TYPE configuration option
ACPICA: Fixes for various ACPI data tables
ACPICA: Restructure includes into public/private
ACPI: remove private acpica headers from driver files
ACPI: reboot.c: use new acpi_reset interface
ACPICA: New: acpi_reset interface - write to reset register
ACPICA: Move all public H/W interfaces to new hwxface
...
Tejun Heo [Fri, 9 Jan 2009 10:19:14 +0000 (19:19 +0900)]
libata: use WARN_ON_ONCE on hot paths
Convert WARN_ON() on command issue/completion paths to WARN_ON_ONCE()
so that libata doesn't spam the machine even when one of those
conditions triggers repeatedly.
David Howells [Fri, 9 Jan 2009 16:13:46 +0000 (16:13 +0000)]
CRED: Must initialise the new creds in prepare_kernel_cred()
The newly allocated creds in prepare_kernel_cred() must be initialised
before get_uid() and get_group_info() can access them. They should be
copied from the old credentials.
Reported-by: Steve Dickson <steved@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
parisc: export length of os_hpmc vector
parisc: fix kernel crash (protection id trap) when compiling ruby1.9
parisc: Use DEFINE_SPINLOCK
parisc: add uevent helper for parisc bus
parisc: fix ipv6 checksum
parisc: quiet palo not-found message from "which"
parisc: Replace NR_CPUS in parisc code
parisc: trivial fixes
parisc: fix braino in commit adding __space_to_prot
parisc: factor out sid to protid conversion
parisc: use leX_to_cpu in place of __fswabX
parisc: fix GFP_KERNEL use while atomic in unwinder
parisc: remove dead BIO_VMERGE_BOUNDARY and BIO_VMERGE_MAX_SIZE definitions
parisc: set_time() catch errors
parisc: use the new byteorder headers
parisc: drivers/parisc/: make code static
parisc: lib/: make code static
Linus Torvalds [Fri, 9 Jan 2009 19:52:14 +0000 (11:52 -0800)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (22 commits)
ioat: fix self test for multi-channel case
dmaengine: bump initcall level to arch_initcall
dmaengine: advertise all channels on a device to dma_filter_fn
dmaengine: use idr for registering dma device numbers
dmaengine: add a release for dma class devices and dependent infrastructure
ioat: do not perform removal actions at shutdown
iop-adma: enable module removal
iop-adma: kill debug BUG_ON
iop-adma: let devm do its job, don't duplicate free
dmaengine: kill enum dma_state_client
dmaengine: remove 'bigref' infrastructure
dmaengine: kill struct dma_client and supporting infrastructure
dmaengine: replace dma_async_client_register with dmaengine_get
atmel-mci: convert to dma_request_channel and down-level dma_slave
dmatest: convert to dma_request_channel
dmaengine: introduce dma_request_channel and private channels
net_dma: convert to dma_find_channel
dmaengine: provide a common 'issue_pending_all' implementation
dmaengine: centralize channel allocation, introduce dma_find_channel
dmaengine: up-level reference counting to the module level
...
Chris Mason [Fri, 9 Jan 2009 18:14:17 +0000 (13:14 -0500)]
Btrfs: explicitly mark the tree log root for writeback
Each subvolume has an extent_state_tree used to mark metadata
that needs to be sent to disk while syncing the tree. This is
used in addition to the dirty bits on the pages themselves so that
a single subvolume can be sent to disk efficiently in disk order.
Normally this marking happens in btrfs_alloc_free_block, which also does
special recording of dirty tree blocks for the tree log roots.
Yan Zheng noticed that when the root of the log tree is allocated, it is added
to the wrong writeback list. The fix used here is to explicitly set
it dirty as part of tree log creation.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Philip Rakity [Wed, 8 Oct 2008 23:08:20 +0000 (16:08 -0700)]
[MTD] [NAND] add cmdline parsing (mtdparts=) support to cafe_nand
[dwmw2: updated and made to still register whole device first] Signed-off-by: Philip Rakity <pakity@yahoo.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Zhao Yakui [Wed, 17 Dec 2008 08:55:18 +0000 (16:55 +0800)]
ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt"
On some boxes there exist both RSDT and XSDT table. But unfortunately
sometimes there exists the following error when XSDT table is used:
a. 32/64X address mismatch
b. The 32/64X FACS address mismatch
In such case the boot option of "acpi=rsdt" is provided so that
RSDT is tried instead of XSDT table when the system can't work well.
http://bugzilla.kernel.org/show_bug.cgi?id=8246
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
cc:Thomas Renninger <trenn@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
Zhao Yakui [Sun, 4 Jan 2009 04:04:21 +0000 (12:04 +0800)]
ACPI: Avoid array address overflow when _CST MWAIT hint bits are set
The Cx Register address obtained from the _CST object is used as the MWAIT
hints if the register type is FFixedHW. And it is used to check whether
the Cx type is supported or not.
On some boxes the following Cx state package is obtained from _CST object:
>{
ResourceTemplate ()
{
Register (FFixedHW,
0x01, // Bit Width
0x02, // Bit Offset
0x0000000000889759, // Address
0x03, // Access Size
)
},
0x03,
0xF5,
0x015E }
In such case we should use the bit[7:4] of Cx address to check whether
the Cx type is supported or not.
mask the MWAIT hint to avoid array address overflow
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by:Venki Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
GBLS and GBLL only differ in the clearing of the GHKS flag, so there is no need
to have two backlight level readouts. Also, per Peter Gruber, the need for the
BLNF check has disappeared.
As a result, cleanups can be made in the code. This has been tested on the both
the S6410 and the S6420 platforms and causes no functionality regressions, on
the console without X or within X. One module parameter to disable the hotkeys
is dropped, as we only ever took one codepath anyway.
Signed-off-by: Tony Vroon <tony@linx.net> Tested-by: Peter Gruber <nokos@gmx.net> Tested-By: Stephen Gildea <stepheng+linux@gildea.com> Acked-by: Jonathan Woithe <jwoithe@physics.adelaide.edu.au> Signed-off-by: Len Brown <len.brown@intel.com>
Tony Vroon [Wed, 31 Dec 2008 18:19:59 +0000 (18:19 +0000)]
fujitsu-laptop: Add BL power, LED control and radio state information
The FUNC interface in the Fujitsu-Siemens DSDT was unused until now. It exposes
state information that is now reported in additional platform files (whether the
radios are killed by the hardware switch or operational, whether the machine is
docked and whether the lid is open).
Support for the backlight class is now extended with the ability to power the
backlight on & off. Optional support for the LED class allows the keyboard
headlamps found on the U810 netbook and the Fujitsu logo illumination on the
P8010 notebook to be turned on & off.
This was fed through checkpatch.pl and tested on the S6420, P8010 & U810 platforms.
Signed-off-by: Stephen Gildea <stepheng+linux@gildea.com> Tested-by: Stephen Gildea <stepheng+linux@gildea.com> Tested-by: Julian Brown <jules@panic.cs-bristol.org.uk> Signed-off-by: Tony Vroon <tony@linx.net> Tested-by: Peter Gruber <nokos@gmx.net> Acked-by: Jonathan Woithe <jwoithe@physics.adelaide.edu.au> Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Fri, 9 Jan 2009 01:14:59 +0000 (17:14 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (57 commits)
jbd2: Fix oops in jbd2_journal_init_inode() on corrupted fs
ext4: Remove "extents" mount option
block: Add Kconfig help which notes that ext4 needs CONFIG_LBD
ext4: Make printk's consistently prefixed with "EXT4-fs: "
ext4: Add sanity checks for the superblock before mounting the filesystem
ext4: Add mount option to set kjournald's I/O priority
jbd2: Submit writes to the journal using WRITE_SYNC
jbd2: Add pid and journal device name to the "kjournald2 starting" message
ext4: Add markers for better debuggability
ext4: Remove code to create the journal inode
ext4: provide function to release metadata pages under memory pressure
ext3: provide function to release metadata pages under memory pressure
add releasepage hooks to block devices which can be used by file systems
ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
ext4: Init the complete page while building buddy cache
ext4: Don't allow new groups to be added during block allocation
ext4: mark the blocks/inode bitmap beyond end of group as used
ext4: Use new buffer_head flag to check uninit group bitmaps initialization
ext4: Fix the race between read_inode_bitmap() and ext4_new_inode()
ext4: code cleanup
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits)
[SCSI] qla2xxx: Update version number to 8.03.00-k1.
[SCSI] qla2xxx: Add ISP81XX support.
[SCSI] qla2xxx: Use proper request/response queues with MQ instantiations.
[SCSI] qla2xxx: Correct MQ-chain information retrieval during a firmware dump.
[SCSI] qla2xxx: Collapse EFT/FCE copy procedures during a firmware dump.
[SCSI] qla2xxx: Don't pollute kernel logs with ZIO/RIO status messages.
[SCSI] qla2xxx: Don't fallback to interrupt-polling during re-initialization with MSI-X enabled.
[SCSI] qla2xxx: Remove support for reading/writing HW-event-log.
[SCSI] cxgb3i: add missing include
[SCSI] scsi_lib: fix DID_RESET status problems
[SCSI] fc transport: restore missing dev_loss_tmo callback to LLDD
[SCSI] aha152x_cs: Fix regression that keeps driver from using shared interrupts
[SCSI] sd: Correctly handle 6-byte commands with DIX
[SCSI] sd: DIF: Fix tagging on platforms with signed char
[SCSI] sd: DIF: Show app tag on error
[SCSI] Fix error handling for DIF/DIX
[SCSI] scsi_lib: don't decrement busy counters when inserting commands
[SCSI] libsas: fix test for negative unsigned and typos
[SCSI] a2091, gvp11: kill warn_unused_result warnings
[SCSI] fusion: Move a dereference below a NULL test
...
Fixed up trivial conflict due to moving the async part of sd_probe
around in the async probes vs using dev_set_name() in naming.
Linus Torvalds [Thu, 8 Jan 2009 23:52:13 +0000 (15:52 -0800)]
Merge branch 'docs-next' of git://git.lwn.net/linux-2.6
* 'docs-next' of git://git.lwn.net/linux-2.6:
Fix a typo in the development process document.
Document handling of bad memory
Document RCU and unloadable modules
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (84 commits)
wimax: fix kernel-doc for debufs_dentry member of struct wimax_dev
net: convert pegasus driver to net_device_ops
bnx2x: Prevent eeprom set when driver is down
net: switch kaweth driver to netdevops
pcnet32: round off carrier watch timer
i2400m/usb: wrap USB power saving in #ifdef CONFIG_PM
wimax: testing for rfkill support should also test for CONFIG_RFKILL_MODULE
wimax: fix kconfig interactions with rfkill and input layers
wimax: fix '#ifndef CONFIG_BUG' layout to avoid warning
r6040: bump release number to 0.20
r6040: warn about MAC address being unset
r6040: check PHY status when bringing interface up
r6040: make printks consistent with DRV_NAME
gianfar: Fixup use of BUS_ID_SIZE
mlx4_en: Returning real Max in get_ringparam
mlx4_en: Consider inline packets on completion
netdev: bfin_mac: enable bfin_mac net dev driver for BF51x
qeth: convert to net_device_ops
vlan: add neigh_setup
dm9601: warn on invalid mac address
...
Linus Torvalds [Thu, 8 Jan 2009 22:03:34 +0000 (14:03 -0800)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: don't retry recovery of raid1 that fails due to error on source drive.
md: Allow md devices to be created by name.
md: make devices disappear when they are no longer needed.
md: centralise all freeing of an 'mddev' in 'md_free'
md: move allocation of ->queue from mddev_find to md_probe
md: need another print_sb for mdp_superblock_1
md: use list_for_each_entry macro directly
md: raid0: make hash_spacing and preshift sector-based.
md: raid0: Represent the size of strip zones in sectors.
md: raid0 create_strip_zones(): Add KERN_INFO/KERN_ERR to printk's.
md: raid0 create_strip_zones(): Make two local variables sector-based.
md: raid0: Represent zone->zone_offset in sectors.
md: raid0: Represent device offset in sectors.
md: raid0_make_request(): Replace local variable block by sector.
md: raid0_make_request(): Remove local variable chunk_size.
md: raid0_make_request(): Replace chunksize_bits by chunksect_bits.
md: use sysfs_notify_dirent to notify changes to md/sync_action.
md: fix bitmap-on-external-file bug.
Linus Torvalds [Thu, 8 Jan 2009 22:01:36 +0000 (14:01 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: clean up the SFF code for coding style
libata: Add 32bit PIO support
[libata] ahci: Withdraw IGN_SERR_INTERNAL for SB800 SATA
pata_hpt366: reimplement mode programming
[libata] pata_hpt3x3: correct _freeze() function declaration
libata: Add special ata_pio_need_iordy() handling for Compact Flash.
pata_platform: __pata_platform_remove() shouldn't be in discard section
sata_sil24: remove unused sil24_port_multiplier
[libata] ahci: Add SATA GEN3 related messages
ata_piix: save, use saved and restore IOCFG
pata_ali: Fix and workaround for FIFO DMA bug
pata_ali: force initialise a few bits
pata_hpt3x3: Workarounds for chipset
Alan Cox [Mon, 5 Jan 2009 14:16:39 +0000 (14:16 +0000)]
libata: Add 32bit PIO support
This matters for some controllers and in one or two cases almost doubles
PIO performance. Add a bmdma32 operations set we can inherit and activate
it for some controllers
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Shane Huang [Tue, 30 Dec 2008 02:53:41 +0000 (10:53 +0800)]
[libata] ahci: Withdraw IGN_SERR_INTERNAL for SB800 SATA
There is an issue in ATI SB600/SB700 SATA that PxSERR.E should not be
set on some conditions, which will lead to many SATA ODD error messages.
commit 55a61604cd1354e1783364e1c901034f2f474b7d is the workaround.
Since SB800 fixed this HW issue, IGN_SERR_INTERNAL should be withdrawn
for SB800.
Signed-off-by: Shane Huang <shane.huang@amd.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
NeilBrown [Thu, 8 Jan 2009 21:31:11 +0000 (08:31 +1100)]
md: don't retry recovery of raid1 that fails due to error on source drive.
If a raid1 has only one working drive and it has a sector which
gives an error on read, then an attempt to recover onto a spare will
fail, but as the single remaining drive is not removed from the
array, the recovery will be immediately re-attempted, resulting
in an infinite recovery loop.
So detect this situation and don't retry recovery once an error
on the lone remaining drive is detected.
Allow recovery to be retried once every time a spare is added
in case the problem wasn't actually a media error.
NeilBrown [Thu, 8 Jan 2009 21:31:10 +0000 (08:31 +1100)]
md: Allow md devices to be created by name.
Using sequential numbers to identify md devices is somewhat artificial.
Using names can be a lot more user-friendly.
Also, creating md devices by opening the device special file is a bit
awkward.
So this patch provides a new option for creating and naming devices.
Writing a name such as "md_home" to
/sys/modules/md_mod/parameters/new_array
will cause an array with that name to be created. It will appear in
/sys/block/ /proc/partitions and /proc/mdstat as 'md_home'.
It will have an arbitrary minor number allocated.
md devices that a created by an open are destroyed on the last
close when the device is inactive.
For named md devices, they will not be destroyed until the array
is explicitly stopped, either with the STOP_ARRAY ioctl or by
writing 'clear' to /sys/block/md_XXXX/md/array_state.
The name of the array must start 'md_' to avoid conflict with
other devices.
NeilBrown [Thu, 8 Jan 2009 21:31:10 +0000 (08:31 +1100)]
md: make devices disappear when they are no longer needed.
Currently md devices, once created, never disappear until the module
is unloaded. This is essentially because the gendisk holds a
reference to the mddev, and the mddev holds a reference to the
gendisk, this a circular reference.
If we drop the reference from mddev to gendisk, then we need to ensure
that the mddev is destroyed when the gendisk is destroyed. However it
is not possible to hook into the gendisk destruction process to enable
this.
So we drop the reference from the gendisk to the mddev and destroy the
gendisk when the mddev gets destroyed. However this has a
complication.
Between the call
__blkdev_get->get_gendisk->kobj_lookup->md_probe
and the call
__blkdev_get->md_open
there is no obvious way to hold a reference on the mddev any more, so
unless something is done, it will disappear and gendisk will be
destroyed prematurely.
Also, once we decide to destroy the mddev, there will be an unlockable
moment before the gendisk is unlinked (blk_unregister_region) during
which a new reference to the gendisk can be created. We need to
ensure that this reference can not be used. i.e. the ->open must
fail.
So:
1/ in md_probe we set a flag in the mddev (hold_active) which
indicates that the array should be treated as active, even
though there are no references, and no appearance of activity.
This is cleared by md_release when the device is closed if it
is no longer needed.
This ensures that the gendisk will survive between md_probe and
md_open.
2/ In md_open we check if the mddev we expect to open matches
the gendisk that we did open.
If there is a mismatch we return -ERESTARTSYS and modify
__blkdev_get to retry from the top in that case.
In the -ERESTARTSYS sys case we make sure to wait until
the old gendisk (that we succeeded in opening) is really gone so
we loop at most once.
Some udev configurations will always open an md device when it first
appears. If we allow an md device that was just created by an open
to disappear on an immediate close, then this can race with such udev
configurations and result in an infinite loop the device being opened
and closed, then re-open due to the 'ADD' even from the first open,
and then close and so on.
So we make sure an md device, once created by an open, remains active
at least until some md 'ioctl' has been made on it. This means that
all normal usage of md devices will allow them to disappear promptly
when not needed, but the worst that an incorrect usage will do it
cause an inactive md device to be left in existence (it can easily be
removed).
As an array can be stopped by writing to a sysfs attribute
echo clear > /sys/block/mdXXX/md/array_state
we need to use scheduled work for deleting the gendisk and other
kobjects. This allows us to wait for any pending gendisk deletion to
complete by simply calling flush_scheduled_work().
NeilBrown [Thu, 8 Jan 2009 21:31:09 +0000 (08:31 +1100)]
md: centralise all freeing of an 'mddev' in 'md_free'
md_free is the .release handler for the md kobj_type.
So it makes sense to release all the objects referenced by
the mddev in there, rather than just prior to calling kobject_put
for what we think is the last time.
NeilBrown [Thu, 8 Jan 2009 21:31:08 +0000 (08:31 +1100)]
md: move allocation of ->queue from mddev_find to md_probe
It is more balanced to just do simple initialisation in mddev_find,
which allocates and links a new md device, and leave all the
more sophisticated allocation to md_probe (which calls mddev_find).
md_probe already allocated the gendisk. It should allocate the
queue too.
Cheng Renquan [Thu, 8 Jan 2009 21:31:08 +0000 (08:31 +1100)]
md: need another print_sb for mdp_superblock_1
md_print_devices is called in two code path: MD_BUG(...), and md_ioctl
with PRINT_RAID_DEBUG. it will dump out all in use md devices
information;
However, it wrongly processed two types of superblock in one:
The header file <linux/raid/md_p.h> has defined two types of superblock,
struct mdp_superblock_s (typedefed with mdp_super_t) according to md with
metadata 0.90, and struct mdp_superblock_1 according to md with metadata
1.0 and later,
These two types of superblock are very different,
The md_print_devices code processed them both in mdp_super_t, that would
lead to wrong informaton dump like:
this md0 (metadata 1.2) information dumping is exactly according to struct
mdp_superblock_1.
Signed-off-by: Cheng Renquan <crquan@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NeilBrown <neilb@suse.de>
Cheng Renquan [Thu, 8 Jan 2009 21:31:08 +0000 (08:31 +1100)]
md: use list_for_each_entry macro directly
The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to
list_for_each_entry_safe, from <linux/list.h>, it should be defined to
use list_for_each_entry_safe, instead of reinventing the wheel.
But some calls to each_entry_safe don't really need a safe version,
just a direct list_for_each_entry is enough, this could save a temp
variable (tmp) in every function that used rdev_for_each.
In this patch, most rdev_for_each loops are replaced by list_for_each_entry,
totally save many tmp vars; and only in the other situations that will call
list_del to delete an entry, the safe version is used.
Andre Noll [Thu, 8 Jan 2009 21:31:08 +0000 (08:31 +1100)]
md: raid0: make hash_spacing and preshift sector-based.
This patch renames the hash_spacing and preshift members of struct
raid0_private_data to spacing and sector_shift respectively and
changes the semantics as follows:
We always have spacing = 2 * hash_spacing. In case
sizeof(sector_t) > sizeof(u32) we also have sector_shift = preshift + 1
while sector_shift = preshift = 0 otherwise.
Note that the values of nb_zone and zone are unaffected by these changes
because in the sector_div() preceeding the assignement of these two
variables both arguments double.
Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
Andre Noll [Thu, 8 Jan 2009 21:31:07 +0000 (08:31 +1100)]
md: raid0 create_strip_zones(): Make two local variables sector-based.
current_offset and curr_zone_offset stored the corresponding offsets
as 1K quantities. Rename them to current_start and curr_zone_start
to match the naming of struct strip_zone and store the offsets as
sector counts.
Also, add KERN_INFO to the printk() affected by this change to make
checkpatch happy.
Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
Tejun Heo [Thu, 8 Jan 2009 21:29:20 +0000 (16:29 -0500)]
pata_hpt366: reimplement mode programming
Reimplement mode programming logic of pata_hpt366 such that it's
identical to that of IDE hpt366 driver. The differences were...
* pata_hpt366 used 0xCFFF8FFFF to mask pio modes and 0x3FFFFFFF dma
modes. IDE hpt366 uses 0xC1F8FFFF for PIO, 0x303800FF for MWDMA and
0x30070000 for UDMA.
* pata_hpt366 doesn't set 0x08000000 for PIO unless it's already set
and always turns it on for MWDMA/UDMA. IDE hpt366 doesn't bother
with the bit. It always uses what was there.
* IDE hpt366 always clears 0xC0000000. pata_hpt366 doesn't.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sonic Zhang [Wed, 7 Jan 2009 16:37:12 +0000 (00:37 +0800)]
pata_platform: __pata_platform_remove() shouldn't be in discard section
--
UPD include/linux/compile.h
`___pata_platform_remove' referenced in section `__ksymtab_gpl' of
drivers/built-in.o: defined in discarded section `.devexit.text' of
drivers/built-in.o
make: *** [.tmp_vmlinux1] Error 1
--
__pata_platform_remove() should not be in discarded section
__pata_platform_remove(struct device *dev) is invoked in both
pata_platform.c and pata_of_platform.c by reomve function defined in
discarded section ".devexit.text". An exported function should not be put
into discarded section.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Fri, 2 Jan 2009 03:04:48 +0000 (12:04 +0900)]
ata_piix: save, use saved and restore IOCFG
Certain ACPI implementations mess up IOCFG on _STM making libata
detect cable type incorrectly after a suspend/resume cycle. This
patch makes ata_piix save IOCFG on attach, use the saved value for
things which aren't dynamic and restore it on detach so that the next
driver also gets the BIOS initialized value.
This patch contains the following changes.
* makes ich_pata_cable_detect() use saved_iocfg.
* make piix_iocfg_bit18_quirk() take @host and use saved_iocfg.
* hpriv allocation moved upwards to save iocfg before doing anything
else.
This fixes bz#11879. Andreas Mohr reported and diagnosed the problem.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andreas Mohr <andi@lisas.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Alan Cox [Mon, 5 Jan 2009 14:12:51 +0000 (14:12 +0000)]
pata_hpt3x3: Workarounds for chipset
Correct the DMA bit flags (UDMA and MWDMA were swapped)
Add workarounds so that we clear ERR and INTR bits before issuing a DMA
Add workarounds so that we stop a live DMA before touching the CTL register
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
async: make async_synchronize_full() more serializing
turns out that there are real problems with allowing async
tasks that are scheduled from async tasks to run after
the async_synchronize_full() returns.
This patch makes the _full more strict and a complete
synchronization. Later I might need to add back a lighter
form of synchronization for other uses.. but not right now.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 8 Jan 2009 19:50:23 +0000 (11:50 -0800)]
regulator: fix kernel-doc warnings
Fix kernel-doc warnings in regulator/driver.h:
Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'set_current' description in 'regulator_ops'
Warning(linux-next-20090108//include/linux/regulator/driver.h:95): Excess struct/union/enum/typedef member 'get_current' description in 'regulator_ops'
Warning(linux-next-20090108//include/linux/regulator/driver.h:124): No description found for parameter 'irq'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Liam Girdwood <lrg@slimlogic.co.uk>
cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
David Brownell [Wed, 31 Dec 2008 12:54:19 +0000 (12:54 +0000)]
regulator: catch some registration errors
Prevent registration of duplicate "struct regulator" names.
They'd be unavailable, and clearly indicate something wrong.
[Edited to remove check for NULL consumer device until we have a
solution for things like cpufreq -- broonie]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Wed, 31 Dec 2008 12:52:44 +0000 (12:52 +0000)]
regulator: Add basic DocBook manual
Add a basic DocBook manual for the regulator API. This is much more
skeletal than the existing text documentation, the main benefit is to
provide a skeleton for automatic generation of a manual based on the
kerneldoc for the API.
Since large portions of the text are lifted from the existing text format
documentation written by Liam Girdwood much of the credit belongs to
him.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Mark Brown [Wed, 31 Dec 2008 12:52:43 +0000 (12:52 +0000)]
regulator: Fix some kerneldoc rendering issues
There are some minor textual changes in here as well, mostly to enable()
and disable() but the primary goal of these changes is to fix
misrenderings of the kerneldoc documentation for the regulator API.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Some of the internal structures have no kerneldoc but the ** at the start
of the comment marking them for documentation. Remove the annotation
until some is added.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
David Brownell [Tue, 2 Dec 2008 05:50:13 +0000 (21:50 -0800)]
regulator: init/link earlier
Move regulator earlier in link sequence.
The regulator core currently initializes as a core_initcall() to be
available early ... but then it links way late, throwing away that
benefit, so regulators available at e.g. subsys_initcall() are not
available to subsystems which need to use them.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
David Brownell [Wed, 12 Nov 2008 01:39:02 +0000 (17:39 -0800)]
regulator: sysfs attribute reduction (v2)
Clean up the sysfs interface to regulators by only exposing the
attributes that can be properly displayed. For example: when a
particular regulator method is needed to display the value, only
create that attribute when that method exists.
This cleaned-up interface is much more comprehensible. Most
regulators only support a subset of the possible methods, so
often more than half the attributes would be meaningless. Many
"not defined" values are no longer necessary. (But handling
of out-of-range values still looks a bit iffy.)
Documentation is updated to reflect that few of the attributes
are *always* present, and to briefly explain why a regulator may
not have a given attribute.
This adds object code, about a dozen bytes more than was removed
by the preceding patch, but saves a bunch of per-regulator data
associated with the now-removed attributes. So there's a net
reduction in memory footprint.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>