Jan Kiszka [Tue, 30 Sep 2008 08:41:06 +0000 (10:41 +0200)]
KVM: x86: Silence various LAPIC-related host kernel messages
KVM-x86 dumps a lot of debug messages that have no meaning for normal
operation:
- INIT de-assertion is ignored
- SIPIs are sent and received
- APIC writes are unaligned or < 4 byte long
(Windows Server 2003 triggers this on SMP)
Degrade them to true debug messages, keeping the host kernel log clean
for real problems.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
The PIC code makes little effort to avoid kvm_vcpu_kick(), resulting in
unnecessary guest exits in some conditions.
For example, if the timer interrupt is routed through the IOAPIC, IRR
for IRQ 0 will get set but not cleared, since the APIC is handling the
acks.
This means that everytime an interrupt < 16 is triggered, the priority
logic will find IRQ0 pending and send an IPI to vcpu0 (in case IRQ0 is
not masked, which is Linux's case).
Introduce a new variable isr_ack to represent the IRQ's for which the
guest has been signalled / cleared the ISR. Use it to avoid more than
one IPI per trigger-ack cycle, in addition to the avoidance when ISR is
set in get_priority().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Allow guest pagetables to go out of sync. Instead of emulating write
accesses to guest pagetables, or unshadowing them, we un-write-protect
the page table and allow the guest to modify it at will. We rely on
invlpg executions to synchronize individual ptes, and will synchronize
the entire pagetable on tlb flushes.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
There is not much point in write protecting large mappings. This
can only happen when a page is shadowed during the window between
is_largepage_backed and mmu_lock acquision. Zap the entry instead, so
the next pagefault will find a shadowed page via is_largepage_backed and
fallback to 4k translations.
Simplifies out of sync shadow.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Harvey Harrison [Tue, 23 Sep 2008 18:01:45 +0000 (11:01 -0700)]
x86: pvclock: fix shadowed variable warning
arch/x86/kernel/pvclock.c:102:6: warning: symbol 'tsc_khz' shadows an earlier one
include/asm/tsc.h:18:21: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@redhat.com>
In Tukwila processor, VT-i has been enhanced in its
implementation, it is often called VT-i2 techonology.
With VTi-2 support, virtulization performance should be
improved. In this patch, we added the related stuff to
support kvm/ia64 for Tukwila processors.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
An uniform entry kvm_vps_entry is added for
vps_sync_write/read, vps_resume_handler/guest,
and branches to differnt PAL service according to the offset.
Singed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Glauber Costa [Thu, 18 Sep 2008 02:16:59 +0000 (23:16 -0300)]
KVM: Don't destroy vcpu in case vcpu_setup fails
One of vcpu_setup responsibilities is to do mmu initialization.
However, in case we fail in kvm_arch_vcpu_reset, before we get the
chance to init mmu. OTOH, vcpu_destroy will attempt to destroy mmu,
triggering a bug. Keeping track of whether or not mmu is initialized
would unnecessarily complicate things. Rather, we just make return,
making sure any needed uninitialization is done before we return, in
case we fail.
Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
KVM: don't enter guest after SIPI was received by a CPU
The vcpu should process pending SIPI message before entering guest mode again.
kvm_arch_vcpu_runnable() returns true if the vcpu is in SIPI state, so
we can't call it here.
Signed-off-by: Gleb Natapov <gleb@qumranet.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless
pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7%
faster on VMX.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
kvm_vm_fault is invoked with mmap_sem held in read mode. Since gfn_to_page
will be converted to get_user_pages_fast, which requires this lock NOT
to be held, switch to opencoded get_user_pages.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Amit Shah [Tue, 16 Sep 2008 15:04:28 +0000 (18:04 +0300)]
KVM: Device Assignment: Free device structures if IRQ allocation fails
When an IRQ allocation fails, we free up the device structures and
disable the device so that we can unregister the device in the
userspace and not expose it to the guest at all.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Based on a patch by: Kay, Allen M <allen.m.kay@intel.com>
This patch enables PCI device assignment based on VT-d support.
When a device is assigned to the guest, the guest memory is pinned and
the mapping is updated in the VT-d IOMMU.
[Amit: Expose KVM_CAP_IOMMU so we can check if an IOMMU is present
and also control enable/disable from userspace]
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Amit Shah <amit.shah@qumranet.com> Acked-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Kay, Allen M [Tue, 9 Sep 2008 15:37:29 +0000 (18:37 +0300)]
VT-d: Changes to support KVM
This patch extends the VT-d driver to support KVM
[Ben: fixed memory pinning]
[avi: move dma_remapping.h as well]
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Amit Shah <amit.shah@qumranet.com> Acked-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Marc Zyngier [Wed, 15 Oct 2008 11:54:05 +0000 (12:54 +0100)]
[ARM] 5308/1: Fix Viper ISA IRQ handling
The ISA IRQ renumbering broke the Viper ISA code in interesting ways.
It originally assumed that ISA interrupt were numbered in the order that
is defined by the CPLD registers. Unfortunately, this is no longer the
case.
Furthermore, the viper_irq_handler() function being a chained IRQ
handler, it must ACK the interrupt by itself, or the handler will be
immediately reentered, with the expected damages.
This fix was made possible thanks to the help of David Raeman, who
provided debug information and tested each version of this patch.
Tested-by: David Raeman <david.raeman@gmail.com> Signed-off-by: Marc Zyngier <maz@misterjones.org> Acked-by: Eric Miao <eric.miao@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Mike Rapoport [Wed, 15 Oct 2008 10:07:21 +0000 (11:07 +0100)]
[ARM] 5306/1: pxa: fix build error on CM-X270
Fix build for CM-X2XX with CONFIG_RTC_DRV_V3020 unset:
CC arch/arm/mach-pxa/cm-x270.o
/mnt/sdb1/git/linux-2.6-arm/arch/arm/mach-pxa/cm-x270.c: In function 'cmx270_init':
/mnt/sdb1/git/linux-2.6-arm/arch/arm/mach-pxa/cm-x270.c:338: error: implicit declaration of function 'cmx270_init_rtc'
make[2]: *** [arch/arm/mach-pxa/cm-x270.o] Error 1
Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Stephen Rothwell [Tue, 14 Oct 2008 22:09:21 +0000 (09:09 +1100)]
md: build failure due to missing delay.h
Today's linux-next build (powerpc ppc64_defconfig) failed like this:
drivers/md/raid1.c: In function 'sync_request':
drivers/md/raid1.c:1759: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid1.o] Error 1
make[3]: *** Waiting for unfinished jobs....
drivers/md/raid10.c: In function 'sync_request':
drivers/md/raid10.c:1749: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid10.o] Error 1
drivers/md/md.c: In function 'md_do_sync':
drivers/md/md.c:5915: error: implicit declaration of function 'msleep'
Caused by commit 6caa3b0bbdb474647f6bdd8a958ffc46f78d8d58 ("md: Remove
unnecessary #includes, #defines, and function declarations"). I added
the following patch.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NeilBrown <neilb@suse.de>
Wim Van Sebroeck [Wed, 15 Oct 2008 08:53:06 +0000 (08:53 +0000)]
[WATCHDOG] ib700wdt.c - fix buffer_underflow bug
This fixes Bug 11399:
if ibwdt_set_heartbeat(int t) is called with value 30 then
the check "if ((t < 0) || (t > 30))" in ibwdt_set_heartbeat
is not going to fail because t == 30, but in the loop, the
check wd_times[i] > t is never going to be true because
none of the wd_times are greater than the value of t (i.e. 30).
So we are exiting the loop with i == -1 and therefore setting
wd_margin to -1 which is wrong.
Reported-by: Zvonimir Rakamaric <zrakamar@cs.ubc.ca> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Add DstAcc operand type. That means that there are 4 bits now for
DstMask.
"In the good old days cpus would have only one register that was able to
fully participate in arithmetic operations, typically called A for
Accumulator. The x86 retains this tradition by having special, shorter
encodings for the A register (like the cmp opcode), and even some
instructions that only operate on A (like mul).
SrcAcc and DstAcc would accommodate these instructions by decoding A
into the corresponding 'struct operand'."
-- Avi Kivity
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Thu, 11 Sep 2008 16:47:13 +0000 (19:47 +0300)]
KVM: x86 emulator: fix jmp r/m64 instruction
jmp r/m64 doesn't require the rex.w prefix to indicate the operand size
is 64 bits. Set the Stack attribute (even though it doesn't involve the
stack, really) to indicate this.
Offline or uninitialized vcpu's can be executed if requested to perform
userspace work.
Follow Avi's suggestion to handle halted vcpu's in the main loop,
simplifying kvm_emulate_halt(). Introduce a new vcpu->requests bit to
indicate events that promote state from halted to running.
Also standardize vcpu wake sites.
Signed-off-by: Marcelo Tosatti <mtosatti <at> redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Sheng Yang [Mon, 1 Sep 2008 11:41:20 +0000 (19:41 +0800)]
KVM: MMU: Modify kvm_shadow_walk.entry to accept u64 addr
EPT is 4 level by default in 32pae(48 bits), but the addr parameter
of kvm_shadow_walk->entry() only accept unsigned long as virtual
address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX()
overflow when try to fetch level 4 index.
Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in
parameter.
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Heiko Carstens pointed out, that its safer to activate working facilities
instead of disabling problematic facilities. The new code uses the host
facility bits and masks it with known good ones.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Joerg Roedel [Fri, 29 Aug 2008 09:52:07 +0000 (11:52 +0200)]
KVM: add MC5_MISC msr read support
Currently KVM implements MC0-MC4_MISC read support. When booting Linux this
results in KVM warnings in the kernel log when the guest tries to read
MC5_MISC. Fix this warnings with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Wed, 27 Aug 2008 17:01:04 +0000 (20:01 +0300)]
KVM: MMU: Fix setting the accessed bit on non-speculative sptes
The accessed bit was accidentally turned on in a random flag word, rather
than, the spte itself, which was lucky, since it used the non-EPT compatible
PT_ACCESSED_MASK.
Fix by turning the bit on in the spte and changing it to use the portable
accessed mask.
Avi Kivity [Tue, 26 Aug 2008 14:31:31 +0000 (17:31 +0300)]
KVM: Don't call get_user_pages(.force = 1)
This is esoteric and only needed to break COW on MAP_SHARED mappings. Since
KVM no longer does these sorts of mappings, breaking COW on them is no longer
necessary.