From: Evgeny Yakovlev <insoreiges@gmail.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: edk2-devel@ml01.01.org, eyakovlev@virtuozzo.com,
den@virtuozzo.com, Jeff Fan <jeff.fan@intel.com>
Subject: Re: OvmfPkg: VM crashed trying to write to RO memory from CommonInterruptEntry
Date: Tue, 22 Nov 2016 16:58:12 +0300 [thread overview]
Message-ID: <CAM0BJjQDdwPULhik-F2d77jpWKUX=oyHnvS9FMU+ECusT4SeGQ@mail.gmail.com> (raw)
In-Reply-To: <2340021c-4bcb-2622-07a8-6e6173f94d81@redhat.com>
Wow, that is more than i expected :)
> I wonder if you started to see this issue very recently.
Very recently, however we use a pretty old OVMF build, circa 2015
> OVMF debug log
Sorry, we hadn't had it enabled when VM crashed and these crashes are very
rare. We will try to capture it when it happens again
> - your host CPU model,
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
stepping : 7
> - the host kernel (KVM) version,
Our kernel is roughly based on RHEL7.2 (kernel version 3.10.0-327.36.1). We
also have some upstream KVM patches backported.
> - the guest CPU model,
-cpu
SandyBridge,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+osxsave,-arat,-xsaveopt,-xgetbv1,-vmx,-xsavec,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_synic,hv_stimer,hv_reset,hv_crash
> - the guest CPU topology.
8 sockets, 1 core per socket, 1 thread per core
Hope that helps!
2016-11-22 16:41 GMT+03:00 Laszlo Ersek <lersek@redhat.com>:
> Hello Evgeny,
>
> On 11/22/16 13:57, Evgeny Yakovlev wrote:
> > We are running windows UEFI-based VMs on QEMU/KVM with OvmfPkg.
> >
> > Very rarely we are experiencing a crash when VM tries to write to RO
> memory
> > very early during UEFI boot process.
> >
> > Crash happens when VM tries to execute this code in interrupt handler:
> > https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/Library/
> CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.asm#L244-L246
> >
> >
> > fxsave [rdi], where RDI = 0xffe60
> >
> > Which is bad - it points to ISA BIOS F-segment area.
> >
> > This memory was mapped by qemu for read only access, which is reflected
> in
> > KVM EPT:
> > 00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
> >
> > This is a very early IRQ0 interrupt, presumably during early
> initialization
> > phase (Sec or Pei).
> >
> > Looks like CommonInterruptHandler does not switch to a separate stack and
> > works on interrupted context's stack, which was fairly close to 1MB
> > boundary when IRQ0 fired (RSP around 1002c0). When CommonInterruptEntry
> > reached highlighted code it subtracted 512 bytes from current RSP which
> > dropped to 0xffe60, below 1MB and into QEMU RO region.
> >
> > We were figuring out how to best fix this. Possible solutions are to
> switch
> > to a separate stack in CommonInterruptEntry, relocate early OvmfPkg stack
> > to somewhere farther away from 1MB, to run with interrupts disabled until
> > we reach a later phase or maybe something else.
> >
> > Any comments would be very appreciated!
>
> I wonder if you started to see this issue very recently.
>
> I suspect (hope!) that the symptoms you are experiencing are a
> consequence of a bug in UefiCpuPkg that I've debugged and fixed just
> today. (I hope to post the patches today.)
>
> While testing those patches on your end will of course tell us if your
> issue has the same root cause, you could gather a few more symptoms even
> before I get around posting the patches. The bug that I'm working on has
> extremely varied crash symptoms (basically the APs wander off into the
> weeds), and some of those symptoms have involved CpuExceptionHandlerLib.
> The point is, by the time we get into CpuExceptionHandlerLib, all is
> lost -- it is executing on an AP whose state is corrupt anyway. The
> fxsave symptom is a red herring, most likely.
>
> CpuExceptionHandlerLib works fine otherwise, especially when invoked
> from the BSP -- we've used the output dumped by CpuExceptionHandlerLib
> to the serial port several times to track down issues.
>
> So, my request is that you please capture the OVMF debug log (please see
> the "OvmfPkg/README" file for how). I'm curious if it crashes where and
> how I suspect it crashes.
>
> Also, it would help if you provided
> - your host CPU model,
> - the host kernel (KVM) version,
> - the guest CPU model,
> - the guest CPU topology.
>
> Thanks!
> Laszlo
>
next prev parent reply other threads:[~2016-11-22 13:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-22 12:57 OvmfPkg: VM crashed trying to write to RO memory from CommonInterruptEntry Evgeny Yakovlev
2016-11-22 13:41 ` Laszlo Ersek
2016-11-22 13:58 ` Evgeny Yakovlev [this message]
2016-11-22 16:58 ` Laszlo Ersek
2016-11-23 8:37 ` Evgeny Yakovlev
2016-11-23 16:54 ` Laszlo Ersek
2016-12-07 9:11 ` Evgeny Yakovlev
2016-11-23 14:31 ` Evgeny Yakovlev
2016-11-23 16:38 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAM0BJjQDdwPULhik-F2d77jpWKUX=oyHnvS9FMU+ECusT4SeGQ@mail.gmail.com' \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox