From: "Laszlo Ersek" <lersek@redhat.com>
To: Andrew Fish <afish@apple.com>, devel@edk2.groups.io
Cc: wuchenye1995 <wuchenye1995@gmail.com>,
zhoujianjay <zhoujianjay@gmail.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
berrange@redhat.com,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
qemu-devel@nongnu.org, discuss <discuss@edk2.groups.io>
Subject: Re: [edk2-devel] A problem with live migration of UEFI virtual machines
Date: Wed, 26 Feb 2020 10:42:05 +0100 [thread overview]
Message-ID: <6666a886-720d-1ead-8f7e-13e65dcaaeb4@redhat.com> (raw)
In-Reply-To: <8F42F6F1-A65D-490D-9F2F-E12746870B29@apple.com>
Hi Andrew,
On 02/25/20 22:35, Andrew Fish wrote:
> Laszlo,
>
> The FLASH offsets changing breaking things makes sense.
>
> I now realize this is like updating the EFI ROM without rebooting the
> system. Thus changes in how the new EFI code works is not the issue.
>
> Is this migration event visible to the firmware? Traditionally the
> NVRAM is a region in the FD so if you update the FD you have to skip
> NVRAM region or save and restore it. Is that activity happening in
> this case? Even if the ROM layout does not change how do you not lose
> the contents of the NVRAM store when the live migration happens? Sorry
> if this is a remedial question but I'm trying to learn how this
> migration works.
With live migration, the running guest doesn't notice anything. This is
a general requirement for live migration (regardless of UEFI or flash).
You are very correct to ask about "skipping" the NVRAM region. With the
approach that OvmfPkg originally supported, live migration would simply
be unfeasible. The "build" utility would produce a single (unified)
OVMF.fd file, which would contain both NVRAM and executable regions, and
the guest's variable updates would modify the one file that would exist.
This is inappropriate even without considering live migration, because
OVMF binary upgrades (package updates) on the virtualization host would
force guests to lose their private variable stores (NVRAMs).
Therefore, the "build" utility produces "split" files too, in addition
to the unified OVMF.fd file. Namely, OVMF_CODE.fd and OVMF_VARS.fd.
OVMF.fd is simply the concatenation of the latter two.
$ cat OVMF_VARS.fd OVMF_CODE.fd | cmp - OVMF.fd
[prints nothing]
When you define a new domain (VM) on a virtualization host, the domain
definition saves a reference (pathname) to the OVMF_CODE.fd file.
However, the OVMF_VARS.fd file (the variable store *template*) is not
directly referenced; instead, it is *copied* into a separate (private)
file for the domain.
Furthermore, once booted, guest has two flash chips, one that maps the
firmware executable OVMF_CODE.fd read-only, and another pflash chip that
maps its private varstore file read-write.
This makes it possible to upgrade OVMF_CODE.fd and OVMF_VARS.fd (via
package upgrades on the virt host) without messing with varstores that
were earlier instantiated from OVMF_VARS.fd. What's important here is
that the various constants in the new (upgraded) OVMF_CODE.fd file
remain compatible with the *old* OVMF_VARS.fd structure, across package
upgrades.
If that's not possible for introducing e.g. a new feature, then the
package upgrade must not overwrite the OVMF_CODE.fd file in place, but
must provide an additional firmware binary. This firmware binary can
then only be used by freshly defined domains (old domains cannot be
switched over). Old domains can be switched over manually -- and only if
the sysadmin decides it is OK to lose the current variable store
contents. Then the old varstore file for the domain is deleted
(manually), the domain definition is updated, and then a new (logically
empty, pristine) varstore can be created from the *new* OVMF_2_VARS.fd
that matches the *new* OVMF_2_CODE.fd.
During live migration, the "RAM-like" contents of both pflash chips are
migrated (the guest-side view of both chips remains the same, including
the case when the writeable chip happens to be in "programming mode",
i.e., during a UEFI variable write through the Fault Tolerant Write and
Firmware Volume Block(2) protocols).
Once live migration completes, QEMU dumps the full contents of the
writeable chip to the backing file (on the destination host). Going
forward, flash writes from within the guest are reflected to said
host-side file on-line, just like it happened on the source host before
live migration. If the file backing the r/w pflash chip is on NFS
(shared by both src and dst hosts), then this one-time dumping when the
migration completes is superfluous, but it's also harmless.
The interesting question is, what happens when you power down the VM on
the destination host (= post migration), and launch it again there, from
zero. In that case, the firmware executable file comes from the
*destination host* (it was never persistently migrated from the source
host, i.e. never written out on the dst). It simply comes from the OVMF
package that had been installed on the destination host, by the
sysadmin. However, the varstore pflash does reflect the permanent result
of the previous migration. So this is where things can fall apart, if
both firmware binaries (on the src host and on the dst host) don't agree
about the internal structure of the varstore pflash.
Thanks
Laszlo
next prev parent reply other threads:[~2020-02-26 9:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-11 17:06 A problem with live migration of UEFI virtual machines "wuchenye1995
2020-02-11 17:39 ` Alex Bennée
2020-02-24 15:28 ` Daniel P. Berrangé
2020-02-25 17:53 ` [edk2-devel] " Laszlo Ersek
2020-02-25 18:56 ` Andrew Fish
2020-02-25 20:40 ` Laszlo Ersek
2020-02-25 21:35 ` Andrew Fish
2020-02-26 9:42 ` Laszlo Ersek [this message]
2020-02-28 3:20 ` Zhoujian (jay)
2020-02-28 11:29 ` Laszlo Ersek
2020-02-28 4:04 ` Andrew Fish
2020-02-28 11:47 ` Laszlo Ersek
2020-02-28 11:50 ` Laszlo Ersek
2020-03-02 12:32 ` Dr. David Alan Gilbert
-- strict thread matches above, loose matches on Subject: below --
2020-02-10 4:39 wuchenye1995
2020-02-10 20:20 ` [edk2-devel] " Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6666a886-720d-1ead-8f7e-13e65dcaaeb4@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox