public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Andrew Fish" <afish@apple.com>
To: devel@edk2.groups.io, lersek@redhat.com
Cc: wuchenye1995 <wuchenye1995@gmail.com>,
	zhoujianjay <zhoujianjay@gmail.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	berrange@redhat.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	qemu-devel@nongnu.org, discuss <discuss@edk2.groups.io>
Subject: Re: [edk2-devel] A problem with live migration of UEFI virtual machines
Date: Thu, 27 Feb 2020 20:04:00 -0800	[thread overview]
Message-ID: <284BFC25-8534-4147-8616-DE7C410DB681@apple.com> (raw)
In-Reply-To: <6666a886-720d-1ead-8f7e-13e65dcaaeb4@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6260 bytes --]



> On Feb 26, 2020, at 1:42 AM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> Hi Andrew,
> 
> On 02/25/20 22:35, Andrew Fish wrote:
> 
>> Laszlo,
>> 
>> The FLASH offsets changing breaking things makes sense.
>> 
>> I now realize this is like updating the EFI ROM without rebooting the
>> system.  Thus changes in how the new EFI code works is not the issue.
>> 
>> Is this migration event visible to the firmware? Traditionally the
>> NVRAM is a region in the FD so if you update the FD you have to skip
>> NVRAM region or save and restore it. Is that activity happening in
>> this case? Even if the ROM layout does not change how do you not lose
>> the contents of the NVRAM store when the live migration happens? Sorry
>> if this is a remedial question but I'm trying to learn how this
>> migration works.
> 
> With live migration, the running guest doesn't notice anything. This is
> a general requirement for live migration (regardless of UEFI or flash).
> 
> You are very correct to ask about "skipping" the NVRAM region. With the
> approach that OvmfPkg originally supported, live migration would simply
> be unfeasible. The "build" utility would produce a single (unified)
> OVMF.fd file, which would contain both NVRAM and executable regions, and
> the guest's variable updates would modify the one file that would exist.
> This is inappropriate even without considering live migration, because
> OVMF binary upgrades (package updates) on the virtualization host would
> force guests to lose their private variable stores (NVRAMs).
> 
> Therefore, the "build" utility produces "split" files too, in addition
> to the unified OVMF.fd file. Namely, OVMF_CODE.fd and OVMF_VARS.fd.
> OVMF.fd is simply the concatenation of the latter two.
> 
> $ cat OVMF_VARS.fd OVMF_CODE.fd | cmp - OVMF.fd
> [prints nothing]


Laszlo,

Thanks for the detailed explanation. 

Maybe I was overcomplicating this. Given your explanation I think the part I'm missing is OVMF is implying FLASH layout, in this split model, based on the size of the OVMF_CODE.fd and OVMF_VARS.fd.  Given that if OVMF_CODE.fd gets bigger the variable address changes from a QEMU point of view. So basically it is the QEMU  API that is making assumptions about the relative layout of the FD in the split model that makes a migration to larger ROM not work. Basically the -pflash API does not support changing the size of the ROM without moving NVRAM given the way it is currently defined. 

Given the above it seems like the 2 options are:
1) Pad OVMF_CODE.fd to be very large so there is room to grow.
2) Add some feature to QUEM that allows the variable store address to not be based on OVMF_CODE.fd size. 

I did see this [1] and combined with your email I either understand, or I'm still confused? :)

I'm not saying we need to change anything, I'm just trying to make sure I understand how OVMF and QEMU are tied to together. 

[1] https://www.redhat.com/archives/libvir-list/2019-January/msg01031.html

Thanks,

Andrew Fish




> 
> When you define a new domain (VM) on a virtualization host, the domain
> definition saves a reference (pathname) to the OVMF_CODE.fd file.
> However, the OVMF_VARS.fd file (the variable store *template*) is not
> directly referenced; instead, it is *copied* into a separate (private)
> file for the domain.
> 
> Furthermore, once booted, guest has two flash chips, one that maps the
> firmware executable OVMF_CODE.fd read-only, and another pflash chip that
> maps its private varstore file read-write.
> 
> This makes it possible to upgrade OVMF_CODE.fd and OVMF_VARS.fd (via
> package upgrades on the virt host) without messing with varstores that
> were earlier instantiated from OVMF_VARS.fd. What's important here is
> that the various constants in the new (upgraded) OVMF_CODE.fd file
> remain compatible with the *old* OVMF_VARS.fd structure, across package
> upgrades.
> 
> If that's not possible for introducing e.g. a new feature, then the
> package upgrade must not overwrite the OVMF_CODE.fd file in place, but
> must provide an additional firmware binary. This firmware binary can
> then only be used by freshly defined domains (old domains cannot be
> switched over). Old domains can be switched over manually -- and only if
> the sysadmin decides it is OK to lose the current variable store
> contents. Then the old varstore file for the domain is deleted
> (manually), the domain definition is updated, and then a new (logically
> empty, pristine) varstore can be created from the *new* OVMF_2_VARS.fd
> that matches the *new* OVMF_2_CODE.fd.
> 
> 
> During live migration, the "RAM-like" contents of both pflash chips are
> migrated (the guest-side view of both chips remains the same, including
> the case when the writeable chip happens to be in "programming mode",
> i.e., during a UEFI variable write through the Fault Tolerant Write and
> Firmware Volume Block(2) protocols).
> 
> Once live migration completes, QEMU dumps the full contents of the
> writeable chip to the backing file (on the destination host). Going
> forward, flash writes from within the guest are reflected to said
> host-side file on-line, just like it happened on the source host before
> live migration. If the file backing the r/w pflash chip is on NFS
> (shared by both src and dst hosts), then this one-time dumping when the
> migration completes is superfluous, but it's also harmless.
> 
> The interesting question is, what happens when you power down the VM on
> the destination host (= post migration), and launch it again there, from
> zero. In that case, the firmware executable file comes from the
> *destination host* (it was never persistently migrated from the source
> host, i.e. never written out on the dst). It simply comes from the OVMF
> package that had been installed on the destination host, by the
> sysadmin. However, the varstore pflash does reflect the permanent result
> of the previous migration. So this is where things can fall apart, if
> both firmware binaries (on the src host and on the dst host) don't agree
> about the internal structure of the varstore pflash.
> 
> Thanks
> Laszlo
> 
> 
> 
> 


[-- Attachment #2: Type: text/html, Size: 9826 bytes --]

  parent reply	other threads:[~2020-02-28  4:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 17:06 A problem with live migration of UEFI virtual machines "wuchenye1995
2020-02-11 17:39 ` Alex Bennée
2020-02-24 15:28   ` Daniel P. Berrangé
2020-02-25 17:53     ` [edk2-devel] " Laszlo Ersek
2020-02-25 18:56       ` Andrew Fish
2020-02-25 20:40         ` Laszlo Ersek
2020-02-25 21:35           ` Andrew Fish
2020-02-26  9:42             ` Laszlo Ersek
2020-02-28  3:20               ` Zhoujian (jay)
2020-02-28 11:29                 ` Laszlo Ersek
2020-02-28  4:04               ` Andrew Fish [this message]
2020-02-28 11:47                 ` Laszlo Ersek
2020-02-28 11:50                   ` Laszlo Ersek
2020-03-02 12:32               ` Dr. David Alan Gilbert
  -- strict thread matches above, loose matches on Subject: below --
2020-02-10  4:39 wuchenye1995
2020-02-10 20:20 ` [edk2-devel] " Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=284BFC25-8534-4147-8616-DE7C410DB681@apple.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox