public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Tobin Feldman-Fitzthum" <tobin@linux.ibm.com>
To: Ashish Kalra <ashish.kalra@amd.com>
Cc: Laszlo Ersek <lersek@redhat.com>,
	devel@edk2.groups.io, Dov Murik <dovmurik@linux.vnet.ibm.com>,
	Tobin Feldman-Fitzthum <tobin@ibm.com>,
	James Bottomley <jejb@linux.ibm.com>,
	Hubertus Franke <frankeh@us.ibm.com>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Jon Grimm <jon.grimm@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [edk2-devel] [RFC PATCH 00/14] Firmware Support for Fast Live Migration for AMD SEV
Date: Fri, 5 Mar 2021 16:22:25 -0500	[thread overview]
Message-ID: <5dfa8dba-d627-d30b-e42e-31ab34f721f8@linux.ibm.com> (raw)
In-Reply-To: <20210305161053.GA2148@ashkalra_ubuntu_server>


> On Fri, Mar 05, 2021 at 10:44:23AM +0000, Ashish Kalra wrote:
>> On Wed, Mar 03, 2021 at 01:25:40PM -0500, Tobin Feldman-Fitzthum wrote:
>>>> Hi Tobin,
>>>>
>>>> On 03/02/21 21:48, Tobin Feldman-Fitzthum wrote:
>>>>> This is a demonstration of fast migration for encrypted virtual machines
>>>>> using a Migration Handler that lives in OVMF. This demo uses AMD SEV,
>>>>> but the ideas may generalize to other confidential computing platforms.
>>>>> With AMD SEV, guest memory is encrypted and the hypervisor cannot access
>>>>> or move it. This makes migration tricky. In this demo, we show how the
>>>>> HV can ask a Migration Handler (MH) in the firmware for an encrypted
>>>>> page. The MH encrypts the page with a transport key prior to releasing
>>>>> it to the HV. The target machine also runs an MH that decrypts the page
>>>>> once it is passed in by the target HV. These patches are not ready for
>>>>> production, but the are a full end-to-end solution that facilitates a
>>>>> fast live migration between two SEV VMs.
>>>>>
>>>>> Corresponding patches for QEMU have been posted my colleague Dov Murik
>>>>> on qemu-devel. Our approach needs little kernel support, requiring only
>>>>> one hypercall that the guest can use to mark a page as encrypted or
>>>>> shared. This series includes updated patches from Ashish Kalra and
>>>>> Brijesh Singh that allow OVMF to use this hypercall.
>>>>>
>>>>> The MH runs continuously in the guest, waiting for communication from
>>>>> the HV. The HV starts an additional vCPU for the MH but does not expose
>>>>> it to the guest OS via ACPI. We use the MpService to start the MH. The
>>>>> MpService is only available at runtime and processes that are started by
>>>>> it are usually cleaned up on ExitBootServices. Since we need the MH to
>>>>> run continuously, we had to make some modifications. Ideally a feature
>>>>> could be added to the MpService to allow for the starting of
>>>>> long-running processes. Besides migration, this could support other
>>>>> background processes that need to operate within the encryption
>>>>> boundary. For now, we have included a handful of patches that modify the
>>>>> MpService to allow the MH to keep running after ExitBootServices. These
>>>>> are temporary.
>>>> I plan to do a lightweight review for this series. (My understanding is
>>>> that it's an RFC and not actually being proposed for merging.)
>>>>
>>>> Regarding the MH's availability at runtime -- does that necessarily
>>>> require the isolation of an AP? Because in the current approach,
>>>> allowing the MP Services to survive into OS runtime (in some form or
>>>> another) seems critical, and I don't think it's going to fly.
>>>>
>>>> I agree that the UefiCpuPkg patches have been well separated from the
>>>> rest of the series, but I'm somewhat doubtful the "firmware-initiated
>>>> background process" idea will be accepted. Have you investigated
>>>> exposing a new "runtime service" (a function pointer) via the UEFI
>>>> Configuration table, and calling that (perhaps periodically?) from the
>>>> guest kernel? It would be a form of polling I guess. Or maybe, poll the
>>>> mailbox directly in the kernel, and call the new firmware runtime
>>>> service when there's an actual command to process.
>>> Continuous runtime availability for the MH is almost certainly the most
>>> controversial part of this proposal, which is why I put it in the cover
>>> letter and why it's good to discuss.
>>>> (You do spell out "little kernel support", and I'm not sure if that's a
>>>> technical benefit, or a political / community benefit.)
>>> As you allude to, minimal kernel support is really one of the main things
>>> that shapes our approach. This is partly a political and practical benefit,
>>> but there are also technical benefits. Having the MH in firmware likely
>>> leads to higher availability. It can be accessed when the OS is unreachable,
>>> perhaps during boot or when the OS is hung. There are also potential
>>> portability advantages although we do currently require support for one
>>> hypercall. The cost of implementing this hypercall is low.
>>>
>>> Generally speaking, our task is to find a home for functionality that was
>>> traditionally provided by the hypervisor, but that needs to be inside the
>>> trust domain, but that isn't really part of a guest. A meta-goal of this
>>> project is to figure out the best way to do this.
>>>
>>>> I'm quite uncomfortable with an attempt to hide a CPU from the OS via
>>>> ACPI. The OS has other ways to learn (for example, a boot loader could
>>>> use the MP services itself, stash the information, and hand it to the OS
>>>> kernel -- this would minimally allow for detecting an inconsistency in
>>>> the OS). What about "all-but-self" IPIs too -- the kernel might think
>>>> all the processors it's poking like that were under its control.
>>> This might be the second most controversial piece. Here's a question: if we
>>> could successfully hide the MH vCPU from the OS, would it still make you
>>> uncomfortable? In other words, is the worry that there might be some
>>> inconsistency or more generally that there is something hidden from the OS?
>>> One thing to think about is that the guest owner should generally be aware
>>> that there is a migration handler running. The way I see it, a guest owner
>>> of an SEV VM would need to opt-in to migration and should then expect that
>>> there is an MH running even if they aren't able to see it. Of course we need
>>> to be certain that the MH isn't going to break the OS.
>>>
>>>> Also, as far as I can tell from patch #7, the AP seems to be
>>>> busy-looping (with a CpuPause() added in), for the entire lifetime of
>>>> the OS. Do I understand right? If so -- is it a temporary trait as well?
>>> In our approach the MH continuously checks for commands from the hypervisor.
>>> There are potentially ways to optimize this, such as having the hypervisor
>>> de-schedule the MH vCPU while not migrating. You could potentially shut down
>>> down the MH on the target after receiving the MH_RESET command (when the
>>> migration finishes), but what if you want to migrate that VM somewhere else?
>>>
>> I think another approach can be considered here, why not implement MH
>> vCPU(s) as hot-plugged vCPU(s), basically hot-plug a new vCPU when migration
>> is started and hot unplug the vCPU when migration is completed, then we
>> won't need a vCPU running (and potentially consuming cycles) forever and
>> busy-looping with CpuPause().
>>
> After internal discussions, realized that this approach will not work as
> vCPU hotplug will not work for SEV-ES, SNP. As the VMSA has to be
> encrypted as part of the LAUNCH command, therefore we can't create/add a
> new vCPU after LAUNCH has completed.
>
> Thanks,
> Ashish

Hm yeah we talked about hotplug a bit. It was never clear how it would 
square with OVMF.

-Tobin


  reply	other threads:[~2021-03-05 21:22 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02 20:48 [RFC PATCH 00/14] Firmware Support for Fast Live Migration for AMD SEV Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 01/14] OvmfPkg/BaseMemEncryptLib: Support to issue unencrypted hypercall Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 02/14] OvmfPkg/PlatformPei: Mark SEC GHCB page in the page encrpytion bitmap Tobin Feldman-Fitzthum
2021-03-03  0:16   ` Ashish Kalra
2021-03-03 14:56     ` [edk2-devel] " Tobin Feldman-Fitzthum
2021-03-03 15:01       ` Ashish Kalra
2021-03-02 20:48 ` [RFC PATCH 03/14] OvmfPkg/PlatformDxe: Add support for SEV live migration Tobin Feldman-Fitzthum
2021-03-03 16:41   ` Ashish Kalra
2021-03-03 16:47     ` Tobin Feldman-Fitzthum
2021-03-03 16:57       ` Ashish Kalra
2021-03-02 20:48 ` [RFC PATCH 04/14] OvmfPkg/AmdSev: Base for Confidential Migration Handler Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 05/14] OvmfPkg/PlatfomPei: Set Confidential Migration PCD Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 06/14] OvmfPkg/AmdSev: Setup Migration Handler Mailbox Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 07/14] OvmfPkg/AmdSev: MH support for mailbox protocol Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 08/14] UefiCpuPkg/MpInitLib: temp removal of MpLib cleanup Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 09/14] UefiCpuPkg/MpInitLib: Allocate MP buffer as runtime memory Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 10/14] UefiCpuPkg/CpuExceptionHandlerLib: Exception handling " Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 11/14] OvmfPkg/AmdSev: Build page table for migration handler Tobin Feldman-Fitzthum
2021-03-03 16:32   ` Ashish Kalra
2021-03-03 18:58     ` Dov Murik
2021-03-02 20:48 ` [RFC PATCH 12/14] OvmfPkg/AmdSev: Don't overwrite mailbox or pagetables Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 13/14] OvmfPkg/AmdSev: Don't overwrite MH stack Tobin Feldman-Fitzthum
2021-03-02 20:48 ` [RFC PATCH 14/14] OvmfPkg/AmdSev: MH page encryption POC Tobin Feldman-Fitzthum
2021-03-03 16:14 ` [edk2-devel] [RFC PATCH 00/14] Firmware Support for Fast Live Migration for AMD SEV Laszlo Ersek
2021-03-03 18:25   ` Tobin Feldman-Fitzthum
2021-03-04 17:35     ` Laszlo Ersek
2021-03-05 10:44     ` Ashish Kalra
2021-03-05 16:10       ` Ashish Kalra
2021-03-05 21:22         ` Tobin Feldman-Fitzthum [this message]
2021-03-04  1:49 ` Yao, Jiewen
2021-03-04  9:21 ` Paolo Bonzini
2021-03-04 20:45   ` Laszlo Ersek
2021-03-04 21:18     ` Laszlo Ersek
2021-03-05  8:59     ` Paolo Bonzini
     [not found] ` <166900903D364B89.9163@groups.io>
2021-03-13  2:32   ` Yao, Jiewen
2021-03-16 17:05     ` Singh, Brijesh
2021-03-16 17:47     ` Tobin Feldman-Fitzthum
2021-03-17 15:30       ` Yao, Jiewen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5dfa8dba-d627-d30b-e42e-31ab34f721f8@linux.ibm.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox