From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.158.5]) by mx.groups.io with SMTP id smtpd.web09.144.1614795945148757872 for ; Wed, 03 Mar 2021 10:25:45 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@ibm.com header.s=pp1 header.b=B+7wq5Ku; spf=pass (domain: linux.ibm.com, ip: 148.163.158.5, mailfrom: tobin@linux.ibm.com) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 123I3F1S186864; Wed, 3 Mar 2021 13:25:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=yv6vh4pGW6UxzP/4XVUr+YsV6HCZYENxgClyPEPhXUc=; b=B+7wq5Ku4Ex2y6rUW91eW9MvFfi0E8LguvseNZWGIBsdRxm8tR3mPa+8AIR6gxABv/Q4 xmVzJLVJVULqkTxGT5rieya+OPXAcQa9y8g/NJpq4bFQIQKYI62O/jxpjeSDX5sL53EC oq+athSGdUhhF8wTVhaZfPzAXy8ahWGXQiWI9wwRK29tv/F17qMSBj7/OfRfBdlVzCtu yKt0gaRO/I5QeQGL4FH2DxHVvbOQTYBX8AV1Dfn640uOFEgA/Mr9OQ0YjCKM4uC3R22D aZvh6OGOYOaeflkWAGAWm4IbY2biToxsWuZe9inDF1rsFsCQlUMsmZc8Eu8HXBFiZtTg qw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 372f82s7nh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Mar 2021 13:25:42 -0500 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 123I3RNX189200; Wed, 3 Mar 2021 13:25:42 -0500 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 372f82s7mx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Mar 2021 13:25:42 -0500 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 123IMMVl030365; Wed, 3 Mar 2021 18:25:41 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma05wdc.us.ibm.com with ESMTP id 371b015u72-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 Mar 2021 18:25:41 +0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 123IPeco34144700 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 3 Mar 2021 18:25:40 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9CC60124058; Wed, 3 Mar 2021 18:25:40 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 69EF1124053; Wed, 3 Mar 2021 18:25:40 +0000 (GMT) Received: from Tobins-MacBook-Pro-2.local (unknown [9.85.173.209]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Wed, 3 Mar 2021 18:25:40 +0000 (GMT) Subject: Re: [edk2-devel] [RFC PATCH 00/14] Firmware Support for Fast Live Migration for AMD SEV To: Laszlo Ersek , devel@edk2.groups.io Cc: Dov Murik , Tobin Feldman-Fitzthum , James Bottomley , Hubertus Franke , Brijesh Singh , Ashish Kalra , Jon Grimm , Tom Lendacky References: <20210302204839.82042-1-tobin@linux.ibm.com> <9d7de545-7902-4d38-ba49-f084a750ee2a@redhat.com> From: "Tobin Feldman-Fitzthum" Message-ID: <4869b086-f329-b79b-39ee-e21bfc5f95b5@linux.ibm.com> Date: Wed, 3 Mar 2021 13:25:40 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <9d7de545-7902-4d38-ba49-f084a750ee2a@redhat.com> X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-03-03_05:2021-03-03,2021-03-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 spamscore=0 mlxscore=0 malwarescore=0 suspectscore=0 priorityscore=1501 impostorscore=0 adultscore=0 clxscore=1015 bulkscore=0 lowpriorityscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103030127 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US > Hi Tobin, > > On 03/02/21 21:48, Tobin Feldman-Fitzthum wrote: >> This is a demonstration of fast migration for encrypted virtual machines >> using a Migration Handler that lives in OVMF. This demo uses AMD SEV, >> but the ideas may generalize to other confidential computing platforms. >> With AMD SEV, guest memory is encrypted and the hypervisor cannot access >> or move it. This makes migration tricky. In this demo, we show how the >> HV can ask a Migration Handler (MH) in the firmware for an encrypted >> page. The MH encrypts the page with a transport key prior to releasing >> it to the HV. The target machine also runs an MH that decrypts the page >> once it is passed in by the target HV. These patches are not ready for >> production, but the are a full end-to-end solution that facilitates a >> fast live migration between two SEV VMs. >> >> Corresponding patches for QEMU have been posted my colleague Dov Murik >> on qemu-devel. Our approach needs little kernel support, requiring only >> one hypercall that the guest can use to mark a page as encrypted or >> shared. This series includes updated patches from Ashish Kalra and >> Brijesh Singh that allow OVMF to use this hypercall. >> >> The MH runs continuously in the guest, waiting for communication from >> the HV. The HV starts an additional vCPU for the MH but does not expose >> it to the guest OS via ACPI. We use the MpService to start the MH. The >> MpService is only available at runtime and processes that are started by >> it are usually cleaned up on ExitBootServices. Since we need the MH to >> run continuously, we had to make some modifications. Ideally a feature >> could be added to the MpService to allow for the starting of >> long-running processes. Besides migration, this could support other >> background processes that need to operate within the encryption >> boundary. For now, we have included a handful of patches that modify the >> MpService to allow the MH to keep running after ExitBootServices. These >> are temporary. > I plan to do a lightweight review for this series. (My understanding is > that it's an RFC and not actually being proposed for merging.) > > Regarding the MH's availability at runtime -- does that necessarily > require the isolation of an AP? Because in the current approach, > allowing the MP Services to survive into OS runtime (in some form or > another) seems critical, and I don't think it's going to fly. > > I agree that the UefiCpuPkg patches have been well separated from the > rest of the series, but I'm somewhat doubtful the "firmware-initiated > background process" idea will be accepted. Have you investigated > exposing a new "runtime service" (a function pointer) via the UEFI > Configuration table, and calling that (perhaps periodically?) from the > guest kernel? It would be a form of polling I guess. Or maybe, poll the > mailbox directly in the kernel, and call the new firmware runtime > service when there's an actual command to process. Continuous runtime availability for the MH is almost certainly the most controversial part of this proposal, which is why I put it in the cover letter and why it's good to discuss. > (You do spell out "little kernel support", and I'm not sure if that's a > technical benefit, or a political / community benefit.) As you allude to, minimal kernel support is really one of the main things that shapes our approach. This is partly a political and practical benefit, but there are also technical benefits. Having the MH in firmware likely leads to higher availability. It can be accessed when the OS is unreachable, perhaps during boot or when the OS is hung. There are also potential portability advantages although we do currently require support for one hypercall. The cost of implementing this hypercall is low. Generally speaking, our task is to find a home for functionality that was traditionally provided by the hypervisor, but that needs to be inside the trust domain, but that isn't really part of a guest. A meta-goal of this project is to figure out the best way to do this. > > I'm quite uncomfortable with an attempt to hide a CPU from the OS via > ACPI. The OS has other ways to learn (for example, a boot loader could > use the MP services itself, stash the information, and hand it to the OS > kernel -- this would minimally allow for detecting an inconsistency in > the OS). What about "all-but-self" IPIs too -- the kernel might think > all the processors it's poking like that were under its control. This might be the second most controversial piece. Here's a question: if we could successfully hide the MH vCPU from the OS, would it still make you uncomfortable? In other words, is the worry that there might be some inconsistency or more generally that there is something hidden from the OS? One thing to think about is that the guest owner should generally be aware that there is a migration handler running. The way I see it, a guest owner of an SEV VM would need to opt-in to migration and should then expect that there is an MH running even if they aren't able to see it. Of course we need to be certain that the MH isn't going to break the OS. > Also, as far as I can tell from patch #7, the AP seems to be > busy-looping (with a CpuPause() added in), for the entire lifetime of > the OS. Do I understand right? If so -- is it a temporary trait as well? In our approach the MH continuously checks for commands from the hypervisor. There are potentially ways to optimize this, such as having the hypervisor de-schedule the MH vCPU while not migrating. You could potentially shut down down the MH on the target after receiving the MH_RESET command (when the migration finishes), but what if you want to migrate that VM somewhere else? > > Sorry if my questions are "premature", in the sense that I could get my > own answers as well if I actually read the patches in detail -- however, > I wouldn't like to do that at once, because then I'll be distracted by > many style issues and other "trivial" stuff. Examples for the latter: Not premature at all. I think you hit the nail on the head with everything you raised. -Tobin > > - patch#1 calls SetMemoryEncDecHypercall3(), but there is no such > function in edk2, so minimally it's a patch ordering bug in the series, > > - in patch#1, there's minimally one whitespace error (no whitespace > right after "EFI_SIZE_TO_PAGES") > > - in patch#1, the alphabetical ordering in the [LibraryClasses] section, > and in the matching #include directives, gets broken, > > - I'd prefer if the "SevLiveMigrationEnabled" UEFI variable were set in > ConfidentialMigrationDxe, rather than PlatformDxe (patch #3), or at > least another AMD SEV related DXE driver (OvmfPkg/AmdSevDxe etc). > > - any particular reasonf or making the UEFI variable non-volatile? I > don't think it should survive any particular boot of the guest. > > - Why do we need a variable in the first place? > > etc etc > > Thanks! > Laszlo > > > > >> Ashish Kalra (2): >> OvmfPkg/PlatformPei: Mark SEC GHCB page in the page encrpytion bitmap. >> OvmfPkg/PlatformDxe: Add support for SEV live migration. >> >> Brijesh Singh (1): >> OvmfPkg/BaseMemEncryptLib: Support to issue unencrypted hypercall >> >> Dov Murik (1): >> OvmfPkg/AmdSev: Build page table for migration handler >> >> Tobin Feldman-Fitzthum (10): >> OvmfPkg/AmdSev: Base for Confidential Migration Handler >> OvmfPkg/PlatfomPei: Set Confidential Migration PCD >> OvmfPkg/AmdSev: Setup Migration Handler Mailbox >> OvmfPkg/AmdSev: MH support for mailbox protocol >> UefiCpuPkg/MpInitLib: temp removal of MpLib cleanup >> UefiCpuPkg/MpInitLib: Allocate MP buffer as runtime memory >> UefiCpuPkg/CpuExceptionHandlerLib: Exception handling as runtime >> memory >> OvmfPkg/AmdSev: Don't overwrite mailbox or pagetables >> OvmfPkg/AmdSev: Don't overwrite MH stack >> OvmfPkg/AmdSev: MH page encryption POC >> >> OvmfPkg/OvmfPkg.dec | 11 + >> OvmfPkg/AmdSev/AmdSevX64.dsc | 2 + >> OvmfPkg/AmdSev/AmdSevX64.fdf | 13 +- >> .../ConfidentialMigrationDxe.inf | 45 +++ >> .../ConfidentialMigrationPei.inf | 35 ++ >> .../DxeMemEncryptSevLib.inf | 1 + >> .../PeiMemEncryptSevLib.inf | 1 + >> OvmfPkg/PlatformDxe/Platform.inf | 2 + >> OvmfPkg/PlatformPei/PlatformPei.inf | 2 + >> UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 2 + >> UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 2 + >> OvmfPkg/AmdSev/ConfidentialMigration/MpLib.h | 235 +++++++++++++ >> .../ConfidentialMigration/VirtualMemory.h | 177 ++++++++++ >> OvmfPkg/Include/Guid/MemEncryptLib.h | 16 + >> OvmfPkg/PlatformDxe/PlatformConfig.h | 5 + >> .../ConfidentialMigrationDxe.c | 325 ++++++++++++++++++ >> .../ConfidentialMigrationPei.c | 25 ++ >> .../X64/PeiDxeVirtualMemory.c | 18 + >> OvmfPkg/PlatformDxe/AmdSev.c | 99 ++++++ >> OvmfPkg/PlatformDxe/Platform.c | 6 + >> OvmfPkg/PlatformPei/AmdSev.c | 10 + >> OvmfPkg/PlatformPei/Platform.c | 10 + >> .../CpuExceptionHandlerLib/DxeException.c | 8 +- >> UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 21 +- >> UefiCpuPkg/Library/MpInitLib/MpLib.c | 7 +- >> 25 files changed, 1061 insertions(+), 17 deletions(-) >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationDxe.inf >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationPei.inf >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/MpLib.h >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/VirtualMemory.h >> create mode 100644 OvmfPkg/Include/Guid/MemEncryptLib.h >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationDxe.c >> create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationPei.c >> create mode 100644 OvmfPkg/PlatformDxe/AmdSev.c >>