From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.158.5]) by mx.groups.io with SMTP id smtpd.web09.2433.1614979350657533249 for ; Fri, 05 Mar 2021 13:22:31 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@ibm.com header.s=pp1 header.b=DDXx0QsM; spf=pass (domain: linux.ibm.com, ip: 148.163.158.5, mailfrom: tobin@linux.ibm.com) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 125LF4pi021173; Fri, 5 Mar 2021 16:22:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=6aKwQcVjB2UZqi4OYZemgxV7dMG1qBYaGgKJdDRH+rY=; b=DDXx0QsMoG7yqIOs6o51hUnRKp1jXLA8M9s4eqk/oapzbv60AyIk2pTd4iksEmWf7zOM YZtpo/TdoFc8SVMqvi+ZfEbrIs7+2VNUaIQdV9PTtVtlioCG6odVzkYN4ZHbdDHAZK7/ lHdO4XmaXVgQeNemA6S4uSfuZ/T28Rr3JHTk01XH/+wwA4QmMlNQWJtNsIbQKfnjzBPZ 78wKaOGFJu2XnhBxBhfUNFXp7GthMgDkf/5UTUZK7yfYDor/sHcsHWXaZwqxAFlECqZ5 w2iYp15Mtk9FlNA2kCqGPJjLdkj1EyNNY4wElGLR8BMnXcZn9x67nmF1O4pzFWWuxyh0 aQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 373vcw05pm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Mar 2021 16:22:27 -0500 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 125LG8YE024060; Fri, 5 Mar 2021 16:22:27 -0500 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0b-001b2d01.pphosted.com with ESMTP id 373vcw05pb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Mar 2021 16:22:27 -0500 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 125LHYjW006986; Fri, 5 Mar 2021 21:22:26 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma01wdc.us.ibm.com with ESMTP id 36ydq9t60j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Mar 2021 21:22:26 +0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 125LMQ4A43319570 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 5 Mar 2021 21:22:26 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 24A9A12405A; Fri, 5 Mar 2021 21:22:26 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E37CB124053; Fri, 5 Mar 2021 21:22:25 +0000 (GMT) Received: from Tobins-MacBook-Pro-2.local (unknown [9.85.133.147]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 5 Mar 2021 21:22:25 +0000 (GMT) Subject: Re: [edk2-devel] [RFC PATCH 00/14] Firmware Support for Fast Live Migration for AMD SEV To: Ashish Kalra Cc: Laszlo Ersek , devel@edk2.groups.io, Dov Murik , Tobin Feldman-Fitzthum , James Bottomley , Hubertus Franke , Brijesh Singh , Jon Grimm , Tom Lendacky References: <20210302204839.82042-1-tobin@linux.ibm.com> <9d7de545-7902-4d38-ba49-f084a750ee2a@redhat.com> <4869b086-f329-b79b-39ee-e21bfc5f95b5@linux.ibm.com> <20210305104422.GA1984@ashkalra_ubuntu_server> <20210305161053.GA2148@ashkalra_ubuntu_server> From: "Tobin Feldman-Fitzthum" Message-ID: <5dfa8dba-d627-d30b-e42e-31ab34f721f8@linux.ibm.com> Date: Fri, 5 Mar 2021 16:22:25 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210305161053.GA2148@ashkalra_ubuntu_server> X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369,18.0.761 definitions=2021-03-05_14:2021-03-03,2021-03-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxscore=0 malwarescore=0 priorityscore=1501 phishscore=0 clxscore=1015 spamscore=0 mlxlogscore=999 bulkscore=0 adultscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2103050105 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US > On Fri, Mar 05, 2021 at 10:44:23AM +0000, Ashish Kalra wrote: >> On Wed, Mar 03, 2021 at 01:25:40PM -0500, Tobin Feldman-Fitzthum wrote: >>>> Hi Tobin, >>>> >>>> On 03/02/21 21:48, Tobin Feldman-Fitzthum wrote: >>>>> This is a demonstration of fast migration for encrypted virtual machines >>>>> using a Migration Handler that lives in OVMF. This demo uses AMD SEV, >>>>> but the ideas may generalize to other confidential computing platforms. >>>>> With AMD SEV, guest memory is encrypted and the hypervisor cannot access >>>>> or move it. This makes migration tricky. In this demo, we show how the >>>>> HV can ask a Migration Handler (MH) in the firmware for an encrypted >>>>> page. The MH encrypts the page with a transport key prior to releasing >>>>> it to the HV. The target machine also runs an MH that decrypts the page >>>>> once it is passed in by the target HV. These patches are not ready for >>>>> production, but the are a full end-to-end solution that facilitates a >>>>> fast live migration between two SEV VMs. >>>>> >>>>> Corresponding patches for QEMU have been posted my colleague Dov Murik >>>>> on qemu-devel. Our approach needs little kernel support, requiring only >>>>> one hypercall that the guest can use to mark a page as encrypted or >>>>> shared. This series includes updated patches from Ashish Kalra and >>>>> Brijesh Singh that allow OVMF to use this hypercall. >>>>> >>>>> The MH runs continuously in the guest, waiting for communication from >>>>> the HV. The HV starts an additional vCPU for the MH but does not expose >>>>> it to the guest OS via ACPI. We use the MpService to start the MH. The >>>>> MpService is only available at runtime and processes that are started by >>>>> it are usually cleaned up on ExitBootServices. Since we need the MH to >>>>> run continuously, we had to make some modifications. Ideally a feature >>>>> could be added to the MpService to allow for the starting of >>>>> long-running processes. Besides migration, this could support other >>>>> background processes that need to operate within the encryption >>>>> boundary. For now, we have included a handful of patches that modify the >>>>> MpService to allow the MH to keep running after ExitBootServices. These >>>>> are temporary. >>>> I plan to do a lightweight review for this series. (My understanding is >>>> that it's an RFC and not actually being proposed for merging.) >>>> >>>> Regarding the MH's availability at runtime -- does that necessarily >>>> require the isolation of an AP? Because in the current approach, >>>> allowing the MP Services to survive into OS runtime (in some form or >>>> another) seems critical, and I don't think it's going to fly. >>>> >>>> I agree that the UefiCpuPkg patches have been well separated from the >>>> rest of the series, but I'm somewhat doubtful the "firmware-initiated >>>> background process" idea will be accepted. Have you investigated >>>> exposing a new "runtime service" (a function pointer) via the UEFI >>>> Configuration table, and calling that (perhaps periodically?) from the >>>> guest kernel? It would be a form of polling I guess. Or maybe, poll the >>>> mailbox directly in the kernel, and call the new firmware runtime >>>> service when there's an actual command to process. >>> Continuous runtime availability for the MH is almost certainly the most >>> controversial part of this proposal, which is why I put it in the cover >>> letter and why it's good to discuss. >>>> (You do spell out "little kernel support", and I'm not sure if that's a >>>> technical benefit, or a political / community benefit.) >>> As you allude to, minimal kernel support is really one of the main things >>> that shapes our approach. This is partly a political and practical benefit, >>> but there are also technical benefits. Having the MH in firmware likely >>> leads to higher availability. It can be accessed when the OS is unreachable, >>> perhaps during boot or when the OS is hung. There are also potential >>> portability advantages although we do currently require support for one >>> hypercall. The cost of implementing this hypercall is low. >>> >>> Generally speaking, our task is to find a home for functionality that was >>> traditionally provided by the hypervisor, but that needs to be inside the >>> trust domain, but that isn't really part of a guest. A meta-goal of this >>> project is to figure out the best way to do this. >>> >>>> I'm quite uncomfortable with an attempt to hide a CPU from the OS via >>>> ACPI. The OS has other ways to learn (for example, a boot loader could >>>> use the MP services itself, stash the information, and hand it to the OS >>>> kernel -- this would minimally allow for detecting an inconsistency in >>>> the OS). What about "all-but-self" IPIs too -- the kernel might think >>>> all the processors it's poking like that were under its control. >>> This might be the second most controversial piece. Here's a question: if we >>> could successfully hide the MH vCPU from the OS, would it still make you >>> uncomfortable? In other words, is the worry that there might be some >>> inconsistency or more generally that there is something hidden from the OS? >>> One thing to think about is that the guest owner should generally be aware >>> that there is a migration handler running. The way I see it, a guest owner >>> of an SEV VM would need to opt-in to migration and should then expect that >>> there is an MH running even if they aren't able to see it. Of course we need >>> to be certain that the MH isn't going to break the OS. >>> >>>> Also, as far as I can tell from patch #7, the AP seems to be >>>> busy-looping (with a CpuPause() added in), for the entire lifetime of >>>> the OS. Do I understand right? If so -- is it a temporary trait as well? >>> In our approach the MH continuously checks for commands from the hypervisor. >>> There are potentially ways to optimize this, such as having the hypervisor >>> de-schedule the MH vCPU while not migrating. You could potentially shut down >>> down the MH on the target after receiving the MH_RESET command (when the >>> migration finishes), but what if you want to migrate that VM somewhere else? >>> >> I think another approach can be considered here, why not implement MH >> vCPU(s) as hot-plugged vCPU(s), basically hot-plug a new vCPU when migration >> is started and hot unplug the vCPU when migration is completed, then we >> won't need a vCPU running (and potentially consuming cycles) forever and >> busy-looping with CpuPause(). >> > After internal discussions, realized that this approach will not work as > vCPU hotplug will not work for SEV-ES, SNP. As the VMSA has to be > encrypted as part of the LAUNCH command, therefore we can't create/add a > new vCPU after LAUNCH has completed. > > Thanks, > Ashish Hm yeah we talked about hotplug a bit. It was never clear how it would square with OVMF. -Tobin