From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Thu, 03 Oct 2019 03:12:58 -0700 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 93D883084025; Thu, 3 Oct 2019 10:12:57 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-154.rdu2.redhat.com [10.10.120.154]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1B3745D9E1; Thu, 3 Oct 2019 10:12:51 +0000 (UTC) Subject: Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES To: "Lendacky, Thomas" , "devel@edk2.groups.io" Cc: Jordan Justen , Ard Biesheuvel , Michael D Kinney , Liming Gao , Eric Dong , Ray Ni , "Singh, Brijesh" , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= References: <81e310d1f2929f839cd166d1c7de6694220743b6.1568922729.git.thomas.lendacky@amd.com> <284e15f0-25ee-bb69-dcd1-09e146346c69@redhat.com> From: "Laszlo Ersek" Message-ID: Date: Thu, 3 Oct 2019 12:12:51 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Thu, 03 Oct 2019 10:12:57 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/02/19 20:07, Lendacky, Thomas wrote: > On 10/2/19 10:26 AM, Laszlo Ersek wrote: >> On 10/02/19 17:15, Laszlo Ersek wrote: >>> Adding Phil. >>> >>> I'm looking at this patch only because one thing caught my attention in >>> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector >>> re-directing": >>> >>> On 09/19/19 21:53, Lendacky, Thomas wrote: >>>> From: Tom Lendacky >>>> >>>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198 >>>> >>>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This >>>> sequence is intercepted by the hypervisor, which sets the AP's registers >>>> to the values requested by the sequence. At that point, the hypervisor can >>>> start the AP, which will then begin execution at the appropriate location. >>>> >>>> Under SEV-ES, AP booting presents some challenges since the hypervisor is >>>> not allowed to alter the AP's register state. In this situation, we have >>>> to distinguish between the AP's first boot and AP's subsequent boots. >>>> >>>> First boot: >>>> Once the AP's register state has been defined (which is before the guest >>>> is first booted) it cannot be altered. Should the hypervisor attempt to >>>> alter the register state, the change would be detected by the hardware >>>> and the VMRUN instruction would fail. Given this, the first boot for the >>>> AP is required to begin execution with this initial register state, which >>>> is typically the reset vector. This prevents the BSP from directing the >>>> AP startup location through the INIT-SIPI-SIPI sequence. >>>> >>>> To work around this, provide a four-byte field at offset 0xffffffd0 that >>>> can contain an IP / CS register combination, that if non-zero, causes >>>> the AP to perform a far jump to that location instead of a near jump to >>>> EarlyBspInitReal16. Before booting the AP for the first time, the BSP >>>> should set the IP / CS value for the AP based on the value that would be >>>> derived from the INIT-SIPI-SIPI sequence. >>> >>> I don't understand how this can work: the guest-phys address 0xffffffd0 >>> is backed by read-only pflash in most OVMF deployments. >>> >>> In addition: >>> >>> [...] >>> >>>> @@ -1002,6 +1204,7 @@ WakeUpAP ( >>>> CpuMpData->InitFlag != ApInitDone) { >>>> ResetVectorRequired = TRUE; >>>> AllocateResetVector (CpuMpData); >>>> + AllocateSevEsAPMemory (CpuMpData); >>>> FillExchangeInfoData (CpuMpData); >>>> SaveLocalApicTimerSetting (CpuMpData); >>>> } >>>> @@ -1038,6 +1241,15 @@ WakeUpAP ( >>>> } >>>> } >>>> if (ResetVectorRequired) { >>>> + // >>>> + // For SEV-ES, set the jump address for initial AP boot >>>> + // >>>> + if (CpuMpData->SevEsActive) { >>>> + SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0; >>>> + >>>> + JmpFar->ApStart.Rip = 0; >>>> + JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4); >>>> + } >>> >>> Even if the address is backed by a single "unified" pflash, mapped r/w >>> -- which we can call a "non-standard OVMF deployment" nowadays --, a >>> normal store doesn't appear sufficient to me. The first write to pflash >>> will flip it to "programming mode", and the values stored are supposed >>> to be pflash commands (not just the raw data we intend to put in place). >>> >>> See for example the QemuFlashWrite() function in >>> "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that >>> is used with the pflash chip that hosts the variable store, and is >>> therefore mapped r/w.) >>> >>> >>> Taking a step back... I don't think APs execute any code from pflash, >>> when MpInitLib boots them. >>> >>> In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore >>> "CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and >>> "ResetVectorRequired" too should be TRUE, at first AP boot. >>> Consequently, the reset vector seems to be allocated with >>> AllocateResetVector(). >>> >>> AllocateResetVector() has separate implementations for PEI and DXE, but >>> in both cases, it returns RAM. So I don't see where the AP accesses (or >>> executes) pflash. >> >> ... I believe I understand that this is precisely what cannot work under >> SEV-ES -- because we cannot launch an AP at an address that's >> dynamically chosen by the firmware (and passed to the hypervisor), like >> with INIT-SIPI-SIPI. > > Correct. > >> >> And so firmware and hypervisor have to agree upon a *constant* AP reset >> vector address, in advance. >> >> We have two options: >> >> - pick the reset vector address *constant* such that it falls into RAM, or >> >> - let the AP reset vector reside in pflash, but then the code in pflash >> has to look for a parameter block at a fixed address in RAM. >> >> So in the end, both options require the same -- we need a RAM address >> constant that is determined at firmware build time. Either for the reset >> vector itself, or for the reset vector's parameter block. > > As long as the hypervisor has a way to determine a build time address, > that would be the preferred method. I couldn't come up with a way (or at > least don't know of a way) to allow the hypervisor to discover that build > time address. Hopefully someone else does and we can eliminate the need > to map the ROM read/write and just program the vCPUs initial registers to > the build time value. > > I think the address should be for the AP reset vector itself, otherwise > you would have to guarantee that the reset vector parameter block is zero > for the initial BSP boot. Rough idea: (1) Introduce a new fixed PCD (UINT32) pointing into RAM -- the AP reset vector address. You could carve it out from another unused part of FD.MEMFD, and set the PCD in the FDF files. (2) In PlatformPei, reserve the area. (3) In "ResetVectorVtf0.asm", we should introduce a packed structure as follows, so that it end at label "applicationProcessorEntryPoint": UINT32 ApEntryPoint; EFI_GUID SevEsFooterGuid; UINT16 Size; I think the assembly source code could populate this structure with DD / DB / DW pseudo-instructions. - The "ApEntryPoint" field would be filled from FixedPcdGet32. - The SevEsFooterGuid field would be open-coded with DB. We'd pick a new GUID with uuidgen for this. - The "Size" field would cover the entire structure. I think we should be able to auto-calculate it for a DW pseudo-instruction, placing a label at the top of the structure, and then using ($ - ThatLabel) or similar. (4) The following information would be fixed (by convention) and known to QEMU: - the address of the SevEsFooterGuid field (offset backwards from the end of flash / 4GB) - the value of SevEsFooterGuid. (5) Whenever QEMU finds the GUID, it can check the Size field. Given the Size field, QEMU can check fields *prepended* to the footer structure. Incompatible changes to the structure (i.e. any change other than prepending new fields) require a new GUID to be generated for SevEsFooterGuid. (6) Thus QEMU could read ApEntryPoint out of the pflash image. Thanks Laszlo