From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5437021CEB0F6 for ; Thu, 7 Sep 2017 07:45:52 -0700 (PDT) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Sep 2017 07:48:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,358,1500966000"; d="scan'208,217";a="309092678" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by fmsmga004.fm.intel.com with ESMTP; 07 Sep 2017 07:48:30 -0700 Received: from fmsmsx155.amr.corp.intel.com (10.18.116.71) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 7 Sep 2017 07:48:30 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by FMSMSX155.amr.corp.intel.com (10.18.116.71) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 7 Sep 2017 07:48:29 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.39]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.98]) with mapi id 14.03.0319.002; Thu, 7 Sep 2017 22:48:04 +0800 From: "Yao, Jiewen" To: Brijesh Singh , Laszlo Ersek , "Zeng, Star" , edk2-devel-01 CC: "Dong, Eric" Thread-Topic: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices() Thread-Index: AQHTJO6HlEgpQf73uk+WGvJRCtH+06KkA0GAgACz0wCAAMTM8IAAA0SAgADDw8D//836AIAA9pEQgAA8GYCAADkqgIABX6yQ///xnwCAADB3AIAAhodA Date: Thu, 7 Sep 2017 14:48:03 +0000 Message-ID: <74D8A39837DF1E4DA445A8C0B3885C503A9A92BA@shsmsx102.ccr.corp.intel.com> References: <20170903195449.30261-1-lersek@redhat.com> <0C09AFA07DD0434D9E2A0C6AEB0483103B93A125@shsmsx102.ccr.corp.intel.com> <4b24b1eb-362f-3b46-97e2-bdfda53f40c9@redhat.com> <74D8A39837DF1E4DA445A8C0B3885C503A9A79BD@shsmsx102.ccr.corp.intel.com> <5f1fdc84-5824-bee2-5a1a-fbd67adf5443@redhat.com> <74D8A39837DF1E4DA445A8C0B3885C503A9A7F10@shsmsx102.ccr.corp.intel.com> <12d71f32-9dcf-4f6e-b033-d5f82104caca@redhat.com> <74D8A39837DF1E4DA445A8C0B3885C503A9A852C@shsmsx102.ccr.corp.intel.com> <24d66e81-336e-3924-8045-4749d98e2fbb@redhat.com> <94154cfe-92bf-ba6f-f25f-3963891f6932@amd.com> <74D8A39837DF1E4DA445A8C0B3885C503A9A8F0C@shsmsx102.ccr.corp.intel.com> <976f687a-1add-f86a-0061-0137f2df0eb6@amd.com> In-Reply-To: <976f687a-1add-f86a-0061-0137f2df0eb6@amd.com> Accept-Language: zh-CN, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [10.239.127.40] MIME-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 Subject: Re: [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices() X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Sep 2017 14:45:52 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Great. Thanks to confirm that. Now it is clear to me. Then let's wait Laszlo's new patch to make all DMA buffer to private. :) Thank you Yao, Jiewen From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Brij= esh Singh Sent: Thursday, September 7, 2017 10:40 PM To: Laszlo Ersek ; Yao, Jiewen ; Z= eng, Star ; edk2-devel-01 Cc: brijesh.singh@amd.com; Dong, Eric Subject: Re: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap co= mmon buffers at ExitBootServices() Hi Jiewen, On 09/07/2017 06:46 AM, Laszlo Ersek wrote: > On 09/07/17 06:46, Yao, Jiewen wrote: >> Thanks for the sharing, Brijesh. >> >> "If a page was marked as "shared" >> then its guest responsibility to make it "private" after its done commun= icating with >> hypervisor." >> >> I believe I have same understanding - a *guest* should guarantee that. >> >> My question is: During the *transition* from firmware to OS, *which gues= t* should make the shared buffer to be private? Is it "guest firmware" or "= guest OS"? >> >> Maybe I can ask the specific question to get it more clear. >> >> 1) If the common DMA buffer is not unmapped at ExitBootService, is= that treated as an issue? >> >> 2) If the read/write DMA buffer is not unmapped at ExitBootService= , is that treated as an issue? > > Very good questions, totally to the point. > > On the authoritative answer, I defer to Brijesh. > Both the above cases (#1 and #2) are problems. Since buffers was owned by f= irmware and firmware made it "shared" hence firmware is responsible to mark them as= private after its done with the buffer. In other words, we must call Unmap() from E= xitBootServices to ensure that buffers mapped using BusMasterCommonBuffer/BusMasterRead/Bus= MasterWrite is marked as "private" before we make it available to the guest OS. (we do = similar thing in Linux OS). Having any kind of side channel to communicate the encryption status of a p= age will not work -- we should be able to support a usecase where we boot OVMF = in 64-bit but launch 32-bit Linux guest. When Linux boots in 32-bit mode it do= es not have access to encryption bit (C-bit is bit-47 in page table) and can't mark the= page as private (even if we provide some kind of side-channel). thank you very much for all your help. > ( > > My personal opinion is that both situations (#1 and #2) are problems, > because they break the *practical* security invariant for SEV guests: > > - most memory should be encrypted at all times, *and* > > - any memory that is decrypted must have an owner that is responsible > for re-encrypting the memory eventually. > > Therefore, *either* the firmware has to re-encrypt all remaining DMA > buffers at ExitBootServices(), *or* a new information channel must be > designed, from firmware to OS, to carry over the decryption status. > > I strongly prefer the first option, for the following reason: the same > questions apply to all EDK2 IOMMU protocol interfaces, not just the one > exported by the SEV driver. > > ) > > Thanks, > Laszlo > >> From: Brijesh Singh [mailto:brijesh.singh@amd.com] >> Sent: Wednesday, September 6, 2017 11:40 PM >> To: Laszlo Ersek >; Yao, Jie= wen >; Zeng, Star >; edk2-devel-01 > >> Cc: brijesh.singh@amd.com; Dong, Eric > >> Subject: Re: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap= common buffers at ExitBootServices() >> >> >> >> On 09/06/2017 07:14 AM, Laszlo Ersek wrote: >>> On 09/06/17 06:37, Yao, Jiewen wrote: >>>> Thanks for the clarification. Comment in line. >>>> >>>> From: Laszlo Ersek [mailto:lersek@redhat.com] >>>> Sent: Wednesday, September 6, 2017 1:57 AM >>>> To: Yao, Jiewen >>; Zeng, Star >>; edk2-devel-01 >> >>>> Cc: Dong, Eric >>; Brijesh Singh >> >>>> Subject: Re: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unm= ap common buffers at ExitBootServices() >>> >>>>> Then after ExitBootService, the OS will take control of CR3 and set c= orrect >>>>> encryption bit. >>>> >>>> This is true, the guest OS will set up new page tables, and in those >>>> PTEs, the C-bit ("encrypt") will likely be set by default. >>>> >>>> However, the guest OS will not *rewrite* all of the memory, with the >>>> C-bit set. This means that any memory that the firmware didn't actuall= y >>>> re-encrypt (in the IOMMU driver) will still expose stale data to the >>>> hypervisor. >>>> [Jiewen] That is an interesting question. >>>> Is there any security concern associated? >>> >>> I can't tell for sure. Answering this question is up to the proponents >>> of SEV, who have designed the threat model for SEV. >>> >>> My operating assumption is that we should keep memory as tightly >>> encrypted as possible at the firmware --> OS control transfer. It could >>> be an exaggerated expectation from my side; I'd just better be safe tha= n >>> sorry :) >>> >>> >> >> Let me give some brief intro on SEV (Secure Encrypted Virtualization) an= d then >> we can discuss a security model (if needed). >> >> SEV is an extension to the AMD-V architecture which supports running enc= rypted >> virtual machines (VMs) under the control of a hypervisor. Encrypted VMs = have their >> pages (code and data) secured such that only the guest itself has access= to the >> unencrypted version. Each encrypted VMs is associated with a unique encr= yption >> key; if its data is accessed by a different entity using a different key= the >> encrypted guest data will be incorrectly decrypted, leading to unintelli= gible data. >> You can also find some detail for SEV in white paper [1]. >> >> SEV encrypted Vs have the concept of private and shared memory. The priv= ate memory >> is encrypted with the guest-specific key, while shared memory may be enc= rypted >> with hypervisor key. SEV guest VMs can choose which pages they would lik= e to >> be private. But the instruction pages and guest page tables are always t= reated >> as private by the hardware. The DMA operation inside the guest must be p= erformed >> on shared pages -- this is mainly because in virtualization world the de= vice >> DMA needs some assistance from the hypervisor. >> >> The security model provided by the SEV ensure that hypervisor will no lo= nger able >> to inspect or alter any guest code or data. The guest OS controls what i= t want to >> share with outside world (in this case with Hypervisor). >> >> In software implementation we took approach to encrypt everything possib= le starting >> early in boot. In this approach whenever OVMF wants to perform certain D= MA operations >> it allocate a shared page, issues the command, free the shared page afte= r the command >> is completed (in other word we use sw bounce buffer to complete the DMA = operation). >> >> We have implemented IOMMU driver to provide the following functions: >> >> AllocateBuffer(): >> -------------------- >> it allocate a private pages, as per UEFI spec the driver will map the bu= ffer allocated >> from this routine using BusMasterCommonBuffer hence we allocate extra st= ash pages >> in addition to requested pages. >> >> >> Map >> ---- >> If requested operation is BusMasterRead or BusMasterWrite then we alloca= te a shared buffer >> and DeviceAddress point to shared buffer. >> >> If requested operation is BusMasterCommonBuffer then we perform in-place= decryption of the >> contents and update the page-table to clear the C-bit (basically make it= shared page) >> >> Unmap >> ------ >> If operation was BusMasterRead or BusMasterWrite then we complete the un= mapping and free >> the shared buffer allocated in Map(). >> >> If operation was BusMasterCommonBuffer then we perform in-place encrypti= on and set the C-bit >> (basically make the page private) >> >> FreeBuffer: >> ----------- >> Free the pages >> >> >> Lets run with the below scenario: >> >> 1) guest marks a page as "shared" and gives its physical address to HV (= e.g DMA read) >> 2) hypervisor completes the request operation >> 3) hypervisor caches the shared physical address and monitor it for info= rmation leak >> 4) sometime later if guest write data in its "shared" physical address t= hen hypervisor can >> easily read it without guest knowledge. >> >> SEV hardware can protect us against the attack where someone tries to in= ject or alter the >> guest code. As I noted above any instruction fetch is forced as private = hence if attacker >> write some code into a shared buffer and point the RIP to his/her code t= hen instruction >> fetch will try to decrypt it and get unintelligible opcode. If a page wa= s marked as "shared" >> then its guest responsibility to make it "private" after its done commun= icating with >> hypervisor. >> >> [1] http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_M= emory_Encryption_Whitepaper_v7-Public.pdf >> >> >>>> If the C-bit is cleared for a read/write buffer, is the content in the >>>> read/write buffer also exposed to hypervisor? >>> >>> Not immediately. Only when the guest also rewrites the memory through >>> the appropriate virtual address. >>> >>> Basically, in the virtual machine, the C-bit in a PTE that covers a >>> particular guest virtual address (GVA) controls whether a guest write >>> through that GVA will result in memory encryption, and if a gues read >>> through that GVA will result in memory decryption. >>> >>> The current state of the C-bit doesn't matter for the hypervisor, what >>> matters is the last guest write through a GVA whose PTE had the C-bit >>> set, or clear. If the C-bit was clear when the guest last wrote the >>> area, then the hypervisor can read the data. If the C-bit was set, then >>> the hypervisor can only read garbage. >>> >>> >>>> I means if there is security concern, the concern should be applied to >>>> both common buffer and read/write buffer. >>>> Then we have to figure a way to resolve both buffer. >>> >>> Yes, this is correct. The PciIo.Unmap operation, as currently >>> implemented in OvmfPkg/IoMmuDxe/, handles the encryption/decryption >>> correctly for all operations, but it can only guarantee *not* freeing >>> memory for CommonBuffer. So Unmap()ing CommonBuffer at ExitBootServices >>> is safe, while Unmap()ing Read/Write is not. The encryption would be >>> re-established just fine for Read/Write as well, but we would change th= e >>> UEFI memmap. >>> >>> In OVMF, we currently manage this problem by making all asynchronous DM= A >>> mappings CommonBuffer, even if they could othewise be Read or Write. We >>> use Read and Write only if the DMA operation completes before the >>> higher-level protocol function returns (so we can immediately Unmap the >>> operation, and the ExitBootServices() handler doesn't have to care). >>> >>> >>>> To be honest, that is exactly my biggest question on this patch: >>>> why do we only handle the common buffer specially? >>> >>> For the following reason: >>> >>> - Given that CommonBuffer mappings take a separate AllocateBuffer() / >>> FreeBuffer() call-pair, around Map/Unmap, we can *reasonably ask* PciIo >>> implementors -- beyond what the UEFI spec requries -- to keep *all* >>> dynamic memory management out of Map/Unmap. If they need dynamic memory >>> management, we ask them to do it in AllocateBuffer/FreeBuffer instead. >>> This way Unmap is safe in ExitBootServices handlers. >>> >>> - We cannot *reasonably ask* PciIo implementors to provide the same >>> guarantee for Read and Write mappings, simply because there are no othe= r >>> APIs that could manage the dynamic memory for Map and Unmap -- >>> AllocateBuffer and FreeBuffer are not required for Read and Write >>> mappings. PciIo implementors couldn't agree to our request even if they >>> wanted to. Therefore Unmapping Read/Write operations is inherently >>> unsafe in EBS handlers. >>> >>> >>>>> NOTE: The device should still be halt state, because the device is >>>>> also controlled by OS. >>>> >>>> This is the key point -- the device is *not* controlled by the guest O= S, >>>> in the worst case. >>>> >>>> The virtual device is ultimately implemented by the hypervisor. We don= 't >>>> expect the virtual device (=3D the hypervisor) to mis-behave on purpos= e. >>>> However, if the hypervisor is compromised by an external attacker (for >>>> example over the network, or via privilege escalation from another >>>> guest), then all guest RAM that has not been encrypted up to that poin= t >>>> becomes visible to the attacker. In other words, the hypervisor ("the >>>> device") becomes retro-actively malicious. >>>> [Jiewen] If that is the threat model, I have a question on the exposur= e: >>>> 1) If the concern is for existing data, it means all DMA data already = written. >>>> However, the DMA data is already consumed or produced by virtual devic= e >>>> or a hypervisor. If the hypervisor is attacked, it already gets all th= e data content. >>> >>> Maybe, maybe not. The data may have been sent to a different host over >>> the network, and wiped from memory. >>> >>> Or, the data may have been written to a disk image that is separately >>> encrypted by the host OS, and has been detached (unplugged) from the >>> guest, and also unmounted from the host, with the disk key securely >>> wiped from host memory. >>> >>> We can also imagine the reverse direction. Assume that the user of the >>> virtual machine goes into the UEFI shell in OVMF, starts the EDIT >>> program, and types some secret information into a text file on the ESP. >>> Further assume that the disk image is encrypted on the host OS. It is >>> conceivable that fragments of the secret could remain stuck in the AHCI >>> (disk) and USB (keyboard) DMA buffers. >>> >>> Maybe you think that these are "made up" examples, and I agree, I just >>> made them up. The point is, it is not my place to discount possible >>> attack vectors (I haven't designed SEV). Attackers will be limited by >>> their *own* imaginations only, not by mine :) >>> >>> >>>> 2) if the concern is for future data, it means all user data to be wri= tten. >>>> However, the C-bit already prevent that. >>> >>> As far as I understand SEV, future data is out of scope. The goal is to >>> prevent *retroactive* guest data leaks (=3D whatever is currently in ho= st >>> OS memory) if an attacker compromises an otherwise non-malicious hyperv= isor. >>> >>> If an attacker compromises a virtualization host, then they can just si= t >>> silent and watch data flow into and out of guests from that point >>> onward, because they can then monitor all DMA (which always happens in >>> clear-text) real-time. >>> >>> >>>> Maybe I do not fully understand the threat model defined. >>>> If you can explain more, that would be great. >>> >>> Well I tried to explain *my* understanding of SEV :) I hope Brijesh wil= l >>> correct me if I'm wrong. >>> >>> >>>> The point of SEV is to keep as much guest data encrypted at all times = as >>>> possible, so that *whenever* the hypervisor is compromised by an >>>> attacker, the guest memory -- which the attacker can inevitably dump >>>> from the host side -- remains un-intellegible to the attacker. >>>> [Jiewen] OK. If this is a security question, I suggest we define a cle= ar >>>> threat model at first. >>>> Or what we are doing might be unnecessary or insufficient. >>> >>> I agree. >>> >>> As I said above, my operating principle has been to re-encrypt all DMA >>> buffers as soon as possible. For long-standing DMA buffers, re-encrypt >>> them at ExitBootServices at the latest. >>> >>> >>>> [Jiewen] For "require that Unmap() work for CommonBuffer >>>> operations without releasing memory", I would hold my opinion until >>>> it is documented clearly in UEFI spec. >>> >>> You are right, of course. >>> >>> It's just that we cannot avoid exploring ways, for this feature, that >>> currently fall outside of the spec. Whatever we do here will be outside >>> of the spec, one way or another. I suggested what I thought could be a >>> reasonable extension to the spec, one that could be implemented by PciI= o >>> implementors even before the spec is modified. >>> >>> edk2's PciIo.Unmap already behaves like this, without breaking the spec= . >>> >>> If there's a better way -- for example modifying CoreExitBootServices() >>> in edk2, to signal IOMMU drivers separately, *after* notifying other >>> ExitBootServices() handlers --, that might work as well. >>> >>> >>>> My current concern is: >>>> First, this sentence is NOT defined in UEFI specification. >>> >>> Correct. >>> >>> >>>> Second, it might break an existing implementation or existing feature,= such as tracing. >>> >>> Correct. >>> >>>> Till now, I did not see any restriction in UEFI spec say: In this func= tion, you are not allowed >>>> to call memory services. >>>> The only restriction is >>>> 1) TPL restriction, where memory service must be <=3D TPL_NOTIFY. >>>> 2) AP restriction, where no UEFI service/protocol is allowed for AP. >>> >>> I agree. >>> >>> >>>> I'm just trying to operate with the APIs currently defined by the UEFI >>>> spec, and these assumptions were the best I could come up with. >>>> [Jiewen] The PCI device driver must follow and *only* follow UEFI spec= . >>>> Especially the IHV Option ROM should not consume any private API. >>> >>> I disagree about "only follow". If there are additional requirements >>> that can be satisfied without breaking the spec, driver implementors ca= n >>> choose to conform to both sets of requirements. >>> >>> For example (if I understand correctly), Microsoft / Windows present a >>> bunch of requirements *beyond* the UEFI spec, for both platform and >>> add-on card firmware. Most vendors conform :) >>> >>> >>>>> [Jiewen] I am not sure who will control "When control is transferred = to the OS, all >>>>> guest memory should be encrypted again, even those areas that were on= ce >>>>> used as CommonBuffers." >>>>> For SEV case, I think it should be controlled by OS, because it is ju= st CR3. >>>> >>>> If it was indeed just CR3, then I would fully agree with you. >>>> >>>> But -- to my understanding --, ensuring that the memory is actually >>>> encrypted requires that the guest *write* to the memory through a >>>> virtual address whose PTE has the C-bit set. >>>> >>>> And I don't think the guest OS can be expected to rewrite all of its >>>> memory at launch. :( >>>> >>>> [Jiewen] That is fine. >>>> I suggest we get clear on the threat model as the first step. >>>> The threat model for SEV usage in OVMF. >>>> >>>> I am sorry if that is already discussed before. I might ignore the con= versation. >>> >>> No problem; it's always good to re-investigate our assumptions and >>> operating principles. >>> >>> >>>> If you or any SEV owner can share the insight, that will be great. >>>> See my question above "If that is the threat model, I have a question = on the exposure:..." >>> >>> I shared *my* impressions of the threat model (which are somewhat >>> unclear, admittedly, and thus could make me overly cautious). >>> >>> I hope Brijesh can extend and/or correct my description. >>> >>> >>>>> So apparently the default behaviors of the VT-d IOMMU and the SEV IOM= MU >>>>> are opposites -- VT-d permits all DMA unless configured otherwise, wh= ile >>>>> SEV forbids all DMA unless configured otherwise. >>>>> [Jiewen] I do not think so. The default behaviors of VT-d IOMMU is to= disable all DMA access. >>>>> I setup translation table, but mark all entry to be NOT-PRESENT. >>>>> I grant DMA access only if there is a specific request by a device. >>>>> >>>>> I open DMA access in ExitBootServices, just want to make sure it is c= ompatible to >>>>> the existing OS, which might not have knowledge on IOMMU. >>>>> I will provide a patch to make it a policy very soon. As such VTd IOM= MU may >>>>> turn off IOMMU, or keep it enabled at ExitBootService. >>>>> An IOMMU aware OS may take over IOMMU directly and reprogram it. >>>> >>>> Thanks for the clarification. >>>> >>>> But, again, will this not lead to the possibility that the VT-d IOMMU'= s >>>> ExitBootServices() handler disables all DMA *before* the PCI drivers g= et >>>> a chance to run their own ExitBootServices() handlers, disabling the >>>> individual devices? >>>> [Jiewen] Sorry for the confusing. Let me explain: >>>> No, VTd never disables *all* DMA buffer at ExitBootService. >>>> >>>> "disable VTd" means to "disable IOMMU protection" and "allow all DMA". >>>> "Keep VTd enabled" means to "keep IOMMU protection enabled" and >>>> "only allow the DMA from Map() request". >>> >>> Okay, but this interpretation was exactly what I thought of first (see >>> above): "VT-d permits all DMA unless configured otherwise". You answere= d >>> that it wasn't the case. >>> >>> So basically, if I understand it correctly now, at ExitBootServices the >>> VT-d IOMMU driver opens up all RAM for DMA access. Is that correct? >>> >>> That is equivalent to decrypting all memory under SEV, and is the exact >>> opposite of what we want. (As I understand it.) >>> >>> >>>> If that happens, then a series of IOMMU faults could be generated, whi= ch >>>> I described above. (I.e., such IOMMU faults could result, at least in >>>> the case of SEV, in garbage being written to disk, via queued commands= .) >>>> [Jiewen] You are right. That would not happen. >>>> IOMMU driver should not bring any compatibility problem for the PCI dr= iver, >>>> iff the PCI driver follows the UEFI specification. >>> >>> Agreed. >>> >>> >>>> Now, to finish up, here's an idea I just had. >>>> >>>> Are we allowed to call gBS->SignalEvent() in an ExitBootServices() >>>> notification function? >>>> >>>> If we are, then we could do the following: >>>> >>>> * PciIo.Unmap() and friends would work as always (no restrictions on >>>> dynamic memory allocation or freeing, for any kind of DMA operatio= n). >>>> >>>> * PCI drivers would not be expected to call PciIo.Unmap() in their >>>> ExitBootServices() handlers. >>>> >>>> * The IOMMU driver would have an ExitBootServices() notification >>>> function, to be enqueued at the TPL_CALLBACK or TPL_NOTIFY level >>>> (doesn't matter which). >>>> >>>> * In this notification function, the IOMMU driver would signal *anothe= r* >>>> event (a private one). The notification function for this event wo= uld >>>> be enqueued strictly at the TPL_CALLBACK level. >>>> >>>> * The notification function for the second event (private to the IOMMU= ) >>>> would iterate over all existent mappings, and unmap them, without >>>> allocating or freeing memory. >>>> >>>> The key point is that by signaling the second event, the IOMMU driver >>>> could order its own cleanup code after all other ExitBootServices() >>>> callbacks. (Assuming, at least, that no other ExitBootServices() >>>> callback employs the same trick to defer itself!) Therefore by the tim= e >>>> the second callback runs, all PCI devices have been halted, and it is >>>> safe to tear down the translations. >>>> >>>> The ordering would be ensured by the following: >>>> >>>> - The UEFI spec says under CreateEventEx(), "All events are guaranteed >>>> to be signaled before the first notification action is taken." Thi= s >>>> means that, by the time the IOMMU's ExitBootServices() handler is >>>> entered, all other ExitBootServices() handlers have been *queued* = at >>>> least, at TPL_CALLBACK or TPL_NOTIFY. >>>> >>>> - Therefore, when we signal our second callback from there, for >>>> TPL_CALLBACK, the second callback will be queued at the end of the >>>> TPL_CALLBACK queue. >>>> >>>> - CreateEventEx() also says that EFI_EVENT_GROUP_EXIT_BOOT_SERVICES "i= s >>>> functionally equivalent to the EVT_SIGNAL_EXIT_BOOT_SERVICES flag >>>> for the Type argument of CreateEvent." So it wouldn't matter if a >>>> driver used CreateEvent() or CreateEventEx() for setting up its ow= n >>>> handler, the handler would be queued just the same. >>>> >>>> I think this is ugly and brittle, but perhaps the only way to clean up >>>> *all* translations safely, with regard to PciIo.Unmap() + >>>> ExitBootServices(), without updating the UEFI spec. >>>> >>>> What do you think? >>>> >>>> [Jiewen] If the order is correct, and all PCI device driver is halt at= ExitBootServices, that works. >>>> We do not need update PCI driver and we do not need update UEFI spec. >>>> We only need update IOMMU driver which is concerned, based upon the th= reat model. >>>> That probably is best solution. :-) >>> >>> I'm very glad to hear that you like the idea. >>> >>> However, it depends on whether we are permitted, by the UEFI spec, to >>> signal another event in an ExitBootServices() notification function. >>> >>> Are we permitted to do that? >>> >>> Does the UEFI spec guarantee that the notification function for the >>> *second* event will be queued just like it would be under "normal" >>> circumstances? >>> >>> (I know we must not allocate or free memory in the notification functio= n >>> of the *second* event either; I just want to know if the second event's >>> handler is *queued* like it would normally be.) >>> >>> >>>> I assume you want to handle both common buffer and read/write buffer, = right? >>> >>> Yes. Under this idea, all kinds of operations would be cleaned up. >>> >>> >>>> And if possible, I still have interest to get clear on the threat mode= l for SEV in OVMF. >>>> If you or any SEV owner can share ... >>> >>> Absolutely. Please see above. >>> >>> Thank you! >>> Laszlo >>> >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel >> > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel