From: Laszlo Ersek <lersek@redhat.com>
To: "Yao, Jiewen" <jiewen.yao@intel.com>,
"Zeng, Star" <star.zeng@intel.com>,
edk2-devel-01 <edk2-devel@lists.01.org>
Cc: "Dong, Eric" <eric.dong@intel.com>
Subject: Re: [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices()
Date: Tue, 5 Sep 2017 11:15:47 +0200 [thread overview]
Message-ID: <5f1fdc84-5824-bee2-5a1a-fbd67adf5443@redhat.com> (raw)
In-Reply-To: <74D8A39837DF1E4DA445A8C0B3885C503A9A79BD@shsmsx102.ccr.corp.intel.com>
On 09/05/17 04:18, Yao, Jiewen wrote:
> HI Laszlo
> Thank you very much to raise this DMA topic.
>
> I have a chat with Star. We have couples of question, from high level to low level.
> Hope you can clarify:
>
>
> 1) About disable DMA
>
> As you mentioned, in UEFI_Driver_Writer_Guide_V1.0.1_120308.pdf, 7.7 Adding the Exit Boot Services feature
> =========================
> Examples from the EDK II that use this feature are the PCI device drivers for USB Host
> Controllers. Some USB Host Controllers are PCI Bus Masters that continuously access a
> memory buffer to poll for operation requests. Access to this memory buffer by a USB
> Host Controller may be required to boot an operation system, but this activity must be
> terminated when the OS calls ExitBootServices(). The typical action in the Exit Boot
> Services Event for these types of drivers is to disable the PCI bus master and place the
> USB Host Controller into a halted state
> =========================
>
> I believe the fundamental requirement is to "disable DMA".
> As such, the simplest action is to clear PCI Bus Master Enable bit in the command register.
>
> Do you think clearing BME is enough?
> Is there any special reason that we must unmap the DMA buffer?
Modifying the device state is enough to ask the device *nicely* to stop
doing DMA.
But, I think, if we are on an IOMMU-protected system, where part of
PciIo.Map() is to add a translation (= system memory access permission)
to the IOMMU, then we should also revoke this permission explicitly,
*after* we asked the device nicely to stop doing DMA.
Basically, ask the device nicely first, and then enforce the protection.
... BTW, I mentioned "IOMMU faults" earlier, and I have some further
thoughts on them. This is about tearing down IOMMU translations *first*,
in the IOMMU driver's ExitBootServices() handler, *before* the PCI
drivers get a chance -- in *their* ExitBootServices() handlers -- to ask
the PCI devices nicely to stop doing DMA.
I think a quite problematic failure mode is possible here. Assume that a
disk controller has some write commands queued (which are mapped with
Bus Master Read or Bus Master CommonBuffer operations). If the
ExitBootServices() handler for the IOMMU driver runs first, then the
device could act upon these queued commands with missing IOMMU
translations, before the device's own PCI driver asked the device to
shut down DMA. In some cases, I guess this could result in those queued
write commands simply failing, without ill effects. However, dependent
on the IOMMU implementation, the write commands might not fail
explicitly, but read garbage from system memory, and write garbage to
disk blocks. That would be bad.
I think this is exactly what could happen with the SEV IOMMU we have.
The memory would be first re-encrypted (this is equivalent to revoking a
translation), and then the write commands (queued in the emulated AHCI
controller, in the hypervisor) would read garbage from guest memory, and
write garbage to the virtual disk.
>
>
> 2) About common buffer
> If there is a special requirement that we must unmap the DMA buffer, why we only unmap the common buffer?
>
> I understand your statement that unmapping a read buffer or write buffer may cause free memory action. But that is implementation choice.
I believe I disagree with this (i.e., I think it is not an
implementation choice).
In the general case, we can assume that PciIo.Map() allocates a bounce
buffer internally, when Read or Write action is requested.
In turn, PciIo.Unmap() has to release the same buffer, internally.
PciIo.Unmap() does not know if it is called from an ExitBootServices()
notification function, or from a normal context. It cannot decide if it
is allowed to release memory, or not. There's no additional parameter to
provide this hint either. If PciIo.Unmap() never releases buffers, then
ExitBootServices() will be happy, but we're going to leak memory.
Now, PciIo.Unmap() could move the bounce buffer to some internal "free
list" instead, rather than freeing it. And, the next PciIo.Map() call
could try to reuse a buffer from the "free list". But this is difficult
to do, because the buffers may have any size, and recycling them would
require an internal, general "page pool", in parallel to the Boot
Services page pool (AllocatePages, FreePages).
If PciIo.Unmap() could be informed that it is running in
ExitBootServices() notification context -- called from PCI drivers --,
then unmapping Read and Write operations would be simple.
> If we have strong requirement to unmap the DMA buffer, we should achieve that.
I totally agree that we *should* achieve that, but I don't see how the
current PciIo.Map() and PciIo.Unmap() APIs make it possible, for Read
and Write operations.
For CommonBuffer operations, it is possible, because AllocateBuffer()
and FreeBuffer() are separate actions, and an ExitBootServices()
notification function can *avoid* calling FreeBuffer. (But only for
CommonBuffer operations.) PciIo.Unmap() cannot be told *not* to release
a bounce buffer for Read/Write.
>
>
> 3) About interaction with IOMMU
> I am not worried about IOMMU interaction.
> I believe an proper IOMMU implementation should not block any action taken by a PCI device driver.
> Current IntelVTdDxe driver disables VTd engine in ExitBootService, which means the DMA access is no longer managed by VTd engine. It is NOT related to disable DMA buffer.
Ah, I understand!
So, on one hand, that is totally safe, for PCI device drivers. It means
all pending DMA can get through, until the PCI drivers ask the devices
(nicely) to stop doing DMA.
On the other hand, it is the opposite of the security goal of SEV. With
SEV, the default state is "all DMA forbidden" (= all guest memory
encrypted). This is also the status that we'd like to achieve when
ExitBootServices() completes. When control is transferred to the OS, all
guest memory should be encrypted again, even those areas that were once
used as CommonBuffers.
So apparently the default behaviors of the VT-d IOMMU and the SEV IOMMU
are opposites -- VT-d permits all DMA unless configured otherwise, while
SEV forbids all DMA unless configured otherwise.
> 4) About the driver you modified
> I believe we have some other PCI device driver (NVMe/SDMMC/UFS), besides the ones you modified in this patch (ATA/UCHI/EHCI/XHCI).
> If we need take this approach, I assume they also need update, right?
Yes, they should be updated similarly.
SDMMC and UFS are not included in OVMF, so I couldn't test their behavior.
NVMe is included in OVMF, but it is used very rarely in practice (we can
call it a corner case), so I thought I would first submit patches for
the four main drivers (UHCI, EHCI, XHCI, AHCI) that are frequently used
with OVMF and QEMU. Tracking down the CommonBuffer lifecycles was
difficult :)
Thank you!
Laszlo
>
> Thank you
> Yao Jiewen
>
>
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> Sent: Tuesday, September 5, 2017 5:20 AM
> To: Zeng, Star <star.zeng@intel.com>; edk2-devel-01 <edk2-devel@lists.01.org>
> Cc: Dong, Eric <eric.dong@intel.com>
> Subject: Re: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices()
>
> On 09/04/17 12:36, Zeng, Star wrote:
>> Curious about one question when I am reviewing this patch series.
>>
>> Does/Should the IOMMU component disable the address translation at ExitBootServices?
>
> It is a very good and interesting question.
>
> My answer is, I don't know. The IOMMU protocol abstraction is specific
> to edk2, and its behavior is not documented in the PI or UEFI specs (as
> far as I know).
>
> For one example, the VT-d IOMMU implementation in
>
> IntelSiliconPkg/IntelVTdDxe
>
> zaps all translations at ExitBootServices():
>
> VOID
> EFIAPI
> OnExitBootServices (
> IN EFI_EVENT Event,
> IN VOID *Context
> )
> {
> DEBUG ((DEBUG_INFO, "Vtd OnExitBootServices\n"));
> DumpVtdRegsAll ();
> DisableDmar ();
> DumpVtdRegsAll ();
> }
>
> Now, in case someone suggested that the same approach should be followed
> by all IOMMU implementations, I'd have three counter-arguments:
>
> (1) My series states the problem (and seeks to solve it) on the PciIo
> protocol level, which is standardized. The IOMMU protocol is edk2-only.
> I feel that this kind of "unmap on ExitBootServices" should be done by
> all UEFI drivers that map DMA common buffers, regardless of
> platform-specific PciIo implementations.
>
> (2) ExitBootServices() notify functions are called in unspecified order
> (there's no way to express dependencies between them). This means that,
> following the above pattern, an IO address translation could be torn
> down before a reliant PCI device is de-configured in its own UEFI_DRIVER
> ExitBootServices() handler. The result can be a series of IOMMU faults.
> I'm not sure if that's bad in practice, but it does look like a layering
> violation. (We shouldn't tear down resources from the bottom up.)
>
> (3) The DisableDmar() function is relatively simple. It uses VT-d
> registers that exist independently of any PCI devices, and of any "live"
> mappings.
>
> Some IOMMU protocol implementations might be different -- for them the
> live configuration might only be known by constantly tracking the full
> set of Map() and Unmap() operations. That's a hurdle for the implementation.
>
> Furthermore, what is supposed to happen if both the IOMMU driver and the
> individual PCI driver destroy the mapping at ExitBootServices()? If the
> PCI driver's notification function runs first, then the IOMMU driver can
> mark the mapping as "unmapped", and ignore it in its own notification
> function. If the IOMMU driver's notification function runs first, then
> it can again only mark the mapping as "unmapped", so that when the PCI
> driver tries to unmap it as well, things don't blow up. This seems to
> imply that an unmapped mapping data structure can never be recycled,
> because we never know what action might follow. This sounds very
> confusing to me.
>
>
> Now, there are two arguments in favor of the IOMMU ExitBootServices()
> callback as well:
>
> - Security. It doesn't matter if some driver forgets to de-configure a
> translation, the IOMMU driver won't. Layering violations be damned.
>
> - In ExitBootServices() notification functions, the UEFI memory map must
> not be altered (memory cannot be released). This means that Bus Master
> Write and Bus Master Read operations, for pending transfers, cannot be
> unmapped (only CommonBuffer operations can be), because unmapping
> Write/Read might free memory (bounce buffers) *implicitly*. For
> CommonBuffer operations, separate PciIo.FreeBuffer() calls are necessary
> for releasing memory, thus the notification functions can simply *not*
> call FreeBuffer(), to stay safe. But this is not possible for pending
> Read/Write operations.
>
>
> My current thinking is that (a) PCI drivers should unmap pending Common
> Buffer operations at ExitBootServices; (b) PCI drivers should never
> return from their protocol member functions with pending Write/Read
> operations, only CommonBuffer operations (put differently, if the
> operation is asynchronous on the higher protocol level, then make the
> underlying BMDMA operation CommonBuffer); (c) IOMMU protocol
> implementations should not be required to clean up translations, only
> allowed to, if they can do it without interfering with (a).
>
> Basically, the question is, who *owns* the translations. Based on the
> PciIo interfaces, and on the requirement that PCI drivers Map() and
> Unmap() buffers, my opinion is that the PCI drivers own the translations.
>
> If you think that we should take this question to the USWG, I'm OK with
> that.
>
> Thanks,
> Laszlo
>
>
>
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
>> Sent: Monday, September 4, 2017 3:55 AM
>> To: edk2-devel-01 <edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>>
>> Cc: Dong, Eric <eric.dong@intel.com<mailto:eric.dong@intel.com>>; Zeng, Star <star.zeng@intel.com<mailto:star.zeng@intel.com>>
>> Subject: [edk2] [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices()
>>
>> Repo: https://github.com/lersek/edk2.git
>> Branch: pci_host_controllers_unmap
>>
>> This series updates four PCI Host Controller drivers so that they don't leave CommonBuffer mappings hanging when control is transfered to the OS.
>>
>> Cc: Brijesh Singh <brijesh.singh@amd.com<mailto:brijesh.singh@amd.com>>
>> Cc: Eric Dong <eric.dong@intel.com<mailto:eric.dong@intel.com>>
>> Cc: Star Zeng <star.zeng@intel.com<mailto:star.zeng@intel.com>>
>>
>> Thanks
>> Laszlo
>>
>> Laszlo Ersek (4):
>> MdeModulePkg/UhciDxe: unmap BM common buffers when exiting boot
>> services
>> MdeModulePkg/EhciDxe: unmap BM common buffers when exiting boot
>> services
>> MdeModulePkg/XhciDxe: unmap BM common buffers when exiting boot
>> services
>> MdeModulePkg/AtaAtapiPassThru: unmap common buffers at
>> ExitBootServices()
>>
>> MdeModulePkg/Bus/Ata/AtaAtapiPassThru/AhciMode.h | 18 ++++++
>> MdeModulePkg/Bus/Ata/AtaAtapiPassThru/AtaAtapiPassThru.h | 5 ++ MdeModulePkg/Bus/Ata/AtaAtapiPassThru/AtaAtapiPassThru.c | 67 +++++++++++++++++++-
>> MdeModulePkg/Bus/Pci/EhciDxe/Ehci.c | 25 +++++++-
>> MdeModulePkg/Bus/Pci/UhciDxe/Uhci.c | 25 +++++++-
>> MdeModulePkg/Bus/Pci/XhciDxe/Xhci.c | 31 +++++++++
>> 6 files changed, 168 insertions(+), 3 deletions(-)
>>
>> --
>> 2.14.1.3.gb7cf6e02401b
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
> https://lists.01.org/mailman/listinfo/edk2-devel
>
next prev parent reply other threads:[~2017-09-05 9:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-03 19:54 [PATCH 0/4] MdeModulePkg: some PCI HC drivers: unmap common buffers at ExitBootServices() Laszlo Ersek
2017-09-03 19:54 ` [PATCH 1/4] MdeModulePkg/UhciDxe: unmap BM common buffers when exiting boot services Laszlo Ersek
2017-09-03 19:54 ` [PATCH 2/4] MdeModulePkg/EhciDxe: " Laszlo Ersek
2017-09-03 19:54 ` [PATCH 3/4] MdeModulePkg/XhciDxe: " Laszlo Ersek
2017-09-03 19:54 ` [PATCH 4/4] MdeModulePkg/AtaAtapiPassThru: unmap common buffers at ExitBootServices() Laszlo Ersek
2017-09-04 10:36 ` [PATCH 0/4] MdeModulePkg: some PCI HC drivers: " Zeng, Star
2017-09-04 21:19 ` Laszlo Ersek
2017-09-05 2:18 ` Yao, Jiewen
2017-09-05 9:15 ` Laszlo Ersek [this message]
2017-09-05 13:44 ` Yao, Jiewen
2017-09-05 17:57 ` Laszlo Ersek
2017-09-06 4:37 ` Yao, Jiewen
2017-09-06 12:14 ` Laszlo Ersek
2017-09-06 15:39 ` Brijesh Singh
2017-09-07 4:46 ` Yao, Jiewen
2017-09-07 11:46 ` Laszlo Ersek
2017-09-07 14:40 ` Brijesh Singh
2017-09-07 14:48 ` Yao, Jiewen
2017-09-07 16:40 ` Laszlo Ersek
2017-09-07 4:34 ` Yao, Jiewen
2017-09-07 12:11 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5f1fdc84-5824-bee2-5a1a-fbd67adf5443@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox