From: Brijesh Singh <brijesh.singh@amd.com>
To: Laszlo Ersek <lersek@redhat.com>, edk2-devel@lists.01.org
Cc: brijesh.singh@amd.com, Tom Lendacky <thomas.lendacky@amd.com>,
Jordan Justen <jordan.l.justen@intel.com>,
Jason Wang <jasowang@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Gerd Hoffmann <kraxel@redhat.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support
Date: Tue, 25 Jul 2017 18:42:21 -0500 [thread overview]
Message-ID: <4e7e2808-318e-90a2-6c15-a75e59086640@amd.com> (raw)
In-Reply-To: <841bec5f-6f6e-8b1f-25ba-0fd37a915b72@redhat.com>
Hi Laszlo,
On 07/25/2017 01:17 PM, Laszlo Ersek wrote:
> Adding Ard
>
> On 07/20/17 00:09, Brijesh Singh wrote:
>> I have found that OVMF fails to detect the disk when iommu_platform is set from
>> qemu cli. The failure occurs during the feature bit negotiation.
>>
>> Recently, EDKII introduced IOMMU protocol d1fddc4533bf. SEV patch series introduced
>> a IoMmu protocol driver f9d129e68a45 to set a DMA access attribute and methods to
>> allocate, free, map and unmap the DMA memory for the master bus devices
>>
>> In this patch series, I have tried to enable the IOMMU_PLATFORM feature for
>> VirtioBlkDevice. I am sending this as RFC to seek feedback before I extend the support
>> for other Virtio devices. The patch has been tested in SEV guest - mainly because
>> IoMmuDxe driver installs the IOMMU protocol for SEV guest only. If needed, I can
>> extend the IoMmuDxe driver to install IOMMU protocol for non SEV guests.
>>
>> qemu cli used for testing:
>>
>> # $QEMU \
>> ...
>> -drive file=${HDA_FILE},if=none,id=disk0,format=qcow2 \
>> -device virtio-blk-pci,drive=disk0,disable-legacy=on,iommu_platform=true,disable-modern=off,scsi=off
>> ...
>>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>
>> Brijesh Singh (3):
>> OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit
>> OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support
>> OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support
>
> In the course of processing my post-PTO email backlog, yesterday I
> skimmed this topic very quickly, and thought about it for an hour or so
> (while away for the computer). Here's my take on it.
>
>
> (1) The closest to any formal language that I found for
> VIRTIO_F_IOMMU_PLATFORM are:
>
> https://issues.oasis-open.org/browse/VIRTIO-154
> https://lists.oasis-open.org/archives/virtio-dev/201610/msg00121.html
>
> Are these references up-to-date? The ticket also references SVN r587 as
> the resolution, but that link is dead.
>
>
> (2) A few weeks back I saw Jason's SeaBIOS patch (also pointed out to me
> more recently by Brijesh, in private):
>
> [SeaBIOS] [PATCH] virtio: IOMMU support
>
> I don't understand this patch. (I also don't understand the
> "iommu_platform" device property on the QEMU command line.) According to
> the spec language quoted from the mailing list above, we have four cases:
>
> (2.1) device does not offer VIRITO_F_IOMMU_PLATFORM --> everything works
> like before
>
> (2.2) device offers VIRITO_F_IOMMU_PLATFORM, but the driver does not
> negotiate it --> device is allowed to reject functioning
>
> * device offers VIRITO_F_IOMMU_PLATFORM and the driver negotiates it:
> there are two possibilities:
> (2.3) the driver *disables* the IOMMU, and works like before
> (2.4) the driver *configures* the IOMMU and works accordingly
>
> The SeaBIOS patch negotiates VIRITO_F_IOMMU_PLATFORM, but it *neither*
> disables *nor* configures the IOMMU. It simply *ignores* the IOMMU. I
> don't see how that is any different *in effect* from simply not
> negotiating VIRITO_F_IOMMU_PLATFORM -- case (2.2) --, and I don't
> understand why QEMU tolerates "ignoring the IOMMU" differently from "not
> negotiating the flag".
>
>
> (3) *If* we indeed just wanted to follow the SeaBIOS patch, then
> VIRTIO_F_IOMMU_PLATFORM should be treated in OVMF *precisely* in
> parallel with VIRTIO_F_VERSION_1.
>
>
> (4) I'm confused by the intersection of the SEV and
> VIRTIO_F_IOMMU_PLATFORM use cases. The IoMmu DXE driver added in edk2
> commit f9d129e68a45 ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06) is
> specific to SEV.
>
> In SEV-less guests, the IoMmu DXE driver will not install the IOMMU
> protocol, but the QEMU command line may still contain the
> "iommu_platform=true" property. For example, as far as I recall, DPDK
> guest payloads use an emulated "viommu" device. For this OVMF, would
> have to provide a separate IOMMU DXE driver, one that could actually
> interact with the "viommu" device. And there should be some coordination
> that exactly one instance of gEdkiiIoMmuProtocolGuid OR
> gIoMmuAbsentProtocolGuid be installed.
>
> I see that the SEV references are only in the blurb of the patch set; no
> actual commit message refers to SEV. That's OK; I just think the blurb
> is confusing.
>
I was trying to figure out why setting iommu_platform fails to detect the HDD
in OVMF. Since SEV IOMMU work was still fresh in my mind hence I thought we
simply need to update the Virtio drivers to consume IOMMU protocol directly.
But your explanation on how things work makes sense; thanks for explaining
it in great detail.
I will follow your recommendation, and look into extending VIRTIO_DEVICE_PROTOCOL
with necessary functions and delegate the work to EFI_PCI_IO_PROTOCOL (which will
use IOMMU driver if available).
Thanks
Brijesh
>
> (5) Now I'm coming to my main point. The virtio device drivers in
> OvmfPkg are UEFI_DRIVER modules that conform to the UEFI driver model.
> They consume our home-grown the VIRTIO_DEVICE_PROTOCOL, and produce
> whatever UEFI protocol is appropriate on top (for example,
> EFI_BLOCK_IO_PROTOCOL in VirtioBlkDxe).
>
> Such a driver has no business talking to the platform's IOMMU protocol
> directly (even if there is one). Instead:
>
>
> (5.1) The VIRTIO_DEVICE_PROTOCOL has to be extended with the necessary
>
> AllocateSharedPages, FreeSharedPages, MapSharedPages, UnmapSharedPages
>
> functions.
>
>
> (5.2) All top-level virtio driver code, including VirtioLib, has to be
> rebased to the new VIRTIO_DEVICE_PROTOCOL member functions. This covers
> the ring memory itself, the memory pointed-to by the descriptors placed
> on the ring(s) -- i.e., the device specific requests --, and all further
> memory that is referenced by those device specific requests.
>
> This will result in a larger memory footprint, as all current pool
> allocations will be turned into page allocations, but I guess that is
> tolerable.
>
>
> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM simply
> in parallel with VIRTIO_F_VERSION_1, and don't act upon
> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically this
> is just my point (3) from above.
>
>
> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2:
>
> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional
> virtio-pci devices, and offers virtio 0.9.5 semantics.
>
> - "ArmVirtPkg/VirtioFdtDxe" (via "OvmfPkg/Library/VirtioMmioDeviceLib")
> binds virtio-mmio devices, and offers virtio 0.9.5 semantics.
>
> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and offers
> virtio 1.0.0 semantics.
>
> The first two drivers should implement the AllocateSharedPages() and
> FreeSharedPages() member functions simply with the corresponding
> MemoryAllocationLib functions (using BootServicesData type memory), and
> implement the MapSharedPages() and UnmapSharedPages() member functions
> as no-ops (return the input addresses transparently).
>
> The third driver should implement all four new member functions by
> respectively delegating the job to:
> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData --
> - EFI_PCI_IO_PROTOCOL.FreeBuffer()
> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 --
> - EFI_PCI_IO_PROTOCOL.Unmap()
>
> The EFI_PCI_IO_PROTOCOL implementation will delegate these calls to the
> platform-specific PCI host bridge / root bridge driver, and *that*
> driver in turn is allowed to talk to an IOMMU protocol (if any).
>
> (This last step is already covered by the following edk2 commits:
> - generally, c15da8eb3587 ("MdeModulePkg/PciHostBridge: Add IOMMU
> support.", 2017-04-29),
> - specifically for SEV in OVMF, c6ab9aecb71b ("OvmfPkg: update
> PciHostBridgeDxe to use PlatformHasIoMmuLib", 2017-07-06).)
>
>
> (5.5) There's a delicate question that has to be considered with care;
> namely the ExitBootServices() callbacks in the virtio drivers.
>
> Currently these functions perform virtio resets. They cause the devices
> to forget all their configuration (including guest RAM references).
> Aborting in-flight DMA (e.g. in VirtioNetDxe) is a practice that is
> specifically recommended in the UEFI Driver Writers' Guide; plus at some
> point the guest kernel will reclaim and overwrite BootServicesData type
> memory. If at that point QEMU is still looking at some (originally
> firmware-allocated) areas as virtio rings, the results won't be amusing.
> (Speaking from experience.) Hence the resetting of virtio devices upon
> ExitBootServices().
>
> Now, remember that ExitBootServices() callbacks *must not* change the
> UEFI memory map, so *no* memory allocations *must* be released in the
> callbacks. The virtio resets performed in the callbacks are surgical for
> this reason; nothing else is being done. The memory will be reclaimed by
> the OS, later.
>
> With an IOMMU in the picture, further actions become necessary: *after*
> the virtio reset, any buffers previously in use (including rings, device
> specific requests pointed-to by descriptors, and any further memory
> referenced by those requests) must be *unmapped*, but *not freed*.
>
> (Speaking in SEV terms, this will result in those memory areas seeing
> their C bits restored, without changing the UEFI memmap.)
>
> This means that:
> - the new VIRTIO_DEVICE_PROTOCOL.UnmapSharedPages() function has to be
> called judiciously from these callbacks, after the virtio reset,
> - *and* that the *entire* call chain originating from
> UnmapSharedPages(), through PciIo, through PciRootBridgeIo, to
> EdkiiIoMmu, *must* not call gBS->FreePool() or gBS->FreePages() (or the
> equivalent MemoryAllocationLib functions).
>
> Note that if the last requirement is currently violated (outside of
> OvmfPkg), then that is a general problem for physical platforms as well
> -- IMO, a physical NIC driver too should be able to abort DMA in its
> exit-boot-services callback and then unmap any relevant IOMMU mappings
> (via PciIo->Unmap().)
>
> Thanks
> Laszlo
>
next prev parent reply other threads:[~2017-07-25 23:40 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-19 22:09 [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support Brijesh Singh
2017-07-19 22:09 ` [RFC v1 1/3] OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit Brijesh Singh
2017-07-19 22:09 ` [RFC v1 2/3] OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support Brijesh Singh
2017-07-19 22:09 ` [RFC v1 3/3] OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support Brijesh Singh
[not found] ` <62320c1a-0cec-947c-8c63-5eb0416e4e33@redhat.com>
2017-07-21 11:17 ` [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support Brijesh Singh
[not found] ` <20170722024318-mutt-send-email-mst@kernel.org>
2017-07-24 8:25 ` Gerd Hoffmann
2017-07-25 18:17 ` Laszlo Ersek
2017-07-25 23:42 ` Brijesh Singh [this message]
[not found] ` <904dae9f-e515-01ba-e16f-6561616c78af@redhat.com>
2017-07-26 15:30 ` Laszlo Ersek
2017-07-27 14:21 ` Brijesh Singh
2017-07-27 17:16 ` Laszlo Ersek
2017-07-27 17:56 ` Ard Biesheuvel
2017-07-27 19:00 ` Brijesh Singh
2017-07-27 20:55 ` Brijesh Singh
2017-07-27 21:31 ` Ard Biesheuvel
2017-07-27 21:38 ` Andrew Fish
2017-07-27 22:13 ` Brijesh Singh
2017-07-27 22:10 ` Brijesh Singh
2017-07-28 8:39 ` Ard Biesheuvel
2017-07-28 15:27 ` Laszlo Ersek
2017-07-28 13:38 ` Laszlo Ersek
2017-07-28 16:00 ` Brijesh Singh
2017-07-28 16:16 ` Laszlo Ersek
2017-07-28 19:21 ` Laszlo Ersek
2017-07-28 19:59 ` Laszlo Ersek
2017-07-29 0:52 ` Brijesh Singh
2017-07-29 1:37 ` Brijesh Singh
2017-07-31 18:20 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e7e2808-318e-90a2-6c15-a75e59086640@amd.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox