public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Brijesh Singh <brijesh.singh@amd.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Laszlo Ersek <lersek@redhat.com>
Cc: brijesh.singh@amd.com,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Jordan Justen <jordan.l.justen@intel.com>,
	Jason Wang <jasowang@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support
Date: Thu, 27 Jul 2017 14:00:40 -0500	[thread overview]
Message-ID: <c5c3c3cc-d0c1-4227-306c-da19f2e43203@amd.com> (raw)
In-Reply-To: <CAKv+Gu-he69K-2bWPzi604na9NnhniTnVcrjicxRY78aEOjnRw@mail.gmail.com>



On 07/27/2017 12:56 PM, Ard Biesheuvel wrote:
> On 27 July 2017 at 18:16, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 07/27/17 16:21, Brijesh Singh wrote:
>>> Hi Laszlo,
>>>
>>>
>>>>
>>>> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM
>>>> simply in parallel with VIRTIO_F_VERSION_1, and don't act upon
>>>> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically
>>>> this is just my point (3) from above.
>>>>
>>>>
>>>> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2:
>>>>
>>>> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional
>>>>     virtio-pci devices, and offers virtio 0.9.5 semantics.
>>>>
>>>> - "ArmVirtPkg/VirtioFdtDxe" (via
>>>>     "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices,
>>>>     and offers virtio 0.9.5 semantics.
>>>>
>>>> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and
>>>>     offers virtio 1.0.0 semantics.
>>>>
>>>> The first two drivers should implement the AllocateSharedPages() and
>>>> FreeSharedPages() member functions simply with the corresponding
>>>> MemoryAllocationLib functions (using BootServicesData type memory),
>>>> and implement the MapSharedPages() and UnmapSharedPages() member
>>>> functions as no-ops (return the input addresses transparently).
>>>>
>>>> The third driver should implement all four new member functions by
>>>> respectively delegating the job to:
>>>> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData --
>>>> - EFI_PCI_IO_PROTOCOL.FreeBuffer()
>>>> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 --
>>>> - EFI_PCI_IO_PROTOCOL.Unmap()
>>>>
>>>
>>> I have working to implement patch per your recommendation. I assume
>>> you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer
>>> [1].
>>
>> Yes.
>>
>>> If so, I see one issue with SEV guest. When SEV is active, IOMMU
>>> driver uses a bounce buffer to map host address to a device address.
>>> While creating bounce buffer we can map it either for
>>> EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite
>>> Operation. If caller wants to map
>>> EfiPciIoOperationBusMasterCommonBuffer then it must allocate the
>>> buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we
>>> will fail to map.
>>
>> Correct.
>>
>>> I see that PciRootBridgeIo.c has similar check when using a bounce
>>> buffer for < 4GB use cases [3].
>>
>> Correct.
>>
>>> Do you see any issue if we use EfiPciIoOperationBusMasterRead or
>>> EfiPciIoOperationBusMasterWrite instead of
>>> EfiPciIoOperationBusMasterCommonBuffer ?
>>
>> Yes, I do. The BusMasterRead and BusMasterWrite operations are suitable
>> for one-shot, uni-directional transfers, where the bounce buffer
>> contents and the client code contents are exchanged on Map/Unmap.
>>
>> However virtio transfers, generally speaking, are not like this. While
>> the individual memory areas pointed-to by separate virtio descriptors
>> can me associated with one specific transfer direction (see the
>> VRING_DESC_F_WRITE flag), the virtio ring area itself is long-lived, and
>> it is simultaneously read and written by both host and guest,
>> asynchronously and repeatedly. This calls for BusMasterCommonBuffer.
>> Once we implement BusMasterCommonBuffer, we can use it for everything
>> else.
>>
>>
>> Another reason why separate allocation and mapping (and conversely,
>> separate unmapping and deallocation) are required is the
>> ExitBootServices() callbacks. In that callback, unmapping must happen
>> *without* memory allocation or deallocation (because the IOMMU must be
>> de-programmed, but the UEFI memmap must not be changed), covering all
>> memory areas that are at that point shared between host and guest
>> (regardless of transfer direction).
>>
>> Normally, Map allocates the bounce buffer internally, and Unmap releases
>> the bounce buffer internally (for BusMasterRead and BusMasterWrite),
>> which is not right here. If you use AllocateBuffer() separately, then
>> Map() -- with BusMasterCommonBuffer -- will not allocate internally, and
>> Unmap() will also not deallocate internally. So, in the
>> ExitBootServices() callback, you will be able to call Unmap() only, and
>> then *not* call FreeBuffer() at all.
>>
>> This is why I suggest introducing all four functions (Allocate, Free,
>> Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio drivers
>> should use all four functions explicitly, not just Map and Unmap.
>>
>> ... Actually, I think there is a bug in
>> "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following distribution of
>> operations at the moment:
>>
>> - IoMmuAllocateBuffer() allocates pages and clears the memory encryption
>>    mask
>>
>> - IoMmuFreeBuffer() re-sets the memory encryption mask, and deallocates
>>    pages
>>
>> - IoMmuMap() does nothing at all when BusMasterCommonBuffer is requested
>>    (and IoMmuAllocateBuffer() was called previously). Otherwise,
>>    IoMmuMap() allocates pages, allocates MAP_INFO, and clears the memory
>>    encryption mask.
>>
>> - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer
>>    operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the
>>    encryption mask, and frees both the pages and MAP_INFO.
>>

I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And introduce
list to track if the buffer was allocated by us. If buffer was allocated by
us then we can safely say that it does not require a bounce buffer. While
implementing the initial code I was not able to see BusMasterCommonBuffer
usage. But with virtio things are getting a bit more clear in my mind.

>> This distribution of operations seems wrong. The key point is that
>> AllocateBuffer() *need not* result in a buffer that is immediately
>> usable, and that client code is required to call Map()
>> *unconditionally*, even if BusMasterCommonBuffer is the desired
>> operation. Therefore, the right distribution of operations is:
>>
>> - IoMmuAllocateBuffer() allocates pages and does not touch the
>>    encryption mask..
>>
>> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption
>>    mask.
>>

Actually one of main reason why we cleared and restored the memory encryption mask
during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib
as a method to allocate and free a DMA buffer. I am certainly open to suggestions.

[1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159
[2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197

>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is
>>    requested, and it allocates pages (bounce buffer) otherwise.
>>

I am trying to wrap my head around how we can support BusMasterCommonBuffer
when buffer was not allocated by us. Changing the memory encryption mask in
a page table will not update the contents. Also since the memory encryption
mask works on PAGE_SIZE hence changing the encryption mask on not our allocated
buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned).

>>    *Regardless* of BusMaster operation, the following actions are carried
>>    out unconditionally:
>>
>>    . the memory encryption mask is cleared in this function (and in this
>>      function only),
>>
>>    . An attempt is made to grab a MAP_INFO structure from an internal
>>      free list (to be introduced!). The head of the list is a new static
>>      variable. If the free list is empty, then a MAP_INFO structure is
>>      allocated with AllocatePool(). The NO_MAPPING macro becomes unused
>>      and can be deleted from the source code.
>>
>> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it
>>    has to consult the MAP_INFO structure that is being passed in from the
>>    caller.) In addition:
>>
>>    . If MapInfo->Operation is BusMasterCommonBuffer, then we know the
>>      allocation was done separately in AllocateBuffer, so we do not
>>      release the pages. Otherwise, we do release the pages.
>>
>>    . MapInfo is linked back on the internal free list (see above). It is
>>      *never* released with FreePool().
>>
>>    This approach guarantees that IoMmuUnmap() can de-program the IOMMU (=
>>    re-set the memory encryption mask) without changing the UEFI memory
>>    map. (I trust that MemEncryptSevSetPageEncMask() will not split page
>>    tables internally when it *re*sets the encryption mask -- is that
>>    correct?)

Yes, MemEncryptSevSetPageEncMask() will not split pages when it restores mask.

>>>> I'm sorry that I didn't catch this earlier in your commit f9d129e68a45
>> ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06), but as you see, I gave you
>> only an Acked-by on that, not a Reviewed-by. I've really had to think it
>> through from the virtio perspective; I didn't foresee this use case back
>> then (I only felt that I wasn't getting the full picture about the IOMMU
>> protocol details).
>>

> 
> I have to concur with Laszlo here: the respective semantics of the
> allocate/map/unmap/free operations should be identical to their
> counterparts in the PCI I/O protocol, and in the DmaLib in
> EmbeddedPkg.
> 
> Note that this is likely to cause problems with proprietary x86
> drivers in option ROMs, which are used to getting away with using host
> addresses for DMA, and fail to invoke Map() for common buffers (or
> fail to invoke AllocateBuffer() altogether). The API is clear, and
> drivers that abide by the rules should operate correctly even when
> using encrypted memory or non-1:1 mapping between the host and PCI
> address space (which is how I ran into these issues)
> 


  reply	other threads:[~2017-07-27 18:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-19 22:09 [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support Brijesh Singh
2017-07-19 22:09 ` [RFC v1 1/3] OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit Brijesh Singh
2017-07-19 22:09 ` [RFC v1 2/3] OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support Brijesh Singh
2017-07-19 22:09 ` [RFC v1 3/3] OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support Brijesh Singh
     [not found] ` <62320c1a-0cec-947c-8c63-5eb0416e4e33@redhat.com>
2017-07-21 11:17   ` [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support Brijesh Singh
     [not found]     ` <20170722024318-mutt-send-email-mst@kernel.org>
2017-07-24  8:25       ` Gerd Hoffmann
2017-07-25 18:17 ` Laszlo Ersek
2017-07-25 23:42   ` Brijesh Singh
     [not found]   ` <904dae9f-e515-01ba-e16f-6561616c78af@redhat.com>
2017-07-26 15:30     ` Laszlo Ersek
2017-07-27 14:21   ` Brijesh Singh
2017-07-27 17:16     ` Laszlo Ersek
2017-07-27 17:56       ` Ard Biesheuvel
2017-07-27 19:00         ` Brijesh Singh [this message]
2017-07-27 20:55           ` Brijesh Singh
2017-07-27 21:31             ` Ard Biesheuvel
2017-07-27 21:38               ` Andrew Fish
2017-07-27 22:13                 ` Brijesh Singh
2017-07-27 22:10               ` Brijesh Singh
2017-07-28  8:39                 ` Ard Biesheuvel
2017-07-28 15:27                   ` Laszlo Ersek
2017-07-28 13:38           ` Laszlo Ersek
2017-07-28 16:00             ` Brijesh Singh
2017-07-28 16:16               ` Laszlo Ersek
2017-07-28 19:21               ` Laszlo Ersek
2017-07-28 19:59               ` Laszlo Ersek
2017-07-29  0:52                 ` Brijesh Singh
2017-07-29  1:37                   ` Brijesh Singh
2017-07-31 18:20                     ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5c3c3cc-d0c1-4227-306c-da19f2e43203@amd.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox