From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 4AE612095DB96 for ; Thu, 27 Jul 2017 10:14:53 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7EFD010D515; Thu, 27 Jul 2017 17:16:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7EFD010D515 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=lersek@redhat.com Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-71.phx2.redhat.com [10.3.116.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 048EF5C323; Thu, 27 Jul 2017 17:16:50 +0000 (UTC) To: Brijesh Singh Cc: edk2-devel@lists.01.org, Tom Lendacky , Jordan Justen , Jason Wang , "Michael S . Tsirkin" , Gerd Hoffmann , Ard Biesheuvel References: <1500502151-13508-1-git-send-email-brijesh.singh@amd.com> <841bec5f-6f6e-8b1f-25ba-0fd37a915b72@redhat.com> <4e2fc623-3656-eea7-09a8-b5c6d2f694e1@amd.com> From: Laszlo Ersek Message-ID: <4071596d-32c9-e6d9-8c93-0d43d28e9b5a@redhat.com> Date: Thu, 27 Jul 2017 19:16:49 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <4e2fc623-3656-eea7-09a8-b5c6d2f694e1@amd.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 27 Jul 2017 17:16:56 +0000 (UTC) Subject: Re: [RFC v1 0/3] Add VIRTIO_F_IOMMU_PLATFORM support X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Jul 2017 17:14:53 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 07/27/17 16:21, Brijesh Singh wrote: > Hi Laszlo, > > >> >> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM >> simply in parallel with VIRTIO_F_VERSION_1, and don't act upon >> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically >> this is just my point (3) from above. >> >> >> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: >> >> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional >> virtio-pci devices, and offers virtio 0.9.5 semantics. >> >> - "ArmVirtPkg/VirtioFdtDxe" (via >> "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices, >> and offers virtio 0.9.5 semantics. >> >> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and >> offers virtio 1.0.0 semantics. >> >> The first two drivers should implement the AllocateSharedPages() and >> FreeSharedPages() member functions simply with the corresponding >> MemoryAllocationLib functions (using BootServicesData type memory), >> and implement the MapSharedPages() and UnmapSharedPages() member >> functions as no-ops (return the input addresses transparently). >> >> The third driver should implement all four new member functions by >> respectively delegating the job to: >> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- >> - EFI_PCI_IO_PROTOCOL.FreeBuffer() >> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- >> - EFI_PCI_IO_PROTOCOL.Unmap() >> > > I have working to implement patch per your recommendation. I assume > you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer > [1]. Yes. > If so, I see one issue with SEV guest. When SEV is active, IOMMU > driver uses a bounce buffer to map host address to a device address. > While creating bounce buffer we can map it either for > EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite > Operation. If caller wants to map > EfiPciIoOperationBusMasterCommonBuffer then it must allocate the > buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we > will fail to map. Correct. > I see that PciRootBridgeIo.c has similar check when using a bounce > buffer for < 4GB use cases [3]. Correct. > Do you see any issue if we use EfiPciIoOperationBusMasterRead or > EfiPciIoOperationBusMasterWrite instead of > EfiPciIoOperationBusMasterCommonBuffer ? Yes, I do. The BusMasterRead and BusMasterWrite operations are suitable for one-shot, uni-directional transfers, where the bounce buffer contents and the client code contents are exchanged on Map/Unmap. However virtio transfers, generally speaking, are not like this. While the individual memory areas pointed-to by separate virtio descriptors can me associated with one specific transfer direction (see the VRING_DESC_F_WRITE flag), the virtio ring area itself is long-lived, and it is simultaneously read and written by both host and guest, asynchronously and repeatedly. This calls for BusMasterCommonBuffer. Once we implement BusMasterCommonBuffer, we can use it for everything else. Another reason why separate allocation and mapping (and conversely, separate unmapping and deallocation) are required is the ExitBootServices() callbacks. In that callback, unmapping must happen *without* memory allocation or deallocation (because the IOMMU must be de-programmed, but the UEFI memmap must not be changed), covering all memory areas that are at that point shared between host and guest (regardless of transfer direction). Normally, Map allocates the bounce buffer internally, and Unmap releases the bounce buffer internally (for BusMasterRead and BusMasterWrite), which is not right here. If you use AllocateBuffer() separately, then Map() -- with BusMasterCommonBuffer -- will not allocate internally, and Unmap() will also not deallocate internally. So, in the ExitBootServices() callback, you will be able to call Unmap() only, and then *not* call FreeBuffer() at all. This is why I suggest introducing all four functions (Allocate, Free, Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio drivers should use all four functions explicitly, not just Map and Unmap. ... Actually, I think there is a bug in "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following distribution of operations at the moment: - IoMmuAllocateBuffer() allocates pages and clears the memory encryption mask - IoMmuFreeBuffer() re-sets the memory encryption mask, and deallocates pages - IoMmuMap() does nothing at all when BusMasterCommonBuffer is requested (and IoMmuAllocateBuffer() was called previously). Otherwise, IoMmuMap() allocates pages, allocates MAP_INFO, and clears the memory encryption mask. - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the encryption mask, and frees both the pages and MAP_INFO. This distribution of operations seems wrong. The key point is that AllocateBuffer() *need not* result in a buffer that is immediately usable, and that client code is required to call Map() *unconditionally*, even if BusMasterCommonBuffer is the desired operation. Therefore, the right distribution of operations is: - IoMmuAllocateBuffer() allocates pages and does not touch the encryption mask.. - IoMmuFreeBuffer() deallocates pages and does not touch the encryption mask. - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is requested, and it allocates pages (bounce buffer) otherwise. *Regardless* of BusMaster operation, the following actions are carried out unconditionally: . the memory encryption mask is cleared in this function (and in this function only), . An attempt is made to grab a MAP_INFO structure from an internal free list (to be introduced!). The head of the list is a new static variable. If the free list is empty, then a MAP_INFO structure is allocated with AllocatePool(). The NO_MAPPING macro becomes unused and can be deleted from the source code. - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it has to consult the MAP_INFO structure that is being passed in from the caller.) In addition: . If MapInfo->Operation is BusMasterCommonBuffer, then we know the allocation was done separately in AllocateBuffer, so we do not release the pages. Otherwise, we do release the pages. . MapInfo is linked back on the internal free list (see above). It is *never* released with FreePool(). This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= re-set the memory encryption mask) without changing the UEFI memory map. (I trust that MemEncryptSevSetPageEncMask() will not split page tables internally when it *re*sets the encryption mask -- is that correct?) I'm sorry that I didn't catch this earlier in your commit f9d129e68a45 ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06), but as you see, I gave you only an Acked-by on that, not a Reviewed-by. I've really had to think it through from the virtio perspective; I didn't foresee this use case back then (I only felt that I wasn't getting the full picture about the IOMMU protocol details). Thanks Laszlo