From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 620C81A1E08 for ; Fri, 2 Sep 2016 10:22:02 -0700 (PDT) Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E3B7CC05680F; Fri, 2 Sep 2016 17:22:01 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-76.phx2.redhat.com [10.3.116.76]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u82HM0p3022396; Fri, 2 Sep 2016 13:22:00 -0400 To: Ard Biesheuvel References: <1472666379-25426-1-git-send-email-ard.biesheuvel@linaro.org> <207F4239-0958-4A0E-9DAA-36ABB56E7BB7@linaro.org> Cc: "edk2-devel@lists.01.org" , Leif Lindholm From: Laszlo Ersek Message-ID: Date: Fri, 2 Sep 2016 19:21:59 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 02 Sep 2016 17:22:01 +0000 (UTC) Subject: Re: [PATCH v2 0/6] ArmVirtQemu: move to generic PciHostBridgeDxe X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Sep 2016 17:22:02 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 09/02/16 18:26, Ard Biesheuvel wrote: > On 2 September 2016 at 17:13, Laszlo Ersek wrote: >> On 09/02/16 17:27, Laszlo Ersek wrote: >>> On 09/02/16 16:58, Ard Biesheuvel wrote: >>>> (on the road atm, will reply in full later) >>>> >>>>> On 2 sep. 2016, at 14:09, Laszlo Ersek wrote: >>> >>>>> (2) aarch64 KVM, using virtio-gpu-pci and USB 2 keyboard and >>>>> tablet. I actually booted a Fedora 24 guest with this, and in the >>>>> guest, everything works just fine (display, keyboard, >>>>> mouse/tablet). Most of the firmware log looks good too. >>>>> >>>>> (2a) However, the USB 2 keyboard is broken while in the firmware >>>>> (in spite of it working well in the guest OS). >>>>> >>>>> -device ich9-usb-ehci1,multifunction=on,id=ehci,addr=05.0 \ >>>>> -device ich9-usb-uhci1,multifunction=on,masterbus=ehci.0,firstport=0,addr=05.1 \ >>>>> -device ich9-usb-uhci2,multifunction=on,masterbus=ehci.0,firstport=2,addr=05.2 \ >>>>> -device ich9-usb-uhci3,multifunction=on,masterbus=ehci.0,firstport=4,addr=05.3 \ >>>>> -device usb-kbd,bus=ehci.0 \ >>>>> -device usb-tablet,bus=ehci.0 \ >>>>> >>>>> My QEMU has your commit 5d636e21c44e ("hw/arm/virt: mark the PCIe >>>>> host controller as DMA coherent in the DT"), but I guess the EHCI >>>>> driver in edk2 doesn't comply with the "guest drivers should use >>>>> cacheable accesses as well when running under KVM" part. :( >>>>> >>>>> The following snippet repeats in the log: >>>>> >>>>> EhcClearLegacySupport: called to clear legacy support >>>>> processing error - resetting ehci HC >>>>> EhcInitHC: failed to enable period schedule >>>>> EhcDriverBindingStart: failed to init host controller >>>>> EhcCreateUsb2Hc: capability length 32 >>>>> >>>>> Interestingly, if I back out your series, then USB2 works in the >>>>> firmware. I don't understand this, given that my build includes >>>>> commit 3ef3209d3028 ("ArmVirtPkg: remove >>>>> PcdKludgeMapPciMmioAsCached") from the master branch! >>>>> >>>> >>>> Does it work when you limit DMA to < 4 GB? >>> >>> You are one wicked genius, man; the following change >>> >>>> diff --git a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >>>> index efccedcca14f..1f0f87cac8a9 100644 >>>> --- a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >>>> +++ b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >>>> @@ -317,7 +317,7 @@ PciHostBridgeGetRootBridges ( >>>> EFI_PCI_ATTRIBUTE_VGA_PALETTE_IO_16; >>>> mRootBridge.Attributes = mRootBridge.Supports; >>>> >>>> - mRootBridge.DmaAbove4G = TRUE; >>>> + mRootBridge.DmaAbove4G = FALSE; >>>> mRootBridge.NoExtendedConfigSpace = FALSE; >>>> mRootBridge.ResourceAssigned = FALSE; >>>> >>> >>> does make it work! Excellent! >>> >>> Explain please. :) (Although, I'll look into PciHostBridgeDxe in a moment too. :)) >> > > Thanks. You seem to have a good handle on things already, though :-) > >> Well okay, I reviewed the RootBridgeIoMap() and RootBridgeIoUnmap() >> functions in "MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c". >> They implement bounce buffering when DmaAbove4G is set to FALSE, and >> when the original RAM buffer, to be DMA'd from or to by the PCI device, >> would end outside of 32-bit space. >> >> For common buffer operations (when device and CPU collaborate on memory >> repeatedly, without intervening Map() and Unmap() calls), Map() and >> Unmap() cannot implement bounce buffering, so the initial buffer must be >> allocated low enough. This is what RootBridgeIoAllocateBuffer() does, >> and yes it considers DmaAbove4G as well. >> >> EhciDxe uses these functions quite a bit. And, my test VM has 4G of >> memory, with a base at 0x4000_0000 (1GB); the base is fixed of course, >> from "-M virt". So, I guess, some buffers that EhciDxe allocated itself, >> for DMA'ing from/to the device, and some buffers that it allocated with >> AllocateBuffer(), for common operations with the device, ended up in the >> 4GB..5GB range. Due to DmaAbove4G = TRUE, those host addresses got >> passed to the PCI device (the USB 2 host controller) verbatim, but that >> device can only access host RAM in the 32-bit address range?.... >> >> Hm, let me check the QEMU code (hw/usb/hcd-ehci.c)... >> >> Alright, I've found it. According to the EHCI specification >> ("ehci-specification-for-usb.pdf", link found under >> ), >> revision 1.0, section "2.2.4 HCCPARAMS -- Capability Parameters", bit #0 >> (value 1) in the HCCPARAMS capability register stands for: >> >> >> 64-bit Addressing Capability. This field documents the addressing >> range capability of this implementation. The value of this field >> determines whether software should use the data structures defined >> in Section 3 (32-bit) or those defined in Appendix B (64-bit). >> Values for this field have the following interpretation: >> >> 0b data structures using 32-bit address memory pointers >> 1b data structures using 64-bit address memory pointers >> >> Furthermore, the HCCPARAMS register lives at address "Base + (08h)". >> >> Now, looking at the QEMU code, we have usb_ehci_init() >> [hw/usb/hcd-ehci.c] performing the following assignment: >> >> s->caps[0x08] = 0x80; /* We can cache whole frame, no 64-bit */ >> >> (And, the "cache whole frame" reference, for bit #7, is consistent with >> the documentation of that bit in the spec: "When bit [7] is a >> one, then host software assumes the host controller may cache an >> isochronous data structure for an entire frame.") >> >> So, bingo. Please flip DmaAbove4G to FALSE in patch #3, and please drop >> the "DMA above 4 GB" paragraph from the commit message of patch #4. >> > > Actually, I suspect this is a bug in PciHostBridgeDxe. It ignores the > absence of the EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE attribute, which > should be set by the driver if it knows the device is capable of > 64-bit DMA. > > Could you please try the below? > > > diff --git a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c > b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c > index b2d76d67afa2..b53b9a834816 100644 > --- a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c > +++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c > @@ -1308,7 +1308,8 @@ RootBridgeIoAllocateBuffer ( > RootBridge = ROOT_BRIDGE_FROM_THIS (This); > > AllocateType = AllocateAnyPages; > - if (!RootBridge->DmaAbove4G) { > + if (!RootBridge->DmaAbove4G || > + (Attributes & EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE) == 0) { > // > // Limit allocations to memory below 4GB > // > > Thanks, > Ard. > Before trying it, I'll say that I don't like it, for two reasons :) (1) This will affect AllocateBuffer(), yes, but it doesn't affect Map() and Unmap(). In fact I don't understand how the spec allows those functions to communicate this kind of information between PciIo and PciRootBridgeIo: while for AllocateBuffer(), the PciIo implementation can check the device itself, and pass EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE to PciRootBridgeIo, I don't see the same possibility, in the spec, for Map(). There is no Attributes parameter there. So how will PciRootBridgeIo know? In more direct terms, you can't extend the DmaAbove4G check in RootBridgeIoMap() in a similar fashion. (Is this a spec bug actually?) (2) I tried to track down where EFI_PCI_ATTRIBUTE_DUAL_ADDRESS_CYCLE would come from, in edk2. The only location that passes it is PciIoAllocateBuffer() in "MdeModulePkg/Bus/Pci/PciBusDxe/PciIo.c" (i.e., the implementation of the similarly named PciIo protocol member). The condition for passing this attribute to PciRootBridgeIo.AllocateBuffer() is that EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE (note: a different constant!) be set in PciIo.Attributes -- i.e., on the PciIo device itself. Makes sense, right? So, what sets EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE on the device? - PciSetDeviceAttribute() in "MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c" sets the bit in PCI_IO_DEVICE.Supports (not .Attributes!) unconditionally, - in DetermineDeviceAttribute() [MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c], the attribute is again set unconditionally (only in PCI_IO_DEVICE.Supports), accompanied by the comment "Assume the PCI Root Bridge supports DAC", - ModifyRootBridgeAttributes() in "MdeModulePkg/Bus/Pci/PciBusDxe/PciIo.c" seems to exclude this bit from the set of bits that can be toggled. So, I think unless a UEFI_DRIVER that consumes PciIo actively calls PciIo.Attributes() with OperationSet / OperationEnable for EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE, this code will never make a difference. UEFI_DRIVERs are actually expected to massage the PciIo attributes as they see fit, for example EFI_PCI_IO_ATTRIBUTE_IO is frequently set for IO BAR decoding. However, I couldn't find any driver in the tree that would set EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE. *Maybe*, I guess, EhciDxe could look at the HCCPARAMS register discussed above, and then set EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE? I've got no clue. Anyway, after this wall of text, I should reenable >4GB DMA, and actually test your patch... Yep, while it might be justified per se, it definitely does not suffice for making things work. The USB 2 keyboard remains broken with it. Thanks Laszlo