From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 39D151A1E06 for ; Fri, 2 Sep 2016 09:13:42 -0700 (PDT) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A3E3E43A3E; Fri, 2 Sep 2016 16:13:41 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-76.phx2.redhat.com [10.3.116.76]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u82GDdB6021822; Fri, 2 Sep 2016 12:13:40 -0400 To: Ard Biesheuvel References: <1472666379-25426-1-git-send-email-ard.biesheuvel@linaro.org> <207F4239-0958-4A0E-9DAA-36ABB56E7BB7@linaro.org> Cc: edk2-devel@ml01.01.org, leif.lindholm@linaro.org From: Laszlo Ersek Message-ID: Date: Fri, 2 Sep 2016 18:13:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 02 Sep 2016 16:13:41 +0000 (UTC) Subject: Re: [PATCH v2 0/6] ArmVirtQemu: move to generic PciHostBridgeDxe X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Sep 2016 16:13:42 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 09/02/16 17:27, Laszlo Ersek wrote: > On 09/02/16 16:58, Ard Biesheuvel wrote: >> (on the road atm, will reply in full later) >> >>> On 2 sep. 2016, at 14:09, Laszlo Ersek wrote: > >>> (2) aarch64 KVM, using virtio-gpu-pci and USB 2 keyboard and >>> tablet. I actually booted a Fedora 24 guest with this, and in the >>> guest, everything works just fine (display, keyboard, >>> mouse/tablet). Most of the firmware log looks good too. >>> >>> (2a) However, the USB 2 keyboard is broken while in the firmware >>> (in spite of it working well in the guest OS). >>> >>> -device ich9-usb-ehci1,multifunction=on,id=ehci,addr=05.0 \ >>> -device ich9-usb-uhci1,multifunction=on,masterbus=ehci.0,firstport=0,addr=05.1 \ >>> -device ich9-usb-uhci2,multifunction=on,masterbus=ehci.0,firstport=2,addr=05.2 \ >>> -device ich9-usb-uhci3,multifunction=on,masterbus=ehci.0,firstport=4,addr=05.3 \ >>> -device usb-kbd,bus=ehci.0 \ >>> -device usb-tablet,bus=ehci.0 \ >>> >>> My QEMU has your commit 5d636e21c44e ("hw/arm/virt: mark the PCIe >>> host controller as DMA coherent in the DT"), but I guess the EHCI >>> driver in edk2 doesn't comply with the "guest drivers should use >>> cacheable accesses as well when running under KVM" part. :( >>> >>> The following snippet repeats in the log: >>> >>> EhcClearLegacySupport: called to clear legacy support >>> processing error - resetting ehci HC >>> EhcInitHC: failed to enable period schedule >>> EhcDriverBindingStart: failed to init host controller >>> EhcCreateUsb2Hc: capability length 32 >>> >>> Interestingly, if I back out your series, then USB2 works in the >>> firmware. I don't understand this, given that my build includes >>> commit 3ef3209d3028 ("ArmVirtPkg: remove >>> PcdKludgeMapPciMmioAsCached") from the master branch! >>> >> >> Does it work when you limit DMA to < 4 GB? > > You are one wicked genius, man; the following change > >> diff --git a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >> index efccedcca14f..1f0f87cac8a9 100644 >> --- a/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >> +++ b/ArmVirtPkg/Library/FdtPciHostBridgeLib/FdtPciHostBridgeLib.c >> @@ -317,7 +317,7 @@ PciHostBridgeGetRootBridges ( >> EFI_PCI_ATTRIBUTE_VGA_PALETTE_IO_16; >> mRootBridge.Attributes = mRootBridge.Supports; >> >> - mRootBridge.DmaAbove4G = TRUE; >> + mRootBridge.DmaAbove4G = FALSE; >> mRootBridge.NoExtendedConfigSpace = FALSE; >> mRootBridge.ResourceAssigned = FALSE; >> > > does make it work! Excellent! > > Explain please. :) (Although, I'll look into PciHostBridgeDxe in a moment too. :)) Well okay, I reviewed the RootBridgeIoMap() and RootBridgeIoUnmap() functions in "MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c". They implement bounce buffering when DmaAbove4G is set to FALSE, and when the original RAM buffer, to be DMA'd from or to by the PCI device, would end outside of 32-bit space. For common buffer operations (when device and CPU collaborate on memory repeatedly, without intervening Map() and Unmap() calls), Map() and Unmap() cannot implement bounce buffering, so the initial buffer must be allocated low enough. This is what RootBridgeIoAllocateBuffer() does, and yes it considers DmaAbove4G as well. EhciDxe uses these functions quite a bit. And, my test VM has 4G of memory, with a base at 0x4000_0000 (1GB); the base is fixed of course, from "-M virt". So, I guess, some buffers that EhciDxe allocated itself, for DMA'ing from/to the device, and some buffers that it allocated with AllocateBuffer(), for common operations with the device, ended up in the 4GB..5GB range. Due to DmaAbove4G = TRUE, those host addresses got passed to the PCI device (the USB 2 host controller) verbatim, but that device can only access host RAM in the 32-bit address range?.... Hm, let me check the QEMU code (hw/usb/hcd-ehci.c)... Alright, I've found it. According to the EHCI specification ("ehci-specification-for-usb.pdf", link found under ), revision 1.0, section "2.2.4 HCCPARAMS -- Capability Parameters", bit #0 (value 1) in the HCCPARAMS capability register stands for: 64-bit Addressing Capability. This field documents the addressing range capability of this implementation. The value of this field determines whether software should use the data structures defined in Section 3 (32-bit) or those defined in Appendix B (64-bit). Values for this field have the following interpretation: 0b data structures using 32-bit address memory pointers 1b data structures using 64-bit address memory pointers Furthermore, the HCCPARAMS register lives at address "Base + (08h)". Now, looking at the QEMU code, we have usb_ehci_init() [hw/usb/hcd-ehci.c] performing the following assignment: s->caps[0x08] = 0x80; /* We can cache whole frame, no 64-bit */ (And, the "cache whole frame" reference, for bit #7, is consistent with the documentation of that bit in the spec: "When bit [7] is a one, then host software assumes the host controller may cache an isochronous data structure for an entire frame.") So, bingo. Please flip DmaAbove4G to FALSE in patch #3, and please drop the "DMA above 4 GB" paragraph from the commit message of patch #4. Thanks! Laszlo