From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id F265B21CE7490 for ; Fri, 28 Jul 2017 08:48:29 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8EC5B6AA49; Fri, 28 Jul 2017 15:50:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 8EC5B6AA49 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=lersek@redhat.com Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-27.phx2.redhat.com [10.3.116.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8978E17C16; Fri, 28 Jul 2017 15:50:30 +0000 (UTC) To: "Ni, Ruiyu" , "edk2-devel@lists.01.org" Cc: Marcel Apfelbaum , Aleksandr Bezzubikov , "Zeng, Star" References: <20170727030537.51784-1-ruiyu.ni@intel.com> <8efaf70a-ac43-5f29-5ff3-cd0dbb0f5d91@redhat.com> <734D49CCEBEEF84792F5B80ED585239D5B9C4316@SHSMSX104.ccr.corp.intel.com> From: Laszlo Ersek Message-ID: <0ed18d93-1d0b-56a2-7636-ac499bbc5c6d@redhat.com> Date: Fri, 28 Jul 2017 17:50:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <734D49CCEBEEF84792F5B80ED585239D5B9C4316@SHSMSX104.ccr.corp.intel.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 28 Jul 2017 15:50:34 +0000 (UTC) Subject: Re: bus number padding [was: MdeModulePkg/PciBus: Avoid hang when BUS pad resource is not in top] X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 15:48:30 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 07/28/17 12:19, Ni, Ruiyu wrote: > Laszlo, > Thanks for raising this questions. > Please see my embedded reply in below > > Thanks/Ray > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of >> Laszlo Ersek >> Sent: Thursday, July 27, 2017 7:23 PM >> To: Ni, Ruiyu ; edk2-devel@lists.01.org >> Cc: Marcel Apfelbaum ; Aleksandr Bezzubikov >> ; Zeng, Star >> Subject: [edk2] bus number padding [was: MdeModulePkg/PciBus: Avoid >> hang when BUS pad resource is not in top] >> >> Hello Ray, >> >> please excuse me for hijacking this thread a little bit, but I'm very happy that >> you happen to be looking at this exact part of the code, which is also what >> I've been discussing with Marcel and Aleksandr (CC'd) due to a different >> reason. >> >> (The relevant sub-thread is at >> >> Allow RedHat PCI bridges reserve more buses than necessary during init >> >> http://mid.mail-archive.com/d0bd04f4-1ac4-0e3a-885f- >> 2ceb6180f69a@redhat.com >> >> but it's very long, so I'm not asking you to read that -- instead, please let me >> ask you our question independently and in a self-contained >> form.) >> >> In "OvmfPkg/PciHotPlugInitDxe/", we plan to return some BUS resource >> paddings, from the >> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() >> function. >> >> QEMU has no "root" hotplug controllers, only "non-root" hotplug controllers >> (using the terminology from "12.6 PCI Hot Plug PCI Initialization Protocol" in >> the PI-1.6 spec). Therefore OVMF's >> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetRootHpcList() returns an empty >> array. >> The BUS resource padding that we'd like to return from >> GetResourcePadding() would be for "non-root" hotplug controllers. >> >> According to the PI spec, this is a valid thing to do: >> >> * From "12.5 Sample Implementation for a Platform Containing PCI >> Hot Plug* Slots": >> >>> [...] For all the root HPCs and the nonroot HPCs, call >>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() to obtain the >>> amount of overallocation and add that amount to the requests from the >>> physical devices. Reprogram the bus numbers by taking into account the >>> bus resource padding information. [...] >> >> * From "12.6 PCI Hot Plug PCI Initialization Protocol, >> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()": >> >>> [...] This member function is called for all the root HPCs and nonroot >>> HPCs that are detected by the PCI bus enumerator. [...] The PCI bus >>> driver is responsible for adding this resource request to the resource >>> requests by the physical PCI devices. [...] >> >> I searched PciBusDxe for GetResourcePadding() call sites, and there are >> two: >> >> (1) In "MdeModulePkg/Bus/Pci/PciBusDxe/PciHotPlugSupport.c", function >> GetResourcePaddingForHpb(). This call site seems to work as follows, >> for our purposes: >> >> GetResourcePaddingPpb() >> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c] >> GetResourcePaddingForHpb() >> [MdeModulePkg/Bus/Pci/PciBusDxe/PciHotPlugSupport.c] >> gPciHotPlugInit->GetResourcePadding() >> // >> // and then save the resource descriptors, including the bus >> // padding, in PciIoDevice->ResourcePaddingDescriptors >> // >> >> ResourcePaddingPolicy() >> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c] >> ApplyResourcePadding() >> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c] >> // >> // translate IO and MMIO resources from >> // PciDev->ResourcePaddingDescriptors to internal padding >> // representation >> // >> >> This call site and processing seem to be active for "non-root" >> hotplug controllers, but they do not consider bus numbers, only IO >> and MMIO. > [Ray] It's in bus allocation phase. So the PciBus driver only cares about the > bus padding. > PciBus driver firstly allocates the bus resource and programs the PCI-PCI bridges > to set the bus numbers. > After that, PciBus allocates the IO/MMIO resources and program the BARs in > PCI devices & PCI-PCI bridges. > So it's good here. >> >> (2) In the PciScanBus() function, file >> "MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c", we have another >> GetResourcePadding() call -- your present patch is updating this >> code part. >> >> From the resource padding list returned by this GetResourcePadding() >> invocation, PciBusDxe cares only about the bus numbers. I think >> that's probably fine. However, in this location, >> GetResourcePadding() is called *only* for "root" hotplug >> controllers, of which QEMU has none. >> >> The end result is that bus number paddings are not considered at all for >> "non-root" hotplug controllers: > [Ray] Bus number padding should be considered for "non-root" HPCs. > So I agree it's a bug/gap I need to fix. I noticed that already before your question. > And I do have a plan to fix it though the priority is not very high. Awesome, thank you! It is not very urgent for us. Do you agree that I file a TianoCore BZ about this? > >> - site (1) is active for "non-root" HPCs, but it ignores bus number >> paddings, >> - site (2) processes bus number paddings, but it is inactive for >> "non-root" HPCs. >> >> This doesn't seem to match what the PI spec says. Is this behavior intentional >> in PciBusDxe? If so, what is the reason for it? > [Ray] In summary, site(1) is good. Site(2) contains a bug/gap. > Would you please submit a Bugzilla to record this issue? Thank you for answering my question :) I've now filed the following TianoCore BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=656 > But can you just return the non-root HPCs as root for a workaround? > What prevent you from reporting the HPC as non-root? Yes, I've thought of this as well. The problem is that the "a-priori" knowledge of the root HPCs, that the PI spec mentions for GetRootHpcList(), is actually *dynamic* knowledge in QEMU's case. In "OvmfPkg/PciHotPlugInitDxe", I could only collect the affected bridges by doing a full (recursive) PCI hierarchy enumeration, and look into the config space of each "candidate" controller. However, PciBusDxe already implements the full enumeration, so I don't think PciHotPlugInitDxe should it do internally. (In particular in a platform hook that could be called from a "sensitive" location in PciBusDxe.) > My opinion is the differences between non-root and root HPCs are: > 1. Explicitly initialization is needed for root HPCs I agree that this is one difference. Such initialization is not needed in QEMU's case however. > 2. Root HPCs may not be detected as hot-pluggable by PciBus using standard ways I agree that this is the other difference. All of QEMU's hotplug controllers can be enumerated in standard ways however. > If your platform doesn't report any root HPCs, why not just report non-root as root ones? (See above:) because the list of non-root HPCs is not known to OVMF in any static way, it is dynamic (boot time) information. And, QEMU does not expose this information to OVMF in any dedicated, easy-to-consume way. The only way to find out about the non-root HPCs is to actually traverse the PCI hierarchy. Given that PciBusDxe is in the middle of doing exactly that, I think that OvmfPkg/PciHotPlugInitDxe should not perform an "internal" enumeration in GetRootHpcList(). I don't think that OvmfPkg should have a private implementation of this PCI traversal, when the "main" traversal is already done in PciBusDxe. (And, likely much better than what I could write.) When you fix TianoCore BZ#656, that will be perfect for us. :) Thank you very much! Laszlo