public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: "Ni, Ruiyu" <ruiyu.ni@intel.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: Marcel Apfelbaum <marcel@redhat.com>,
	Aleksandr Bezzubikov <zuban32s@gmail.com>,
	"Zeng, Star" <star.zeng@intel.com>
Subject: Re: bus number padding [was: MdeModulePkg/PciBus: Avoid hang when BUS pad resource is not in top]
Date: Fri, 28 Jul 2017 17:50:29 +0200	[thread overview]
Message-ID: <0ed18d93-1d0b-56a2-7636-ac499bbc5c6d@redhat.com> (raw)
In-Reply-To: <734D49CCEBEEF84792F5B80ED585239D5B9C4316@SHSMSX104.ccr.corp.intel.com>

On 07/28/17 12:19, Ni, Ruiyu wrote:
> Laszlo,
> Thanks for raising this questions.
> Please see my embedded reply in below
> 
> Thanks/Ray
> 
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>> Laszlo Ersek
>> Sent: Thursday, July 27, 2017 7:23 PM
>> To: Ni, Ruiyu <ruiyu.ni@intel.com>; edk2-devel@lists.01.org
>> Cc: Marcel Apfelbaum <marcel@redhat.com>; Aleksandr Bezzubikov
>> <zuban32s@gmail.com>; Zeng, Star <star.zeng@intel.com>
>> Subject: [edk2] bus number padding [was: MdeModulePkg/PciBus: Avoid
>> hang when BUS pad resource is not in top]
>>
>> Hello Ray,
>>
>> please excuse me for hijacking this thread a little bit, but I'm very happy that
>> you happen to be looking at this exact part of the code, which is also what
>> I've been discussing with Marcel and Aleksandr (CC'd) due to a different
>> reason.
>>
>> (The relevant sub-thread is at
>>
>>   Allow RedHat PCI bridges reserve more buses than necessary during init
>>
>>   http://mid.mail-archive.com/d0bd04f4-1ac4-0e3a-885f-
>> 2ceb6180f69a@redhat.com
>>
>> but it's very long, so I'm not asking you to read that -- instead, please let me
>> ask you our question independently and in a self-contained
>> form.)
>>
>> In "OvmfPkg/PciHotPlugInitDxe/", we plan to return some BUS resource
>> paddings, from the
>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()
>> function.
>>
>> QEMU has no "root" hotplug controllers, only "non-root" hotplug controllers
>> (using the terminology from "12.6 PCI Hot Plug PCI Initialization Protocol" in
>> the PI-1.6 spec). Therefore OVMF's
>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetRootHpcList() returns an empty
>> array.
>> The BUS resource padding that we'd like to return from
>> GetResourcePadding() would be for "non-root" hotplug controllers.
>>
>> According to the PI spec, this is a valid thing to do:
>>
>> * From "12.5 Sample Implementation for a Platform Containing PCI
>>   Hot Plug* Slots":
>>
>>> [...] For all the root HPCs and the nonroot HPCs, call
>>> EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() to obtain the
>>> amount of overallocation and add that amount to the requests from the
>>> physical devices. Reprogram the bus numbers by taking into account the
>>> bus resource padding information. [...]
>>
>> * From "12.6 PCI Hot Plug PCI Initialization Protocol,
>>   EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding()":
>>
>>> [...] This member function is called for all the root HPCs and nonroot
>>> HPCs that are detected by the PCI bus enumerator. [...] The PCI bus
>>> driver is responsible for adding this resource request to the resource
>>> requests by the physical PCI devices. [...]
>>
>> I searched PciBusDxe for GetResourcePadding() call sites, and there are
>> two:
>>
>> (1) In "MdeModulePkg/Bus/Pci/PciBusDxe/PciHotPlugSupport.c", function
>>     GetResourcePaddingForHpb(). This call site seems to work as follows,
>>     for our purposes:
>>
>>     GetResourcePaddingPpb()
>> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c]
>>       GetResourcePaddingForHpb()
>> [MdeModulePkg/Bus/Pci/PciBusDxe/PciHotPlugSupport.c]
>>         gPciHotPlugInit->GetResourcePadding()
>>         //
>>         // and then save the resource descriptors, including the bus
>>         // padding, in PciIoDevice->ResourcePaddingDescriptors
>>         //
>>
>>     ResourcePaddingPolicy()
>> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c]
>>       ApplyResourcePadding()
>> [MdeModulePkg/Bus/Pci/PciBusDxe/PciResourceSupport.c]
>>         //
>>         // translate IO and MMIO resources from
>>         // PciDev->ResourcePaddingDescriptors to internal padding
>>         // representation
>>         //
>>
>>     This call site and processing seem to be active for "non-root"
>>     hotplug controllers, but they do not consider bus numbers, only IO
>>     and MMIO.
> [Ray] It's in bus allocation phase. So the PciBus driver only cares about the
> bus padding.
> PciBus driver firstly allocates the bus resource and programs the PCI-PCI bridges
> to set the bus numbers.
> After that, PciBus allocates the IO/MMIO resources and program the BARs in
> PCI devices & PCI-PCI bridges.
> So it's good here.
>>
>> (2) In the PciScanBus() function, file
>>     "MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c", we have another
>>     GetResourcePadding() call -- your present patch is updating this
>>     code part.
>>
>>     From the resource padding list returned by this GetResourcePadding()
>>     invocation, PciBusDxe cares only about the bus numbers. I think
>>     that's probably fine. However, in this location,
>>     GetResourcePadding() is called *only* for "root" hotplug
>>     controllers, of which QEMU has none.
>>
>> The end result is that bus number paddings are not considered at all for
>> "non-root" hotplug controllers:
> [Ray] Bus number padding should be considered for "non-root" HPCs.
> So I agree it's a bug/gap I need to fix. I noticed that already before your question.
> And I do have a plan to fix it though the priority is not very high.

Awesome, thank you!

It is not very urgent for us. Do you agree that I file a TianoCore BZ
about this?

> 
>> - site (1) is active for "non-root" HPCs, but it ignores bus number
>>   paddings,
>> - site (2) processes bus number paddings, but it is inactive for
>>   "non-root" HPCs.
>>
>> This doesn't seem to match what the PI spec says. Is this behavior intentional
>> in PciBusDxe? If so, what is the reason for it?
> [Ray] In summary, site(1) is good. Site(2) contains a bug/gap.
> Would you please submit a Bugzilla to record this issue?

Thank you for answering my question :) I've now filed the following
TianoCore BZ:

https://bugzilla.tianocore.org/show_bug.cgi?id=656

> But can you just return the non-root HPCs as root for a workaround?
> What prevent you from reporting the HPC as non-root?

Yes, I've thought of this as well.

The problem is that the "a-priori" knowledge of the root HPCs, that the
PI spec mentions for GetRootHpcList(), is actually *dynamic* knowledge
in QEMU's case.

In "OvmfPkg/PciHotPlugInitDxe", I could only collect the affected
bridges by doing a full (recursive) PCI hierarchy enumeration, and look
into the config space of each "candidate" controller. However, PciBusDxe
already implements the full enumeration, so I don't think
PciHotPlugInitDxe should it do internally. (In particular in a platform
hook that could be called from a "sensitive" location in PciBusDxe.)

> My opinion is the differences between non-root and root HPCs are:
> 1. Explicitly initialization is needed for root HPCs

I agree that this is one difference.

Such initialization is not needed in QEMU's case however.

> 2. Root HPCs may not be detected as hot-pluggable by PciBus using standard ways

I agree that this is the other difference.

All of QEMU's hotplug controllers can be enumerated in standard ways
however.

> If your platform doesn't report any root HPCs, why not just report non-root as root ones?

(See above:) because the list of non-root HPCs is not known to OVMF in
any static way, it is dynamic (boot time) information. And, QEMU does
not expose this information to OVMF in any dedicated, easy-to-consume way.

The only way to find out about the non-root HPCs is to actually traverse
the PCI hierarchy.

Given that PciBusDxe is in the middle of doing exactly that, I think
that OvmfPkg/PciHotPlugInitDxe should not perform an "internal"
enumeration in GetRootHpcList(). I don't think that OvmfPkg should have
a private implementation of this PCI traversal, when the "main"
traversal is already done in PciBusDxe. (And, likely much better than
what I could write.)

When you fix TianoCore BZ#656, that will be perfect for us. :)

Thank you very much!
Laszlo


      reply	other threads:[~2017-07-28 15:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-27  3:05 [PATCH] MdeModulePkg/PciBus: Avoid hang when BUS pad resource is not in top Ruiyu Ni
2017-07-27  4:54 ` Zeng, Star
2017-07-27 11:23 ` bus number padding [was: MdeModulePkg/PciBus: Avoid hang when BUS pad resource is not in top] Laszlo Ersek
2017-07-28 10:19   ` Ni, Ruiyu
2017-07-28 15:50     ` Laszlo Ersek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ed18d93-1d0b-56a2-7636-ac499bbc5c6d@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox