* Re: [PATCH] MdeModulePkg/PciBus: Reserve Bus number for non-root and root HPBs
2018-01-05 8:23 [PATCH] MdeModulePkg/PciBus: Reserve Bus number for non-root and root HPBs Ruiyu Ni
@ 2018-01-05 20:12 ` Laszlo Ersek
0 siblings, 0 replies; 2+ messages in thread
From: Laszlo Ersek @ 2018-01-05 20:12 UTC (permalink / raw)
To: Ruiyu Ni; +Cc: edk2-devel, Marcel Apfelbaum, Alexander Bezzubikov
Hello Ray,
thanks very much for the patch; please see my comments below.
(+Marcel, +Aleksandr, just for their information)
On 01/05/18 09:23, Ruiyu Ni wrote:
> REF: https://bugzilla.tianocore.org/show_bug.cgi?id=656
>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Ruiyu Ni <ruiyu.ni@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> ---
> MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c b/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> index 8b076e8791..a777f91f2c 100644
> --- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> +++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> @@ -1,7 +1,7 @@
> /** @file
> Internal library implementation for PCI Bus module.
>
> -Copyright (c) 2006 - 2017, Intel Corporation. All rights reserved.<BR>
> +Copyright (c) 2006 - 2018, Intel Corporation. All rights reserved.<BR>
> (C) Copyright 2015 Hewlett Packard Enterprise Development LP<BR>
> This program and the accompanying materials
> are licensed and made available under the terms and conditions of the BSD License
> @@ -1125,14 +1125,14 @@ PciScanBus (
> BusPadding = FALSE;
> if (gPciHotPlugInit != NULL) {
>
> - if (IsRootPciHotPlugBus (PciDevice->DevicePath, &HpIndex)) {
> + if (IsPciHotPlugBus (PciDevice)) {
>
> //
> // If it is initialized, get the padded bus range
> //
> Status = gPciHotPlugInit->GetResourcePadding (
> gPciHotPlugInit,
> - gPciRootHpcPool[HpIndex].HpbDevicePath,
> + PciDevice->DevicePath,
> PciAddress,
> &State,
> (VOID **) &Descriptors,
>
In order to test this patch (i.e., bus number reservation) in OVMF, I
created the following virtual machine configuration / PCI Express
hierarchy:
- Four PCI Express root complexes / root bridges in total (one main, and
three extra), assigning bus numbers 0, 64 (0x40), 128 (0x80) and 192
(0xc0) to them, respectively. These root bridges partition the 0..255
bus number space (which is what QEMU and OVMF support) accordingly.
- Onto each root bridge, I placed two PCI Express root ports. The basic
configuration of these root ports is that they do not reserve any bus
numbers, but this can be changed later.
- In the first two root ports, I cold-plugged an XHCI and a virtio SCSI
controller, respectively. In the other six root ports, I plugged
virtio NICs.
Please find the QEMU command line / script below:
> ISO=Fedora-Workstation-Live-x86_64-27-1.6.iso
> CODE=OVMF_CODE.4m.3264.fd
> TMPL=OVMF_VARS.fd
> VARS=try13.fd
> DEBUG=debug13.log
> MODERN=disable-legacy=on,disable-modern=off
> BUS_RES=0
>
> cp -- "${TMPL}" "${VARS}"
>
> qemu-system-x86_64 \
> -no-user-config \
> -nodefaults \
> \
> -s \
> \
> -machine q35,smm=on,accel=kvm \
> -global driver=cfi.pflash01,property=secure,value=on \
> -cpu Haswell-noTSX,+vmx,enforce=on \
> \
> -m 4096 \
> -smp sockets=4,cores=1,threads=1 \
> \
> -drive if=pflash,format=raw,file=$CODE,readonly \
> -drive if=pflash,format=raw,file=$VARS \
> \
> -chardev file,id=debugfile,path=$DEBUG \
> -device isa-debugcon,iobase=0x402,chardev=debugfile \
> \
> -chardev stdio,id=char0,signal=off,mux=on \
> -mon chardev=char0,mode=readline \
> -serial chardev:char0 \
> \
> -device intel-iommu \
> \
> -numa node,nodeid=0,cpus=0,mem=1024 \
> -numa node,nodeid=1,cpus=1,mem=1024 \
> -numa node,nodeid=2,cpus=2,mem=1024 \
> -numa node,nodeid=3,cpus=3,mem=1024 \
> \
> -drive id=fedora-iso,if=none,format=raw,readonly,file=$ISO \
> \
> -netdev user,id=net0,net=10.0.2.0/24,hostfwd=tcp:127.0.0.1:2222-:22 \
> -netdev user,id=net1,net=10.0.3.0/24 \
> -netdev user,id=net2,net=10.0.4.0/24 \
> -netdev user,id=net3,net=10.0.5.0/24 \
> -netdev user,id=net4,net=10.0.6.0/24 \
> -netdev user,id=net5,net=10.0.7.0/24 \
> \
> -device pxb-pcie,bus_nr=64,id=rootbr1,numa_node=1,bus=pcie.0,addr=0x3 \
> -device pxb-pcie,bus_nr=128,id=rootbr2,numa_node=2,bus=pcie.0,addr=0x4 \
> -device pxb-pcie,bus_nr=192,id=rootbr3,numa_node=3,bus=pcie.0,addr=0x5 \
> \
> -device qxl-vga,bus=pcie.0,addr=0x6 \
> \
> -device pcie-root-port,id=rootbr0-port1,bus=pcie.0,addr=0x1,slot=1,io-reserve=0,bus-reserve=$((BUS_RES * 2)) \
> -device nec-usb-xhci,id=usb-host-controller,bus=rootbr0-port1 \
> -device usb-kbd,bus=usb-host-controller.0 \
> -device usb-tablet,bus=usb-host-controller.0 \
> \
> -device pcie-root-port,id=rootbr0-port2,bus=pcie.0,addr=0x2,slot=2,io-reserve=0,bus-reserve=$((BUS_RES * 3)) \
> -device virtio-scsi-pci,bus=rootbr0-port2,$MODERN,id=scsi0 \
> -device scsi-cd,drive=fedora-iso,bus=scsi0.0,bootindex=1 \
> \
> -device pcie-root-port,id=rootbr1-port1,bus=rootbr1,addr=0x1,slot=3,io-reserve=0,bus-reserve=$((BUS_RES * 4)) \
> -device virtio-net-pci,$MODERN,bus=rootbr1-port1,netdev=net0 \
> \
> -device pcie-root-port,id=rootbr1-port2,bus=rootbr1,addr=0x2,slot=4,io-reserve=0,bus-reserve=$((BUS_RES * 5)) \
> -device virtio-net-pci,$MODERN,bus=rootbr1-port2,netdev=net1 \
> \
> -device pcie-root-port,id=rootbr2-port1,bus=rootbr2,addr=0x1,slot=5,io-reserve=0,bus-reserve=$((BUS_RES * 6)) \
> -device virtio-net-pci,$MODERN,bus=rootbr2-port1,netdev=net2 \
> \
> -device pcie-root-port,id=rootbr2-port2,bus=rootbr2,addr=0x2,slot=6,io-reserve=0,bus-reserve=$((BUS_RES * 7)) \
> -device virtio-net-pci,$MODERN,bus=rootbr2-port2,netdev=net3 \
> \
> -device pcie-root-port,id=rootbr3-port1,bus=rootbr3,addr=0x1,slot=7,io-reserve=0,bus-reserve=$((BUS_RES * 8)) \
> -device virtio-net-pci,$MODERN,bus=rootbr3-port1,netdev=net4 \
> \
> -device pcie-root-port,id=rootbr3-port2,bus=rootbr3,addr=0x2,slot=8,io-reserve=0,bus-reserve=$((BUS_RES * 9)) \
> -device virtio-net-pci,$MODERN,bus=rootbr3-port2,netdev=net5 \
> \
>
I. This command line allows for testing three cold-plug scenarios:
(1) Capture a log from OVMF, and an "lspci -t -v" output from the Linux
guest, without your patch.
In this case, the "bus-reserve" properties of the "pcie-root-port"
devices are irrelevant -- although OVMF would return them to
PciBusDxe as padding requests, were BUS_RES=1 set, without your
patch PciBusDxe ignores them (because QEMU only has "non-root"
hotplug controllers, in the PI spec sense).
The purpose of this step is to capture a "baseline", without bus
number reservations.
(2) Apply your patch, and set BUS_RES to zero. Capture the OVMF log and
the "lspci -t -v" output again, and compare them to (1).
The idea here is that the patch should make no difference relative
to the current status, if bus numbers are not reserved.
(3) Apply your patch, and set BUS_RES to one. Capture the OVMF log and
the "lspci -t -v" output again, and compare them to (2) (or,
equivalently, to (1)).
The idea is that BUS_RES=1 will set the reservation numbers to 2, 3,
... 9 for the eight root ports, respectively.
Results:
(1) The baseline "lspci -t -v" output (i.e., without your patch) is:
> -+-[0000:c0]-+-01.0-[c1]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[c2]----00.0 Red Hat, Inc Virtio network device
> +-[0000:80]-+-01.0-[81]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[82]----00.0 Red Hat, Inc Virtio network device
> +-[0000:40]-+-01.0-[41]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[42]----00.0 Red Hat, Inc Virtio network device
> \-[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +-01.0-[01]----00.0 NEC Corporation uPD720200 USB 3.0 Host Controller
> +-02.0-[02]----00.0 Red Hat, Inc Virtio SCSI
> +-03.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-04.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-05.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-06.0 Red Hat, Inc. QXL paravirtual graphic card
> +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller
> +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
> \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller
We can see that
- the root complexes / bridges get bus numbers 00, 40, 80, c0 (all
hex);
- the downstream bridges of the root ports -- in total, there are
eight of them -- get bus numbers 01, 02; 41, 42; 81, 82; c1, c2;
respectively (all hex);
- the individual (single-function) PCI Express devices occupy slot 0
of the respective root ports. (The only possible slot.)
OK.
(2) Applying your patch and then launching the script with BUS_RES=0
prevents OVMF from booting.
Under this scenario, OVMF does not request any bus number padding
from PciBusDxe (this is a valid thing to do). Therefore, the
PciGetBusRange() function
[MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c] returns
EFI_NOT_FOUND, and the PciScanBus() function is terminated:
> if (IsPciHotPlugBus (PciDevice)) {
>
> //
> // If it is initialized, get the padded bus range
> //
> Status = gPciHotPlugInit->GetResourcePadding (
> gPciHotPlugInit,
> PciDevice->DevicePath,
> PciAddress,
> &State,
> (VOID **) &Descriptors,
> &Attributes
> );
>
> if (EFI_ERROR (Status)) {
> return Status;
> }
>
> BusRange = 0;
> NextDescriptors = Descriptors;
> Status = PciGetBusRange (
> &NextDescriptors,
> NULL,
> NULL,
> &BusRange
> );
>
> FreePool (Descriptors);
>
> if (EFI_ERROR (Status)) {
> return Status; <------- HERE
> }
>
> BusPadding = TRUE;
> }
This should be fixed; I'll propose a patch at the bottom.
(3) With your patch applied, and BUS_RES set to 1, the guest boots fine,
and the "lspci -t -v" output is:
> -+-[0000:c0]-+-01.0-[c1-c9]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[ca-d3]----00.0 Red Hat, Inc Virtio network device
> +-[0000:80]-+-01.0-[81-87]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[88-8f]----00.0 Red Hat, Inc Virtio network device
> +-[0000:40]-+-01.0-[41-45]----00.0 Red Hat, Inc Virtio network device
> | \-02.0-[46-4b]----00.0 Red Hat, Inc Virtio network device
> \-[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +-01.0-[01-03]----00.0 NEC Corporation uPD720200 USB 3.0 Host Controller
> +-02.0-[04-07]----00.0 Red Hat, Inc Virtio SCSI
> +-03.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-04.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-05.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-06.0 Red Hat, Inc. QXL paravirtual graphic card
> +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller
> +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
> \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller
- There is no change to the root complexes / bridges; so that's OK.
- The downstream bridges of the PCI Express root ports (there are
eight of them) now end up with the following
[secondary..subordinate] bus number ranges, respectively:
[01-03] - 1 nr. for the port itself (0x01), 2 nrs reserved (0x02-0x03)
[04-07] - 1 nr. for the port itself (0x04), 3 nrs reserved (0x05-0x07)
[41-45] - 1 nr. for the port itself (0x41), 4 nrs reserved (0x42-0x45)
[46-4b] - 1 nr. for the port itself (0x46), 5 nrs reserved (0x47-0x4b)
[81-87] - 1 nr. for the port itself (0x81), 6 nrs reserved (0x82-0x87)
[88-8f] - 1 nr. for the port itself (0x88), 7 nrs reserved (0x89-0x8f)
[c1-c9] - 1 nr. for the port itself (0xc1), 8 nrs reserved (0xc2-0xc9)
[ca-d3] - 1 nr. for the port itself (0xca), 9 nrs reserved (0xcb-0xd3)
This matches the QEMU command line (the "bus-reserve" properties)
precisely.
- The individual (single-function) PCI Express devices continue to
occupy slot 0 of the respective root ports (using the root port's
"own" bus number as bus number, like in (1) -- i.e. the ports
"secondary" bus nr).
So this is OK as well.
--------*--------
II. Then, I modified the command line, to test the hotplug scenario for
which QEMU actually introduced bus number reservations -- plug a
PCIE-PCI bridge into a PCI Express root port. (This is documented in the
"docs/pcie_pci_bridge.txt" file in the QEMU source tree.) The bridge
being hot-plugged will require its own bus number(s), and that range is
allocated from the reservation.
The command line changes are:
- remove:
> -device virtio-net-pci,$MODERN,bus=rootbr3-port1,netdev=net4 \
- replace
> -device pcie-root-port,id=rootbr3-port1,bus=rootbr3,addr=0x1,slot=7,io-reserve=0,bus-reserve=$((BUS_RES * 8)) \
with
> -device pcie-root-port,id=rootbr3-port1,bus=rootbr3,addr=0x1,slot=7,io-reserve=4K,pref64-reserve=1M,bus-reserve=$((BUS_RES * 8)) \
(This is so that we reserve all the resources for the to-be-hotplugged
PCIE-PCI bridge that Linux will want to assign it.)
(4) Once the Linux guest is booted up like this, the hotplug commands at
the QEMU monitor prompt are (switch to the monitor with [Ctrl-a c]):
> (qemu) device_add pcie-pci-bridge,id=rootbr3-port1-pci-bridge,bus=rootbr3-port1
> (qemu) device_add e1000,bus=rootbr3-port1-pci-bridge,addr=1,netdev=net4
This results in the following "lspci -t -v" output in the guest:
> -+-[0000:c0]-+-01.0-[c1-c9]----00.0-[c2]----01.0 Intel Corporation 82540EM Gigabit Ethernet Controller
> | \-02.0-[ca-d3]----00.0 Red Hat, Inc. Virtio network device
> +-[0000:80]-+-01.0-[81-87]----00.0 Red Hat, Inc. Virtio network device
> | \-02.0-[88-8f]----00.0 Red Hat, Inc. Virtio network device
> +-[0000:40]-+-01.0-[41-45]----00.0 Red Hat, Inc. Virtio network device
> | \-02.0-[46-4b]----00.0 Red Hat, Inc. Virtio network device
> \-[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +-01.0-[01-03]----00.0 NEC Corporation uPD720200 USB 3.0 Host Controller
> +-02.0-[04-07]----00.0 Red Hat, Inc. Virtio SCSI
> +-03.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-04.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-05.0 Red Hat, Inc. QEMU PCIe Expander bridge
> +-06.0 Red Hat, Inc. QXL paravirtual graphic card
> +-1f.0 Intel Corporation 82801IB (ICH9) LPC Interface Controller
> +-1f.2 Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
> \-1f.3 Intel Corporation 82801I (ICH9 Family) SMBus Controller
- The PCIE-PCI bridge got bus number 0xc2 (carved out of the
0xc2..0xc9 reservation), and under it, we see the E1000
(traditional PCI) device.
So, OK.
--------*--------
III. Summary: everything works fine with this patch, as long as the HPCs
actually request bus number paddings. A fix is needed for the case when
a HPC does not request bus number padding; because as-is, the patch
causes PciScanBus() to terminate early in that case.
I suggest squashing the following patch:
> commit eadc0e1521229521f4ef5d35bb0ec3323a290b85
> Author: Laszlo Ersek <lersek@redhat.com>
> Date: Fri Jan 5 20:33:55 2018 +0100
>
> MdeModulePkg/PciBusDxe: cope with HPCs that request no bus nr padding
>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>
> diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c b/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> index a777f91f2c1d..83d4d5e74624 100644
> --- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> +++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciLib.c
> @@ -1154,11 +1154,18 @@ PciScanBus (
>
> FreePool (Descriptors);
>
> - if (EFI_ERROR (Status)) {
> + switch (Status) {
> + case EFI_SUCCESS:
> + BusPadding = TRUE;
> + break;
> + case EFI_NOT_FOUND:
> + //
> + // no bus number padding requested
> + //
> + break;
> + default:
> return Status;
> }
> -
> - BusPadding = TRUE;
> }
> }
> }
With that, you can add:
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Thank you!
Laszlo
^ permalink raw reply [flat|nested] 2+ messages in thread