On Mon, May 23, 2022 at 11:24 PM, Gerd Hoffmann wrote:
On Mon, May 23, 2022 at 04:48:05AM -0700, Albecki, Mateusz wrote:
@Ni, Ray

I think EDK2 needs to provide a way for root port to operate without IO space assigned in a platform-independent way. I can think of the following cases when root port didn't get IO space:

1. We have run out of IO space but it's fine since the device under the root port doesn't use IO or has only non-critical functionalities under IO
2. We have run out of IO space and it's really not fine since device needs IO
3. We are running on a CPU which doesn't support IO

For 1. the question is whether the device driver in EDK2 understands that IO bar for that device is optional and will bother to check if it has been assigned and either fail gracefully or continue operation in limited capacity. For 2. the question is whether the driver will fail gracefully. 3 is for completeness at this point I think since the only other architecture that uses EDK2 is ARM which has to deal with it in some way right now which I think maps IO region into MMIO so in a way it supports IO.
Well, the case I'm trying to handle here is qemu microvm. It's x86, but
io address space support for pcie devices is not wired up. So the pcie
host bridge doesn't support io, which is rather close to case (3).

I've checked the device driver behavior in EDK2 for devices which use IO bar here is the rundown:
1. IDE - Doesn't check if IO has been assigned, not giving IO results in undefined behavior
2. SerialIo -> Doesn't check, will assert the system when IO is not assigned (although the logic there is really strange as it can use 3 different access methods)
3. UHCI -> Checks but too late, will most likely result in undefined behavior
Current edk2 behavior is that the initialization of the pcie host bridge
fails in case no io space is present (and all devices connected to it are
not initialized either of course).

With this patch applied pcie host bridge initialization works. PCIe
devices without io bars are enumerated and initialized sucessfully.
PCIe devices with io bar fail to initialize. That isn't much of a
problem tough as a qemu microvm typically has no pcie devices with io
bars.
You mention that devices with IO bar fail to initialize but that is contrary to what I would expect from code review. I've run an experiment with your change in which I am telling
EDK2 that no IO space is available on the system by not updating the IO range in PciHostBridgeLib. Sure enough Devices that need an IO are still enumerated, device path and PciIo are installed
and in general everything works as it used to. If I had an UHCI controller on that system UHCI driver would be loaded and it could potentially result in some strange behavior since that driver isn't smart enough to check
if IO space has been allocated for the device.

To make things worse I see that if we return success there EDK2 will actually go ahead and start assigning trash addresses to the device and enable IO space decoding in case of the PCI root port which means that device will try to decode
invalid IO ranges. Not an issue for a system without an IO but for a system in which we have run out of the IO and we have entered this code branch this new behavior is potentially more dangerous then simply not enumerating the device.

Logs:
PciBus: Resource Map for Root Bridge PciRoot(0x0)
Type =   Io16; Base = 0xFFFFFFFFFFFFFFFF; Length = 0x1000; Alignment = 0xFFF
   Base = 0x0; Length = 0x4; Alignment = 0x3; Owner = PPB [00|06|04:14]
   Base = 0x0; Length = 0x4; Alignment = 0x3; Owner = PPB [00|06|00:14]
   Base = 0xFFFFFFFC; Length = 0x4; Alignment = 0x3; Owner = PCI [00|04|00:24]
   Base = 0xFFFFFFFC; Length = 0x4; Alignment = 0x3; Owner = PCI [00|04|00:20]
   Base = 0xFFFFFFFC; Length = 0x4; Alignment = 0x3; Owner = PCI [00|04|00:1C]
   Base = 0xFFFFFFFC; Length = 0x4; Alignment = 0x3; Owner = PCI [00|04|00:18]


Even with those bad device drivers I would agree that taking this
change presents low risk given that those devices are pretty old and
should be mostly unused on new systems(SerialIo being an exception but
that one is usually an RCIEP).
Also note that for pcie root bridges which do support io address space
this patch changes nothing.
It seems to me like it does. Specifically the error scenario where the system has run out of IO space will not be handled properly I think.

That said I think we are missing a larger issue here - why are we
running out of IO when we have 16 root ports?
I don't think so. I see the *linux kernel* hand out io address space to
pcie root ports (until it runs out). edk2 doesn't.
Ok I misunderstood previous mails

take care,
Gerd

I think to really handle it we would have to have a more involved change. Specifically in the PciHostBridge.c:NoitfyPhase function we need to have a way to tell the PciBus driver which resources specifically failed to allocate and treat this as a condition we need to handle instead of panicking and asserting the system or dropping the entire host bridge. When PciBus driver sees that IO failed to allocate it would skip IO bars and would not allow to set the IO space enable bit in the root bridge. We also would need to change the logic in PciResourceSupport.c:ProgramBar because that function is very optimistic and assumes that if we were able to program one BAR then surely all resources for the device are allocated:

  //
  // Indicate pci bus driver has allocated
  // resource for this device
  // It might be a temporary solution here since
  // pci device could have multiple bar
  //
  Node->PciDev->Allocated = TRUE;

Which simply isn't the case.

Thanks,
Mateusz