From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 220F42194D3AE for ; Tue, 25 Dec 2018 02:18:32 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 997FD7F3E1; Tue, 25 Dec 2018 10:18:31 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-65.rdu2.redhat.com [10.10.120.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2722160161; Tue, 25 Dec 2018 10:18:27 +0000 (UTC) To: "Zhoujian (jay)" Cc: "Yao, Jiewen" , "edk2-devel@lists.01.org" , "Huangweidong (C)" , "liujunjie (A)" , "wangxin (U)" , "wujing (O)" , "dengkai (A)" References: <74D8A39837DF1E4DA445A8C0B3885C503F462A65@shsmsx102.ccr.corp.intel.com> <74D8A39837DF1E4DA445A8C0B3885C503F464A14@shsmsx102.ccr.corp.intel.com> From: Laszlo Ersek Message-ID: Date: Tue, 25 Dec 2018 11:18:26 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 25 Dec 2018 10:18:31 +0000 (UTC) Subject: Re: Question about hotplugging NIC devices to an empty pci-bridge X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Dec 2018 10:18:33 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Brief answer while I'm on PTO. (It's difficult to reply to this thread in any sensible manner, because of the brain-damaged top-posting that outlook and gmail perpetuate. I'll try my best anyway, but you might have to reverse the order of my answers for getting a good logical explanation. Again, the damage is self-inflicted here; use a better MUA please.) On 12/21/18 14:50, Zhoujian (jay) wrote: >> -----Original Message----- >> From: Yao, Jiewen [mailto:jiewen.yao@intel.com] >> Sent: Friday, December 21, 2018 1:28 PM >> To: Zhoujian (jay) ; >> edk2-devel@lists.01.org; lersek@redhat.com >> Cc: Huangweidong (C) ; liujunjie (A) >> ; wangxin (U) ; >> wujing (O) ; dengkai (A) >> Subject: RE: Question about hotplugging NIC devices to an empty >> pci-bridge When you hotplug a traditional PCI, or PCI Express, device, at OS runtime, the OS can generally only satisfy the resource requirements of the device from reserved (pre-allocated) resources. This means that hotplug plans have to be considered in advance when the initial PCI enumeration and resource assignment occurs, in the firmware. The reservations should be considered / propagated upstream (to the root complex(es)) from the leaf bridge(s) where the hotplug actions are expected. PciBusDxe covers the propagation, but the "leaves" have to expose the reservations ("paddings"). The default reservation sizes may be both wasteful and insufficient. One example for waste is when you have many traditional PCI bridges, each requiring 4KB IO space, but the platform doesn't have much IO space in total (the theoretical maximum is 64KB anyway), and so you run out of IO space during enumeration. More below: >> >> You need have a PciHotPlug driver to produce the >> EFI_PCI_HOT_PLUG_INIT_PROTOCOL >> >> One example: >> https://github.com/tianocore/edk2/tree/master/OvmfPkg/PciHotPlugInitDxe >> Laszlo added it. He may provide comment on how to use it. >> >> Another example: >> https://github.com/tianocore/edk2-platforms/tree/devel- >> MinPlatform/Platform/Intel/KabylakeOpenBoardPkg/Features/PciHotPlug >> This is to add Thunderbolt support in Kabylake platform. > > I've checked the dsc, and confirmed that the OVMF.fd already had the > PciHotPlug driver. > Then I found the resource info through the debug log like below: > > InitRootBridge: populated root bus 0, with room for 255 subordinate bus(es) > RootBridge: PciRoot(0x0) > Support/Attr: 70069 / 70069 > DmaAbove4G: No > NoExtConfSpace: Yes > AllocAttr: 3 (CombineMemPMem Mem64Decode) > Bus: 0 - FF Translation=0 > Io: C000 - FFFF Translation=0 > Mem: C0000000 - FBFFFFFF Translation=0 > MemAbove4G: 41800000000 - 41FFFFFFFFF Translation=0 > PMem: FFFFFFFFFFFFFFFF - 0 Translation=0 > PMemAbove4G: FFFFFFFFFFFFFFFF - 0 Translation=0 > > In the OvmfPkg/PlatformPei/Platform.c, the function > MemMapInitialization sets the PciIoBase=0xC000 and PciIoSize=0x4000(On > Q35, the PciIoBase=0x6000 and PciIoSize=0xA000). > > So my question are: > 1)Why the default value of PciIoBase is 0xC000, each pci-bridges needs > 0x0fff IO window, which means only 4 pci-bridges can be reserved? The IO space aperture sizes that you see on i440fx and Q35 in OvmfPkg/PlatformPei emerge like that simply because those are the largest contiguous IO space ranges that fit between IO ports that belong to platform devices. If you run git blame -- OvmfPkg/PlatformPei/Platform.c you soon end up with a pointer to commit bba734ab4c7c ("OvmfPkg/PlatformPei: provide 10 * 4KB of PCI IO Port space on Q35", 2016-05-17). The commit message on that commit should help, and it also mentions https://bugzilla.redhat.com/show_bug.cgi?id=1333238 which is where I had investigated the IO space sizes that were *practically* available on i440fx and Q35. > 2)If I set the PciIoBase=0x1000, PciIoSize=0xA000 and start a vm with > 8 empty pci-bridges, hotpluging a virtual nic to the pci-bridge, the > problem is disappearing. > But will this cause any side effects? Yes, it could; if you override PciIoBase like this, then PciBusDxe may easily allocate IO BARs of devices such that they overlap IO ports of other (built-in, platform) devices. The solution to the IO space shortage is to use Q35 with a PCI Express (that is, not traditional PCI) hierarchy. PCI Express devices are required to function without IO BARs, and you can use PCI Express Root Ports, and Switches (consisting from Upstream Ports and a number of Downstream Ports) without consuming IO space at all. This is documented in great detail in the following two documents in the QEMU source tree: [1] docs/pcie.txt [2] docs/pcie_pci_bridge.txt Now, if you switch to Q35 / PCIE, then you likely won't run out of IO space; however, the other issue may still arise, where not enough MMIO is reserved for hot-plugging devices with large MMIO demands. For that, OvmfPkg/PciHotPlugInitDxe implements the firmware side for QEMU's "PCI resource reservation capability". This is a vendor-specific PCI capability (in traditional config space) that can be added to the generic PCI Express Root Port device model of QEMU, using the appropriate command line switches (see again [1] and [2]). When you do that, PciHotPlugInitDxe instructs PciBusDxe to reserve the given sizes from the given resource types on the given root port, and then you can hotplug a large device at OS runtime into that root port. For more details (beyond the two documents above), please refer to [3] git log -- OvmfPkg/PciHotPlugInitDxe [4] https://bugzilla.redhat.com/show_bug.cgi?id=1434740#c5 [5] https://lists.01.org/pipermail/edk2-devel/2017-September/015296.html More below: >>> -----Original Message----- >>> From: Zhoujian (jay) [mailto:jianjay.zhou@huawei.com] >>> Sent: Friday, December 21, 2018 11:04 AM >>> To: Yao, Jiewen ; edk2-devel@lists.01.org; >>> lersek@redhat.com >>> Cc: Huangweidong (C) ; liujunjie (A) >>> ; wangxin (U) ; >>> wujing (O) ; dengkai (A) >>> Subject: RE: Question about hotplugging NIC devices to an empty >>> pci-bridge >>> >>> I've tried to set PcdPciBusHotplugDeviceSupport to be true in >>> MdeModulePkg.dec like below: >>> gEfiMdeModulePkgTokenSpaceGuid.PcdPciBusHotplugDeviceSupport|TRUE >>> |BOOLEAN|0x0001003d >>> But the problem still exists. Is there any steps I missed? Or some >>> infos need to populate to OVMF by Qemu? >>> >>> Could you give me more infos? >>> >>> Thanks, >>> Jay Zhou >>> >>>> -----Original Message----- >>>> From: Yao, Jiewen [mailto:jiewen.yao@intel.com] >>>> Sent: Thursday, December 20, 2018 8:09 PM >>>> To: Zhoujian (jay) ; >>>> edk2-devel@lists.01.org >>>> Cc: Huangweidong (C) ; liujunjie (A) >>>> ; wangxin (U) >>> ; wujing (O) >>>> ; dengkai (A) >>>> Subject: RE: Question about hotplugging NIC devices to an empty >>> pci-bridge >>>> >>>> Maybe you can use EFI_PCI_HOT_PLUG_INIT_PROTOCOL to reserve some >>> resource. >>>> >>>> See MdePkg\Include\Protocol\PciHotPlugInit.h >>>> >>>> Thank you >>>> Yao Jiewen >>>> >>>>> -----Original Message----- >>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On >>>>> Behalf >>> Of >>>>> Zhoujian (jay) >>>>> Sent: Thursday, December 20, 2018 7:34 PM >>>>> To: edk2-devel@lists.01.org >>>>> Cc: Huangweidong (C) ; liujunjie (A) >>>>> ; wangxin (U) >>> ; >>>>> wujing (O) ; dengkai (A) >>> >>>>> Subject: [edk2] Question about hotplugging NIC devices to an empty >>>>> pci-bridge >>>>> >>>>> Hi all, >>>>> >>>>> The issue occurs when I started a virtual machine in UEFI way by >>>>> libvirt on qemu-kvm platform, the vm is configured with 8 >>>>> pci-bridges on root bus0. I hotplug a device like virtual nic to >>>>> an empty pci-bridge which has no device connected. Login the vm, I >>>>> can see the device by "lspci"", but it didn't show by "ifconfig >>>>> -a". Dmesg shows like >>>> below: >>>>> pci 0000:04:01.0: BAR 0: no space for [mem size 0x00010000 64bit >>>>> pref] pci >>>>> 0000:04:01.0: BAR 0: failed to assign [mem size 0x00010000 64bit >>>>> pref] pci >>>>> 0000:04:01.0: BAR 3: no space for [mem size 0x00004000 64bit pref] >>>>> pci >>>>> 0000:04:01.0: BAR 3: failed to assign [mem size 0x00004000 64bit >>>>> pref] >>>>> >>>>> Reboot the vm, everything turns back to normal and I can see the >>>>> new hotplugged nic by "ifconfig -a". >>>>> >>>>> Use the OVMF compiling from latest edk2 source code, the same >>> problem >>>>> arises. >>>>> >>>>> So, my questions are: >>>>> 1) the generic PCI bus driver in edk2 does not allocate IO and/or >>>>> MMIO for a bridge if there is no device behind the Currently, if >>>>> you bridge that consume that kind of resource? >>>>> 2) What's the purpose of this strategy? >>>>> 3) Why don't allocate resource to all bridges like seabios? >>>>> 4) Is there any switch for me to turn off this constraint so that >>>>> every pci-bridge including empty ones can be assigned IO and >>>>> memory >>> window? >>>>> Otherwise, each time I hotplug a device to empty pci-bridge, a >>>>> reboot operation should be implemented to use the device? >>>>> >>>>> Any help will be appreciated, Thanks! Currently, the resource reservation capability is implemented on the Generic PCI Express Root Port device model, which is only usable on Q35. If you really want to hotplug a traditional PCI device, *while* sizing the reservation appropriately, I believe you'll have to: - size the reservation on a Root Port as needed, - cold-plug a PCIE-PCI bridge first into the Root Port, - hotplug the traditional PCI device into the PCIE-PCI bridge. (You can also *hot*plug the PCIE-PCI bridge itself, because has been fixed, but then remember to reserve bus numbers as well, at the Root Port level.) We worked out this exact scenario with another developer earlier, on the SeaBIOS mailing list. Please read through the thread below: [SeaBIOS] hotplug failure issue on pci-bridge http://mid.mail-archive.com/da8e8d1c-ab1e-c790-0c34-ef094a438a77@linux.intel.com https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/WKHZ6LVPOAXRPPT4M5HZKUPON2Z7EZWB/ Hope this helps, Laszlo