From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id DF8AF210F30CE for ; Wed, 23 Jan 2019 07:58:03 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 51E19C05D3EC; Wed, 23 Jan 2019 15:58:02 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-117.rdu2.redhat.com [10.10.120.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id A2B6D67E71; Wed, 23 Jan 2019 15:57:54 +0000 (UTC) To: "Wuzongyong (Euler Dept)" Cc: "edk2-devel@lists.01.org" , chenlixin , "Wanzongshun (Vincent)" , Alex Williamson , Gerd Hoffmann , "Dr. David Alan Gilbert" References: <9BD73EA91F8E404F851CF3F519B14AA8036C60EF@DGGEMI521-MBX.china.huawei.com> From: Laszlo Ersek Message-ID: Date: Wed, 23 Jan 2019 16:57:43 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <9BD73EA91F8E404F851CF3F519B14AA8036C60EF@DGGEMI521-MBX.china.huawei.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 23 Jan 2019 15:58:02 +0000 (UTC) Subject: Re: A VM failed to boot when I changed the gUefiOvmfPkgTokenSpaceGuid.PcdPciMmio64Size X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2019 15:58:04 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Hi, (adding Alex, Gerd, Dave) On 01/23/19 12:40, Wuzongyong (Euler Dept) wrote: > > Hi, > > Recently I do a test with edk2 on rhel platform. Cool :) I don't frequently see RHEL-related reports on this list. > I resized the gUefiOvmfPkgTokenSpaceGuid.PcdPciMmio64Size value to 1TB for supporting multiple GPUs passthrough which have large bars . Out of curiosity, did you modify the PCD directly (that is, by changing the DSC file, or using the --pcd build option), or else, did you use the QEMU switch -fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=1048576 that was described in commit 7e5b1b670c38 ("OvmfPkg: PlatformPei: determine the 64-bit PCI host aperture for X64 DXE", 2016-03-23)? > But when I started a VM with a virtio nic and booted from the changed OVMF, Ah, OK, you wrote "changed OVMF". I guess you modified the DSC then. So, for a bit more convenience, see above: the aperture size can be set on the QEMU command line too. The fw_cfg file name carries "X-" in the name so that it's clear that the knob is experimental and might change in the future incompatibly. > it seems the VM failed to boot before loading ipxe. > The uefi error log is like this: > > !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!! > ExceptionData - 000000000000000B I:0 R:1 U:0 W:1 P:1 PK:0 S:0 > RIP - 00000000BE4AD7F7, CS - 0000000000000038, RFLAGS - 0000000000010206 > RAX - 0000000000000000, RCX - 0000000000000014, RDX - 0000010000000014 > RBX - 00000000BE4BEFE0, RSP - 00000000BFEDE6F8, RBP - 00000000BE4BEFF0 > RSI - 00000000BE4BEFF0, RDI - 00000000BE4BEFE0 > R8 - 0000000000000000, R9 - 0000000000000000, R10 - 0000000000000000 > R11 - 00000000BE4BB900, R12 - 00000000BE4BEFD0, R13 - 0000000000000060 > R14 - 0000000000000084, R15 - 0000000000000070 > DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 > GS - 0000000000000030, SS - 0000000000000030 > CR0 - 0000000080010033, CR2 - 0000010000000014, CR3 - 00000000BF6BA000 > CR4 - 0000000000000668, CR8 - 0000000000000000 > DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 > DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 > GDTR - 00000000BF6A8A98 0000000000000047, LDTR - 0000000000000000 > IDTR - 00000000BF29E018 0000000000000FFF, TR - 0000000000000000 > FXSAVE_STATE - 00000000BFEDE350 > !!!! Find image 1af41000.efidrv (ImageBase=00000000BE499000, EntryPoint=00000000BE49F1EB) !!!! "1af41000.efidrv" is the iPXE UEFI option ROM for the virtio-net NIC. QEMU loads the combined BIOS+UEFI option ROM from an external file into the ROM BAR of the virtio-net NIC, and then the PciBusDxe driver built into OVMF makes sure the driver is dispatched. (The dispatch is deferred until after EndOfDxe, in BDS.) The external file that provides this ROM image may be known under different pathnames. On RHEL7 for example, it is installed as "/usr/share/ipxe/1af41000.rom", as part of the ipxe-roms-qemu package; however QEMU loads it through the symlink "/usr/share/qemu-kvm/pxe-virtio.rom" that is part of the qemu-kvm-rhev package. If you use upstream QEMU, then the file is "$prefix/share/qemu/efi-virtio.rom". (1) Either way, if you'd like to check whether this issue is specific to the iPXE option ROM, you could prevent QEMU from loading the ROM image into the NIC's ROM BAR, and retest. The QEMU option for that is -device virtio-net-pci,[some properties,]romfile='' The corresponging domain XML element is In this case, virtio-net NICs will be bound by OVMF's built-in virtio-net driver (OvmfPkg/VirtioNetDxe). > But when I decrease the PcdPciMmio64Size to 10 * 32 GB, the VM booted successfully. > I'm not familiar with uefi, could you please point out what wrong I have done? (2) One important factor to check (on your host) is: $ grep 'address sizes' /proc/cpuinfo Because, the size that you choose for the 64-bit PCI MMIO aperture influences the total address space size (the "address width") for which the DXE IPL PEIM in OVMF has to create page tables (1:1 virtual->physical mapping). When using an aperture >= 1TB, this address width is at least 40. And, if you use KVM with nested paging enabled ("ept" on Intel, "npt" on AMD), *but* the physical address size of your processor is *smaller* than 40 (i.e. smaller than the address width required by the guest), then accesses to high addresses in the guest will silently fail (usually with very bad results). (3) If you confirm that your physical CPU has a phys addr size that is large enough, then another test could be to enable 1GB pages in the VCPU model. Pass -cpu [whatever model you already use],+pdpe1gb The equivalent libvirt domain XML fragment would be (4) Assuming you are testing upstream OVMF, if you rebuild it with the following switch: --pcd gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel=0x8040004F and then boot it on QEMU with -global isa-debugcon.iobase=0x402 -debugcon file:ovmf.log then I might be able to tell you more, from "ovmf.log". (Downstream OVMF is already built with that PCD setting.) The libvirt domain XML knob for capturing the OVMF debug log (the QEMU debug port) is a bit more involved. First, modify the root element of the domain XML as follows: then (in the same editing session) add the following near the end, just before : (BTW this is also how you could hook the "X-PciMmio64Mb" setting from the top into the domain XML.) Thanks! Laszlo