From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-x229.google.com (mail-io0-x229.google.com [IPv6:2607:f8b0:4001:c06::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5E43621D0FE06 for ; Tue, 1 Aug 2017 10:21:47 -0700 (PDT) Received: by mail-io0-x229.google.com with SMTP id g13so10801080ioj.5 for ; Tue, 01 Aug 2017 10:23:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=PQzrPlUL8MNjaBySQTToCp8QK21h8Lt/kFYMbbo92BE=; b=Km1zjOFNwLRKV3scQ65Iv7aTIUI9pfiO5t78cm++PRNV+sNhA1KF7wOI184IOR75f0 XZWtMgUWTcB1eOfF5fSHAzGN6cisQAis2BOPaL2rzfTuu2NhG+WKc3k1CM6MG8wJSNel OkJHEte7R6wbQy0Z912g5aIpGkodREJBC7e7g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=PQzrPlUL8MNjaBySQTToCp8QK21h8Lt/kFYMbbo92BE=; b=d7gP9NHYz2f4SSuF2e42EHguF7IZLdlUW+niE9sjuXOm3Tij0eCBtAEjYK7fi7Tqs2 S8wL3tpKPiPj+Fw8spd/Sjt9O+npsclG1xjWFI6VFWhtwT5jTHgCYpe8Nhizxyx3qBdB qajMW5WS/m73JwYHW1DZJ+8d9LqJCWg0C7zhJMSVUNryu9cPe/zmldhhqXS0HhM3VLNs MQ1s5SWKhxSf8MqN3V/8CmE+4TBik5EvHUUQwM2VMUlATAQZIU2V9Kq7mEglmRMmiyDL 6MLCylYl1/aRMbgYCJdbh+BDZlYn+9gjhpm06+UZwtviiMB3KG6sfUHZ/uPyY8WeDSzO HVZw== X-Gm-Message-State: AIVw1136Sfv8BcyhTXzIK9vwzUPOcOZ9I+ficZKaHdXLY9EpC7uuiX0w 3ivnUQZ/JxyupDNq5gE+OmZ3r4KqUslp X-Received: by 10.107.162.21 with SMTP id l21mr16002866ioe.154.1501608235694; Tue, 01 Aug 2017 10:23:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.162.1 with HTTP; Tue, 1 Aug 2017 10:23:55 -0700 (PDT) In-Reply-To: <77e8ddad-9039-efe4-f6f7-1dbc66d4eb6c@redhat.com> References: <597E798B.1020806@huawei.com> <443e01eb-28ec-6e4d-43ac-6f6f16f7f3d4@redhat.com> <59803D0A.6020305@huawei.com> <77e8ddad-9039-efe4-f6f7-1dbc66d4eb6c@redhat.com> From: Ard Biesheuvel Date: Tue, 1 Aug 2017 18:23:55 +0100 Message-ID: To: Laszlo Ersek Cc: Zhu Yijun , "edk2-devel@lists.01.org" , "Richard W.M. Jones" Subject: Re: issue about booting centos fail with edk2 X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Aug 2017 17:21:47 -0000 Content-Type: text/plain; charset="UTF-8" On 1 August 2017 at 16:42, Laszlo Ersek wrote: > On 08/01/17 10:34, Zhu Yijun wrote: >> Thanks for your reply! >> >> On 2017/8/1 3:02, Laszlo Ersek wrote: >>> On 07/31/17 02:27, Zhu Yijun wrote: >>>> Hi all, >>>> >>>> I install a CentOS-7-aarch64 guest img by qemu cdrom, but it hung at UEFI probability. >>>> >>>> Basic info: >>>> libvirt 1.3.5 >>>> QEMU 2.6.2 >>>> UEFI: master branch with commit "688c7d2 BaseTools: Fix the bug that warn() function with only 1 argument" >>>> >>>> Config pflash and two disks in xml: >>>> >>>> ... >>>> >>>> hvm >>>> /usr/share/edk2/aarch64/QEMU_EFI-pflash.raw >>>> >>>> >>>> ... >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> ... >>>> >>>> I found it failed at "Match (Translated, TranslatedSize, ActiveOption[Idx].BootOption->FilePath)" function in "SetBootOrderFromQemu", the UEFI debug info as follow: >>> No, that's not where the problem is. See below: >>> >>>> start-console-fail.log >>>> FSOpen: Open '\EFI\BOOT\fallback.efi' Success >>>> FSOpen: Open '\EFI\BOOT\fallback.efi' Success >>>> >>>> >>>> Synchronous Exception at 0x00000002384B1104 >>>> PC 0x0002384B1104 >>>> PC 0x0002384A916C >>>> PC 0x0002384CA2D0 >>>> PC 0x00023EEB7DF8 (0x00023EEB1000+0x00006DF8) [ 1] DxeCore.dll >>>> PC 0x00023BD1568C (0x00023BD02000+0x0001368C) [ 2] BdsDxe.dll >>>> PC 0x00023BD03F98 (0x00023BD02000+0x00001F98) [ 2] BdsDxe.dll >>>> PC 0x00023BD05640 (0x00023BD02000+0x00003640) [ 2] BdsDxe.dll >>>> PC 0x00023EEB3704 (0x00023EEB1000+0x00002704) [ 3] DxeCore.dll >>>> PC 0x00023EEB27C8 (0x00023EEB1000+0x000017C8) [ 3] DxeCore.dll >>>> PC 0x00023EEB2024 (0x00023EEB1000+0x00001024) [ 3] DxeCore.dll >>>> [ 1] /root/rpmbuild/BUILD/edk2-2.6.0/Build/ArmVirtQemu-AARCH64/DEBUG_GCC49/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll >>>> [ 2] /root/rpmbuild/BUILD/edk2-2.6.0/Build/ArmVirtQemu-AARCH64/DEBUG_GCC49/AARCH64/MdeModulePkg/Universal/BdsDxe/BdsDxe/DEBUG/BdsDxe.dll >>>> [ 3] /root/rpmbuild/BUILD/edk2-2.6.0/Build/ArmVirtQemu-AARCH64/DEBUG_GCC49/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll >>>> >>>> X0 0x00000002384A9000 X1 0x00000002384B2990 X2 0x000000023AAFDF98 X3 0x000000023BFF0018 >>>> X4 0x0000000000000000 X5 0x0000000000000007 X6 0x0000000238533300 X7 0x0000000000000000 >>>> X8 0x000000023C01F548 X9 0x0000000200000000 X10 0x00000002384A8000 X11 0x00000002384C5FFF >>>> X12 0x0000000000000000 X13 0x0000000000000008 X14 0x259511BDAEB1F36C X15 0x1378CC1DF3F5DDBB >>>> X16 0x000000023EEB0BE0 X17 0x0000000000000000 X18 0x0000000000000000 X19 0x0000000000000013 >>>> X20 0x0000000000000000 X21 0x0000000000000000 X22 0x0000000000000000 X23 0x0000000000000000 >>>> X24 0x0000000000000000 X25 0x0000000000000000 X26 0x0000000000000000 X27 0x0000000000000000 >>>> X28 0x0000000000000000 FP 0x000000023EEB0A40 LR 0x00000002384A916C >>>> >>>> V0 0xAFAFAFAFAFAFAFAF AFAFAFAFAFAFAFAF V1 0x63702F6666666666 6666666666666666 >>>> V2 0x40697363732F3340 6567646972622D69 V3 0x0000000000000000 0000000000000000 >>>> V4 0x0000000000000000 0000000000000000 V5 0x4010040140100401 4010040140100401 >>>> V6 0x0000000000000000 0000000000000000 V7 0x0000000000000000 0000000000000000 >>>> V8 0x0000000000000000 0000000000000000 V9 0x0000000000000000 0000000000000000 >>>> V10 0x0000000000000000 0000000000000000 V11 0x0000000000000000 0000000000000000 >>>> V12 0x0000000000000000 0000000000000000 V13 0x0000000000000000 0000000000000000 >>>> V14 0x0000000000000000 0000000000000000 V15 0x0000000000000000 0000000000000000 >>>> V16 0x0000000000000000 0000000000000000 V17 0x0000000000000000 0000000000000000 >>>> V18 0x0000000000000000 0000000000000000 V19 0x0000000000000000 0000000000000000 >>>> V20 0x0000000000000000 0000000000000000 V21 0x0000000000000000 0000000000000000 >>>> V22 0x0000000000000000 0000000000000000 V23 0x0000000000000000 0000000000000000 >>>> V24 0x0000000000000000 0000000000000000 V25 0x0000000000000000 0000000000000000 >>>> V26 0x0000000000000000 0000000000000000 V27 0x0000000000000000 0000000000000000 >>>> V28 0x0000000000000000 0000000000000000 V29 0x0000000000000000 0000000000000000 >>>> V30 0x0000000000000000 0000000000000000 V31 0x0000000000000000 0000000000000000 >>>> >>>> SP 0x000000023EEB0A40 ELR 0x00000002384B1104 SPSR 0x60000205 FPSR 0x00000000 >>>> ESR 0x02000000 FAR 0x1DE7EC7EDBADC0DE >>>> >>>> ESR : EC 0x00 IL 0x1 ISS 0x00000000 >>>> >>>> Stack dump: >>>> 000023EEB0940: 0000C0E000000148 00000002384A9000 00000002384CA254 0000000000000000 >>>> 000023EEB0960: 000000023EEB0BC0 000000023AC006C0 0000F2503EEB0BC0 00000002384B6018 >>>> 000023EEB0980: 000000023EEB0BC0 0000000000000000 000000000000C0E0 0000000000000148 >>>> 000023EEB09A0: 0000000000000148 0000100000020A8C 00000002384B6110 00000002384B6108 >>>> 000023EEB09C0: 00000002384B6100 0000000000000006 00000002384B6058 00000002384B50DF >>>> 000023EEB09E0: 00000002384A9148 0000000000000000 00000002384A9000 00000002384A9000 >>>> 000023EEB0A00: 0000000000000000 00000002398DA518 00000002385375B2 00000002385629A0 >>>> 000023EEB0A20: 000000023854C1C0 00000002398DA518 000000023EEB0BC0 0000000000000000 >>>>> 000023EEB0A40: 000000023EEB0BC0 00000002384CA2D0 000000023AAFDF98 000000023BFF0018 >>>> 000023EEB0A60: 00000002384CA360 000000023EEC8348 00000002385375B0 000000023AAFDF98 >>>> 000023EEB0A80: 000000023EEB0AC0 0000F25038533338 00000002384B6018 0000000000000000 >>>> 000023EEB0AA0: 0000000000000000 0000000238B63D18 0000000000001000 0000000000000000 >>>> 000023EEB0AC0: 000000023BFF0018 00000002398DA518 00000002398CE598 0000000000000000 >>>> 000023EEB0AE0: 0000000000000000 0000000000000000 00000002384C6000 00000000000C99C0 >>>> 000023EEB0B00: 0000000200000001 0000000000000000 000000023AC006C0 11D295625B1B31A1 >>>> 000023EEB0B20: 3B7269C9A0003F8E 0000000000000000 0000000238B63F98 000000163EEB0B68 >>>> ASSERT [ArmCpuDxe] /root/rpmbuild/BUILD/edk2-2.6.0/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(271): ((BOOLEAN)(0==1)) >>> This is a guest that you didn't install from installer media. I think >>> you may have gotten the preinstalled disk image from some image provider >>> service. The UEFI boot variable(s) are not set up to boot the CentOS >>> installation, in your nvram / pflash file. >> Yes, the boot variable must store in domain's nvram file("/var/lib/libvirt/qemu/nvram/centos_VARS.fd"). After installed, it generates an new boot menu >> called "CentOS Linux AltArch " which device path is "HD(1,GPT,D562CAA6-F61B-4F93-87FB-22DDADF6CAE2,0x800,0x64000)/\EFI\centos\shim.efi". >> >> such like: >> Boot Manager Menu >> CentOS Linux AltArch -> device path: PciRoot(0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0) /HD(1,GPT,D562CAA6-F61B-4F93-87FB-22DDADF6CAE2,0x800,0x64000)/\EFI\centos\shim.efi >> UEFI Misc Device >> UEFI Misc Device 2 >> EFI Internal Shell >> UEFI QEMU QEMU CD-ROM -> device path: PciRoot(0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x1) >> UEFI QEMU QEMU HARDDISK -> device path: PciRoot(0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0) >> UEFI PXEv4 (MAC:5254002D2EB6) >> >> But when I shutdown &undefine this domain, and virsh create an new domain with the disk centos.qcow2 which installed just before, the UEFI boot manager >> menu is: >> Boot Manager Menu >> UEFI QEMU QEMU HARDDISK -> device path: PciRoot(0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0) >> UEFI Misc Device >> UEFI Misc Device 2 >> EFI Internal Shell >> UEFI PXEv4 (MAC:5254002D2EB6) > > Right. In this case you have lost your original nvram contents, and you > only have the boot options that are auto-generated by the > EfiBootManagerRefreshAllBootOption() function. This function lives in > UefiBootManagerLib, and is called from OVMF's PlatformBootManagerLib > instance. > > The filtering and reordering still occurs in OVMF, but now the first > boot option that matches QEMU's fw_cfg bootorder specification is not > the "CentOS Linux AltArch" boot option that you originally had. Instead, > now QemuBootOrderLib encounters the "UEFI QEMU QEMU HARDDISK" > auto-generated boot option first as a match. > > This boot option in turn means "fallback.efi", according to the blog > post I linked earlier. > > When "fallback.efi" executes successfully, your original "CentOS Linux > AltArch" boot option is restored / recreated (at the top of the boot > option list). But, when "fallback.efi" crashes, you get a crash instead. > >> I am confused about two points: >> 1) The new domain still have chance to load the "EFI\centos\shim.efi" and boot kernel successful, it means that sometimes the system firmware launches >> the BOOTAA64.EFI, sometimes lauches shim.efi. It is probabilistic. > > "EFI\centos\shim.efi" is never automatically loaded. It needs a > dedicated UEFI boot option. Thus, it can be loaded in your "new" domain > *only* if "fallback.efi" runs first, successfully. > > So what you are seeing is that "fallback.efi" sometimes works, and > sometimes crashes. That's the nature of memory corruption bugs. > >> >> 2) Is there a way to make the "CentOS Linux AltArch " boot menu persistent? > > There isn't. If you lose your nvram, you lose the non-auto-generated > boot options with it. > > Remedying such situations is what "fallback.efi" exists for. > >>> >>> In such cases, the "fallback.efi" utility is invoked (called >>> "\EFI\BOOT\BOOTAA64.EFI). Please refer to: >>> >>> https://blog.uncooperative.org/blog/2014/02/06/the-efi-system-partition/ >>> >>> Unfortunately, "fallback.efi" (from the shim package) used to have a few >>> bugs over time and sometimes it would crash. See for example: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1196114 >>> >>> I'm unsure what version of shim / fallback.efi is in the installed >>> CentOS image, but it looks like the same (or another similar) >>> fallback.efi issue to me. >> >> shim version in my side is shim-0.9-2.el7.aarch64. > > This confirms that you are not seeing the exact bug described in > RHBZ#1196114, because that bug was fixed in shim-0.9 (see > ). > > It remains a fact that your original log contains a crash register dump > after fallback.efi is launched. The V0 register contains > 0xAFAFAFAFAFAFAFAF AFAFAFAFAFAFAFAF; the pattern 0xAF is used to fill > released (freed) pages in debug builds. So this seems to be an > use-after-free issue. I suggest adding debug instrumentation to > fallback.efi, and seeing where exactly it blows up. > The presence of the 0xAF pattern in register v0 by itself does not suggest anything at all: V0 is a SIMD register, which is used by the SetMem() routine to poison the memory. There is very little other code (if any) that actually uses the SIMD registers otherwise.