From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.groups.io with SMTP id smtpd.web11.8547.1570794839613907324 for ; Fri, 11 Oct 2019 04:53:59 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 36C0310CC1EE; Fri, 11 Oct 2019 11:53:59 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-177.rdu2.redhat.com [10.10.120.177]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1408010016EB; Fri, 11 Oct 2019 11:53:57 +0000 (UTC) Subject: Re: [edk2-devel] OVMF is crashing for me in master To: devel@edk2.groups.io, afish@apple.com, "Gao, Liming" Cc: Pete Batard References: <4A89E2EF3DFEDB4C8BFDE51014F606A14E515A05@SHSMSX104.ccr.corp.intel.com> From: "Laszlo Ersek" Message-ID: <51a06bcf-c22f-464c-bbe4-4006b7e8b4c5@redhat.com> Date: Fri, 11 Oct 2019 13:53:57 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Fri, 11 Oct 2019 11:53:59 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 10/11/19 06:59, Andrew Fish via Groups.Io wrote: > Liming, >=20 > Thanks for looking into this! >=20 > Can someone also answer my question about the expected behavior of takin= g an exception in OVMF? Is the CpuDeadloop() expected?=20 Yes, it is. The exception handler dumps the register file and the stack to the emulated serial port, not to the QEMU debug port. When looking for OVMF debug messages, people usually (and rightly) check wherever the QEMU debug port has been redirected. (Unless they built OVMF with -D DEBUG_ON_SERIAL_PORT.) However, the exception handler always dumps the state to the serial port, regardless of DEBUG_ON_SERIAL_PORT. This can be confusing, as the normally consulted debug log has no indication of the issue. Thanks Laszlo >=20 > Thanks, >=20 > Andrew Fish >=20 >> On Oct 10, 2019, at 6:19 PM, Gao, Liming wrote: >> >> Andrew: >> I verify the change (2de1f611be06ded3a59726a4052a9039be7d459b MdeModu= lePkg/BdsDxe: Also call PlatformBootManagerWaitCallback on 0) in Emulator. >> It works, because PCD value is set to 10 in Emulator. >> >> Before this change, if TimeOut PCD is zero, BdsEntry doesn=E2=80=99t = call PlatformBootManagerWaitCallback(). >> After this change, if TimeOut PCD is zero, BdsEntry still call Platf= ormBootManagerWaitCallback(). So, it trigs this issue. I agree your fix. >> >> Pete: >> Will you contribute the patch to fix this hang issue in OVMF? >> >> Thanks >> Liming >> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of A= ndrew Fish via Groups.Io >> Sent: Friday, October 11, 2019 5:12 AM >> To: devel@edk2.groups.io; Pete Batard >> Subject: [edk2-devel] OVMF is crashing for me in master >> >> This is my flavor of OVMF: build -p OvmfPkg/OvmfPkgX64.dsc -a X64 -t X= CODE5 >> >> It looks like I took an exception? Is it expected that an unhandled exc= eption just hang in a dead loop? I would have expected some serial output = about the failure?=20 >> >> Looks like a divide by zero exception. The exception context has PC and= FP so I can manually walk the stack. Yikes I see PlatformBootManagerWaitCa= llback() will fault if PcdPlatformBootTimeOut is zero?=20 >> /Volumes/Case/UDK2018(master)>git grep PcdPlatformBootTimeOut -- *.dsc >> ArmVirtPkg/ArmVirtQemu.dsc:194: gEfiMdePkgTokenSpaceGuid.PcdPlatformBo= otTimeOut|3 >> ArmVirtPkg/ArmVirtQemuKernel.dsc:191: gEfiMdePkgTokenSpaceGuid.PcdPlat= formBootTimeOut|3 >> ArmVirtPkg/ArmVirtXen.dsc:122: gEfiMdePkgTokenSpaceGuid.PcdPlatformBoo= tTimeOut|3 >> EmulatorPkg/EmulatorPkg.dsc:236: gEfiMdePkgTokenSpaceGuid.PcdPlatformB= ootTimeOut|L"Timeout"|gEfiGlobalVariableGuid|0x0|10 >> OvmfPkg/OvmfPkgIa32.dsc:541: gEfiMdePkgTokenSpaceGuid.PcdPlatformBootT= imeOut|0 >> OvmfPkg/OvmfPkgIa32X64.dsc:553: gEfiMdePkgTokenSpaceGuid.PcdPlatformBo= otTimeOut|0 >> OvmfPkg/OvmfPkgX64.dsc:552: gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTi= meOut|0 >> OvmfPkg/OvmfXen.dsc:470: gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeO= ut|0 >> UefiPayloadPkg/UefiPayloadPkgIa32.dsc:344: gEfiMdePkgTokenSpaceGuid.Pc= dPlatformBootTimeOut|3 >> UefiPayloadPkg/UefiPayloadPkgIa32X64.dsc:345: gEfiMdePkgTokenSpaceGuid= .PcdPlatformBootTimeOut|3 >> >> >> OK PcdPlatformBootTimeOut is zero on Ovmf, so how did this ever work? >> >> >> Ahhh gotom....=20 >> /Volumes/Case/UDK2018(master)>git blame -L344,344 /Volumes/Case/UDK201= 8/MdeModulePkg/Universal/BdsDxe/BdsEntry.c=20 >> 2de1f611be0 (Pete Batard 2019-09-25 23:50:05 +0800 344) PlatformBootM= anagerWaitCallback (0); >> >> >> This call causes a divide by zero if PcdPlatformBootTimeOut =3D=3D 0.= =20 >> >> This fixes my crash: >> diff --git a/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c b/Ovm= fPkg/Library/PlatformBootManagerLib/BdsPlatform.c >> index 70df6b841a..d6ae43e900 100644 >> --- a/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c >> +++ b/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c >> @@ -1634,6 +1634,11 @@ PlatformBootManagerWaitCallback ( >> UINT16 Timeout; >> >> Timeout =3D PcdGet16 (PcdPlatformBootTimeOut); >> + if (Timeout =3D=3D 0) { >> + Timeout =3D 100; >> + } else { >> + Timeout =3D (Timeout - TimeoutRemain) * 100 / Timeout; >> + } >> >> Black.Raw =3D 0x00000000; >> White.Raw =3D 0x00FFFFFF; >> @@ -1643,7 +1648,7 @@ PlatformBootManagerWaitCallback ( >> Black.Pixel, >> L"Start boot option", >> White.Pixel, >> - (Timeout - TimeoutRemain) * 100 / Timeout, >> + Timeout, >> 0 >> ); >> } >> >> >> >> lldb debugger output: >> >> (lldb) bt >> * thread #1, stop reason =3D signal SIGTRAP >> * frame #0: 0x0000000007b58a70 CpuDxe.dll:CpuDeadLoop() + 13 at /Volu= mes/Case/UDK2018/MdePkg/Library/BaseLib/CpuDeadLoop.c:31 >> frame #1: 0x0000000007b61222 CpuDxe.dll:CommonExceptionHandlerWorke= r() + 674 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLi= b/PeiDxeSmmCpuException.c:115 >> frame #2: 0x0000000007b61624 CpuDxe.dll:CommonExceptionHandler() + = 36 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLib/DxeEx= ception.c:40 >> frame #3: 0x0000000007b5ff26 CpuDxe.dll:HasErrorCode() + 230 >> (lldb) fr sel 1 >> frame #1: 0x0000000007b61222 CpuDxe.dll:CommonExceptionHandlerWorker() = + 674 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLib/Pe= iDxeSmmCpuException.c:115 >> 91 (ExternalInterruptHandler[ExceptionType]) (ExceptionT= ype, SystemContext); >> 92 } else if (ExceptionType < CPU_EXCEPTION_NUM) { >> 93 // >> 94 // Get Spinlock to display CPU information >> 95 // >> 96 while (!AcquireSpinLockOrFail (&ExceptionHandlerData-= >DisplayMessageSpinLock)) { >> 97 CpuPause (); >> 98 } >> 99 // >> 100 // Initialize the serial port before dumping. >> 101 // >> 102 SerialPortInitialize (); >> 103 // >> 104 // Display ExceptionType, CPU information and Image in= formation >> 105 // >> 106 DumpImageAndCpuContent (ExceptionType, SystemContext); >> 107 // >> 108 // Release Spinlock of output message >> 109 // >> 110 ReleaseSpinLock (&ExceptionHandlerData->DisplayMessage= SpinLock); >> 111 // >> 112 // Enter a dead loop if needn't to execute old IDT han= dler further >> 113 // >> 114 if (ReservedVectors[ExceptionType].Attribute !=3D EFI_= VECTOR_HANDOFF_HOOK_BEFORE) { >> -> 115 CpuDeadLoop (); >> 116 } >> 117 } >> 118 } >> 119 >> (lldb) p ExceptionType >> (EFI_EXCEPTION_TYPE) $0 =3D 0 >> (lldb) p SystemContext.SystemContextX64->Rip >> (UINT64) $1 =3D 0x0000000007a9cc38 >> (lldb) p SystemContext.SystemContextX64->Rbp >> (UINT64) $2 =3D 0x0000000007e8fc20 >> (lldb) efi_backtrace --pc 0x0000000007a9cc38 --frame 0x0000000007e8fc20= --symbols >> frame 0: 0x07a9cc38 BdsDxe:PlatformBootManagerWaitCallback + 35 at /= Volumes/Case/UDK2018/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c:1= 646:37 >> frame 1: 0x07a9e744 BdsDxe:BdsWait + 269 at /Volumes/Case/UDK2018/Md= eModulePkg/Universal/BdsDxe/BdsEntry.c:344:3 >> frame 2: 0x07a9dff9 BdsDxe:BdsEntry + 2620 at /Volumes/Case/UDK2018/= MdeModulePkg/Universal/BdsDxe/BdsEntry.c:1002:5 >> frame 3: 0x07eb367d DxeCore:DxeMain + 2680 at /Volumes/Case/UDK2018/= MdeModulePkg/Core/Dxe/DxeMain/DxeMain.c:544:3 >> frame 4: 0x07e943cb DxeCore:_ModuleEntryPoint + 20 at /Volumes/Case/= UDK2018/MdePkg/Library/DxeCoreEntryPoint/DxeCoreEntryPoint.c:48:3 >> frame 5: 0x07edc947 DxeIpl.dll`AsmEnableCache >> frame 6: 0x07ee1e4e DxeIpl:HandOffToDxeCore + 509 at /Volumes/Case/U= DK2018/MdeModulePkg/Core/DxeIplPeim/X64/DxeLoadFunc.c:113:3 >> frame 7: 0x07ee0604 DxeIpl:DxeLoadCore + 1354 at /Volumes/Case/UDK20= 18/MdeModulePkg/Core/DxeIplPeim/DxeLoad.c:449:3 >> frame 8: 0x07eeff2f PeiCore.dll`PeiCore.cold.3 + 847 >> frame 9: 0x07ee9d04 PeiCore:PeiCore + 163 at /Volumes/Case/UDK2018/M= deModulePkg/Core/Pei/PeiMain/PeiMain.c:502:3 >> frame 10: 0x0082c387 0x0082c387 >> frame 11: 0x00825dd7 0x00825dd7 >> frame 12: 0x0082ad27 0x0082ad27 >> frame 13: 0x0082b3a8 0x0082b3a8 >> frame 14: 0x0082bf23 0x0082bf23 >> frame 15: 0x00825e24 0x00825e24 >> frame 16: 0x00823af2 0x00823af2 >> frame 17: 0xfffd1db8 SecMain:SecStartupPhase2 + 67 at /Volumes/Case/U= DK2018/OvmfPkg/Sec/SecMain.c:858:3 >> frame 18: 0xfffd1d67 SecMain:SecCoreStartupWithStack + 420 at /Volume= s/Case/UDK2018/OvmfPkg/Sec/SecMain.c:821:3 >> frame 19: 0xfffd1e14 SecMain:ProcessLibraryConstructorList + 0 at /Vo= lumes/Case/UDK2018/Build/OvmfX64/DEBUG_XCODE5/X64/OvmfPkg/Sec/SecMain/DEBUG= /AutoGen.c:201 >> >> (lldb) l /Volumes/Case/UDK2018/OvmfPkg/Library/PlatformBootManagerLib/B= dsPlatform.c:1646 >> 1646 (Timeout - TimeoutRemain) * 100 / Timeout, >> 1647 0 >> 1648 ); >> 1649 } >> 1650 >> 1651 /** >> 1652 The function is called when no boot option could be launc= hed, >> 1653 including platform recovery options and options pointing = to applications >> 1654 built into firmware volumes. >> 1655 >> (lldb) l /Volumes/Case/UDK2018/MdeModulePkg/Universal/BdsDxe/BdsEntry.c= :344 >> 344 PlatformBootManagerWaitCallback (0); >> 345 DEBUG ((EFI_D_INFO, "[Bds]Exit the waiting!\n")); >> 346 } >> 347 >> 348 /** >> 349 Attempt to boot each boot option in the BootOptions arra= y. >> 350 >> 351 @param BootOptions Input boot option array. >> 352 @param BootOptionCount Input boot option count. >> 353 @param BootManagerMenu Input boot manager menu. >> 354 >> >> >> Thanks, >> >> Andrew Fish >> >> >=20 >=20 >=20 >=20 >=20