On Oct 11, 2019, at 4:53 AM, Laszlo Ersek <lersek@redhat.com> wrote:

On 10/11/19 06:59, Andrew Fish via Groups.Io wrote:
Liming,

Thanks for looking into this!

Can someone also answer my question about the expected behavior of taking an exception in OVMF?  Is the CpuDeadloop() expected? 

Yes, it is.

The exception handler dumps the register file and the stack to the
emulated serial port, not to the QEMU debug port.

When looking for OVMF debug messages, people usually (and rightly) check
wherever the QEMU debug port has been redirected. (Unless they built
OVMF with -D DEBUG_ON_SERIAL_PORT.) However, the exception handler
always dumps the state to the serial port, regardless of
DEBUG_ON_SERIAL_PORT. This can be confusing, as the normally consulted
debug log has no indication of the issue.


Laszlo,

Thanks if I add -serial file:serial.log to the QEMU command line I see the exception in serial.log. 

Thanks,

Andrew Fish

Thanks
Laszlo


Thanks,

Andrew Fish

On Oct 10, 2019, at 6:19 PM, Gao, Liming <liming.gao@intel.com> wrote:

Andrew:
 I verify the change (2de1f611be06ded3a59726a4052a9039be7d459b MdeModulePkg/BdsDxe: Also call PlatformBootManagerWaitCallback on 0) in Emulator.
 It works, because PCD value is set to 10 in Emulator.

 Before this change, if TimeOut PCD is zero, BdsEntry doesn’t call PlatformBootManagerWaitCallback().
 After this change,  if TimeOut PCD is zero, BdsEntry still call PlatformBootManagerWaitCallback(). So, it trigs this issue. I agree your fix.

Pete:
 Will you contribute the patch to fix this hang issue in OVMF?

Thanks
Liming
From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Andrew Fish via Groups.Io
Sent: Friday, October 11, 2019 5:12 AM
To: devel@edk2.groups.io; Pete Batard <pete@akeo.ie>
Subject: [edk2-devel] OVMF is crashing for me in master

This is my flavor of OVMF:  build -p OvmfPkg/OvmfPkgX64.dsc -a X64 -t XCODE5

It looks like I took an exception? Is it expected that an unhandled exception just hang in a dead loop? I would have expected some serial  output about the failure? 

Looks like a divide by zero exception. The exception context has PC and FP so I can manually walk the stack. Yikes I see PlatformBootManagerWaitCallback() will fault if PcdPlatformBootTimeOut is zero? 
/Volumes/Case/UDK2018(master)>git grep PcdPlatformBootTimeOut -- *.dsc
ArmVirtPkg/ArmVirtQemu.dsc:194:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|3
ArmVirtPkg/ArmVirtQemuKernel.dsc:191:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|3
ArmVirtPkg/ArmVirtXen.dsc:122:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|3
EmulatorPkg/EmulatorPkg.dsc:236:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|L"Timeout"|gEfiGlobalVariableGuid|0x0|10
OvmfPkg/OvmfPkgIa32.dsc:541:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|0
OvmfPkg/OvmfPkgIa32X64.dsc:553:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|0
OvmfPkg/OvmfPkgX64.dsc:552:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|0
OvmfPkg/OvmfXen.dsc:470:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|0
UefiPayloadPkg/UefiPayloadPkgIa32.dsc:344:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|3
UefiPayloadPkg/UefiPayloadPkgIa32X64.dsc:345:  gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|3


OK PcdPlatformBootTimeOut is zero on Ovmf, so how did this ever work?


Ahhh gotom.... 
/Volumes/Case/UDK2018(master)>git blame -L344,344  /Volumes/Case/UDK2018/MdeModulePkg/Universal/BdsDxe/BdsEntry.c 
2de1f611be0 (Pete Batard 2019-09-25 23:50:05 +0800 344)   PlatformBootManagerWaitCallback (0);


This call causes a divide by zero if PcdPlatformBootTimeOut == 0. 

This fixes my crash:
diff --git a/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c b/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c
index 70df6b841a..d6ae43e900 100644
--- a/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c
+++ b/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c
@@ -1634,6 +1634,11 @@ PlatformBootManagerWaitCallback (
  UINT16                              Timeout;

  Timeout = PcdGet16 (PcdPlatformBootTimeOut);
+  if (Timeout ==  0) {
+    Timeout = 100;
+  } else {
+    Timeout = (Timeout - TimeoutRemain) * 100 / Timeout;
+  }

  Black.Raw = 0x00000000;
  White.Raw = 0x00FFFFFF;
@@ -1643,7 +1648,7 @@ PlatformBootManagerWaitCallback (
    Black.Pixel,
    L"Start boot option",
    White.Pixel,
-    (Timeout - TimeoutRemain) * 100 / Timeout,
+    Timeout,
    0
    );
}



lldb debugger output:

(lldb) bt
* thread #1, stop reason = signal SIGTRAP
 * frame #0: 0x0000000007b58a70 CpuDxe.dll:CpuDeadLoop() + 13 at /Volumes/Case/UDK2018/MdePkg/Library/BaseLib/CpuDeadLoop.c:31
   frame #1: 0x0000000007b61222 CpuDxe.dll:CommonExceptionHandlerWorker() + 674 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLib/PeiDxeSmmCpuException.c:115
   frame #2: 0x0000000007b61624 CpuDxe.dll:CommonExceptionHandler() + 36 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLib/DxeException.c:40
   frame #3: 0x0000000007b5ff26 CpuDxe.dll:HasErrorCode() + 230
(lldb) fr sel 1
frame #1: 0x0000000007b61222 CpuDxe.dll:CommonExceptionHandlerWorker() + 674 at /Volumes/Case/UDK2018/UefiCpuPkg/Library/CpuExceptionHandlerLib/PeiDxeSmmCpuException.c:115
  91             (ExternalInterruptHandler[ExceptionType]) (ExceptionType, SystemContext);
  92           } else if (ExceptionType < CPU_EXCEPTION_NUM) {
  93             //
  94             // Get Spinlock to display CPU information
  95             //
  96             while (!AcquireSpinLockOrFail (&ExceptionHandlerData->DisplayMessageSpinLock)) {
  97               CpuPause ();
  98             }
  99             //
  100           // Initialize the serial port before dumping.
  101           //
  102           SerialPortInitialize ();
  103           //
  104           // Display ExceptionType, CPU information and Image information
  105           //
  106           DumpImageAndCpuContent (ExceptionType, SystemContext);
  107           //
  108           // Release Spinlock of output message
  109           //
  110           ReleaseSpinLock (&ExceptionHandlerData->DisplayMessageSpinLock);
  111           //
  112           // Enter a dead loop if needn't to execute old IDT handler further
  113           //
  114           if (ReservedVectors[ExceptionType].Attribute != EFI_VECTOR_HANDOFF_HOOK_BEFORE) {
-> 115            CpuDeadLoop ();
  116           }
  117         }
  118       }
  119
(lldb) p ExceptionType
(EFI_EXCEPTION_TYPE) $0 = 0
(lldb) p SystemContext.SystemContextX64->Rip
(UINT64) $1 = 0x0000000007a9cc38
(lldb) p SystemContext.SystemContextX64->Rbp
(UINT64) $2 = 0x0000000007e8fc20
(lldb) efi_backtrace --pc 0x0000000007a9cc38 --frame 0x0000000007e8fc20 --symbols
 frame  0: 0x07a9cc38 BdsDxe:PlatformBootManagerWaitCallback + 35 at /Volumes/Case/UDK2018/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c:1646:37
 frame  1: 0x07a9e744 BdsDxe:BdsWait + 269 at /Volumes/Case/UDK2018/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:344:3
 frame  2: 0x07a9dff9 BdsDxe:BdsEntry + 2620 at /Volumes/Case/UDK2018/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:1002:5
 frame  3: 0x07eb367d DxeCore:DxeMain + 2680 at /Volumes/Case/UDK2018/MdeModulePkg/Core/Dxe/DxeMain/DxeMain.c:544:3
 frame  4: 0x07e943cb DxeCore:_ModuleEntryPoint + 20 at /Volumes/Case/UDK2018/MdePkg/Library/DxeCoreEntryPoint/DxeCoreEntryPoint.c:48:3
 frame  5: 0x07edc947 DxeIpl.dll`AsmEnableCache
 frame  6: 0x07ee1e4e DxeIpl:HandOffToDxeCore + 509 at /Volumes/Case/UDK2018/MdeModulePkg/Core/DxeIplPeim/X64/DxeLoadFunc.c:113:3
 frame  7: 0x07ee0604 DxeIpl:DxeLoadCore + 1354 at /Volumes/Case/UDK2018/MdeModulePkg/Core/DxeIplPeim/DxeLoad.c:449:3
 frame  8: 0x07eeff2f PeiCore.dll`PeiCore.cold.3 + 847
 frame  9: 0x07ee9d04 PeiCore:PeiCore + 163 at /Volumes/Case/UDK2018/MdeModulePkg/Core/Pei/PeiMain/PeiMain.c:502:3
 frame 10: 0x0082c387 0x0082c387
 frame 11: 0x00825dd7 0x00825dd7
 frame 12: 0x0082ad27 0x0082ad27
 frame 13: 0x0082b3a8 0x0082b3a8
 frame 14: 0x0082bf23 0x0082bf23
 frame 15: 0x00825e24 0x00825e24
 frame 16: 0x00823af2 0x00823af2
 frame 17: 0xfffd1db8 SecMain:SecStartupPhase2 + 67 at /Volumes/Case/UDK2018/OvmfPkg/Sec/SecMain.c:858:3
 frame 18: 0xfffd1d67 SecMain:SecCoreStartupWithStack + 420 at /Volumes/Case/UDK2018/OvmfPkg/Sec/SecMain.c:821:3
 frame 19: 0xfffd1e14 SecMain:ProcessLibraryConstructorList + 0 at /Volumes/Case/UDK2018/Build/OvmfX64/DEBUG_XCODE5/X64/OvmfPkg/Sec/SecMain/DEBUG/AutoGen.c:201

(lldb) l /Volumes/Case/UDK2018/OvmfPkg/Library/PlatformBootManagerLib/BdsPlatform.c:1646
  1646         (Timeout - TimeoutRemain) * 100 / Timeout,
  1647         0
  1648         );
  1649     }
  1650
  1651     /**
  1652       The function is called when no boot option could be launched,
  1653       including platform recovery options and options pointing to applications
  1654       built into firmware volumes.
  1655
(lldb) l /Volumes/Case/UDK2018/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:344
  344         PlatformBootManagerWaitCallback (0);
  345         DEBUG ((EFI_D_INFO, "[Bds]Exit the waiting!\n"));
  346       }
  347
  348       /**
  349         Attempt to boot each boot option in the BootOptions array.
  350
  351         @param BootOptions       Input boot option array.
  352         @param BootOptionCount   Input boot option count.
  353         @param BootManagerMenu   Input boot manager menu.
  354


Thanks,

Andrew Fish