public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception
@ 2023-08-29 13:29 Ard Biesheuvel
  2023-08-29 14:36 ` Laszlo Ersek
  0 siblings, 1 reply; 4+ messages in thread
From: Ard Biesheuvel @ 2023-08-29 13:29 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, lersek, Ard Biesheuvel

Laszlo reports that the efi_gdb.py script fails to produce a full
backtrace when attaching it to an ARM firmware build that has halted on
an unhandled exception.

The reason is that the asm code that processes the exception was not
implemented with this in mind, and therefore lacks any handling of it.

So let's add this: create a dummy frame record suitable for chasing the
frame pointer, and add the CFI metadata to describe where the return
value can be found on the stack.

When using a GCC5 build, this produces a stack trace such as

  (gdb) bt
  #0  0x000000007fd4537c in CpuDeadLoop () at /home/ardb/build/edk2/MdePkg/Library/BaseLib/CpuDeadLoop.c:30
  #1  0x000000007fd454f8 in DebugAssert (
      FileName=FileName@entry=0x7fd4a8a8 <MmioWrite64Internal+3604> "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c",
      LineNumber=LineNumber@entry=343, Description=Description@entry=0x7fd4a896 <MmioWrite64Internal+3586> "((BOOLEAN)(0==1))")
      at /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/DebugLib.c:235
  #2  0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=<optimized out>, SystemContext=...)
      at /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c:343
  #3  0x000000007fd48eb8 in ExceptionHandlersEnd ()
  #4  0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=<synthetic pointer>) at /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.c:201
  #5  TryRunningQemuKernel () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/QemuKernel.c:46
  #6  PlatformBootManagerAfterConsole () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139
  #7  BdsEntry (This=<optimized out>) at /home/ardb/build/edk2/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:931
  #8  0x000000007ffd0018 in ?? ()
  Backtrace stopped: previous frame inner to this frame (corrupt stack?)

when QemuLoadKernelImage() has been tweaked to trigger an exception, as
is shown by GDB when walking the call stack:

|    0x7fcde938 <BdsEntry+3292>      b.ne    0x7fcdf134 <BdsEntry+5336>  // b.any
|    0x7fcde93c <BdsEntry+3296>      mov     x0, #0x40                       // #64
|    0x7fcde940 <BdsEntry+3300>      bl      0x7fcd7aec <DebugPrint>
|  > 0x7fcde944 <BdsEntry+3304>      brk     #0x4d2
|    0x7fcde948 <BdsEntry+3308>      bl      0x7fce4354 <ConnectDevicesFromQemu>
|    0x7fcde94c <BdsEntry+3312>      tbz     x0, #63, 0x7fcde954 <BdsEntry+3320>
|    0x7fcde950 <BdsEntry+3316>      bl      0x7fcd844c <EfiBootManagerConnectAll>
|    0x7fcde954 <BdsEntry+3320>      bl      0x7fcd990c <EfiBootManagerRefreshAllBootOption

Unfortunately, CLANGDWARF does not seem entirely happy with this
arrangement: it identifies the call frame where the exception
originated, but does not show any frames above that. (This could be
related to the fact that the exception code uses a separate exception
stack for handling synchronous exceptions)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
index cd9437b6aab8..345b566932bb 100644
--- a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
+++ b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
@@ -259,6 +259,8 @@ ASM_PFX(ExceptionHandlersEnd):
 
 
 ASM_PFX(CommonExceptionEntry):
+  .cfi_sections .debug_frame
+  .cfi_startproc
 
   EL1_OR_EL2_OR_EL3(x1)
 1:mrs      x2, elr_el1   // Exception Link Register
@@ -280,6 +282,13 @@ ASM_PFX(CommonExceptionEntry):
 
 4:mrs      x4, fpsr      // Floating point Status Register  32bit
 
+  // Create a dummy frame record using the ELR as the return address
+  stp      x29, x2, [sp, #-16]!
+  .cfi_def_cfa_offset (GP_CONTEXT_SIZE + FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE + 16)
+  .cfi_rel_offset x29, 0
+  .cfi_rel_offset x30, 8
+  mov      x29, sp
+
   // Save the SYS regs
   stp      x2,  x3,  [x28, #-SYS_CONTEXT_SIZE]!
   stp      x4,  x5,  [x28, #0x10]
@@ -305,7 +314,7 @@ ASM_PFX(CommonExceptionEntry):
 
   // x0 still holds the exception type.
   // Set x1 to point to the top of our struct on the Stack
-  mov      x1, sp
+  add      x1, sp, #16
 
 // CommonCExceptionHandler (
 //   IN     EFI_EXCEPTION_TYPE           ExceptionType,   R0
@@ -318,6 +327,9 @@ ASM_PFX(CommonExceptionEntry):
   // We do not try to recover.
   bl       ASM_PFX(CommonCExceptionHandler) // Call exception handler
 
+  // Pop dummy frame record
+  add      sp, sp, #16
+
   // Pop as many GP regs as we can before entering the critical section below
   ldp      x2,  x3,  [sp, #0x10]
   ldp      x4,  x5,  [sp, #0x20]
@@ -378,13 +390,17 @@ ASM_PFX(CommonExceptionEntry):
 
   // pop remaining GP regs and return from exception.
   ldr      x30, [sp, #0xf0 - 0xe0]
+  .cfi_restore 30
   ldp      x28, x29, [sp], #GP_CONTEXT_SIZE - 0xe0
+  .cfi_restore 29
 
   // Adjust SP to be where we started from when we came into the handler.
   // The handler can not change the SP.
   add      sp, sp, #FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE
+  .cfi_def_cfa_offset 0
 
   eret
+  .cfi_endproc
 
 ASM_FUNC(RegisterEl0Stack)
   msr     sp_el0, x0
-- 
2.42.0.rc2.253.gd59a3bf2b4-goog



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#108092): https://edk2.groups.io/g/devel/message/108092
Mute This Topic: https://groups.io/mt/101030910/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception
  2023-08-29 13:29 [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception Ard Biesheuvel
@ 2023-08-29 14:36 ` Laszlo Ersek
  2023-08-30 13:00   ` Ard Biesheuvel
  0 siblings, 1 reply; 4+ messages in thread
From: Laszlo Ersek @ 2023-08-29 14:36 UTC (permalink / raw)
  To: Ard Biesheuvel, devel; +Cc: quic_llindhol

On 8/29/23 15:29, Ard Biesheuvel wrote:
> Laszlo reports that the efi_gdb.py script fails to produce a full
> backtrace when attaching it to an ARM firmware build that has halted on
> an unhandled exception.
> 
> The reason is that the asm code that processes the exception was not
> implemented with this in mind, and therefore lacks any handling of it.
> 
> So let's add this: create a dummy frame record suitable for chasing the
> frame pointer, and add the CFI metadata to describe where the return
> value can be found on the stack.
> 
> When using a GCC5 build, this produces a stack trace such as
> 
>   (gdb) bt
>   #0  0x000000007fd4537c in CpuDeadLoop () at /home/ardb/build/edk2/MdePkg/Library/BaseLib/CpuDeadLoop.c:30
>   #1  0x000000007fd454f8 in DebugAssert (
>       FileName=FileName@entry=0x7fd4a8a8 <MmioWrite64Internal+3604> "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c",
>       LineNumber=LineNumber@entry=343, Description=Description@entry=0x7fd4a896 <MmioWrite64Internal+3586> "((BOOLEAN)(0==1))")
>       at /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/DebugLib.c:235
>   #2  0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=<optimized out>, SystemContext=...)
>       at /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c:343
>   #3  0x000000007fd48eb8 in ExceptionHandlersEnd ()
>   #4  0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=<synthetic pointer>) at /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.c:201
>   #5  TryRunningQemuKernel () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/QemuKernel.c:46
>   #6  PlatformBootManagerAfterConsole () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139
>   #7  BdsEntry (This=<optimized out>) at /home/ardb/build/edk2/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:931
>   #8  0x000000007ffd0018 in ?? ()
>   Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> 
> when QemuLoadKernelImage() has been tweaked to trigger an exception, as
> is shown by GDB when walking the call stack:
> 
> |    0x7fcde938 <BdsEntry+3292>      b.ne    0x7fcdf134 <BdsEntry+5336>  // b.any
> |    0x7fcde93c <BdsEntry+3296>      mov     x0, #0x40                       // #64
> |    0x7fcde940 <BdsEntry+3300>      bl      0x7fcd7aec <DebugPrint>
> |  > 0x7fcde944 <BdsEntry+3304>      brk     #0x4d2
> |    0x7fcde948 <BdsEntry+3308>      bl      0x7fce4354 <ConnectDevicesFromQemu>
> |    0x7fcde94c <BdsEntry+3312>      tbz     x0, #63, 0x7fcde954 <BdsEntry+3320>
> |    0x7fcde950 <BdsEntry+3316>      bl      0x7fcd844c <EfiBootManagerConnectAll>
> |    0x7fcde954 <BdsEntry+3320>      bl      0x7fcd990c <EfiBootManagerRefreshAllBootOption
> 
> Unfortunately, CLANGDWARF does not seem entirely happy with this
> arrangement: it identifies the call frame where the exception
> originated, but does not show any frames above that. (This could be
> related to the fact that the exception code uses a separate exception
> stack for handling synchronous exceptions)

First of all, thanks for writing this patch so incredibly quickly. :)

Second, something must be off with my gdb.

Before your patch, I kept experimenting with manually resetting FP, SP,
and LR to the values printed in the register dump, using gdb "set"
commands. Strangely, that did result in complete pre-exception stack
traces, but *only sometimes*. Most of the time gdb complains about
"corrupted stack". And I just can't figure out what distinguishes the
broken from the functional "bt" commands -- I did walk the allegedly
corrupt stack manually, and there is nothing corrupt in the FP and LR
parts of the stack frames. They all chain nicely and point to valid
instructions, respectively. I don't know what it is that gdb doesn't like.

Third, when I test your patch, I seem to experience precisely what you
describe under CLANGDWARF -- it shows the faulting frame (the frame just
before the exception), but nothing before it! And I'm not building with
clang :(

Thanks,
Laszlo

> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
> index cd9437b6aab8..345b566932bb 100644
> --- a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
> +++ b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S
> @@ -259,6 +259,8 @@ ASM_PFX(ExceptionHandlersEnd):
>  
>  
>  ASM_PFX(CommonExceptionEntry):
> +  .cfi_sections .debug_frame
> +  .cfi_startproc
>  
>    EL1_OR_EL2_OR_EL3(x1)
>  1:mrs      x2, elr_el1   // Exception Link Register
> @@ -280,6 +282,13 @@ ASM_PFX(CommonExceptionEntry):
>  
>  4:mrs      x4, fpsr      // Floating point Status Register  32bit
>  
> +  // Create a dummy frame record using the ELR as the return address
> +  stp      x29, x2, [sp, #-16]!
> +  .cfi_def_cfa_offset (GP_CONTEXT_SIZE + FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE + 16)
> +  .cfi_rel_offset x29, 0
> +  .cfi_rel_offset x30, 8
> +  mov      x29, sp
> +
>    // Save the SYS regs
>    stp      x2,  x3,  [x28, #-SYS_CONTEXT_SIZE]!
>    stp      x4,  x5,  [x28, #0x10]
> @@ -305,7 +314,7 @@ ASM_PFX(CommonExceptionEntry):
>  
>    // x0 still holds the exception type.
>    // Set x1 to point to the top of our struct on the Stack
> -  mov      x1, sp
> +  add      x1, sp, #16
>  
>  // CommonCExceptionHandler (
>  //   IN     EFI_EXCEPTION_TYPE           ExceptionType,   R0
> @@ -318,6 +327,9 @@ ASM_PFX(CommonExceptionEntry):
>    // We do not try to recover.
>    bl       ASM_PFX(CommonCExceptionHandler) // Call exception handler
>  
> +  // Pop dummy frame record
> +  add      sp, sp, #16
> +
>    // Pop as many GP regs as we can before entering the critical section below
>    ldp      x2,  x3,  [sp, #0x10]
>    ldp      x4,  x5,  [sp, #0x20]
> @@ -378,13 +390,17 @@ ASM_PFX(CommonExceptionEntry):
>  
>    // pop remaining GP regs and return from exception.
>    ldr      x30, [sp, #0xf0 - 0xe0]
> +  .cfi_restore 30
>    ldp      x28, x29, [sp], #GP_CONTEXT_SIZE - 0xe0
> +  .cfi_restore 29
>  
>    // Adjust SP to be where we started from when we came into the handler.
>    // The handler can not change the SP.
>    add      sp, sp, #FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE
> +  .cfi_def_cfa_offset 0
>  
>    eret
> +  .cfi_endproc
>  
>  ASM_FUNC(RegisterEl0Stack)
>    msr     sp_el0, x0




-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#108096): https://edk2.groups.io/g/devel/message/108096
Mute This Topic: https://groups.io/mt/101030910/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception
  2023-08-29 14:36 ` Laszlo Ersek
@ 2023-08-30 13:00   ` Ard Biesheuvel
  2023-08-30 13:31     ` Laszlo Ersek
  0 siblings, 1 reply; 4+ messages in thread
From: Ard Biesheuvel @ 2023-08-30 13:00 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: devel, quic_llindhol

On Tue, 29 Aug 2023 at 16:37, Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 8/29/23 15:29, Ard Biesheuvel wrote:
> > Laszlo reports that the efi_gdb.py script fails to produce a full
> > backtrace when attaching it to an ARM firmware build that has halted on
> > an unhandled exception.
> >
> > The reason is that the asm code that processes the exception was not
> > implemented with this in mind, and therefore lacks any handling of it.
> >
> > So let's add this: create a dummy frame record suitable for chasing the
> > frame pointer, and add the CFI metadata to describe where the return
> > value can be found on the stack.
> >
> > When using a GCC5 build, this produces a stack trace such as
> >
> >   (gdb) bt
> >   #0  0x000000007fd4537c in CpuDeadLoop () at /home/ardb/build/edk2/MdePkg/Library/BaseLib/CpuDeadLoop.c:30
> >   #1  0x000000007fd454f8 in DebugAssert (
> >       FileName=FileName@entry=0x7fd4a8a8 <MmioWrite64Internal+3604> "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c",
> >       LineNumber=LineNumber@entry=343, Description=Description@entry=0x7fd4a896 <MmioWrite64Internal+3586> "((BOOLEAN)(0==1))")
> >       at /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/DebugLib.c:235
> >   #2  0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=<optimized out>, SystemContext=...)
> >       at /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c:343
> >   #3  0x000000007fd48eb8 in ExceptionHandlersEnd ()
> >   #4  0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=<synthetic pointer>) at /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.c:201
> >   #5  TryRunningQemuKernel () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/QemuKernel.c:46
> >   #6  PlatformBootManagerAfterConsole () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139
> >   #7  BdsEntry (This=<optimized out>) at /home/ardb/build/edk2/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:931
> >   #8  0x000000007ffd0018 in ?? ()
> >   Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> >
> > when QemuLoadKernelImage() has been tweaked to trigger an exception, as
> > is shown by GDB when walking the call stack:
> >
> > |    0x7fcde938 <BdsEntry+3292>      b.ne    0x7fcdf134 <BdsEntry+5336>  // b.any
> > |    0x7fcde93c <BdsEntry+3296>      mov     x0, #0x40                       // #64
> > |    0x7fcde940 <BdsEntry+3300>      bl      0x7fcd7aec <DebugPrint>
> > |  > 0x7fcde944 <BdsEntry+3304>      brk     #0x4d2
> > |    0x7fcde948 <BdsEntry+3308>      bl      0x7fce4354 <ConnectDevicesFromQemu>
> > |    0x7fcde94c <BdsEntry+3312>      tbz     x0, #63, 0x7fcde954 <BdsEntry+3320>
> > |    0x7fcde950 <BdsEntry+3316>      bl      0x7fcd844c <EfiBootManagerConnectAll>
> > |    0x7fcde954 <BdsEntry+3320>      bl      0x7fcd990c <EfiBootManagerRefreshAllBootOption
> >
> > Unfortunately, CLANGDWARF does not seem entirely happy with this
> > arrangement: it identifies the call frame where the exception
> > originated, but does not show any frames above that. (This could be
> > related to the fact that the exception code uses a separate exception
> > stack for handling synchronous exceptions)
>
> First of all, thanks for writing this patch so incredibly quickly. :)
>

My pleasure.

> Second, something must be off with my gdb.
>
> Before your patch, I kept experimenting with manually resetting FP, SP,
> and LR to the values printed in the register dump, using gdb "set"
> commands. Strangely, that did result in complete pre-exception stack
> traces, but *only sometimes*. Most of the time gdb complains about
> "corrupted stack". And I just can't figure out what distinguishes the
> broken from the functional "bt" commands -- I did walk the allegedly
> corrupt stack manually, and there is nothing corrupt in the FP and LR
> parts of the stack frames. They all chain nicely and point to valid
> instructions, respectively. I don't know what it is that gdb doesn't like.
>

I suspect that gdb is filled with heuristics and tweaks, and uses a
combination of the frame records, the actual value of LR and the
unwind data to figure out what the call stack looks like.

> Third, when I test your patch, I seem to experience precisely what you
> describe under CLANGDWARF -- it shows the faulting frame (the frame just
> before the exception), but nothing before it! And I'm not building with
> clang :(
>

Shame. Unfortunately, I don't have a lot of time to spend on this
right now, but it is something I have been wanting to fix forever so
hopefully I'll get back to it at some point.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#108144): https://edk2.groups.io/g/devel/message/108144
Mute This Topic: https://groups.io/mt/101030910/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception
  2023-08-30 13:00   ` Ard Biesheuvel
@ 2023-08-30 13:31     ` Laszlo Ersek
  0 siblings, 0 replies; 4+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:31 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: devel, quic_llindhol

On 8/30/23 15:00, Ard Biesheuvel wrote:
> On Tue, 29 Aug 2023 at 16:37, Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 8/29/23 15:29, Ard Biesheuvel wrote:
>>> Laszlo reports that the efi_gdb.py script fails to produce a full
>>> backtrace when attaching it to an ARM firmware build that has halted on
>>> an unhandled exception.
>>>
>>> The reason is that the asm code that processes the exception was not
>>> implemented with this in mind, and therefore lacks any handling of it.
>>>
>>> So let's add this: create a dummy frame record suitable for chasing the
>>> frame pointer, and add the CFI metadata to describe where the return
>>> value can be found on the stack.
>>>
>>> When using a GCC5 build, this produces a stack trace such as
>>>
>>>   (gdb) bt
>>>   #0  0x000000007fd4537c in CpuDeadLoop () at /home/ardb/build/edk2/MdePkg/Library/BaseLib/CpuDeadLoop.c:30
>>>   #1  0x000000007fd454f8 in DebugAssert (
>>>       FileName=FileName@entry=0x7fd4a8a8 <MmioWrite64Internal+3604> "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c",
>>>       LineNumber=LineNumber@entry=343, Description=Description@entry=0x7fd4a896 <MmioWrite64Internal+3586> "((BOOLEAN)(0==1))")
>>>       at /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/DebugLib.c:235
>>>   #2  0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=<optimized out>, SystemContext=...)
>>>       at /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c:343
>>>   #3  0x000000007fd48eb8 in ExceptionHandlersEnd ()
>>>   #4  0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=<synthetic pointer>) at /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.c:201
>>>   #5  TryRunningQemuKernel () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/QemuKernel.c:46
>>>   #6  PlatformBootManagerAfterConsole () at /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139
>>>   #7  BdsEntry (This=<optimized out>) at /home/ardb/build/edk2/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:931
>>>   #8  0x000000007ffd0018 in ?? ()
>>>   Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>>>
>>> when QemuLoadKernelImage() has been tweaked to trigger an exception, as
>>> is shown by GDB when walking the call stack:
>>>
>>> |    0x7fcde938 <BdsEntry+3292>      b.ne    0x7fcdf134 <BdsEntry+5336>  // b.any
>>> |    0x7fcde93c <BdsEntry+3296>      mov     x0, #0x40                       // #64
>>> |    0x7fcde940 <BdsEntry+3300>      bl      0x7fcd7aec <DebugPrint>
>>> |  > 0x7fcde944 <BdsEntry+3304>      brk     #0x4d2
>>> |    0x7fcde948 <BdsEntry+3308>      bl      0x7fce4354 <ConnectDevicesFromQemu>
>>> |    0x7fcde94c <BdsEntry+3312>      tbz     x0, #63, 0x7fcde954 <BdsEntry+3320>
>>> |    0x7fcde950 <BdsEntry+3316>      bl      0x7fcd844c <EfiBootManagerConnectAll>
>>> |    0x7fcde954 <BdsEntry+3320>      bl      0x7fcd990c <EfiBootManagerRefreshAllBootOption
>>>
>>> Unfortunately, CLANGDWARF does not seem entirely happy with this
>>> arrangement: it identifies the call frame where the exception
>>> originated, but does not show any frames above that. (This could be
>>> related to the fact that the exception code uses a separate exception
>>> stack for handling synchronous exceptions)
>>
>> First of all, thanks for writing this patch so incredibly quickly. :)
>>
> 
> My pleasure.
> 
>> Second, something must be off with my gdb.
>>
>> Before your patch, I kept experimenting with manually resetting FP, SP,
>> and LR to the values printed in the register dump, using gdb "set"
>> commands. Strangely, that did result in complete pre-exception stack
>> traces, but *only sometimes*. Most of the time gdb complains about
>> "corrupted stack". And I just can't figure out what distinguishes the
>> broken from the functional "bt" commands -- I did walk the allegedly
>> corrupt stack manually, and there is nothing corrupt in the FP and LR
>> parts of the stack frames. They all chain nicely and point to valid
>> instructions, respectively. I don't know what it is that gdb doesn't like.
>>
> 
> I suspect that gdb is filled with heuristics and tweaks, and uses a
> combination of the frame records, the actual value of LR and the
> unwind data to figure out what the call stack looks like.

That's what I feared :/

> 
>> Third, when I test your patch, I seem to experience precisely what you
>> describe under CLANGDWARF -- it shows the faulting frame (the frame just
>> before the exception), but nothing before it! And I'm not building with
>> clang :(
>>
> 
> Shame. Unfortunately, I don't have a lot of time to spend on this
> right now, but it is something I have been wanting to fix forever so
> hopefully I'll get back to it at some point.
> 

I'm grateful that you wrote v1! :)

Thank you!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#108146): https://edk2.groups.io/g/devel/message/108146
Mute This Topic: https://groups.io/mt/101030910/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-08-30 13:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-29 13:29 [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception Ard Biesheuvel
2023-08-29 14:36 ` Laszlo Ersek
2023-08-30 13:00   ` Ard Biesheuvel
2023-08-30 13:31     ` Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox