From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 08128AC168D for ; Wed, 30 Aug 2023 13:31:57 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=7Pru1luytzuNJUKEsdDfpIOGkg/Udsea4eU+NMvUAO4=; c=relaxed/simple; d=groups.io; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Language:Content-Type:Content-Transfer-Encoding; s=20140610; t=1693402316; v=1; b=RdB6RtrOd7va1fEkMsO/TAjeSH7HNKa8PX/NNhjFIDlE1MCXUudm6xhPwwLNqhAks/Z5ykqf So2gwWnr2Krnmr/dbCV7RxcGBczasJrPHTGZTtx3OxyW50Et3nCoJvB+tJJhpqcRcaeYW+FJbxM vzVfEUP0atNLf4kRqbzos6Zs= X-Received: by 127.0.0.2 with SMTP id M8BcYY7687511xt1L3MGCY6w; Wed, 30 Aug 2023 06:31:56 -0700 X-Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web10.13968.1693402315786757514 for ; Wed, 30 Aug 2023 06:31:55 -0700 X-Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-652-2jYrenWUNwekZdCTCBEwGw-1; Wed, 30 Aug 2023 09:31:53 -0400 X-MC-Unique: 2jYrenWUNwekZdCTCBEwGw-1 X-Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8FFAD3C0DDD3; Wed, 30 Aug 2023 13:31:52 +0000 (UTC) X-Received: from [10.39.192.65] (unknown [10.39.192.65]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C1D3F4021C8; Wed, 30 Aug 2023 13:31:51 +0000 (UTC) Message-ID: <6b8cc4f2-5f37-7d08-ca8c-faf7b409a4f3@redhat.com> Date: Wed, 30 Aug 2023 15:31:50 +0200 MIME-Version: 1.0 Subject: Re: [edk2-devel] [PATCH 1/1] ArmPkg/ExceptionSupport: Support backtrace through an exception To: Ard Biesheuvel Cc: devel@edk2.groups.io, quic_llindhol@quicinc.com References: <20230829132921.123407-1-ardb@kernel.org> From: "Laszlo Ersek" In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,lersek@redhat.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: DU20ih14yBO0NBwx8e9ImkIYx7686176AA= Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=RdB6RtrO; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=redhat.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io On 8/30/23 15:00, Ard Biesheuvel wrote: > On Tue, 29 Aug 2023 at 16:37, Laszlo Ersek wrote: >> >> On 8/29/23 15:29, Ard Biesheuvel wrote: >>> Laszlo reports that the efi_gdb.py script fails to produce a full >>> backtrace when attaching it to an ARM firmware build that has halted on >>> an unhandled exception. >>> >>> The reason is that the asm code that processes the exception was not >>> implemented with this in mind, and therefore lacks any handling of it. >>> >>> So let's add this: create a dummy frame record suitable for chasing the >>> frame pointer, and add the CFI metadata to describe where the return >>> value can be found on the stack. >>> >>> When using a GCC5 build, this produces a stack trace such as >>> >>> (gdb) bt >>> #0 0x000000007fd4537c in CpuDeadLoop () at /home/ardb/build/edk2/Mde= Pkg/Library/BaseLib/CpuDeadLoop.c:30 >>> #1 0x000000007fd454f8 in DebugAssert ( >>> FileName=3DFileName@entry=3D0x7fd4a8a8 = "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/D= efaultExceptionHandler.c", >>> LineNumber=3DLineNumber@entry=3D343, Description=3DDescription@en= try=3D0x7fd4a896 "((BOOLEAN)(0=3D=3D1))") >>> at /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/De= bugLib.c:235 >>> #2 0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=3D, SystemContext=3D...) >>> at /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLi= b/AArch64/DefaultExceptionHandler.c:343 >>> #3 0x000000007fd48eb8 in ExceptionHandlersEnd () >>> #4 0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=3D) at /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageL= ib/GenericQemuLoadImageLib.c:201 >>> #5 TryRunningQemuKernel () at /home/ardb/build/edk2/ArmVirtPkg/Libra= ry/PlatformBootManagerLib/QemuKernel.c:46 >>> #6 PlatformBootManagerAfterConsole () at /home/ardb/build/edk2/ArmVi= rtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139 >>> #7 BdsEntry (This=3D) at /home/ardb/build/edk2/MdeMod= ulePkg/Universal/BdsDxe/BdsEntry.c:931 >>> #8 0x000000007ffd0018 in ?? () >>> Backtrace stopped: previous frame inner to this frame (corrupt stack?= ) >>> >>> when QemuLoadKernelImage() has been tweaked to trigger an exception, as >>> is shown by GDB when walking the call stack: >>> >>> | 0x7fcde938 b.ne 0x7fcdf134 = // b.any >>> | 0x7fcde93c mov x0, #0x40 = // #64 >>> | 0x7fcde940 bl 0x7fcd7aec >>> | > 0x7fcde944 brk #0x4d2 >>> | 0x7fcde948 bl 0x7fce4354 >>> | 0x7fcde94c tbz x0, #63, 0x7fcde954 >>> | 0x7fcde950 bl 0x7fcd844c >>> | 0x7fcde954 bl 0x7fcd990c >> >>> Unfortunately, CLANGDWARF does not seem entirely happy with this >>> arrangement: it identifies the call frame where the exception >>> originated, but does not show any frames above that. (This could be >>> related to the fact that the exception code uses a separate exception >>> stack for handling synchronous exceptions) >> >> First of all, thanks for writing this patch so incredibly quickly. :) >> >=20 > My pleasure. >=20 >> Second, something must be off with my gdb. >> >> Before your patch, I kept experimenting with manually resetting FP, SP, >> and LR to the values printed in the register dump, using gdb "set" >> commands. Strangely, that did result in complete pre-exception stack >> traces, but *only sometimes*. Most of the time gdb complains about >> "corrupted stack". And I just can't figure out what distinguishes the >> broken from the functional "bt" commands -- I did walk the allegedly >> corrupt stack manually, and there is nothing corrupt in the FP and LR >> parts of the stack frames. They all chain nicely and point to valid >> instructions, respectively. I don't know what it is that gdb doesn't lik= e. >> >=20 > I suspect that gdb is filled with heuristics and tweaks, and uses a > combination of the frame records, the actual value of LR and the > unwind data to figure out what the call stack looks like. That's what I feared :/ >=20 >> Third, when I test your patch, I seem to experience precisely what you >> describe under CLANGDWARF -- it shows the faulting frame (the frame just >> before the exception), but nothing before it! And I'm not building with >> clang :( >> >=20 > Shame. Unfortunately, I don't have a lot of time to spend on this > right now, but it is something I have been wanting to fix forever so > hopefully I'll get back to it at some point. >=20 I'm grateful that you wrote v1! :) Thank you! Laszlo -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#108146): https://edk2.groups.io/g/devel/message/108146 Mute This Topic: https://groups.io/mt/101030910/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/19134562= 12/xyzzy [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-