public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Ard Biesheuvel" <ardb@kernel.org>
To: Oliver Steffen <osteffen@redhat.com>
Cc: devel@edk2.groups.io, Gerd Hoffmann <kraxel@redhat.com>,
	Marc Zyngier <maz@kernel.org>,
	 dann.frazier@canonical.com
Subject: Re: [edk2-devel] [PATCH v2 2/2] ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX
Date: Fri, 19 May 2023 23:36:53 +0200	[thread overview]
Message-ID: <CAMj1kXG04NbUtK1ia9Fy5thNCr=2UxzJhdmcBKdJDXVjV_or2A@mail.gmail.com> (raw)
In-Reply-To: <CA+bRGFrUkJ2KvZHVpTCJjHL+QQ0D-ePZw+H2sTbqk_1LjyrTiQ@mail.gmail.com>

On Fri, 19 May 2023 at 18:32, Oliver Steffen <osteffen@redhat.com> wrote:
>
>
> Hi all,
>
> I had another look at this and I can now reproduce the issue consistently,
> with a quite minimal setup, on recent Linux kernel, Qemu, and EDK2.
> It requires rebooting the guest in a tight loop. It happens in silent
> and verbose
> builds alike, but since the verbose ones are slowed down by the serial
> output, it
> takes longer to hit the issue.
> It is possible to reproduce it with the silent builds within a few minutes.
> For the verbose case I recommend running multiple Qemu instances in parallel (as
> many as the machine allows, in my case ~100).
>

Thanks a lot for all these details, this is extremely helpful.

So what appears to be happening is that we split the 2M block mapping
that covers the code that we were called from, and hit a level 2
translation fault because the updated page table entry is still
observed to be in its transient 'invalid' state as we return to it.

Could you please check whether this makes a difference?

--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
@@ -65,6 +65,7 @@
   // write updated entry
   str   x1, [x0]
   dsb   nshst
+  isb

 .L2_\@:
   .endm



> Details:
>
> CPU: Cavium ThunderX2(R) CPU CN9975
> Tested on 3 different machines:
>     HPE apache, HPE apollo, Gigabyte R181
> Kernels tested:
>  - 6.2.15-100.fc36.aarch64
>  - 5.14.0-312.el9.aarch64
>    (contains 406504c7b0405d74d74c15a667cd4c4620c3e7a9,
>    "KVM: arm64: Fix S1PTW handling on RO memslots")
> Qemu v8.0.0 (RHEL version and build from upstream repo)
> EDK2: master branch from 2023-05-16 (cafb4f3f)
> gcc 11.3.1
>
> EDK2 build command line:
> build \
>   -a AARCH64
>   -p ArmVirtPkg/ArmVirtQemu.dsc
>   -t GCC5 -b DEBUG \
>   -D NETWORK_IP6_ENABLE \
>   -D NETWORK_HTTP_BOOT_ENABLE \
>   -D NETWORK_TLS_ENABLE \
>   -D NETWORK_ISCSI_ENABLE \
>   -D NETWORK_ALLOW_HTTP_CONNECTIONS \
>   -D CAVIUM_ERRATUM_27456=TRUE \
>   -D TPM2_ENABLE=TRUE \
>   -D TPM1_ENABLE=FALSE \
>   -D DEBUG_PRINT_ERROR_LEVEL=0x80000000  \
>   -D BUILD_SHELL=TRUE \
>   --pcd="gEfiShellPkgTokenSpaceGuid.PcdShellDefaultDelay=0" \
>   --pcd="gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut=0" \
>   --hash --cmd-len=65536
>
> To reproduce the issue I launched the firmware in Qemu and have it do
> a reboot once it finished booting up
> via a startup.nsh on the ESP.
>
> Qemu command line:
> qemu-system-aarch64 \
>     -machine virt,accel=kvm -m 13G \
>     -boot menu=off \
>     -cpu host \
>     -blockdev node-name=code,driver=file,filename="${FW_CODE}",read-only=on \
>     -blockdev node-name=vars,driver=file,filename="${FW_VARS}" \
>     -machine pflash0=code \
>     -machine pflash1=vars \
>     -serial stdio \
>     -net none \
>     -drive file=esp.img,snapshot=on
>
> Other things like number of CPUs or the presence of a vTPM have no
> influence. I did not try different amounts of RAM yet.
>
> Serial output:
> [...]
> InitializeDxeNxMemoryProtectionPolicy: StackBase = 0x00000000476C5000
> StackSize = 0x0000000000020000
> InitializeDxeNxMemoryProtectionPolicy: applying strict permissions to
> active memory regions
> SetUefiImageMemoryAttributes - 0x0000000040000000 - 0x00000000076E5000
> (0x0000000000004000)
> UpdateRegionMappingRecursive(0): 40000000 - 476E5000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 40000000 - 476E5000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 40000000 - 476E5000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 47600000 - 476E5000 set
> 60000000000400 clr FF9F000000000B3F
> SetUefiImageMemoryAttributes - 0x00000000476C5000 - 0x0000000000001000
> (0x0000000000006000)
> UpdateRegionMappingRecursive(0): 476C5000 - 476C6000 set
> 60000000000000 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 476C5000 - 476C6000 set
> 60000000000000 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 476C5000 - 476C6000 set
> 60000000000000 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 476C5000 - 476C6000 set
> 60000000000000 clr FF9F000000000B3F
> SetUefiImageMemoryAttributes - 0x000000004772B000 - 0x00000000007C0000
> (0x0000000000004000)
> UpdateRegionMappingRecursive(0): 4772B000 - 47EEB000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 4772B000 - 47EEB000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 4772B000 - 47EEB000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 4772B000 - 47800000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 47E00000 - 47EEB000 set
> 60000000000400 clr FF9F000000000B3F
> SetUefiImageMemoryAttributes - 0x0000000047EF3000 - 0x0000000000101000
> (0x0000000000004000)
> UpdateRegionMappingRecursive(0): 47EF3000 - 47FF4000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 47EF3000 - 47FF4000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 47EF3000 - 47FF4000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 47EF3000 - 47FF4000 set
> 60000000000400 clr FF9F000000000B3F
> SetUefiImageMemoryAttributes - 0x0000000047FFA000 - 0x0000000334AA6000
> (0x0000000000004000)
> UpdateRegionMappingRecursive(0): 47FFA000 - 37CAA0000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 47FFA000 - 37CAA0000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 47FFA000 - 80000000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 47FFA000 - 48000000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 340000000 - 380000000 set 70C clr 0
> UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set 70C clr 0
> UpdateRegionMappingRecursive(2): 340000000 - 37CAA0000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 37CA00000 - 37CC00000 set 70C clr 0
> UpdateRegionMappingRecursive(3): 37CA00000 - 37CAA0000 set
> 60000000000400 clr FF9F000000000B3F
> SetUefiImageMemoryAttributes - 0x000000037CB40000 - 0x00000000031F9000
> (0x0000000000004000)
> UpdateRegionMappingRecursive(0): 37CB40000 - 37FD39000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(1): 37CB40000 - 37FD39000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(2): 37CB40000 - 37FD39000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 37CB40000 - 37CC00000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set
> 60000000000400 clr FF9F000000000B3F
> UpdateRegionMappingRecursive(3): 37FC00000 - 37FE00000 set 70C clr 0
> UpdateRegionMappingRecursive(3): 37FC00000 - 37FD39000 set
> 60000000000400 clr FF9F000000000B3F
>
>
> Synchronous Exception at 0x000000037FD3C0A8
> PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll
> PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll
> PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll
> PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll
> PC 0x00037FD3C2E4 (0x00037FD39000+0x000032E4) [ 0] ArmCpuDxe.dll
> PC 0x0000476E78F8 (0x0000476E5000+0x000028F8) [ 1] DxeCore.dll
> PC 0x0000476ED680 (0x0000476E5000+0x00008680) [ 1] DxeCore.dll
> PC 0x0000476F2744 (0x0000476E5000+0x0000D744) [ 1] DxeCore.dll
> PC 0x0000476ECDE8 (0x0000476E5000+0x00007DE8) [ 1] DxeCore.dll
> PC 0x00037FD3D2DC (0x00037FD39000+0x000042DC) [ 2] ArmCpuDxe.dll
> PC 0x0000476EC788 (0x0000476E5000+0x00007788) [ 3] DxeCore.dll
> PC 0x0000476F9CA8 (0x0000476E5000+0x00014CA8) [ 3] DxeCore.dll
> PC 0x0000476EFEF0 (0x0000476E5000+0x0000AEF0) [ 3] DxeCore.dll
>
> [ 0] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll
> [ 1] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
> [ 2] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll
> [ 3] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
>
>   X0 0x000000037F10BFF0   X1 0x000000037F106003   X2
> 0x000000000037FC00   X3 0x0000000000000000
>   X4 0x0000000000000200   X5 0x0000000000000004   X6
> 0x0000000000000000   X7 0x000000037FD3F4B5
>   X8 0x0000000000000000   X9 0x0000000000000002  X10
> 0x0000000000000000  X11 0x0000000000000000
>  X12 0x0000000000000002  X13 0x0000000000000002  X14
> 0x0000000000000001  X15 0x0000000000000002
>  X16 0x000000037FD3A268  X17 0x00000000007AFA10  X18
> 0x0000000000000000  X19 0x000000037FC00000
>  X20 0x0000000000000002  X21 0x000000037F106003  X22
> 0x000000037F10B000  X23 0x000000037FD42000
>  X24 0x00000000001FFFFF  X25 0x000000037FD39000  X26
> 0x000000037F106000  X27 0x0000000000000003
>  X28 0x000000037F10BFF0   FP 0x00000000476E4780   LR 0x000000037FD3C0A8
>
>   V0 0x0000000000000000 0000000000000000   V1 0x0000000000000000
> 0000000000000000
>   V2 0x0000000000000000 0000000000000000   V3 0x0000000000000000
> 0000000000000000
>   V4 0x0000000000000000 0000000000000000   V5 0x0000000000000000
> 0000000000000000
>   V6 0x0000000000000000 0000000000000000   V7 0x0000000000000000
> 0000000000000000
>   V8 0x0000000000000000 0000000000000000   V9 0x0000000000000000
> 0000000000000000
>  V10 0x0000000000000000 0000000000000000  V11 0x0000000000000000
> 0000000000000000
>  V12 0x0000000000000000 0000000000000000  V13 0x0000000000000000
> 0000000000000000
>  V14 0x0000000000000000 0000000000000000  V15 0x0000000000000000
> 0000000000000000
>  V16 0x0000000000000000 0000000000000000  V17 0x0000000000000000
> 0000000000000000
>  V18 0x0000000000000000 0000000000000000  V19 0x0000000000000000
> 0000000000000000
>  V20 0x0000000000000000 0000000000000000  V21 0x0000000000000000
> 0000000000000000
>  V22 0x0000000000000000 0000000000000000  V23 0x0000000000000000
> 0000000000000000
>  V24 0x0000000000000000 0000000000000000  V25 0x0000000000000000
> 0000000000000000
>  V26 0x0000000000000000 0000000000000000  V27 0x0000000000000000
> 0000000000000000
>  V28 0x0000000000000000 0000000000000000  V29 0x0000000000000000
> 0000000000000000
>  V30 0x0000000000000000 0000000000000000  V31 0x0000000000000000
> 0000000000000000
>
>   SP 0x00000000476E4780  ELR 0x000000037FD3C0A8  SPSR 0x80000205  FPSR
> 0x00000000
>  ESR 0x86000006          FAR 0x000000037FD3C0A8
>
>  ESR : EC 0x21  IL 0x1  ISS 0x00000006
>
> Instruction abort: Translation fault, second level
>
> Stack dump:
>   00000476E4680: 0000000000000001 0000000000000004 00000000476E4700
> 00000000476F3980
>   00000476E46A0: 000000037FD40CBD 0000000000000003 000000037FC00000
> 000000037FD39000
>   00000476E46C0: 0060000000000400 FF9F000000000B3F 00000000476E4780
> 000000037FD3BE70
>   00000476E46E0: 000000037FC00000 0000000000000002 000000037F106000
> 000000037F10B000
>   00000476E4700: 0000000000000FF0 00000000001FFFFF 000000037FD39000
> 000000037F106000
>   00000476E4720: 0000000000000003 000000037F10BFF0 0060000000000400
> FF9F000000000B3F
>   00000476E4740: 000000037FD39000 000000037FD39000 00000000476E4780
> 0060000000000403
>   00000476E4760: 0000000C00000001 000000037FD3F90E 0000000000000400
> 000000037F10B000
> > 00000476E4780: 00000000476E4830 000000037FD3BE70 000000037CB40000 0000000000000001
>   00000476E47A0: 000000037F10B000 0000000047FFE000 0000000000000068
> 000000003FFFFFFF
>   00000476E47C0: 000000037FD39000 000000037F10C528 0000000000000002
> 0000000047FFE068
>   00000476E47E0: 0060000000000400 FF9F000000000B3F 0000000300000001
> 000000037FD39000
>   00000476E4800: 000000017FD40CBD 0060000000000401 0000001500000001
> 000000037FD3F90E
>   00000476E4820: 0060000000000400 000000037F106000 00000000476E48E0
> 000000037FD3BE70
>   00000476E4840: 000000037CB40000 0000000000000000 0000000047FFE000
> 0000000047FFF000
>   00000476E4860: 0000000000000000 0000007FFFFFFFFF 000000037FD39000
> 000000037F10C528
> ASSERT [ArmCpuDxe]
> /root/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(333):
> ((BOOLEAN)(0==1))
>
>
>
> The full log is available here:
> https://gitlab.com/osteffen/thunderx2-debug/-/raw/main/2023-05-19/85.log?inline=false
>
> Debug files, firmware binaries, and the full build tree are here:
> https://gitlab.com/osteffen/thunderx2-debug/-/tree/main/2023-05-19
>
> I am able to reproduce this quickly, so any ideas for what I can try
> are welcome :-)
>
> Thanks
> -Oliver
>

  reply	other threads:[~2023-05-19 21:37 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-05 16:25 [PATCH v2 1/2] ArmVirtPkg/ArmPlatformLibQemu: Ensure that VFP is on before running C code Ard Biesheuvel
2023-01-05 16:25 ` [PATCH v2 2/2] ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX Ard Biesheuvel
2023-01-10  0:08   ` dann frazier
2023-01-17 12:47     ` [edk2-devel] " Oliver Steffen
2023-01-17 14:53       ` Ard Biesheuvel
2023-01-18  6:36         ` Oliver Steffen
2023-01-18  7:34           ` Ard Biesheuvel
2023-01-18  8:27             ` Oliver Steffen
2023-01-18  8:48               ` Ard Biesheuvel
2023-01-18  9:22                 ` Ard Biesheuvel
2023-01-19 11:03                   ` Oliver Steffen
2023-01-19 11:11                     ` Ard Biesheuvel
2023-01-19 11:25                       ` Oliver Steffen
2023-01-19 11:55                       ` Marc Zyngier
2023-01-19 12:21                         ` Ard Biesheuvel
2023-01-19 12:00                       ` Gerd Hoffmann
2023-01-19 12:55                         ` Oliver Steffen
2023-01-19 13:21                           ` Ard Biesheuvel
2023-01-26 12:01                             ` Gerd Hoffmann
2023-01-26 13:48                               ` Marc Zyngier
2023-01-26 14:46                                 ` Gerd Hoffmann
2023-01-26 15:08                                   ` Marc Zyngier
2023-02-01  9:13                             ` Oliver Steffen
2023-02-01 11:51                               ` Ard Biesheuvel
2023-02-01 12:58                                 ` Oliver Steffen
2023-02-01 13:29                                   ` Ard Biesheuvel
2023-02-02 11:09                                     ` Oliver Steffen
     [not found]                                     ` <173FFD60429C89C3.3213@groups.io>
2023-02-07 10:51                                       ` Oliver Steffen
2023-02-07 11:56                                         ` Ard Biesheuvel
2023-02-07 12:58                                           ` Oliver Steffen
2023-02-09 15:15                                             ` Ard Biesheuvel
2023-03-02 10:50                                               ` Ard Biesheuvel
2023-03-02 13:29                                                 ` Oliver Steffen
     [not found]                                                 ` <17489D498A098DB9.9697@groups.io>
2023-05-19 16:32                                                   ` Oliver Steffen
2023-05-19 21:36                                                     ` Ard Biesheuvel [this message]
2023-05-20  8:37                                                       ` Oliver Steffen
2023-05-20  9:20                                                         ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMj1kXG04NbUtK1ia9Fy5thNCr=2UxzJhdmcBKdJDXVjV_or2A@mail.gmail.com' \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox