From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by mx.groups.io with SMTP id smtpd.web10.687.1684532228091620212 for ; Fri, 19 May 2023 14:37:08 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qCapps23; spf=pass (domain: kernel.org, ip: 139.178.84.217, mailfrom: ardb@kernel.org) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7735A65B4D for ; Fri, 19 May 2023 21:37:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDA91C4339B for ; Fri, 19 May 2023 21:37:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684532226; bh=p6QOhza030YPAFioSJ2g4zH9ZUzgXeh+ByVZ46i8+3o=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=qCapps23iEoM+3rSo6sSV2Nlwhy9qTnEhw8IBN2CD3Fdw8rtKjZ3kc/0LjKVhsSk0 6Kl/w2moZazyx4cYXI1TzWAkr2agaS/Hd6MVqz5XflB8/0MeVFXTswuJm1MYoDlF8b B1/ZKbdATq9LUV8vNnt4OJ33zEBPL+2iTEtanxGK8tuhCDOmonbuYsIz0CB3eIPgvu //c4qLM6y406Cncu/8jp1nAdyfYOT0uXoAIxCGWrwsQse8sQ1XbXN2UiK/G4rYEhf/ CJW/Z8d8vQsT8WmpkcyRWOpZ4RgKVudJW9K027bQkWzUR0eITK5kmMlGpBJFkmSXTN x/ripMKFmLpqQ== Received: by mail-lf1-f52.google.com with SMTP id 2adb3069b0e04-4f3a873476bso2346270e87.1 for ; Fri, 19 May 2023 14:37:06 -0700 (PDT) X-Gm-Message-State: AC+VfDxgauxpL68rKAvuSET4O3eKpLFQ/bcm150Ooo3z3ZRFklOpUER8 GDiATdZW1m3ou1/bRWrwnqmu41jh7D38uwCmFak= X-Google-Smtp-Source: ACHHUZ4dDtkRg6Ol1wJZSW3dmnCPNIZG+IPEz+23BKrSOjZGZYwJ0IskuadHYuU+OPilo4KBu+8M+IPRYRlsdZa08Zg= X-Received: by 2002:a05:6512:49e:b0:4ef:ff42:b13 with SMTP id v30-20020a056512049e00b004efff420b13mr1062647lfq.65.1684532224864; Fri, 19 May 2023 14:37:04 -0700 (PDT) MIME-Version: 1.0 References: <173FFD60429C89C3.3213@groups.io> <17489D498A098DB9.9697@groups.io> In-Reply-To: From: "Ard Biesheuvel" Date: Fri, 19 May 2023 23:36:53 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [edk2-devel] [PATCH v2 2/2] ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX To: Oliver Steffen Cc: devel@edk2.groups.io, Gerd Hoffmann , Marc Zyngier , dann.frazier@canonical.com Content-Type: text/plain; charset="UTF-8" On Fri, 19 May 2023 at 18:32, Oliver Steffen wrote: > > > Hi all, > > I had another look at this and I can now reproduce the issue consistently, > with a quite minimal setup, on recent Linux kernel, Qemu, and EDK2. > It requires rebooting the guest in a tight loop. It happens in silent > and verbose > builds alike, but since the verbose ones are slowed down by the serial > output, it > takes longer to hit the issue. > It is possible to reproduce it with the silent builds within a few minutes. > For the verbose case I recommend running multiple Qemu instances in parallel (as > many as the machine allows, in my case ~100). > Thanks a lot for all these details, this is extremely helpful. So what appears to be happening is that we split the 2M block mapping that covers the code that we were called from, and hit a level 2 translation fault because the updated page table entry is still observed to be in its transient 'invalid' state as we return to it. Could you please check whether this makes a difference? --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S @@ -65,6 +65,7 @@ // write updated entry str x1, [x0] dsb nshst + isb .L2_\@: .endm > Details: > > CPU: Cavium ThunderX2(R) CPU CN9975 > Tested on 3 different machines: > HPE apache, HPE apollo, Gigabyte R181 > Kernels tested: > - 6.2.15-100.fc36.aarch64 > - 5.14.0-312.el9.aarch64 > (contains 406504c7b0405d74d74c15a667cd4c4620c3e7a9, > "KVM: arm64: Fix S1PTW handling on RO memslots") > Qemu v8.0.0 (RHEL version and build from upstream repo) > EDK2: master branch from 2023-05-16 (cafb4f3f) > gcc 11.3.1 > > EDK2 build command line: > build \ > -a AARCH64 > -p ArmVirtPkg/ArmVirtQemu.dsc > -t GCC5 -b DEBUG \ > -D NETWORK_IP6_ENABLE \ > -D NETWORK_HTTP_BOOT_ENABLE \ > -D NETWORK_TLS_ENABLE \ > -D NETWORK_ISCSI_ENABLE \ > -D NETWORK_ALLOW_HTTP_CONNECTIONS \ > -D CAVIUM_ERRATUM_27456=TRUE \ > -D TPM2_ENABLE=TRUE \ > -D TPM1_ENABLE=FALSE \ > -D DEBUG_PRINT_ERROR_LEVEL=0x80000000 \ > -D BUILD_SHELL=TRUE \ > --pcd="gEfiShellPkgTokenSpaceGuid.PcdShellDefaultDelay=0" \ > --pcd="gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut=0" \ > --hash --cmd-len=65536 > > To reproduce the issue I launched the firmware in Qemu and have it do > a reboot once it finished booting up > via a startup.nsh on the ESP. > > Qemu command line: > qemu-system-aarch64 \ > -machine virt,accel=kvm -m 13G \ > -boot menu=off \ > -cpu host \ > -blockdev node-name=code,driver=file,filename="${FW_CODE}",read-only=on \ > -blockdev node-name=vars,driver=file,filename="${FW_VARS}" \ > -machine pflash0=code \ > -machine pflash1=vars \ > -serial stdio \ > -net none \ > -drive file=esp.img,snapshot=on > > Other things like number of CPUs or the presence of a vTPM have no > influence. I did not try different amounts of RAM yet. > > Serial output: > [...] > InitializeDxeNxMemoryProtectionPolicy: StackBase = 0x00000000476C5000 > StackSize = 0x0000000000020000 > InitializeDxeNxMemoryProtectionPolicy: applying strict permissions to > active memory regions > SetUefiImageMemoryAttributes - 0x0000000040000000 - 0x00000000076E5000 > (0x0000000000004000) > UpdateRegionMappingRecursive(0): 40000000 - 476E5000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 40000000 - 476E5000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 40000000 - 476E5000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 47600000 - 476E5000 set > 60000000000400 clr FF9F000000000B3F > SetUefiImageMemoryAttributes - 0x00000000476C5000 - 0x0000000000001000 > (0x0000000000006000) > UpdateRegionMappingRecursive(0): 476C5000 - 476C6000 set > 60000000000000 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 476C5000 - 476C6000 set > 60000000000000 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 476C5000 - 476C6000 set > 60000000000000 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 476C5000 - 476C6000 set > 60000000000000 clr FF9F000000000B3F > SetUefiImageMemoryAttributes - 0x000000004772B000 - 0x00000000007C0000 > (0x0000000000004000) > UpdateRegionMappingRecursive(0): 4772B000 - 47EEB000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 4772B000 - 47EEB000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 4772B000 - 47EEB000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 4772B000 - 47800000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 47E00000 - 47EEB000 set > 60000000000400 clr FF9F000000000B3F > SetUefiImageMemoryAttributes - 0x0000000047EF3000 - 0x0000000000101000 > (0x0000000000004000) > UpdateRegionMappingRecursive(0): 47EF3000 - 47FF4000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 47EF3000 - 47FF4000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 47EF3000 - 47FF4000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 47EF3000 - 47FF4000 set > 60000000000400 clr FF9F000000000B3F > SetUefiImageMemoryAttributes - 0x0000000047FFA000 - 0x0000000334AA6000 > (0x0000000000004000) > UpdateRegionMappingRecursive(0): 47FFA000 - 37CAA0000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 47FFA000 - 37CAA0000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 47FFA000 - 80000000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 47FFA000 - 48000000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 340000000 - 380000000 set 70C clr 0 > UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set 70C clr 0 > UpdateRegionMappingRecursive(2): 340000000 - 37CAA0000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 37CA00000 - 37CC00000 set 70C clr 0 > UpdateRegionMappingRecursive(3): 37CA00000 - 37CAA0000 set > 60000000000400 clr FF9F000000000B3F > SetUefiImageMemoryAttributes - 0x000000037CB40000 - 0x00000000031F9000 > (0x0000000000004000) > UpdateRegionMappingRecursive(0): 37CB40000 - 37FD39000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(1): 37CB40000 - 37FD39000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(2): 37CB40000 - 37FD39000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 37CB40000 - 37CC00000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set > 60000000000400 clr FF9F000000000B3F > UpdateRegionMappingRecursive(3): 37FC00000 - 37FE00000 set 70C clr 0 > UpdateRegionMappingRecursive(3): 37FC00000 - 37FD39000 set > 60000000000400 clr FF9F000000000B3F > > > Synchronous Exception at 0x000000037FD3C0A8 > PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll > PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll > PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll > PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll > PC 0x00037FD3C2E4 (0x00037FD39000+0x000032E4) [ 0] ArmCpuDxe.dll > PC 0x0000476E78F8 (0x0000476E5000+0x000028F8) [ 1] DxeCore.dll > PC 0x0000476ED680 (0x0000476E5000+0x00008680) [ 1] DxeCore.dll > PC 0x0000476F2744 (0x0000476E5000+0x0000D744) [ 1] DxeCore.dll > PC 0x0000476ECDE8 (0x0000476E5000+0x00007DE8) [ 1] DxeCore.dll > PC 0x00037FD3D2DC (0x00037FD39000+0x000042DC) [ 2] ArmCpuDxe.dll > PC 0x0000476EC788 (0x0000476E5000+0x00007788) [ 3] DxeCore.dll > PC 0x0000476F9CA8 (0x0000476E5000+0x00014CA8) [ 3] DxeCore.dll > PC 0x0000476EFEF0 (0x0000476E5000+0x0000AEF0) [ 3] DxeCore.dll > > [ 0] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll > [ 1] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll > [ 2] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Drivers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll > [ 3] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll > > X0 0x000000037F10BFF0 X1 0x000000037F106003 X2 > 0x000000000037FC00 X3 0x0000000000000000 > X4 0x0000000000000200 X5 0x0000000000000004 X6 > 0x0000000000000000 X7 0x000000037FD3F4B5 > X8 0x0000000000000000 X9 0x0000000000000002 X10 > 0x0000000000000000 X11 0x0000000000000000 > X12 0x0000000000000002 X13 0x0000000000000002 X14 > 0x0000000000000001 X15 0x0000000000000002 > X16 0x000000037FD3A268 X17 0x00000000007AFA10 X18 > 0x0000000000000000 X19 0x000000037FC00000 > X20 0x0000000000000002 X21 0x000000037F106003 X22 > 0x000000037F10B000 X23 0x000000037FD42000 > X24 0x00000000001FFFFF X25 0x000000037FD39000 X26 > 0x000000037F106000 X27 0x0000000000000003 > X28 0x000000037F10BFF0 FP 0x00000000476E4780 LR 0x000000037FD3C0A8 > > V0 0x0000000000000000 0000000000000000 V1 0x0000000000000000 > 0000000000000000 > V2 0x0000000000000000 0000000000000000 V3 0x0000000000000000 > 0000000000000000 > V4 0x0000000000000000 0000000000000000 V5 0x0000000000000000 > 0000000000000000 > V6 0x0000000000000000 0000000000000000 V7 0x0000000000000000 > 0000000000000000 > V8 0x0000000000000000 0000000000000000 V9 0x0000000000000000 > 0000000000000000 > V10 0x0000000000000000 0000000000000000 V11 0x0000000000000000 > 0000000000000000 > V12 0x0000000000000000 0000000000000000 V13 0x0000000000000000 > 0000000000000000 > V14 0x0000000000000000 0000000000000000 V15 0x0000000000000000 > 0000000000000000 > V16 0x0000000000000000 0000000000000000 V17 0x0000000000000000 > 0000000000000000 > V18 0x0000000000000000 0000000000000000 V19 0x0000000000000000 > 0000000000000000 > V20 0x0000000000000000 0000000000000000 V21 0x0000000000000000 > 0000000000000000 > V22 0x0000000000000000 0000000000000000 V23 0x0000000000000000 > 0000000000000000 > V24 0x0000000000000000 0000000000000000 V25 0x0000000000000000 > 0000000000000000 > V26 0x0000000000000000 0000000000000000 V27 0x0000000000000000 > 0000000000000000 > V28 0x0000000000000000 0000000000000000 V29 0x0000000000000000 > 0000000000000000 > V30 0x0000000000000000 0000000000000000 V31 0x0000000000000000 > 0000000000000000 > > SP 0x00000000476E4780 ELR 0x000000037FD3C0A8 SPSR 0x80000205 FPSR > 0x00000000 > ESR 0x86000006 FAR 0x000000037FD3C0A8 > > ESR : EC 0x21 IL 0x1 ISS 0x00000006 > > Instruction abort: Translation fault, second level > > Stack dump: > 00000476E4680: 0000000000000001 0000000000000004 00000000476E4700 > 00000000476F3980 > 00000476E46A0: 000000037FD40CBD 0000000000000003 000000037FC00000 > 000000037FD39000 > 00000476E46C0: 0060000000000400 FF9F000000000B3F 00000000476E4780 > 000000037FD3BE70 > 00000476E46E0: 000000037FC00000 0000000000000002 000000037F106000 > 000000037F10B000 > 00000476E4700: 0000000000000FF0 00000000001FFFFF 000000037FD39000 > 000000037F106000 > 00000476E4720: 0000000000000003 000000037F10BFF0 0060000000000400 > FF9F000000000B3F > 00000476E4740: 000000037FD39000 000000037FD39000 00000000476E4780 > 0060000000000403 > 00000476E4760: 0000000C00000001 000000037FD3F90E 0000000000000400 > 000000037F10B000 > > 00000476E4780: 00000000476E4830 000000037FD3BE70 000000037CB40000 0000000000000001 > 00000476E47A0: 000000037F10B000 0000000047FFE000 0000000000000068 > 000000003FFFFFFF > 00000476E47C0: 000000037FD39000 000000037F10C528 0000000000000002 > 0000000047FFE068 > 00000476E47E0: 0060000000000400 FF9F000000000B3F 0000000300000001 > 000000037FD39000 > 00000476E4800: 000000017FD40CBD 0060000000000401 0000001500000001 > 000000037FD3F90E > 00000476E4820: 0060000000000400 000000037F106000 00000000476E48E0 > 000000037FD3BE70 > 00000476E4840: 000000037CB40000 0000000000000000 0000000047FFE000 > 0000000047FFF000 > 00000476E4860: 0000000000000000 0000007FFFFFFFFF 000000037FD39000 > 000000037F10C528 > ASSERT [ArmCpuDxe] > /root/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(333): > ((BOOLEAN)(0==1)) > > > > The full log is available here: > https://gitlab.com/osteffen/thunderx2-debug/-/raw/main/2023-05-19/85.log?inline=false > > Debug files, firmware binaries, and the full build tree are here: > https://gitlab.com/osteffen/thunderx2-debug/-/tree/main/2023-05-19 > > I am able to reproduce this quickly, so any ideas for what I can try > are welcome :-) > > Thanks > -Oliver >