From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.groups.io with SMTP id smtpd.web11.7001.1684571852001801790 for ; Sat, 20 May 2023 01:37:33 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=h3qE0kmI; spf=pass (domain: redhat.com, ip: 170.10.129.124, mailfrom: osteffen@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684571851; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1ve5QJoOL5pD5sI7S1ll+j6bSaNiwainruvp26gAY0I=; b=h3qE0kmIBGBhts0ti9CNp3rUDVkLlKo67q6X8seqdSa3SGrzLfYIN/ATlGpMo1IQ/Us1xX jWuMQVKi7sMVDxSlLB3mKcR0UDCJOFQAQGEpab+eNZCv3lA9s0YS2r/ofv+HIh41ExxWEU Bm+wwWUZlXvHWLbXcChN4omM61yT0Yw= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-615-kzX3TsE_NRaYy84XH3J_mQ-1; Sat, 20 May 2023 04:37:29 -0400 X-MC-Unique: kzX3TsE_NRaYy84XH3J_mQ-1 Received: by mail-lf1-f69.google.com with SMTP id 2adb3069b0e04-4edc7ab63ccso2453830e87.3 for ; Sat, 20 May 2023 01:37:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684571848; x=1687163848; h=content-transfer-encoding:cc:to:subject:message-id:date:user-agent :from:references:in-reply-to:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=1ve5QJoOL5pD5sI7S1ll+j6bSaNiwainruvp26gAY0I=; b=jxS02yvhwL39OsJ21r+pJZemI0WXEEQJ3AgUN27OELSAFdopqVKkBvTCntxi7gL1ph g+GkzVdYIxyVQPK97RHaMcT3BOk+2Xxx+/gDEU8Fw0oFaVBYybJnCzJ0/vh4GXcFP+1F brpiCMa4O/SV/BW3gMEJXnPGN/M6qW5FUX4uL+PhjNIcy6Yj6GzRi/0oGhbAFYBzbF1e Rw3qrx256cOXCPzlbNtWekbf0FpDfYYurPExsLcUxIThsmhZ9AIpdRtuQ4h0osfRSXwM 4czEDDgI82y+c/CWoBl6T/BUqgdCrrOdbi1RwylrG2rMUZYTYARtdqW8nj18q5olxynE qkjw== X-Gm-Message-State: AC+VfDwxj/pmM3aRaSvJ2KdtlB8redL+piiX+9ZdwtLbU68fdgM+OgcE veD+HCrKtAqok0KBh0RV07RIeXjVDsx6wcqrOvnH98NKbcpk0+O1a42YC0elOkukOnCFRbuRTAN p6+3aSNouevcW7YRZbgvBmIs8gq1sLA== X-Received: by 2002:a05:6512:945:b0:4e9:74a8:134c with SMTP id u5-20020a056512094500b004e974a8134cmr1391692lft.43.1684571847790; Sat, 20 May 2023 01:37:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7BtP8QXMn3ngpB3XEzXRv0tVnOS+C+p4bRnNvNhBfXMp4UV1kEv20maes0IQ61cEF0R4GKzqFb8tCerAoCtaw= X-Received: by 2002:a05:6512:945:b0:4e9:74a8:134c with SMTP id u5-20020a056512094500b004e974a8134cmr1391686lft.43.1684571847400; Sat, 20 May 2023 01:37:27 -0700 (PDT) Received: from 567203818698 named unknown by gmailapi.google.com with HTTPREST; Sat, 20 May 2023 08:37:26 +0000 MIME-Version: 1.0 In-Reply-To: References: <173FFD60429C89C3.3213@groups.io> <17489D498A098DB9.9697@groups.io> From: "Oliver Steffen" User-Agent: alot/0.8.1 Date: Sat, 20 May 2023 08:37:26 +0000 Message-ID: Subject: Re: [edk2-devel] [PATCH v2 2/2] ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX To: Ard Biesheuvel Cc: devel@edk2.groups.io, Gerd Hoffmann , Marc Zyngier , dann.frazier@canonical.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Quoting Ard Biesheuvel (2023-05-19 23:36:53) > On Fri, 19 May 2023 at 18:32, Oliver Steffen wrote: > > > > > > Hi all, > > > > I had another look at this and I can now reproduce the issue consistent= ly, > > with a quite minimal setup, on recent Linux kernel, Qemu, and EDK2. > > It requires rebooting the guest in a tight loop. It happens in silent > > and verbose > > builds alike, but since the verbose ones are slowed down by the serial > > output, it > > takes longer to hit the issue. > > It is possible to reproduce it with the silent builds within a few minu= tes. > > For the verbose case I recommend running multiple Qemu instances in par= allel (as > > many as the machine allows, in my case ~100). > > > > Thanks a lot for all these details, this is extremely helpful. > > So what appears to be happening is that we split the 2M block mapping > that covers the code that we were called from, and hit a level 2 > translation fault because the updated page table entry is still > observed to be in its transient 'invalid' state as we return to it. > > Could you please check whether this makes a difference? > > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > @@ -65,6 +65,7 @@ > // write updated entry > str x1, [x0] > dsb nshst > + isb > > .L2_\@: > .endm That fixes it - no crash observed within 150k iterations. Thanks, Ard! - Oliver > > > > Details: > > > > CPU: Cavium ThunderX2(R) CPU CN9975 > > Tested on 3 different machines: > > HPE apache, HPE apollo, Gigabyte R181 > > Kernels tested: > > - 6.2.15-100.fc36.aarch64 > > - 5.14.0-312.el9.aarch64 > > (contains 406504c7b0405d74d74c15a667cd4c4620c3e7a9, > > "KVM: arm64: Fix S1PTW handling on RO memslots") > > Qemu v8.0.0 (RHEL version and build from upstream repo) > > EDK2: master branch from 2023-05-16 (cafb4f3f) > > gcc 11.3.1 > > > > EDK2 build command line: > > build \ > > -a AARCH64 > > -p ArmVirtPkg/ArmVirtQemu.dsc > > -t GCC5 -b DEBUG \ > > -D NETWORK_IP6_ENABLE \ > > -D NETWORK_HTTP_BOOT_ENABLE \ > > -D NETWORK_TLS_ENABLE \ > > -D NETWORK_ISCSI_ENABLE \ > > -D NETWORK_ALLOW_HTTP_CONNECTIONS \ > > -D CAVIUM_ERRATUM_27456=3DTRUE \ > > -D TPM2_ENABLE=3DTRUE \ > > -D TPM1_ENABLE=3DFALSE \ > > -D DEBUG_PRINT_ERROR_LEVEL=3D0x80000000 \ > > -D BUILD_SHELL=3DTRUE \ > > --pcd=3D"gEfiShellPkgTokenSpaceGuid.PcdShellDefaultDelay=3D0" \ > > --pcd=3D"gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut=3D0" \ > > --hash --cmd-len=3D65536 > > > > To reproduce the issue I launched the firmware in Qemu and have it do > > a reboot once it finished booting up > > via a startup.nsh on the ESP. > > > > Qemu command line: > > qemu-system-aarch64 \ > > -machine virt,accel=3Dkvm -m 13G \ > > -boot menu=3Doff \ > > -cpu host \ > > -blockdev node-name=3Dcode,driver=3Dfile,filename=3D"${FW_CODE}",re= ad-only=3Don \ > > -blockdev node-name=3Dvars,driver=3Dfile,filename=3D"${FW_VARS}" \ > > -machine pflash0=3Dcode \ > > -machine pflash1=3Dvars \ > > -serial stdio \ > > -net none \ > > -drive file=3Desp.img,snapshot=3Don > > > > Other things like number of CPUs or the presence of a vTPM have no > > influence. I did not try different amounts of RAM yet. > > > > Serial output: > > [...] > > InitializeDxeNxMemoryProtectionPolicy: StackBase =3D 0x00000000476C5000 > > StackSize =3D 0x0000000000020000 > > InitializeDxeNxMemoryProtectionPolicy: applying strict permissions to > > active memory regions > > SetUefiImageMemoryAttributes - 0x0000000040000000 - 0x00000000076E5000 > > (0x0000000000004000) > > UpdateRegionMappingRecursive(0): 40000000 - 476E5000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 40000000 - 476E5000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 40000000 - 476E5000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 47600000 - 476E5000 set > > 60000000000400 clr FF9F000000000B3F > > SetUefiImageMemoryAttributes - 0x00000000476C5000 - 0x0000000000001000 > > (0x0000000000006000) > > UpdateRegionMappingRecursive(0): 476C5000 - 476C6000 set > > 60000000000000 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 476C5000 - 476C6000 set > > 60000000000000 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 476C5000 - 476C6000 set > > 60000000000000 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 476C5000 - 476C6000 set > > 60000000000000 clr FF9F000000000B3F > > SetUefiImageMemoryAttributes - 0x000000004772B000 - 0x00000000007C0000 > > (0x0000000000004000) > > UpdateRegionMappingRecursive(0): 4772B000 - 47EEB000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 4772B000 - 47EEB000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 4772B000 - 47EEB000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 4772B000 - 47800000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 47E00000 - 47EEB000 set > > 60000000000400 clr FF9F000000000B3F > > SetUefiImageMemoryAttributes - 0x0000000047EF3000 - 0x0000000000101000 > > (0x0000000000004000) > > UpdateRegionMappingRecursive(0): 47EF3000 - 47FF4000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 47EF3000 - 47FF4000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 47EF3000 - 47FF4000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 47EF3000 - 47FF4000 set > > 60000000000400 clr FF9F000000000B3F > > SetUefiImageMemoryAttributes - 0x0000000047FFA000 - 0x0000000334AA6000 > > (0x0000000000004000) > > UpdateRegionMappingRecursive(0): 47FFA000 - 37CAA0000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 47FFA000 - 37CAA0000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 47FFA000 - 80000000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 47FFA000 - 48000000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 340000000 - 380000000 set 70C clr 0 > > UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set 70C clr 0 > > UpdateRegionMappingRecursive(2): 340000000 - 37CAA0000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 37CA00000 - 37CC00000 set 70C clr 0 > > UpdateRegionMappingRecursive(3): 37CA00000 - 37CAA0000 set > > 60000000000400 clr FF9F000000000B3F > > SetUefiImageMemoryAttributes - 0x000000037CB40000 - 0x00000000031F9000 > > (0x0000000000004000) > > UpdateRegionMappingRecursive(0): 37CB40000 - 37FD39000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(1): 37CB40000 - 37FD39000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(2): 37CB40000 - 37FD39000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 37CB40000 - 37CC00000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 37F000000 - 37F200000 set > > 60000000000400 clr FF9F000000000B3F > > UpdateRegionMappingRecursive(3): 37FC00000 - 37FE00000 set 70C clr 0 > > UpdateRegionMappingRecursive(3): 37FC00000 - 37FD39000 set > > 60000000000400 clr FF9F000000000B3F > > > > > > Synchronous Exception at 0x000000037FD3C0A8 > > PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll > > PC 0x00037FD3C0A8 (0x00037FD39000+0x000030A8) [ 0] ArmCpuDxe.dll > > PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll > > PC 0x00037FD3BE70 (0x00037FD39000+0x00002E70) [ 0] ArmCpuDxe.dll > > PC 0x00037FD3C2E4 (0x00037FD39000+0x000032E4) [ 0] ArmCpuDxe.dll > > PC 0x0000476E78F8 (0x0000476E5000+0x000028F8) [ 1] DxeCore.dll > > PC 0x0000476ED680 (0x0000476E5000+0x00008680) [ 1] DxeCore.dll > > PC 0x0000476F2744 (0x0000476E5000+0x0000D744) [ 1] DxeCore.dll > > PC 0x0000476ECDE8 (0x0000476E5000+0x00007DE8) [ 1] DxeCore.dll > > PC 0x00037FD3D2DC (0x00037FD39000+0x000042DC) [ 2] ArmCpuDxe.dll > > PC 0x0000476EC788 (0x0000476E5000+0x00007788) [ 3] DxeCore.dll > > PC 0x0000476F9CA8 (0x0000476E5000+0x00014CA8) [ 3] DxeCore.dll > > PC 0x0000476EFEF0 (0x0000476E5000+0x0000AEF0) [ 3] DxeCore.dll > > > > [ 0] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Dri= vers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll > > [ 1] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModuleP= kg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll > > [ 2] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/ArmPkg/Dri= vers/CpuDxe/CpuDxe/DEBUG/ArmCpuDxe.dll > > [ 3] /root/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/AARCH64/MdeModuleP= kg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll > > > > X0 0x000000037F10BFF0 X1 0x000000037F106003 X2 > > 0x000000000037FC00 X3 0x0000000000000000 > > X4 0x0000000000000200 X5 0x0000000000000004 X6 > > 0x0000000000000000 X7 0x000000037FD3F4B5 > > X8 0x0000000000000000 X9 0x0000000000000002 X10 > > 0x0000000000000000 X11 0x0000000000000000 > > X12 0x0000000000000002 X13 0x0000000000000002 X14 > > 0x0000000000000001 X15 0x0000000000000002 > > X16 0x000000037FD3A268 X17 0x00000000007AFA10 X18 > > 0x0000000000000000 X19 0x000000037FC00000 > > X20 0x0000000000000002 X21 0x000000037F106003 X22 > > 0x000000037F10B000 X23 0x000000037FD42000 > > X24 0x00000000001FFFFF X25 0x000000037FD39000 X26 > > 0x000000037F106000 X27 0x0000000000000003 > > X28 0x000000037F10BFF0 FP 0x00000000476E4780 LR 0x000000037FD3C0A8 > > > > V0 0x0000000000000000 0000000000000000 V1 0x0000000000000000 > > 0000000000000000 > > V2 0x0000000000000000 0000000000000000 V3 0x0000000000000000 > > 0000000000000000 > > V4 0x0000000000000000 0000000000000000 V5 0x0000000000000000 > > 0000000000000000 > > V6 0x0000000000000000 0000000000000000 V7 0x0000000000000000 > > 0000000000000000 > > V8 0x0000000000000000 0000000000000000 V9 0x0000000000000000 > > 0000000000000000 > > V10 0x0000000000000000 0000000000000000 V11 0x0000000000000000 > > 0000000000000000 > > V12 0x0000000000000000 0000000000000000 V13 0x0000000000000000 > > 0000000000000000 > > V14 0x0000000000000000 0000000000000000 V15 0x0000000000000000 > > 0000000000000000 > > V16 0x0000000000000000 0000000000000000 V17 0x0000000000000000 > > 0000000000000000 > > V18 0x0000000000000000 0000000000000000 V19 0x0000000000000000 > > 0000000000000000 > > V20 0x0000000000000000 0000000000000000 V21 0x0000000000000000 > > 0000000000000000 > > V22 0x0000000000000000 0000000000000000 V23 0x0000000000000000 > > 0000000000000000 > > V24 0x0000000000000000 0000000000000000 V25 0x0000000000000000 > > 0000000000000000 > > V26 0x0000000000000000 0000000000000000 V27 0x0000000000000000 > > 0000000000000000 > > V28 0x0000000000000000 0000000000000000 V29 0x0000000000000000 > > 0000000000000000 > > V30 0x0000000000000000 0000000000000000 V31 0x0000000000000000 > > 0000000000000000 > > > > SP 0x00000000476E4780 ELR 0x000000037FD3C0A8 SPSR 0x80000205 FPSR > > 0x00000000 > > ESR 0x86000006 FAR 0x000000037FD3C0A8 > > > > ESR : EC 0x21 IL 0x1 ISS 0x00000006 > > > > Instruction abort: Translation fault, second level > > > > Stack dump: > > 00000476E4680: 0000000000000001 0000000000000004 00000000476E4700 > > 00000000476F3980 > > 00000476E46A0: 000000037FD40CBD 0000000000000003 000000037FC00000 > > 000000037FD39000 > > 00000476E46C0: 0060000000000400 FF9F000000000B3F 00000000476E4780 > > 000000037FD3BE70 > > 00000476E46E0: 000000037FC00000 0000000000000002 000000037F106000 > > 000000037F10B000 > > 00000476E4700: 0000000000000FF0 00000000001FFFFF 000000037FD39000 > > 000000037F106000 > > 00000476E4720: 0000000000000003 000000037F10BFF0 0060000000000400 > > FF9F000000000B3F > > 00000476E4740: 000000037FD39000 000000037FD39000 00000000476E4780 > > 0060000000000403 > > 00000476E4760: 0000000C00000001 000000037FD3F90E 0000000000000400 > > 000000037F10B000 > > > 00000476E4780: 00000000476E4830 000000037FD3BE70 000000037CB40000 000= 0000000000001 > > 00000476E47A0: 000000037F10B000 0000000047FFE000 0000000000000068 > > 000000003FFFFFFF > > 00000476E47C0: 000000037FD39000 000000037F10C528 0000000000000002 > > 0000000047FFE068 > > 00000476E47E0: 0060000000000400 FF9F000000000B3F 0000000300000001 > > 000000037FD39000 > > 00000476E4800: 000000017FD40CBD 0060000000000401 0000001500000001 > > 000000037FD3F90E > > 00000476E4820: 0060000000000400 000000037F106000 00000000476E48E0 > > 000000037FD3BE70 > > 00000476E4840: 000000037CB40000 0000000000000000 0000000047FFE000 > > 0000000047FFF000 > > 00000476E4860: 0000000000000000 0000007FFFFFFFFF 000000037FD39000 > > 000000037F10C528 > > ASSERT [ArmCpuDxe] > > /root/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExc= eptionHandler.c(333): > > ((BOOLEAN)(0=3D=3D1)) > > > > > > > > The full log is available here: > > https://gitlab.com/osteffen/thunderx2-debug/-/raw/main/2023-05-19/85.lo= g?inline=3Dfalse > > > > Debug files, firmware binaries, and the full build tree are here: > > https://gitlab.com/osteffen/thunderx2-debug/-/tree/main/2023-05-19 > > > > I am able to reproduce this quickly, so any ideas for what I can try > > are welcome :-) > > > > Thanks > > -Oliver > > > -- =F0=9F=8E=A9Oliver Steffen (he/him) - Software Engineer, Virtualization Red Hat GmbH , Registered seat: Werner-von-Siemens-Ring 12, D-85630 Grasbrunn, Germany Commercial register: Amtsgericht M=C3=BCnchen/Munich, HRB 153243, Managing Directors: Ryan Barnhart, Charles Cachera, Michael O'Neill, Amy Ross Everyone has different working hours=E2=80=A6 Please do not feel obligated = to reply outside of your normal work schedule.