From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web11.43152.1674127515945231019 for ; Thu, 19 Jan 2023 03:25:16 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=EhvR3Hrq; spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: osteffen@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1674127514; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8Jlb8d0AaAO0fa34ejqwaU27rbJbvSNBkNW1ndvQ39M=; b=EhvR3HrqRlZfzazMxAODNrmnQjqX5uiI1NrjGqHn8JTyInzJkK/6DRAxoDMFlje/GAP8BE wpMf4UbI70xOk2tuMk64rSP1R9GPtDjYCj7IwCJw7wvP0cExc29hpGyQvnnGRe/84Qfhlz G8rD0SJi0spB038d7jLm8IEqVxeNNyA= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-628-_dgLg9TBMAuXEEpOg2bMkg-1; Thu, 19 Jan 2023 06:25:13 -0500 X-MC-Unique: _dgLg9TBMAuXEEpOg2bMkg-1 Received: by mail-lj1-f200.google.com with SMTP id u9-20020a2e8449000000b0028b95c83803so259089ljh.23 for ; Thu, 19 Jan 2023 03:25:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:user-agent:from:references :in-reply-to:mime-version:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8Jlb8d0AaAO0fa34ejqwaU27rbJbvSNBkNW1ndvQ39M=; b=kFA71ufpd49vcFpYjstAphnTIHHlkVUPflEofC3L+SDl3g/5NpabqHYOFE6dDvcmNh 84kyN9PJZXhteD9qfFYnOxhuCOz93oW4zSTtuRXruxKmetWIInu23C+B3EjFrhBBHNae 89X8vQ8BcOWijKBXdrZ6+vwdSFF7WafRvHnXXnekfr7P5496v/kXgFMoqXXzS25cN/3k XIPeNfntv0fRJHU1jb4nQp1R6EfBV/FWA1kcTzOTv1vIfH33wXSXAhyVyJTsSwVFt9Q3 tWwBkL7wH0IV+/3v5GAGXeaM/7ODqHwVgdBXYcC7It2qVzX++6Cj7Qk3f8McEay6C8XF Ssnw== X-Gm-Message-State: AFqh2krfGVX+HWix5g9RkpNYVZe+hYtKPeIoJGczI5KzI8NQ2BH9PM4K Xb+JsTlNTohK5hmN7axh2UN0VyxJY8BJqNCdv2UzZ/YVdK2flMI5EGqDxapIkfRHrkf6Cblsh9/ +/LSTOhrKzq3vb6WULA2tZTm7pLokDg== X-Received: by 2002:a2e:9917:0:b0:27f:d3e5:bab with SMTP id v23-20020a2e9917000000b0027fd3e50babmr955728lji.471.1674127511310; Thu, 19 Jan 2023 03:25:11 -0800 (PST) X-Google-Smtp-Source: AMrXdXu8AtmcqnbHmMs2M30rs7W4lrVvWY0HMKvEd8o1mVTVMrFHJ2bKCqZ2lElbh+8Z1Cf6xcQ77SMrZKaJP3AN53Q= X-Received: by 2002:a2e:9917:0:b0:27f:d3e5:bab with SMTP id v23-20020a2e9917000000b0027fd3e50babmr955721lji.471.1674127510958; Thu, 19 Jan 2023 03:25:10 -0800 (PST) Received: from 567203818698 named unknown by gmailapi.google.com with HTTPREST; Thu, 19 Jan 2023 03:25:10 -0800 MIME-Version: 1.0 In-Reply-To: References: <20230105162528.1430368-1-ardb@kernel.org> From: "Oliver Steffen" User-Agent: alot/0.8.1 Date: Thu, 19 Jan 2023 03:25:10 -0800 Message-ID: Subject: Re: [edk2-devel] [PATCH v2 2/2] ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX To: Ard Biesheuvel , Marc Zyngier Cc: devel@edk2.groups.io, dann.frazier@canonical.com, kraxel@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Quoting Ard Biesheuvel (2023-01-19 12:11:34) > (cc Marc) > > Context: > - on my TX2 (with the S1PTW r/o memslot fix applied), the new version > of ArmVirtQemu that uses an initial ID map in emulated NOR flash works > fine. > - in Oliver's case (which is a slightly different flavor of TX2), it > crashes extremely early, presumably at the point where this ID map is > activated. > > More details at the end. > > On Thu, 19 Jan 2023 at 12:03, Oliver Steffen wrote: > > > > Quoting Ard Biesheuvel (2023-01-18 10:22:12) > > > On Wed, 18 Jan 2023 at 09:48, Ard Biesheuvel wrote: > > > > > > > > On Wed, 18 Jan 2023 at 09:28, Oliver Steffen wrote: > > > > > > > > > > Quoting Ard Biesheuvel (2023-01-18 08:34:32) > > > > > > On Wed, 18 Jan 2023 at 07:37, Oliver Steffen wrote: > > > > > > > > > > > > > > On Tue, Jan 17, 2023 at 3:57 PM Ard Biesheuvel wrote: > > > > > > >> > > > > > > >> On Tue, 17 Jan 2023 at 13:48, Oliver Steffen wrote: > > > > > > >> > > > > > > > >> > Hi Ard, Hi everyone, > > > > > > >> > > > > > > > >> > Thanks for the work! > > > > > > >> > > > > > > > >> > But somehow this patch (as it was merged into master branch) does not > > > > > > >> > work for me on the ThunderX box we have. > > > > > > >> > > > > > > > >> > Any idea what could be wrong? > > > > > > >> > > > > > > >> I'm not sure I understand the question. The patch targets ThunderX, > > > > > > >> and you are using a ThunderX2. > > > > > > >> > > > > > > >> What were you expecting to happen, and what is happening instead? > > > > > > > > > > > > > > > > > > > > > Firmware does not start at all when using KVM. > > > > > > > > > > > > > > Please excuse my limited knowledge of Arm processor variants. > > > > > > > I assumed that ThunderX and ThunderX2 are very similar and hoped > > > > > > > the fix would also work for this case. > > > > > > > > > > > > > > The issue was introduced by the same commit that Dann > > > > > > > reported (07be1d34d95460a238fcd0f6693efb747c28b329): > > > > > > > "ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot". > > > > > > > > > > > > > > > > > > > Can you share the QEMU command line that you are using? I use a > > > > > > ThunderX2 basically 24/7 to do all my Linux and EDK2 development, so > > > > > > this change was developed on ThunderX2 and so I'm surprised you are > > > > > > seeing this issue. > > > > > > > > > > > > Did you try the DEBUG build as well? > > > > > Yes, debug is on. > > > > > > > > > > Here is what I have, trying with the master branch from just now > > > > > (998ebe5ca0ae5c449e83ede533bee872f97d63af): > > > > > > > > > > # make -C BaseTools && \ > > > > > . ./edksetup.sh && \ > > > > > build -t GCC5 -a AARCH64 \ > > > > > -p ArmVirtPkg/ArmVirtQemu.dsc \ > > > > > -DCAVIUM_ERRATUM_27456 \ > > > > > -b DEBUG > > > > > > > > > > # /usr/libexec/qemu-kvm \ > > > > > -machine accel=kvm -m 1G -boot menu=on \ > > > > > -blockdev node-name=code,driver=file,filename="${FW_CODE_RESIZED}",read-only=on > > > > > \ > > > > > -blockdev node-name=vars,driver=file,filename="${FW_VARS}" \ > > > > > -machine pflash0=code \ > > > > > -machine pflash1=vars \ > > > > > -cpu max \ > > > > > -net none \ > > > > > -serial stdio > > > > > > > > > > > > > My distro does not have qemu-kvm, and using the command line above > > > > results in the following if i try it with qemu-system-aarch64 > > > > > > > > """ > > > > qemu-system-aarch64: No machine specified, and there is no default > > > > Use -machine help to list supported machines > > > > """ > > > > > > > > unless i change it to > > > > > > > > qemu-system-aarch64 -machine virt,accel=kvm -m 1G -boot menu=on \ > > > > -blockdev node-name=code,driver=file,filename=$HOME/bin/flash0.img,read-only=on > > > > \ > > > > -blockdev node-name=vars,driver=file,filename=$HOME/bin/flash1.img \ > > > > -machine pflash0=code \ > > > > -machine pflash1=vars \ > > > > -cpu max \ > > > > -net none \ > > > > -nographic > > > > > > > > and that works fine with my firmware build. > > > > > > > > > > > > > # /usr/libexec/qemu-kvm --version > > > > > QEMU emulator version 7.2.0 (qemu-kvm-7.2.0-3.el9) > > > > > > > > > > # uname -r > > > > > 5.14.0-234.el9.aarch64 > > > > > > > > > > > > > Yeah, that is quite old. One potential issue that comes to mind here > > > > is the one address by the patch below > > > > > > > > > > > > > > > > > > > > > > > Since you have the same CPU... Might this be a bug in KVM? > > > > > > > > > > > > > Indeed. Could you try applying this patch? > > > > > > > > commit 406504c7b0405d74d74c15a667cd4c4620c3e7a9 > > > > Author: Marc Zyngier > > > > Date: Tue Dec 20 14:03:52 2022 +0000 > > > > > > > > KVM: arm64: Fix S1PTW handling on RO memslots > > > > > > > > Or check whether this is generally reproducible with newer kernels? > > > > > > Another thing you might try: > > > > > > - build the firmware with the following hunk applied > > > > > > """ > > > diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > index 5ac7c732f6ec..f4e1285beefc 100644 > > > --- a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > +++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > > > @@ -40,6 +40,12 @@ > > > .set sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | > > > SCTLR_EL1_ITD | SCTLR_EL1_SED > > > .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES1 > > > > > > + .align 11 > > > +.Lvectors: > > > + .rept 16 > > > + .align 7 > > > + b . > > > + .endr > > > > > > ASM_FUNC(ArmPlatformPeiBootAction) > > > #ifdef CAVIUM_ERRATUM_27456 > > > @@ -90,6 +96,8 @@ ASM_FUNC(ArmPlatformPeiBootAction) > > > msr mair_el1, x0 // set up the 1:1 mapping > > > msr tcr_el1, x1 > > > msr ttbr0_el1, x2 > > > + adr x0, .Lvectors > > > + msr vbar_el1, x0 > > > isb > > > > > > tlbi vmalle1 // invalidate any cached translations > > > """ > > > > > > - run qemu with the -s option and let it crash > > > > > > - connect with gdb and dump the exception context > > > > > > target remote:1234 > > > set radix 16 > > > p $FAR_EL1 > > > p $ESR_EL1 > > > p $ELR_EL1 > > > > > > That should at least tell us why the crash is occurring. > > > > > > > I tried the most recent Qemu master (v7.2.50) and also v7.0.0, > > on the 5.14 (RHEL) kernel and on 6.1.6-200.fc37.aarch64 (from Fedora). > > No luck. > > > > Does that include a backport of commit 406504c7b0405d74d74c15a667cd4c4620c3e7a9? > > > I applied the patch and attached gdb, as described (Qemu 7.2.50): > > > > p $ELR_EL1 > > (gdb) p $FAR_EL1 > > $1 = 0x6200 > > (gdb) p $ESR_EL1 > > $2 = 0x86000010 > > (gdb) p $ELR_EL1 > > $3 = 0x6200 > > > > There is no sign of any crash. It seems like it does not even start > > running. > > > > So 0x6200 is the sync exception vector, which is both the code > location of the crash and the faulting address. This means fetching > the instructions to handle the original exception failed, and so the > original exception reason (ESR) is lost. However, the synchronous > external abort (https://esr.arm64.dev/?#0x86000010) that you are > seeing might point to an issue similar (or the same) that Marc > recently fixed in KVM. > > It is quite odd that this does not reproduce *at all* on my TX2. > Fedora kernels don't use 64k pages right? > Kernel config says: CONFIG_ARM64_4K_PAGES=y