From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by mx.groups.io with SMTP id smtpd.web10.11072.1645638074708765268 for ; Wed, 23 Feb 2022 09:41:15 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XW270ISz; spf=pass (domain: kernel.org, ip: 145.40.68.75, mailfrom: ardb@kernel.org) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 8723DB81FE7 for ; Wed, 23 Feb 2022 17:41:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 53458C340EF for ; Wed, 23 Feb 2022 17:41:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1645638071; bh=2S5rbkI9WPmkdoqKkg4Bu7rtxVeRKNupHvPw1sVYMdQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=XW270ISzHr8IlX/Y2lwAcCiib5yiCDHmMB894DzgapFNLcn0KTZwNDkV5rmUdyc5G nx7aJQCwJfywYmcZ1YxeiyBwOO8al+rUw7hXJFrIOB99IRk2YDJ9K6tNipg/o85O4C JWfC/iFQzVcrNzLCJRlPHtthz9ir5lGVPrKbjX5HMpS35ZGK+2Vpnsw7zVj4o8snFF h7i6VF9le/raK3WiP8LzyEhWEfBV8681TUGXMYXQyemjVe3xcjloZG7xMtWHDkbHke W8tUanPpQYSfLFueUTurkYHkyIJ6Idaqa8A31FciJ/Sf9bqro/ytGgg5mem7sSsmVi H5rCGJeU0jmjg== Received: by mail-yb1-f174.google.com with SMTP id u12so35569466ybd.7 for ; Wed, 23 Feb 2022 09:41:11 -0800 (PST) X-Gm-Message-State: AOAM532wUu/qV9pQXUYnD63c91mFful7IKiXoaqEIS8iPqIxNImvtyK6 bjMv2Xbcs4dNsrfb2eSkWmq3PxQ7ms/jGAtfPpU= X-Google-Smtp-Source: ABdhPJyTxco8QTkwhsZLtC2x7QhUOf3xproH+8ApxQ7ua0ABmMkdEM8YeyHK+P1sdjm1M2YKlyCg2Upmaw34jcF+1aQ= X-Received: by 2002:a25:9d81:0:b0:622:7df3:ff6c with SMTP id v1-20020a259d81000000b006227df3ff6cmr715354ybp.617.1645638070399; Wed, 23 Feb 2022 09:41:10 -0800 (PST) MIME-Version: 1.0 References: <122c32bb19ed0730ef166b9f46d3b112bc9ed937.1645497637.git.ashishsingha@nvidia.com> <877d9m3qny.wl-maz@kernel.org> In-Reply-To: From: "Ard Biesheuvel" Date: Wed, 23 Feb 2022 18:40:59 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable To: edk2-devel-groups-io , Ashish Singhal Cc: Marc Zyngier , Sami Mujawar , Ard Biesheuvel , Leif Lindholm Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 23 Feb 2022 at 18:36, Ashish Singhal via groups.io wrote: > > Hello Ard and Marc, > > I apologize for not providing the background on this in the commit messag= e and I understand the commit message is not very clear as well. Let me try= to summarize the problem. > > In our UEFI implementation, we are doing the following as part of the ini= tial MMU setup: > > Set the applicable device memory as nGnRnE memory. > Set the whole DRAM as normal memory that translates to RW and executable = memory. > Enable caches and MMU. > At this time, the memory map looks correct when I check that from DS-5. > I have asked you a number of times about the XN attribute. How and where are you setting it for the device regions? > When we start dispatching drivers, DxeCore dispatches a driver and marks = its code area as RO and executable and its data region as RW and non-execut= able. What we are seeing randomly is that some of the page tables (using DS= -5) have invalid output address that leads to the correct input address fro= m UEFI being translated to an unavailable memory location causing a crash s= ometime in EL2 or sometimes as a RAS error in EL3. > At which EL does UEFI run? If at EL1, is it running under a stage2 mapping? If so, does the stage 2 mapping set the XN attribute appropriately for the device mappings? > When I reached out to the CPU team here, they said Arm=C2=AE Cortex=C2=AE= -A78AE is a highly speculative core and we need to have appropriate barrier= s in place so that there is consistency in the way an address is accessed e= specially if it is done right after there is a change in translation tables= . Based on this, I started some experimentation wrt caches whenever MMUs ar= e enabled and I found that invalidating the instruction cache after enablin= g MMUs solves this problem. > It may hide it, but I don't think it is a proper fix. > Please note that I could be wrong with my hypothesis here and I may just = be masking the issue. If that is the case, please let me know what I should= be trying as I am out of ideas at this point. Also, the same UEFI works on= NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin= Silicon that has Arm=C2=AE Cortex=C2=AE-A78AE v8.2 64-bit CPU. > Thanks, Ard.