* [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable @ 2022-02-22 2:42 Ashish Singhal 2022-02-23 5:07 ` Ashish Singhal 2022-02-23 7:02 ` Ard Biesheuvel 0 siblings, 2 replies; 10+ messages in thread From: Ashish Singhal @ 2022-02-22 2:42 UTC (permalink / raw) To: devel, sami.mujawar, ardb+tianocore, quic_llindhol; +Cc: Ashish Singhal Even with MMU turned off, instruction cache can speculate and fetch instructions. This can cause a crash if region being executed has been modified recently. With this patch, we ensure that instruction cache is invalidated right after MMU has been enabled and any potentially stale instruction fetched earlier has been discarded. This is specially helpful when the memory attributes of a region in MMU are being changed and some instructions operating on the region are prefetched in the instruction cache. Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com> --- ArmPkg/Library/ArmLib/AArch64/AArch64Support.S | 4 +++- ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S index d3cc1e8671..9648245182 100644 --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu) dsb nsh isb msr sctlr_el3, x0 // Write back -4: isb +4: ic iallu + dsb sy + isb ret diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S index 66ebca571e..56cc2dd73f 100644 --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S @@ -37,6 +37,8 @@ // re-enable the MMU msr sctlr_el\el, x8 + ic iallu + dsb sy isb .endm -- 2.17.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-22 2:42 [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable Ashish Singhal @ 2022-02-23 5:07 ` Ashish Singhal 2022-02-23 7:02 ` Ard Biesheuvel 1 sibling, 0 replies; 10+ messages in thread From: Ashish Singhal @ 2022-02-23 5:07 UTC (permalink / raw) To: devel@edk2.groups.io, sami.mujawar@arm.com, ardb+tianocore@kernel.org, quic_llindhol@quicinc.com, Samer El-Haj-Mahmoud [-- Attachment #1: Type: text/plain, Size: 2253 bytes --] + @Samer El-Haj-Mahmoud<mailto:Samer.El-Haj-Mahmoud@arm.com> Hello Leif/Ard/Sami/Samer, Can you please look at this patch and provide some feedback? Thanks Ashish ________________________________ From: Ashish Singhal <ashishsingha@nvidia.com> Sent: Monday, February 21, 2022 7:42 PM To: devel@edk2.groups.io <devel@edk2.groups.io>; sami.mujawar@arm.com <sami.mujawar@arm.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org>; quic_llindhol@quicinc.com <quic_llindhol@quicinc.com> Cc: Ashish Singhal <ashishsingha@nvidia.com> Subject: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable Even with MMU turned off, instruction cache can speculate and fetch instructions. This can cause a crash if region being executed has been modified recently. With this patch, we ensure that instruction cache is invalidated right after MMU has been enabled and any potentially stale instruction fetched earlier has been discarded. This is specially helpful when the memory attributes of a region in MMU are being changed and some instructions operating on the region are prefetched in the instruction cache. Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com> --- ArmPkg/Library/ArmLib/AArch64/AArch64Support.S | 4 +++- ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S index d3cc1e8671..9648245182 100644 --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu) dsb nsh isb msr sctlr_el3, x0 // Write back -4: isb +4: ic iallu + dsb sy + isb ret diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S index 66ebca571e..56cc2dd73f 100644 --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S @@ -37,6 +37,8 @@ // re-enable the MMU msr sctlr_el\el, x8 + ic iallu + dsb sy isb .endm -- 2.17.1 [-- Attachment #2: Type: text/html, Size: 4451 bytes --] ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-22 2:42 [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable Ashish Singhal 2022-02-23 5:07 ` Ashish Singhal @ 2022-02-23 7:02 ` Ard Biesheuvel 2022-02-23 8:58 ` Marc Zyngier 1 sibling, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2022-02-23 7:02 UTC (permalink / raw) To: Ashish Singhal, Marc Zyngier Cc: edk2-devel-groups-io, Sami Mujawar, Ard Biesheuvel, Leif Lindholm (+ Marc) On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > Even with MMU turned off, instruction cache can speculate > and fetch instructions. This can cause a crash if region > being executed has been modified recently. With this patch, > we ensure that instruction cache is invalidated right after > MMU has been enabled and any potentially stale instruction > fetched earlier has been discarded. > I don't follow this reasoning. Every time the UEFI image loader loads an image into memory, it does so with MMU and caches enabled, and the code regions are cleaned to the PoU before being invalidated in the I-cache. So how could these regions be stale? The only code that needs special care here is the little trampoline that executes with the MMU off, but this is being taken care of explicitly, by cleaning the code to the PoC before invoking it. > This is specially helpful when the memory attributes of a > region in MMU are being changed and some instructions > operating on the region are prefetched in the instruction > cache. > This sounds like a TLB problem not a cache problem. For the sake of posterity, could you include a more detailed description of the issue in the commit log? IIRC, this is about speculative instruction fetches hitting device memory locations? As I mentioned before, the architecture does not permit speculative instruction fetches for regions mapped with the XN attribute. Is the issue occurring under virtualization? Is there a stage 2 mapping that lacks the XN attribute for the device memory region in question? > Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com> > --- > ArmPkg/Library/ArmLib/AArch64/AArch64Support.S | 4 +++- > ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > index d3cc1e8671..9648245182 100644 > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu) > dsb nsh > isb > msr sctlr_el3, x0 // Write back > -4: isb > +4: ic iallu > + dsb sy > + isb > ret > > > diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > index 66ebca571e..56cc2dd73f 100644 > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > @@ -37,6 +37,8 @@ > > // re-enable the MMU > msr sctlr_el\el, x8 > + ic iallu > + dsb sy > isb > .endm > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 7:02 ` Ard Biesheuvel @ 2022-02-23 8:58 ` Marc Zyngier 2022-02-23 17:36 ` Ashish Singhal 0 siblings, 1 reply; 10+ messages in thread From: Marc Zyngier @ 2022-02-23 8:58 UTC (permalink / raw) To: Ard Biesheuvel, Ashish Singhal Cc: edk2-devel-groups-io, Sami Mujawar, Ard Biesheuvel, Leif Lindholm On Wed, 23 Feb 2022 07:02:28 +0000, Ard Biesheuvel <ardb@kernel.org> wrote: > > (+ Marc) > > On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > > > Even with MMU turned off, instruction cache can speculate > > and fetch instructions. This can cause a crash if region > > being executed has been modified recently. With this patch, > > we ensure that instruction cache is invalidated right after > > MMU has been enabled and any potentially stale instruction > > fetched earlier has been discarded. Are you doing code patching while the MMU is off? > > > > I don't follow this reasoning. Every time the UEFI image loader loads > an image into memory, it does so with MMU and caches enabled, and the > code regions are cleaned to the PoU before being invalidated in the > I-cache. So how could these regions be stale? The only code that needs > special care here is the little trampoline that executes with the MMU > off, but this is being taken care of explicitly, by cleaning the code > to the PoC before invoking it. > > > This is specially helpful when the memory attributes of a > > region in MMU are being changed and some instructions > > operating on the region are prefetched in the instruction > > cache. > > Huh, this is way too vague. How do you expect anyone to understand your problem based on this? > > This sounds like a TLB problem not a cache problem. For the sake of > posterity, could you include a more detailed description of the issue > in the commit log? IIRC, this is about speculative instruction fetches > hitting device memory locations? > > As I mentioned before, the architecture does not permit speculative > instruction fetches for regions mapped with the XN attribute. > > Is the issue occurring under virtualization? Is there a stage 2 > mapping that lacks the XN attribute for the device memory region in > question? > > > Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com> > > --- > > ArmPkg/Library/ArmLib/AArch64/AArch64Support.S | 4 +++- > > ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++ > > 2 files changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > index d3cc1e8671..9648245182 100644 > > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu) > > dsb nsh > > isb > > msr sctlr_el3, x0 // Write back > > -4: isb > > +4: ic iallu > > + dsb sy Why DSB SY? Given that you are only invalidating on a single CPU, DSB NSH should be enough. > > + isb > > ret > > > > > > diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > index 66ebca571e..56cc2dd73f 100644 > > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > @@ -37,6 +37,8 @@ > > > > // re-enable the MMU > > msr sctlr_el\el, x8 > > + ic iallu > > + dsb sy Same thing here. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 8:58 ` Marc Zyngier @ 2022-02-23 17:36 ` Ashish Singhal 2022-02-23 17:40 ` [edk2-devel] " Ard Biesheuvel 0 siblings, 1 reply; 10+ messages in thread From: Ashish Singhal @ 2022-02-23 17:36 UTC (permalink / raw) To: Marc Zyngier, Ard Biesheuvel Cc: edk2-devel-groups-io, Sami Mujawar, Ard Biesheuvel, Leif Lindholm [-- Attachment #1: Type: text/plain, Size: 5846 bytes --] Hello Ard and Marc, I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem. In our UEFI implementation, we are doing the following as part of the initial MMU setup: 1. Set the applicable device memory as nGnRnE memory. 2. Set the whole DRAM as normal memory that translates to RW and executable memory. 3. Enable caches and MMU. 4. At this time, the memory map looks correct when I check that from DS-5. When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3. When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem. Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU. @Marc Zyngier<mailto:maz@kernel.org>, I agree with your concern about using "dsb sy" and we should replace that with "dsb nsh" as we are running this only on a single core during boot. Thanks Ashish ________________________________ From: Marc Zyngier <maz@kernel.org> Sent: Wednesday, February 23, 2022 1:58 AM To: Ard Biesheuvel <ardb@kernel.org>; Ashish Singhal <ashishsingha@nvidia.com> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com> Subject: Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable External email: Use caution opening links or attachments On Wed, 23 Feb 2022 07:02:28 +0000, Ard Biesheuvel <ardb@kernel.org> wrote: > > (+ Marc) > > On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > > > Even with MMU turned off, instruction cache can speculate > > and fetch instructions. This can cause a crash if region > > being executed has been modified recently. With this patch, > > we ensure that instruction cache is invalidated right after > > MMU has been enabled and any potentially stale instruction > > fetched earlier has been discarded. Are you doing code patching while the MMU is off? > > > > I don't follow this reasoning. Every time the UEFI image loader loads > an image into memory, it does so with MMU and caches enabled, and the > code regions are cleaned to the PoU before being invalidated in the > I-cache. So how could these regions be stale? The only code that needs > special care here is the little trampoline that executes with the MMU > off, but this is being taken care of explicitly, by cleaning the code > to the PoC before invoking it. > > > This is specially helpful when the memory attributes of a > > region in MMU are being changed and some instructions > > operating on the region are prefetched in the instruction > > cache. > > Huh, this is way too vague. How do you expect anyone to understand your problem based on this? > > This sounds like a TLB problem not a cache problem. For the sake of > posterity, could you include a more detailed description of the issue > in the commit log? IIRC, this is about speculative instruction fetches > hitting device memory locations? > > As I mentioned before, the architecture does not permit speculative > instruction fetches for regions mapped with the XN attribute. > > Is the issue occurring under virtualization? Is there a stage 2 > mapping that lacks the XN attribute for the device memory region in > question? > > > Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com> > > --- > > ArmPkg/Library/ArmLib/AArch64/AArch64Support.S | 4 +++- > > ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++ > > 2 files changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > index d3cc1e8671..9648245182 100644 > > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S > > @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu) > > dsb nsh > > isb > > msr sctlr_el3, x0 // Write back > > -4: isb > > +4: ic iallu > > + dsb sy Why DSB SY? Given that you are only invalidating on a single CPU, DSB NSH should be enough. > > + isb > > ret > > > > > > diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > index 66ebca571e..56cc2dd73f 100644 > > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S > > @@ -37,6 +37,8 @@ > > > > // re-enable the MMU > > msr sctlr_el\el, x8 > > + ic iallu > > + dsb sy Same thing here. M. -- Without deviation from the norm, progress is not possible. [-- Attachment #2: Type: text/html, Size: 13347 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 17:36 ` Ashish Singhal @ 2022-02-23 17:40 ` Ard Biesheuvel 2022-02-23 18:13 ` Ashish Singhal 0 siblings, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2022-02-23 17:40 UTC (permalink / raw) To: edk2-devel-groups-io, Ashish Singhal Cc: Marc Zyngier, Sami Mujawar, Ard Biesheuvel, Leif Lindholm On Wed, 23 Feb 2022 at 18:36, Ashish Singhal via groups.io <ashishsingha=nvidia.com@groups.io> wrote: > > Hello Ard and Marc, > > I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem. > > In our UEFI implementation, we are doing the following as part of the initial MMU setup: > > Set the applicable device memory as nGnRnE memory. > Set the whole DRAM as normal memory that translates to RW and executable memory. > Enable caches and MMU. > At this time, the memory map looks correct when I check that from DS-5. > I have asked you a number of times about the XN attribute. How and where are you setting it for the device regions? > When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3. > At which EL does UEFI run? If at EL1, is it running under a stage2 mapping? If so, does the stage 2 mapping set the XN attribute appropriately for the device mappings? > When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem. > It may hide it, but I don't think it is a proper fix. > Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU. > Thanks, Ard. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 17:40 ` [edk2-devel] " Ard Biesheuvel @ 2022-02-23 18:13 ` Ashish Singhal 2022-02-23 22:54 ` Ard Biesheuvel 0 siblings, 1 reply; 10+ messages in thread From: Ashish Singhal @ 2022-02-23 18:13 UTC (permalink / raw) To: Ard Biesheuvel, edk2-devel-groups-io Cc: Marc Zyngier, Sami Mujawar, Ard Biesheuvel, Leif Lindholm [-- Attachment #1: Type: text/plain, Size: 3706 bytes --] Ard, During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK. For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC. Please let me know in case I have still not understood your question. In our case, UEFI is running in NS-EL2. I understand what I am proposing may not be a proper fix. I am looking for help on what could be a fix for this as we are using all upstream code for memory management. Thanks Ashish ________________________________ From: Ard Biesheuvel <ardb@kernel.org> Sent: Wednesday, February 23, 2022 10:40 AM To: edk2-devel-groups-io <devel@edk2.groups.io>; Ashish Singhal <ashishsingha@nvidia.com> Cc: Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com> Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable External email: Use caution opening links or attachments On Wed, 23 Feb 2022 at 18:36, Ashish Singhal via groups.io <ashishsingha=nvidia.com@groups.io> wrote: > > Hello Ard and Marc, > > I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem. > > In our UEFI implementation, we are doing the following as part of the initial MMU setup: > > Set the applicable device memory as nGnRnE memory. > Set the whole DRAM as normal memory that translates to RW and executable memory. > Enable caches and MMU. > At this time, the memory map looks correct when I check that from DS-5. > I have asked you a number of times about the XN attribute. How and where are you setting it for the device regions? > When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3. > At which EL does UEFI run? If at EL1, is it running under a stage2 mapping? If so, does the stage 2 mapping set the XN attribute appropriately for the device mappings? > When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem. > It may hide it, but I don't think it is a proper fix. > Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU. > Thanks, Ard. [-- Attachment #2: Type: text/html, Size: 6099 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 18:13 ` Ashish Singhal @ 2022-02-23 22:54 ` Ard Biesheuvel 2022-02-24 6:01 ` Ashish Singhal 0 siblings, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2022-02-23 22:54 UTC (permalink / raw) To: Ashish Singhal Cc: edk2-devel-groups-io, Marc Zyngier, Sami Mujawar, Ard Biesheuvel, Leif Lindholm On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > Ard, > > During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK. > > For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC. > > Please let me know in case I have still not understood your question. > This all looks ok. But the real question is whether the address that the speculative access targets is mapped using the XN attribute or not, so you will need to find a way to check that. So there are a couple of options: - The XN attribute is set correctly, but the CPU is speculatively fetching instructions anyway. This would imply a severe hardware bug, and flushing the I-cache is unlikely to make a difference. - The speculative access is not the result of an instruction fetch, in which case I-cache maintenance is unlikely to help either. - The XN bit is not being set correctly, and so the MMU code needs to be fixed. Papering over this by adding I-cache maintenance doesn't seem the best course of action tbh. -- Ard. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-23 22:54 ` Ard Biesheuvel @ 2022-02-24 6:01 ` Ashish Singhal 2022-02-26 4:46 ` Ashish Singhal 0 siblings, 1 reply; 10+ messages in thread From: Ashish Singhal @ 2022-02-24 6:01 UTC (permalink / raw) To: Ard Biesheuvel Cc: edk2-devel-groups-io, Marc Zyngier, Sami Mujawar, Ard Biesheuvel, Leif Lindholm [-- Attachment #1: Type: text/plain, Size: 2974 bytes --] Hello Ard, When we had a discussion on this topic earlier, maybe a few weeks back, we thought device memory is being accessed in a speculative manner. Our latest debug where we focussed on MMU page tables at the time of error tells that the issue is actually DRAM mapping in page tables getting corrupt (as per DS-5) where descriptor for a page seems to be something garbage. What this causes is that a valid input address in DRAM may get translated to an address in a different region of DRAM or some address in device memory. When I invalidate the instruction cache after enabling MMUs, this issue seems to be getting resolved. Again, I am not saying this is a fix but this is something that solves the issue while we are looking for suggestions from you for a proper fix. I have tried to summarize the problem based on the latest findings a few emails down the trail if you want to have a look at that again. Thanks Ashish ________________________________ From: Ard Biesheuvel <ardb@kernel.org> Sent: Wednesday, February 23, 2022 3:54 PM To: Ashish Singhal <ashishsingha@nvidia.com> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com> Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable External email: Use caution opening links or attachments On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > Ard, > > During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK. > > For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC. > > Please let me know in case I have still not understood your question. > This all looks ok. But the real question is whether the address that the speculative access targets is mapped using the XN attribute or not, so you will need to find a way to check that. So there are a couple of options: - The XN attribute is set correctly, but the CPU is speculatively fetching instructions anyway. This would imply a severe hardware bug, and flushing the I-cache is unlikely to make a difference. - The speculative access is not the result of an instruction fetch, in which case I-cache maintenance is unlikely to help either. - The XN bit is not being set correctly, and so the MMU code needs to be fixed. Papering over this by adding I-cache maintenance doesn't seem the best course of action tbh. -- Ard. [-- Attachment #2: Type: text/html, Size: 5809 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable 2022-02-24 6:01 ` Ashish Singhal @ 2022-02-26 4:46 ` Ashish Singhal 0 siblings, 0 replies; 10+ messages in thread From: Ashish Singhal @ 2022-02-26 4:46 UTC (permalink / raw) To: Ard Biesheuvel Cc: edk2-devel-groups-io, Marc Zyngier, Sami Mujawar, Ard Biesheuvel, Leif Lindholm [-- Attachment #1: Type: text/plain, Size: 3880 bytes --] @Marc Zyngier<mailto:maz@kernel.org> I have addressed your comment regarding the type of data synchronization barrier and have posted v2 of the patch set<https://edk2.groups.io/g/devel/message/87030>. @Ard Biesheuvel<mailto:ardb@kernel.org> I am still looking for a better solution to the problem and I am open to working with you or anyone else from ARM on that. Please let me know if you want me to try something else. Thanks Ashish ________________________________ From: Ashish Singhal <ashishsingha@nvidia.com> Sent: Wednesday, February 23, 2022 11:01 PM To: Ard Biesheuvel <ardb@kernel.org> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com> Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable Hello Ard, When we had a discussion on this topic earlier, maybe a few weeks back, we thought device memory is being accessed in a speculative manner. Our latest debug where we focussed on MMU page tables at the time of error tells that the issue is actually DRAM mapping in page tables getting corrupt (as per DS-5) where descriptor for a page seems to be something garbage. What this causes is that a valid input address in DRAM may get translated to an address in a different region of DRAM or some address in device memory. When I invalidate the instruction cache after enabling MMUs, this issue seems to be getting resolved. Again, I am not saying this is a fix but this is something that solves the issue while we are looking for suggestions from you for a proper fix. I have tried to summarize the problem based on the latest findings a few emails down the trail if you want to have a look at that again. Thanks Ashish ________________________________ From: Ard Biesheuvel <ardb@kernel.org> Sent: Wednesday, February 23, 2022 3:54 PM To: Ashish Singhal <ashishsingha@nvidia.com> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com> Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable External email: Use caution opening links or attachments On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote: > > Ard, > > During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK. > > For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC. > > Please let me know in case I have still not understood your question. > This all looks ok. But the real question is whether the address that the speculative access targets is mapped using the XN attribute or not, so you will need to find a way to check that. So there are a couple of options: - The XN attribute is set correctly, but the CPU is speculatively fetching instructions anyway. This would imply a severe hardware bug, and flushing the I-cache is unlikely to make a difference. - The speculative access is not the result of an instruction fetch, in which case I-cache maintenance is unlikely to help either. - The XN bit is not being set correctly, and so the MMU code needs to be fixed. Papering over this by adding I-cache maintenance doesn't seem the best course of action tbh. -- Ard. [-- Attachment #2: Type: text/html, Size: 8471 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-02-26 4:46 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-02-22 2:42 [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable Ashish Singhal 2022-02-23 5:07 ` Ashish Singhal 2022-02-23 7:02 ` Ard Biesheuvel 2022-02-23 8:58 ` Marc Zyngier 2022-02-23 17:36 ` Ashish Singhal 2022-02-23 17:40 ` [edk2-devel] " Ard Biesheuvel 2022-02-23 18:13 ` Ashish Singhal 2022-02-23 22:54 ` Ard Biesheuvel 2022-02-24 6:01 ` Ashish Singhal 2022-02-26 4:46 ` Ashish Singhal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox