public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops
@ 2020-02-26 10:03 Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 1/6] ArmPkg/ArmMmuLib ARM: remove dummy constructor Ard Biesheuvel
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

As it turns out, there were still some instances of set/way ops left in
the core code, in ArmMmuLib to be precise.

This series fixes ArmMmuLib to perform the appropriate cache invalidation
when populating page tables with the MMU and caches off, allowing us to
get rid of the cache clean/disable/enable sequences which are incorrect
and pointless at the same time.

While at it, do some mild refactoring of the ARM version of ArmMmuLib
first, and finally, deprecate the set/way ops in ArmLib.

Note that the last patch needs coordination with edk2-platforms, as
there are a couple of users there which will need to be fixed, or
drop their definition of DISABLE_NEW_DEPRECATED_INTERFACES

Ard Biesheuvel (6):
  ArmPkg/ArmMmuLib ARM: remove dummy constructor
  ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code
  ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  ArmPkg/ArmMmuLib AARCH64: cache-invalidate initial page table entries
  ArmPkg/ArmLib: move set/way helper functions into private header
  ArmPkg/ArmLib: deprecate set/way cache maintenance routines

 ArmPkg/Include/Library/ArmLib.h               |  22 +-
 ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h    |  18 +
 ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S |   9 +-
 ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h          |  18 +
 .../Library/ArmMmuLib/AArch64/ArmMmuLibCore.c |   9 -
 .../Library/ArmMmuLib/Arm/ArmMmuLibConvert.c  |  32 ++
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c  | 468 +-----------------
 .../Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c   | 435 ++++++++++++++++
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf    |   4 +
 9 files changed, 530 insertions(+), 485 deletions(-)
 create mode 100644 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c
 create mode 100644 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] ArmPkg/ArmMmuLib ARM: remove dummy constructor
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 2/6] ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code Ard Biesheuvel
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

Make the CONSTRUCTOR define in the .INF AARCH64 only, so we can drop
the empty stub that exists for ARM.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 9 ---------
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf   | 2 ++
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
index 74ac31de98cc..a6601258bee0 100644
--- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
@@ -830,12 +830,3 @@ ArmClearMemoryRegionReadOnly (
 {
   return ArmSetMemoryAttributes (BaseAddress, Length, __EFI_MEMORY_RWX);
 }
-
-RETURN_STATUS
-EFIAPI
-ArmMmuBaseLibConstructor (
-  VOID
-  )
-{
-  return RETURN_SUCCESS;
-}
diff --git a/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf b/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
index 5028a955afac..3dfe68ba48a6 100644
--- a/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
+++ b/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
@@ -14,6 +14,8 @@ [Defines]
   MODULE_TYPE                    = BASE
   VERSION_STRING                 = 1.0
   LIBRARY_CLASS                  = ArmMmuLib
+
+[Defines.AARCH64]
   CONSTRUCTOR                    = ArmMmuBaseLibConstructor
 
 [Sources.AARCH64]
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 1/6] ArmPkg/ArmMmuLib ARM: remove dummy constructor Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries Ard Biesheuvel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

Unlike the AArch64 implementation of ArmMmuLib, which combines the
initial page table population code with the code that runs at later
stages to manage permission attributes in the page tables, ARM uses
two completely separate sets of routines for this.

Since ArmMmuLib is a static library, we can prevent duplication of
this code between different users, which usually only need one or
the other. (Note that LTO should also achieve the same.)

This also makes it easier to reason about modifying the cache
maintenance handling, and replace the set/way ops with by-VA
ops, since the code that performs the set/way ops only executes
when the MMU is still off.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c |  32 ++
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c    | 434 -------------------
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c  | 435 ++++++++++++++++++++
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf      |   2 +
 4 files changed, 469 insertions(+), 434 deletions(-)

diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c
new file mode 100644
index 000000000000..e3b02a9fba57
--- /dev/null
+++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c
@@ -0,0 +1,32 @@
+/** @file
+*  File managing the MMU for ARMv7 architecture
+*
+*  Copyright (c) 2011-2016, ARM Limited. All rights reserved.
+*
+*  SPDX-License-Identifier: BSD-2-Clause-Patent
+*
+**/
+
+#include <Uefi.h>
+
+#include <Library/ArmLib.h>
+
+#include <Chipset/ArmV7.h>
+
+UINT32
+ConvertSectionAttributesToPageAttributes (
+  IN UINT32   SectionAttributes,
+  IN BOOLEAN  IsLargePage
+  )
+{
+  UINT32 PageAttributes;
+
+  PageAttributes = 0;
+  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_CACHE_POLICY (SectionAttributes, IsLargePage);
+  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_AP (SectionAttributes);
+  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_XN (SectionAttributes, IsLargePage);
+  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_NG (SectionAttributes);
+  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_S (SectionAttributes);
+
+  return PageAttributes;
+}
diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
index a6601258bee0..aca7a37facac 100644
--- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
@@ -31,15 +31,6 @@
 #define ID_MMFR0_SHR_IMP_HW_COHERENT   1
 #define ID_MMFR0_SHR_IGNORED         0xf
 
-#define __EFI_MEMORY_RWX               0    // no restrictions
-
-#define CACHE_ATTRIBUTE_MASK   (EFI_MEMORY_UC | \
-                                EFI_MEMORY_WC | \
-                                EFI_MEMORY_WT | \
-                                EFI_MEMORY_WB | \
-                                EFI_MEMORY_UCE | \
-                                EFI_MEMORY_WP)
-
 UINTN
 EFIAPI
 ArmReadIdMmfr0 (
@@ -52,24 +43,6 @@ ArmHasMpExtensions (
   VOID
   );
 
-UINT32
-ConvertSectionAttributesToPageAttributes (
-  IN UINT32   SectionAttributes,
-  IN BOOLEAN  IsLargePage
-  )
-{
-  UINT32 PageAttributes;
-
-  PageAttributes = 0;
-  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_CACHE_POLICY (SectionAttributes, IsLargePage);
-  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_AP (SectionAttributes);
-  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_XN (SectionAttributes, IsLargePage);
-  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_NG (SectionAttributes);
-  PageAttributes |= TT_DESCRIPTOR_CONVERT_TO_PAGE_S (SectionAttributes);
-
-  return PageAttributes;
-}
-
 STATIC
 BOOLEAN
 PreferNonshareableMemory (
@@ -423,410 +396,3 @@ ArmConfigureMmu (
   ArmEnableMmu();
   return RETURN_SUCCESS;
 }
-
-STATIC
-EFI_STATUS
-ConvertSectionToPages (
-  IN EFI_PHYSICAL_ADDRESS  BaseAddress
-  )
-{
-  UINT32                  FirstLevelIdx;
-  UINT32                  SectionDescriptor;
-  UINT32                  PageTableDescriptor;
-  UINT32                  PageDescriptor;
-  UINT32                  Index;
-
-  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
-  volatile ARM_PAGE_TABLE_ENTRY         *PageTable;
-
-  DEBUG ((EFI_D_PAGE, "Converting section at 0x%x to pages\n", (UINTN)BaseAddress));
-
-  // Obtain page table base
-  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
-
-  // Calculate index into first level translation table for start of modification
-  FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
-  ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
-
-  // Get section attributes and convert to page attributes
-  SectionDescriptor = FirstLevelTable[FirstLevelIdx];
-  PageDescriptor = TT_DESCRIPTOR_PAGE_TYPE_PAGE | ConvertSectionAttributesToPageAttributes (SectionDescriptor, FALSE);
-
-  // Allocate a page table for the 4KB entries (we use up a full page even though we only need 1KB)
-  PageTable = (volatile ARM_PAGE_TABLE_ENTRY *)AllocatePages (1);
-  if (PageTable == NULL) {
-    return EFI_OUT_OF_RESOURCES;
-  }
-
-  // Write the page table entries out
-  for (Index = 0; Index < TRANSLATION_TABLE_PAGE_COUNT; Index++) {
-    PageTable[Index] = TT_DESCRIPTOR_PAGE_BASE_ADDRESS(BaseAddress + (Index << 12)) | PageDescriptor;
-  }
-
-  // Formulate page table entry, Domain=0, NS=0
-  PageTableDescriptor = (((UINTN)PageTable) & TT_DESCRIPTOR_SECTION_PAGETABLE_ADDRESS_MASK) | TT_DESCRIPTOR_SECTION_TYPE_PAGE_TABLE;
-
-  // Write the page table entry out, replacing section entry
-  FirstLevelTable[FirstLevelIdx] = PageTableDescriptor;
-
-  return EFI_SUCCESS;
-}
-
-STATIC
-EFI_STATUS
-UpdatePageEntries (
-  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN  UINT64                    Length,
-  IN  UINT64                    Attributes,
-  OUT BOOLEAN                   *FlushTlbs OPTIONAL
-  )
-{
-  EFI_STATUS    Status;
-  UINT32        EntryValue;
-  UINT32        EntryMask;
-  UINT32        FirstLevelIdx;
-  UINT32        Offset;
-  UINT32        NumPageEntries;
-  UINT32        Descriptor;
-  UINT32        p;
-  UINT32        PageTableIndex;
-  UINT32        PageTableEntry;
-  UINT32        CurrentPageTableEntry;
-  VOID          *Mva;
-
-  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
-  volatile ARM_PAGE_TABLE_ENTRY         *PageTable;
-
-  Status = EFI_SUCCESS;
-
-  // EntryMask: bitmask of values to change (1 = change this value, 0 = leave alone)
-  // EntryValue: values at bit positions specified by EntryMask
-  EntryMask = TT_DESCRIPTOR_PAGE_TYPE_MASK | TT_DESCRIPTOR_PAGE_AP_MASK;
-  if (Attributes & EFI_MEMORY_XP) {
-    EntryValue = TT_DESCRIPTOR_PAGE_TYPE_PAGE_XN;
-  } else {
-    EntryValue = TT_DESCRIPTOR_PAGE_TYPE_PAGE;
-  }
-
-  // Although the PI spec is unclear on this, the GCD guarantees that only
-  // one Attribute bit is set at a time, so the order of the conditionals below
-  // is irrelevant. If no memory attribute is specified, we preserve whatever
-  // memory type is set in the page tables, and update the permission attributes
-  // only.
-  if (Attributes & EFI_MEMORY_UC) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
-    // map to strongly ordered
-    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_STRONGLY_ORDERED; // TEX[2:0] = 0, C=0, B=0
-  } else if (Attributes & EFI_MEMORY_WC) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
-    // map to normal non-cachable
-    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_NON_CACHEABLE; // TEX [2:0]= 001 = 0x2, B=0, C=0
-  } else if (Attributes & EFI_MEMORY_WT) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
-    // write through with no-allocate
-    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_WRITE_THROUGH_NO_ALLOC; // TEX [2:0] = 0, C=1, B=0
-  } else if (Attributes & EFI_MEMORY_WB) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
-    // write back (with allocate)
-    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_WRITE_BACK_ALLOC; // TEX [2:0] = 001, C=1, B=1
-  } else if (Attributes & CACHE_ATTRIBUTE_MASK) {
-    // catch unsupported memory type attributes
-    ASSERT (FALSE);
-    return EFI_UNSUPPORTED;
-  }
-
-  if (Attributes & EFI_MEMORY_RO) {
-    EntryValue |= TT_DESCRIPTOR_PAGE_AP_RO_RO;
-  } else {
-    EntryValue |= TT_DESCRIPTOR_PAGE_AP_RW_RW;
-  }
-
-  // Obtain page table base
-  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
-
-  // Calculate number of 4KB page table entries to change
-  NumPageEntries = Length / TT_DESCRIPTOR_PAGE_SIZE;
-
-  // Iterate for the number of 4KB pages to change
-  Offset = 0;
-  for(p = 0; p < NumPageEntries; p++) {
-    // Calculate index into first level translation table for page table value
-
-    FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress + Offset) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
-    ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
-
-    // Read the descriptor from the first level page table
-    Descriptor = FirstLevelTable[FirstLevelIdx];
-
-    // Does this descriptor need to be converted from section entry to 4K pages?
-    if (!TT_DESCRIPTOR_SECTION_TYPE_IS_PAGE_TABLE(Descriptor)) {
-      Status = ConvertSectionToPages (FirstLevelIdx << TT_DESCRIPTOR_SECTION_BASE_SHIFT);
-      if (EFI_ERROR(Status)) {
-        // Exit for loop
-        break;
-      }
-
-      // Re-read descriptor
-      Descriptor = FirstLevelTable[FirstLevelIdx];
-      if (FlushTlbs != NULL) {
-        *FlushTlbs = TRUE;
-      }
-    }
-
-    // Obtain page table base address
-    PageTable = (ARM_PAGE_TABLE_ENTRY *)TT_DESCRIPTOR_PAGE_BASE_ADDRESS(Descriptor);
-
-    // Calculate index into the page table
-    PageTableIndex = ((BaseAddress + Offset) & TT_DESCRIPTOR_PAGE_INDEX_MASK) >> TT_DESCRIPTOR_PAGE_BASE_SHIFT;
-    ASSERT (PageTableIndex < TRANSLATION_TABLE_PAGE_COUNT);
-
-    // Get the entry
-    CurrentPageTableEntry = PageTable[PageTableIndex];
-
-    // Mask off appropriate fields
-    PageTableEntry = CurrentPageTableEntry & ~EntryMask;
-
-    // Mask in new attributes and/or permissions
-    PageTableEntry |= EntryValue;
-
-    if (CurrentPageTableEntry  != PageTableEntry) {
-      Mva = (VOID *)(UINTN)((((UINTN)FirstLevelIdx) << TT_DESCRIPTOR_SECTION_BASE_SHIFT) + (PageTableIndex << TT_DESCRIPTOR_PAGE_BASE_SHIFT));
-
-      // Only need to update if we are changing the entry
-      PageTable[PageTableIndex] = PageTableEntry;
-      ArmUpdateTranslationTableEntry ((VOID *)&PageTable[PageTableIndex], Mva);
-    }
-
-    Status = EFI_SUCCESS;
-    Offset += TT_DESCRIPTOR_PAGE_SIZE;
-
-  } // End first level translation table loop
-
-  return Status;
-}
-
-STATIC
-EFI_STATUS
-UpdateSectionEntries (
-  IN EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN UINT64                    Length,
-  IN UINT64                    Attributes
-  )
-{
-  EFI_STATUS    Status = EFI_SUCCESS;
-  UINT32        EntryMask;
-  UINT32        EntryValue;
-  UINT32        FirstLevelIdx;
-  UINT32        NumSections;
-  UINT32        i;
-  UINT32        CurrentDescriptor;
-  UINT32        Descriptor;
-  VOID          *Mva;
-  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
-
-  // EntryMask: bitmask of values to change (1 = change this value, 0 = leave alone)
-  // EntryValue: values at bit positions specified by EntryMask
-
-  // Make sure we handle a section range that is unmapped
-  EntryMask = TT_DESCRIPTOR_SECTION_TYPE_MASK | TT_DESCRIPTOR_SECTION_XN_MASK |
-              TT_DESCRIPTOR_SECTION_AP_MASK;
-  EntryValue = TT_DESCRIPTOR_SECTION_TYPE_SECTION;
-
-  // Although the PI spec is unclear on this, the GCD guarantees that only
-  // one Attribute bit is set at a time, so the order of the conditionals below
-  // is irrelevant. If no memory attribute is specified, we preserve whatever
-  // memory type is set in the page tables, and update the permission attributes
-  // only.
-  if (Attributes & EFI_MEMORY_UC) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
-    // map to strongly ordered
-    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_STRONGLY_ORDERED; // TEX[2:0] = 0, C=0, B=0
-  } else if (Attributes & EFI_MEMORY_WC) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
-    // map to normal non-cachable
-    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_NON_CACHEABLE; // TEX [2:0]= 001 = 0x2, B=0, C=0
-  } else if (Attributes & EFI_MEMORY_WT) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
-    // write through with no-allocate
-    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_WRITE_THROUGH_NO_ALLOC; // TEX [2:0] = 0, C=1, B=0
-  } else if (Attributes & EFI_MEMORY_WB) {
-    // modify cacheability attributes
-    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
-    // write back (with allocate)
-    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_WRITE_BACK_ALLOC; // TEX [2:0] = 001, C=1, B=1
-  } else if (Attributes & CACHE_ATTRIBUTE_MASK) {
-    // catch unsupported memory type attributes
-    ASSERT (FALSE);
-    return EFI_UNSUPPORTED;
-  }
-
-  if (Attributes & EFI_MEMORY_RO) {
-    EntryValue |= TT_DESCRIPTOR_SECTION_AP_RO_RO;
-  } else {
-    EntryValue |= TT_DESCRIPTOR_SECTION_AP_RW_RW;
-  }
-
-  if (Attributes & EFI_MEMORY_XP) {
-    EntryValue |= TT_DESCRIPTOR_SECTION_XN_MASK;
-  }
-
-  // obtain page table base
-  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
-
-  // calculate index into first level translation table for start of modification
-  FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
-  ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
-
-  // calculate number of 1MB first level entries this applies to
-  NumSections = Length / TT_DESCRIPTOR_SECTION_SIZE;
-
-  // iterate through each descriptor
-  for(i=0; i<NumSections; i++) {
-    CurrentDescriptor = FirstLevelTable[FirstLevelIdx + i];
-
-    // has this descriptor already been converted to pages?
-    if (TT_DESCRIPTOR_SECTION_TYPE_IS_PAGE_TABLE(CurrentDescriptor)) {
-      // forward this 1MB range to page table function instead
-      Status = UpdatePageEntries (
-                 (FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT,
-                 TT_DESCRIPTOR_SECTION_SIZE,
-                 Attributes,
-                 NULL);
-    } else {
-      // still a section entry
-
-      if (CurrentDescriptor != 0) {
-        // mask off appropriate fields
-        Descriptor = CurrentDescriptor & ~EntryMask;
-      } else {
-        Descriptor = ((UINTN)FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT;
-      }
-
-      // mask in new attributes and/or permissions
-      Descriptor |= EntryValue;
-
-      if (CurrentDescriptor  != Descriptor) {
-        Mva = (VOID *)(UINTN)(((UINTN)FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT);
-
-        // Only need to update if we are changing the descriptor
-        FirstLevelTable[FirstLevelIdx + i] = Descriptor;
-        ArmUpdateTranslationTableEntry ((VOID *)&FirstLevelTable[FirstLevelIdx + i], Mva);
-      }
-
-      Status = EFI_SUCCESS;
-    }
-  }
-
-  return Status;
-}
-
-EFI_STATUS
-ArmSetMemoryAttributes (
-  IN EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN UINT64                    Length,
-  IN UINT64                    Attributes
-  )
-{
-  EFI_STATUS    Status;
-  UINT64        ChunkLength;
-  BOOLEAN       FlushTlbs;
-
-  if (BaseAddress > (UINT64)MAX_ADDRESS) {
-    return EFI_UNSUPPORTED;
-  }
-
-  Length = MIN (Length, (UINT64)MAX_ADDRESS - BaseAddress + 1);
-  if (Length == 0) {
-    return EFI_SUCCESS;
-  }
-
-  FlushTlbs = FALSE;
-  while (Length > 0) {
-    if ((BaseAddress % TT_DESCRIPTOR_SECTION_SIZE == 0) &&
-        Length >= TT_DESCRIPTOR_SECTION_SIZE) {
-
-      ChunkLength = Length - Length % TT_DESCRIPTOR_SECTION_SIZE;
-
-      DEBUG ((DEBUG_PAGE,
-        "SetMemoryAttributes(): MMU section 0x%lx length 0x%lx to %lx\n",
-        BaseAddress, ChunkLength, Attributes));
-
-      Status = UpdateSectionEntries (BaseAddress, ChunkLength, Attributes);
-
-      FlushTlbs = TRUE;
-    } else {
-
-      //
-      // Process page by page until the next section boundary, but only if
-      // we have more than a section's worth of area to deal with after that.
-      //
-      ChunkLength = TT_DESCRIPTOR_SECTION_SIZE -
-                    (BaseAddress % TT_DESCRIPTOR_SECTION_SIZE);
-      if (ChunkLength + TT_DESCRIPTOR_SECTION_SIZE > Length) {
-        ChunkLength = Length;
-      }
-
-      DEBUG ((DEBUG_PAGE,
-        "SetMemoryAttributes(): MMU page 0x%lx length 0x%lx to %lx\n",
-        BaseAddress, ChunkLength, Attributes));
-
-      Status = UpdatePageEntries (BaseAddress, ChunkLength, Attributes,
-                 &FlushTlbs);
-    }
-
-    if (EFI_ERROR (Status)) {
-      break;
-    }
-
-    BaseAddress += ChunkLength;
-    Length -= ChunkLength;
-  }
-
-  if (FlushTlbs) {
-    ArmInvalidateTlb ();
-  }
-  return Status;
-}
-
-EFI_STATUS
-ArmSetMemoryRegionNoExec (
-  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN  UINT64                    Length
-  )
-{
-  return ArmSetMemoryAttributes (BaseAddress, Length, EFI_MEMORY_XP);
-}
-
-EFI_STATUS
-ArmClearMemoryRegionNoExec (
-  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN  UINT64                    Length
-  )
-{
-  return ArmSetMemoryAttributes (BaseAddress, Length, __EFI_MEMORY_RWX);
-}
-
-EFI_STATUS
-ArmSetMemoryRegionReadOnly (
-  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN  UINT64                    Length
-  )
-{
-  return ArmSetMemoryAttributes (BaseAddress, Length, EFI_MEMORY_RO);
-}
-
-EFI_STATUS
-ArmClearMemoryRegionReadOnly (
-  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
-  IN  UINT64                    Length
-  )
-{
-  return ArmSetMemoryAttributes (BaseAddress, Length, __EFI_MEMORY_RWX);
-}
diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c
new file mode 100644
index 000000000000..3dafe1d964cd
--- /dev/null
+++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c
@@ -0,0 +1,435 @@
+/** @file
+*  File managing the MMU for ARMv7 architecture
+*
+*  Copyright (c) 2011-2016, ARM Limited. All rights reserved.
+*
+*  SPDX-License-Identifier: BSD-2-Clause-Patent
+*
+**/
+
+#include <Uefi.h>
+
+#include <Library/ArmLib.h>
+#include <Library/BaseLib.h>
+#include <Library/BaseMemoryLib.h>
+#include <Library/DebugLib.h>
+#include <Library/CacheMaintenanceLib.h>
+#include <Library/MemoryAllocationLib.h>
+
+#include <Chipset/ArmV7.h>
+
+#define __EFI_MEMORY_RWX               0    // no restrictions
+
+#define CACHE_ATTRIBUTE_MASK   (EFI_MEMORY_UC | \
+                                EFI_MEMORY_WC | \
+                                EFI_MEMORY_WT | \
+                                EFI_MEMORY_WB | \
+                                EFI_MEMORY_UCE | \
+                                EFI_MEMORY_WP)
+
+STATIC
+EFI_STATUS
+ConvertSectionToPages (
+  IN EFI_PHYSICAL_ADDRESS  BaseAddress
+  )
+{
+  UINT32                  FirstLevelIdx;
+  UINT32                  SectionDescriptor;
+  UINT32                  PageTableDescriptor;
+  UINT32                  PageDescriptor;
+  UINT32                  Index;
+
+  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
+  volatile ARM_PAGE_TABLE_ENTRY         *PageTable;
+
+  DEBUG ((EFI_D_PAGE, "Converting section at 0x%x to pages\n", (UINTN)BaseAddress));
+
+  // Obtain page table base
+  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
+
+  // Calculate index into first level translation table for start of modification
+  FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
+  ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
+
+  // Get section attributes and convert to page attributes
+  SectionDescriptor = FirstLevelTable[FirstLevelIdx];
+  PageDescriptor = TT_DESCRIPTOR_PAGE_TYPE_PAGE | ConvertSectionAttributesToPageAttributes (SectionDescriptor, FALSE);
+
+  // Allocate a page table for the 4KB entries (we use up a full page even though we only need 1KB)
+  PageTable = (volatile ARM_PAGE_TABLE_ENTRY *)AllocatePages (1);
+  if (PageTable == NULL) {
+    return EFI_OUT_OF_RESOURCES;
+  }
+
+  // Write the page table entries out
+  for (Index = 0; Index < TRANSLATION_TABLE_PAGE_COUNT; Index++) {
+    PageTable[Index] = TT_DESCRIPTOR_PAGE_BASE_ADDRESS(BaseAddress + (Index << 12)) | PageDescriptor;
+  }
+
+  // Formulate page table entry, Domain=0, NS=0
+  PageTableDescriptor = (((UINTN)PageTable) & TT_DESCRIPTOR_SECTION_PAGETABLE_ADDRESS_MASK) | TT_DESCRIPTOR_SECTION_TYPE_PAGE_TABLE;
+
+  // Write the page table entry out, replacing section entry
+  FirstLevelTable[FirstLevelIdx] = PageTableDescriptor;
+
+  return EFI_SUCCESS;
+}
+
+STATIC
+EFI_STATUS
+UpdatePageEntries (
+  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN  UINT64                    Length,
+  IN  UINT64                    Attributes,
+  OUT BOOLEAN                   *FlushTlbs OPTIONAL
+  )
+{
+  EFI_STATUS    Status;
+  UINT32        EntryValue;
+  UINT32        EntryMask;
+  UINT32        FirstLevelIdx;
+  UINT32        Offset;
+  UINT32        NumPageEntries;
+  UINT32        Descriptor;
+  UINT32        p;
+  UINT32        PageTableIndex;
+  UINT32        PageTableEntry;
+  UINT32        CurrentPageTableEntry;
+  VOID          *Mva;
+
+  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
+  volatile ARM_PAGE_TABLE_ENTRY         *PageTable;
+
+  Status = EFI_SUCCESS;
+
+  // EntryMask: bitmask of values to change (1 = change this value, 0 = leave alone)
+  // EntryValue: values at bit positions specified by EntryMask
+  EntryMask = TT_DESCRIPTOR_PAGE_TYPE_MASK | TT_DESCRIPTOR_PAGE_AP_MASK;
+  if (Attributes & EFI_MEMORY_XP) {
+    EntryValue = TT_DESCRIPTOR_PAGE_TYPE_PAGE_XN;
+  } else {
+    EntryValue = TT_DESCRIPTOR_PAGE_TYPE_PAGE;
+  }
+
+  // Although the PI spec is unclear on this, the GCD guarantees that only
+  // one Attribute bit is set at a time, so the order of the conditionals below
+  // is irrelevant. If no memory attribute is specified, we preserve whatever
+  // memory type is set in the page tables, and update the permission attributes
+  // only.
+  if (Attributes & EFI_MEMORY_UC) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
+    // map to strongly ordered
+    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_STRONGLY_ORDERED; // TEX[2:0] = 0, C=0, B=0
+  } else if (Attributes & EFI_MEMORY_WC) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
+    // map to normal non-cachable
+    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_NON_CACHEABLE; // TEX [2:0]= 001 = 0x2, B=0, C=0
+  } else if (Attributes & EFI_MEMORY_WT) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
+    // write through with no-allocate
+    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_WRITE_THROUGH_NO_ALLOC; // TEX [2:0] = 0, C=1, B=0
+  } else if (Attributes & EFI_MEMORY_WB) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_MASK;
+    // write back (with allocate)
+    EntryValue |= TT_DESCRIPTOR_PAGE_CACHE_POLICY_WRITE_BACK_ALLOC; // TEX [2:0] = 001, C=1, B=1
+  } else if (Attributes & CACHE_ATTRIBUTE_MASK) {
+    // catch unsupported memory type attributes
+    ASSERT (FALSE);
+    return EFI_UNSUPPORTED;
+  }
+
+  if (Attributes & EFI_MEMORY_RO) {
+    EntryValue |= TT_DESCRIPTOR_PAGE_AP_RO_RO;
+  } else {
+    EntryValue |= TT_DESCRIPTOR_PAGE_AP_RW_RW;
+  }
+
+  // Obtain page table base
+  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
+
+  // Calculate number of 4KB page table entries to change
+  NumPageEntries = Length / TT_DESCRIPTOR_PAGE_SIZE;
+
+  // Iterate for the number of 4KB pages to change
+  Offset = 0;
+  for(p = 0; p < NumPageEntries; p++) {
+    // Calculate index into first level translation table for page table value
+
+    FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress + Offset) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
+    ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
+
+    // Read the descriptor from the first level page table
+    Descriptor = FirstLevelTable[FirstLevelIdx];
+
+    // Does this descriptor need to be converted from section entry to 4K pages?
+    if (!TT_DESCRIPTOR_SECTION_TYPE_IS_PAGE_TABLE(Descriptor)) {
+      Status = ConvertSectionToPages (FirstLevelIdx << TT_DESCRIPTOR_SECTION_BASE_SHIFT);
+      if (EFI_ERROR(Status)) {
+        // Exit for loop
+        break;
+      }
+
+      // Re-read descriptor
+      Descriptor = FirstLevelTable[FirstLevelIdx];
+      if (FlushTlbs != NULL) {
+        *FlushTlbs = TRUE;
+      }
+    }
+
+    // Obtain page table base address
+    PageTable = (ARM_PAGE_TABLE_ENTRY *)TT_DESCRIPTOR_PAGE_BASE_ADDRESS(Descriptor);
+
+    // Calculate index into the page table
+    PageTableIndex = ((BaseAddress + Offset) & TT_DESCRIPTOR_PAGE_INDEX_MASK) >> TT_DESCRIPTOR_PAGE_BASE_SHIFT;
+    ASSERT (PageTableIndex < TRANSLATION_TABLE_PAGE_COUNT);
+
+    // Get the entry
+    CurrentPageTableEntry = PageTable[PageTableIndex];
+
+    // Mask off appropriate fields
+    PageTableEntry = CurrentPageTableEntry & ~EntryMask;
+
+    // Mask in new attributes and/or permissions
+    PageTableEntry |= EntryValue;
+
+    if (CurrentPageTableEntry  != PageTableEntry) {
+      Mva = (VOID *)(UINTN)((((UINTN)FirstLevelIdx) << TT_DESCRIPTOR_SECTION_BASE_SHIFT) + (PageTableIndex << TT_DESCRIPTOR_PAGE_BASE_SHIFT));
+
+      // Only need to update if we are changing the entry
+      PageTable[PageTableIndex] = PageTableEntry;
+      ArmUpdateTranslationTableEntry ((VOID *)&PageTable[PageTableIndex], Mva);
+    }
+
+    Status = EFI_SUCCESS;
+    Offset += TT_DESCRIPTOR_PAGE_SIZE;
+
+  } // End first level translation table loop
+
+  return Status;
+}
+
+STATIC
+EFI_STATUS
+UpdateSectionEntries (
+  IN EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN UINT64                    Length,
+  IN UINT64                    Attributes
+  )
+{
+  EFI_STATUS    Status = EFI_SUCCESS;
+  UINT32        EntryMask;
+  UINT32        EntryValue;
+  UINT32        FirstLevelIdx;
+  UINT32        NumSections;
+  UINT32        i;
+  UINT32        CurrentDescriptor;
+  UINT32        Descriptor;
+  VOID          *Mva;
+  volatile ARM_FIRST_LEVEL_DESCRIPTOR   *FirstLevelTable;
+
+  // EntryMask: bitmask of values to change (1 = change this value, 0 = leave alone)
+  // EntryValue: values at bit positions specified by EntryMask
+
+  // Make sure we handle a section range that is unmapped
+  EntryMask = TT_DESCRIPTOR_SECTION_TYPE_MASK | TT_DESCRIPTOR_SECTION_XN_MASK |
+              TT_DESCRIPTOR_SECTION_AP_MASK;
+  EntryValue = TT_DESCRIPTOR_SECTION_TYPE_SECTION;
+
+  // Although the PI spec is unclear on this, the GCD guarantees that only
+  // one Attribute bit is set at a time, so the order of the conditionals below
+  // is irrelevant. If no memory attribute is specified, we preserve whatever
+  // memory type is set in the page tables, and update the permission attributes
+  // only.
+  if (Attributes & EFI_MEMORY_UC) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
+    // map to strongly ordered
+    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_STRONGLY_ORDERED; // TEX[2:0] = 0, C=0, B=0
+  } else if (Attributes & EFI_MEMORY_WC) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
+    // map to normal non-cachable
+    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_NON_CACHEABLE; // TEX [2:0]= 001 = 0x2, B=0, C=0
+  } else if (Attributes & EFI_MEMORY_WT) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
+    // write through with no-allocate
+    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_WRITE_THROUGH_NO_ALLOC; // TEX [2:0] = 0, C=1, B=0
+  } else if (Attributes & EFI_MEMORY_WB) {
+    // modify cacheability attributes
+    EntryMask |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_MASK;
+    // write back (with allocate)
+    EntryValue |= TT_DESCRIPTOR_SECTION_CACHE_POLICY_WRITE_BACK_ALLOC; // TEX [2:0] = 001, C=1, B=1
+  } else if (Attributes & CACHE_ATTRIBUTE_MASK) {
+    // catch unsupported memory type attributes
+    ASSERT (FALSE);
+    return EFI_UNSUPPORTED;
+  }
+
+  if (Attributes & EFI_MEMORY_RO) {
+    EntryValue |= TT_DESCRIPTOR_SECTION_AP_RO_RO;
+  } else {
+    EntryValue |= TT_DESCRIPTOR_SECTION_AP_RW_RW;
+  }
+
+  if (Attributes & EFI_MEMORY_XP) {
+    EntryValue |= TT_DESCRIPTOR_SECTION_XN_MASK;
+  }
+
+  // obtain page table base
+  FirstLevelTable = (ARM_FIRST_LEVEL_DESCRIPTOR *)ArmGetTTBR0BaseAddress ();
+
+  // calculate index into first level translation table for start of modification
+  FirstLevelIdx = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(BaseAddress) >> TT_DESCRIPTOR_SECTION_BASE_SHIFT;
+  ASSERT (FirstLevelIdx < TRANSLATION_TABLE_SECTION_COUNT);
+
+  // calculate number of 1MB first level entries this applies to
+  NumSections = Length / TT_DESCRIPTOR_SECTION_SIZE;
+
+  // iterate through each descriptor
+  for(i=0; i<NumSections; i++) {
+    CurrentDescriptor = FirstLevelTable[FirstLevelIdx + i];
+
+    // has this descriptor already been converted to pages?
+    if (TT_DESCRIPTOR_SECTION_TYPE_IS_PAGE_TABLE(CurrentDescriptor)) {
+      // forward this 1MB range to page table function instead
+      Status = UpdatePageEntries (
+                 (FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT,
+                 TT_DESCRIPTOR_SECTION_SIZE,
+                 Attributes,
+                 NULL);
+    } else {
+      // still a section entry
+
+      if (CurrentDescriptor != 0) {
+        // mask off appropriate fields
+        Descriptor = CurrentDescriptor & ~EntryMask;
+      } else {
+        Descriptor = ((UINTN)FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT;
+      }
+
+      // mask in new attributes and/or permissions
+      Descriptor |= EntryValue;
+
+      if (CurrentDescriptor  != Descriptor) {
+        Mva = (VOID *)(UINTN)(((UINTN)FirstLevelIdx + i) << TT_DESCRIPTOR_SECTION_BASE_SHIFT);
+
+        // Only need to update if we are changing the descriptor
+        FirstLevelTable[FirstLevelIdx + i] = Descriptor;
+        ArmUpdateTranslationTableEntry ((VOID *)&FirstLevelTable[FirstLevelIdx + i], Mva);
+      }
+
+      Status = EFI_SUCCESS;
+    }
+  }
+
+  return Status;
+}
+
+EFI_STATUS
+ArmSetMemoryAttributes (
+  IN EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN UINT64                    Length,
+  IN UINT64                    Attributes
+  )
+{
+  EFI_STATUS    Status;
+  UINT64        ChunkLength;
+  BOOLEAN       FlushTlbs;
+
+  if (BaseAddress > (UINT64)MAX_ADDRESS) {
+    return EFI_UNSUPPORTED;
+  }
+
+  Length = MIN (Length, (UINT64)MAX_ADDRESS - BaseAddress + 1);
+  if (Length == 0) {
+    return EFI_SUCCESS;
+  }
+
+  FlushTlbs = FALSE;
+  while (Length > 0) {
+    if ((BaseAddress % TT_DESCRIPTOR_SECTION_SIZE == 0) &&
+        Length >= TT_DESCRIPTOR_SECTION_SIZE) {
+
+      ChunkLength = Length - Length % TT_DESCRIPTOR_SECTION_SIZE;
+
+      DEBUG ((DEBUG_PAGE,
+        "SetMemoryAttributes(): MMU section 0x%lx length 0x%lx to %lx\n",
+        BaseAddress, ChunkLength, Attributes));
+
+      Status = UpdateSectionEntries (BaseAddress, ChunkLength, Attributes);
+
+      FlushTlbs = TRUE;
+    } else {
+
+      //
+      // Process page by page until the next section boundary, but only if
+      // we have more than a section's worth of area to deal with after that.
+      //
+      ChunkLength = TT_DESCRIPTOR_SECTION_SIZE -
+                    (BaseAddress % TT_DESCRIPTOR_SECTION_SIZE);
+      if (ChunkLength + TT_DESCRIPTOR_SECTION_SIZE > Length) {
+        ChunkLength = Length;
+      }
+
+      DEBUG ((DEBUG_PAGE,
+        "SetMemoryAttributes(): MMU page 0x%lx length 0x%lx to %lx\n",
+        BaseAddress, ChunkLength, Attributes));
+
+      Status = UpdatePageEntries (BaseAddress, ChunkLength, Attributes,
+                 &FlushTlbs);
+    }
+
+    if (EFI_ERROR (Status)) {
+      break;
+    }
+
+    BaseAddress += ChunkLength;
+    Length -= ChunkLength;
+  }
+
+  if (FlushTlbs) {
+    ArmInvalidateTlb ();
+  }
+  return Status;
+}
+
+EFI_STATUS
+ArmSetMemoryRegionNoExec (
+  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN  UINT64                    Length
+  )
+{
+  return ArmSetMemoryAttributes (BaseAddress, Length, EFI_MEMORY_XP);
+}
+
+EFI_STATUS
+ArmClearMemoryRegionNoExec (
+  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN  UINT64                    Length
+  )
+{
+  return ArmSetMemoryAttributes (BaseAddress, Length, __EFI_MEMORY_RWX);
+}
+
+EFI_STATUS
+ArmSetMemoryRegionReadOnly (
+  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN  UINT64                    Length
+  )
+{
+  return ArmSetMemoryAttributes (BaseAddress, Length, EFI_MEMORY_RO);
+}
+
+EFI_STATUS
+ArmClearMemoryRegionReadOnly (
+  IN  EFI_PHYSICAL_ADDRESS      BaseAddress,
+  IN  UINT64                    Length
+  )
+{
+  return ArmSetMemoryAttributes (BaseAddress, Length, __EFI_MEMORY_RWX);
+}
diff --git a/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf b/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
index 3dfe68ba48a6..2a7e7147958c 100644
--- a/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
+++ b/ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf
@@ -23,7 +23,9 @@ [Sources.AARCH64]
   AArch64/ArmMmuLibReplaceEntry.S
 
 [Sources.ARM]
+  Arm/ArmMmuLibConvert.c
   Arm/ArmMmuLibCore.c
+  Arm/ArmMmuLibUpdate.c
   Arm/ArmMmuLibV7Support.S   |GCC
   Arm/ArmMmuLibV7Support.asm |RVCT
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 1/6] ArmPkg/ArmMmuLib ARM: remove dummy constructor Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 2/6] ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-02-26 10:37   ` Ard Biesheuvel
  2020-03-02 12:25   ` [edk2-devel] " Leif Lindholm
  2020-02-26 10:03 ` [PATCH 4/6] ArmPkg/ArmMmuLib AARCH64: " Ard Biesheuvel
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

In the ARM version of ArmMmuLib, we are currently relying on set/way
invalidation to ensure that the caches are in a consistent state with
respect to main memory once we turn the MMU on. Even if set/way
operations were the appropriate method to achieve this, doing an
invalidate-all first and then populating the page table entries creates
a window where page table entries could be loaded speculatively into
the caches before we modify them, and shadow the new values that we
write there.

So let's get rid of the blanket clean/invalidate operations, and
instead, update ArmUpdateTranslationTableEntry () to invalidate each
page table entry *after* it is written if the MMU is still disabled
at this point.

On ARMv7, cache maintenance may be required also when the MMU is
enabled, in case the page table walker is not cache coherent. However,
the code being updated here is guaranteed to run only when the MMU is
still off, and so we can disregard the case when the MMU and caches
are on.

Since the MMU and D-cache are already off when we reach this point, we
can drop the MMU and D-cache disables as well. Maintenance of the I-cache
is unnecessary, since we are not modifying any code, and the installed
mapping is guaranteed to be 1:1. This means we can also leave it enabled
while the page table population code is running.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
index aca7a37facac..c5906b4310cc 100644
--- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
@@ -183,6 +183,8 @@ PopulateLevel2PageTable (
     PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
   }
 
+  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
+    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
 }
 
 STATIC
@@ -257,7 +259,11 @@ FillTranslationTable (
         RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
       // Case: Physical address aligned on the Section Size (1MB) && the length
       // is greater than the Section Size
-      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
+      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
+
+      ArmDataSynchronizationBarrier ();
+      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
+
       PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
       RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
     } else {
@@ -267,9 +273,12 @@ FillTranslationTable (
       // Case: Physical address aligned on the Section Size (1MB) && the length
       //       does not fill a section
       // Case: Physical address NOT aligned on the Section Size (1MB)
-      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
+      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
         MemoryRegion->Attributes);
 
+      ArmDataSynchronizationBarrier ();
+      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
+
       // If it is the last entry
       if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
         break;
@@ -349,18 +358,6 @@ ArmConfigureMmu (
     }
   }
 
-  ArmCleanInvalidateDataCache ();
-  ArmInvalidateInstructionCache ();
-
-  ArmDisableDataCache ();
-  ArmDisableInstructionCache();
-  // TLBs are also invalidated when calling ArmDisableMmu()
-  ArmDisableMmu ();
-
-  // Make sure nothing sneaked into the cache
-  ArmCleanInvalidateDataCache ();
-  ArmInvalidateInstructionCache ();
-
   ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
 
   //
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] ArmPkg/ArmMmuLib AARCH64: cache-invalidate initial page table entries
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2020-02-26 10:03 ` [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 5/6] ArmPkg/ArmLib: move set/way helper functions into private header Ard Biesheuvel
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

In the AARCH64 version of ArmMmuLib, we are currently relying on
set/way invalidation to ensure that the caches are in a consistent
state with respect to main memory once we turn the MMU on. Even if
set/way operations were the appropriate method to achieve this, doing
an invalidate-all first and then populating the page table entries
creates a window where page table entries could be loaded speculatively
into the caches before we modify them, and shadow the new values that
we write there.

So let's get rid of the blanket clean/invalidate operations, and
instead, update ArmUpdateTranslationTableEntry () to invalidate each
page table entry *after* it is written if the MMU is still disabled
at this point.

On ARMv8, it is guaranteed that memory accesses done by the page table
walker are cache coherent, and so we can ignore the case where the
MMU is on.

Since the MMU and D-cache are already off when we reach this point, we
can drop the MMU and D-cache disables as well. Maintenance of the I-cache
is unnecessary, since we are not modifying any code, and the installed
mapping is guaranteed to be 1:1. This means we can also leave it enabled
while the page table population code is running.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S    | 9 ++++++++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c | 9 ---------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S b/ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S
index 1adf960377a2..f744cd6738b9 100644
--- a/ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S
+++ b/ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S
@@ -13,6 +13,8 @@
 .set DAIF_RD_FIQ_BIT,   (1 << 6)
 .set DAIF_RD_IRQ_BIT,   (1 << 7)
 
+.set SCTLR_ELx_M_BIT_POS, (0)
+
 ASM_FUNC(ArmReadMidr)
   mrs     x0, midr_el1        // Read from Main ID Register (MIDR)
   ret
@@ -122,11 +124,16 @@ ASM_FUNC(ArmUpdateTranslationTableEntry)
    lsr     x1, x1, #12
    EL1_OR_EL2_OR_EL3(x0)
 1: tlbi    vaae1, x1             // TLB Invalidate VA , EL1
+   mrs     x2, sctlr_el1
    b       4f
 2: tlbi    vae2, x1              // TLB Invalidate VA , EL2
+   mrs     x2, sctlr_el2
    b       4f
 3: tlbi    vae3, x1              // TLB Invalidate VA , EL3
-4: dsb     nsh
+   mrs     x2, sctlr_el3
+4: tbnz    x2, SCTLR_ELx_M_BIT_POS, 5f
+   dc      ivac, x0              // invalidate in Dcache if MMU is still off
+5: dsb     nsh
    isb
    ret
 
diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
index e8f5c69e3136..204e33c75f95 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
@@ -699,15 +699,6 @@ ArmConfigureMmu (
 
   ZeroMem (TranslationTable, RootTableEntryCount * sizeof(UINT64));
 
-  // Disable MMU and caches. ArmDisableMmu() also invalidates the TLBs
-  ArmDisableMmu ();
-  ArmDisableDataCache ();
-  ArmDisableInstructionCache ();
-
-  // Make sure nothing sneaked into the cache
-  ArmCleanInvalidateDataCache ();
-  ArmInvalidateInstructionCache ();
-
   TranslationTableAttribute = TT_ATTR_INDX_INVALID;
   while (MemoryTable->Length != 0) {
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] ArmPkg/ArmLib: move set/way helper functions into private header
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2020-02-26 10:03 ` [PATCH 4/6] ArmPkg/ArmMmuLib AARCH64: " Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-02-26 10:03 ` [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines Ard Biesheuvel
  2020-02-26 10:29 ` [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Laszlo Ersek
  6 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

The clean/invalidate helper functions that operate on a single cache
line identified by set, way and level in a special, architected format
are only used by the implementations of the clean/invalidate routines
that operate on the entire cache hierarchy, as exposed by ArmLib.

The latter routines will be deprecated soon, so move the helpers out
of ArmLib.h and into a private header so they are safe from abuse.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Include/Library/ArmLib.h            | 18 ------------------
 ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h | 18 ++++++++++++++++++
 ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h       | 18 ++++++++++++++++++
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/ArmPkg/Include/Library/ArmLib.h b/ArmPkg/Include/Library/ArmLib.h
index e76a46d5f4ce..5a27b7c2fc27 100644
--- a/ArmPkg/Include/Library/ArmLib.h
+++ b/ArmPkg/Include/Library/ArmLib.h
@@ -211,24 +211,6 @@ ArmCleanInvalidateDataCacheEntryByMVA (
   IN  UINTN   Address
   );
 
-VOID
-EFIAPI
-ArmInvalidateDataCacheEntryBySetWay (
-  IN  UINTN  SetWayFormat
-  );
-
-VOID
-EFIAPI
-ArmCleanDataCacheEntryBySetWay (
-  IN  UINTN  SetWayFormat
-  );
-
-VOID
-EFIAPI
-ArmCleanInvalidateDataCacheEntryBySetWay (
-  IN  UINTN   SetWayFormat
-  );
-
 VOID
 EFIAPI
 ArmEnableDataCache (
diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h b/ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h
index ab9bcf553c4d..b2c8a8ea0b84 100644
--- a/ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h
+++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h
@@ -17,5 +17,23 @@ AArch64AllDataCachesOperation (
   IN  AARCH64_CACHE_OPERATION  DataCacheOperation
   );
 
+VOID
+EFIAPI
+ArmInvalidateDataCacheEntryBySetWay (
+  IN  UINTN  SetWayFormat
+  );
+
+VOID
+EFIAPI
+ArmCleanDataCacheEntryBySetWay (
+  IN  UINTN  SetWayFormat
+  );
+
+VOID
+EFIAPI
+ArmCleanInvalidateDataCacheEntryBySetWay (
+  IN  UINTN   SetWayFormat
+  );
+
 #endif // __AARCH64_LIB_H__
 
diff --git a/ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h b/ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h
index c52fb9a1b484..93183e67230e 100644
--- a/ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h
+++ b/ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h
@@ -30,5 +30,23 @@ ArmV7AllDataCachesOperation (
   IN  ARM_V7_CACHE_OPERATION  DataCacheOperation
   );
 
+VOID
+EFIAPI
+ArmInvalidateDataCacheEntryBySetWay (
+  IN  UINTN  SetWayFormat
+  );
+
+VOID
+EFIAPI
+ArmCleanDataCacheEntryBySetWay (
+  IN  UINTN  SetWayFormat
+  );
+
+VOID
+EFIAPI
+ArmCleanInvalidateDataCacheEntryBySetWay (
+  IN  UINTN   SetWayFormat
+  );
+
 #endif // __ARM_V7_LIB_H__
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2020-02-26 10:03 ` [PATCH 5/6] ArmPkg/ArmLib: move set/way helper functions into private header Ard Biesheuvel
@ 2020-02-26 10:03 ` Ard Biesheuvel
  2020-03-02 13:13   ` Leif Lindholm
  2020-02-26 10:29 ` [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Laszlo Ersek
  6 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:03 UTC (permalink / raw)
  To: devel; +Cc: leif, lersek, sami.mujawar, Ard Biesheuvel

Cache maintenance on ARMv7 systems and up should be done by virtual
address if the purpose is to manage the cached state of contents of
memory. Set/way operations are only intended to maintain the caches
themselves, e.g., to ensure that the contents of dirty cachelines
are brought to main memory before the core is powered off entirely.

UEFI on ARM is typically not involved in the latter at all, and any
cache maintenance it does is to ensure that the memory it occupies
and modifies remains in a consistent state with respect to the
caches.

So let's deprecate the set/way routines now that we have removed all
uses of it in the core code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 ArmPkg/Include/Library/ArmLib.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/ArmPkg/Include/Library/ArmLib.h b/ArmPkg/Include/Library/ArmLib.h
index 5a27b7c2fc27..8330339302ca 100644
--- a/ArmPkg/Include/Library/ArmLib.h
+++ b/ArmPkg/Include/Library/ArmLib.h
@@ -156,6 +156,8 @@ ArmIsMpCore (
   VOID
   );
 
+#ifndef DISABLE_NEW_DEPRECATED_INTERFACES
+
 VOID
 EFIAPI
 ArmInvalidateDataCache (
@@ -169,6 +171,8 @@ ArmCleanInvalidateDataCache (
   VOID
   );
 
+#endif
+
 VOID
 EFIAPI
 ArmCleanDataCache (
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops
  2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2020-02-26 10:03 ` [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines Ard Biesheuvel
@ 2020-02-26 10:29 ` Laszlo Ersek
  6 siblings, 0 replies; 17+ messages in thread
From: Laszlo Ersek @ 2020-02-26 10:29 UTC (permalink / raw)
  To: Ard Biesheuvel, devel; +Cc: leif, sami.mujawar

On 02/26/20 11:03, Ard Biesheuvel wrote:
> As it turns out, there were still some instances of set/way ops left in
> the core code, in ArmMmuLib to be precise.
> 
> This series fixes ArmMmuLib to perform the appropriate cache invalidation
> when populating page tables with the MMU and caches off, allowing us to
> get rid of the cache clean/disable/enable sequences which are incorrect
> and pointless at the same time.
> 
> While at it, do some mild refactoring of the ARM version of ArmMmuLib
> first, and finally, deprecate the set/way ops in ArmLib.
> 
> Note that the last patch needs coordination with edk2-platforms, as
> there are a couple of users there which will need to be fixed, or
> drop their definition of DISABLE_NEW_DEPRECATED_INTERFACES
> 
> Ard Biesheuvel (6):
>   ArmPkg/ArmMmuLib ARM: remove dummy constructor
>   ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code
>   ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
>   ArmPkg/ArmMmuLib AARCH64: cache-invalidate initial page table entries
>   ArmPkg/ArmLib: move set/way helper functions into private header
>   ArmPkg/ArmLib: deprecate set/way cache maintenance routines
> 
>  ArmPkg/Include/Library/ArmLib.h               |  22 +-
>  ArmPkg/Library/ArmLib/AArch64/AArch64Lib.h    |  18 +
>  ArmPkg/Library/ArmLib/AArch64/ArmLibSupport.S |   9 +-
>  ArmPkg/Library/ArmLib/Arm/ArmV7Lib.h          |  18 +
>  .../Library/ArmMmuLib/AArch64/ArmMmuLibCore.c |   9 -
>  .../Library/ArmMmuLib/Arm/ArmMmuLibConvert.c  |  32 ++
>  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c  | 468 +-----------------
>  .../Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c   | 435 ++++++++++++++++
>  ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf    |   4 +
>  9 files changed, 530 insertions(+), 485 deletions(-)
>  create mode 100644 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibConvert.c
>  create mode 100644 ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibUpdate.c
> 

I'd like to leave reviewing this to Leif :)

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-02-26 10:03 ` [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries Ard Biesheuvel
@ 2020-02-26 10:37   ` Ard Biesheuvel
  2020-03-02 12:25   ` [edk2-devel] " Leif Lindholm
  1 sibling, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-02-26 10:37 UTC (permalink / raw)
  To: edk2-devel-groups-io; +Cc: Leif Lindholm, Laszlo Ersek, Sami Mujawar

On Wed, 26 Feb 2020 at 11:04, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
> In the ARM version of ArmMmuLib, we are currently relying on set/way
> invalidation to ensure that the caches are in a consistent state with
> respect to main memory once we turn the MMU on. Even if set/way
> operations were the appropriate method to achieve this, doing an
> invalidate-all first and then populating the page table entries creates
> a window where page table entries could be loaded speculatively into
> the caches before we modify them, and shadow the new values that we
> write there.
>
> So let's get rid of the blanket clean/invalidate operations, and
> instead, update ArmUpdateTranslationTableEntry () to invalidate each
> page table entry *after* it is written if the MMU is still disabled
> at this point.
>

Uhm, apologies. This paragraph was copy-pasted from the AARCH64
version (along with the preceding one), but it doesn't apply here.
Instead, it should read,

"""
So let's get rid of the blanket clean/invalidate operations, and
instead, invalidate each section entry right after it is updated, and
invalidate sets of level 2 entries in blocks, using the generic
invalidation routine from CacheMaintenanceLib.
"""

> On ARMv7, cache maintenance may be required also when the MMU is
> enabled, in case the page table walker is not cache coherent. However,
> the code being updated here is guaranteed to run only when the MMU is
> still off, and so we can disregard the case when the MMU and caches
> are on.
>
> Since the MMU and D-cache are already off when we reach this point, we
> can drop the MMU and D-cache disables as well. Maintenance of the I-cache
> is unnecessary, since we are not modifying any code, and the installed
> mapping is guaranteed to be 1:1. This means we can also leave it enabled
> while the page table population code is running.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
>  1 file changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> index aca7a37facac..c5906b4310cc 100644
> --- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> +++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> @@ -183,6 +183,8 @@ PopulateLevel2PageTable (
>      PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
>    }
>
> +  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
> +    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
>  }
>
>  STATIC
> @@ -257,7 +259,11 @@ FillTranslationTable (
>          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
>        // Case: Physical address aligned on the Section Size (1MB) && the length
>        // is greater than the Section Size
> -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> +
> +      ArmDataSynchronizationBarrier ();
> +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> +
>        PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
>        RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
>      } else {
> @@ -267,9 +273,12 @@ FillTranslationTable (
>        // Case: Physical address aligned on the Section Size (1MB) && the length
>        //       does not fill a section
>        // Case: Physical address NOT aligned on the Section Size (1MB)
> -      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
> +      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
>          MemoryRegion->Attributes);
>
> +      ArmDataSynchronizationBarrier ();
> +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> +
>        // If it is the last entry
>        if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
>          break;
> @@ -349,18 +358,6 @@ ArmConfigureMmu (
>      }
>    }
>
> -  ArmCleanInvalidateDataCache ();
> -  ArmInvalidateInstructionCache ();
> -
> -  ArmDisableDataCache ();
> -  ArmDisableInstructionCache();
> -  // TLBs are also invalidated when calling ArmDisableMmu()
> -  ArmDisableMmu ();
> -
> -  // Make sure nothing sneaked into the cache
> -  ArmCleanInvalidateDataCache ();
> -  ArmInvalidateInstructionCache ();
> -
>    ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
>
>    //
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [edk2-devel] [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-02-26 10:03 ` [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries Ard Biesheuvel
  2020-02-26 10:37   ` Ard Biesheuvel
@ 2020-03-02 12:25   ` Leif Lindholm
  2020-03-02 12:58     ` Ard Biesheuvel
  1 sibling, 1 reply; 17+ messages in thread
From: Leif Lindholm @ 2020-03-02 12:25 UTC (permalink / raw)
  To: devel, ard.biesheuvel; +Cc: lersek, sami.mujawar

On Wed, Feb 26, 2020 at 11:03:50 +0100, Ard Biesheuvel wrote:
> In the ARM version of ArmMmuLib, we are currently relying on set/way
> invalidation to ensure that the caches are in a consistent state with
> respect to main memory once we turn the MMU on. Even if set/way
> operations were the appropriate method to achieve this, doing an
> invalidate-all first and then populating the page table entries creates
> a window where page table entries could be loaded speculatively into
> the caches before we modify them, and shadow the new values that we
> write there.
> 
> So let's get rid of the blanket clean/invalidate operations, and
> instead, update ArmUpdateTranslationTableEntry () to invalidate each
> page table entry *after* it is written if the MMU is still disabled
> at this point.
> 
> On ARMv7, cache maintenance may be required also when the MMU is
> enabled, in case the page table walker is not cache coherent. However,
> the code being updated here is guaranteed to run only when the MMU is
> still off, and so we can disregard the case when the MMU and caches
> are on.
> 
> Since the MMU and D-cache are already off when we reach this point, we
> can drop the MMU and D-cache disables as well. Maintenance of the I-cache
> is unnecessary, since we are not modifying any code, and the installed
> mapping is guaranteed to be 1:1. This means we can also leave it enabled
> while the page table population code is running.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> index aca7a37facac..c5906b4310cc 100644
> --- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> +++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> @@ -183,6 +183,8 @@ PopulateLevel2PageTable (
>      PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
>    }
>  
> +  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
> +    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
>  }
>  
>  STATIC
> @@ -257,7 +259,11 @@ FillTranslationTable (
>          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
>        // Case: Physical address aligned on the Section Size (1MB) && the length
>        // is greater than the Section Size
> -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> +
> +      ArmDataSynchronizationBarrier ();
> +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> +

Since the sequence is somewhat conterintuitive, could we add a comment
to the extent that // Force subsequent acces to fetch from main memory?

Obnoxious question: do we need another DSB here? Or are we reasonably
guaranteed that one will appear in the instruction stream between here
and anything else that would touch the same line?

>        PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
>        RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
>      } else {
> @@ -267,9 +273,12 @@ FillTranslationTable (
>        // Case: Physical address aligned on the Section Size (1MB) && the length
>        //       does not fill a section
>        // Case: Physical address NOT aligned on the Section Size (1MB)
> -      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
> +      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
>          MemoryRegion->Attributes);
>  
> +      ArmDataSynchronizationBarrier ();
> +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> +

Same pattern, so same questions.

/
    Leif

>        // If it is the last entry
>        if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
>          break;
> @@ -349,18 +358,6 @@ ArmConfigureMmu (
>      }
>    }
>  
> -  ArmCleanInvalidateDataCache ();
> -  ArmInvalidateInstructionCache ();
> -
> -  ArmDisableDataCache ();
> -  ArmDisableInstructionCache();
> -  // TLBs are also invalidated when calling ArmDisableMmu()
> -  ArmDisableMmu ();
> -
> -  // Make sure nothing sneaked into the cache
> -  ArmCleanInvalidateDataCache ();
> -  ArmInvalidateInstructionCache ();
> -
>    ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
>  
>    //
> -- 
> 2.17.1
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [edk2-devel] [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-03-02 12:25   ` [edk2-devel] " Leif Lindholm
@ 2020-03-02 12:58     ` Ard Biesheuvel
  2020-03-02 13:10       ` Leif Lindholm
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2020-03-02 12:58 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, 2 Mar 2020 at 13:25, Leif Lindholm <leif@nuviainc.com> wrote:
>
> On Wed, Feb 26, 2020 at 11:03:50 +0100, Ard Biesheuvel wrote:
> > In the ARM version of ArmMmuLib, we are currently relying on set/way
> > invalidation to ensure that the caches are in a consistent state with
> > respect to main memory once we turn the MMU on. Even if set/way
> > operations were the appropriate method to achieve this, doing an
> > invalidate-all first and then populating the page table entries creates
> > a window where page table entries could be loaded speculatively into
> > the caches before we modify them, and shadow the new values that we
> > write there.
> >
> > So let's get rid of the blanket clean/invalidate operations, and
> > instead, update ArmUpdateTranslationTableEntry () to invalidate each
> > page table entry *after* it is written if the MMU is still disabled
> > at this point.
> >
> > On ARMv7, cache maintenance may be required also when the MMU is
> > enabled, in case the page table walker is not cache coherent. However,
> > the code being updated here is guaranteed to run only when the MMU is
> > still off, and so we can disregard the case when the MMU and caches
> > are on.
> >
> > Since the MMU and D-cache are already off when we reach this point, we
> > can drop the MMU and D-cache disables as well. Maintenance of the I-cache
> > is unnecessary, since we are not modifying any code, and the installed
> > mapping is guaranteed to be 1:1. This means we can also leave it enabled
> > while the page table population code is running.
> >
> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > ---
> >  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
> >  1 file changed, 11 insertions(+), 14 deletions(-)
> >
> > diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > index aca7a37facac..c5906b4310cc 100644
> > --- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > +++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > @@ -183,6 +183,8 @@ PopulateLevel2PageTable (
> >      PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
> >    }
> >
> > +  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
> > +    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
> >  }
> >
> >  STATIC
> > @@ -257,7 +259,11 @@ FillTranslationTable (
> >          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
> >        // Case: Physical address aligned on the Section Size (1MB) && the length
> >        // is greater than the Section Size
> > -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > +
> > +      ArmDataSynchronizationBarrier ();
> > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > +
>
> Since the sequence is somewhat conterintuitive, could we add a comment
> to the extent that // Force subsequent acces to fetch from main memory?
>

The barrier is there to ensure that the write made it to meain memory,
so we could actually relax this to a DMB.

> Obnoxious question: do we need another DSB here? Or are we reasonably
> guaranteed that one will appear in the instruction stream between here
> and anything else that would touch the same line?
>

The MMU enable will issue a DSB to ensure that all the cache
invalidations have completed.

> >        PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
> >        RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
> >      } else {
> > @@ -267,9 +273,12 @@ FillTranslationTable (
> >        // Case: Physical address aligned on the Section Size (1MB) && the length
> >        //       does not fill a section
> >        // Case: Physical address NOT aligned on the Section Size (1MB)
> > -      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
> > +      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
> >          MemoryRegion->Attributes);
> >
> > +      ArmDataSynchronizationBarrier ();
> > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > +
>
> Same pattern, so same questions.
>

Same answer :-)


> >        // If it is the last entry
> >        if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
> >          break;
> > @@ -349,18 +358,6 @@ ArmConfigureMmu (
> >      }
> >    }
> >
> > -  ArmCleanInvalidateDataCache ();
> > -  ArmInvalidateInstructionCache ();
> > -
> > -  ArmDisableDataCache ();
> > -  ArmDisableInstructionCache();
> > -  // TLBs are also invalidated when calling ArmDisableMmu()
> > -  ArmDisableMmu ();
> > -
> > -  // Make sure nothing sneaked into the cache
> > -  ArmCleanInvalidateDataCache ();
> > -  ArmInvalidateInstructionCache ();
> > -
> >    ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
> >
> >    //
> > --
> > 2.17.1
> >
> >
> > 
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [edk2-devel] [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-03-02 12:58     ` Ard Biesheuvel
@ 2020-03-02 13:10       ` Leif Lindholm
  2020-03-02 13:15         ` Ard Biesheuvel
  0 siblings, 1 reply; 17+ messages in thread
From: Leif Lindholm @ 2020-03-02 13:10 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, Mar 02, 2020 at 13:58:39 +0100, Ard Biesheuvel wrote:
> On Mon, 2 Mar 2020 at 13:25, Leif Lindholm <leif@nuviainc.com> wrote:
> >
> > On Wed, Feb 26, 2020 at 11:03:50 +0100, Ard Biesheuvel wrote:
> > > In the ARM version of ArmMmuLib, we are currently relying on set/way
> > > invalidation to ensure that the caches are in a consistent state with
> > > respect to main memory once we turn the MMU on. Even if set/way
> > > operations were the appropriate method to achieve this, doing an
> > > invalidate-all first and then populating the page table entries creates
> > > a window where page table entries could be loaded speculatively into
> > > the caches before we modify them, and shadow the new values that we
> > > write there.
> > >
> > > So let's get rid of the blanket clean/invalidate operations, and
> > > instead, update ArmUpdateTranslationTableEntry () to invalidate each
> > > page table entry *after* it is written if the MMU is still disabled
> > > at this point.
> > >
> > > On ARMv7, cache maintenance may be required also when the MMU is
> > > enabled, in case the page table walker is not cache coherent. However,
> > > the code being updated here is guaranteed to run only when the MMU is
> > > still off, and so we can disregard the case when the MMU and caches
> > > are on.
> > >
> > > Since the MMU and D-cache are already off when we reach this point, we
> > > can drop the MMU and D-cache disables as well. Maintenance of the I-cache
> > > is unnecessary, since we are not modifying any code, and the installed
> > > mapping is guaranteed to be 1:1. This means we can also leave it enabled
> > > while the page table population code is running.
> > >
> > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > ---
> > >  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
> > >  1 file changed, 11 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > index aca7a37facac..c5906b4310cc 100644
> > > --- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > +++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > @@ -183,6 +183,8 @@ PopulateLevel2PageTable (
> > >      PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
> > >    }
> > >
> > > +  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
> > > +    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
> > >  }
> > >
> > >  STATIC
> > > @@ -257,7 +259,11 @@ FillTranslationTable (
> > >          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
> > >        // Case: Physical address aligned on the Section Size (1MB) && the length
> > >        // is greater than the Section Size
> > > -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > +
> > > +      ArmDataSynchronizationBarrier ();
> > > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > > +
> >
> > Since the sequence is somewhat conterintuitive, could we add a comment
> > to the extent that // Force subsequent acces to fetch from main memory?
> 
> The barrier is there to ensure that the write made it to meain memory,
> so we could actually relax this to a DMB.

If there's no risk there could be a stale entry for that line (i.e.,
D-cache has not been enabled since reset). Otherwise, I *think* there
could be a potential race condition in v7.

> > Obnoxious question: do we need another DSB here? Or are we reasonably
> > guaranteed that one will appear in the instruction stream between here
> > and anything else that would touch the same line?
> 
> The MMU enable will issue a DSB to ensure that all the cache
> invalidations have completed.

And that happens on our return path from here?
If so, fine. 

> > >        PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
> > >        RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
> > >      } else {
> > > @@ -267,9 +273,12 @@ FillTranslationTable (
> > >        // Case: Physical address aligned on the Section Size (1MB) && the length
> > >        //       does not fill a section
> > >        // Case: Physical address NOT aligned on the Section Size (1MB)
> > > -      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
> > > +      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
> > >          MemoryRegion->Attributes);
> > >
> > > +      ArmDataSynchronizationBarrier ();
> > > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > > +
> >
> > Same pattern, so same questions.
> >
> 
> Same answer :-)

Efficient!

/
    Leif

> > >        // If it is the last entry
> > >        if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
> > >          break;
> > > @@ -349,18 +358,6 @@ ArmConfigureMmu (
> > >      }
> > >    }
> > >
> > > -  ArmCleanInvalidateDataCache ();
> > > -  ArmInvalidateInstructionCache ();
> > > -
> > > -  ArmDisableDataCache ();
> > > -  ArmDisableInstructionCache();
> > > -  // TLBs are also invalidated when calling ArmDisableMmu()
> > > -  ArmDisableMmu ();
> > > -
> > > -  // Make sure nothing sneaked into the cache
> > > -  ArmCleanInvalidateDataCache ();
> > > -  ArmInvalidateInstructionCache ();
> > > -
> > >    ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
> > >
> > >    //
> > > --
> > > 2.17.1
> > >
> > >
> > > 
> > >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines
  2020-02-26 10:03 ` [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines Ard Biesheuvel
@ 2020-03-02 13:13   ` Leif Lindholm
  2020-03-02 13:16     ` Ard Biesheuvel
  0 siblings, 1 reply; 17+ messages in thread
From: Leif Lindholm @ 2020-03-02 13:13 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: devel, lersek, sami.mujawar

On Wed, Feb 26, 2020 at 11:03:53 +0100, Ard Biesheuvel wrote:
> Cache maintenance on ARMv7 systems and up should be done by virtual
> address if the purpose is to manage the cached state of contents of
> memory. Set/way operations are only intended to maintain the caches
> themselves, e.g., to ensure that the contents of dirty cachelines
> are brought to main memory before the core is powered off entirely.
> 
> UEFI on ARM is typically not involved in the latter at all, and any
> cache maintenance it does is to ensure that the memory it occupies
> and modifies remains in a consistent state with respect to the
> caches.
> 
> So let's deprecate the set/way routines now that we have removed all
> uses of it in the core code.

Does this patch simply get dropped in favour of the ASSERT variant?

/
    Leif

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  ArmPkg/Include/Library/ArmLib.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/ArmPkg/Include/Library/ArmLib.h b/ArmPkg/Include/Library/ArmLib.h
> index 5a27b7c2fc27..8330339302ca 100644
> --- a/ArmPkg/Include/Library/ArmLib.h
> +++ b/ArmPkg/Include/Library/ArmLib.h
> @@ -156,6 +156,8 @@ ArmIsMpCore (
>    VOID
>    );
>  
> +#ifndef DISABLE_NEW_DEPRECATED_INTERFACES
> +
>  VOID
>  EFIAPI
>  ArmInvalidateDataCache (
> @@ -169,6 +171,8 @@ ArmCleanInvalidateDataCache (
>    VOID
>    );
>  
> +#endif
> +
>  VOID
>  EFIAPI
>  ArmCleanDataCache (
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [edk2-devel] [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-03-02 13:10       ` Leif Lindholm
@ 2020-03-02 13:15         ` Ard Biesheuvel
  2020-03-04 12:10           ` Leif Lindholm
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2020-03-02 13:15 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, 2 Mar 2020 at 14:10, Leif Lindholm <leif@nuviainc.com> wrote:
>
> On Mon, Mar 02, 2020 at 13:58:39 +0100, Ard Biesheuvel wrote:
> > On Mon, 2 Mar 2020 at 13:25, Leif Lindholm <leif@nuviainc.com> wrote:
> > >
> > > On Wed, Feb 26, 2020 at 11:03:50 +0100, Ard Biesheuvel wrote:
> > > > In the ARM version of ArmMmuLib, we are currently relying on set/way
> > > > invalidation to ensure that the caches are in a consistent state with
> > > > respect to main memory once we turn the MMU on. Even if set/way
> > > > operations were the appropriate method to achieve this, doing an
> > > > invalidate-all first and then populating the page table entries creates
> > > > a window where page table entries could be loaded speculatively into
> > > > the caches before we modify them, and shadow the new values that we
> > > > write there.
> > > >
> > > > So let's get rid of the blanket clean/invalidate operations, and
> > > > instead, update ArmUpdateTranslationTableEntry () to invalidate each
> > > > page table entry *after* it is written if the MMU is still disabled
> > > > at this point.
> > > >
> > > > On ARMv7, cache maintenance may be required also when the MMU is
> > > > enabled, in case the page table walker is not cache coherent. However,
> > > > the code being updated here is guaranteed to run only when the MMU is
> > > > still off, and so we can disregard the case when the MMU and caches
> > > > are on.
> > > >
> > > > Since the MMU and D-cache are already off when we reach this point, we
> > > > can drop the MMU and D-cache disables as well. Maintenance of the I-cache
> > > > is unnecessary, since we are not modifying any code, and the installed
> > > > mapping is guaranteed to be 1:1. This means we can also leave it enabled
> > > > while the page table population code is running.
> > > >
> > > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > > ---
> > > >  ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c | 25 +++++++++-----------
> > > >  1 file changed, 11 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > > index aca7a37facac..c5906b4310cc 100644
> > > > --- a/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > > +++ b/ArmPkg/Library/ArmMmuLib/Arm/ArmMmuLibCore.c
> > > > @@ -183,6 +183,8 @@ PopulateLevel2PageTable (
> > > >      PhysicalBase += TT_DESCRIPTOR_PAGE_SIZE;
> > > >    }
> > > >
> > > > +  InvalidateDataCacheRange ((UINT32 *)TranslationTable + FirstPageOffset,
> > > > +    RemainLength / TT_DESCRIPTOR_PAGE_SIZE * sizeof (*PageEntry));
> > > >  }
> > > >
> > > >  STATIC
> > > > @@ -257,7 +259,11 @@ FillTranslationTable (
> > > >          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
> > > >        // Case: Physical address aligned on the Section Size (1MB) && the length
> > > >        // is greater than the Section Size
> > > > -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > > +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > > +
> > > > +      ArmDataSynchronizationBarrier ();
> > > > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > > > +
> > >
> > > Since the sequence is somewhat conterintuitive, could we add a comment
> > > to the extent that // Force subsequent acces to fetch from main memory?
> >
> > The barrier is there to ensure that the write made it to meain memory,
> > so we could actually relax this to a DMB.
>
> If there's no risk there could be a stale entry for that line (i.e.,
> D-cache has not been enabled since reset). Otherwise, I *think* there
> could be a potential race condition in v7.
>

I don't follow. Getting rid of stale, clean cachelines is the whole
point. But to ensure that a speculative fetch doesn't fetch the same
stale value again, we need to order the invalidate wrt the store.


> > > Obnoxious question: do we need another DSB here? Or are we reasonably
> > > guaranteed that one will appear in the instruction stream between here
> > > and anything else that would touch the same line?
> >
> > The MMU enable will issue a DSB to ensure that all the cache
> > invalidations have completed.
>
> And that happens on our return path from here?
> If so, fine.
>

It happens later. But remember that all of this code runs with the MMU
and caches off. We are just ensuring that our page table changes are
not shadowed in the PTW's view by clean but stale cachelines when we
turn on the MMU a bit later.

> > > >        PhysicalBase += TT_DESCRIPTOR_SECTION_SIZE;
> > > >        RemainLength -= TT_DESCRIPTOR_SECTION_SIZE;
> > > >      } else {
> > > > @@ -267,9 +273,12 @@ FillTranslationTable (
> > > >        // Case: Physical address aligned on the Section Size (1MB) && the length
> > > >        //       does not fill a section
> > > >        // Case: Physical address NOT aligned on the Section Size (1MB)
> > > > -      PopulateLevel2PageTable (SectionEntry++, PhysicalBase, PageMapLength,
> > > > +      PopulateLevel2PageTable (SectionEntry, PhysicalBase, PageMapLength,
> > > >          MemoryRegion->Attributes);
> > > >
> > > > +      ArmDataSynchronizationBarrier ();
> > > > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > > > +
> > >
> > > Same pattern, so same questions.
> > >
> >
> > Same answer :-)
>
> Efficient!
>
> /
>     Leif
>
> > > >        // If it is the last entry
> > > >        if (RemainLength < TT_DESCRIPTOR_SECTION_SIZE) {
> > > >          break;
> > > > @@ -349,18 +358,6 @@ ArmConfigureMmu (
> > > >      }
> > > >    }
> > > >
> > > > -  ArmCleanInvalidateDataCache ();
> > > > -  ArmInvalidateInstructionCache ();
> > > > -
> > > > -  ArmDisableDataCache ();
> > > > -  ArmDisableInstructionCache();
> > > > -  // TLBs are also invalidated when calling ArmDisableMmu()
> > > > -  ArmDisableMmu ();
> > > > -
> > > > -  // Make sure nothing sneaked into the cache
> > > > -  ArmCleanInvalidateDataCache ();
> > > > -  ArmInvalidateInstructionCache ();
> > > > -
> > > >    ArmSetTTBR0 ((VOID *)(UINTN)(((UINTN)TranslationTable & ~TRANSLATION_TABLE_SECTION_ALIGNMENT_MASK) | (TTBRAttributes & 0x7F)));
> > > >
> > > >    //
> > > > --
> > > > 2.17.1
> > > >
> > > >
> > > > 
> > > >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines
  2020-03-02 13:13   ` Leif Lindholm
@ 2020-03-02 13:16     ` Ard Biesheuvel
  2020-03-04 12:04       ` Ard Biesheuvel
  0 siblings, 1 reply; 17+ messages in thread
From: Ard Biesheuvel @ 2020-03-02 13:16 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, 2 Mar 2020 at 14:13, Leif Lindholm <leif@nuviainc.com> wrote:
>
> On Wed, Feb 26, 2020 at 11:03:53 +0100, Ard Biesheuvel wrote:
> > Cache maintenance on ARMv7 systems and up should be done by virtual
> > address if the purpose is to manage the cached state of contents of
> > memory. Set/way operations are only intended to maintain the caches
> > themselves, e.g., to ensure that the contents of dirty cachelines
> > are brought to main memory before the core is powered off entirely.
> >
> > UEFI on ARM is typically not involved in the latter at all, and any
> > cache maintenance it does is to ensure that the memory it occupies
> > and modifies remains in a consistent state with respect to the
> > caches.
> >
> > So let's deprecate the set/way routines now that we have removed all
> > uses of it in the core code.
>
> Does this patch simply get dropped in favour of the ASSERT variant?
>

Yes.

But I realised that we still need a fix for CmdRunAxf in that case :-(

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines
  2020-03-02 13:16     ` Ard Biesheuvel
@ 2020-03-04 12:04       ` Ard Biesheuvel
  0 siblings, 0 replies; 17+ messages in thread
From: Ard Biesheuvel @ 2020-03-04 12:04 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, 2 Mar 2020 at 14:16, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
> On Mon, 2 Mar 2020 at 14:13, Leif Lindholm <leif@nuviainc.com> wrote:
> >
> > On Wed, Feb 26, 2020 at 11:03:53 +0100, Ard Biesheuvel wrote:
> > > Cache maintenance on ARMv7 systems and up should be done by virtual
> > > address if the purpose is to manage the cached state of contents of
> > > memory. Set/way operations are only intended to maintain the caches
> > > themselves, e.g., to ensure that the contents of dirty cachelines
> > > are brought to main memory before the core is powered off entirely.
> > >
> > > UEFI on ARM is typically not involved in the latter at all, and any
> > > cache maintenance it does is to ensure that the memory it occupies
> > > and modifies remains in a consistent state with respect to the
> > > caches.
> > >
> > > So let's deprecate the set/way routines now that we have removed all
> > > uses of it in the core code.
> >
> > Does this patch simply get dropped in favour of the ASSERT variant?
> >
>
> Yes.
>
> But I realised that we still need a fix for CmdRunAxf in that case :-(

As discussed on IRC, with CmdRunAxf out of the way, we can ASSERT ()
on any use of the set/way ops with the MMU/dcache enabled. In any
case, this 6/6 will simply be dropped, and superseded by my followup
series '[PATCH 0/3] ArmPkg/ArmLib: ASSERT() on misuse of set/way ops'

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [edk2-devel] [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries
  2020-03-02 13:15         ` Ard Biesheuvel
@ 2020-03-04 12:10           ` Leif Lindholm
  0 siblings, 0 replies; 17+ messages in thread
From: Leif Lindholm @ 2020-03-04 12:10 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: edk2-devel-groups-io, Laszlo Ersek, Sami Mujawar

On Mon, Mar 02, 2020 at 14:15:58 +0100, Ard Biesheuvel wrote:
> > > > > @@ -257,7 +259,11 @@ FillTranslationTable (
> > > > >          RemainLength >= TT_DESCRIPTOR_SECTION_SIZE) {
> > > > >        // Case: Physical address aligned on the Section Size (1MB) && the length
> > > > >        // is greater than the Section Size
> > > > > -      *SectionEntry++ = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > > > +      *SectionEntry = TT_DESCRIPTOR_SECTION_BASE_ADDRESS(PhysicalBase) | Attributes;
> > > > > +
> > > > > +      ArmDataSynchronizationBarrier ();
> > > > > +      ArmInvalidateDataCacheEntryByMVA ((UINTN)SectionEntry++);
> > > > > +
> > > >
> > > > Since the sequence is somewhat conterintuitive, could we add a comment
> > > > to the extent that // Force subsequent acces to fetch from main memory?
> > >
> > > The barrier is there to ensure that the write made it to meain memory,
> > > so we could actually relax this to a DMB.
> >
> > If there's no risk there could be a stale entry for that line (i.e.,
> > D-cache has not been enabled since reset). Otherwise, I *think* there
> > could be a potential race condition in v7.
> 
> I don't follow. Getting rid of stale, clean cachelines is the whole
> point. But to ensure that a speculative fetch doesn't fetch the same
> stale value again, we need to order the invalidate wrt the store.

What I'm getting at is that *some* v7 platforms don't use TF-A, and
some of those could have run with caches enabled before ever going
into EDK2.

Since v7 architecture says:
---
When a cache is disabled for a type of access, for a particular
translation regime:
- it is IMPLEMENTATION DEFINED whether a cache hit occurs if a location
  that is held in the cache is accessed by that type of access.
- any location that is not held in the cache is not brought into the
  cache as a result of a memory access of that type.
---

Which I *think* means that if there *was* a stale entry in the D-cache
for that location, we need to ensure that entry is gone before we do
the write, or we risk it going missing.

Race condition was an inaccurate way to describe this, but I had
forgotten about the second bullet point.

> > > > Obnoxious question: do we need another DSB here? Or are we reasonably
> > > > guaranteed that one will appear in the instruction stream between here
> > > > and anything else that would touch the same line?
> > >
> > > The MMU enable will issue a DSB to ensure that all the cache
> > > invalidations have completed.
> >
> > And that happens on our return path from here?
> > If so, fine.
> 
> It happens later. But remember that all of this code runs with the MMU
> and caches off. We are just ensuring that our page table changes are
> not shadowed in the PTW's view by clean but stale cachelines when we
> turn on the MMU a bit later.

Second bullet point again. Ignore me.

/
    Leif

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-03-04 12:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-26 10:03 [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Ard Biesheuvel
2020-02-26 10:03 ` [PATCH 1/6] ArmPkg/ArmMmuLib ARM: remove dummy constructor Ard Biesheuvel
2020-02-26 10:03 ` [PATCH 2/6] ArmPkg/ArmMmuLib ARM: split ArmMmuLibCore.c into core and update code Ard Biesheuvel
2020-02-26 10:03 ` [PATCH 3/6] ArmPkg/ArmMmuLib ARM: cache-invalidate initial page table entries Ard Biesheuvel
2020-02-26 10:37   ` Ard Biesheuvel
2020-03-02 12:25   ` [edk2-devel] " Leif Lindholm
2020-03-02 12:58     ` Ard Biesheuvel
2020-03-02 13:10       ` Leif Lindholm
2020-03-02 13:15         ` Ard Biesheuvel
2020-03-04 12:10           ` Leif Lindholm
2020-02-26 10:03 ` [PATCH 4/6] ArmPkg/ArmMmuLib AARCH64: " Ard Biesheuvel
2020-02-26 10:03 ` [PATCH 5/6] ArmPkg/ArmLib: move set/way helper functions into private header Ard Biesheuvel
2020-02-26 10:03 ` [PATCH 6/6] ArmPkg/ArmLib: deprecate set/way cache maintenance routines Ard Biesheuvel
2020-03-02 13:13   ` Leif Lindholm
2020-03-02 13:16     ` Ard Biesheuvel
2020-03-04 12:04       ` Ard Biesheuvel
2020-02-26 10:29 ` [PATCH 0/6] ArmPkg: eradicate and deprecate by set/way cache ops Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox