public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off
@ 2022-09-06 15:06 Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 1/7] ArmPkg/ArmMmuLib: don't replace table entries with block entries Ard Biesheuvel
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel
  Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel, Marc Zyngier,
	Alexander Graf

We currently do a substantial amount of processing before enabling the
MMU and caches, which is bad for performance, but also fragile, as it
requires cache coherency to be managed by hand.

This also means that when running under virtualization, the hypervisor
must do a non-trivial amount of work to ensure that the host's cached
view of memory is consistent with the guest's uncached view.

So let's update the ArmVirtQemu early boot sequence to improve the
situation:
- instead of switching the MMU off and on again to meet
  break-before-make (BBM) requirements when running at EL1, use two sets
  of page tables and switch between them using different ASIDs;
- use a compile time generated ID map that covers the first bank of NOR
  flash, the first MMIO region (for the UART), and the first 128 MiB of
  DRAM, and switch to it straight out of reset.

The resulting build no longer performs any non-coherent memory accesses
via the data side, and only relies on instruction fetches before the MMU
is enabled.

Changes since v1:
- coding style tweaks to placate our CI overlord
- drop -mstrict-align which is no longer needed now that all C code runs
  with the MMU and caches on

Cc: Marc Zyngier <maz@kernel.org>
Cc: Alexander Graf <graf@amazon.com>

Ard Biesheuvel (7):
  ArmPkg/ArmMmuLib: don't replace table entries with block entries
  ArmPkg/ArmMmuLib: use shadow page tables for break-before-make at EL1
  ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled
  ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled
  ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map
  ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory
  ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot

 ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c                               |   4 +
 ArmPkg/Include/Chipset/AArch64Mmu.h                               |   1 +
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c                  | 204 +++++++++++++-------
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S          |  15 +-
 ArmPlatformPkg/PrePeiCore/PrePeiCore.c                            |  22 ++-
 ArmVirtPkg/ArmVirtQemu.dsc                                        |  12 +-
 ArmVirtPkg/ArmVirtQemu.fdf                                        |   2 +-
 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S | 111 +++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c        |  64 ++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf      |  40 ++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S                     |  57 ++++++
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c                         | 105 ++++++++++
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf                       |  68 +++++++
 13 files changed, 624 insertions(+), 81 deletions(-)
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
 create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
 create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf

-- 
2.35.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/7] ArmPkg/ArmMmuLib: don't replace table entries with block entries
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 2/7] ArmPkg/ArmMmuLib: use shadow page tables for break-before-make at EL1 Ard Biesheuvel
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

Drop the optimization that replaces table entries with block entries and
frees the page tables in the subhierarchy that is being replaced. This
rarely occurs in practice anyway, and will require more elaborate TLB
maintenance once we switch to a different approach when running at EL1,
where we no longer disable the MMU and nuke the TLB entirely every time
we update a descriptor in a way that requires break-before-make (BBM).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c | 20 ++------------------
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
index e5ecc7375153..34f1031c4de3 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
@@ -197,12 +197,9 @@ UpdateRegionMappingRecursive (
     // than a block, and recurse to create the block or page entries at
     // the next level. No block mappings are allowed at all at level 0,
     // so in that case, we have to recurse unconditionally.
-    // If we are changing a table entry and the AttributeClearMask is non-zero,
-    // we cannot replace it with a block entry without potentially losing
-    // attribute information, so keep the table entry in that case.
     //
     if ((Level == 0) || (((RegionStart | BlockEnd) & BlockMask) != 0) ||
-        (IsTableEntry (*Entry, Level) && (AttributeClearMask != 0)))
+        IsTableEntry (*Entry, Level))
     {
       ASSERT (Level < 3);
 
@@ -294,20 +291,7 @@ UpdateRegionMappingRecursive (
       EntryValue |= (Level == 3) ? TT_TYPE_BLOCK_ENTRY_LEVEL3
                                  : TT_TYPE_BLOCK_ENTRY;
 
-      if (IsTableEntry (*Entry, Level)) {
-        //
-        // We are replacing a table entry with a block entry. This is only
-        // possible if we are keeping none of the original attributes.
-        // We can free the table entry's page table, and all the ones below
-        // it, since we are dropping the only possible reference to it.
-        //
-        ASSERT (AttributeClearMask == 0);
-        TranslationTable = (VOID *)(UINTN)(*Entry & TT_ADDRESS_MASK_BLOCK_ENTRY);
-        ReplaceTableEntry (Entry, EntryValue, RegionStart, TRUE);
-        FreePageTablesRecursive (TranslationTable, Level + 1);
-      } else {
-        ReplaceTableEntry (Entry, EntryValue, RegionStart, FALSE);
-      }
+      ReplaceTableEntry (Entry, EntryValue, RegionStart, FALSE);
     }
   }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/7] ArmPkg/ArmMmuLib: use shadow page tables for break-before-make at EL1
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 1/7] ArmPkg/ArmMmuLib: don't replace table entries with block entries Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 3/7] ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled Ard Biesheuvel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

When executing at EL1, disabling and re-enabling the MMU every time we
need to replace a live translation entry is slightly problematic, given
that memory accesses performed with the MMU off have non-cacheable
attributes and are therefore non-coherent. On bare metal, we can deal
with this by adding some barriers and cache invalidation instructions,
but when running under virtualization, elaborate trapping and cache
maintenance logic is necessary on the part of the hypervisor, and this
is better avoided.

So let's switch to a different approach when running at EL1, and use
two sets of page tables with different ASIDs, and non-global attributes
for all mappings. This allows us to switch between those sets without
having to care about break-before-make, which means we can manipulate
the primary translation while running from the secondary.

To avoid splitting block mappings unnecessarily in the shadow page
tables, add a special case to the recursive mapping routines to retain
a block mapping that already covers a region that we are trying to map,
and has the right attributes.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c                      |   4 +
 ArmPkg/Include/Chipset/AArch64Mmu.h                      |   1 +
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c         | 154 +++++++++++++++-----
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S |  15 +-
 4 files changed, 138 insertions(+), 36 deletions(-)

diff --git a/ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c b/ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c
index 8bb33046e707..d15eb158ad60 100644
--- a/ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c
+++ b/ArmPkg/Drivers/CpuDxe/AArch64/Mmu.c
@@ -313,6 +313,10 @@ EfiAttributeToArmAttribute (
     ArmAttributes |= TT_PXN_MASK;
   }
 
+  if (ArmReadCurrentEL () == AARCH64_EL1) {
+    ArmAttributes |= TT_NG;
+  }
+
   return ArmAttributes;
 }
 
diff --git a/ArmPkg/Include/Chipset/AArch64Mmu.h b/ArmPkg/Include/Chipset/AArch64Mmu.h
index 2ea2cc0a874d..763dc53908e2 100644
--- a/ArmPkg/Include/Chipset/AArch64Mmu.h
+++ b/ArmPkg/Include/Chipset/AArch64Mmu.h
@@ -67,6 +67,7 @@
 
 #define TT_NS  BIT5
 #define TT_AF  BIT10
+#define TT_NG  BIT11
 
 #define TT_SH_NON_SHAREABLE    (0x0 << 8)
 #define TT_SH_OUTER_SHAREABLE  (0x2 << 8)
diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
index 34f1031c4de3..747ebc533511 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
@@ -25,23 +25,31 @@ ArmMemoryAttributeToPageAttribute (
   IN ARM_MEMORY_REGION_ATTRIBUTES  Attributes
   )
 {
+  UINT64  NonGlobal;
+
+  if (ArmReadCurrentEL () == AARCH64_EL1) {
+    NonGlobal = TT_NG;
+  } else {
+    NonGlobal = 0;
+  }
+
   switch (Attributes) {
     case ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK_NONSHAREABLE:
     case ARM_MEMORY_REGION_ATTRIBUTE_NONSECURE_WRITE_BACK_NONSHAREABLE:
-      return TT_ATTR_INDX_MEMORY_WRITE_BACK;
+      return TT_ATTR_INDX_MEMORY_WRITE_BACK | NonGlobal;
 
     case ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK:
     case ARM_MEMORY_REGION_ATTRIBUTE_NONSECURE_WRITE_BACK:
-      return TT_ATTR_INDX_MEMORY_WRITE_BACK | TT_SH_INNER_SHAREABLE;
+      return TT_ATTR_INDX_MEMORY_WRITE_BACK | TT_SH_INNER_SHAREABLE | NonGlobal;
 
     case ARM_MEMORY_REGION_ATTRIBUTE_WRITE_THROUGH:
     case ARM_MEMORY_REGION_ATTRIBUTE_NONSECURE_WRITE_THROUGH:
-      return TT_ATTR_INDX_MEMORY_WRITE_THROUGH | TT_SH_INNER_SHAREABLE;
+      return TT_ATTR_INDX_MEMORY_WRITE_THROUGH | TT_SH_INNER_SHAREABLE | NonGlobal;
 
     // Uncached and device mappings are treated as outer shareable by default,
     case ARM_MEMORY_REGION_ATTRIBUTE_UNCACHED_UNBUFFERED:
     case ARM_MEMORY_REGION_ATTRIBUTE_NONSECURE_UNCACHED_UNBUFFERED:
-      return TT_ATTR_INDX_MEMORY_NON_CACHEABLE;
+      return TT_ATTR_INDX_MEMORY_NON_CACHEABLE | NonGlobal;
 
     default:
       ASSERT (0);
@@ -50,7 +58,7 @@ ArmMemoryAttributeToPageAttribute (
       if (ArmReadCurrentEL () == AARCH64_EL2) {
         return TT_ATTR_INDX_DEVICE_MEMORY | TT_XN_MASK;
       } else {
-        return TT_ATTR_INDX_DEVICE_MEMORY | TT_UXN_MASK | TT_PXN_MASK;
+        return TT_ATTR_INDX_DEVICE_MEMORY | TT_UXN_MASK | TT_PXN_MASK | TT_NG;
       }
   }
 }
@@ -203,7 +211,14 @@ UpdateRegionMappingRecursive (
     {
       ASSERT (Level < 3);
 
-      if (!IsTableEntry (*Entry, Level)) {
+      if (IsBlockEntry (*Entry, Level) && (AttributeClearMask == 0) &&
+          ((*Entry & TT_ATTRIBUTES_MASK) == AttributeSetMask))
+      {
+        // The existing block entry already covers the region we are
+        // trying to map with the correct attributes so no need to do
+        // anything here
+        continue;
+      } else if (!IsTableEntry (*Entry, Level)) {
         //
         // No table entry exists yet, so we need to allocate a page table
         // for the next level.
@@ -304,7 +319,8 @@ UpdateRegionMapping (
   IN  UINT64  RegionStart,
   IN  UINT64  RegionLength,
   IN  UINT64  AttributeSetMask,
-  IN  UINT64  AttributeClearMask
+  IN  UINT64  AttributeClearMask,
+  IN  UINT64  *RootTable
   )
 {
   UINTN  T0SZ;
@@ -320,7 +336,7 @@ UpdateRegionMapping (
            RegionStart + RegionLength,
            AttributeSetMask,
            AttributeClearMask,
-           ArmGetTTBR0BaseAddress (),
+           RootTable,
            GetRootTableLevel (T0SZ)
            );
 }
@@ -329,14 +345,26 @@ STATIC
 EFI_STATUS
 FillTranslationTable (
   IN  UINT64                        *RootTable,
-  IN  ARM_MEMORY_REGION_DESCRIPTOR  *MemoryRegion
+  IN  ARM_MEMORY_REGION_DESCRIPTOR  *MemoryRegion,
+  IN  BOOLEAN                       IsSecondary
   )
 {
+  //
+  // Omit non-memory mappings from the shadow page tables, which only need to
+  // cover system RAM.
+  //
+  if (IsSecondary &&
+      (MemoryRegion->Attributes != ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK))
+  {
+    return EFI_SUCCESS;
+  }
+
   return UpdateRegionMapping (
            MemoryRegion->VirtualBase,
            MemoryRegion->Length,
            ArmMemoryAttributeToPageAttribute (MemoryRegion->Attributes) | TT_AF,
-           0
+           0,
+           RootTable
            );
 }
 
@@ -380,6 +408,10 @@ GcdAttributeToPageAttribute (
     PageAttributes |= TT_AP_NO_RO;
   }
 
+  if (ArmReadCurrentEL () == AARCH64_EL1) {
+    PageAttributes |= TT_NG;
+  }
+
   return PageAttributes | TT_AF;
 }
 
@@ -390,8 +422,9 @@ ArmSetMemoryAttributes (
   IN UINT64                Attributes
   )
 {
-  UINT64  PageAttributes;
-  UINT64  PageAttributeMask;
+  UINT64      PageAttributes;
+  UINT64      PageAttributeMask;
+  EFI_STATUS  Status;
 
   PageAttributes    = GcdAttributeToPageAttribute (Attributes);
   PageAttributeMask = 0;
@@ -404,13 +437,36 @@ ArmSetMemoryAttributes (
     PageAttributes   &= TT_AP_MASK | TT_UXN_MASK | TT_PXN_MASK;
     PageAttributeMask = ~(TT_ADDRESS_MASK_BLOCK_ENTRY | TT_AP_MASK |
                           TT_PXN_MASK | TT_XN_MASK);
+  } else if ((ArmReadCurrentEL () == AARCH64_EL1) &&
+             ((PageAttributes & TT_ATTR_INDX_MASK) == TT_ATTR_INDX_MEMORY_WRITE_BACK))
+  {
+    //
+    // Update the shadow page tables if we are running at EL1 and are mapping
+    // memory. This is needed because we may be adding memory that may be used
+    // later on for allocating page tables, and these need to be shadowed as
+    // well so we can update them safely. Strip the attributes so we don't
+    // fragment the shadow page tables unnecessarily.  (Note that adding device
+    // memory here and stripping the XN attributes would be bad, as it could
+    // result in speculative instruction fetches from MMIO regions.)
+    //
+    Status = UpdateRegionMapping (
+               BaseAddress,
+               Length,
+               (PageAttributes & ~(TT_AP_MASK | TT_PXN_MASK | TT_UXN_MASK)) | TT_AP_NO_RW,
+               0,
+               ArmGetTTBR0BaseAddress () + EFI_PAGE_SIZE
+               );
+    if (EFI_ERROR (Status)) {
+      return Status;
+    }
   }
 
   return UpdateRegionMapping (
            BaseAddress,
            Length,
            PageAttributes,
-           PageAttributeMask
+           PageAttributeMask,
+           ArmGetTTBR0BaseAddress ()
            );
 }
 
@@ -423,7 +479,13 @@ SetMemoryRegionAttribute (
   IN  UINT64                BlockEntryMask
   )
 {
-  return UpdateRegionMapping (BaseAddress, Length, Attributes, BlockEntryMask);
+  return UpdateRegionMapping (
+           BaseAddress,
+           Length,
+           Attributes,
+           BlockEntryMask,
+           ArmGetTTBR0BaseAddress ()
+           );
 }
 
 EFI_STATUS
@@ -503,13 +565,15 @@ ArmConfigureMmu (
   OUT UINTN                         *TranslationTableSize OPTIONAL
   )
 {
-  VOID        *TranslationTable;
-  UINTN       MaxAddressBits;
-  UINT64      MaxAddress;
-  UINTN       T0SZ;
-  UINTN       RootTableEntryCount;
-  UINT64      TCR;
-  EFI_STATUS  Status;
+  UINT64                        *TranslationTable;
+  UINTN                         MaxAddressBits;
+  UINT64                        MaxAddress;
+  UINTN                         T0SZ;
+  UINTN                         RootTableEntryCount;
+  UINT64                        TCR;
+  EFI_STATUS                    Status;
+  ARM_MEMORY_REGION_DESCRIPTOR  *MemTab;
+  UINTN                         NumRootPages;
 
   if (MemoryTable == NULL) {
     ASSERT (MemoryTable != NULL);
@@ -538,6 +602,8 @@ ArmConfigureMmu (
     // Note: Bits 23 and 31 are reserved(RES1) bits in TCR_EL2
     TCR = T0SZ | (1UL << 31) | (1UL << 23) | TCR_TG0_4KB;
 
+    NumRootPages = 1;
+
     // Set the Physical Address Size using MaxAddress
     if (MaxAddress < SIZE_4GB) {
       TCR |= TCR_PS_4GB;
@@ -564,6 +630,8 @@ ArmConfigureMmu (
     // Due to Cortex-A57 erratum #822227 we must set TG1[1] == 1, regardless of EPD1.
     TCR = T0SZ | TCR_TG0_4KB | TCR_TG1_4KB | TCR_EPD1;
 
+    NumRootPages = 2;
+
     // Set the Physical Address Size using MaxAddress
     if (MaxAddress < SIZE_4GB) {
       TCR |= TCR_IPS_4GB;
@@ -608,19 +676,11 @@ ArmConfigureMmu (
   ArmSetTCR (TCR);
 
   // Allocate pages for translation table
-  TranslationTable = AllocatePages (1);
+  TranslationTable = AllocatePages (NumRootPages);
   if (TranslationTable == NULL) {
     return EFI_OUT_OF_RESOURCES;
   }
 
-  //
-  // We set TTBR0 just after allocating the table to retrieve its location from
-  // the subsequent functions without needing to pass this value across the
-  // functions. The MMU is only enabled after the translation tables are
-  // populated.
-  //
-  ArmSetTTBR0 (TranslationTable);
-
   if (TranslationTableBase != NULL) {
     *TranslationTableBase = TranslationTable;
   }
@@ -637,15 +697,14 @@ ArmConfigureMmu (
     TranslationTable,
     RootTableEntryCount * sizeof (UINT64)
     );
+
   ZeroMem (TranslationTable, RootTableEntryCount * sizeof (UINT64));
 
-  while (MemoryTable->Length != 0) {
-    Status = FillTranslationTable (TranslationTable, MemoryTable);
+  for (MemTab = MemoryTable; MemTab->Length != 0; MemTab++) {
+    Status = FillTranslationTable (TranslationTable, MemTab, FALSE);
     if (EFI_ERROR (Status)) {
       goto FreeTranslationTable;
     }
-
-    MemoryTable++;
   }
 
   //
@@ -661,16 +720,41 @@ ArmConfigureMmu (
     MAIR_ATTR (TT_ATTR_INDX_MEMORY_WRITE_BACK, MAIR_ATTR_NORMAL_MEMORY_WRITE_BACK)
     );
 
+  ArmSetTTBR0 (TranslationTable);
+
   ArmDisableAlignmentCheck ();
   ArmEnableStackAlignmentCheck ();
   ArmEnableInstructionCache ();
   ArmEnableDataCache ();
 
   ArmEnableMmu ();
+
+  if (NumRootPages > 1) {
+    //
+    // Clone all memory ranges into the shadow page tables that we will use
+    // to temporarily switch to when manipulating live entries
+    //
+    ZeroMem (
+      TranslationTable + TT_ENTRY_COUNT,
+      RootTableEntryCount * sizeof (UINT64)
+      );
+
+    for (MemTab = MemoryTable; MemTab->Length != 0; MemTab++) {
+      Status = FillTranslationTable (
+                 TranslationTable + TT_ENTRY_COUNT,
+                 MemTab,
+                 TRUE
+                 );
+      if (EFI_ERROR (Status)) {
+        goto FreeTranslationTable;
+      }
+    }
+  }
+
   return EFI_SUCCESS;
 
 FreeTranslationTable:
-  FreePages (TranslationTable, 1);
+  FreePages (TranslationTable, NumRootPages);
   return Status;
 }
 
diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
index 66ebca571e63..6929e081ed8d 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
@@ -59,7 +59,20 @@ ASM_FUNC(ArmReplaceLiveTranslationEntry)
   dsb   nsh
 
   EL1_OR_EL2_OR_EL3(x3)
-1:__replace_entry 1
+1:mrs   x8, ttbr0_el1
+  add   x9, x8, #0x1000        // advance to shadow page table
+  orr   x9, x9, #1 << 48       // use different ASID for shadow translations
+  msr   ttbr0_el1, x9
+  isb
+  str   x1, [x0]               // install the entry and make it observeable
+  dsb   ishst                  // to the page table walker
+  isb
+  lsr   x2, x2, #12
+  tlbi  vae1is, x2             // invalidate the updated entry
+  dsb   ish
+  isb
+  msr   ttbr0_el1, x8          // switch back to original translation
+  isb
   b     4f
 2:__replace_entry 2
   b     4f
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/7] ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 1/7] ArmPkg/ArmMmuLib: don't replace table entries with block entries Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 2/7] ArmPkg/ArmMmuLib: use shadow page tables for break-before-make at EL1 Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 4/7] ArmPlatformPkg/PrePeiCore: permit entry with the " Ard Biesheuvel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

Permit the use of this library with the MMU and caches already enabled.
This removes the need for any cache maintenance for coherency, and is
generally better for robustness and performance, especially when running
under virtualization.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c | 30 +++++++++++---------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
index 747ebc533511..ebd39ab4a657 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c
@@ -689,14 +689,16 @@ ArmConfigureMmu (
     *TranslationTableSize = RootTableEntryCount * sizeof (UINT64);
   }
 
-  //
-  // Make sure we are not inadvertently hitting in the caches
-  // when populating the page tables.
-  //
-  InvalidateDataCacheRange (
-    TranslationTable,
-    RootTableEntryCount * sizeof (UINT64)
-    );
+  if (!ArmMmuEnabled ()) {
+    //
+    // Make sure we are not inadvertently hitting in the caches
+    // when populating the page tables.
+    //
+    InvalidateDataCacheRange (
+      TranslationTable,
+      RootTableEntryCount * sizeof (UINT64)
+      );
+  }
 
   ZeroMem (TranslationTable, RootTableEntryCount * sizeof (UINT64));
 
@@ -722,12 +724,14 @@ ArmConfigureMmu (
 
   ArmSetTTBR0 (TranslationTable);
 
-  ArmDisableAlignmentCheck ();
-  ArmEnableStackAlignmentCheck ();
-  ArmEnableInstructionCache ();
-  ArmEnableDataCache ();
+  if (!ArmMmuEnabled ()) {
+    ArmDisableAlignmentCheck ();
+    ArmEnableStackAlignmentCheck ();
+    ArmEnableInstructionCache ();
+    ArmEnableDataCache ();
 
-  ArmEnableMmu ();
+    ArmEnableMmu ();
+  }
 
   if (NumRootPages > 1) {
     //
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/7] ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2022-09-06 15:06 ` [PATCH v2 3/7] ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 5/7] ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map Ard Biesheuvel
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

Some platforms may set up a preliminary ID map in flash and enter EFI
with the MMU and caches enabled, as this removes a lot of the complexity
around cache coherency. Let's take this into account, and avoid touching
the MMU controls or perform cache invalidation when the MMU is enabled
at entry.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmPlatformPkg/PrePeiCore/PrePeiCore.c | 22 +++++++++++---------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/ArmPlatformPkg/PrePeiCore/PrePeiCore.c b/ArmPlatformPkg/PrePeiCore/PrePeiCore.c
index 9c4b25df953d..8b86c6e69abd 100644
--- a/ArmPlatformPkg/PrePeiCore/PrePeiCore.c
+++ b/ArmPlatformPkg/PrePeiCore/PrePeiCore.c
@@ -58,17 +58,19 @@ CEntryPoint (
   IN  EFI_PEI_CORE_ENTRY_POINT  PeiCoreEntryPoint
   )
 {
-  // Data Cache enabled on Primary core when MMU is enabled.
-  ArmDisableDataCache ();
-  // Invalidate instruction cache
-  ArmInvalidateInstructionCache ();
-  // Enable Instruction Caches on all cores.
-  ArmEnableInstructionCache ();
+  if (!ArmMmuEnabled ()) {
+    // Data Cache enabled on Primary core when MMU is enabled.
+    ArmDisableDataCache ();
+    // Invalidate instruction cache
+    ArmInvalidateInstructionCache ();
+    // Enable Instruction Caches on all cores.
+    ArmEnableInstructionCache ();
 
-  InvalidateDataCacheRange (
-    (VOID *)(UINTN)PcdGet64 (PcdCPUCoresStackBase),
-    PcdGet32 (PcdCPUCorePrimaryStackSize)
-    );
+    InvalidateDataCacheRange (
+      (VOID *)(UINTN)PcdGet64 (PcdCPUCoresStackBase),
+      PcdGet32 (PcdCPUCorePrimaryStackSize)
+      );
+  }
 
   //
   // Note: Doesn't have to Enable CPU interface in non-secure world,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 5/7] ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2022-09-06 15:06 ` [PATCH v2 4/7] ArmPlatformPkg/PrePeiCore: permit entry with the " Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 6/7] ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 7/7] ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot Ard Biesheuvel
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

To substantially reduce the amount of processing that takes place with
the MMU and caches off, implement a version of ArmPlatformLib specific
for QEMU/mach-virt in AArch64 mode that carries a statically allocated
and populated ID map that covers the NOR flash and device region, and
128 MiB of DRAM at the base of memory (0x4000_0000).

Note that 128 MiB has always been the minimum amount of DRAM we support
for this configuration, and the existing code already ASSERT()s in DEBUG
mode when booting with less.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S | 111 ++++++++++++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c        |  64 +++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf      |  40 +++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S                     |  57 ++++++++++
 4 files changed, 272 insertions(+)

diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
new file mode 100644
index 000000000000..7b78e2928710
--- /dev/null
+++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
@@ -0,0 +1,111 @@
+//
+//  Copyright (c) 2022, Google LLC. All rights reserved.
+//
+//  SPDX-License-Identifier: BSD-2-Clause-Patent
+//
+//
+
+#include <AsmMacroIoLibV8.h>
+
+  .macro mov_i, reg:req, imm:req
+  movz   \reg, :abs_g3:\imm
+  movk   \reg, :abs_g2_nc:\imm
+  movk   \reg, :abs_g1_nc:\imm
+  movk   \reg, :abs_g0_nc:\imm
+  .endm
+
+ .set    MAIR_DEV_nGnRnE,     0x00
+ .set    MAIR_MEM_NC,         0x44
+ .set    MAIR_MEM_WT,         0xbb
+ .set    MAIR_MEM_WBWA,       0xff
+ .set    mairval, MAIR_DEV_nGnRnE | (MAIR_MEM_NC << 8) | (MAIR_MEM_WT << 16) | (MAIR_MEM_WBWA << 24)
+
+ .set    TCR_TG0_4KB,         0x0 << 14
+ .set    TCR_TG1_4KB,         0x2 << 30
+ .set    TCR_IPS_SHIFT,       32
+ .set    TCR_EPD1,            0x1 << 23
+ .set    TCR_SH_INNER,        0x3 << 12
+ .set    TCR_RGN_OWB,         0x1 << 10
+ .set    TCR_RGN_IWB,         0x1 << 8
+ .set    tcrval, TCR_TG0_4KB | TCR_TG1_4KB | TCR_EPD1 | TCR_RGN_OWB
+ .set    tcrval, tcrval | TCR_RGN_IWB | TCR_SH_INNER
+
+ .set    SCTLR_ELx_I,         0x1 << 12
+ .set    SCTLR_ELx_SA,        0x1 << 3
+ .set    SCTLR_ELx_C,         0x1 << 2
+ .set    SCTLR_ELx_M,         0x1 << 0
+ .set    SCTLR_EL1_SPAN,      0x1 << 23
+ .set    SCTLR_EL1_WXN,       0x1 << 19
+ .set    SCTLR_EL1_SED,       0x1 << 8
+ .set    SCTLR_EL1_ITD,       0x1 << 7
+ .set    SCTLR_EL1_RES1,      (0x1 << 11) | (0x1 << 20) | (0x1 << 22) | (0x1 << 28) | (0x1 << 29)
+ .set    sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_EL1_ITD | SCTLR_EL1_SED
+ .set    sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES1
+
+
+ASM_FUNC(ArmPlatformPeiBootAction)
+  mov_i  x0, mairval
+  mov_i  x1, tcrval
+  adrp   x2, idmap
+  orr    x2, x2, #0xff << 48     // set non-zero ASID
+  mov_i  x3, sctlrval
+
+  mrs    x6, id_aa64mmfr0_el1    // get the supported PA range
+  and    x6, x6, #0xf            // isolate PArange bits
+  cmp    x6, #6                  // 0b0110 == 52 bits
+  sub    x6, x6, #1              // subtract 1
+  cinc   x6, x6, ne              // add back 1 unless PArange == 52 bits
+  bfi    x1, x6, #32, #3         // copy updated PArange into TCR_EL1.IPS
+
+  cmp    x6, #3                  // 0b0011 == 42 bits
+  sub    x6, x6, #1              // subtract 1
+  cinc   x6, x6, lt              // add back 1 unless VA range >= 42
+
+  mov    x7, #32
+  sub    x6, x7, x6, lsl #2      // T0SZ for PArange != 42
+  mov    x7, #64 - 42            // T0SZ for PArange == 42
+  csel   x6, x6, x7, ne
+  orr    x1, x1, x6              // set T0SZ field in TCR
+
+  cmp    x6, #64 - 40            // VA size < 40 bits?
+  add    x4, x2, #0x1000         // advance to level 1 descriptor
+  csel   x2, x4, x2, gt
+
+  msr    mair_el1, x0            // set up the 1:1 mapping
+  msr    tcr_el1, x1
+  msr    ttbr0_el1, x2
+  isb
+
+  tlbi   vmalle1                 // invalidate any cached translations
+  ic     iallu                   // invalidate the I-cache
+  dsb    nsh
+  isb
+
+  msr    sctlr_el1, x3           // enable MMU and caches
+  isb
+  ret
+
+//UINTN
+//ArmPlatformGetCorePosition (
+//  IN UINTN MpId
+//  );
+// With this function: CorePos = (ClusterId * 4) + CoreId
+ASM_FUNC(ArmPlatformGetCorePosition)
+  mov   x0, xzr
+  ret
+
+//UINTN
+//ArmPlatformGetPrimaryCoreMpId (
+//  VOID
+//  );
+ASM_FUNC(ArmPlatformGetPrimaryCoreMpId)
+  MOV32  (w0, FixedPcdGet32 (PcdArmPrimaryCore))
+  ret
+
+//UINTN
+//ArmPlatformIsPrimaryCore (
+//  IN UINTN MpId
+//  );
+ASM_FUNC(ArmPlatformIsPrimaryCore)
+  mov   x0, #1
+  ret
diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c b/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
new file mode 100644
index 000000000000..1de80422ee4c
--- /dev/null
+++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
@@ -0,0 +1,64 @@
+/** @file
+
+  Copyright (c) 2011-2012, ARM Limited. All rights reserved.
+
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+
+#include <Library/ArmLib.h>
+#include <Library/ArmPlatformLib.h>
+
+/**
+  Return the current Boot Mode.
+
+  This function returns the boot reason on the platform
+
+  @return   Return the current Boot Mode of the platform
+
+**/
+EFI_BOOT_MODE
+ArmPlatformGetBootMode (
+  VOID
+  )
+{
+  return BOOT_WITH_FULL_CONFIGURATION;
+}
+
+/**
+  Initialize controllers that must setup in the normal world.
+
+  This function is called by the ArmPlatformPkg/PrePi or
+  ArmPlatformPkg/PlatformPei in the PEI phase.
+
+  @param[in]     MpId               ID of the calling CPU
+
+  @return        RETURN_SUCCESS unless the operation failed
+**/
+RETURN_STATUS
+ArmPlatformInitialize (
+  IN  UINTN  MpId
+  )
+{
+  return RETURN_SUCCESS;
+}
+
+/**
+  Return the Platform specific PPIs.
+
+  This function exposes the Platform Specific PPIs. They can be used by any
+  PrePi modules or passed to the PeiCore by PrePeiCore.
+
+  @param[out]   PpiListSize         Size in Bytes of the Platform PPI List
+  @param[out]   PpiList             Platform PPI List
+
+**/
+VOID
+ArmPlatformGetPlatformPpiList (
+  OUT UINTN                   *PpiListSize,
+  OUT EFI_PEI_PPI_DESCRIPTOR  **PpiList
+  )
+{
+  *PpiListSize = 0;
+  *PpiList     = NULL;
+}
diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf b/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
new file mode 100644
index 000000000000..b2ecdfa061cb
--- /dev/null
+++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
@@ -0,0 +1,40 @@
+## @file
+#  ArmPlatformLib implementation for QEMU/mach-virt on AArch64 that contains a
+#  statically allocated 1:1 mapping of the first 128 MiB of DRAM, as well as
+#  the NOR flash and the device region
+#
+#  Copyright (c) 2011-2012, ARM Limited. All rights reserved.
+#  Copyright (c) 2022, Google LLC. All rights reserved.
+#
+#  SPDX-License-Identifier: BSD-2-Clause-Patent
+#
+##
+
+[Defines]
+  INF_VERSION                    = 1.27
+  BASE_NAME                      = ArmPlatformLibQemu
+  FILE_GUID                      = 40af3a25-f02c-4aef-94ef-7ac0282d21d4
+  MODULE_TYPE                    = BASE
+  VERSION_STRING                 = 1.0
+  LIBRARY_CLASS                  = ArmPlatformLib
+
+[Packages]
+  MdePkg/MdePkg.dec
+  MdeModulePkg/MdeModulePkg.dec
+  ArmPkg/ArmPkg.dec
+  ArmPlatformPkg/ArmPlatformPkg.dec
+
+[LibraryClasses]
+  ArmLib
+  DebugLib
+
+[Sources.common]
+  ArmPlatformLibQemu.c
+  IdMap.S
+
+[Sources.AArch64]
+  AArch64/ArmPlatformHelper.S
+
+[FixedPcd]
+  gArmTokenSpaceGuid.PcdArmPrimaryCoreMask
+  gArmTokenSpaceGuid.PcdArmPrimaryCore
diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S b/ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
new file mode 100644
index 000000000000..4a4b7b77ed83
--- /dev/null
+++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: BSD-2-Clause-Patent
+// Copyright 2022 Google LLC
+// Author: Ard Biesheuvel <ardb@google.com>
+
+  .set      TT_TYPE_BLOCK, 0x1
+  .set      TT_TYPE_PAGE,  0x3
+  .set      TT_TYPE_TABLE, 0x3
+
+  .set      TT_AF, 0x1 << 10
+  .set      TT_NG, 0x1 << 11
+  .set      TT_RO, 0x2 << 6
+  .set      TT_XN, 0x3 << 53
+
+  .set      TT_MT_DEV, 0x0 << 2                 // MAIR #0
+  .set      TT_MT_MEM, (0x3 << 2) | (0x3 << 8)  // MAIR #3
+
+  .set      PAGE_XIP,  TT_TYPE_PAGE  | TT_MT_MEM | TT_AF | TT_RO | TT_NG
+  .set      BLOCK_DEV, TT_TYPE_BLOCK | TT_MT_DEV | TT_AF | TT_XN | TT_NG
+  .set      BLOCK_MEM, TT_TYPE_BLOCK | TT_MT_MEM | TT_AF | TT_XN | TT_NG
+
+  .globl    idmap
+  .section  ".rodata.idmap", "a", %progbits
+  .align    12
+
+idmap:      /* level 0 */
+  .quad     1f + TT_TYPE_TABLE
+  .fill     511, 8, 0x0
+
+1:          /* level 1 */
+  .quad     20f + TT_TYPE_TABLE           // 1 GB of flash and device mappings
+  .quad     21f + TT_TYPE_TABLE           // up to 1 GB of DRAM
+  .fill     510, 8, 0x0                   // 510 GB of remaining VA space
+
+20:         /* level 2 */
+  .quad     3f + TT_TYPE_TABLE            // up to 2 MB of flash
+  .fill     63, 8, 0x0                    // 126 MB of unused flash
+  .set      idx, 64
+  .rept     448
+  .quad     BLOCK_DEV | (idx << 21)       // 896 MB of RW- device mappings
+  .set      idx, idx + 1
+  .endr
+
+21:         /* level 2 */
+  .set      idx, 0x40000000 >> 21
+  .rept     64
+  .quad     BLOCK_MEM | (idx << 21)       // 128 MB of RW- memory mappings
+  .set      idx, idx + 1
+  .endr
+  .fill     448, 8, 0x0
+
+3:          /* level 3 */
+  .quad     0x0                           // omit first 4k page
+  .set      idx, 1
+  .rept     511
+  .quad     PAGE_XIP | (idx << 12)        // 2044 KiB of R-X flash mappings
+  .set      idx, idx + 1
+  .endr
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 6/7] ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2022-09-06 15:06 ` [PATCH v2 5/7] ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  2022-09-06 15:06 ` [PATCH v2 7/7] ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot Ard Biesheuvel
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

In order to allow booting with the MMU and caches enabled really early,
we need to ensure that the code that populates the page tables can
access those page tables with the statically defined ID map active.

So let's put the permanent PEI RAM in the first 128 MiB of memory, which
we will cover with this initial ID map (as it is the minimum supported
DRAM size for ArmVirtQemu).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c   | 105 ++++++++++++++++++++
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf |  68 +++++++++++++
 2 files changed, 173 insertions(+)

diff --git a/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c b/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
new file mode 100644
index 000000000000..d61fa55efaaa
--- /dev/null
+++ b/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
@@ -0,0 +1,105 @@
+/** @file
+
+  Copyright (c) 2011, ARM Limited. All rights reserved.
+  Copyright (c) 2022, Google LLC. All rights reserved.
+
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+
+#include <PiPei.h>
+#include <Library/ArmPlatformLib.h>
+#include <Library/DebugLib.h>
+#include <Library/HobLib.h>
+#include <Library/PeimEntryPoint.h>
+#include <Library/PeiServicesLib.h>
+#include <Library/PcdLib.h>
+#include <Guid/MemoryTypeInformation.h>
+
+EFI_STATUS
+EFIAPI
+MemoryPeim (
+  IN EFI_PHYSICAL_ADDRESS  UefiMemoryBase,
+  IN UINT64                UefiMemorySize
+  );
+
+/**
+  Build the memory type information HOB that describes how many pages of each
+  type to preallocate when initializing the GCD memory map.
+**/
+VOID
+EFIAPI
+BuildMemoryTypeInformationHob (
+  VOID
+  )
+{
+  EFI_MEMORY_TYPE_INFORMATION  Info[10];
+
+  Info[0].Type          = EfiACPIReclaimMemory;
+  Info[0].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiACPIReclaimMemory);
+  Info[1].Type          = EfiACPIMemoryNVS;
+  Info[1].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiACPIMemoryNVS);
+  Info[2].Type          = EfiReservedMemoryType;
+  Info[2].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiReservedMemoryType);
+  Info[3].Type          = EfiRuntimeServicesData;
+  Info[3].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiRuntimeServicesData);
+  Info[4].Type          = EfiRuntimeServicesCode;
+  Info[4].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiRuntimeServicesCode);
+  Info[5].Type          = EfiBootServicesCode;
+  Info[5].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiBootServicesCode);
+  Info[6].Type          = EfiBootServicesData;
+  Info[6].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiBootServicesData);
+  Info[7].Type          = EfiLoaderCode;
+  Info[7].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiLoaderCode);
+  Info[8].Type          = EfiLoaderData;
+  Info[8].NumberOfPages = PcdGet32 (PcdMemoryTypeEfiLoaderData);
+
+  // Terminator for the list
+  Info[9].Type          = EfiMaxMemoryType;
+  Info[9].NumberOfPages = 0;
+
+  BuildGuidDataHob (&gEfiMemoryTypeInformationGuid, &Info, sizeof (Info));
+}
+
+/**
+  Module entry point.
+
+  @param[in]    FileHandle    Handle of the file being invoked.
+  @param[in]    PeiServices   Describes the list of possible PEI Services.
+
+  @return       EFI_SUCCESS unless the operation failed.
+**/
+EFI_STATUS
+EFIAPI
+InitializeMemory (
+  IN       EFI_PEI_FILE_HANDLE  FileHandle,
+  IN CONST EFI_PEI_SERVICES     **PeiServices
+  )
+{
+  UINTN       UefiMemoryBase;
+  EFI_STATUS  Status;
+
+  ASSERT (PcdGet64 (PcdSystemMemorySize) >= SIZE_128MB);
+  ASSERT (PcdGet64 (PcdSystemMemoryBase) < (UINT64)MAX_ALLOC_ADDRESS);
+
+  //
+  // Put the permanent PEI memory in the first 128 MiB of DRAM so that
+  // it is covered by the statically configured ID map.
+  //
+  UefiMemoryBase = (UINTN)PcdGet64 (PcdSystemMemoryBase) + SIZE_128MB
+                   - FixedPcdGet32 (PcdSystemMemoryUefiRegionSize);
+
+  Status = PeiServicesInstallPeiMemory (
+             UefiMemoryBase,
+             FixedPcdGet32 (PcdSystemMemoryUefiRegionSize)
+             );
+  ASSERT_EFI_ERROR (Status);
+
+  Status = MemoryPeim (
+             UefiMemoryBase,
+             FixedPcdGet32 (PcdSystemMemoryUefiRegionSize)
+             );
+  ASSERT_EFI_ERROR (Status);
+
+  return Status;
+}
diff --git a/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf b/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf
new file mode 100644
index 000000000000..f4492719c350
--- /dev/null
+++ b/ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf
@@ -0,0 +1,68 @@
+## @file
+#  Implementation of MemoryInitPeim that uses the first 128 MiB at the base of
+#  DRAM as permanent PEI memory
+#
+#  Copyright (c) 2011-2014, ARM Ltd. All rights reserved.<BR>
+#  Copyright (c) 2022, Google LLC. All rights reserved.<BR>
+#
+#  SPDX-License-Identifier: BSD-2-Clause-Patent
+#
+##
+
+[Defines]
+  INF_VERSION                    = 1.27
+  BASE_NAME                      = MemoryInit
+  FILE_GUID                      = 0fbffd44-f98f-4e1c-9922-e9b21f13c3f8
+  MODULE_TYPE                    = PEIM
+  VERSION_STRING                 = 1.0
+  ENTRY_POINT                    = InitializeMemory
+
+#
+# The following information is for reference only and not required by the build tools.
+#
+#  VALID_ARCHITECTURES           = IA32 X64 EBC ARM
+#
+
+[Sources]
+  MemoryInitPeim.c
+
+[Packages]
+  MdePkg/MdePkg.dec
+  MdeModulePkg/MdeModulePkg.dec
+  EmbeddedPkg/EmbeddedPkg.dec
+  ArmPkg/ArmPkg.dec
+  ArmPlatformPkg/ArmPlatformPkg.dec
+
+[LibraryClasses]
+  PeimEntryPoint
+  DebugLib
+  HobLib
+  ArmLib
+  ArmPlatformLib
+  MemoryInitPeiLib
+
+[Guids]
+  gEfiMemoryTypeInformationGuid
+
+[FeaturePcd]
+  gEmbeddedTokenSpaceGuid.PcdPrePiProduceMemoryTypeInformationHob
+
+[FixedPcd]
+  gArmPlatformTokenSpaceGuid.PcdSystemMemoryUefiRegionSize
+
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiACPIReclaimMemory
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiACPIMemoryNVS
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiReservedMemoryType
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiRuntimeServicesData
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiRuntimeServicesCode
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiBootServicesCode
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiBootServicesData
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiLoaderCode
+  gEmbeddedTokenSpaceGuid.PcdMemoryTypeEfiLoaderData
+
+[Pcd]
+  gArmTokenSpaceGuid.PcdSystemMemoryBase
+  gArmTokenSpaceGuid.PcdSystemMemorySize
+
+[Depex]
+  TRUE
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 7/7] ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot
  2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2022-09-06 15:06 ` [PATCH v2 6/7] ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory Ard Biesheuvel
@ 2022-09-06 15:06 ` Ard Biesheuvel
  6 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2022-09-06 15:06 UTC (permalink / raw)
  To: devel; +Cc: quic_llindhol, sami.mujawar, Ard Biesheuvel

Now that we have all the pieces in place, switch the AArch64 version of
ArmVirtQemu to a mode where the first thing it does out of reset is
enable a preliminary ID map that covers the NOR flash and sufficient
DRAM to create the UEFI page tables as usual.

The advantage of this is that no manipulation of memory occurs any
longer before the MMU is enabled, which removes the need for explicit
coherency management, which is cumbersome and bad for performance.

It also means we no longer need to build all components that may execute
with the MMU off (including BASE libraries) with strict alignment.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 ArmVirtPkg/ArmVirtQemu.dsc | 12 +++++++++---
 ArmVirtPkg/ArmVirtQemu.fdf |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/ArmVirtPkg/ArmVirtQemu.dsc b/ArmVirtPkg/ArmVirtQemu.dsc
index 302c0d2a4e29..2bf360d1b87b 100644
--- a/ArmVirtPkg/ArmVirtQemu.dsc
+++ b/ArmVirtPkg/ArmVirtQemu.dsc
@@ -63,8 +63,6 @@ [LibraryClasses.common]
   QemuFwCfgSimpleParserLib|OvmfPkg/Library/QemuFwCfgSimpleParserLib/QemuFwCfgSimpleParserLib.inf
   QemuLoadImageLib|OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.inf
 
-  ArmPlatformLib|ArmPlatformPkg/Library/ArmPlatformLibNull/ArmPlatformLibNull.inf
-
   TimerLib|ArmPkg/Library/ArmArchTimerLib/ArmArchTimerLib.inf
   NorFlashPlatformLib|ArmVirtPkg/Library/NorFlashQemuLib/NorFlashQemuLib.inf
 
@@ -92,6 +90,12 @@ [LibraryClasses.common]
   TpmPlatformHierarchyLib|SecurityPkg/Library/PeiDxeTpmPlatformHierarchyLibNull/PeiDxeTpmPlatformHierarchyLib.inf
 !endif
 
+[LibraryClasses.AARCH64]
+  ArmPlatformLib|ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
+
+[LibraryClasses.ARM]
+  ArmPlatformLib|ArmPlatformPkg/Library/ArmPlatformLibNull/ArmPlatformLibNull.inf
+
 [LibraryClasses.common.PEIM]
   ArmVirtMemInfoLib|ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLib.inf
 
@@ -112,6 +116,8 @@ [LibraryClasses.common.UEFI_DRIVER]
   UefiScsiLib|MdePkg/Library/UefiScsiLib/UefiScsiLib.inf
 
 [BuildOptions]
+  GCC:*_*_AARCH64_CC_XIPFLAGS = -mno-strict-align
+
 !include NetworkPkg/NetworkBuildOptions.dsc.inc
 
 ################################################################################
@@ -310,7 +316,7 @@ [Components.common]
       PcdLib|MdePkg/Library/BasePcdLibNull/BasePcdLibNull.inf
   }
   ArmPlatformPkg/PlatformPei/PlatformPeim.inf
-  ArmPlatformPkg/MemoryInitPei/MemoryInitPeim.inf
+  ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf
   ArmPkg/Drivers/CpuPei/CpuPei.inf
 
   MdeModulePkg/Universal/Variable/Pei/VariablePei.inf
diff --git a/ArmVirtPkg/ArmVirtQemu.fdf b/ArmVirtPkg/ArmVirtQemu.fdf
index b5e2253295fe..7f17aeb3ad0d 100644
--- a/ArmVirtPkg/ArmVirtQemu.fdf
+++ b/ArmVirtPkg/ArmVirtQemu.fdf
@@ -107,7 +107,7 @@ [FV.FVMAIN_COMPACT]
   INF ArmPlatformPkg/PrePeiCore/PrePeiCoreUniCore.inf
   INF MdeModulePkg/Core/Pei/PeiMain.inf
   INF ArmPlatformPkg/PlatformPei/PlatformPeim.inf
-  INF ArmPlatformPkg/MemoryInitPei/MemoryInitPeim.inf
+  INF ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf
   INF ArmPkg/Drivers/CpuPei/CpuPei.inf
   INF MdeModulePkg/Universal/PCD/Pei/Pcd.inf
   INF MdeModulePkg/Universal/Variable/Pei/VariablePei.inf
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-06 15:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-06 15:06 [PATCH v2 0/7] ArmVirtPkg/ArmVirtQemu: avoid stores with MMU off Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 1/7] ArmPkg/ArmMmuLib: don't replace table entries with block entries Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 2/7] ArmPkg/ArmMmuLib: use shadow page tables for break-before-make at EL1 Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 3/7] ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 4/7] ArmPlatformPkg/PrePeiCore: permit entry with the " Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 5/7] ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 6/7] ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory Ard Biesheuvel
2022-09-06 15:06 ` [PATCH v2 7/7] ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox