public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH v3 00/16] ArmVirtPkg/ArmVirtQemu: Performance streamlining
@ 2022-09-26  8:24 Ard Biesheuvel
  0 siblings, 0 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2022-09-26  8:24 UTC (permalink / raw)
  To: devel; +Cc: Ard Biesheuvel, Leif Lindholm, Alexander Graf

We currently do a substantial amount of processing before enabling the
MMU and caches, which is bad for performance, but also fragile, as it
requires cache coherency to be managed in software.

It also means that when running under virtualization, the hypervisor
must do a non-trivial amount of work to ensure that the host's cached
view of memory is consistent with the guest's uncached view.

So let's update the ArmVirtQemu early boot sequence to improve the
situation:
- modify the page table building logic to avoid the MMU disable/enable
  unless really necessary, i.e., only when the entry in question maps
  itself, or the code that performs the actual update;
- map any regions that cover page tables in memory eagerly down to
  pages, so that we will not need to split them later, and be forced to
  go through the MMU-off path to unmap and remap them;
- allow the asm helper routine that lives in the MemoryInit XIP PEIM to
  be exposed via a HOB so we can fall back to it from DXE; 
- use a compile time generated ID map that covers the first bank of NOR
  flash, the first MMIO region (for the UART), and the first 128 MiB of
  DRAM, and switch to it straight out of reset.

The resulting build no longer performs any non-coherent memory accesses
via the data side, and only relies on instruction fetches before the MMU
is enabled. It also avoids any cache maintenance to the PoC.

Changes since v2:
- drop shadow page table approach - it only works at EL1, and is a bit
  more intrusive than needed; instead, do a proper break-before-make
  (BBM) unless the break unmaps the page table itself or the code that
  is modifying it;
- add a couple of only tangentially related performance streamlining
  changes, to avoid dispatching and shadowing drivers that we don't need

Changes since v1:
- coding style tweaks to placate our CI overlord
- drop -mstrict-align which is no longer needed now that all C code runs
  with the MMU and caches on

Cc: Leif Lindholm <quic_llindhol@quicinc.com>
Cc: Alexander Graf <agraf@csgraf.de>

Ard Biesheuvel (16):
  ArmVirtPkg: remove EbcDxe from all platforms
  ArmVirtPkg: do not enable iSCSI driver by default
  ArmVirtPkg: make EFI_LOADER_DATA non-executable
  ArmVirtPkg/ArmVirtQemu: wire up timeout PCD to Timeout variable
  ArmPkg/ArmMmuLib: don't replace table entries with block entries
  ArmPkg/ArmMmuLib: Disable and re-enable MMU only when needed
  ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled
  ArmPkg/ArmMmuLib: Reuse XIP MMU routines when splitting entries
  ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled
  ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map
  ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory
  ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot
  ArmVirtPkg/ArmVirtQemu: Drop unused variable PEIM
  ArmVirtPkg/ArmVirtQemu: avoid shadowing PEIMs unless necessary
  ArmVirtPkg/QemuVirtMemInfoLib: use HOB not PCD to record the memory
    size
  ArmVirtPkg/ArmVirtQemu: omit PCD PEIM unless TPM support is enabled

 ArmPkg/ArmPkg.dec                                                                                            |   2 +
 ArmPkg/Include/Library/ArmMmuLib.h                                                                           |   7 +-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c                                                             | 191 +++++++++++++-------
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S                                                     |  43 ++++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuPeiLibConstructor.c                                                   |  17 ++
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf                                                                   |   4 +
 ArmPkg/Library/ArmMmuLib/ArmMmuPeiLib.inf                                                                    |   4 +
 ArmPlatformPkg/PrePeiCore/PrePeiCore.c                                                                       |  22 ++-
 ArmVirtPkg/ArmVirt.dsc.inc                                                                                   |   7 +-
 ArmVirtPkg/ArmVirtCloudHv.fdf                                                                                |   5 -
 ArmVirtPkg/ArmVirtPkg.dec                                                                                    |   1 +
 ArmVirtPkg/ArmVirtQemu.dsc                                                                                   |  53 ++++--
 ArmVirtPkg/ArmVirtQemu.fdf                                                                                   |   5 +-
 ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc                                                                         |   5 -
 ArmVirtPkg/ArmVirtQemuKernel.dsc                                                                             |   1 -
 ArmVirtPkg/ArmVirtXen.fdf                                                                                    |   5 -
 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S                                            | 115 ++++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c                                                   |  64 +++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf                                                 |  40 ++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S                                                                |  57 ++++++
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.c                                         |  14 +-
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf                                       |   1 +
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c                                                   |  35 +++-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.inf                                                 |   5 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLib.inf                                              |   8 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c                                     |  30 +--
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c                                                                    | 104 +++++++++++
 ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} |  36 ++--
 28 files changed, 714 insertions(+), 167 deletions(-)
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
 create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
 copy ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} (64%)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 3+ messages in thread
* [PATCH v3 00/16] ArmVirtPkg/ArmVirtQemu: Performance streamlining
@ 2022-09-26  8:24 Ard Biesheuvel
  0 siblings, 0 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2022-09-26  8:24 UTC (permalink / raw)
  To: devel; +Cc: Ard Biesheuvel, Leif Lindholm, Alexander Graf

We currently do a substantial amount of processing before enabling the
MMU and caches, which is bad for performance, but also fragile, as it
requires cache coherency to be managed in software.

It also means that when running under virtualization, the hypervisor
must do a non-trivial amount of work to ensure that the host's cached
view of memory is consistent with the guest's uncached view.

So let's update the ArmVirtQemu early boot sequence to improve the
situation:
- modify the page table building logic to avoid the MMU disable/enable
  unless really necessary, i.e., only when the entry in question maps
  itself, or the code that performs the actual update;
- map any regions that cover page tables in memory eagerly down to
  pages, so that we will not need to split them later, and be forced to
  go through the MMU-off path to unmap and remap them;
- allow the asm helper routine that lives in the MemoryInit XIP PEIM to
  be exposed via a HOB so we can fall back to it from DXE; 
- use a compile time generated ID map that covers the first bank of NOR
  flash, the first MMIO region (for the UART), and the first 128 MiB of
  DRAM, and switch to it straight out of reset.

The resulting build no longer performs any non-coherent memory accesses
via the data side, and only relies on instruction fetches before the MMU
is enabled. It also avoids any cache maintenance to the PoC.

Changes since v2:
- drop shadow page table approach - it only works at EL1, and is a bit
  more intrusive than needed; instead, do a proper break-before-make
  (BBM) unless the break unmaps the page table itself or the code that
  is modifying it;
- add a couple of only tangentially related performance streamlining
  changes, to avoid dispatching and shadowing drivers that we don't need

Changes since v1:
- coding style tweaks to placate our CI overlord
- drop -mstrict-align which is no longer needed now that all C code runs
  with the MMU and caches on

Cc: Leif Lindholm <quic_llindhol@quicinc.com>
Cc: Alexander Graf <agraf@csgraf.de>

Ard Biesheuvel (16):
  ArmVirtPkg: remove EbcDxe from all platforms
  ArmVirtPkg: do not enable iSCSI driver by default
  ArmVirtPkg: make EFI_LOADER_DATA non-executable
  ArmVirtPkg/ArmVirtQemu: wire up timeout PCD to Timeout variable
  ArmPkg/ArmMmuLib: don't replace table entries with block entries
  ArmPkg/ArmMmuLib: Disable and re-enable MMU only when needed
  ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled
  ArmPkg/ArmMmuLib: Reuse XIP MMU routines when splitting entries
  ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled
  ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map
  ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory
  ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot
  ArmVirtPkg/ArmVirtQemu: Drop unused variable PEIM
  ArmVirtPkg/ArmVirtQemu: avoid shadowing PEIMs unless necessary
  ArmVirtPkg/QemuVirtMemInfoLib: use HOB not PCD to record the memory
    size
  ArmVirtPkg/ArmVirtQemu: omit PCD PEIM unless TPM support is enabled

 ArmPkg/ArmPkg.dec                                                                                            |   2 +
 ArmPkg/Include/Library/ArmMmuLib.h                                                                           |   7 +-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c                                                             | 191 +++++++++++++-------
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S                                                     |  43 ++++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuPeiLibConstructor.c                                                   |  17 ++
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf                                                                   |   4 +
 ArmPkg/Library/ArmMmuLib/ArmMmuPeiLib.inf                                                                    |   4 +
 ArmPlatformPkg/PrePeiCore/PrePeiCore.c                                                                       |  22 ++-
 ArmVirtPkg/ArmVirt.dsc.inc                                                                                   |   7 +-
 ArmVirtPkg/ArmVirtCloudHv.fdf                                                                                |   5 -
 ArmVirtPkg/ArmVirtPkg.dec                                                                                    |   1 +
 ArmVirtPkg/ArmVirtQemu.dsc                                                                                   |  53 ++++--
 ArmVirtPkg/ArmVirtQemu.fdf                                                                                   |   5 +-
 ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc                                                                         |   5 -
 ArmVirtPkg/ArmVirtQemuKernel.dsc                                                                             |   1 -
 ArmVirtPkg/ArmVirtXen.fdf                                                                                    |   5 -
 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S                                            | 115 ++++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c                                                   |  64 +++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf                                                 |  40 ++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S                                                                |  57 ++++++
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.c                                         |  14 +-
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf                                       |   1 +
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c                                                   |  35 +++-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.inf                                                 |   5 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLib.inf                                              |   8 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c                                     |  30 +--
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c                                                                    | 104 +++++++++++
 ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} |  36 ++--
 28 files changed, 714 insertions(+), 167 deletions(-)
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
 create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
 copy ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} (64%)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 3+ messages in thread
* [PATCH v3 00/16] ArmVirtPkg/ArmVirtQemu: Performance streamlining
@ 2022-09-26  8:23 Ard Biesheuvel
  0 siblings, 0 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2022-09-26  8:23 UTC (permalink / raw)
  To: devel; +Cc: Ard Biesheuvel, Leif Lindholm, Alexander Graf

We currently do a substantial amount of processing before enabling the
MMU and caches, which is bad for performance, but also fragile, as it
requires cache coherency to be managed in software.

It also means that when running under virtualization, the hypervisor
must do a non-trivial amount of work to ensure that the host's cached
view of memory is consistent with the guest's uncached view.

So let's update the ArmVirtQemu early boot sequence to improve the
situation:
- modify the page table building logic to avoid the MMU disable/enable
  unless really necessary, i.e., only when the entry in question maps
  itself, or the code that performs the actual update;
- map any regions that cover page tables in memory eagerly down to
  pages, so that we will not need to split them later, and be forced to
  go through the MMU-off path to unmap and remap them;
- allow the asm helper routine that lives in the MemoryInit XIP PEIM to
  be exposed via a HOB so we can fall back to it from DXE; 
- use a compile time generated ID map that covers the first bank of NOR
  flash, the first MMIO region (for the UART), and the first 128 MiB of
  DRAM, and switch to it straight out of reset.

The resulting build no longer performs any non-coherent memory accesses
via the data side, and only relies on instruction fetches before the MMU
is enabled. It also avoids any cache maintenance to the PoC.

Changes since v2:
- drop shadow page table approach - it only works at EL1, and is a bit
  more intrusive than needed; instead, do a proper break-before-make
  (BBM) unless the break unmaps the page table itself or the code that
  is modifying it;
- add a couple of only tangentially related performance streamlining
  changes, to avoid dispatching and shadowing drivers that we don't need

Changes since v1:
- coding style tweaks to placate our CI overlord
- drop -mstrict-align which is no longer needed now that all C code runs
  with the MMU and caches on

Cc: Leif Lindholm <quic_llindhol@quicinc.com>
Cc: Alexander Graf <agraf@csgraf.de>

Ard Biesheuvel (16):
  ArmVirtPkg: remove EbcDxe from all platforms
  ArmVirtPkg: do not enable iSCSI driver by default
  ArmVirtPkg: make EFI_LOADER_DATA non-executable
  ArmVirtPkg/ArmVirtQemu: wire up timeout PCD to Timeout variable
  ArmPkg/ArmMmuLib: don't replace table entries with block entries
  ArmPkg/ArmMmuLib: Disable and re-enable MMU only when needed
  ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled
  ArmPkg/ArmMmuLib: Reuse XIP MMU routines when splitting entries
  ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled
  ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map
  ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory
  ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot
  ArmVirtPkg/ArmVirtQemu: Drop unused variable PEIM
  ArmVirtPkg/ArmVirtQemu: avoid shadowing PEIMs unless necessary
  ArmVirtPkg/QemuVirtMemInfoLib: use HOB not PCD to record the memory
    size
  ArmVirtPkg/ArmVirtQemu: omit PCD PEIM unless TPM support is enabled

 ArmPkg/ArmPkg.dec                                                                                            |   2 +
 ArmPkg/Include/Library/ArmMmuLib.h                                                                           |   7 +-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c                                                             | 191 +++++++++++++-------
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S                                                     |  43 ++++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuPeiLibConstructor.c                                                   |  17 ++
 ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf                                                                   |   4 +
 ArmPkg/Library/ArmMmuLib/ArmMmuPeiLib.inf                                                                    |   4 +
 ArmPlatformPkg/PrePeiCore/PrePeiCore.c                                                                       |  22 ++-
 ArmVirtPkg/ArmVirt.dsc.inc                                                                                   |   7 +-
 ArmVirtPkg/ArmVirtCloudHv.fdf                                                                                |   5 -
 ArmVirtPkg/ArmVirtPkg.dec                                                                                    |   1 +
 ArmVirtPkg/ArmVirtQemu.dsc                                                                                   |  53 ++++--
 ArmVirtPkg/ArmVirtQemu.fdf                                                                                   |   5 +-
 ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc                                                                         |   5 -
 ArmVirtPkg/ArmVirtQemuKernel.dsc                                                                             |   1 -
 ArmVirtPkg/ArmVirtXen.fdf                                                                                    |   5 -
 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S                                            | 115 ++++++++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c                                                   |  64 +++++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf                                                 |  40 ++++
 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S                                                                |  57 ++++++
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.c                                         |  14 +-
 ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf                                       |   1 +
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c                                                   |  35 +++-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.inf                                                 |   5 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLib.inf                                              |   8 +-
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c                                     |  30 +--
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c                                                                    | 104 +++++++++++
 ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} |  36 ++--
 28 files changed, 714 insertions(+), 167 deletions(-)
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf
 create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S
 create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c
 copy ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf => MemoryInitPei/MemoryInitPeim.inf} (64%)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-09-26  8:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-26  8:24 [PATCH v3 00/16] ArmVirtPkg/ArmVirtQemu: Performance streamlining Ard Biesheuvel
  -- strict thread matches above, loose matches on Subject: below --
2022-09-26  8:24 Ard Biesheuvel
2022-09-26  8:23 Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox