From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by mx.groups.io with SMTP id smtpd.web12.26312.1664180645907136884 for ; Mon, 26 Sep 2022 01:24:06 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@kernel.org header.s=k20201202 header.b=N89dTGhE; spf=pass (domain: kernel.org, ip: 139.178.84.217, mailfrom: ardb@kernel.org) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2F50D618FE; Mon, 26 Sep 2022 08:24:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 82097C433D6; Mon, 26 Sep 2022 08:24:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1664180644; bh=N+kRd6X56oaaTSsfBEz9txMsfTTtnvc4TzZsvijixwo=; h=From:To:Cc:Subject:Date:From; b=N89dTGhE7JXHRBt2KbMAdBpzGVBvR2kHV4zbTBB+InWFmcLUP3bf0DbXbgkvFQXNY pAIHd6cj62Wi8IwPZLJkz6mEaJogEuge/HcdbsIoOklAEIcoTNRJndVhOAZ8VOjbVD 9D7zO2ff33MjCwSZFUR4LSfHtF4oi8WKu5lTjF0hm1udIMmzQ3Sq0F41b5PBZ1UnLd mVvIv0VwSMuiG/tukkc8kosn+zht5KSnjXO22jpdpO+TCzw4+2RDLCmSidOohhNq9f VPMHpp8tUPCptwS+UuSInoXsjTvqyl44ihksCzzH/+0uK1GotL4urTaEnnlBwDs8a9 1JidB4V25ar3g== From: "Ard Biesheuvel" To: devel@edk2.groups.io Cc: Ard Biesheuvel , Leif Lindholm , Alexander Graf Subject: [PATCH v3 00/16] ArmVirtPkg/ArmVirtQemu: Performance streamlining Date: Mon, 26 Sep 2022 10:23:57 +0200 Message-Id: <20220926082357.2110760-1-ardb@kernel.org> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable We currently do a substantial amount of processing before enabling the MMU and caches, which is bad for performance, but also fragile, as it requires cache coherency to be managed in software. It also means that when running under virtualization, the hypervisor must do a non-trivial amount of work to ensure that the host's cached view of memory is consistent with the guest's uncached view. So let's update the ArmVirtQemu early boot sequence to improve the situation: - modify the page table building logic to avoid the MMU disable/enable unless really necessary, i.e., only when the entry in question maps itself, or the code that performs the actual update; - map any regions that cover page tables in memory eagerly down to pages, so that we will not need to split them later, and be forced to go through the MMU-off path to unmap and remap them; - allow the asm helper routine that lives in the MemoryInit XIP PEIM to be exposed via a HOB so we can fall back to it from DXE;=20 - use a compile time generated ID map that covers the first bank of NOR flash, the first MMIO region (for the UART), and the first 128 MiB of DRAM, and switch to it straight out of reset. The resulting build no longer performs any non-coherent memory accesses via the data side, and only relies on instruction fetches before the MMU is enabled. It also avoids any cache maintenance to the PoC. Changes since v2: - drop shadow page table approach - it only works at EL1, and is a bit more intrusive than needed; instead, do a proper break-before-make (BBM) unless the break unmaps the page table itself or the code that is modifying it; - add a couple of only tangentially related performance streamlining changes, to avoid dispatching and shadowing drivers that we don't need Changes since v1: - coding style tweaks to placate our CI overlord - drop -mstrict-align which is no longer needed now that all C code runs with the MMU and caches on Cc: Leif Lindholm Cc: Alexander Graf Ard Biesheuvel (16): ArmVirtPkg: remove EbcDxe from all platforms ArmVirtPkg: do not enable iSCSI driver by default ArmVirtPkg: make EFI_LOADER_DATA non-executable ArmVirtPkg/ArmVirtQemu: wire up timeout PCD to Timeout variable ArmPkg/ArmMmuLib: don't replace table entries with block entries ArmPkg/ArmMmuLib: Disable and re-enable MMU only when needed ArmPkg/ArmMmuLib: permit initial configuration with MMU enabled ArmPkg/ArmMmuLib: Reuse XIP MMU routines when splitting entries ArmPlatformPkg/PrePeiCore: permit entry with the MMU enabled ArmVirtPkg/ArmVirtQemu: implement ArmPlatformLib with static ID map ArmVirtPkg/ArmVirtQemu: use first 128 MiB as permanent PEI memory ArmVirtPkg/ArmVirtQemu: enable initial ID map at early boot ArmVirtPkg/ArmVirtQemu: Drop unused variable PEIM ArmVirtPkg/ArmVirtQemu: avoid shadowing PEIMs unless necessary ArmVirtPkg/QemuVirtMemInfoLib: use HOB not PCD to record the memory size ArmVirtPkg/ArmVirtQemu: omit PCD PEIM unless TPM support is enabled ArmPkg/ArmPkg.dec = | 2 + ArmPkg/Include/Library/ArmMmuLib.h = | 7 +- ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c = | 191 +++++++++++++------- ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S = | 43 ++++- ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuPeiLibConstructor.c = | 17 ++ ArmPkg/Library/ArmMmuLib/ArmMmuBaseLib.inf = | 4 + ArmPkg/Library/ArmMmuLib/ArmMmuPeiLib.inf = | 4 + ArmPlatformPkg/PrePeiCore/PrePeiCore.c = | 22 ++- ArmVirtPkg/ArmVirt.dsc.inc = | 7 +- ArmVirtPkg/ArmVirtCloudHv.fdf = | 5 - ArmVirtPkg/ArmVirtPkg.dec = | 1 + ArmVirtPkg/ArmVirtQemu.dsc = | 53 ++++-- ArmVirtPkg/ArmVirtQemu.fdf = | 5 +- ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc = | 5 - ArmVirtPkg/ArmVirtQemuKernel.dsc = | 1 - ArmVirtPkg/ArmVirtXen.fdf = | 5 - ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S = | 115 ++++++++++++ ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.c = | 64 +++++++ ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQemu.inf = | 40 ++++ ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S = | 57 ++++++ ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.c = | 14 +- ArmVirtPkg/Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf = | 1 + ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c = | 35 +++- ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.inf = | 5 +- ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLib.inf = | 8 +- ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoPeiLibConstructor.c= | 30 +-- ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c = | 104 +++++++++++ ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib.inf = =3D> MemoryInitPei/MemoryInitPeim.inf} | 36 ++-- 28 files changed, 714 insertions(+), 167 deletions(-) create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlat= formHelper.S create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQ= emu.c create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/ArmPlatformLibQ= emu.inf create mode 100644 ArmVirtPkg/Library/ArmPlatformLibQemu/IdMap.S create mode 100644 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.c copy ArmVirtPkg/{Library/ArmVirtMemoryInitPeiLib/ArmVirtMemoryInitPeiLib= .inf =3D> MemoryInitPei/MemoryInitPeim.inf} (64%) --=20 2.35.1