From: "Laszlo Ersek" <lersek@redhat.com>
To: edk2-devel-groups-io <devel@edk2.groups.io>
Cc: Anthony Perard <anthony.perard@citrix.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Igor Mammedov <imammedo@redhat.com>,
Jordan Justen <jordan.l.justen@intel.com>,
Julien Grall <julien.grall@arm.com>
Subject: [PATCH v2 3/3] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug
Date: Wed, 23 Oct 2019 00:15:54 +0200 [thread overview]
Message-ID: <20191022221554.14963-4-lersek@redhat.com> (raw)
In-Reply-To: <20191022221554.14963-1-lersek@redhat.com>
MaxCpuCountInitialization() currently handles the following options:
(1) QEMU does not report the boot CPU count (FW_CFG_NB_CPUS is 0)
In this case, PlatformPei makes MpInitLib enumerate APs up to the
default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until
the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses.
(Whichever is reached first.)
Time-limited AP enumeration had never been reliable on QEMU/KVM, which
is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF.
(2) QEMU reports the boot CPU count (FW_CFG_NB_CPUS is nonzero)
In this case, PlatformPei sets
- PcdCpuMaxLogicalProcessorNumber to the reported boot CPU count
(FW_CFG_NB_CPUS, which exports "PCMachineState.boot_cpus"),
- and PcdCpuApInitTimeOutInMicroSeconds to practically "infinity"
(MAX_UINT32, ~71 minutes).
That causes MpInitLib to enumerate exactly the present (boot) APs.
With CPU hotplug in mind, this method is not good enough. Because,
using QEMU terminology, UefiCpuPkg expects
PcdCpuMaxLogicalProcessorNumber to provide the "possible CPUs" count
("MachineState.smp.max_cpus"), which includes present and not present
CPUs both (with not present CPUs being subject for hot-plugging).
FW_CFG_NB_CPUS does not include not present CPUs.
Rewrite MaxCpuCountInitialization() for handling the following cases:
(1) The behavior of case (1) does not change. (No UefiCpuPkg PCDs are set
to values different from the defaults.)
(2) QEMU reports the boot CPU count ("PCMachineState.boot_cpus", via
FW_CFG_NB_CPUS), but not the possible CPUs count
("MachineState.smp.max_cpus").
In this case, the behavior remains unchanged.
The way MpInitLib is instructed to do the same differs however: we now
set the new PcdCpuBootLogicalProcessorNumber to the boot CPU count
(while continuing to set PcdCpuMaxLogicalProcessorNumber identically).
PcdCpuApInitTimeOutInMicroSeconds becomes irrelevant.
(3) QEMU reports both the boot CPU count ("PCMachineState.boot_cpus", via
FW_CFG_NB_CPUS), and the possible CPUs count
("MachineState.smp.max_cpus").
We tell UefiCpuPkg about the possible CPUs count through
PcdCpuMaxLogicalProcessorNumber. We also tell MpInitLib the boot CPU
count for precise and quick AP enumeration, via
PcdCpuBootLogicalProcessorNumber. PcdCpuApInitTimeOutInMicroSeconds is
irrelevant again.
This patch is a pre-requisite for enabling CPU hotplug with SMM_REQUIRE.
As a side effect, the patch also enables S3 to work with CPU hotplug at
once, *without* SMM_REQUIRE.
(Without the patch, S3 resume fails, if a CPU is hot-plugged at OS
runtime, prior to suspend: the FW_CFG_NB_CPUS increase seen during resume
causes PcdCpuMaxLogicalProcessorNumber to increase as well, which is not
permitted.
With the patch, PcdCpuMaxLogicalProcessorNumber stays the same, namely
"MachineState.smp.max_cpus". Therefore, the CPU structures allocated
during normal boot can accommodate the CPUs at S3 resume that have been
hotplugged prior to S3 suspend.)
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Julien Grall <julien.grall@arm.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
Notes:
v2:
- use "possible CPUs" term in the code and the commit message [Igor]
- add details about S3 to the commit message [Igor]
- use QEMU's existent CPU hotplug register block, rather than a new
named file in fw_cfg [Igor]
- tested on QEMU v2.6.2 (legacy-only CPU hotplug register block), v2.7.1
(modern register block, but buggy fw_cfg), v2.8.1.1 (no QEMU issues),
v4.0.0 (no QEMU issues)
OvmfPkg/OvmfPkgIa32.dsc | 2 +-
OvmfPkg/OvmfPkgIa32X64.dsc | 2 +-
OvmfPkg/OvmfPkgX64.dsc | 2 +-
OvmfPkg/PlatformPei/PlatformPei.inf | 2 +-
OvmfPkg/PlatformPei/Platform.c | 172 +++++++++++++++++---
5 files changed, 151 insertions(+), 29 deletions(-)
diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index 4301e7821902..68d8a9fb9655 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -553,11 +553,11 @@ [PcdsDynamicDefault]
gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
# UefiCpuPkg PCDs related to initial AP bringup and general AP management.
gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
- gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+ gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0
# Set memory encryption mask
gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0
!if $(SMM_REQUIRE) == TRUE
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index 803fd74ae8e4..e5a6260b6088 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -565,11 +565,11 @@ [PcdsDynamicDefault]
gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
# UefiCpuPkg PCDs related to initial AP bringup and general AP management.
gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
- gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+ gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0
# Set memory encryption mask
gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0
!if $(SMM_REQUIRE) == TRUE
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index 5dbd1b793a90..f5d904945103 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -564,11 +564,11 @@ [PcdsDynamicDefault]
gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
# UefiCpuPkg PCDs related to initial AP bringup and general AP management.
gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
- gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+ gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0
# Set memory encryption mask
gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0
!if $(SMM_REQUIRE) == TRUE
diff --git a/OvmfPkg/PlatformPei/PlatformPei.inf b/OvmfPkg/PlatformPei/PlatformPei.inf
index d9fd9c8f05b3..30eaebdfae63 100644
--- a/OvmfPkg/PlatformPei/PlatformPei.inf
+++ b/OvmfPkg/PlatformPei/PlatformPei.inf
@@ -96,11 +96,11 @@ [Pcd]
gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiS3Enable
gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask
gEfiSecurityPkgTokenSpaceGuid.PcdOptionRomImageVerificationPolicy
gUefiCpuPkgTokenSpaceGuid.PcdCpuLocalApicBaseAddress
gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
- gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
+ gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber
gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize
[FixedPcd]
gEfiMdePkgTokenSpaceGuid.PcdPciExpressBaseAddress
diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index 3ba2459872d9..e5e8581752b5 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -28,11 +28,14 @@
#include <Library/QemuFwCfgLib.h>
#include <Library/QemuFwCfgS3Lib.h>
#include <Library/ResourcePublicationLib.h>
#include <Guid/MemoryTypeInformation.h>
#include <Ppi/MasterBootMode.h>
+#include <IndustryStandard/I440FxPiix4.h>
#include <IndustryStandard/Pci22.h>
+#include <IndustryStandard/Q35MchIch9.h>
+#include <IndustryStandard/QemuCpuHotplug.h>
#include <OvmfPlatforms.h>
#include "Platform.h"
#include "Cmos.h"
@@ -562,47 +565,165 @@ S3Verification (
#endif
}
/**
- Fetch the number of boot CPUs from QEMU and expose it to UefiCpuPkg modules.
- Set the mMaxCpuCount variable.
+ Fetch the boot CPU count and the possible CPU count from QEMU, and expose
+ them to UefiCpuPkg modules. Set the mMaxCpuCount variable.
**/
VOID
MaxCpuCountInitialization (
VOID
)
{
- UINT16 ProcessorCount;
+ UINT16 BootCpuCount;
RETURN_STATUS PcdStatus;
+ //
+ // Try to fetch the boot CPU count.
+ //
QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount);
- ProcessorCount = QemuFwCfgRead16 ();
- //
- // If the fw_cfg key or fw_cfg entirely is unavailable, load mMaxCpuCount
- // from the PCD default. No change to PCDs.
- //
- if (ProcessorCount == 0) {
+ BootCpuCount = QemuFwCfgRead16 ();
+ if (BootCpuCount == 0) {
+ //
+ // QEMU doesn't report the boot CPU count. (BootCpuCount == 0) will let
+ // MpInitLib count APs up to (PcdCpuMaxLogicalProcessorNumber - 1), or
+ // until PcdCpuApInitTimeOutInMicroSeconds elapses (whichever is reached
+ // first).
+ //
+ DEBUG ((DEBUG_WARN, "%a: boot CPU count unavailable\n", __FUNCTION__));
mMaxCpuCount = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
- return;
+ } else {
+ //
+ // We will expose BootCpuCount to MpInitLib. MpInitLib will count APs up to
+ // (BootCpuCount - 1) precisely, regardless of timeout.
+ //
+ // Now try to fetch the possible CPU count.
+ //
+ UINTN CpuHpBase;
+ UINT32 CmdData2;
+
+ CpuHpBase = ((mHostBridgeDevId == INTEL_Q35_MCH_DEVICE_ID) ?
+ ICH9_CPU_HOTPLUG_BASE : PIIX4_CPU_HOTPLUG_BASE);
+
+ //
+ // If only legacy mode is available in the CPU hotplug register block, or
+ // the register block is completely missing, then the writes below are
+ // no-ops.
+ //
+ // 1. Switch the hotplug register block to modern mode.
+ //
+ IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, 0);
+ //
+ // 2. Select a valid CPU for deterministic reading of
+ // QEMU_CPUHP_R_CMD_DATA2.
+ //
+ // CPU#0 is always valid; it is the always present and non-removable
+ // BSP.
+ //
+ IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, 0);
+ //
+ // 3. Send a command after which QEMU_CPUHP_R_CMD_DATA2 is specified to
+ // read as zero, and which does not invalidate the selector. (The
+ // selector may change, but it must not become invalid.)
+ //
+ // Send QEMU_CPUHP_CMD_GET_PENDING, as it will prove useful later.
+ //
+ IoWrite8 (CpuHpBase + QEMU_CPUHP_W_CMD, QEMU_CPUHP_CMD_GET_PENDING);
+ //
+ // 4. Read QEMU_CPUHP_R_CMD_DATA2.
+ //
+ // If the register block is entirely missing, then this is an unassigned
+ // IO read, returning all-bits-one.
+ //
+ // If only legacy mode is available, then bit#0 stands for CPU#0 in the
+ // "CPU present bitmap". CPU#0 is always present.
+ //
+ // Otherwise, QEMU_CPUHP_R_CMD_DATA2 is either still reserved (returning
+ // all-bits-zero), or it is specified to read as zero after the above
+ // steps. Both cases confirm modern mode.
+ //
+ CmdData2 = IoRead32 (CpuHpBase + QEMU_CPUHP_R_CMD_DATA2);
+ DEBUG ((DEBUG_VERBOSE, "%a: CmdData2=0x%x\n", __FUNCTION__, CmdData2));
+ if (CmdData2 != 0) {
+ //
+ // QEMU doesn't support the modern CPU hotplug interface. Assume that the
+ // possible CPU count equals the boot CPU count (precluding hotplug).
+ //
+ DEBUG ((DEBUG_WARN, "%a: modern CPU hotplug interface unavailable\n",
+ __FUNCTION__));
+ mMaxCpuCount = BootCpuCount;
+ } else {
+ //
+ // Grab the possible CPU count from the modern CPU hotplug interface.
+ //
+ UINT32 Present, Possible, Selected;
+
+ Present = 0;
+ Possible = 0;
+
+ //
+ // We've sent QEMU_CPUHP_CMD_GET_PENDING last; this ensures
+ // QEMU_CPUHP_RW_CMD_DATA can now be read usefully. However,
+ // QEMU_CPUHP_CMD_GET_PENDING may have selected a CPU with actual pending
+ // hotplug events; therefore, select CPU#0 forcibly.
+ //
+ IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, Possible);
+
+ do {
+ UINT8 CpuStatus;
+
+ //
+ // Read the status of the currently selected CPU. This will help with a
+ // sanity check against "BootCpuCount".
+ //
+ CpuStatus = IoRead8 (CpuHpBase + QEMU_CPUHP_R_CPU_STAT);
+ if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) != 0) {
+ ++Present;
+ }
+ //
+ // Attempt to select the next CPU.
+ //
+ ++Possible;
+ IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, Possible);
+ //
+ // If the selection is successful, then the following read will return
+ // the selector (which we know is positive at this point). Otherwise,
+ // the read will return 0.
+ //
+ Selected = IoRead32 (CpuHpBase + QEMU_CPUHP_RW_CMD_DATA);
+ ASSERT (Selected == Possible || Selected == 0);
+ } while (Selected > 0);
+
+ //
+ // Sanity check: fw_cfg and the modern CPU hotplug interface should
+ // return the same boot CPU count.
+ //
+ if (BootCpuCount != Present) {
+ DEBUG ((DEBUG_WARN, "%a: QEMU v2.7 reset bug: BootCpuCount=%d "
+ "Present=%u\n", __FUNCTION__, BootCpuCount, Present));
+ //
+ // The handling of QemuFwCfgItemSmpCpuCount, across CPU hotplug plus
+ // platform reset (including S3), was corrected in QEMU commit
+ // e3cadac073a9 ("pc: fix FW_CFG_NB_CPUS to account for -device added
+ // CPUs", 2016-11-16), part of release v2.8.0.
+ //
+ BootCpuCount = (UINT16)Present;
+ }
+
+ mMaxCpuCount = Possible;
+ }
}
- //
- // Otherwise, set mMaxCpuCount to the value reported by QEMU.
- //
- mMaxCpuCount = ProcessorCount;
- //
- // Additionally, tell UefiCpuPkg modules (a) the exact number of VCPUs, (b)
- // to wait, in the initial AP bringup, exactly as long as it takes for all of
- // the APs to report in. For this, we set the longest representable timeout
- // (approx. 71 minutes).
- //
- PcdStatus = PcdSet32S (PcdCpuMaxLogicalProcessorNumber, ProcessorCount);
+
+ DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", __FUNCTION__,
+ BootCpuCount, mMaxCpuCount));
+ ASSERT (BootCpuCount <= mMaxCpuCount);
+
+ PcdStatus = PcdSet32S (PcdCpuBootLogicalProcessorNumber, BootCpuCount);
ASSERT_RETURN_ERROR (PcdStatus);
- PcdStatus = PcdSet32S (PcdCpuApInitTimeOutInMicroSeconds, MAX_UINT32);
+ PcdStatus = PcdSet32S (PcdCpuMaxLogicalProcessorNumber, mMaxCpuCount);
ASSERT_RETURN_ERROR (PcdStatus);
- DEBUG ((DEBUG_INFO, "%a: QEMU reports %d processor(s)\n", __FUNCTION__,
- ProcessorCount));
}
/**
Perform Platform PEI initialization.
@@ -636,17 +757,18 @@ InitializePlatform (
}
S3Verification ();
BootModeInitialization ();
AddressWidthInitialization ();
- MaxCpuCountInitialization ();
//
// Query Host Bridge DID
//
mHostBridgeDevId = PciRead16 (OVMF_HOSTBRIDGE_DID);
+ MaxCpuCountInitialization ();
+
if (FeaturePcdGet (PcdSmmSmramRequire)) {
Q35TsegMbytesInitialization ();
}
PublishPeiMemory ();
--
2.19.1.3.g30247aa5d201
next prev parent reply other threads:[~2019-10-22 22:16 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-22 22:15 [PATCH v2 0/3] OvmfPkg: distinguish boot CPU count from possible CPU count Laszlo Ersek
2019-10-22 22:15 ` [PATCH v2 1/3] OvmfPkg/OvmfXen.dsc: remove PcdCpu* dynamic defaults Laszlo Ersek
2019-10-24 14:18 ` Anthony PERARD
2019-10-22 22:15 ` [PATCH v2 2/3] OvmfPkg/IndustryStandard: define macros for QEMU's CPU hotplug registers Laszlo Ersek
2019-10-23 12:05 ` [edk2-devel] " Philippe Mathieu-Daudé
2019-10-24 10:29 ` Laszlo Ersek
2019-10-24 15:12 ` Philippe Mathieu-Daudé
2019-10-25 8:21 ` Laszlo Ersek
2020-01-24 11:40 ` Laszlo Ersek
2020-01-29 14:43 ` Philippe Mathieu-Daudé
2020-01-29 17:30 ` Laszlo Ersek
2019-10-22 22:15 ` Laszlo Ersek [this message]
2019-10-24 14:27 ` [PATCH v2 3/3] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug Anthony PERARD
2019-10-24 15:28 ` Laszlo Ersek
2019-10-24 15:33 ` [edk2-devel] " Philippe Mathieu-Daudé
2019-10-25 8:29 ` Laszlo Ersek
2019-10-25 9:17 ` Philippe Mathieu-Daudé
2019-10-23 8:52 ` [PATCH v2 0/3] OvmfPkg: distinguish boot CPU count from possible CPU count Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191022221554.14963-4-lersek@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox