From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, ankur.a.arora@oracle.com
Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com,
Jordan Justen <jordan.l.justen@intel.com>,
Ard Biesheuvel <ard.biesheuvel@arm.com>,
Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-devel] [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu()
Date: Tue, 23 Feb 2021 21:36:35 +0100 [thread overview]
Message-ID: <6e37922f-afbe-2b41-4eb9-4b3c62d28aa9@redhat.com> (raw)
In-Reply-To: <20210222071928.1401820-9-ankur.a.arora@oracle.com>
superficial comments only; the patch is nearly ready:
On 02/22/21 08:19, Ankur Arora wrote:
> Add EjectCpu(), which handles the CPU ejection, and provides a holding
> area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
> at the tail end of the SMI handling.
>
> Also UnplugCpus() now stashes QEMU Selectors of CPUs which need to be
> ejected in CPU_HOT_EJECT_DATA.QemuSelectorMap. This is used by
> EjectCpu() to identify CPUs marked for ejection.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>
> Notes:
> Address these review comments from v6:
> (1) s/CpuEject/EjectCpu/g
> (2) Ensure that the added include is in sorted order.
> (3) Switch to a cheaper CpuSleep() based loop instead of
> CpuDeadLoop(). Also add the CpuLib LibraryClass.
> (4) Remove the nested else clause
> (5) Use Laszlo's much clearer comment when we try to map multiple
> QemuSelector to the same ProcessorNum.
> (6a) Fix indentation of the debug print in the block in (5).
> (6b,6c,6d) Fix printf types for ProcessorNum, use FMT_APIC_ID for
> APIC_ID and 0x%Lx for QemuSelector[].
> () As discussed elsewhere add an DEBUG_INFO print logging the
> correspondence between ProcessorNum, APIC_ID, QemuSelector.
> (7a,7b) Use EFI_ALREADY_STARTED instead of EFI_INVALID_PARAMETER and
> document it in the UnplugCpus() comment block.
> () As discussed elsewhere, add the import statement for
> PcdCpuHotEjectDataAddress.
> (9) Use Laszlo's comment in the PcdGet64(PcdCpuHotEjectDataAddress)
> description block.
> (10) Change mCpuHotEjectData init state checks from ASSERT to ones
> consistent with similar checks for mCpuHotPlugData.
> (11-14) Get rid of mCpuHotEjectData init loop: moved to a prior
> patch so it can be done at allocation time.
> (15) s/SmmCpuFeaturesSmiRendezvousExit/SmmCpuFeaturesRendezvousExit/
> (16,17) Document the ordering requirements of
> mCpuHotEjectData->Handler, and mCpuHotEjectData->QemuSelectorMap.
>
> Not addressed:
> (8) Not removing the EjectCount variable as I'd like to minimize
> stores/loads to CPU_HOT_EJECT_DATA->Handler and so would like to do this
> a single time at the end of the iteration. (It is safe to write multiple
> times to the handler in UnplugCpus() but given the ordering concerns
> around it, it seems cleaner to not access it unnecessarily.)
>
> OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf | 2 +
> OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 157 ++++++++++++++++++++++++++++++--
> 2 files changed, 151 insertions(+), 8 deletions(-)
>
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> index 04322b0d7855..ebcc7e2ac63a 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> @@ -40,6 +40,7 @@ [Packages]
> [LibraryClasses]
> BaseLib
> BaseMemoryLib
> + CpuLib
> DebugLib
> LocalApicLib
> MmServicesTableLib
> @@ -54,6 +55,7 @@ [Protocols]
>
> [Pcd]
> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress ## CONSUMES
> + gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress ## CONSUMES
> gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase ## CONSUMES
>
> [FeaturePcd]
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index f07b5072749a..851e2b28aad9 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -10,10 +10,12 @@
> #include <IndustryStandard/Q35MchIch9.h> // ICH9_APM_CNT
> #include <IndustryStandard/QemuCpuHotplug.h> // QEMU_CPUHP_CMD_GET_PENDING
> #include <Library/BaseLib.h> // CpuDeadLoop()
> +#include <Library/CpuLib.h> // CpuSleep()
> #include <Library/DebugLib.h> // ASSERT()
> #include <Library/MmServicesTableLib.h> // gMmst
> #include <Library/PcdLib.h> // PcdGetBool()
> #include <Library/SafeIntLib.h> // SafeUintnSub()
> +#include <Pcd/CpuHotEjectData.h> // CPU_HOT_EJECT_DATA
> #include <Protocol/MmCpuIo.h> // EFI_MM_CPU_IO_PROTOCOL
> #include <Protocol/SmmCpuService.h> // EFI_SMM_CPU_SERVICE_PROTOCOL
> #include <Uefi/UefiBaseType.h> // EFI_STATUS
> @@ -32,11 +34,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
> //
> STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
> //
> -// This structure is a communication side-channel between the
> +// These structures serve as communication side-channels between the
> // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
> // (i.e., PiSmmCpuDxeSmm).
> //
> STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
> +STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
> //
> // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
> // known event types), for the time of just one MMI.
> @@ -188,18 +191,72 @@ RevokeNewSlot:
> }
>
> /**
> + CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit()
> + on each CPU at exit from SMM.
> +
> + If, the executing CPU is not being ejected, nothing to be done.
> + If, the executing CPU is being ejected, wait in a halted loop
> + until ejected.
> +
> + @param[in] ProcessorNum ProcessorNum denotes the CPU exiting SMM,
> + and will be used as an index into
> + CPU_HOT_EJECT_DATA->QemuSelectorMap. It is
> + identical to the processor handle number in
> + EFI_SMM_CPU_SERVICE_PROTOCOL.
> +**/
> +VOID
> +EFIAPI
> +EjectCpu (
> + IN UINTN ProcessorNum
> + )
> +{
> + UINT64 QemuSelector;
> +
> + QemuSelector = mCpuHotEjectData->QemuSelectorMap[ProcessorNum];
> + if (QemuSelector == CPU_EJECT_QEMU_SELECTOR_INVALID) {
> + return;
> + }
> +
> + //
> + // CPU(s) being unplugged get here from SmmCpuFeaturesRendezvousExit()
> + // after having been cleared to exit the SMI by the BSP and thus have
> + // no SMM processing remaining.
> + //
> + // Given that we cannot allow them to escape to the guest, we pen them
> + // here until the BSP tells QEMU to unplug them.
> + //
> + for (;;) {
> + DisableInterrupts ();
> + CpuSleep ();
> + }
> +}
> +
> +/**
> Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>
> For each such CPU, report the CPU to PiSmmCpuDxeSmm via
> - EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
> - unknown, skip it silently.
> + EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
> + If the to be hot-unplugged CPU is unknown, skip it silently.
> +
> + Additonally, if we do stash any APIC IDs, also install a CPU eject handler
> + which would handle the ejection.
(1) Please update the two APIC ID references above to QEMU CPU selectors
(the commit message has been updated already, correctly).
>
> @param[in] ToUnplugApicIds The APIC IDs of the CPUs that are about to be
> hot-unplugged.
>
> + @param[in] ToUnplugSelector The QEMU Selectors of the CPUs that are about to
> + be hot-unplugged.
> +
(2) Please rename the parameter to "ToUnplugSelectors" (plural), both
here in the comment and in the actual parameter list.
> @param[in] ToUnplugCount The number of filled-in APIC IDs in
> ToUnplugApicIds.
>
> + @retval EFI_ALREADY_STARTED For the ProcessorNum that
> + EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
> + one of the APIC ID in ToUnplugApicIds,
> + mCpuHotEjectData->QemuSelectorMap already has
> + the QemuSelector value stashed. (This should
> + never happen.)
> +
(3) Unfortunately I made a typing error in my v6 review where I
suggested this language: it should say
one of the APIC ID*s* in ToUnplugApicIds
(emphasis only here, in this comment)
> @retval EFI_SUCCESS Known APIC IDs have been removed from SMM data
> structures.
>
> @@ -210,23 +267,36 @@ STATIC
> EFI_STATUS
> UnplugCpus (
> IN APIC_ID *ToUnplugApicIds,
> + IN UINT32 *ToUnplugSelector,
> IN UINT32 ToUnplugCount
> )
> {
> EFI_STATUS Status;
> UINT32 ToUnplugIdx;
> + UINT32 EjectCount;
> UINTN ProcessorNum;
>
> ToUnplugIdx = 0;
> + EjectCount = 0;
> while (ToUnplugIdx < ToUnplugCount) {
> APIC_ID RemoveApicId;
> + UINT32 QemuSelector;
>
> RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
> + QemuSelector = ToUnplugSelector[ToUnplugIdx];
>
> //
> - // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
> - // the ProcessorNum for the APIC ID to be removed.
> + // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use RemoveApicId
> + // to find the corresponding ProcessorNum for the CPU to be removed.
> //
> + // With this we can establish a 3 way mapping:
> + // APIC_ID -- ProcessorNum -- QemuSelector
> + //
> + // We stash the ProcessorNum -> QemuSelector mapping so it can later be
> + // used for CPU hot-eject in SmmCpuFeaturesRendezvousExit() context (where
> + // we only have ProcessorNum available.)
> + //
> +
> for (ProcessorNum = 0;
> ProcessorNum < mCpuHotPlugData->ArrayLength;
> ProcessorNum++) {
> @@ -255,11 +325,64 @@ UnplugCpus (
> return Status;
> }
>
> + if (mCpuHotEjectData->QemuSelectorMap[ProcessorNum] !=
> + CPU_EJECT_QEMU_SELECTOR_INVALID) {
(4) Invalid indentation; in this (controlling expression) context,
CPU_EJECT_QEMU_SELECTOR_INVALID should line up with "mCpuHotEjectData".
> + //
> + // mCpuHotEjectData->QemuSelectorMap[ProcessorNum] is set to
> + // CPU_EJECT_QEMU_SELECTOR_INVALID when mCpuHotEjectData->QemuSelectorMap
> + // is allocated, and once the subject processsor is ejected.
> + //
> + // Additionally, mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
> + // mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
> + // never match more than one APIC ID and by transitivity, more than one
> + // QemuSelector in a single invocation of UnplugCpus().
> + //
(5) good comment, but please make it a tiny bit easier to read: please
insert two "em dashes", as follows:
// never match more than one APIC ID -- and by transitivity, designate
// more than one QemuSelector -- in a single invocation of UnplugCpus().
(I've also inserted the verb "designate", because "match" doesn't seem
to apply in that context.)
> + DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector 0x%Lx, "
> + "cannot also map to 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
> + (UINT64)mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));
(6a) "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" has type UINT64,
so the cast is superfluous
(6b) we log selectors in all other places in decimal, so please use %Lu
here, not 0x%Lx, for logging
"mCpuHotEjectData->QemuSelectorMap[ProcessorNum]"
(6c) "QemuSelector" has type UINT32 (correctly), so we need to log it
with either "0x%x" or "%u". Given my above point about logging selectors
in decimal everywhere else, please use "%u".
DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector %Lu, "
"cannot also map to %u\n", __FUNCTION__, (UINT64)ProcessorNum,
mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));
> +
> + Status = EFI_ALREADY_STARTED;
> + return Status;
> + }
(7) style nit -- this looks better as "return EFI_ALREADY_STARTED".
> +
> + //
> + // Stash the QemuSelector so we can do the actual ejection later.
> + //
> + mCpuHotEjectData->QemuSelectorMap[ProcessorNum] = (UINT64)QemuSelector;
> +
> + DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
> + FMT_APIC_ID ", QemuSelector 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
> + RemoveApicId, mCpuHotEjectData->QemuSelectorMap[ProcessorNum]));
(8a) In the format string, please replace "QemuSelector 0x%Lx" with
"QemuSelector %u" -- the main point is the decimal notation
(8b) As a suggestion, I think we should simplify this DEBUG call by
replacing the "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" argument
with just "QemuSelector".
DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
FMT_APIC_ID ", QemuSelector %u\n", __FUNCTION__, (UINT64)ProcessorNum,
RemoveApicId, QemuSelector));
> +
> + EjectCount++;
(I'm fine with keeping the "EjectCount" variable.)
> ToUnplugIdx++;
> }
>
> + if (EjectCount != 0) {
> + //
> + // We have processors to be ejected; install the handler.
> + //
> + mCpuHotEjectData->Handler = EjectCpu;
> +
> + //
> + // The BSP, CPUs to be ejected dereference mCpuHotEjectData->Handler, and
> + // mCpuHotEjectData->QemuSelectorMap[] in SmmCpuFeaturesRendezvousExit().
> + //
> + // Assignments to both of these are ordered-before the BSP's SMI exit signal
> + // which happens via a write to SMM_DISPATCHER_MP_SYNC_DATA->AllCpusInSync.
> + // Dereferences of both are ordered-after the synchronization via
> + // "AllCpusInSync".
> + //
> + // So we are guaranteed that the Handler would see the assignments above.
> + // However, add a MemoryFence() here in-lieu of a compiler barrier to
> + // ensure that the compiler doesn't monkey around with the stores.
> + //
> + MemoryFence ();
> + }
> +
(9) Per previous discussion (under patch v8 #7), please replace the last
paragraph of this comment, with a "ReleaseMemoryFence" reference.
The rest looks good.
Thanks!
Laszlo
> //
> - // We've removed this set of APIC IDs from SMM data structures.
> + // We've removed this set of APIC IDs from SMM data structures and
> + // have installed an ejection handler if needed.
> //
> return EFI_SUCCESS;
> }
> @@ -387,7 +510,7 @@ CpuHotplugMmi (
> }
>
> if (ToUnplugCount > 0) {
> - Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
> + Status = UnplugCpus (mToUnplugApicIds, mToUnplugSelector, ToUnplugCount);
> if (EFI_ERROR (Status)) {
> goto Fatal;
> }
> @@ -458,9 +581,14 @@ CpuHotplugEntry (
>
> //
> // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
> - // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
> + // has pointed:
> + // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
> + // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
> + // possible CPU count is greater than 1.
> //
> mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
> + mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
> +
> if (mCpuHotPlugData == NULL) {
> Status = EFI_NOT_FOUND;
> DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
> @@ -472,6 +600,19 @@ CpuHotplugEntry (
> if (mCpuHotPlugData->ArrayLength == 1) {
> return EFI_UNSUPPORTED;
> }
> +
> + if (mCpuHotEjectData == NULL) {
> + Status = EFI_NOT_FOUND;
> + } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
> + Status = EFI_INVALID_PARAMETER;
> + } else {
> + Status = EFI_SUCCESS;
> + }
> + if (EFI_ERROR (Status)) {
> + DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
> + goto Fatal;
> + }
> +
> //
> // Allocate the data structures that depend on the possible CPU count.
> //
>
next prev parent reply other threads:[~2021-02-23 20:39 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 7:19 [PATCH v8 00/10] support CPU hot-unplug Ankur Arora
2021-02-22 7:19 ` [PATCH v8 01/10] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
2021-02-22 11:49 ` [edk2-devel] " Laszlo Ersek
2021-02-22 7:19 ` [PATCH v8 02/10] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
2021-02-22 12:27 ` [edk2-devel] " Laszlo Ersek
2021-02-22 22:03 ` Ankur Arora
2021-02-23 16:44 ` Laszlo Ersek
2021-02-22 7:19 ` [PATCH v8 03/10] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
2021-02-22 12:31 ` [edk2-devel] " Laszlo Ersek
2021-02-22 22:22 ` Ankur Arora
2021-02-22 7:19 ` [PATCH v8 04/10] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
2021-02-22 12:39 ` [edk2-devel] " Laszlo Ersek
2021-02-22 22:22 ` Ankur Arora
2021-02-22 7:19 ` [PATCH v8 05/10] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
2021-02-22 13:06 ` [edk2-devel] " Laszlo Ersek
2021-02-22 22:33 ` Ankur Arora
2021-02-22 7:19 ` [PATCH v8 06/10] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
2021-02-22 14:19 ` [edk2-devel] " Laszlo Ersek
2021-02-23 7:37 ` Ankur Arora
2021-02-22 7:19 ` [PATCH v8 07/10] OvmfPkg/SmmCpuFeaturesLib: call CPU hot-eject handler Ankur Arora
2021-02-22 14:53 ` [edk2-devel] " Laszlo Ersek
2021-02-23 7:37 ` Ankur Arora
2021-02-23 16:52 ` Laszlo Ersek
2021-02-23 7:45 ` Paolo Bonzini
2021-02-23 17:06 ` Laszlo Ersek
2021-02-23 17:18 ` Paolo Bonzini
2021-02-23 20:46 ` Ankur Arora
2021-02-22 7:19 ` [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu() Ankur Arora
2021-02-23 20:36 ` Laszlo Ersek [this message]
2021-02-23 20:51 ` [edk2-devel] " Ankur Arora
2021-02-22 7:19 ` [PATCH v8 09/10] OvmfPkg/CpuHotplugSmm: do actual CPU hot-eject Ankur Arora
2021-02-23 21:39 ` [edk2-devel] " Laszlo Ersek
2021-02-24 3:44 ` Ankur Arora
2021-02-25 19:22 ` Laszlo Ersek
2021-02-22 7:19 ` [PATCH v8 10/10] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
2021-02-23 21:52 ` [edk2-devel] " Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6e37922f-afbe-2b41-4eb9-4b3c62d28aa9@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox