From: "Ankur Arora" <ankur.a.arora@oracle.com>
To: Laszlo Ersek <lersek@redhat.com>, devel@edk2.groups.io
Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com,
Jordan Justen <jordan.l.justen@intel.com>,
Ard Biesheuvel <ard.biesheuvel@arm.com>,
Aaron Young <aaron.young@oracle.com>
Subject: Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
Date: Mon, 1 Feb 2021 11:21:28 -0800 [thread overview]
Message-ID: <f8c88396-94aa-5647-ec2b-9d982c559f36@oracle.com> (raw)
In-Reply-To: <2c25a9d0-38dc-754d-fcef-b372e1211a92@redhat.com>
On 2021-02-01 9:22 a.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Designate a worker CPU (we use the one executing the root MMI
>> handler), which will do the actual ejection via QEMU in CpuEject().
>>
>> CpuEject(), on the worker CPU, ejects each marked CPU by first
>> selecting its APIC ID and then sending the QEMU "eject" command.
>> QEMU in-turn signals the remote VCPU thread which context-switches
>> it out of the SMI.
>>
>> CpuEject(), on the CPU being ejected, spins around in its holding
>> area until this final context-switch. This does mean that there is
>> some CPU state that would ordinarily be restored (in SmiRendezvous()
>> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
>> This unrestored state includes FPU state, CET enable, stuffing of
>> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
>> this should be okay.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73 ++++++++++++++++++++++++++++++++++----
>> 1 file changed, 67 insertions(+), 6 deletions(-)
>
> (1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
> and code too).
>
>
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index 526f51faf070..bf91344eef9c 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -193,9 +193,12 @@ RevokeNewSlot:
>> CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>> on each CPU at exit from SMM.
>>
>> - If, the executing CPU is not being ejected, nothing to be done.
>> + If, the executing CPU is neither a worker, nor being ejected, nothing
>> + to be done.
>> If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>> until ejected.
>> + If, the executing CPU is a worker CPU, set QEMU CPU status to eject
>> + for CPUs being ejected.
>>
>> @param[in] ProcessorNum Index of executing CPU.
>>
>> @@ -217,6 +220,56 @@ CpuEject (
>> return;
>> }
>>
>> + if (ApicId == CPU_EJECT_WORKER) {
>
> (2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
> generality). I wish I understood this idea earlier in the patch set.
>
> (2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
> CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
> dropped.
>
> (2b) In this patch, the question whether the executing CPU is the BSP or
> not, should be decided with the same logic that is visible in
> PlatformSmmBspElection()
> [OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:
>
> MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
> BOOLEAN IsBsp;
>
> ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
> IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
>
> (2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
> in UnplugCpus() below.
>
> (2d) The "is worker" language (in comments etc) should be replaced with
> direct "is BSP" language.
>
>
>> + UINT32 CpuIndex;
>> +
>> + for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength; CpuIndex++) {
>> + UINT64 RemoveApicId;
>> +
>> + RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
>> +
>> + if ((RemoveApicId != CPU_EJECT_INVALID &&
>> + RemoveApicId != CPU_EJECT_WORKER)) {
>> + //
>> + // This to-be-ejected-CPU has already received the BSP's SMI exit
>> + // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
>> + // followed by this callback or is already waiting in the
>> + // CpuDeadLoop() below.
>> + //
>> + // Tell QEMU to context-switch it out.
>> + //
>> + QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>> + QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
>
> (3) While the QEMU CPU selector value *usually* matches the APIC ID,
> it's not an invariant. APIC IDs have an internal structure, composed of
> bit-fields, where each bit-field accommodates one hierarchy level in the
> CPU topology (thread, core, die (maybe), and socket).
>
> However, this mapping need not be surjective. QEMU lets you create
> "pathological" CPU topologies, for example one with:
> - 3 threads/core,
> - 5 cores/socket,
> - (say) 2 sockets.
>
> Under that example, the bit-field standing for the "thread number" level
> would have 2 bits, theoretically permitting *4* threads/core, and the
> bit-field standing for the "core number" level would have 3 bits,
> theoretically allowing for *8* cores/socket.
>
> Considering the fully populated topology, you'd see the CPU selector
> range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
> processors in total). However, the APIC ID *image* of this CPU selector
> *domain* would not be "contiguous" -- the APIC ID space, with the
> above-described structure, would accommodate 4*8*2=64 logical
> processors. For example, each APIC ID that stood for the nonexistent
> "thread#3" on a particular core would be left unused (no CPU selector
> would map to it).
>
> All in all, you can't write the APIC ID to the CPU selector register,
> for ejection. You need to select the CPU whose APIC ID is the APIC ID
> you want to eject, and then initiate ejection.
Yeah, this is a clear bug. Should have seen it earlier. Thanks for
pointing it out.
>
> This requires one of two alternatives:
>
>
> (3a) The first option is to keep the change local to this patch.
>
> This alternative is the more CPU-hungry (and uglier) one.
>
> The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
> possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
> CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
> there is a match, eject.
>
>
> (3b) The second option is much more elegant (and it's faster too), but
> it requires a much more intrusive update to the patch set.
>
> First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
> operates on, has to be changed from APIC_ID to a structure type that
> pairs APIC_ID with the QEMU CPU selector. [*]
>
> Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
> also save the "CurrentSelector" value (in the other field of the output
> array element structure).
>
> Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
> replaced with a structure type similar (or identical) to the one
> described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
> be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
> the QEMU selector for that processor, and the APIC ID.
>
> Fourth, the actual ejection should use the selector.
>
> Fifth, the debug message (below) should continue logging the APIC ID, to
> mirror the DEBUG_INFO message in ProcessHotAddedCpus().
>
>
> Would you be willing to implement (3b)?
3b is clearly the better solution. However, is there enough value in
the print message containing APIC ID, that CPU_HOT_EJECT_DATA.ApicIdMap
carry both the cpu-selector and APIC ID?
As you say, the ejection itself just needs the ProcessorNum -> QEMU cpu-selector
mapping.
Ankur
>
>
>> +
>> + //
>> + // Compiler barrier to ensure the next store isn't reordered
>> + //
>> + MemoryFence ();
>> +
>> + //
>> + // Clear the eject status for CpuIndex to ensure that an invalid
>> + // SMI later does not end up trying to eject it or a newly
>> + // hotplugged CpuIndex does not go into the dead loop.
>> + //
>> + mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
>> +
>> + DEBUG ((DEBUG_INFO, "%a: Unplugged CPU %u -> " FMT_APIC_ID "\n",
>> + __FUNCTION__, CpuIndex, RemoveApicId));
>
> (4) The DEBUG_INFO log message is in the right place (and uses the right
> debug mask), but it is afflicted by the usual warts (indentation, format
> specifiers etc). Please reapply the comments I made elsewhere.
>
>
> (5a) Please replace "CPU" with "ProcessorNumber" (so that we know it's
> the protocol-assigned number, not the QEMU selector).
>
> (5b) Please replace the arrow " -> " with the string " APIC ID ".
>
>
> Thanks!
> Laszlo
>
>> + }
>> + }
>> +
>> + //
>> + // Clear our own worker status.
>> + //
>> + mCpuHotEjectData->ApicIdMap[ProcessorNum] = CPU_EJECT_INVALID;
>> +
>> + //
>> + // We are done until the next hot-unplug; clear the handler.
>> + //
>> + mCpuHotEjectData->Handler = NULL;
>> + return;
>> + }
>> +
>> //
>> // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
>> // after having been cleared to exit the SMI by the monarch and thus have
>> @@ -327,6 +380,19 @@ UnplugCpus (
>> }
>>
>> if (EjectCount != 0) {
>> + UINTN Worker;
>> +
>> + Status = mMmCpuService->WhoAmI (mMmCpuService, &Worker);
>> + ASSERT_EFI_ERROR (Status);
>> + //
>> + // UnplugCpus() is called via the root MMI handler and thus we are
>> + // executing in the BSP context.
>> + //
>> + // Mark ourselves as the worker CPU.
>> + //
>> + ASSERT (mCpuHotEjectData->ApicIdMap[Worker] == CPU_EJECT_INVALID);
>> + mCpuHotEjectData->ApicIdMap[Worker] = CPU_EJECT_WORKER;
>> +
>> //
>> // We have processors to be ejected; install the handler.
>> //
>> @@ -451,11 +517,6 @@ CpuHotplugMmi (
>> if (EFI_ERROR (Status)) {
>> goto Fatal;
>> }
>> - if (ToUnplugCount > 0) {
>> - DEBUG ((DEBUG_ERROR, "%a: hot-unplug is not supported yet\n",
>> - __FUNCTION__));
>> - goto Fatal;
>> - }
>>
>> if (PluggedCount > 0) {
>> Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>>
>
next prev parent reply other threads:[~2021-02-01 19:21 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-29 0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
2021-01-29 0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
2021-01-30 1:15 ` [edk2-devel] " Laszlo Ersek
2021-02-02 6:19 ` Ankur Arora
2021-02-01 2:59 ` Laszlo Ersek
2021-01-29 0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
2021-01-30 2:18 ` Laszlo Ersek
2021-01-30 2:23 ` Laszlo Ersek
2021-02-02 6:03 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
2021-01-30 2:36 ` Laszlo Ersek
2021-02-02 6:04 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
2021-01-30 2:37 ` Laszlo Ersek
2021-02-01 3:13 ` Laszlo Ersek
2021-02-03 4:28 ` Ankur Arora
2021-02-03 19:20 ` Laszlo Ersek
2021-01-29 0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
2021-02-01 4:53 ` Laszlo Ersek
2021-02-02 6:15 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
2021-02-01 13:36 ` Laszlo Ersek
2021-02-03 5:20 ` Ankur Arora
2021-02-03 20:36 ` Laszlo Ersek
2021-02-04 2:58 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
2021-02-01 16:11 ` Laszlo Ersek
2021-02-01 19:08 ` Laszlo Ersek
2021-02-01 20:12 ` Ankur Arora
2021-02-02 14:00 ` Laszlo Ersek
2021-02-02 14:15 ` Laszlo Ersek
2021-02-03 6:45 ` Ankur Arora
2021-02-03 20:58 ` Laszlo Ersek
2021-02-04 2:49 ` Ankur Arora
2021-02-04 8:58 ` Laszlo Ersek
2021-02-05 16:06 ` [edk2-devel] " Laszlo Ersek
2021-02-08 5:04 ` Ankur Arora
2021-02-03 6:13 ` Ankur Arora
2021-02-03 20:55 ` Laszlo Ersek
2021-02-04 2:57 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
2021-02-01 17:22 ` Laszlo Ersek
2021-02-01 19:21 ` Ankur Arora [this message]
2021-02-02 13:23 ` Laszlo Ersek
2021-02-03 5:41 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
2021-02-01 17:37 ` Laszlo Ersek
2021-02-01 17:40 ` Laszlo Ersek
2021-02-01 17:48 ` Laszlo Ersek
2021-02-03 5:46 ` Ankur Arora
2021-02-03 20:45 ` Laszlo Ersek
2021-02-04 3:04 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8c88396-94aa-5647-ec2b-9d982c559f36@oracle.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox