From: "Ankur Arora" <ankur.a.arora@oracle.com>
To: Laszlo Ersek <lersek@redhat.com>, devel@edk2.groups.io
Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com,
Jordan Justen <jordan.l.justen@intel.com>,
Ard Biesheuvel <ard.biesheuvel@arm.com>,
Aaron Young <aaron.young@oracle.com>
Subject: Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
Date: Tue, 2 Feb 2021 21:41:03 -0800 [thread overview]
Message-ID: <905bc1e6-c499-6ae5-9e61-ca12c61a5c94@oracle.com> (raw)
In-Reply-To: <ef1bda82-eb37-b16c-d9ba-5b3226bc1ee7@redhat.com>
On 2021-02-02 5:23 a.m., Laszlo Ersek wrote:
> On 02/01/21 20:21, Ankur Arora wrote:
>> On 2021-02-01 9:22 a.m., Laszlo Ersek wrote:
>>> On 01/29/21 01:59, Ankur Arora wrote:
>>>> Designate a worker CPU (we use the one executing the root MMI
>>>> handler), which will do the actual ejection via QEMU in CpuEject().
>>>>
>>>> CpuEject(), on the worker CPU, ejects each marked CPU by first
>>>> selecting its APIC ID and then sending the QEMU "eject" command.
>>>> QEMU in-turn signals the remote VCPU thread which context-switches
>>>> it out of the SMI.
>>>>
>>>> CpuEject(), on the CPU being ejected, spins around in its holding
>>>> area until this final context-switch. This does mean that there is
>>>> some CPU state that would ordinarily be restored (in SmiRendezvous()
>>>> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
>>>> This unrestored state includes FPU state, CET enable, stuffing of
>>>> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
>>>> this should be okay.
>>>>
>>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Cc: Aaron Young <aaron.young@oracle.com>
>>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>>> ---
>>>> OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73
>>>> ++++++++++++++++++++++++++++++++++----
>>>> 1 file changed, 67 insertions(+), 6 deletions(-)
>>>
>>> (1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
>>> and code too).
>>>
>>>
>>>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> index 526f51faf070..bf91344eef9c 100644
>>>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> @@ -193,9 +193,12 @@ RevokeNewSlot:
>>>> CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>>> on each CPU at exit from SMM.
>>>>
>>>> - If, the executing CPU is not being ejected, nothing to be done.
>>>> + If, the executing CPU is neither a worker, nor being ejected, nothing
>>>> + to be done.
>>>> If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>>> until ejected.
>>>> + If, the executing CPU is a worker CPU, set QEMU CPU status to eject
>>>> + for CPUs being ejected.
>>>>
>>>> @param[in] ProcessorNum Index of executing CPU.
>>>>
>>>> @@ -217,6 +220,56 @@ CpuEject (
>>>> return;
>>>> }
>>>>
>>>> + if (ApicId == CPU_EJECT_WORKER) {
>>>
>>> (2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
>>> generality). I wish I understood this idea earlier in the patch set.
>>>
>>> (2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
>>> CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
>>> dropped.
>>>
>>> (2b) In this patch, the question whether the executing CPU is the BSP or
>>> not, should be decided with the same logic that is visible in
>>> PlatformSmmBspElection()
>>> [OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:
>>>
>>> MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
>>> BOOLEAN IsBsp;
>>>
>>> ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
>>> IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
>>>
>>> (2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
>>> in UnplugCpus() below.
>>>
>>> (2d) The "is worker" language (in comments etc) should be replaced with
>>> direct "is BSP" language.
>>>
>>>
>>>> + UINT32 CpuIndex;
>>>> +
>>>> + for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength;
>>>> CpuIndex++) {
>>>> + UINT64 RemoveApicId;
>>>> +
>>>> + RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
>>>> +
>>>> + if ((RemoveApicId != CPU_EJECT_INVALID &&
>>>> + RemoveApicId != CPU_EJECT_WORKER)) {
>>>> + //
>>>> + // This to-be-ejected-CPU has already received the BSP's SMI
>>>> exit
>>>> + // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
>>>> + // followed by this callback or is already waiting in the
>>>> + // CpuDeadLoop() below.
>>>> + //
>>>> + // Tell QEMU to context-switch it out.
>>>> + //
>>>> + QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>>>> + QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
>>>
>>> (3) While the QEMU CPU selector value *usually* matches the APIC ID,
>>> it's not an invariant. APIC IDs have an internal structure, composed of
>>> bit-fields, where each bit-field accommodates one hierarchy level in the
>>> CPU topology (thread, core, die (maybe), and socket).
>>>
>>> However, this mapping need not be surjective. QEMU lets you create
>>> "pathological" CPU topologies, for example one with:
>>> - 3 threads/core,
>>> - 5 cores/socket,
>>> - (say) 2 sockets.
>>>
>>> Under that example, the bit-field standing for the "thread number" level
>>> would have 2 bits, theoretically permitting *4* threads/core, and the
>>> bit-field standing for the "core number" level would have 3 bits,
>>> theoretically allowing for *8* cores/socket.
>>>
>>> Considering the fully populated topology, you'd see the CPU selector
>>> range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
>>> processors in total). However, the APIC ID *image* of this CPU selector
>>> *domain* would not be "contiguous" -- the APIC ID space, with the
>>> above-described structure, would accommodate 4*8*2=64 logical
>>> processors. For example, each APIC ID that stood for the nonexistent
>>> "thread#3" on a particular core would be left unused (no CPU selector
>>> would map to it).
>>>
>>> All in all, you can't write the APIC ID to the CPU selector register,
>>> for ejection. You need to select the CPU whose APIC ID is the APIC ID
>>> you want to eject, and then initiate ejection.
>>
>> Yeah, this is a clear bug. Should have seen it earlier. Thanks for
>> pointing it out.
>>
>>>
>>> This requires one of two alternatives:
>>>
>>>
>>> (3a) The first option is to keep the change local to this patch.
>>>
>>> This alternative is the more CPU-hungry (and uglier) one.
>>>
>>> The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
>>> possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
>>> CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
>>> there is a match, eject.
>>>
>>>
>>> (3b) The second option is much more elegant (and it's faster too), but
>>> it requires a much more intrusive update to the patch set.
>>>
>>> First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
>>> operates on, has to be changed from APIC_ID to a structure type that
>>> pairs APIC_ID with the QEMU CPU selector. [*]
>>>
>>> Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
>>> also save the "CurrentSelector" value (in the other field of the output
>>> array element structure).
>>>
>>> Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
>>> replaced with a structure type similar (or identical) to the one
>>> described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
>>> be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
>>> the QEMU selector for that processor, and the APIC ID.
>>>
>>> Fourth, the actual ejection should use the selector.
>>>
>>> Fifth, the debug message (below) should continue logging the APIC ID, to
>>> mirror the DEBUG_INFO message in ProcessHotAddedCpus().
>>>
>>>
>>> Would you be willing to implement (3b)?
>>
>> 3b is clearly the better solution. However, is there enough value in
>> the print message containing APIC ID, that CPU_HOT_EJECT_DATA.ApicIdMap
>> carry both the cpu-selector and APIC ID?
>>
>> As you say, the ejection itself just needs the ProcessorNum -> QEMU
>> cpu-selector mapping.
>
> Good question, and I'm torn.
>
> The default DEBUG build of OVMF enables INFO messages, and more severe
> ones. It does not enable VERBOSE messages.
>
> In such a DEBUG build (i.e., with otherwise default settings), the log
> entries that relate to hot-plug do not report selectors. Conversely, on
> hot-unplug, the log would report selectors *only* (not APIC IDs). I feel
> this makes it more difficult to get useful OVMF debug logs in bug
> reports, especially from users of distro-provided OVMF builds.
>
> If we can ask the user to rebuild with VERBOSE enabled and re-run the
> test, that's always great, but frequently users don't know how to
> rebuild OVMF, plus usually OVMF is hidden so deep in their virt stack
> that they have no idea how to make that stack use a custom OVMF build.
>
> ... How about this: in UnplugCpus(), we could emit an INFO message about
> the ProcessorNumber <-> APIC ID <-> Selector correspondence, and when
> the eject happens, it would suffice to log (at INFO level) only
> ProcessorNumber <-> Selector.
This makes sense to me. Will do.
Thanks
Ankur
>
> Thanks,
> Laszlo
>
next prev parent reply other threads:[~2021-02-03 5:41 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-29 0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
2021-01-29 0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
2021-01-30 1:15 ` [edk2-devel] " Laszlo Ersek
2021-02-02 6:19 ` Ankur Arora
2021-02-01 2:59 ` Laszlo Ersek
2021-01-29 0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
2021-01-30 2:18 ` Laszlo Ersek
2021-01-30 2:23 ` Laszlo Ersek
2021-02-02 6:03 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
2021-01-30 2:36 ` Laszlo Ersek
2021-02-02 6:04 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
2021-01-30 2:37 ` Laszlo Ersek
2021-02-01 3:13 ` Laszlo Ersek
2021-02-03 4:28 ` Ankur Arora
2021-02-03 19:20 ` Laszlo Ersek
2021-01-29 0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
2021-02-01 4:53 ` Laszlo Ersek
2021-02-02 6:15 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
2021-02-01 13:36 ` Laszlo Ersek
2021-02-03 5:20 ` Ankur Arora
2021-02-03 20:36 ` Laszlo Ersek
2021-02-04 2:58 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
2021-02-01 16:11 ` Laszlo Ersek
2021-02-01 19:08 ` Laszlo Ersek
2021-02-01 20:12 ` Ankur Arora
2021-02-02 14:00 ` Laszlo Ersek
2021-02-02 14:15 ` Laszlo Ersek
2021-02-03 6:45 ` Ankur Arora
2021-02-03 20:58 ` Laszlo Ersek
2021-02-04 2:49 ` Ankur Arora
2021-02-04 8:58 ` Laszlo Ersek
2021-02-05 16:06 ` [edk2-devel] " Laszlo Ersek
2021-02-08 5:04 ` Ankur Arora
2021-02-03 6:13 ` Ankur Arora
2021-02-03 20:55 ` Laszlo Ersek
2021-02-04 2:57 ` Ankur Arora
2021-01-29 0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
2021-02-01 17:22 ` Laszlo Ersek
2021-02-01 19:21 ` Ankur Arora
2021-02-02 13:23 ` Laszlo Ersek
2021-02-03 5:41 ` Ankur Arora [this message]
2021-01-29 0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
2021-02-01 17:37 ` Laszlo Ersek
2021-02-01 17:40 ` Laszlo Ersek
2021-02-01 17:48 ` Laszlo Ersek
2021-02-03 5:46 ` Ankur Arora
2021-02-03 20:45 ` Laszlo Ersek
2021-02-04 3:04 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=905bc1e6-c499-6ae5-9e61-ca12c61a5c94@oracle.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox