public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: Ankur Arora <ankur.a.arora@oracle.com>, devel@edk2.groups.io
Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com,
	Jordan Justen <jordan.l.justen@intel.com>,
	Ard Biesheuvel <ard.biesheuvel@arm.com>,
	Aaron Young <aaron.young@oracle.com>
Subject: Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
Date: Mon, 1 Feb 2021 17:11:24 +0100	[thread overview]
Message-ID: <180a8efb-1a26-3bab-f50a-2d7aeff6d582@redhat.com> (raw)
In-Reply-To: <20210129005950.467638-8-ankur.a.arora@oracle.com>

On 01/29/21 01:59, Ankur Arora wrote:
> Add CpuEject(), which handles the CPU ejection, and provides a holding
> area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
> at the tail end of the SMI handling.

(1) The functions introduced thus far by this patch series are all named
"Verb + Object", which is great; so please call this function EjectCpu()
as well, rather than CpuEject().

Modify all three of: subject line, commit message, patch body; please.


>
> Also UnplugCpus() now stashes APIC IDs of CPUs which need to be
> ejected in CPU_HOT_EJECT_DATA.ApicIdMap. These are used by CpuEject()
> to identify such CPUs.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 109 +++++++++++++++++++++++++++++++++++--
>  1 file changed, 105 insertions(+), 4 deletions(-)
>
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index 70d69f6ed65b..526f51faf070 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -14,6 +14,7 @@
>  #include <Library/MmServicesTableLib.h>      // gMmst
>  #include <Library/PcdLib.h>                  // PcdGetBool()
>  #include <Library/SafeIntLib.h>              // SafeUintnSub()
> +#include <Library/CpuHotEjectData.h>         // CPU_HOT_EJECT_DATA
>  #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
>  #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
>  #include <Uefi/UefiBaseType.h>               // EFI_STATUS

(2) This will change due to the movement of the header file, but: please
keep the #include directive list alphabetically sorted.


> @@ -32,11 +33,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
>  //
>  STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
>  //
> -// This structure is a communication side-channel between the
> +// These structures serve as communication side-channels between the
>  // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
>  // (i.e., PiSmmCpuDxeSmm).
>  //
>  STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
> +STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
>  //
>  // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
>  // known event types), for the time of just one MMI.
> @@ -188,11 +190,53 @@ RevokeNewSlot:
>  }
>
>  /**
> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
> +  on each CPU at exit from SMM.
> +
> +  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
> +  until ejected.
> +
> +  @param[in] ProcessorNum      Index of executing CPU.
> +
> +**/
> +VOID
> +EFIAPI
> +CpuEject (
> +  IN UINTN ProcessorNum
> +  )
> +{
> +  //
> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
> +  // so use UINT64 throughout.
> +  //
> +  UINT64 ApicId;
> +
> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
> +  if (ApicId == CPU_EJECT_INVALID) {
> +    return;
> +  }
> +
> +  //
> +  // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
> +  // after having been cleared to exit the SMI by the monarch and thus have
> +  // no SMM processing remaining.
> +  //
> +  // Given that we cannot allow them to escape to the guest, we pen them
> +  // here until the SMM monarch tells the HW to unplug them.
> +  //
> +  CpuDeadLoop ();
> +}

(3a) We can make this less resource-hungry, by replacing CpuDeadLoop()
with:

  for (;;) {
    DisableInterrupts ();
    CpuSleep ();
  }

This basically translates to a { CLI; HLT; } loop.

(Both functions come from BaseLib, which CpuHotplugSmm already consumes,
thus there is no need to modify #include's or [LibraryClasses].)


(3b) Please refresh the CpuDeadLoop() reference in the function's
leading comment as well.


> +
> +/**
>    Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>
>    For each such CPU, report the CPU to PiSmmCpuDxeSmm via
> -  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
> -  unknown, skip it silently.
> +  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
> +  If the to be hot-unplugged CPU is unknown, skip it silently.
> +
> +  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
> +  which would handle the ejection.
>
>    @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
>                                  hot-unplugged.
> @@ -216,9 +260,11 @@ UnplugCpus (
>  {
>    EFI_STATUS Status;
>    UINT32     ToUnplugIdx;
> +  UINT32     EjectCount;
>    UINTN      ProcessorNum;
>
>    ToUnplugIdx = 0;
> +  EjectCount = 0;
>    while (ToUnplugIdx < ToUnplugCount) {
>      APIC_ID    RemoveApicId;
>
> @@ -255,13 +301,41 @@ UnplugCpus (
>        DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
>          __FUNCTION__, RemoveApicId, Status));
>        goto Fatal;
> +    } else {

(Under patch v6 4/9, I request that the "goto" be replaced with a
"return" -- my point (4) below applies regardless:)

(4) Please don't add an "else" branch, if the first branch of the "if"
ends with a jump statement. Because, in that case, the code that follows
the "if" statement is not reachable after the first branch anyway.

So please just unnest the next part:


> +      //
> +      // Stash the APIC IDs so we can do the actual ejection later.
> +      //
> +      if (mCpuHotEjectData->ApicIdMap[ProcessorNum] != CPU_EJECT_INVALID) {
> +        //
> +        // Since ProcessorNum and APIC-ID map 1-1, so a valid
> +        // mCpuHotEjectData->ApicIdMap[ProcessorNum] means something
> +        // is horribly wrong.
> +        //

(5) To be honest, I would replace this with:

      //
      // - mCpuHotEjectData->ApicIdMap[ProcessorNum] is initialized to
      //   CPU_EJECT_INVALID when mCpuHotEjectData->ApicIdMap is allocated,
      //
      // - mCpuHotEjectData->ApicIdMap[ProcessorNum] is restored to
      //   CPU_EJECT_INVALID when the subject processor is ejected,
      //
      // - mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
      //   mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
      //   never match more than one APIC ID in a single invocation of
      //   UnplugCpus().
      //


> +        DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %u maps to %llx, cannot "
> +                "map to " FMT_APIC_ID "\n", __FUNCTION__, ProcessorNum,
> +                mCpuHotEjectData->ApicIdMap[ProcessorNum], RemoveApicId));

(6a) The indentation of the 2nd and 3rd lines is incorrect.

(6b) For logging UINTN values (i.e., ProcessorNum) portably between IA32
and X64, %u is not correct. Instead:

- cast the UINTN value to UINT64 explicitly,
- use the %Lu or %Lx format specifier.

(6c) There is no "%llx" format string in edk2's PrintLib (no "ll" length
modifier, to be more precise). UINT64 values need to be printed with
"%lu" or "%lx", or -- identically -- with "%Lu" or "%Lx". I prefer the
latter, because standard C does not define the "L" size modifier for
integers, and that makes it clear that we're using an edk2-specific
feature. The "l" (ell) length modifier could be misunderstood as "long"
(which is something we don't use in edk2).

(6d) FMT_APIC_ID is defined as "0x%08x"; to remain consistent with that,
I would print the ApicIdMap element not just with "%Lx", but with
"0x%016Lx".


> +
> +        Status = EFI_INVALID_PARAMETER;
> +        goto Fatal;

(7a) Please just "return EFI_ALREADY_STARTED".

(7b) Please also modify the leading comment on the function -- the new
return value EFI_ALREADY_STARTED should be documented. I suggest:

   @retval EFI_ALREADY_STARTED   For the ProcessorNumber that
                                 EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
                                 one of the APIC ID in ToUnplugApicIds,
                                 mCpuHotEjectData->ApicIdMap already has an
                                 APIC ID stashed. (This should never happen.)


> +      }
> +
> +      mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId;
> +      EjectCount++;
>      }
>
>      ToUnplugIdx++;
>    }
>
> +  if (EjectCount != 0) {
> +    //
> +    // We have processors to be ejected; install the handler.
> +    //
> +    mCpuHotEjectData->Handler = CpuEject;
> +  }
> +

(8) I suggest removing the "EjectCount" local variable, and setting the
"Handler" member where you currently increment "EjectCount".


>    //
> -  // We've removed this set of APIC IDs from SMM data structures.
> +  // We've removed this set of APIC IDs from SMM data structures and
> +  // have installed an ejection handler if needed.
>    //
>    return EFI_SUCCESS;
>
> @@ -458,7 +532,13 @@ CpuHotplugEntry (
>    // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
>    // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
>    //
> +  // Additionally, CPU Hot-unplug is available only if CPU Hotplug is, so
> +  // the same DEPEX also guarantees that PcdCpuHotEjectDataAddress points
> +  // to CPU_HOT_EJECT_DATA in SMRAM.
> +  //

(9) I don't see the relevance of "hot-unplug depends on hot-plug" here.

I recommend the following comment instead:

   //
   // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
   // has pointed:
   // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
   // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
   //   possible CPU count is greater than 1.
   //

>    mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
> +  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
> +
>    if (mCpuHotPlugData == NULL) {
>      Status = EFI_NOT_FOUND;
>      DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
> @@ -470,6 +550,9 @@ CpuHotplugEntry (
>    if (mCpuHotPlugData->ArrayLength == 1) {
>      return EFI_UNSUPPORTED;
>    }
> +  ASSERT (mCpuHotEjectData &&
> +          (mCpuHotPlugData->ArrayLength == mCpuHotEjectData->ArrayLength));
> +
>    //
>    // Allocate the data structures that depend on the possible CPU count.
>    //

(10) To remain consistent with the check performed on "mCpuHotPlugData",
please do:

  if (mCpuHotEjectData == NULL) {
    Status = EFI_NOT_FOUND;
  } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
    Status = EFI_INVALID_PARAMETER;
  } else {
    Status = EFI_SUCCESS;
  }
  if (EFI_ERROR (Status)) {
    DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
    goto Fatal;
  }

(

  As a digression, I'll make some comments on the ASSERT() too:

  - Given ASSERT ((C1) && (C2)), it is best to express the same as
    ASSERT (C1); ASSERT (C2); -- the effect is the same, but the error
    messages have finer granularity.

  - Checking a pointer against NULL must be explicit at all times, in
    edk2. IOW, ASSERT (mCpuHotEjectData) should be spelled
    ASSERT (mCpuHotEjectData != NULL).

)


> @@ -552,6 +635,24 @@ CpuHotplugEntry (
>    //
>    SmbaseInstallFirstSmiHandler ();
>
> +  if (mCpuHotEjectData) {

(11) This condition is guaranteed to evaluate to TRUE; see the ASSERT()
above.

Anyway, ignore this...


> +  UINT32     Idx;

(12) Incorrect indentation, but ignore this too...


> +    //
> +    // For CPU ejection we need to map ProcessorNum -> APIC_ID. By the time
> +    // we need the mapping, however, the Processor's APIC ID has already been
> +    // removed from SMM data structures. So we will maintain a local map
> +    // in mCpuHotEjectData->ApicIdMap.
> +    //
> +    for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
> +      mCpuHotEjectData->ApicIdMap[Idx] = CPU_EJECT_INVALID;
> +    }

(13) ... because this init loop should be moved to patch #6 (subject
"OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state"), as I mentioned
there...


> +
> +    //
> +    // Wait to init the handler until an ejection is warranted
> +    //
> +    mCpuHotEjectData->Handler = NULL;

(14) ... and because this nulling is performed by patch #6 already
(subject "OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state").


Therefore, this whole conditional block should be removed please.

Thanks!
Laszlo

> +  }
> +
>    return EFI_SUCCESS;
>
>  ReleasePostSmmPen:
>


  reply	other threads:[~2021-02-01 16:11 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
2021-01-30  1:15   ` [edk2-devel] " Laszlo Ersek
2021-02-02  6:19     ` Ankur Arora
2021-02-01  2:59   ` Laszlo Ersek
2021-01-29  0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
2021-01-30  2:18   ` Laszlo Ersek
2021-01-30  2:23     ` Laszlo Ersek
2021-02-02  6:03     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
2021-01-30  2:36   ` Laszlo Ersek
2021-02-02  6:04     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
2021-01-30  2:37   ` Laszlo Ersek
2021-02-01  3:13   ` Laszlo Ersek
2021-02-03  4:28     ` Ankur Arora
2021-02-03 19:20       ` Laszlo Ersek
2021-01-29  0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
2021-02-01  4:53   ` Laszlo Ersek
2021-02-02  6:15     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
2021-02-01 13:36   ` Laszlo Ersek
2021-02-03  5:20     ` Ankur Arora
2021-02-03 20:36       ` Laszlo Ersek
2021-02-04  2:58         ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
2021-02-01 16:11   ` Laszlo Ersek [this message]
2021-02-01 19:08   ` Laszlo Ersek
2021-02-01 20:12     ` Ankur Arora
2021-02-02 14:00       ` Laszlo Ersek
2021-02-02 14:15         ` Laszlo Ersek
2021-02-03  6:45           ` Ankur Arora
2021-02-03 20:58             ` Laszlo Ersek
2021-02-04  2:49               ` Ankur Arora
2021-02-04  8:58                 ` Laszlo Ersek
2021-02-05 16:06                 ` [edk2-devel] " Laszlo Ersek
2021-02-08  5:04                   ` Ankur Arora
2021-02-03  6:13         ` Ankur Arora
2021-02-03 20:55           ` Laszlo Ersek
2021-02-04  2:57             ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
2021-02-01 17:22   ` Laszlo Ersek
2021-02-01 19:21     ` Ankur Arora
2021-02-02 13:23       ` Laszlo Ersek
2021-02-03  5:41         ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
2021-02-01 17:37   ` Laszlo Ersek
2021-02-01 17:40     ` Laszlo Ersek
2021-02-01 17:48       ` Laszlo Ersek
2021-02-03  5:46     ` Ankur Arora
2021-02-03 20:45       ` Laszlo Ersek
2021-02-04  3:04         ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=180a8efb-1a26-3bab-f50a-2d7aeff6d582@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox