public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH v6 0/9] support CPU hot-unplug
@ 2021-01-29  0:59 Ankur Arora
  2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
                   ` (8 more replies)
  0 siblings, 9 replies; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel; +Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora

Hi,

This series adds support for CPU hot-unplug with OVMF.

Please see this in conjunction with the QEMU secureboot hot-unplug v2
series posted here (now upstreamed):
  https://lore.kernel.org/qemu-devel/20201207140739.3829993-1-imammedo@redhat.com/

Patches 1, and 3,
  ("OvmfPkg/CpuHotplugSmm: refactor hotplug logic"),
  ("OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper")
are either refactors or add support functions.

Patch 2, and 9,
  ("OvmfPkg/CpuHotplugSmm: collect hot-unplug events"),
  ("OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug")
handle the QEMU protocol logic for collection of CPU hot-unplug
events or the protocol negotiation.

Patch 4,
  ("OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()")
adds the MMI logic for CPU hot-unplug handling and informing
the PiSmmCpuDxeSmm of CPU removal.

Patches 5, and 6,
  ("OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA"),
  ("OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state")
sets up state for doing the CPU ejection as part of hot-unplug.

Patches 7, and 8,
  ("OvmfPkg/CpuHotplugSmm: add CpuEject()"),
  ("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection")
add the CPU ejection logic.

Testing (with QEMU 5.2.50):
 - Stable with randomized CPU plug/unplug (guest maxcpus=1,8,128)
 - Synthetic tests with simultaneous multi CPU hot-unplug
 - Negotiation with/without CPU hotplug enabled

Also at:
  github.com/terminus/edk2/ hot-unplug-v6

Changelog:
v6:
  - addresses v5 review comments.

v5:
  - fixes ECC errors (all but one in "OvmfPkg/CpuHotplugSmm: add
    add Qemu Cpu Status helper").
  URL: https://patchew.org/EDK2/20210126064440.299596-1-ankur.a.arora@oracle.com/

v4:
  - Gets rid of unnecessary UefiCpuPkg changes
  URL: https://patchew.org/EDK2/20210118063457.358581-1-ankur.a.arora@oracle.com/

v3:
  - Use a saner PCD based interface to share state between PiSmmCpuDxeSmm
    and OvmfPkg/CpuHotplugSmm
  - Cleaner split of the hot-unplug code
  URL: https://patchew.org/EDK2/20210115074533.277448-1-ankur.a.arora@oracle.com/

v2:
  - Do the ejection via SmmCpuFeaturesRendezvousExit()
  URL: https://patchew.org/EDK2/20210107195515.106158-1-ankur.a.arora@oracle.com/

RFC:
  URL: https://patchew.org/EDK2/20201208053432.2690694-1-ankur.a.arora@oracle.com/


Please review.

Thanks
Ankur

Ankur Arora (9):
  OvmfPkg/CpuHotplugSmm: refactor hotplug logic
  OvmfPkg/CpuHotplugSmm: collect hot-unplug events
  OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper
  OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA
  OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  OvmfPkg/CpuHotplugSmm: add CpuEject()
  OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug

 OvmfPkg/OvmfPkg.dec                                |  10 +
 OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf            |   1 +
 .../SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf        |   3 +
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.h                  |   6 +
 OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h  |   2 +
 OvmfPkg/Include/Library/CpuHotEjectData.h          |  35 ++
 OvmfPkg/CpuHotplugSmm/CpuHotplug.c                 | 472 +++++++++++++++++----
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                  |  51 ++-
 .../Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c  |  78 ++++
 OvmfPkg/SmmControl2Dxe/SmiFeatures.c               |  25 +-
 10 files changed, 584 insertions(+), 99 deletions(-)
 create mode 100644 OvmfPkg/Include/Library/CpuHotEjectData.h

-- 
2.9.3


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-01-30  1:15   ` [edk2-devel] " Laszlo Ersek
  2021-02-01  2:59   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Refactor CpuHotplugMmi() to pull out the CPU hotplug logic into
ProcessHotAddedCpus(). This is in preparation for supporting CPU
hot-unplug.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
     > +  if (EFI_ERROR(Status)) {
     > +    goto Fatal;
     >    }
    
     (13) Without having seen the rest of the patches, I think this error
     check should be nested under the same (PluggedCount > 0) condition; in
     other words, I think it only makes sense to check Status after we
     actually call ProcessHotAddedCpus().
    
    Addresses all comments from v5, except for this one, since the (lack) of
    nesting makes more sense after patch 4, "OvmfPkg/CpuHotplugSmm: introduce
    UnplugCpus()".

 OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 214 ++++++++++++++++++++++---------------
 1 file changed, 129 insertions(+), 85 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index cfe698ed2b5e..05b1f8cb63a6 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -62,6 +62,130 @@ STATIC UINT32 mPostSmmPenAddress;
 //
 STATIC EFI_HANDLE mDispatchHandle;
 
+/**
+  Process CPUs that have been hot-added, per QemuCpuhpCollectApicIds().
+
+  For each such CPU, relocate the SMBASE, and report the CPU to PiSmmCpuDxeSmm
+  via EFI_SMM_CPU_SERVICE_PROTOCOL. If the supposedly hot-added CPU is already
+  known, skip it silently.
+
+  @param[in] PluggedApicIds    The APIC IDs of the CPUs that have been
+                               hot-plugged.
+
+  @param[in] PluggedCount      The number of filled-in APIC IDs in
+                               PluggedApicIds.
+
+  @retval EFI_SUCCESS          CPUs corresponding to all the APIC IDs are
+                               populated.
+
+  @retval EFI_OUT_OF_RESOURCES Out of APIC ID space in "mCpuHotPlugData".
+
+  @return                      Error codes propagated from SmbaseRelocate()
+                               and mMmCpuService->AddProcessor().
+
+**/
+STATIC
+EFI_STATUS
+ProcessHotAddedCpus (
+  IN APIC_ID                      *PluggedApicIds,
+  IN UINT32                       PluggedCount
+  )
+{
+  EFI_STATUS Status;
+  UINT32     PluggedIdx;
+  UINT32     NewSlot;
+
+  //
+  // The Post-SMM Pen need not be reinstalled multiple times within a single
+  // root MMI handling. Even reinstalling once per root MMI is only prudence;
+  // in theory installing the pen in the driver's entry point function should
+  // suffice.
+  //
+  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
+
+  PluggedIdx = 0;
+  NewSlot = 0;
+  while (PluggedIdx < PluggedCount) {
+    APIC_ID NewApicId;
+    UINT32  CheckSlot;
+    UINTN   NewProcessorNumberByProtocol;
+
+    NewApicId = PluggedApicIds[PluggedIdx];
+
+    //
+    // Check if the supposedly hot-added CPU is already known to us.
+    //
+    for (CheckSlot = 0;
+         CheckSlot < mCpuHotPlugData->ArrayLength;
+         CheckSlot++) {
+      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
+        break;
+      }
+    }
+    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
+      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
+        "before; ignoring it\n", __FUNCTION__, NewApicId));
+      PluggedIdx++;
+      continue;
+    }
+
+    //
+    // Find the first empty slot in CPU_HOT_PLUG_DATA.
+    //
+    while (NewSlot < mCpuHotPlugData->ArrayLength &&
+           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
+      NewSlot++;
+    }
+    if (NewSlot == mCpuHotPlugData->ArrayLength) {
+      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
+        __FUNCTION__, NewApicId));
+      return EFI_OUT_OF_RESOURCES;
+    }
+
+    //
+    // Store the APIC ID of the new processor to the slot.
+    //
+    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
+
+    //
+    // Relocate the SMBASE of the new CPU.
+    //
+    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
+               mPostSmmPenAddress);
+    if (EFI_ERROR (Status)) {
+      goto RevokeNewSlot;
+    }
+
+    //
+    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
+    //
+    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
+                              &NewProcessorNumberByProtocol);
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
+        __FUNCTION__, NewApicId, Status));
+      goto RevokeNewSlot;
+    }
+
+    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
+      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
+      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
+      (UINT64)NewProcessorNumberByProtocol));
+
+    NewSlot++;
+    PluggedIdx++;
+  }
+
+  //
+  // We've processed this batch of hot-added CPUs.
+  //
+  return EFI_SUCCESS;
+
+RevokeNewSlot:
+  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
+
+  return Status;
+}
 
 /**
   CPU Hotplug MMI handler function.
@@ -122,8 +246,6 @@ CpuHotplugMmi (
   UINT8      ApmControl;
   UINT32     PluggedCount;
   UINT32     ToUnplugCount;
-  UINT32     PluggedIdx;
-  UINT32     NewSlot;
 
   //
   // Assert that we are entering this function due to our root MMI handler
@@ -179,87 +301,12 @@ CpuHotplugMmi (
     goto Fatal;
   }
 
-  //
-  // Process hot-added CPUs.
-  //
-  // The Post-SMM Pen need not be reinstalled multiple times within a single
-  // root MMI handling. Even reinstalling once per root MMI is only prudence;
-  // in theory installing the pen in the driver's entry point function should
-  // suffice.
-  //
-  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
+  if (PluggedCount > 0) {
+    Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
+  }
 
-  PluggedIdx = 0;
-  NewSlot = 0;
-  while (PluggedIdx < PluggedCount) {
-    APIC_ID NewApicId;
-    UINT32  CheckSlot;
-    UINTN   NewProcessorNumberByProtocol;
-
-    NewApicId = mPluggedApicIds[PluggedIdx];
-
-    //
-    // Check if the supposedly hot-added CPU is already known to us.
-    //
-    for (CheckSlot = 0;
-         CheckSlot < mCpuHotPlugData->ArrayLength;
-         CheckSlot++) {
-      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
-        break;
-      }
-    }
-    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
-      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
-        "before; ignoring it\n", __FUNCTION__, NewApicId));
-      PluggedIdx++;
-      continue;
-    }
-
-    //
-    // Find the first empty slot in CPU_HOT_PLUG_DATA.
-    //
-    while (NewSlot < mCpuHotPlugData->ArrayLength &&
-           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
-      NewSlot++;
-    }
-    if (NewSlot == mCpuHotPlugData->ArrayLength) {
-      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
-        __FUNCTION__, NewApicId));
-      goto Fatal;
-    }
-
-    //
-    // Store the APIC ID of the new processor to the slot.
-    //
-    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
-
-    //
-    // Relocate the SMBASE of the new CPU.
-    //
-    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
-               mPostSmmPenAddress);
-    if (EFI_ERROR (Status)) {
-      goto RevokeNewSlot;
-    }
-
-    //
-    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
-    //
-    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
-                              &NewProcessorNumberByProtocol);
-    if (EFI_ERROR (Status)) {
-      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
-        __FUNCTION__, NewApicId, Status));
-      goto RevokeNewSlot;
-    }
-
-    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
-      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
-      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
-      (UINT64)NewProcessorNumberByProtocol));
-
-    NewSlot++;
-    PluggedIdx++;
+  if (EFI_ERROR(Status)) {
+    goto Fatal;
   }
 
   //
@@ -267,9 +314,6 @@ CpuHotplugMmi (
   //
   return EFI_SUCCESS;
 
-RevokeNewSlot:
-  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
-
 Fatal:
   ASSERT (FALSE);
   CpuDeadLoop ();
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
  2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-01-30  2:18   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Process fw_remove events in QemuCpuhpCollectApicIds() and collect
corresponding APIC IDs for CPUs that are being hot-unplugged.

In addition, we now ignore CPUs which only have remove set. These
CPUs haven't been processed by OSPM yet.

This is based on the QEMU hot-unplug protocol documented here:
  https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/

Also define QEMU_CPUHP_STAT_EJECTED while we are at it.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
    I'm treating events (insert=1, fw_remove=1) below as invalid (return
    EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
    that is correct:
    
         if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
           //
           // The "insert" event guarantees the "enabled" status; plus it excludes
    -      // the "remove" event.
    +      // the "fw_remove" event.
           //
           if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
    -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
    +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
             DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
               "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
               CpuStatus));
    
    QEMU's handling in cpu_hotplug_rd() can return both of these:
    
    cpu_hotplug_rd() {
       ...
       case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
    	val |= cdev->cpu ? 1 : 0;
    	val |= cdev->is_inserting ? 2 : 0;
    	val |= cdev->is_removing  ? 4 : 0;
    	val |= cdev->fw_remove  ? 16 : 0;
       ...
    }
    and I don't see any code that treats is_inserting and is_removing as
    exclusive.

    One specific case where this looks it might be a problem is if the user
    unplugs a CPU and right after that plugs it.
    
    As part of the unplug handling, the ACPI AML would, in the scan loop,
    asynchronously trigger the notify, which would do the OS unplug, set
    "fw_remove" and then call the SMI_CMD.
    
    The subsequent plug could then come and set the "insert" bit.
    
    Assuming what I'm describing could happen, I'm not sure what's the right
    handling: QEMU could treat these bits as exclusive and then OVMF could
    justifiably treat it as a protocol error?

 OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
index a34a6d3fae61..692e3072598c 100644
--- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
+++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
@@ -34,6 +34,8 @@
 #define QEMU_CPUHP_STAT_ENABLED                BIT0
 #define QEMU_CPUHP_STAT_INSERT                 BIT1
 #define QEMU_CPUHP_STAT_REMOVE                 BIT2
+#define QEMU_CPUHP_STAT_EJECTED                BIT3
+#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
 
 #define QEMU_CPUHP_RW_CMD_DATA               0x8
 
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..f871e50c377b 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
     if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
       //
       // The "insert" event guarantees the "enabled" status; plus it excludes
-      // the "remove" event.
+      // the "fw_remove" event.
       //
       if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
-          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
         DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
           "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
           CpuStatus));
@@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
 
       ExtendIds   = PluggedApicIds;
       ExtendCount = PluggedCount;
-    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
-      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
-        CurrentSelector));
+    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
+      //
+      // "fw_remove" event guarantees "enabled".
+      //
+      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
+        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
+          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
+          CpuStatus));
+        return EFI_PROTOCOL_ERROR;
+      }
+
+      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
+        __FUNCTION__, CurrentSelector));
 
       ExtendIds   = ToUnplugApicIds;
       ExtendCount = ToUnplugCount;
+    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+      //
+      // Let the OSPM deal with the "remove" event.
+      //
+      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
+        __FUNCTION__, CurrentSelector));
+
+      CurrentSelector++;
+      continue;
     } else {
       DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
         __FUNCTION__, CurrentSelector));
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
  2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
  2021-01-29  0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-01-30  2:36   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Add QemuCpuhpWriteCpuStatus() which will be used to update the QEMU
CPU status register. On error, it hangs in a similar fashion as
other helper functions.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.h |  6 ++++++
 OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 22 ++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
index 8adaa0ad91f0..804809846890 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
@@ -30,6 +30,12 @@ QemuCpuhpReadCpuStatus (
   IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
   );
 
+VOID
+QemuCpuhpWriteCpuStatus (
+  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
+  IN UINT8                        CpuStatus
+  );
+
 UINT32
 QemuCpuhpReadCommandData (
   IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index f871e50c377b..ed44264de934 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -67,6 +67,28 @@ QemuCpuhpReadCpuStatus (
   return CpuStatus;
 }
 
+VOID
+QemuCpuhpWriteCpuStatus (
+  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
+  IN UINT8                        CpuStatus
+  )
+{
+  EFI_STATUS Status;
+
+  Status = MmCpuIo->Io.Write (
+                         MmCpuIo,
+                         MM_IO_UINT8,
+                         ICH9_CPU_HOTPLUG_BASE + QEMU_CPUHP_R_CPU_STAT,
+                         1,
+                         &CpuStatus
+                         );
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "%a: %r\n", __FUNCTION__, Status));
+    ASSERT (FALSE);
+    CpuDeadLoop ();
+  }
+}
+
 UINT32
 QemuCpuhpReadCommandData (
   IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (2 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-01-30  2:37   ` Laszlo Ersek
  2021-02-01  3:13   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Introduce UnplugCpus() which maps each APIC ID being unplugged
onto the hardware ID of the processor and informs PiSmmCpuDxeSmm
of removal by calling EFI_SMM_CPU_SERVICE_PROTOCOL.RemoveProcessor().

With this change we handle the first phase of unplug where we collect
the CPUs that need to be unplugged and mark them for removal in SMM
data structures.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 84 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index 05b1f8cb63a6..70d69f6ed65b 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -188,6 +188,88 @@ RevokeNewSlot:
 }
 
 /**
+  Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
+
+  For each such CPU, report the CPU to PiSmmCpuDxeSmm via
+  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
+  unknown, skip it silently.
+
+  @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
+                                hot-unplugged.
+
+  @param[in] ToUnplugCount      The number of filled-in APIC IDs in
+                                ToUnplugApicIds.
+
+  @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
+                                structures.
+
+  @return                       Error codes propagated from
+                                mMmCpuService->RemoveProcessor().
+
+**/
+STATIC
+EFI_STATUS
+UnplugCpus (
+  IN APIC_ID                      *ToUnplugApicIds,
+  IN UINT32                       ToUnplugCount
+  )
+{
+  EFI_STATUS Status;
+  UINT32     ToUnplugIdx;
+  UINTN      ProcessorNum;
+
+  ToUnplugIdx = 0;
+  while (ToUnplugIdx < ToUnplugCount) {
+    APIC_ID    RemoveApicId;
+
+    RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
+
+    //
+    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
+    // the ProcessorNum for the APIC ID to be removed.
+    //
+    for (ProcessorNum = 0;
+         ProcessorNum < mCpuHotPlugData->ArrayLength;
+         ProcessorNum++) {
+      if (mCpuHotPlugData->ApicId[ProcessorNum] == RemoveApicId) {
+        break;
+      }
+    }
+
+    //
+    // Ignore the unplug if APIC ID not found
+    //
+    if (ProcessorNum == mCpuHotPlugData->ArrayLength) {
+      DEBUG ((DEBUG_INFO, "%a: did not find APIC ID " FMT_APIC_ID
+          " to unplug\n", __FUNCTION__, RemoveApicId));
+      ToUnplugIdx++;
+      continue;
+    }
+
+    //
+    // Mark ProcessorNum for removal from SMM data structures
+    //
+    Status = mMmCpuService->RemoveProcessor (mMmCpuService, ProcessorNum);
+
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
+        __FUNCTION__, RemoveApicId, Status));
+      goto Fatal;
+    }
+
+    ToUnplugIdx++;
+  }
+
+  //
+  // We've removed this set of APIC IDs from SMM data structures.
+  //
+  return EFI_SUCCESS;
+
+Fatal:
+  return Status;
+}
+
+/**
   CPU Hotplug MMI handler function.
 
   This is a root MMI handler.
@@ -303,6 +385,8 @@ CpuHotplugMmi (
 
   if (PluggedCount > 0) {
     Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
+  } else if (ToUnplugCount > 0) {
+    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
   }
 
   if (EFI_ERROR(Status)) {
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (3 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-02-01  4:53   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Define CPU_HOT_EJECT_DATA and add PCD PcdCpuHotEjectDataAddress, which
will be used to share CPU ejection state between OvmfPkg/CpuHotPlugSmm
and PiSmmCpuDxeSmm.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/OvmfPkg.dec                       | 10 +++++++++
 OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf   |  1 +
 OvmfPkg/Include/Library/CpuHotEjectData.h | 35 +++++++++++++++++++++++++++++++
 3 files changed, 46 insertions(+)
 create mode 100644 OvmfPkg/Include/Library/CpuHotEjectData.h

diff --git a/OvmfPkg/OvmfPkg.dec b/OvmfPkg/OvmfPkg.dec
index 4348bb45c64a..1a2debb821d7 100644
--- a/OvmfPkg/OvmfPkg.dec
+++ b/OvmfPkg/OvmfPkg.dec
@@ -106,6 +106,10 @@ [LibraryClasses]
   #
   XenPlatformLib|Include/Library/XenPlatformLib.h
 
+  ##  @libraryclass  Share CPU hot-eject state
+  #
+  CpuHotEjectData|Include/Library/CpuHotEjectData.h
+
 [Guids]
   gUefiOvmfPkgTokenSpaceGuid            = {0x93bb96af, 0xb9f2, 0x4eb8, {0x94, 0x62, 0xe0, 0xba, 0x74, 0x56, 0x42, 0x36}}
   gEfiXenInfoGuid                       = {0xd3b46f3b, 0xd441, 0x1244, {0x9a, 0x12, 0x0, 0x12, 0x27, 0x3f, 0xc1, 0x4d}}
@@ -352,6 +356,12 @@ [PcdsDynamic, PcdsDynamicEx]
   #  This PCD is only accessed if PcdSmmSmramRequire is TRUE (see below).
   gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase|FALSE|BOOLEAN|0x34
 
+  ## This PCD adds a communication channel between PiSmmCpuDxeSmm and
+  #  CpuHotplugSmm.
+  #
+  #  Only accessed if PcdCpuHotPlugSupport is TRUE
+  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress|0|UINT64|0x46
+
 [PcdsFeatureFlag]
   gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderPciTranslation|TRUE|BOOLEAN|0x1c
   gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderMmioTranslation|FALSE|BOOLEAN|0x1d
diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
index 04322b0d7855..e08b572ef169 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
@@ -54,6 +54,7 @@ [Protocols]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
+  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
   gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
 
 [FeaturePcd]
diff --git a/OvmfPkg/Include/Library/CpuHotEjectData.h b/OvmfPkg/Include/Library/CpuHotEjectData.h
new file mode 100644
index 000000000000..b6fb629a1283
--- /dev/null
+++ b/OvmfPkg/Include/Library/CpuHotEjectData.h
@@ -0,0 +1,35 @@
+/** @file
+  Definition for a CPU hot-eject state sharing structure.
+
+  Copyright (C) 2021, Oracle Corporation.
+
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+**/
+
+#ifndef _CPU_HOT_EJECT_DATA_H_
+#define _CPU_HOT_EJECT_DATA_H_
+
+typedef
+VOID
+(EFIAPI *CPU_HOT_EJECT_FN)(
+  IN UINTN  ProcessorNum
+  );
+
+#define CPU_EJECT_INVALID               (MAX_UINT64)
+#define CPU_EJECT_WORKER                (MAX_UINT64-1)
+
+#define  CPU_HOT_EJECT_DATA_REVISION_1  0x00000001
+
+typedef struct {
+  UINT32           Revision;          // Used for version identification of
+                                      // this structure
+  UINT32           ArrayLength;       // Entries in the ApicIdMap array
+
+  UINT64           *ApicIdMap;        // Pointer to CpuIndex->ApicId map for
+                                      // pending hot-ejects
+  CPU_HOT_EJECT_FN Handler;           // Handler to do the CPU ejection
+
+  UINT64           Reserved;
+} CPU_HOT_EJECT_DATA;
+
+#endif /* _CPU_HOT_EJECT_DATA_H_ */
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (4 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-02-01 13:36   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Init CPU_HOT_EJECT_DATA, which will be used to share CPU ejection state
between SmmCpuFeaturesLib (via PiSmmCpuDxeSmm) and CpuHotPlugSmm.
CpuHotplugSmm also sets up the CPU ejection mechanism via
CPU_HOT_EJECT_DATA->Handler.

Additionally, expose CPU_HOT_EJECT_DATA via PcdCpuHotEjectDataAddress.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 .../SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf        |  3 +
 .../Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c  | 78 ++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
index 97a10afb6e27..32c63722ee62 100644
--- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
+++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
@@ -35,4 +35,7 @@ [LibraryClasses]
   UefiBootServicesTableLib
 
 [Pcd]
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugSupport
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
+  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress
   gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase
diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
index 7ef7ed98342e..33dd5da92432 100644
--- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
+++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
@@ -14,7 +14,9 @@
 #include <Library/PcdLib.h>
 #include <Library/SmmCpuFeaturesLib.h>
 #include <Library/SmmServicesTableLib.h>
+#include <Library/MemoryAllocationLib.h>
 #include <Library/UefiBootServicesTableLib.h>
+#include <Library/CpuHotEjectData.h>
 #include <PiSmm.h>
 #include <Register/Intel/SmramSaveStateMap.h>
 #include <Register/QemuSmramSaveStateMap.h>
@@ -171,6 +173,70 @@ SmmCpuFeaturesHookReturnFromSmm (
   return OriginalInstructionPointer;
 }
 
+GLOBAL_REMOVE_IF_UNREFERENCED
+CPU_HOT_EJECT_DATA *mCpuHotEjectData = NULL;
+
+/**
+  Initialize CpuHotEjectData if PcdCpuHotPlugSupport is enabled
+  and, if more than 1 CPU is configured.
+
+  Also sets up the corresponding PcdCpuHotEjectDataAddress.
+**/
+STATIC
+VOID
+SmmCpuFeaturesSmmInitHotEject (
+  VOID
+  )
+{
+  UINT32      mMaxNumberOfCpus;
+  EFI_STATUS  Status;
+
+  if (!FeaturePcdGet (PcdCpuHotPlugSupport)) {
+    return;
+  }
+
+  // PcdCpuHotPlugSupport => PcdCpuMaxLogicalProcessorNumber
+  mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
+
+  // No spare CPUs to hot-eject
+  if (mMaxNumberOfCpus == 1) {
+    return;
+  }
+
+  mCpuHotEjectData =
+    (CPU_HOT_EJECT_DATA *)AllocatePool (sizeof (*mCpuHotEjectData));
+  ASSERT (mCpuHotEjectData != NULL);
+
+  //
+  // Allocate buffer for pointers to array in CPU_HOT_EJECT_DATA.
+  //
+
+  // Revision
+  mCpuHotEjectData->Revision = CPU_HOT_EJECT_DATA_REVISION_1;
+
+  // Array Length of APIC ID
+  mCpuHotEjectData->ArrayLength = mMaxNumberOfCpus;
+
+  // CpuIndex -> APIC ID map
+  mCpuHotEjectData->ApicIdMap = (UINT64 *)AllocatePool
+      (sizeof (*mCpuHotEjectData->ApicIdMap) * mCpuHotEjectData->ArrayLength);
+
+  // Hot-eject handler
+  mCpuHotEjectData->Handler = NULL;
+
+  // Reserved
+  mCpuHotEjectData->Reserved = 0;
+
+  ASSERT (mCpuHotEjectData->ApicIdMap != NULL);
+
+  //
+  // Expose address of CPU Hot eject Data structure
+  //
+  Status = PcdSet64S (PcdCpuHotEjectDataAddress,
+                      (UINT64)(VOID *)mCpuHotEjectData);
+  ASSERT_EFI_ERROR (Status);
+}
+
 /**
   Hook point in normal execution mode that allows the one CPU that was elected
   as monarch during System Management Mode initialization to perform additional
@@ -188,6 +254,9 @@ SmmCpuFeaturesSmmRelocationComplete (
   UINTN      MapPagesBase;
   UINTN      MapPagesCount;
 
+
+  SmmCpuFeaturesSmmInitHotEject ();
+
   if (!MemEncryptSevIsEnabled ()) {
     return;
   }
@@ -375,6 +444,15 @@ SmmCpuFeaturesRendezvousExit (
   IN UINTN  CpuIndex
   )
 {
+  //
+  // CPU Hot-eject not enabled.
+  //
+  if (mCpuHotEjectData == NULL ||
+      mCpuHotEjectData->Handler == NULL) {
+    return;
+  }
+
+  mCpuHotEjectData->Handler (CpuIndex);
 }
 
 /**
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (5 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-02-01 16:11   ` Laszlo Ersek
  2021-02-01 19:08   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
  2021-01-29  0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
  8 siblings, 2 replies; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Add CpuEject(), which handles the CPU ejection, and provides a holding
area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
at the tail end of the SMI handling.

Also UnplugCpus() now stashes APIC IDs of CPUs which need to be
ejected in CPU_HOT_EJECT_DATA.ApicIdMap. These are used by CpuEject()
to identify such CPUs.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 109 +++++++++++++++++++++++++++++++++++--
 1 file changed, 105 insertions(+), 4 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index 70d69f6ed65b..526f51faf070 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -14,6 +14,7 @@
 #include <Library/MmServicesTableLib.h>      // gMmst
 #include <Library/PcdLib.h>                  // PcdGetBool()
 #include <Library/SafeIntLib.h>              // SafeUintnSub()
+#include <Library/CpuHotEjectData.h>         // CPU_HOT_EJECT_DATA
 #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
 #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
 #include <Uefi/UefiBaseType.h>               // EFI_STATUS
@@ -32,11 +33,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
 //
 STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
 //
-// This structure is a communication side-channel between the
+// These structures serve as communication side-channels between the
 // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
 // (i.e., PiSmmCpuDxeSmm).
 //
 STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
+STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
 //
 // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
 // known event types), for the time of just one MMI.
@@ -188,11 +190,53 @@ RevokeNewSlot:
 }
 
 /**
+  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
+  on each CPU at exit from SMM.
+
+  If, the executing CPU is not being ejected, nothing to be done.
+  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
+  until ejected.
+
+  @param[in] ProcessorNum      Index of executing CPU.
+
+**/
+VOID
+EFIAPI
+CpuEject (
+  IN UINTN ProcessorNum
+  )
+{
+  //
+  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
+  // so use UINT64 throughout.
+  //
+  UINT64 ApicId;
+
+  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
+  if (ApicId == CPU_EJECT_INVALID) {
+    return;
+  }
+
+  //
+  // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
+  // after having been cleared to exit the SMI by the monarch and thus have
+  // no SMM processing remaining.
+  //
+  // Given that we cannot allow them to escape to the guest, we pen them
+  // here until the SMM monarch tells the HW to unplug them.
+  //
+  CpuDeadLoop ();
+}
+
+/**
   Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
 
   For each such CPU, report the CPU to PiSmmCpuDxeSmm via
-  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
-  unknown, skip it silently.
+  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
+  If the to be hot-unplugged CPU is unknown, skip it silently.
+
+  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
+  which would handle the ejection.
 
   @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
                                 hot-unplugged.
@@ -216,9 +260,11 @@ UnplugCpus (
 {
   EFI_STATUS Status;
   UINT32     ToUnplugIdx;
+  UINT32     EjectCount;
   UINTN      ProcessorNum;
 
   ToUnplugIdx = 0;
+  EjectCount = 0;
   while (ToUnplugIdx < ToUnplugCount) {
     APIC_ID    RemoveApicId;
 
@@ -255,13 +301,41 @@ UnplugCpus (
       DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
         __FUNCTION__, RemoveApicId, Status));
       goto Fatal;
+    } else {
+      //
+      // Stash the APIC IDs so we can do the actual ejection later.
+      //
+      if (mCpuHotEjectData->ApicIdMap[ProcessorNum] != CPU_EJECT_INVALID) {
+        //
+        // Since ProcessorNum and APIC-ID map 1-1, so a valid
+        // mCpuHotEjectData->ApicIdMap[ProcessorNum] means something
+        // is horribly wrong.
+        //
+        DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %u maps to %llx, cannot "
+                "map to " FMT_APIC_ID "\n", __FUNCTION__, ProcessorNum,
+                mCpuHotEjectData->ApicIdMap[ProcessorNum], RemoveApicId));
+
+        Status = EFI_INVALID_PARAMETER;
+        goto Fatal;
+      }
+
+      mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId;
+      EjectCount++;
     }
 
     ToUnplugIdx++;
   }
 
+  if (EjectCount != 0) {
+    //
+    // We have processors to be ejected; install the handler.
+    //
+    mCpuHotEjectData->Handler = CpuEject;
+  }
+
   //
-  // We've removed this set of APIC IDs from SMM data structures.
+  // We've removed this set of APIC IDs from SMM data structures and
+  // have installed an ejection handler if needed.
   //
   return EFI_SUCCESS;
 
@@ -458,7 +532,13 @@ CpuHotplugEntry (
   // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
   // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
   //
+  // Additionally, CPU Hot-unplug is available only if CPU Hotplug is, so
+  // the same DEPEX also guarantees that PcdCpuHotEjectDataAddress points
+  // to CPU_HOT_EJECT_DATA in SMRAM.
+  //
   mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
+  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
+
   if (mCpuHotPlugData == NULL) {
     Status = EFI_NOT_FOUND;
     DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
@@ -470,6 +550,9 @@ CpuHotplugEntry (
   if (mCpuHotPlugData->ArrayLength == 1) {
     return EFI_UNSUPPORTED;
   }
+  ASSERT (mCpuHotEjectData &&
+          (mCpuHotPlugData->ArrayLength == mCpuHotEjectData->ArrayLength));
+
   //
   // Allocate the data structures that depend on the possible CPU count.
   //
@@ -552,6 +635,24 @@ CpuHotplugEntry (
   //
   SmbaseInstallFirstSmiHandler ();
 
+  if (mCpuHotEjectData) {
+  UINT32     Idx;
+    //
+    // For CPU ejection we need to map ProcessorNum -> APIC_ID. By the time
+    // we need the mapping, however, the Processor's APIC ID has already been
+    // removed from SMM data structures. So we will maintain a local map
+    // in mCpuHotEjectData->ApicIdMap.
+    //
+    for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
+      mCpuHotEjectData->ApicIdMap[Idx] = CPU_EJECT_INVALID;
+    }
+
+    //
+    // Wait to init the handler until an ejection is warranted
+    //
+    mCpuHotEjectData->Handler = NULL;
+  }
+
   return EFI_SUCCESS;
 
 ReleasePostSmmPen:
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (6 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-02-01 17:22   ` Laszlo Ersek
  2021-01-29  0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

Designate a worker CPU (we use the one executing the root MMI
handler), which will do the actual ejection via QEMU in CpuEject().

CpuEject(), on the worker CPU, ejects each marked CPU by first
selecting its APIC ID and then sending the QEMU "eject" command.
QEMU in-turn signals the remote VCPU thread which context-switches
it out of the SMI.

CpuEject(), on the CPU being ejected, spins around in its holding
area until this final context-switch. This does mean that there is
some CPU state that would ordinarily be restored (in SmiRendezvous()
and in SmiEntry.nasm::CommonHandler), but will not be anymore.
This unrestored state includes FPU state, CET enable, stuffing of
RSB and the final RSM. Since the CPU state is destroyed by QEMU,
this should be okay.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73 ++++++++++++++++++++++++++++++++++----
 1 file changed, 67 insertions(+), 6 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index 526f51faf070..bf91344eef9c 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -193,9 +193,12 @@ RevokeNewSlot:
   CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
   on each CPU at exit from SMM.
 
-  If, the executing CPU is not being ejected, nothing to be done.
+  If, the executing CPU is neither a worker, nor being ejected, nothing
+  to be done.
   If, the executing CPU is being ejected, wait in a CpuDeadLoop()
   until ejected.
+  If, the executing CPU is a worker CPU, set QEMU CPU status to eject
+  for CPUs being ejected.
 
   @param[in] ProcessorNum      Index of executing CPU.
 
@@ -217,6 +220,56 @@ CpuEject (
     return;
   }
 
+  if (ApicId == CPU_EJECT_WORKER) {
+    UINT32 CpuIndex;
+
+    for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength; CpuIndex++) {
+      UINT64 RemoveApicId;
+
+      RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
+
+      if ((RemoveApicId != CPU_EJECT_INVALID &&
+           RemoveApicId != CPU_EJECT_WORKER)) {
+        //
+        // This to-be-ejected-CPU has already received the BSP's SMI exit
+        // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
+        // followed by this callback or is already waiting in the
+        // CpuDeadLoop() below.
+        //
+        // Tell QEMU to context-switch it out.
+        //
+        QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
+        QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
+
+        //
+        // Compiler barrier to ensure the next store isn't reordered
+        //
+        MemoryFence ();
+
+        //
+        // Clear the eject status for CpuIndex to ensure that an invalid
+        // SMI later does not end up trying to eject it or a newly
+        // hotplugged CpuIndex does not go into the dead loop.
+        //
+        mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
+
+        DEBUG ((DEBUG_INFO, "%a: Unplugged CPU %u -> " FMT_APIC_ID "\n",
+               __FUNCTION__, CpuIndex, RemoveApicId));
+      }
+    }
+
+    //
+    // Clear our own worker status.
+    //
+    mCpuHotEjectData->ApicIdMap[ProcessorNum] = CPU_EJECT_INVALID;
+
+    //
+    // We are done until the next hot-unplug; clear the handler.
+    //
+    mCpuHotEjectData->Handler = NULL;
+    return;
+  }
+
   //
   // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
   // after having been cleared to exit the SMI by the monarch and thus have
@@ -327,6 +380,19 @@ UnplugCpus (
   }
 
   if (EjectCount != 0) {
+    UINTN  Worker;
+
+    Status = mMmCpuService->WhoAmI (mMmCpuService, &Worker);
+    ASSERT_EFI_ERROR (Status);
+    //
+    // UnplugCpus() is called via the root MMI handler and thus we are
+    // executing in the BSP context.
+    //
+    // Mark ourselves as the worker CPU.
+    //
+    ASSERT (mCpuHotEjectData->ApicIdMap[Worker] == CPU_EJECT_INVALID);
+    mCpuHotEjectData->ApicIdMap[Worker] = CPU_EJECT_WORKER;
+
     //
     // We have processors to be ejected; install the handler.
     //
@@ -451,11 +517,6 @@ CpuHotplugMmi (
   if (EFI_ERROR (Status)) {
     goto Fatal;
   }
-  if (ToUnplugCount > 0) {
-    DEBUG ((DEBUG_ERROR, "%a: hot-unplug is not supported yet\n",
-      __FUNCTION__));
-    goto Fatal;
-  }
 
   if (PluggedCount > 0) {
     Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
                   ` (7 preceding siblings ...)
  2021-01-29  0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
@ 2021-01-29  0:59 ` Ankur Arora
  2021-02-01 17:37   ` Laszlo Ersek
  8 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-01-29  0:59 UTC (permalink / raw)
  To: devel
  Cc: lersek, imammedo, boris.ostrovsky, Ankur Arora, Jordan Justen,
	Ard Biesheuvel, Aaron Young

As part of the negotiation treat ICH9_LPC_SMI_F_CPU_HOT_UNPLUG as a
subfeature of feature flag ICH9_LPC_SMI_F_CPU_HOTPLUG, so enable it
only if the other is also being negotiated.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 OvmfPkg/SmmControl2Dxe/SmiFeatures.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
index c9d875543205..e70f3f8b58cb 100644
--- a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
+++ b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
@@ -29,6 +29,13 @@
 //
 #define ICH9_LPC_SMI_F_CPU_HOTPLUG BIT1
 
+// The following bit value stands for "enable CPU hot unplug, and inject an SMI
+// with control value ICH9_APM_CNT_CPU_HOT_UNPLUG upon hot unplug", in the
+// "etc/smi/supported-features" and "etc/smi/requested-features" fw_cfg files.
+// Is only negotiated alongside ICH9_LPC_SMI_F_CPU_HOTPLUG.
+//
+#define ICH9_LPC_SMI_F_CPU_HOT_UNPLUG BIT2
+
 //
 // Provides a scratch buffer (allocated in EfiReservedMemoryType type memory)
 // for the S3 boot script fragment to write to and read from.
@@ -112,7 +119,8 @@ NegotiateSmiFeatures (
   QemuFwCfgReadBytes (sizeof mSmiFeatures, &mSmiFeatures);
 
   //
-  // We want broadcast SMI, SMI on CPU hotplug, and nothing else.
+  // We want broadcast SMI, SMI on CPU hotplug, on CPU hot-unplug
+  // and nothing else.
   //
   RequestedFeaturesMask = ICH9_LPC_SMI_F_BROADCAST;
   if (!MemEncryptSevIsEnabled ()) {
@@ -120,8 +128,18 @@ NegotiateSmiFeatures (
     // For now, we only support hotplug with SEV disabled.
     //
     RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOTPLUG;
+    RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
   }
   mSmiFeatures &= RequestedFeaturesMask;
+
+  if (!(mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) &&
+      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG)) {
+    DEBUG ((DEBUG_WARN, "%a CPU host-features %Lx, requested mask %Lx\n",
+      __FUNCTION__, mSmiFeatures, RequestedFeaturesMask));
+
+    mSmiFeatures &= ~ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
+  }
+
   QemuFwCfgSelectItem (mRequestedFeaturesItem);
   QemuFwCfgWriteBytes (sizeof mSmiFeatures, &mSmiFeatures);
 
@@ -162,8 +180,9 @@ NegotiateSmiFeatures (
   if ((mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) == 0) {
     DEBUG ((DEBUG_INFO, "%a: CPU hotplug not negotiated\n", __FUNCTION__));
   } else {
-    DEBUG ((DEBUG_INFO, "%a: CPU hotplug with SMI negotiated\n",
-      __FUNCTION__));
+    DEBUG ((DEBUG_INFO, "%a: CPU hotplug%s with SMI negotiated\n",
+      __FUNCTION__,
+      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG) ? ", unplug" : ""));
   }
 
   //
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [edk2-devel] [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic
  2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
@ 2021-01-30  1:15   ` Laszlo Ersek
  2021-02-02  6:19     ` Ankur Arora
  2021-02-01  2:59   ` Laszlo Ersek
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-01-30  1:15 UTC (permalink / raw)
  To: devel, ankur.a.arora
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Refactor CpuHotplugMmi() to pull out the CPU hotplug logic into
> ProcessHotAddedCpus(). This is in preparation for supporting CPU
> hot-unplug.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> 
> Notes:
>      > +  if (EFI_ERROR(Status)) {
>      > +    goto Fatal;
>      >    }
>     
>      (13) Without having seen the rest of the patches, I think this error
>      check should be nested under the same (PluggedCount > 0) condition; in
>      other words, I think it only makes sense to check Status after we
>      actually call ProcessHotAddedCpus().
>     
>     Addresses all comments from v5, except for this one, since the (lack) of
>     nesting makes more sense after patch 4, "OvmfPkg/CpuHotplugSmm: introduce
>     UnplugCpus()".
> 
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 214 ++++++++++++++++++++++---------------
>  1 file changed, 129 insertions(+), 85 deletions(-)
> 
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index cfe698ed2b5e..05b1f8cb63a6 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -62,6 +62,130 @@ STATIC UINT32 mPostSmmPenAddress;
>  //
>  STATIC EFI_HANDLE mDispatchHandle;
>  
> +/**
> +  Process CPUs that have been hot-added, per QemuCpuhpCollectApicIds().
> +
> +  For each such CPU, relocate the SMBASE, and report the CPU to PiSmmCpuDxeSmm
> +  via EFI_SMM_CPU_SERVICE_PROTOCOL. If the supposedly hot-added CPU is already
> +  known, skip it silently.
> +
> +  @param[in] PluggedApicIds    The APIC IDs of the CPUs that have been
> +                               hot-plugged.
> +
> +  @param[in] PluggedCount      The number of filled-in APIC IDs in
> +                               PluggedApicIds.
> +
> +  @retval EFI_SUCCESS          CPUs corresponding to all the APIC IDs are
> +                               populated.
> +
> +  @retval EFI_OUT_OF_RESOURCES Out of APIC ID space in "mCpuHotPlugData".
> +
> +  @return                      Error codes propagated from SmbaseRelocate()
> +                               and mMmCpuService->AddProcessor().
> +
> +**/
> +STATIC
> +EFI_STATUS
> +ProcessHotAddedCpus (
> +  IN APIC_ID                      *PluggedApicIds,
> +  IN UINT32                       PluggedCount
> +  )
> +{
> +  EFI_STATUS Status;
> +  UINT32     PluggedIdx;
> +  UINT32     NewSlot;
> +
> +  //
> +  // The Post-SMM Pen need not be reinstalled multiple times within a single
> +  // root MMI handling. Even reinstalling once per root MMI is only prudence;
> +  // in theory installing the pen in the driver's entry point function should
> +  // suffice.
> +  //
> +  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
> +
> +  PluggedIdx = 0;
> +  NewSlot = 0;
> +  while (PluggedIdx < PluggedCount) {
> +    APIC_ID NewApicId;
> +    UINT32  CheckSlot;
> +    UINTN   NewProcessorNumberByProtocol;
> +
> +    NewApicId = PluggedApicIds[PluggedIdx];
> +
> +    //
> +    // Check if the supposedly hot-added CPU is already known to us.
> +    //
> +    for (CheckSlot = 0;
> +         CheckSlot < mCpuHotPlugData->ArrayLength;
> +         CheckSlot++) {
> +      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
> +        break;
> +      }
> +    }
> +    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
> +      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
> +        "before; ignoring it\n", __FUNCTION__, NewApicId));
> +      PluggedIdx++;
> +      continue;
> +    }
> +
> +    //
> +    // Find the first empty slot in CPU_HOT_PLUG_DATA.
> +    //
> +    while (NewSlot < mCpuHotPlugData->ArrayLength &&
> +           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
> +      NewSlot++;
> +    }
> +    if (NewSlot == mCpuHotPlugData->ArrayLength) {
> +      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
> +        __FUNCTION__, NewApicId));
> +      return EFI_OUT_OF_RESOURCES;
> +    }
> +
> +    //
> +    // Store the APIC ID of the new processor to the slot.
> +    //
> +    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
> +
> +    //
> +    // Relocate the SMBASE of the new CPU.
> +    //
> +    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
> +               mPostSmmPenAddress);
> +    if (EFI_ERROR (Status)) {
> +      goto RevokeNewSlot;
> +    }
> +
> +    //
> +    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
> +    //
> +    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
> +                              &NewProcessorNumberByProtocol);
> +    if (EFI_ERROR (Status)) {
> +      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
> +        __FUNCTION__, NewApicId, Status));
> +      goto RevokeNewSlot;
> +    }
> +
> +    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
> +      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
> +      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
> +      (UINT64)NewProcessorNumberByProtocol));
> +
> +    NewSlot++;
> +    PluggedIdx++;
> +  }
> +
> +  //
> +  // We've processed this batch of hot-added CPUs.
> +  //
> +  return EFI_SUCCESS;
> +
> +RevokeNewSlot:
> +  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
> +
> +  return Status;
> +}
>  
>  /**
>    CPU Hotplug MMI handler function.
> @@ -122,8 +246,6 @@ CpuHotplugMmi (
>    UINT8      ApmControl;
>    UINT32     PluggedCount;
>    UINT32     ToUnplugCount;
> -  UINT32     PluggedIdx;
> -  UINT32     NewSlot;
>  
>    //
>    // Assert that we are entering this function due to our root MMI handler
> @@ -179,87 +301,12 @@ CpuHotplugMmi (
>      goto Fatal;
>    }
>  
> -  //
> -  // Process hot-added CPUs.
> -  //
> -  // The Post-SMM Pen need not be reinstalled multiple times within a single
> -  // root MMI handling. Even reinstalling once per root MMI is only prudence;
> -  // in theory installing the pen in the driver's entry point function should
> -  // suffice.
> -  //
> -  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
> +  if (PluggedCount > 0) {
> +    Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
> +  }
>  
> -  PluggedIdx = 0;
> -  NewSlot = 0;
> -  while (PluggedIdx < PluggedCount) {
> -    APIC_ID NewApicId;
> -    UINT32  CheckSlot;
> -    UINTN   NewProcessorNumberByProtocol;
> -
> -    NewApicId = mPluggedApicIds[PluggedIdx];
> -
> -    //
> -    // Check if the supposedly hot-added CPU is already known to us.
> -    //
> -    for (CheckSlot = 0;
> -         CheckSlot < mCpuHotPlugData->ArrayLength;
> -         CheckSlot++) {
> -      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
> -        break;
> -      }
> -    }
> -    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
> -      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
> -        "before; ignoring it\n", __FUNCTION__, NewApicId));
> -      PluggedIdx++;
> -      continue;
> -    }
> -
> -    //
> -    // Find the first empty slot in CPU_HOT_PLUG_DATA.
> -    //
> -    while (NewSlot < mCpuHotPlugData->ArrayLength &&
> -           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
> -      NewSlot++;
> -    }
> -    if (NewSlot == mCpuHotPlugData->ArrayLength) {
> -      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
> -        __FUNCTION__, NewApicId));
> -      goto Fatal;
> -    }
> -
> -    //
> -    // Store the APIC ID of the new processor to the slot.
> -    //
> -    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
> -
> -    //
> -    // Relocate the SMBASE of the new CPU.
> -    //
> -    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
> -               mPostSmmPenAddress);
> -    if (EFI_ERROR (Status)) {
> -      goto RevokeNewSlot;
> -    }
> -
> -    //
> -    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
> -    //
> -    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
> -                              &NewProcessorNumberByProtocol);
> -    if (EFI_ERROR (Status)) {
> -      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
> -        __FUNCTION__, NewApicId, Status));
> -      goto RevokeNewSlot;
> -    }
> -
> -    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
> -      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
> -      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
> -      (UINT64)NewProcessorNumberByProtocol));
> -
> -    NewSlot++;
> -    PluggedIdx++;
> +  if (EFI_ERROR(Status)) {

(1) I understand why you skipped point (13) from the previous review,
but you missed point (14) as well -- space character missing after
"EFI_ERROR":

https://edk2.groups.io/g/devel/message/70785

Anyway, in case v7 will not be necessary, I can fix this up myself.

With the space character added:

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks
Laszlo


> +    goto Fatal;
>    }
>  
>    //
> @@ -267,9 +314,6 @@ CpuHotplugMmi (
>    //
>    return EFI_SUCCESS;
>  
> -RevokeNewSlot:
> -  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
> -
>  Fatal:
>    ASSERT (FALSE);
>    CpuDeadLoop ();
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
  2021-01-29  0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
@ 2021-01-30  2:18   ` Laszlo Ersek
  2021-01-30  2:23     ` Laszlo Ersek
  2021-02-02  6:03     ` Ankur Arora
  0 siblings, 2 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-01-30  2:18 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
> corresponding APIC IDs for CPUs that are being hot-unplugged.
>
> In addition, we now ignore CPUs which only have remove set. These
> CPUs haven't been processed by OSPM yet.
>
> This is based on the QEMU hot-unplug protocol documented here:
>   https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>
> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.

(1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
first use it.

>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>
> Notes:
>     I'm treating events (insert=1, fw_remove=1) below as invalid (return
>     EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
>     that is correct:
>
>          if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>            //
>            // The "insert" event guarantees the "enabled" status; plus it excludes
>     -      // the "remove" event.
>     +      // the "fw_remove" event.
>            //
>            if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>     -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>     +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>              DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>                "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>                CpuStatus));
>
>     QEMU's handling in cpu_hotplug_rd() can return both of these:
>
>     cpu_hotplug_rd() {
>        ...
>        case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
>     	val |= cdev->cpu ? 1 : 0;
>     	val |= cdev->is_inserting ? 2 : 0;
>     	val |= cdev->is_removing  ? 4 : 0;
>     	val |= cdev->fw_remove  ? 16 : 0;
>        ...
>     }
>     and I don't see any code that treats is_inserting and is_removing as
>     exclusive.
>
>     One specific case where this looks it might be a problem is if the user
>     unplugs a CPU and right after that plugs it.
>
>     As part of the unplug handling, the ACPI AML would, in the scan loop,
>     asynchronously trigger the notify, which would do the OS unplug, set
>     "fw_remove" and then call the SMI_CMD.
>
>     The subsequent plug could then come and set the "insert" bit.
>
>     Assuming what I'm describing could happen, I'm not sure what's the right
>     handling: QEMU could treat these bits as exclusive and then OVMF could
>     justifiably treat it as a protocol error?

I'm OK with the related part of your patch (i.e., returning
EFI_PROTOCOL_ERROR for (insert=1, fw_remove=1)).

>
>  OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
>  OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> index a34a6d3fae61..692e3072598c 100644
> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> @@ -34,6 +34,8 @@
>  #define QEMU_CPUHP_STAT_ENABLED                BIT0
>  #define QEMU_CPUHP_STAT_INSERT                 BIT1
>  #define QEMU_CPUHP_STAT_REMOVE                 BIT2
> +#define QEMU_CPUHP_STAT_EJECTED                BIT3
> +#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
>
>  #define QEMU_CPUHP_RW_CMD_DATA               0x8
>
> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> index 8d4a6693c8d6..f871e50c377b 100644
> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> @@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
>      if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>        //
>        // The "insert" event guarantees the "enabled" status; plus it excludes
> -      // the "remove" event.
> +      // the "fw_remove" event.
>        //
>        if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
> -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>          DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>            "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>            CpuStatus));
> @@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
>
>        ExtendIds   = PluggedApicIds;
>        ExtendCount = PluggedCount;
> -    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> -      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
> -        CurrentSelector));
> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
> +      //
> +      // "fw_remove" event guarantees "enabled".
> +      //
> +      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
> +        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
> +          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
> +          CpuStatus));
> +        return EFI_PROTOCOL_ERROR;
> +      }
> +
> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
> +        __FUNCTION__, CurrentSelector));
>
>        ExtendIds   = ToUnplugApicIds;
>        ExtendCount = ToUnplugCount;
> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
> +      //
> +      // Let the OSPM deal with the "remove" event.
> +      //
> +      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
> +        __FUNCTION__, CurrentSelector));

(2) Please downgrade this debug mask from DEBUG_INFO to DEBUG_VERBOSE.

(If you want your OVMF build to emit DEBUG_VERBOSE messages to the log,
you can set PcdDebugPrintErrorLevel to 0x8040004F in the DSC file --
DEBUG_VERBOSE has value 0x00400000.)

> +
> +      CurrentSelector++;
> +      continue;

(3) This change is logically correct; however I request a different
implementation, as I indicated here:

  https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06737.html
  msgid: <926113ec-8fa1-7d3b-ff3f-f1eda692e83d@redhat.com>

Namely:

(3a) On this branch, please set both "ExtendIds" and "ExtendCount" to
NULL, replacing the currently proposed "CurrentSelector" increment and
the "continue" statement.

(3b) Locate the section of code that starts with the comment "Save the
APIC ID of the CPU with the pending event...", and make it conditional
like this:

    ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
    if (ExtendIds != NULL) {
      ...
    }

(3c) and then simply proceed to the end of the loop body, where we
increment "CurrentSelector" already.


Here's why I'm asing for this: with your proposed v6 patch, the loop
body would receive a "CurrentSelector" increment operation that did not
explain itself. And I'd really like to keep *any* "CurrentSelector"
increment operation explained by the comment that we currently have at
the end of the loop body:

     //
     // We've processed the CPU with (known) pending events, but we must never
     // clear events. Therefore we need to advance past this CPU manually;
     // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
     // selected CPU.
     //

Keeping up that "well-explained" status would require one of two
options:

- copy the comment into the new branch (duplicating the comment) just
  before you add the new "CurrentSelector" increment operation, or

- make sure we have just one spot where we increment "CurrentSelector",
  and preserve the comment there.

The second option looks much better to me, so that's what I'm asking
for.

If we didn't have that big comment on the increment, your solution would
be just fine, but said comment is really important IMO.

Thanks!
Laszlo

>      } else {
>        DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
>          __FUNCTION__, CurrentSelector));
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
  2021-01-30  2:18   ` Laszlo Ersek
@ 2021-01-30  2:23     ` Laszlo Ersek
  2021-02-02  6:03     ` Ankur Arora
  1 sibling, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-01-30  2:23 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/30/21 03:18, Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
>> corresponding APIC IDs for CPUs that are being hot-unplugged.
>>
>> In addition, we now ignore CPUs which only have remove set. These
>> CPUs haven't been processed by OSPM yet.
>>
>> This is based on the QEMU hot-unplug protocol documented here:
>>   https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>>
>> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.
> 
> (1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
> ("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
> first use it.

(4) Apologies for the bikeshedding, but I also suggest that we call the
macro "QEMU_CPUHP_STAT_EJECT", rather than "_EJECTED".

Reason: QEMU documents this bit (on write) as "initiates device eject";
in other words, it's not a status, but a signal (or request) from the
guest code to QEMU.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper
  2021-01-29  0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
@ 2021-01-30  2:36   ` Laszlo Ersek
  2021-02-02  6:04     ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-01-30  2:36 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Add QemuCpuhpWriteCpuStatus() which will be used to update the QEMU
> CPU status register. On error, it hangs in a similar fashion as
> other helper functions.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/QemuCpuhp.h |  6 ++++++
>  OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 22 ++++++++++++++++++++++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
> index 8adaa0ad91f0..804809846890 100644
> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
> @@ -30,6 +30,12 @@ QemuCpuhpReadCpuStatus (
>    IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
>    );
>  
> +VOID
> +QemuCpuhpWriteCpuStatus (
> +  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
> +  IN UINT8                        CpuStatus
> +  );
> +
>  UINT32
>  QemuCpuhpReadCommandData (
>    IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> index f871e50c377b..ed44264de934 100644
> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
> @@ -67,6 +67,28 @@ QemuCpuhpReadCpuStatus (
>    return CpuStatus;
>  }
>  
> +VOID
> +QemuCpuhpWriteCpuStatus (
> +  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
> +  IN UINT8                        CpuStatus
> +  )
> +{
> +  EFI_STATUS Status;
> +
> +  Status = MmCpuIo->Io.Write (
> +                         MmCpuIo,
> +                         MM_IO_UINT8,
> +                         ICH9_CPU_HOTPLUG_BASE + QEMU_CPUHP_R_CPU_STAT,
> +                         1,
> +                         &CpuStatus
> +                         );
> +  if (EFI_ERROR (Status)) {
> +    DEBUG ((DEBUG_ERROR, "%a: %r\n", __FUNCTION__, Status));
> +    ASSERT (FALSE);
> +    CpuDeadLoop ();
> +  }
> +}
> +
>  UINT32
>  QemuCpuhpReadCommandData (
>    IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
> 

The code is fine, but please move the new function (both declaration and
definition) between QemuCpuhpWriteCpuSelector() and QemuCpuhpWriteCommand().

Reason: the pre-patch order of the functions matches the order of the
register descriptions in QEMU's "docs/specs/acpi_cpu_hotplug.txt".

There, we first have a section called "read access", then another called
"write access". And in each section, registers are listed in increasing
offset order, within the hotplug register block.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
@ 2021-01-30  2:37   ` Laszlo Ersek
  2021-02-01  3:13   ` Laszlo Ersek
  1 sibling, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-01-30  2:37 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Introduce UnplugCpus() which maps each APIC ID being unplugged
> onto the hardware ID of the processor and informs PiSmmCpuDxeSmm
> of removal by calling EFI_SMM_CPU_SERVICE_PROTOCOL.RemoveProcessor().
> 
> With this change we handle the first phase of unplug where we collect
> the CPUs that need to be unplugged and mark them for removal in SMM
> data structures.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 84 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 84 insertions(+)

I intend to continue the review at this patch, next week.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic
  2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
  2021-01-30  1:15   ` [edk2-devel] " Laszlo Ersek
@ 2021-02-01  2:59   ` Laszlo Ersek
  1 sibling, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01  2:59 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Refactor CpuHotplugMmi() to pull out the CPU hotplug logic into
> ProcessHotAddedCpus(). This is in preparation for supporting CPU
> hot-unplug.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> 
> Notes:
>      > +  if (EFI_ERROR(Status)) {
>      > +    goto Fatal;
>      >    }
>     
>      (13) Without having seen the rest of the patches, I think this error
>      check should be nested under the same (PluggedCount > 0) condition; in
>      other words, I think it only makes sense to check Status after we
>      actually call ProcessHotAddedCpus().
>     
>     Addresses all comments from v5, except for this one, since the (lack) of
>     nesting makes more sense after patch 4, "OvmfPkg/CpuHotplugSmm: introduce
>     UnplugCpus()".
> 
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 214 ++++++++++++++++++++++---------------
>  1 file changed, 129 insertions(+), 85 deletions(-)

I've got one more trivial comment for this patch, to improve style
consistency with the rest of this driver:

> 
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index cfe698ed2b5e..05b1f8cb63a6 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -62,6 +62,130 @@ STATIC UINT32 mPostSmmPenAddress;
>  //
>  STATIC EFI_HANDLE mDispatchHandle;
>  
> +/**
> +  Process CPUs that have been hot-added, per QemuCpuhpCollectApicIds().
> +
> +  For each such CPU, relocate the SMBASE, and report the CPU to PiSmmCpuDxeSmm
> +  via EFI_SMM_CPU_SERVICE_PROTOCOL. If the supposedly hot-added CPU is already
> +  known, skip it silently.
> +
> +  @param[in] PluggedApicIds    The APIC IDs of the CPUs that have been
> +                               hot-plugged.
> +
> +  @param[in] PluggedCount      The number of filled-in APIC IDs in
> +                               PluggedApicIds.
> +
> +  @retval EFI_SUCCESS          CPUs corresponding to all the APIC IDs are
> +                               populated.
> +
> +  @retval EFI_OUT_OF_RESOURCES Out of APIC ID space in "mCpuHotPlugData".
> +
> +  @return                      Error codes propagated from SmbaseRelocate()
> +                               and mMmCpuService->AddProcessor().
> +

(2) This empty line is not "wrong" in any case, just a bit inconsistent
with the rest of this driver; please drop it.

Thanks
Laszlo

> +**/
> +STATIC
> +EFI_STATUS
> +ProcessHotAddedCpus (
> +  IN APIC_ID                      *PluggedApicIds,
> +  IN UINT32                       PluggedCount
> +  )
> +{
> +  EFI_STATUS Status;
> +  UINT32     PluggedIdx;
> +  UINT32     NewSlot;
> +
> +  //
> +  // The Post-SMM Pen need not be reinstalled multiple times within a single
> +  // root MMI handling. Even reinstalling once per root MMI is only prudence;
> +  // in theory installing the pen in the driver's entry point function should
> +  // suffice.
> +  //
> +  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
> +
> +  PluggedIdx = 0;
> +  NewSlot = 0;
> +  while (PluggedIdx < PluggedCount) {
> +    APIC_ID NewApicId;
> +    UINT32  CheckSlot;
> +    UINTN   NewProcessorNumberByProtocol;
> +
> +    NewApicId = PluggedApicIds[PluggedIdx];
> +
> +    //
> +    // Check if the supposedly hot-added CPU is already known to us.
> +    //
> +    for (CheckSlot = 0;
> +         CheckSlot < mCpuHotPlugData->ArrayLength;
> +         CheckSlot++) {
> +      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
> +        break;
> +      }
> +    }
> +    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
> +      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
> +        "before; ignoring it\n", __FUNCTION__, NewApicId));
> +      PluggedIdx++;
> +      continue;
> +    }
> +
> +    //
> +    // Find the first empty slot in CPU_HOT_PLUG_DATA.
> +    //
> +    while (NewSlot < mCpuHotPlugData->ArrayLength &&
> +           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
> +      NewSlot++;
> +    }
> +    if (NewSlot == mCpuHotPlugData->ArrayLength) {
> +      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
> +        __FUNCTION__, NewApicId));
> +      return EFI_OUT_OF_RESOURCES;
> +    }
> +
> +    //
> +    // Store the APIC ID of the new processor to the slot.
> +    //
> +    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
> +
> +    //
> +    // Relocate the SMBASE of the new CPU.
> +    //
> +    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
> +               mPostSmmPenAddress);
> +    if (EFI_ERROR (Status)) {
> +      goto RevokeNewSlot;
> +    }
> +
> +    //
> +    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
> +    //
> +    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
> +                              &NewProcessorNumberByProtocol);
> +    if (EFI_ERROR (Status)) {
> +      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
> +        __FUNCTION__, NewApicId, Status));
> +      goto RevokeNewSlot;
> +    }
> +
> +    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
> +      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
> +      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
> +      (UINT64)NewProcessorNumberByProtocol));
> +
> +    NewSlot++;
> +    PluggedIdx++;
> +  }
> +
> +  //
> +  // We've processed this batch of hot-added CPUs.
> +  //
> +  return EFI_SUCCESS;
> +
> +RevokeNewSlot:
> +  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
> +
> +  return Status;
> +}
>  
>  /**
>    CPU Hotplug MMI handler function.
> @@ -122,8 +246,6 @@ CpuHotplugMmi (
>    UINT8      ApmControl;
>    UINT32     PluggedCount;
>    UINT32     ToUnplugCount;
> -  UINT32     PluggedIdx;
> -  UINT32     NewSlot;
>  
>    //
>    // Assert that we are entering this function due to our root MMI handler
> @@ -179,87 +301,12 @@ CpuHotplugMmi (
>      goto Fatal;
>    }
>  
> -  //
> -  // Process hot-added CPUs.
> -  //
> -  // The Post-SMM Pen need not be reinstalled multiple times within a single
> -  // root MMI handling. Even reinstalling once per root MMI is only prudence;
> -  // in theory installing the pen in the driver's entry point function should
> -  // suffice.
> -  //
> -  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
> +  if (PluggedCount > 0) {
> +    Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
> +  }
>  
> -  PluggedIdx = 0;
> -  NewSlot = 0;
> -  while (PluggedIdx < PluggedCount) {
> -    APIC_ID NewApicId;
> -    UINT32  CheckSlot;
> -    UINTN   NewProcessorNumberByProtocol;
> -
> -    NewApicId = mPluggedApicIds[PluggedIdx];
> -
> -    //
> -    // Check if the supposedly hot-added CPU is already known to us.
> -    //
> -    for (CheckSlot = 0;
> -         CheckSlot < mCpuHotPlugData->ArrayLength;
> -         CheckSlot++) {
> -      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
> -        break;
> -      }
> -    }
> -    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
> -      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
> -        "before; ignoring it\n", __FUNCTION__, NewApicId));
> -      PluggedIdx++;
> -      continue;
> -    }
> -
> -    //
> -    // Find the first empty slot in CPU_HOT_PLUG_DATA.
> -    //
> -    while (NewSlot < mCpuHotPlugData->ArrayLength &&
> -           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
> -      NewSlot++;
> -    }
> -    if (NewSlot == mCpuHotPlugData->ArrayLength) {
> -      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
> -        __FUNCTION__, NewApicId));
> -      goto Fatal;
> -    }
> -
> -    //
> -    // Store the APIC ID of the new processor to the slot.
> -    //
> -    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
> -
> -    //
> -    // Relocate the SMBASE of the new CPU.
> -    //
> -    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
> -               mPostSmmPenAddress);
> -    if (EFI_ERROR (Status)) {
> -      goto RevokeNewSlot;
> -    }
> -
> -    //
> -    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
> -    //
> -    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
> -                              &NewProcessorNumberByProtocol);
> -    if (EFI_ERROR (Status)) {
> -      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
> -        __FUNCTION__, NewApicId, Status));
> -      goto RevokeNewSlot;
> -    }
> -
> -    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
> -      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
> -      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
> -      (UINT64)NewProcessorNumberByProtocol));
> -
> -    NewSlot++;
> -    PluggedIdx++;
> +  if (EFI_ERROR(Status)) {
> +    goto Fatal;
>    }
>  
>    //
> @@ -267,9 +314,6 @@ CpuHotplugMmi (
>    //
>    return EFI_SUCCESS;
>  
> -RevokeNewSlot:
> -  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
> -
>  Fatal:
>    ASSERT (FALSE);
>    CpuDeadLoop ();
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
  2021-01-30  2:37   ` Laszlo Ersek
@ 2021-02-01  3:13   ` Laszlo Ersek
  2021-02-03  4:28     ` Ankur Arora
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01  3:13 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Introduce UnplugCpus() which maps each APIC ID being unplugged
> onto the hardware ID of the processor and informs PiSmmCpuDxeSmm
> of removal by calling EFI_SMM_CPU_SERVICE_PROTOCOL.RemoveProcessor().
>
> With this change we handle the first phase of unplug where we collect
> the CPUs that need to be unplugged and mark them for removal in SMM
> data structures.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 84 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 84 insertions(+)
>
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index 05b1f8cb63a6..70d69f6ed65b 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -188,6 +188,88 @@ RevokeNewSlot:
>  }
>
>  /**
> +  Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
> +
> +  For each such CPU, report the CPU to PiSmmCpuDxeSmm via
> +  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
> +  unknown, skip it silently.
> +
> +  @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
> +                                hot-unplugged.
> +
> +  @param[in] ToUnplugCount      The number of filled-in APIC IDs in
> +                                ToUnplugApicIds.
> +
> +  @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
> +                                structures.
> +
> +  @return                       Error codes propagated from
> +                                mMmCpuService->RemoveProcessor().
> +

(1) Please drop this empty line (just before the '**/').


> +**/
> +STATIC
> +EFI_STATUS
> +UnplugCpus (
> +  IN APIC_ID                      *ToUnplugApicIds,
> +  IN UINT32                       ToUnplugCount
> +  )
> +{
> +  EFI_STATUS Status;
> +  UINT32     ToUnplugIdx;
> +  UINTN      ProcessorNum;
> +
> +  ToUnplugIdx = 0;
> +  while (ToUnplugIdx < ToUnplugCount) {
> +    APIC_ID    RemoveApicId;
> +
> +    RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
> +
> +    //
> +    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
> +    // the ProcessorNum for the APIC ID to be removed.
> +    //
> +    for (ProcessorNum = 0;
> +         ProcessorNum < mCpuHotPlugData->ArrayLength;
> +         ProcessorNum++) {
> +      if (mCpuHotPlugData->ApicId[ProcessorNum] == RemoveApicId) {
> +        break;
> +      }
> +    }
> +
> +    //
> +    // Ignore the unplug if APIC ID not found
> +    //
> +    if (ProcessorNum == mCpuHotPlugData->ArrayLength) {
> +      DEBUG ((DEBUG_INFO, "%a: did not find APIC ID " FMT_APIC_ID
> +          " to unplug\n", __FUNCTION__, RemoveApicId));

(2) Please use DEBUG_VERBOSE here.

(I agree that we should have *one* DEBUG_INFO message that relates to
the removal of an individual processor; however, I think we should emit
that message when we finally signal QEMU to eject the processor.)


(3) Please un-indent ("outdent"?) the second line by two spaces.


> +      ToUnplugIdx++;
> +      continue;
> +    }
> +
> +    //
> +    // Mark ProcessorNum for removal from SMM data structures
> +    //
> +    Status = mMmCpuService->RemoveProcessor (mMmCpuService, ProcessorNum);
> +

(4) It would be more idiomatic to remove this empty line (between Status
assignment and check).


> +    if (EFI_ERROR (Status)) {
> +      DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
> +        __FUNCTION__, RemoveApicId, Status));
> +      goto Fatal;

(5) Please just "return Status" here, and drop the "Fatal" label.


> +    }
> +
> +    ToUnplugIdx++;
> +  }
> +
> +  //
> +  // We've removed this set of APIC IDs from SMM data structures.
> +  //
> +  return EFI_SUCCESS;
> +
> +Fatal:
> +  return Status;
> +}
> +
> +/**
>    CPU Hotplug MMI handler function.
>
>    This is a root MMI handler.
> @@ -303,6 +385,8 @@ CpuHotplugMmi (
>
>    if (PluggedCount > 0) {
>      Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
> +  } else if (ToUnplugCount > 0) {
> +    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
>    }
>
>    if (EFI_ERROR(Status)) {
>

(6) Hmm... What's the reason for the exclusivity?

Why is the following not better:

  if (PluggedCount > 0) {
    Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
    if (EFI_ERROR (Status)) {
      goto Fatal;
    }
  }
  if (ToUnplugCount > 0) {
    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
    if (EFI_ERROR (Status)) {
      goto Fatal;
    }
  }

QemuCpuhpCollectApicIds() intentionally populates both arrays in a
single go. As I suggested earlier:

https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06711.html
msgid: <a92b50df-f693-ebda-e549-7bc9e6d2d7a5@redhat.com>

> [...] please handle plugs first, for which unused slots in
> mCpuHotPlugData.ApicId will be populated, and *then* handle removals
> (in the same invocation of CpuHotplugMmi()).

Did that turn out as unviable (the "same invocation of CpuHotplugMmi()"
part)?


(7) As a side note, addressing point (6) above would allow you to
address my point (13) from the v5 patch#1 thread, too; i.e., nesting the
Status check under (PluggedCount > 0).

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA
  2021-01-29  0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
@ 2021-02-01  4:53   ` Laszlo Ersek
  2021-02-02  6:15     ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01  4:53 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Define CPU_HOT_EJECT_DATA and add PCD PcdCpuHotEjectDataAddress, which
> will be used to share CPU ejection state between OvmfPkg/CpuHotPlugSmm
> and PiSmmCpuDxeSmm.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/OvmfPkg.dec                       | 10 +++++++++
>  OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf   |  1 +
>  OvmfPkg/Include/Library/CpuHotEjectData.h | 35 +++++++++++++++++++++++++++++++
>  3 files changed, 46 insertions(+)
>  create mode 100644 OvmfPkg/Include/Library/CpuHotEjectData.h
>
> diff --git a/OvmfPkg/OvmfPkg.dec b/OvmfPkg/OvmfPkg.dec
> index 4348bb45c64a..1a2debb821d7 100644
> --- a/OvmfPkg/OvmfPkg.dec
> +++ b/OvmfPkg/OvmfPkg.dec
> @@ -106,6 +106,10 @@ [LibraryClasses]
>    #
>    XenPlatformLib|Include/Library/XenPlatformLib.h
>
> +  ##  @libraryclass  Share CPU hot-eject state
> +  #
> +  CpuHotEjectData|Include/Library/CpuHotEjectData.h
> +
>  [Guids]
>    gUefiOvmfPkgTokenSpaceGuid            = {0x93bb96af, 0xb9f2, 0x4eb8, {0x94, 0x62, 0xe0, 0xba, 0x74, 0x56, 0x42, 0x36}}
>    gEfiXenInfoGuid                       = {0xd3b46f3b, 0xd441, 0x1244, {0x9a, 0x12, 0x0, 0x12, 0x27, 0x3f, 0xc1, 0x4d}}

(1) Please drop this hunk -- the [LibraryClasses] section should not be
modified, as we're not introducing a new library class.


> @@ -352,6 +356,12 @@ [PcdsDynamic, PcdsDynamicEx]
>    #  This PCD is only accessed if PcdSmmSmramRequire is TRUE (see below).
>    gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase|FALSE|BOOLEAN|0x34
>
> +  ## This PCD adds a communication channel between PiSmmCpuDxeSmm and
> +  #  CpuHotplugSmm.

(2) I suggest:

  ## This PCD adds a communication channel between OVMF's SmmCpuFeaturesLib
  #  instance in PiSmmCpuDxeSmm, and CpuHotplugSmm.


> +  #
> +  #  Only accessed if PcdCpuHotPlugSupport is TRUE

(3) This statement is technically true, but I suggest dropping it. In my
opinion, it is not useful (it's a superfluous statement). Here's why:

- We set the "PcdCpuHotPlugSupport" feature flag to TRUE in the OVMF DSC
  files exactly when the SMM_REQUIRE feature test macro is set on the
  "build" command line.

- The whole SMM infrastructure is included in the firmware binary
  exactly when SMM_REQUIRE is set.

In other words, PcdCpuHotPlugSupport is *equivalent* with
SmmCpuFeaturesLib, PiSmmCpuDxeSmm, and CpuHotplugSmm being included in
the firmware binary.

Given that the first comment already declares the PCD as an info channel
between SmmCpuFeaturesLib (as built into PiSmmCpuDxeSmm) and
CpuHotplugSmm, the second comment adds nothing.


> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress|0|UINT64|0x46
> +
>  [PcdsFeatureFlag]
>    gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderPciTranslation|TRUE|BOOLEAN|0x1c
>    gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderMmioTranslation|FALSE|BOOLEAN|0x1d
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> index 04322b0d7855..e08b572ef169 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> @@ -54,6 +54,7 @@ [Protocols]
>
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
>    gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
>
>  [FeaturePcd]

(4) Please move this hunk to patch#7 (subject: "OvmfPkg/CpuHotplugSmm:
add CpuEject()"). That's where CpuHotplugSmm first needs the new PCD.


> diff --git a/OvmfPkg/Include/Library/CpuHotEjectData.h b/OvmfPkg/Include/Library/CpuHotEjectData.h
> new file mode 100644
> index 000000000000..b6fb629a1283
> --- /dev/null
> +++ b/OvmfPkg/Include/Library/CpuHotEjectData.h
> @@ -0,0 +1,35 @@
> +/** @file
> +  Definition for a CPU hot-eject state sharing structure.
> +

(5a) I suggest the following language:

  Definition of the CPU_HOT_EJECT_DATA structure, which shares CPU hot-eject
  state between OVMF's SmmCpuFeaturesLib instance in PiSmmCpuDxeSmm, and
  CpuHotplugSmm.

  CPU_HOT_EJECT_DATA is allocated in SMRAM, and pointed-to by
  PcdCpuHotEjectDataAddress.

(5b) Please append at least one more sentence to state the condition
when the PCD is *not* NULL.


(6) This new header file should be located at:

  OvmfPkg/Include/Pcd/CpuHotEjectData.h

please.

The (more or less) general rule is this:

- if you have a macro definition or a structure type that is accessible
  through a Pcd, a Protocol, a Guid -- HOB, VenHw() devpath node etc --,
  a Library, a Register, etc,

- and the Pcd, Protocol, Guid, Library etc in question is declared in
  "WhateverPkg/WhateverPkg.dec",

- then the header file defining the structure or macro should be placed
  in the following directory (according to the access type):

  WhateverPkg/Include/Pcd/
  WhateverPkg/Include/Protocol/
  WhateverPkg/Include/Guid/
  WhateverPkg/Include/Library/
  WhateverPkg/Include/Register/

Admittedly, while this rule is universally honored in edk2 in the
Protocol, Guid, and Library cases, the Register case is somewhat less
frequently followed, and the Pcd case is almost nonexistent. For
example, "UefiCpuPkg/Include/CpuHotPlugData.h" itself does not follow
the rule (no "Pcd" subdir). However, there are examples that do follow
the rule:

  CryptoPkg/Include/Pcd/PcdCryptoServiceFamilyEnable.h
  RedfishPkg/Include/Pcd/RestExServiceDevicePath.h


> +  Copyright (C) 2021, Oracle Corporation.
> +
> +  SPDX-License-Identifier: BSD-2-Clause-Patent
> +**/
> +
> +#ifndef _CPU_HOT_EJECT_DATA_H_
> +#define _CPU_HOT_EJECT_DATA_H_

(7) Please use the following guard macro:

  CPU_HOT_EJECT_DATA_H_

(i.e., please drop the leading underscore).

Although the leading underscore is widely used in edk2, in include guard
macros, it's a bad practice (it creates identifiers that are reserved by
the C standard), so we should not introduce more of it.


> +
> +typedef
> +VOID
> +(EFIAPI *CPU_HOT_EJECT_FN)(

(8) Please replace _FN with _HANDLER or _FUNCTION.

In edk2, we tend to avoid abbreviations. (Yes, the practice has not
entirely been consistent, and sometimes it's actually *annoying* that
our type names are too long. But that's what we got.)

... _HANDLER would be better, as you call the related field "Handler" in
the structure.


(9) Missing space character before the last parenthesis on the line.


(10) Please add a leading comment block on this function prototype.
(Well, yes, I realize it is technically a *pointer* type, but still.)

This is not just a formality; I'd really like the "ProcessorNum"
parameter to be described, for example its relationship with the
"ProcessorNumber" parameter of EFI_SMM_CPU_SERVICE_PROTOCOL member
functions, and/or the "CPU_HOT_PLUG_DATA.ApicId" array.


> +  IN UINTN  ProcessorNum
> +  );
> +
> +#define CPU_EJECT_INVALID               (MAX_UINT64)
> +#define CPU_EJECT_WORKER                (MAX_UINT64-1)

(11a) If these are meant as special values for the "ApicIdMap" array,
then I'd suggest something like:

  CPU_EJECT_APIC_ID_INVALID
  CPU_EJECT_APIC_ID_WORKER

(11b) Can you add a single-sentence comment to each macro? (Observe the
comment style while at it, please.)


> +
> +#define  CPU_HOT_EJECT_DATA_REVISION_1  0x00000001
> +
> +typedef struct {
> +  UINT32           Revision;          // Used for version identification of
> +                                      // this structure

(12) Please drop both the "CPU_HOT_EJECT_DATA_REVISION_1" macro and the
"Revision" field.

The "CPU_HOT_PLUG_DATA" structure, from
"UefiCpuPkg/Include/CpuHotPlugData.h", is different. That structure is
versioned because it establishes a communication channel between a core
module (PiSmmCpuDxeSmm) and a platform module (such as
OvmfPkg/CpuHotplugSmm); what's more, those modules could even be built
separately, and be available in binary-only form.

(Side note: we ignore "CPU_HOT_PLUG_DATA.Revision" in
"OvmfPkg/CpuHotplugSmm" because the OVMF platforms exist in the exact
same repository as PiSmmCpuDxeSmm, so we can keep them in sync. This is
BTW one reason why I absolutely want OVMF to live in the core edk2
repository. Anyway, digression ends.)

However, the same versioning idea (or requirement) does not hold for the
present use case. The new communication channel is between:

- OVMF's SmmCpuFeaturesLib instance in PiSmmCpuDxeSmm,
- and CpuHotplugSmm.

Both of those are OVMF platform modules, and we never build one without
building the other. (Put differently, we never build PiSmmCpuDxeSmm and
CpuHotplugSmm separately, for any particular OVMF binary.)

Thus, the "Revision" field is useless.


> +  UINT32           ArrayLength;       // Entries in the ApicIdMap array
> +
> +  UINT64           *ApicIdMap;        // Pointer to CpuIndex->ApicId map for
> +                                      // pending hot-ejects

(13a) "CpuIndex" is yet another name here; if you mean
"ProcessorNum[ber]" -- see point (10) above --, then please use that
word.

(13b) Also, the "->" arrow is a bit confusing (is "CpuIndex" a
pointer???), so please either use " -> " (spaces on both sides) or write
"ProcessorNumber-to-ApicId".


> +  CPU_HOT_EJECT_FN Handler;           // Handler to do the CPU ejection
> +
> +  UINT64           Reserved;

(14) Please drop the "Reserved" field as well, with the justification
given in (12).


> +} CPU_HOT_EJECT_DATA;
> +
> +#endif /* _CPU_HOT_EJECT_DATA_H_ */
>

(15) Comment style is wrong; should be //.

(I admit that you may find many examples for the wrong comment style,
near such "#endif" directives, under OvmfPkg/Include; sorry about that.)


(16) Please drop the leading underscore here too.


I plan to continue the review either today, or sometime later this week.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  2021-01-29  0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
@ 2021-02-01 13:36   ` Laszlo Ersek
  2021-02-03  5:20     ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 13:36 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Init CPU_HOT_EJECT_DATA, which will be used to share CPU ejection state
> between SmmCpuFeaturesLib (via PiSmmCpuDxeSmm) and CpuHotPlugSmm.
> CpuHotplugSmm also sets up the CPU ejection mechanism via
> CPU_HOT_EJECT_DATA->Handler.
>
> Additionally, expose CPU_HOT_EJECT_DATA via PcdCpuHotEjectDataAddress.

(1) Please mention that the logic is added to
SmmCpuFeaturesSmmRelocationComplete(), and so it will run as part of the
PiSmmCpuDxeSmm entry point function, PiCpuSmmEntry().


>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  .../SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf        |  3 +
>  .../Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c  | 78 ++++++++++++++++++++++
>  2 files changed, 81 insertions(+)
>
> diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
> index 97a10afb6e27..32c63722ee62 100644
> --- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
> +++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
> @@ -35,4 +35,7 @@ [LibraryClasses]
>    UefiBootServicesTableLib
>
>  [Pcd]
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugSupport
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress
>    gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase
> diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
> index 7ef7ed98342e..33dd5da92432 100644
> --- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
> +++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
> @@ -14,7 +14,9 @@
>  #include <Library/PcdLib.h>
>  #include <Library/SmmCpuFeaturesLib.h>
>  #include <Library/SmmServicesTableLib.h>
> +#include <Library/MemoryAllocationLib.h>

(2) The MemoryAllocationLib class is not listed in the [LibraryClasses]
section of "OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf"; so
please list it there as well.

(Please keep the [LibraryClasses] section in the INF file sorted, while
at it.)


>  #include <Library/UefiBootServicesTableLib.h>
> +#include <Library/CpuHotEjectData.h>

(3) This will change, once you move the header file under
"OvmfPkg/Include/Pcd/"; either way, please keep the #include directives
alphabetically sorted.

(Before this patch, the #include list is well-sorted.)


>  #include <PiSmm.h>
>  #include <Register/Intel/SmramSaveStateMap.h>
>  #include <Register/QemuSmramSaveStateMap.h>
> @@ -171,6 +173,70 @@ SmmCpuFeaturesHookReturnFromSmm (
>    return OriginalInstructionPointer;
>  }
>
> +GLOBAL_REMOVE_IF_UNREFERENCED

(4a) This is useless unless building with MSVC; I don't really remember
introducing any instance of this macro myself, ever. I suggest dropping
it.

(4b) On the other hand, STATIC it should be.


> +CPU_HOT_EJECT_DATA *mCpuHotEjectData = NULL;
> +
> +/**
> +  Initialize CpuHotEjectData if PcdCpuHotPlugSupport is enabled
> +  and, if more than 1 CPU is configured.
> +
> +  Also sets up the corresponding PcdCpuHotEjectDataAddress.
> +**/

(5) typo: s/CpuHotEjectData/mCpuHotEjectData/


(6) As requested elsewhere under v6, there's no need to make this
dependent on PcdCpuHotPlugSupport.


(7) "Initialize" is imperative mood, "sets up" is indicative mood.
Either one is fine, just be consistent please.


> +STATIC
> +VOID
> +SmmCpuFeaturesSmmInitHotEject (

(8) This is a STATIC function (i.e., it has internal linkage); there's
no need to complicate its name with the "SmmCpuFeatures..." prefix.

I suggest "InitCpuHotEjectData".


> +  VOID
> +  )
> +{
> +  UINT32      mMaxNumberOfCpus;

(9) This is a variable with automatic storage duration, so the "m"
prefix is invalid.


> +  EFI_STATUS  Status;
> +
> +  if (!FeaturePcdGet (PcdCpuHotPlugSupport)) {
> +    return;
> +  }

(10a) Please drop this, per prior discussion.

(10b) Please drop the PCD from the INF file too.


(11) In the rest of this function, the comment style is incorrect in
several spots. The idiomatic style is:

  //
  // Blah.
  //

I.e., normally we'd need leading and trailing empty comment lines.

*However*, most of those comments don't really explain much beyond
what's emergent from the code anyway, to me anyway, thus, I would simply
suggest dropping those comments.


> +
> +  // PcdCpuHotPlugSupport => PcdCpuMaxLogicalProcessorNumber
> +  mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
> +
> +  // No spare CPUs to hot-eject
> +  if (mMaxNumberOfCpus == 1) {
> +    return;
> +  }
> +
> +  mCpuHotEjectData =
> +    (CPU_HOT_EJECT_DATA *)AllocatePool (sizeof (*mCpuHotEjectData));

(12) The cast is superfluous (it only wastes screen real estate), as
AllocatePool() returns (VOID *).

(Hopefully this will also let us avoid the somewhat awkward line break.)


> +  ASSERT (mCpuHotEjectData != NULL);

(13) Here we need to hang harder than this -- even in a RELEASE build,
in case AllocatePool() fails. The following should work:

  if (mCpuHotEjectData == NULL) {
    ASSERT (mCpuHotEjectData != NULL);
    CpuDeadLoop ();
  }

I'll have another comment on this, below...


> +
> +  //
> +  // Allocate buffer for pointers to array in CPU_HOT_EJECT_DATA.
> +  //
> +
> +  // Revision
> +  mCpuHotEjectData->Revision = CPU_HOT_EJECT_DATA_REVISION_1;
> +
> +  // Array Length of APIC ID
> +  mCpuHotEjectData->ArrayLength = mMaxNumberOfCpus;
> +
> +  // CpuIndex -> APIC ID map
> +  mCpuHotEjectData->ApicIdMap = (UINT64 *)AllocatePool
> +      (sizeof (*mCpuHotEjectData->ApicIdMap) * mCpuHotEjectData->ArrayLength);
> +
> +  // Hot-eject handler
> +  mCpuHotEjectData->Handler = NULL;
> +
> +  // Reserved
> +  mCpuHotEjectData->Reserved = 0;
> +
> +  ASSERT (mCpuHotEjectData->ApicIdMap != NULL);
> +

(14) I would propose the following:

(14a) Add SafeIntLib to both the #include directive list, and the
[LibraryClasses] section in the INF file.

(14b) Use SafeIntLib functions to calculate the cumulative size for both
CPU_HOT_EJECT_DATA, and the ApicIdMap placed right after it, in a local
UINTN variable.

(14c) Use a single AllocatePool() call. This simplifies error handling
-- you'll need just one instance of point (13) above --, plus it might
even reduce SMRAM fragmentation a tiny bit.


(15) The following initialization logic, from patch v6 7/9
("OvmfPkg/CpuHotplugSmm: add CpuEject()"), belongs in the present patch,
in my opinion:

    //
    // For CPU ejection we need to map ProcessorNum -> APIC_ID. By the time
    // we need the mapping, however, the Processor's APIC ID has already been
    // removed from SMM data structures. So we will maintain a local map
    // in mCpuHotEjectData->ApicIdMap.
    //
    for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
      mCpuHotEjectData->ApicIdMap[Idx] = CPU_EJECT_INVALID;
    }

If necessary, feel free to trim or reword the comment; I just think the
data structure is not ready for publishing via the PCD until the
"ApicIdMap" elements have been set to INVALID. (IOW, I'm kind of making
a "RAII" argument.)


> +  //
> +  // Expose address of CPU Hot eject Data structure
> +  //

(this comment is helpful, please keep it)


> +  Status = PcdSet64S (PcdCpuHotEjectDataAddress,
> +                      (UINT64)(VOID *)mCpuHotEjectData);

(16) Incorrect indentation on the second line.


(17) The (UINT64) cast could trigger a warning in an IA32 build (casting
between integer and pointer types should keep the width); please replace
(UINT64) with (UINTN).


> +  ASSERT_EFI_ERROR (Status);

(18) Given that we don't use the "Status" variable for anything else in
this function, it's more idiomatic for "Status" to directly match the
type returned by PcdSet64S() -- RETURN_STATUS. In such cases, we
generally call the variable "PcdStatus". So the idea is

  RETURN_STATUS PcdStatus;

  PcdStatus = PcdSet64S (...);
  ASSERT_RETURN_ERROR (PcdStatus);

RETURN_STATUS and EFI_STATUS behave identically in practice, but (again)
if we use the status variable *only* for a PcdSet retval, then
RETURN_STATUS is more elegant.

(RETURN_STATUS is basically a BASE type, while EFI_STATUS exists in
connection with the PI and UEFI specs; IOW, RETURN_STATUS is
semantically more primitive / foundational.)

The usage of an ASSERT is fine here, BTW; we don't expect this PcdSet
call to ever fail.


> +}
> +
>  /**
>    Hook point in normal execution mode that allows the one CPU that was elected
>    as monarch during System Management Mode initialization to perform additional
> @@ -188,6 +254,9 @@ SmmCpuFeaturesSmmRelocationComplete (
>    UINTN      MapPagesBase;
>    UINTN      MapPagesCount;
>
> +
> +  SmmCpuFeaturesSmmInitHotEject ();
> +
>    if (!MemEncryptSevIsEnabled ()) {
>      return;
>    }
> @@ -375,6 +444,15 @@ SmmCpuFeaturesRendezvousExit (
>    IN UINTN  CpuIndex
>    )
>  {
> +  //
> +  // CPU Hot-eject not enabled.
> +  //
> +  if (mCpuHotEjectData == NULL ||
> +      mCpuHotEjectData->Handler == NULL) {
> +    return;
> +  }
> +
> +  mCpuHotEjectData->Handler (CpuIndex);
>  }
>
>  /**
>

(19a) Please split the SmmCpuFeaturesRendezvousExit() change to a
separate patch.

In particular, "init CPU ejection state" in the subject does not cover
the SmmCpuFeaturesRendezvousExit() change at all.

(19b) In the separate patch's commit message, it would be nice to
mention the *call site* of SmmCpuFeaturesRendezvousExit(), such as "one
of the last actions in SmiRendezvous()".


(20) I think we should refine the comment "CPU Hot-eject not enabled".
That comment covers the (mCpuHotEjectData == NULL) case, yes; but it
doesn't cover (mCpuHotEjectData->Handler == NULL).

The latter condition certainly seems valid, because:

- some SMIs are likely handled before the SMM driver dispatch reaches
  the CpuHotplugSmm driver, and the latter gets a chance to set up the
  callback, as a part of erecting the CPU hot-(un)plug support,

- and even after CpuHotplugSmm is loaded, an unplug request may never
  happen.

However, we should document this particular state, with a dedicated
comment -- perhaps just say, "hot-eject has not been requested yet".

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
@ 2021-02-01 16:11   ` Laszlo Ersek
  2021-02-01 19:08   ` Laszlo Ersek
  1 sibling, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 16:11 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Add CpuEject(), which handles the CPU ejection, and provides a holding
> area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
> at the tail end of the SMI handling.

(1) The functions introduced thus far by this patch series are all named
"Verb + Object", which is great; so please call this function EjectCpu()
as well, rather than CpuEject().

Modify all three of: subject line, commit message, patch body; please.


>
> Also UnplugCpus() now stashes APIC IDs of CPUs which need to be
> ejected in CPU_HOT_EJECT_DATA.ApicIdMap. These are used by CpuEject()
> to identify such CPUs.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 109 +++++++++++++++++++++++++++++++++++--
>  1 file changed, 105 insertions(+), 4 deletions(-)
>
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index 70d69f6ed65b..526f51faf070 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -14,6 +14,7 @@
>  #include <Library/MmServicesTableLib.h>      // gMmst
>  #include <Library/PcdLib.h>                  // PcdGetBool()
>  #include <Library/SafeIntLib.h>              // SafeUintnSub()
> +#include <Library/CpuHotEjectData.h>         // CPU_HOT_EJECT_DATA
>  #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
>  #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
>  #include <Uefi/UefiBaseType.h>               // EFI_STATUS

(2) This will change due to the movement of the header file, but: please
keep the #include directive list alphabetically sorted.


> @@ -32,11 +33,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
>  //
>  STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
>  //
> -// This structure is a communication side-channel between the
> +// These structures serve as communication side-channels between the
>  // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
>  // (i.e., PiSmmCpuDxeSmm).
>  //
>  STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
> +STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
>  //
>  // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
>  // known event types), for the time of just one MMI.
> @@ -188,11 +190,53 @@ RevokeNewSlot:
>  }
>
>  /**
> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
> +  on each CPU at exit from SMM.
> +
> +  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
> +  until ejected.
> +
> +  @param[in] ProcessorNum      Index of executing CPU.
> +
> +**/
> +VOID
> +EFIAPI
> +CpuEject (
> +  IN UINTN ProcessorNum
> +  )
> +{
> +  //
> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
> +  // so use UINT64 throughout.
> +  //
> +  UINT64 ApicId;
> +
> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
> +  if (ApicId == CPU_EJECT_INVALID) {
> +    return;
> +  }
> +
> +  //
> +  // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
> +  // after having been cleared to exit the SMI by the monarch and thus have
> +  // no SMM processing remaining.
> +  //
> +  // Given that we cannot allow them to escape to the guest, we pen them
> +  // here until the SMM monarch tells the HW to unplug them.
> +  //
> +  CpuDeadLoop ();
> +}

(3a) We can make this less resource-hungry, by replacing CpuDeadLoop()
with:

  for (;;) {
    DisableInterrupts ();
    CpuSleep ();
  }

This basically translates to a { CLI; HLT; } loop.

(Both functions come from BaseLib, which CpuHotplugSmm already consumes,
thus there is no need to modify #include's or [LibraryClasses].)


(3b) Please refresh the CpuDeadLoop() reference in the function's
leading comment as well.


> +
> +/**
>    Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>
>    For each such CPU, report the CPU to PiSmmCpuDxeSmm via
> -  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
> -  unknown, skip it silently.
> +  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
> +  If the to be hot-unplugged CPU is unknown, skip it silently.
> +
> +  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
> +  which would handle the ejection.
>
>    @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
>                                  hot-unplugged.
> @@ -216,9 +260,11 @@ UnplugCpus (
>  {
>    EFI_STATUS Status;
>    UINT32     ToUnplugIdx;
> +  UINT32     EjectCount;
>    UINTN      ProcessorNum;
>
>    ToUnplugIdx = 0;
> +  EjectCount = 0;
>    while (ToUnplugIdx < ToUnplugCount) {
>      APIC_ID    RemoveApicId;
>
> @@ -255,13 +301,41 @@ UnplugCpus (
>        DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
>          __FUNCTION__, RemoveApicId, Status));
>        goto Fatal;
> +    } else {

(Under patch v6 4/9, I request that the "goto" be replaced with a
"return" -- my point (4) below applies regardless:)

(4) Please don't add an "else" branch, if the first branch of the "if"
ends with a jump statement. Because, in that case, the code that follows
the "if" statement is not reachable after the first branch anyway.

So please just unnest the next part:


> +      //
> +      // Stash the APIC IDs so we can do the actual ejection later.
> +      //
> +      if (mCpuHotEjectData->ApicIdMap[ProcessorNum] != CPU_EJECT_INVALID) {
> +        //
> +        // Since ProcessorNum and APIC-ID map 1-1, so a valid
> +        // mCpuHotEjectData->ApicIdMap[ProcessorNum] means something
> +        // is horribly wrong.
> +        //

(5) To be honest, I would replace this with:

      //
      // - mCpuHotEjectData->ApicIdMap[ProcessorNum] is initialized to
      //   CPU_EJECT_INVALID when mCpuHotEjectData->ApicIdMap is allocated,
      //
      // - mCpuHotEjectData->ApicIdMap[ProcessorNum] is restored to
      //   CPU_EJECT_INVALID when the subject processor is ejected,
      //
      // - mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
      //   mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
      //   never match more than one APIC ID in a single invocation of
      //   UnplugCpus().
      //


> +        DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %u maps to %llx, cannot "
> +                "map to " FMT_APIC_ID "\n", __FUNCTION__, ProcessorNum,
> +                mCpuHotEjectData->ApicIdMap[ProcessorNum], RemoveApicId));

(6a) The indentation of the 2nd and 3rd lines is incorrect.

(6b) For logging UINTN values (i.e., ProcessorNum) portably between IA32
and X64, %u is not correct. Instead:

- cast the UINTN value to UINT64 explicitly,
- use the %Lu or %Lx format specifier.

(6c) There is no "%llx" format string in edk2's PrintLib (no "ll" length
modifier, to be more precise). UINT64 values need to be printed with
"%lu" or "%lx", or -- identically -- with "%Lu" or "%Lx". I prefer the
latter, because standard C does not define the "L" size modifier for
integers, and that makes it clear that we're using an edk2-specific
feature. The "l" (ell) length modifier could be misunderstood as "long"
(which is something we don't use in edk2).

(6d) FMT_APIC_ID is defined as "0x%08x"; to remain consistent with that,
I would print the ApicIdMap element not just with "%Lx", but with
"0x%016Lx".


> +
> +        Status = EFI_INVALID_PARAMETER;
> +        goto Fatal;

(7a) Please just "return EFI_ALREADY_STARTED".

(7b) Please also modify the leading comment on the function -- the new
return value EFI_ALREADY_STARTED should be documented. I suggest:

   @retval EFI_ALREADY_STARTED   For the ProcessorNumber that
                                 EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
                                 one of the APIC ID in ToUnplugApicIds,
                                 mCpuHotEjectData->ApicIdMap already has an
                                 APIC ID stashed. (This should never happen.)


> +      }
> +
> +      mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId;
> +      EjectCount++;
>      }
>
>      ToUnplugIdx++;
>    }
>
> +  if (EjectCount != 0) {
> +    //
> +    // We have processors to be ejected; install the handler.
> +    //
> +    mCpuHotEjectData->Handler = CpuEject;
> +  }
> +

(8) I suggest removing the "EjectCount" local variable, and setting the
"Handler" member where you currently increment "EjectCount".


>    //
> -  // We've removed this set of APIC IDs from SMM data structures.
> +  // We've removed this set of APIC IDs from SMM data structures and
> +  // have installed an ejection handler if needed.
>    //
>    return EFI_SUCCESS;
>
> @@ -458,7 +532,13 @@ CpuHotplugEntry (
>    // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
>    // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
>    //
> +  // Additionally, CPU Hot-unplug is available only if CPU Hotplug is, so
> +  // the same DEPEX also guarantees that PcdCpuHotEjectDataAddress points
> +  // to CPU_HOT_EJECT_DATA in SMRAM.
> +  //

(9) I don't see the relevance of "hot-unplug depends on hot-plug" here.

I recommend the following comment instead:

   //
   // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
   // has pointed:
   // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
   // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
   //   possible CPU count is greater than 1.
   //

>    mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
> +  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
> +
>    if (mCpuHotPlugData == NULL) {
>      Status = EFI_NOT_FOUND;
>      DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
> @@ -470,6 +550,9 @@ CpuHotplugEntry (
>    if (mCpuHotPlugData->ArrayLength == 1) {
>      return EFI_UNSUPPORTED;
>    }
> +  ASSERT (mCpuHotEjectData &&
> +          (mCpuHotPlugData->ArrayLength == mCpuHotEjectData->ArrayLength));
> +
>    //
>    // Allocate the data structures that depend on the possible CPU count.
>    //

(10) To remain consistent with the check performed on "mCpuHotPlugData",
please do:

  if (mCpuHotEjectData == NULL) {
    Status = EFI_NOT_FOUND;
  } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
    Status = EFI_INVALID_PARAMETER;
  } else {
    Status = EFI_SUCCESS;
  }
  if (EFI_ERROR (Status)) {
    DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
    goto Fatal;
  }

(

  As a digression, I'll make some comments on the ASSERT() too:

  - Given ASSERT ((C1) && (C2)), it is best to express the same as
    ASSERT (C1); ASSERT (C2); -- the effect is the same, but the error
    messages have finer granularity.

  - Checking a pointer against NULL must be explicit at all times, in
    edk2. IOW, ASSERT (mCpuHotEjectData) should be spelled
    ASSERT (mCpuHotEjectData != NULL).

)


> @@ -552,6 +635,24 @@ CpuHotplugEntry (
>    //
>    SmbaseInstallFirstSmiHandler ();
>
> +  if (mCpuHotEjectData) {

(11) This condition is guaranteed to evaluate to TRUE; see the ASSERT()
above.

Anyway, ignore this...


> +  UINT32     Idx;

(12) Incorrect indentation, but ignore this too...


> +    //
> +    // For CPU ejection we need to map ProcessorNum -> APIC_ID. By the time
> +    // we need the mapping, however, the Processor's APIC ID has already been
> +    // removed from SMM data structures. So we will maintain a local map
> +    // in mCpuHotEjectData->ApicIdMap.
> +    //
> +    for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
> +      mCpuHotEjectData->ApicIdMap[Idx] = CPU_EJECT_INVALID;
> +    }

(13) ... because this init loop should be moved to patch #6 (subject
"OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state"), as I mentioned
there...


> +
> +    //
> +    // Wait to init the handler until an ejection is warranted
> +    //
> +    mCpuHotEjectData->Handler = NULL;

(14) ... and because this nulling is performed by patch #6 already
(subject "OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state").


Therefore, this whole conditional block should be removed please.

Thanks!
Laszlo

> +  }
> +
>    return EFI_SUCCESS;
>
>  ReleasePostSmmPen:
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  2021-01-29  0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
@ 2021-02-01 17:22   ` Laszlo Ersek
  2021-02-01 19:21     ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 17:22 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> Designate a worker CPU (we use the one executing the root MMI
> handler), which will do the actual ejection via QEMU in CpuEject().
>
> CpuEject(), on the worker CPU, ejects each marked CPU by first
> selecting its APIC ID and then sending the QEMU "eject" command.
> QEMU in-turn signals the remote VCPU thread which context-switches
> it out of the SMI.
>
> CpuEject(), on the CPU being ejected, spins around in its holding
> area until this final context-switch. This does mean that there is
> some CPU state that would ordinarily be restored (in SmiRendezvous()
> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
> This unrestored state includes FPU state, CET enable, stuffing of
> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
> this should be okay.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 67 insertions(+), 6 deletions(-)

(1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
and code too).


> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index 526f51faf070..bf91344eef9c 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -193,9 +193,12 @@ RevokeNewSlot:
>    CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>    on each CPU at exit from SMM.
>
> -  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is neither a worker, nor being ejected, nothing
> +  to be done.
>    If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>    until ejected.
> +  If, the executing CPU is a worker CPU, set QEMU CPU status to eject
> +  for CPUs being ejected.
>
>    @param[in] ProcessorNum      Index of executing CPU.
>
> @@ -217,6 +220,56 @@ CpuEject (
>      return;
>    }
>
> +  if (ApicId == CPU_EJECT_WORKER) {

(2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
generality). I wish I understood this idea earlier in the patch set.

(2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
dropped.

(2b) In this patch, the question whether the executing CPU is the BSP or
not, should be decided with the same logic that is visible in
PlatformSmmBspElection()
[OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:

  MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
  BOOLEAN                     IsBsp;

  ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
  IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);

(2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
in UnplugCpus() below.

(2d) The "is worker" language (in comments etc) should be replaced with
direct "is BSP" language.


> +    UINT32 CpuIndex;
> +
> +    for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength; CpuIndex++) {
> +      UINT64 RemoveApicId;
> +
> +      RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
> +
> +      if ((RemoveApicId != CPU_EJECT_INVALID &&
> +           RemoveApicId != CPU_EJECT_WORKER)) {
> +        //
> +        // This to-be-ejected-CPU has already received the BSP's SMI exit
> +        // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
> +        // followed by this callback or is already waiting in the
> +        // CpuDeadLoop() below.
> +        //
> +        // Tell QEMU to context-switch it out.
> +        //
> +        QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
> +        QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);

(3) While the QEMU CPU selector value *usually* matches the APIC ID,
it's not an invariant. APIC IDs have an internal structure, composed of
bit-fields, where each bit-field accommodates one hierarchy level in the
CPU topology (thread, core, die (maybe), and socket).

However, this mapping need not be surjective. QEMU lets you create
"pathological" CPU topologies, for example one with:
- 3 threads/core,
- 5 cores/socket,
- (say) 2 sockets.

Under that example, the bit-field standing for the "thread number" level
would have 2 bits, theoretically permitting *4* threads/core, and the
bit-field standing for the "core number" level would have 3 bits,
theoretically allowing for *8* cores/socket.

Considering the fully populated topology, you'd see the CPU selector
range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
processors in total). However, the APIC ID *image* of this CPU selector
*domain* would not be "contiguous" -- the APIC ID space, with the
above-described structure, would accommodate 4*8*2=64 logical
processors. For example, each APIC ID that stood for the nonexistent
"thread#3" on a particular core would be left unused (no CPU selector
would map to it).

All in all, you can't write the APIC ID to the CPU selector register,
for ejection. You need to select the CPU whose APIC ID is the APIC ID
you want to eject, and then initiate ejection.

This requires one of two alternatives:


(3a) The first option is to keep the change local to this patch.

This alternative is the more CPU-hungry (and uglier) one.

The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
there is a match, eject.


(3b) The second option is much more elegant (and it's faster too), but
it requires a much more intrusive update to the patch set.

First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
operates on, has to be changed from APIC_ID to a structure type that
pairs APIC_ID with the QEMU CPU selector. [*]

Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
also save the "CurrentSelector" value (in the other field of the output
array element structure).

Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
replaced with a structure type similar (or identical) to the one
described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
the QEMU selector for that processor, and the APIC ID.

Fourth, the actual ejection should use the selector.

Fifth, the debug message (below) should continue logging the APIC ID, to
mirror the DEBUG_INFO message in ProcessHotAddedCpus().


Would you be willing to implement (3b)?


> +
> +        //
> +        // Compiler barrier to ensure the next store isn't reordered
> +        //
> +        MemoryFence ();
> +
> +        //
> +        // Clear the eject status for CpuIndex to ensure that an invalid
> +        // SMI later does not end up trying to eject it or a newly
> +        // hotplugged CpuIndex does not go into the dead loop.
> +        //
> +        mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
> +
> +        DEBUG ((DEBUG_INFO, "%a: Unplugged CPU %u -> " FMT_APIC_ID "\n",
> +               __FUNCTION__, CpuIndex, RemoveApicId));

(4) The DEBUG_INFO log message is in the right place (and uses the right
debug mask), but it is afflicted by the usual warts (indentation, format
specifiers etc). Please reapply the comments I made elsewhere.


(5a) Please replace "CPU" with "ProcessorNumber" (so that we know it's
the protocol-assigned number, not the QEMU selector).

(5b) Please replace the arrow " -> " with the string " APIC ID ".


Thanks!
Laszlo

> +      }
> +    }
> +
> +    //
> +    // Clear our own worker status.
> +    //
> +    mCpuHotEjectData->ApicIdMap[ProcessorNum] = CPU_EJECT_INVALID;
> +
> +    //
> +    // We are done until the next hot-unplug; clear the handler.
> +    //
> +    mCpuHotEjectData->Handler = NULL;
> +    return;
> +  }
> +
>    //
>    // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
>    // after having been cleared to exit the SMI by the monarch and thus have
> @@ -327,6 +380,19 @@ UnplugCpus (
>    }
>
>    if (EjectCount != 0) {
> +    UINTN  Worker;
> +
> +    Status = mMmCpuService->WhoAmI (mMmCpuService, &Worker);
> +    ASSERT_EFI_ERROR (Status);
> +    //
> +    // UnplugCpus() is called via the root MMI handler and thus we are
> +    // executing in the BSP context.
> +    //
> +    // Mark ourselves as the worker CPU.
> +    //
> +    ASSERT (mCpuHotEjectData->ApicIdMap[Worker] == CPU_EJECT_INVALID);
> +    mCpuHotEjectData->ApicIdMap[Worker] = CPU_EJECT_WORKER;
> +
>      //
>      // We have processors to be ejected; install the handler.
>      //
> @@ -451,11 +517,6 @@ CpuHotplugMmi (
>    if (EFI_ERROR (Status)) {
>      goto Fatal;
>    }
> -  if (ToUnplugCount > 0) {
> -    DEBUG ((DEBUG_ERROR, "%a: hot-unplug is not supported yet\n",
> -      __FUNCTION__));
> -    goto Fatal;
> -  }
>
>    if (PluggedCount > 0) {
>      Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-01-29  0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
@ 2021-02-01 17:37   ` Laszlo Ersek
  2021-02-01 17:40     ` Laszlo Ersek
  2021-02-03  5:46     ` Ankur Arora
  0 siblings, 2 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 17:37 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 01/29/21 01:59, Ankur Arora wrote:
> As part of the negotiation treat ICH9_LPC_SMI_F_CPU_HOT_UNPLUG as a
> subfeature of feature flag ICH9_LPC_SMI_F_CPU_HOTPLUG, so enable it
> only if the other is also being negotiated.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  OvmfPkg/SmmControl2Dxe/SmiFeatures.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
> index c9d875543205..e70f3f8b58cb 100644
> --- a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
> +++ b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
> @@ -29,6 +29,13 @@
>  //
>  #define ICH9_LPC_SMI_F_CPU_HOTPLUG BIT1
>  
> +// The following bit value stands for "enable CPU hot unplug, and inject an SMI

(1) s/hot unplug/hot-unplug/


> +// with control value ICH9_APM_CNT_CPU_HOT_UNPLUG upon hot unplug", in the

(2) There is no such thing as ICH9_APM_CNT_CPU_HOT_UNPLUG; we use the
same SMI command value ICH9_APM_CNT_CPU_HOTPLUG (= 4) for unplug.

In QEMU, the macro is called OVMF_CPUHP_SMI_CMD.


(3) s/hot unplug/hot-unplug/.


> +// "etc/smi/supported-features" and "etc/smi/requested-features" fw_cfg files.
> +// Is only negotiated alongside ICH9_LPC_SMI_F_CPU_HOTPLUG.

(4) Please drop the last sentence (see more on it below).


> +//
> +#define ICH9_LPC_SMI_F_CPU_HOT_UNPLUG BIT2
> +
>  //
>  // Provides a scratch buffer (allocated in EfiReservedMemoryType type memory)
>  // for the S3 boot script fragment to write to and read from.
> @@ -112,7 +119,8 @@ NegotiateSmiFeatures (
>    QemuFwCfgReadBytes (sizeof mSmiFeatures, &mSmiFeatures);
>  
>    //
> -  // We want broadcast SMI, SMI on CPU hotplug, and nothing else.
> +  // We want broadcast SMI, SMI on CPU hotplug, on CPU hot-unplug
> +  // and nothing else.
>    //
>    RequestedFeaturesMask = ICH9_LPC_SMI_F_BROADCAST;
>    if (!MemEncryptSevIsEnabled ()) {

(5) Please spell out the full expression "SMI on CPU hot-unplug".


> @@ -120,8 +128,18 @@ NegotiateSmiFeatures (
>      // For now, we only support hotplug with SEV disabled.
>      //
>      RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOTPLUG;
> +    RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>    }
>    mSmiFeatures &= RequestedFeaturesMask;
> +
> +  if (!(mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) &&
> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG)) {
> +    DEBUG ((DEBUG_WARN, "%a CPU host-features %Lx, requested mask %Lx\n",
> +      __FUNCTION__, mSmiFeatures, RequestedFeaturesMask));
> +
> +    mSmiFeatures &= ~ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
> +  }
> +
>    QemuFwCfgSelectItem (mRequestedFeaturesItem);
>    QemuFwCfgWriteBytes (sizeof mSmiFeatures, &mSmiFeatures);
>  

(6) Please drop this hunk. We don't try to be smarter than QEMU, in
general, whenever we perform feature negotiation.

For example, the pre-patch code doesn't attempt to notice if QEMU
acknowledges ICH9_LPC_SMI_F_CPU_HOTPLUG but not ICH9_LPC_SMI_F_BROADCAST.


> @@ -162,8 +180,9 @@ NegotiateSmiFeatures (
>    if ((mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) == 0) {
>      DEBUG ((DEBUG_INFO, "%a: CPU hotplug not negotiated\n", __FUNCTION__));
>    } else {
> -    DEBUG ((DEBUG_INFO, "%a: CPU hotplug with SMI negotiated\n",
> -      __FUNCTION__));
> +    DEBUG ((DEBUG_INFO, "%a: CPU hotplug%s with SMI negotiated\n",
> +      __FUNCTION__,
> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG) ? ", unplug" : ""));
>    }
>  
>    //
> 

(7) Rather than combining these two in a common debug message, please
just add a separate "if" that follows the whole pattern seen with
ICH9_LPC_SMI_F_CPU_HOTPLUG. Thus, for each feature bit we care about,
we'll have a dedicated log message, saying yes or no.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-02-01 17:37   ` Laszlo Ersek
@ 2021-02-01 17:40     ` Laszlo Ersek
  2021-02-01 17:48       ` Laszlo Ersek
  2021-02-03  5:46     ` Ankur Arora
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 17:40 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/01/21 18:37, Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> As part of the negotiation treat ICH9_LPC_SMI_F_CPU_HOT_UNPLUG as a
>> subfeature of feature flag ICH9_LPC_SMI_F_CPU_HOTPLUG, so enable it
>> only if the other is also being negotiated.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>  OvmfPkg/SmmControl2Dxe/SmiFeatures.c | 25 ++++++++++++++++++++++---
>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> index c9d875543205..e70f3f8b58cb 100644
>> --- a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> +++ b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> @@ -29,6 +29,13 @@
>>  //
>>  #define ICH9_LPC_SMI_F_CPU_HOTPLUG BIT1
>>  
>> +// The following bit value stands for "enable CPU hot unplug, and inject an SMI
> 
> (1) s/hot unplug/hot-unplug/
> 
> 
>> +// with control value ICH9_APM_CNT_CPU_HOT_UNPLUG upon hot unplug", in the
> 
> (2) There is no such thing as ICH9_APM_CNT_CPU_HOT_UNPLUG; we use the
> same SMI command value ICH9_APM_CNT_CPU_HOTPLUG (= 4) for unplug.
> 
> In QEMU, the macro is called OVMF_CPUHP_SMI_CMD.
> 
> 
> (3) s/hot unplug/hot-unplug/.
> 
> 
>> +// "etc/smi/supported-features" and "etc/smi/requested-features" fw_cfg files.
>> +// Is only negotiated alongside ICH9_LPC_SMI_F_CPU_HOTPLUG.
> 
> (4) Please drop the last sentence (see more on it below).
> 
> 
>> +//
>> +#define ICH9_LPC_SMI_F_CPU_HOT_UNPLUG BIT2
>> +
>>  //
>>  // Provides a scratch buffer (allocated in EfiReservedMemoryType type memory)
>>  // for the S3 boot script fragment to write to and read from.
>> @@ -112,7 +119,8 @@ NegotiateSmiFeatures (
>>    QemuFwCfgReadBytes (sizeof mSmiFeatures, &mSmiFeatures);
>>  
>>    //
>> -  // We want broadcast SMI, SMI on CPU hotplug, and nothing else.
>> +  // We want broadcast SMI, SMI on CPU hotplug, on CPU hot-unplug
>> +  // and nothing else.
>>    //
>>    RequestedFeaturesMask = ICH9_LPC_SMI_F_BROADCAST;
>>    if (!MemEncryptSevIsEnabled ()) {
> 
> (5) Please spell out the full expression "SMI on CPU hot-unplug".
> 
> 
>> @@ -120,8 +128,18 @@ NegotiateSmiFeatures (
>>      // For now, we only support hotplug with SEV disabled.
>>      //
>>      RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOTPLUG;
>> +    RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>>    }
>>    mSmiFeatures &= RequestedFeaturesMask;
>> +
>> +  if (!(mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) &&
>> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG)) {
>> +    DEBUG ((DEBUG_WARN, "%a CPU host-features %Lx, requested mask %Lx\n",
>> +      __FUNCTION__, mSmiFeatures, RequestedFeaturesMask));
>> +
>> +    mSmiFeatures &= ~ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>> +  }
>> +
>>    QemuFwCfgSelectItem (mRequestedFeaturesItem);
>>    QemuFwCfgWriteBytes (sizeof mSmiFeatures, &mSmiFeatures);
>>  
> 
> (6) Please drop this hunk. We don't try to be smarter than QEMU, in
> general, whenever we perform feature negotiation.

... obviously: don't drop the part where you set the new bit! :) Sorry,
"hunk" was not the correct term.

Thanks!
Laszlo

> 
> For example, the pre-patch code doesn't attempt to notice if QEMU
> acknowledges ICH9_LPC_SMI_F_CPU_HOTPLUG but not ICH9_LPC_SMI_F_BROADCAST.
> 
> 
>> @@ -162,8 +180,9 @@ NegotiateSmiFeatures (
>>    if ((mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) == 0) {
>>      DEBUG ((DEBUG_INFO, "%a: CPU hotplug not negotiated\n", __FUNCTION__));
>>    } else {
>> -    DEBUG ((DEBUG_INFO, "%a: CPU hotplug with SMI negotiated\n",
>> -      __FUNCTION__));
>> +    DEBUG ((DEBUG_INFO, "%a: CPU hotplug%s with SMI negotiated\n",
>> +      __FUNCTION__,
>> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG) ? ", unplug" : ""));
>>    }
>>  
>>    //
>>
> 
> (7) Rather than combining these two in a common debug message, please
> just add a separate "if" that follows the whole pattern seen with
> ICH9_LPC_SMI_F_CPU_HOTPLUG. Thus, for each feature bit we care about,
> we'll have a dedicated log message, saying yes or no.
> 
> Thanks!
> Laszlo
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-02-01 17:40     ` Laszlo Ersek
@ 2021-02-01 17:48       ` Laszlo Ersek
  0 siblings, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 17:48 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/01/21 18:40, Laszlo Ersek wrote:
> On 02/01/21 18:37, Laszlo Ersek wrote:
>> On 01/29/21 01:59, Ankur Arora wrote:
>>> As part of the negotiation treat ICH9_LPC_SMI_F_CPU_HOT_UNPLUG as a
>>> subfeature of feature flag ICH9_LPC_SMI_F_CPU_HOTPLUG, so enable it
>>> only if the other is also being negotiated.
>>>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>> Cc: Aaron Young <aaron.young@oracle.com>
>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>> ---
>>>  OvmfPkg/SmmControl2Dxe/SmiFeatures.c | 25 ++++++++++++++++++++++---
>>>  1 file changed, 22 insertions(+), 3 deletions(-)

[...]

>>> @@ -120,8 +128,18 @@ NegotiateSmiFeatures (
>>>      // For now, we only support hotplug with SEV disabled.
>>>      //
>>>      RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOTPLUG;
>>> +    RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>>>    }
>>>    mSmiFeatures &= RequestedFeaturesMask;
>>> +
>>> +  if (!(mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) &&
>>> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG)) {
>>> +    DEBUG ((DEBUG_WARN, "%a CPU host-features %Lx, requested mask %Lx\n",
>>> +      __FUNCTION__, mSmiFeatures, RequestedFeaturesMask));
>>> +
>>> +    mSmiFeatures &= ~ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>>> +  }
>>> +
>>>    QemuFwCfgSelectItem (mRequestedFeaturesItem);
>>>    QemuFwCfgWriteBytes (sizeof mSmiFeatures, &mSmiFeatures);
>>>  
>>
>> (6) Please drop this hunk. We don't try to be smarter than QEMU, in
>> general, whenever we perform feature negotiation.

(8) ... Please refresh the commit message accordingly.

> ... obviously: don't drop the part where you set the new bit! :) Sorry,
> "hunk" was not the correct term.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
  2021-02-01 16:11   ` Laszlo Ersek
@ 2021-02-01 19:08   ` Laszlo Ersek
  2021-02-01 20:12     ` Ankur Arora
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-01 19:08 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

apologies, I've got more comments here:

On 01/29/21 01:59, Ankur Arora wrote:

>  /**
> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
> +  on each CPU at exit from SMM.
> +
> +  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
> +  until ejected.
> +
> +  @param[in] ProcessorNum      Index of executing CPU.
> +
> +**/
> +VOID
> +EFIAPI
> +CpuEject (
> +  IN UINTN ProcessorNum
> +  )
> +{
> +  //
> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
> +  // so use UINT64 throughout.
> +  //
> +  UINT64 ApicId;
> +
> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
> +  if (ApicId == CPU_EJECT_INVALID) {
> +    return;
> +  }
> +
> +  //
> +  // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
> +  // after having been cleared to exit the SMI by the monarch and thus have
> +  // no SMM processing remaining.
> +  //
> +  // Given that we cannot allow them to escape to the guest, we pen them
> +  // here until the SMM monarch tells the HW to unplug them.
> +  //
> +  CpuDeadLoop ();
> +}

(15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() --
it's SmmCpuFeaturesRendezvousExit().

(16) This function uses a data structure for communication between BSP
and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
the BSP, and checked above by the APs (too).

What guarantees the visibility of mCpuHotEjectData->ApicIdMap?

I think we might want to use InterlockedCompareExchange64() in both
EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in
addition). InterlockedCompareExchange64() can be used just for
comparison as well, by passing ExchangeValue=CompareValue.

(17) I think a similar observation applies to the "Handler" field too,
as APs call it, while the BSP keeps flipping it between NULL and a real
function address. We might have to turn that field into an
EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use
InterlockedCompareExchange64() again.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  2021-02-01 17:22   ` Laszlo Ersek
@ 2021-02-01 19:21     ` Ankur Arora
  2021-02-02 13:23       ` Laszlo Ersek
  0 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-01 19:21 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-01 9:22 a.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Designate a worker CPU (we use the one executing the root MMI
>> handler), which will do the actual ejection via QEMU in CpuEject().
>>
>> CpuEject(), on the worker CPU, ejects each marked CPU by first
>> selecting its APIC ID and then sending the QEMU "eject" command.
>> QEMU in-turn signals the remote VCPU thread which context-switches
>> it out of the SMI.
>>
>> CpuEject(), on the CPU being ejected, spins around in its holding
>> area until this final context-switch. This does mean that there is
>> some CPU state that would ordinarily be restored (in SmiRendezvous()
>> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
>> This unrestored state includes FPU state, CET enable, stuffing of
>> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
>> this should be okay.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73 ++++++++++++++++++++++++++++++++++----
>>   1 file changed, 67 insertions(+), 6 deletions(-)
> 
> (1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
> and code too).
> 
> 
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index 526f51faf070..bf91344eef9c 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -193,9 +193,12 @@ RevokeNewSlot:
>>     CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>     on each CPU at exit from SMM.
>>
>> -  If, the executing CPU is not being ejected, nothing to be done.
>> +  If, the executing CPU is neither a worker, nor being ejected, nothing
>> +  to be done.
>>     If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>     until ejected.
>> +  If, the executing CPU is a worker CPU, set QEMU CPU status to eject
>> +  for CPUs being ejected.
>>
>>     @param[in] ProcessorNum      Index of executing CPU.
>>
>> @@ -217,6 +220,56 @@ CpuEject (
>>       return;
>>     }
>>
>> +  if (ApicId == CPU_EJECT_WORKER) {
> 
> (2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
> generality). I wish I understood this idea earlier in the patch set.
> 
> (2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
> CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
> dropped.
> 
> (2b) In this patch, the question whether the executing CPU is the BSP or
> not, should be decided with the same logic that is visible in
> PlatformSmmBspElection()
> [OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:
> 
>    MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
>    BOOLEAN                     IsBsp;
> 
>    ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
>    IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
> 
> (2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
> in UnplugCpus() below.
> 
> (2d) The "is worker" language (in comments etc) should be replaced with
> direct "is BSP" language.
> 
> 
>> +    UINT32 CpuIndex;
>> +
>> +    for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength; CpuIndex++) {
>> +      UINT64 RemoveApicId;
>> +
>> +      RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
>> +
>> +      if ((RemoveApicId != CPU_EJECT_INVALID &&
>> +           RemoveApicId != CPU_EJECT_WORKER)) {
>> +        //
>> +        // This to-be-ejected-CPU has already received the BSP's SMI exit
>> +        // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
>> +        // followed by this callback or is already waiting in the
>> +        // CpuDeadLoop() below.
>> +        //
>> +        // Tell QEMU to context-switch it out.
>> +        //
>> +        QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>> +        QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
> 
> (3) While the QEMU CPU selector value *usually* matches the APIC ID,
> it's not an invariant. APIC IDs have an internal structure, composed of
> bit-fields, where each bit-field accommodates one hierarchy level in the
> CPU topology (thread, core, die (maybe), and socket).
> 
> However, this mapping need not be surjective. QEMU lets you create
> "pathological" CPU topologies, for example one with:
> - 3 threads/core,
> - 5 cores/socket,
> - (say) 2 sockets.
> 
> Under that example, the bit-field standing for the "thread number" level
> would have 2 bits, theoretically permitting *4* threads/core, and the
> bit-field standing for the "core number" level would have 3 bits,
> theoretically allowing for *8* cores/socket.
> 
> Considering the fully populated topology, you'd see the CPU selector
> range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
> processors in total). However, the APIC ID *image* of this CPU selector
> *domain* would not be "contiguous" -- the APIC ID space, with the
> above-described structure, would accommodate 4*8*2=64 logical
> processors. For example, each APIC ID that stood for the nonexistent
> "thread#3" on a particular core would be left unused (no CPU selector
> would map to it).
> 
> All in all, you can't write the APIC ID to the CPU selector register,
> for ejection. You need to select the CPU whose APIC ID is the APIC ID
> you want to eject, and then initiate ejection.

Yeah, this is a clear bug. Should have seen it earlier. Thanks for
pointing it out.

> 
> This requires one of two alternatives:
> 
> 
> (3a) The first option is to keep the change local to this patch.
> 
> This alternative is the more CPU-hungry (and uglier) one.
> 
> The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
> possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
> CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
> there is a match, eject.
> 
> 
> (3b) The second option is much more elegant (and it's faster too), but
> it requires a much more intrusive update to the patch set.
> 
> First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
> operates on, has to be changed from APIC_ID to a structure type that
> pairs APIC_ID with the QEMU CPU selector. [*]
> 
> Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
> also save the "CurrentSelector" value (in the other field of the output
> array element structure).
> 
> Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
> replaced with a structure type similar (or identical) to the one
> described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
> be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
> the QEMU selector for that processor, and the APIC ID.
> 
> Fourth, the actual ejection should use the selector.
> 
> Fifth, the debug message (below) should continue logging the APIC ID, to
> mirror the DEBUG_INFO message in ProcessHotAddedCpus().
> 
> 
> Would you be willing to implement (3b)?

3b is clearly the better solution. However, is there enough value in
the print message containing APIC ID, that CPU_HOT_EJECT_DATA.ApicIdMap
carry both the cpu-selector and APIC ID?

As you say, the ejection itself just needs the ProcessorNum -> QEMU cpu-selector
mapping.

Ankur

> 
> 
>> +
>> +        //
>> +        // Compiler barrier to ensure the next store isn't reordered
>> +        //
>> +        MemoryFence ();
>> +
>> +        //
>> +        // Clear the eject status for CpuIndex to ensure that an invalid
>> +        // SMI later does not end up trying to eject it or a newly
>> +        // hotplugged CpuIndex does not go into the dead loop.
>> +        //
>> +        mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
>> +
>> +        DEBUG ((DEBUG_INFO, "%a: Unplugged CPU %u -> " FMT_APIC_ID "\n",
>> +               __FUNCTION__, CpuIndex, RemoveApicId));
> 
> (4) The DEBUG_INFO log message is in the right place (and uses the right
> debug mask), but it is afflicted by the usual warts (indentation, format
> specifiers etc). Please reapply the comments I made elsewhere.
> 
> 
> (5a) Please replace "CPU" with "ProcessorNumber" (so that we know it's
> the protocol-assigned number, not the QEMU selector).
> 
> (5b) Please replace the arrow " -> " with the string " APIC ID ".
> 
> 
> Thanks!
> Laszlo
> 
>> +      }
>> +    }
>> +
>> +    //
>> +    // Clear our own worker status.
>> +    //
>> +    mCpuHotEjectData->ApicIdMap[ProcessorNum] = CPU_EJECT_INVALID;
>> +
>> +    //
>> +    // We are done until the next hot-unplug; clear the handler.
>> +    //
>> +    mCpuHotEjectData->Handler = NULL;
>> +    return;
>> +  }
>> +
>>     //
>>     // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
>>     // after having been cleared to exit the SMI by the monarch and thus have
>> @@ -327,6 +380,19 @@ UnplugCpus (
>>     }
>>
>>     if (EjectCount != 0) {
>> +    UINTN  Worker;
>> +
>> +    Status = mMmCpuService->WhoAmI (mMmCpuService, &Worker);
>> +    ASSERT_EFI_ERROR (Status);
>> +    //
>> +    // UnplugCpus() is called via the root MMI handler and thus we are
>> +    // executing in the BSP context.
>> +    //
>> +    // Mark ourselves as the worker CPU.
>> +    //
>> +    ASSERT (mCpuHotEjectData->ApicIdMap[Worker] == CPU_EJECT_INVALID);
>> +    mCpuHotEjectData->ApicIdMap[Worker] = CPU_EJECT_WORKER;
>> +
>>       //
>>       // We have processors to be ejected; install the handler.
>>       //
>> @@ -451,11 +517,6 @@ CpuHotplugMmi (
>>     if (EFI_ERROR (Status)) {
>>       goto Fatal;
>>     }
>> -  if (ToUnplugCount > 0) {
>> -    DEBUG ((DEBUG_ERROR, "%a: hot-unplug is not supported yet\n",
>> -      __FUNCTION__));
>> -    goto Fatal;
>> -  }
>>
>>     if (PluggedCount > 0) {
>>       Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>>
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-01 19:08   ` Laszlo Ersek
@ 2021-02-01 20:12     ` Ankur Arora
  2021-02-02 14:00       ` Laszlo Ersek
  0 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-01 20:12 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-01 11:08 a.m., Laszlo Ersek wrote:
> apologies, I've got more comments here:
> 
> On 01/29/21 01:59, Ankur Arora wrote:
> 
>>   /**
>> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>> +  on each CPU at exit from SMM.
>> +
>> +  If, the executing CPU is not being ejected, nothing to be done.
>> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>> +  until ejected.
>> +
>> +  @param[in] ProcessorNum      Index of executing CPU.
>> +
>> +**/
>> +VOID
>> +EFIAPI
>> +CpuEject (
>> +  IN UINTN ProcessorNum
>> +  )
>> +{
>> +  //
>> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
>> +  // so use UINT64 throughout.
>> +  //
>> +  UINT64 ApicId;
>> +
>> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>> +  if (ApicId == CPU_EJECT_INVALID) {
>> +    return;
>> +  }
>> +
>> +  //
>> +  // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit()
>> +  // after having been cleared to exit the SMI by the monarch and thus have
>> +  // no SMM processing remaining.
>> +  //
>> +  // Given that we cannot allow them to escape to the guest, we pen them
>> +  // here until the SMM monarch tells the HW to unplug them.
>> +  //
>> +  CpuDeadLoop ();
>> +}
> 
> (15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() --
> it's SmmCpuFeaturesRendezvousExit().
> 
> (16) This function uses a data structure for communication between BSP
> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
> the BSP, and checked above by the APs (too).
> 
> What guarantees the visibility of mCpuHotEjectData->ApicIdMap?

I was banking on SmiRendezvous() explicitly signalling that all
processing on the BSP was done before any AP will look at
mCpuHotEjectData in SmmCpuFeaturesRendezvousExit().

1716     //
1717     // Wait for BSP's signal to exit SMI
1718     //
1719     while (*mSmmMpSyncData->AllCpusInSync) {
1720       CpuPause ();
1721     }
1722   }
1723
1724 Exit:
1725   SmmCpuFeaturesRendezvousExit (CpuIndex);

> 
> I think we might want to use InterlockedCompareExchange64() in both
> EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in
> addition). InterlockedCompareExchange64() can be used just for
> comparison as well, by passing ExchangeValue=CompareValue.


Speaking specifically about the ApicIdMap, I'm not sure I fully
agree (assuming my comment just above is correct.)


The only AP (reader) ApicIdMap deref is here:

CpuEject():
218   ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];

For the to-be-ejected-AP, this value can only move from
    valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID.

Given that, by the time the worker does the write on line 254, this
AP is guaranteed to be dead already, I don't think there's any
scenario where the to-be-ejected-AP can see anything other than
a valid-APIC-ID.

241         QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
242         QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
243
244         //
245         // Compiler barrier to ensure the next store isn't reordered
246         //
247         MemoryFence ();
248
249         //
250         // Clear the eject status for CpuIndex to ensure that an invalid
251         // SMI later does not end up trying to eject it or a newly
252         // hotplugged CpuIndex does not go into the dead loop.
253         //
254         mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
   
For APs that are not being ejected, they will always see CPU_EJECT_INVALID
since the writer never changes that.

The one scenario in which bad things could happen is if entries in the
ApicIdMap are unaligned (or if the compiler or cpu-arch tears aligned
writes).

> 
> (17) I think a similar observation applies to the "Handler" field too,
> as APs call it, while the BSP keeps flipping it between NULL and a real
> function address. We might have to turn that field into an
 From a real function address, to NULL is the problem part right?

(Same argument as above for the transition in UnplugCpus() from
NULL -> function-address.)


> EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use
> InterlockedCompareExchange64() again.

AFAICS, these are the problematic derefs:

SmmCpuFeaturesRendezvousExit():

450   if (mCpuHotEjectData == NULL ||
451       mCpuHotEjectData->Handler == NULL) {
452     return;

and problematic assignments:

266     //
267     // We are done until the next hot-unplug; clear the handler.
268     //
269     mCpuHotEjectData->Handler = NULL;
270     return;
271   }

Here as well, I've been banking on aligned writes such that the APs would
only see the before or after value not an intermediate value.

Thanks
Ankur

> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events
  2021-01-30  2:18   ` Laszlo Ersek
  2021-01-30  2:23     ` Laszlo Ersek
@ 2021-02-02  6:03     ` Ankur Arora
  1 sibling, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-02  6:03 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-01-29 6:18 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Process fw_remove events in QemuCpuhpCollectApicIds() and collect
>> corresponding APIC IDs for CPUs that are being hot-unplugged.
>>
>> In addition, we now ignore CPUs which only have remove set. These
>> CPUs haven't been processed by OSPM yet.
>>
>> This is based on the QEMU hot-unplug protocol documented here:
>>    https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/
>>
>> Also define QEMU_CPUHP_STAT_EJECTED while we are at it.
> 
> (1) Please move the addition of QEMU_CPUHP_STAT_EJECTED to patch 8
> ("OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection"), where you
> first use it.
> 
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>
>> Notes:
>>      I'm treating events (insert=1, fw_remove=1) below as invalid (return
>>      EFI_PROTOCOL_ERROR, which ends up as an assert), but I'm not sure
>>      that is correct:
>>
>>           if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>>             //
>>             // The "insert" event guarantees the "enabled" status; plus it excludes
>>      -      // the "remove" event.
>>      +      // the "fw_remove" event.
>>             //
>>             if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>>      -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>>      +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>>               DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>>                 "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>>                 CpuStatus));
>>
>>      QEMU's handling in cpu_hotplug_rd() can return both of these:
>>
>>      cpu_hotplug_rd() {
>>         ...
>>         case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
>>      	val |= cdev->cpu ? 1 : 0;
>>      	val |= cdev->is_inserting ? 2 : 0;
>>      	val |= cdev->is_removing  ? 4 : 0;
>>      	val |= cdev->fw_remove  ? 16 : 0;
>>         ...
>>      }
>>      and I don't see any code that treats is_inserting and is_removing as
>>      exclusive.
>>
>>      One specific case where this looks it might be a problem is if the user
>>      unplugs a CPU and right after that plugs it.
>>
>>      As part of the unplug handling, the ACPI AML would, in the scan loop,
>>      asynchronously trigger the notify, which would do the OS unplug, set
>>      "fw_remove" and then call the SMI_CMD.
>>
>>      The subsequent plug could then come and set the "insert" bit.
>>
>>      Assuming what I'm describing could happen, I'm not sure what's the right
>>      handling: QEMU could treat these bits as exclusive and then OVMF could
>>      justifiably treat it as a protocol error?
> 
> I'm OK with the related part of your patch (i.e., returning
> EFI_PROTOCOL_ERROR for (insert=1, fw_remove=1)).
> 
>>
>>   OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |  2 ++
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.c                 | 29 +++++++++++++++++++----
>>   2 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> index a34a6d3fae61..692e3072598c 100644
>> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
>> @@ -34,6 +34,8 @@
>>   #define QEMU_CPUHP_STAT_ENABLED                BIT0
>>   #define QEMU_CPUHP_STAT_INSERT                 BIT1
>>   #define QEMU_CPUHP_STAT_REMOVE                 BIT2
>> +#define QEMU_CPUHP_STAT_EJECTED                BIT3
>> +#define QEMU_CPUHP_STAT_FW_REMOVE              BIT4
>>
>>   #define QEMU_CPUHP_RW_CMD_DATA               0x8
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> index 8d4a6693c8d6..f871e50c377b 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> @@ -245,10 +245,10 @@ QemuCpuhpCollectApicIds (
>>       if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
>>         //
>>         // The "insert" event guarantees the "enabled" status; plus it excludes
>> -      // the "remove" event.
>> +      // the "fw_remove" event.
>>         //
>>         if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
>> -          (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +          (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>>           DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>>             "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>>             CpuStatus));
>> @@ -260,12 +260,31 @@ QemuCpuhpCollectApicIds (
>>
>>         ExtendIds   = PluggedApicIds;
>>         ExtendCount = PluggedCount;
>> -    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> -      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
>> -        CurrentSelector));
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
>> +      //
>> +      // "fw_remove" event guarantees "enabled".
>> +      //
>> +      if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
>> +        DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
>> +          "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
>> +          CpuStatus));
>> +        return EFI_PROTOCOL_ERROR;
>> +      }
>> +
>> +      DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
>> +        __FUNCTION__, CurrentSelector));
>>
>>         ExtendIds   = ToUnplugApicIds;
>>         ExtendCount = ToUnplugCount;
>> +    } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
>> +      //
>> +      // Let the OSPM deal with the "remove" event.
>> +      //
>> +      DEBUG ((DEBUG_INFO, "%a: CurrentSelector=%u: remove (ignored)\n",
>> +        __FUNCTION__, CurrentSelector));
> 
> (2) Please downgrade this debug mask from DEBUG_INFO to DEBUG_VERBOSE.
> 
> (If you want your OVMF build to emit DEBUG_VERBOSE messages to the log,
> you can set PcdDebugPrintErrorLevel to 0x8040004F in the DSC file --
> DEBUG_VERBOSE has value 0x00400000.)
> 
>> +
>> +      CurrentSelector++;
>> +      continue;
> 
> (3) This change is logically correct; however I request a different
> implementation, as I indicated here:
> 
>    https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06737.html
>    msgid: <926113ec-8fa1-7d3b-ff3f-f1eda692e83d@redhat.com>
> 
> Namely:
> 
> (3a) On this branch, please set both "ExtendIds" and "ExtendCount" to
> NULL, replacing the currently proposed "CurrentSelector" increment and
> the "continue" statement.
> 
> (3b) Locate the section of code that starts with the comment "Save the
> APIC ID of the CPU with the pending event...", and make it conditional
> like this:
> 
>      ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
>      if (ExtendIds != NULL) {
>        ...
>      }
> 
> (3c) and then simply proceed to the end of the loop body, where we
> increment "CurrentSelector" already.
> 
> 
> Here's why I'm asing for this: with your proposed v6 patch, the loop
> body would receive a "CurrentSelector" increment operation that did not
> explain itself. And I'd really like to keep *any* "CurrentSelector"
> increment operation explained by the comment that we currently have at
> the end of the loop body:
> 
>       //
>       // We've processed the CPU with (known) pending events, but we must never
>       // clear events. Therefore we need to advance past this CPU manually;
>       // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
>       // selected CPU.
>       //
> 
> Keeping up that "well-explained" status would require one of two
> options:
> 
> - copy the comment into the new branch (duplicating the comment) just
>    before you add the new "CurrentSelector" increment operation, or
> 
> - make sure we have just one spot where we increment "CurrentSelector",
>    and preserve the comment there.
> 
> The second option looks much better to me, so that's what I'm asking
> for.
> 
> If we didn't have that big comment on the increment, your solution would
> be just fine, but said comment is really important IMO.

Yeah you are right, that comment is quite useful to keep in mind. Will fix.

Also acking the other review comments here (including the EJECTED -> EJECT
change.)

Thanks
Ankur

> 
> Thanks!
> Laszlo
> 
>>       } else {
>>         DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
>>           __FUNCTION__, CurrentSelector));
>>
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper
  2021-01-30  2:36   ` Laszlo Ersek
@ 2021-02-02  6:04     ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-02  6:04 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-01-29 6:36 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Add QemuCpuhpWriteCpuStatus() which will be used to update the QEMU
>> CPU status register. On error, it hangs in a similar fashion as
>> other helper functions.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.h |  6 ++++++
>>   OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 22 ++++++++++++++++++++++
>>   2 files changed, 28 insertions(+)
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> index 8adaa0ad91f0..804809846890 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
>> @@ -30,6 +30,12 @@ QemuCpuhpReadCpuStatus (
>>     IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
>>     );
>>   
>> +VOID
>> +QemuCpuhpWriteCpuStatus (
>> +  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
>> +  IN UINT8                        CpuStatus
>> +  );
>> +
>>   UINT32
>>   QemuCpuhpReadCommandData (
>>     IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
>> diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> index f871e50c377b..ed44264de934 100644
>> --- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> +++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
>> @@ -67,6 +67,28 @@ QemuCpuhpReadCpuStatus (
>>     return CpuStatus;
>>   }
>>   
>> +VOID
>> +QemuCpuhpWriteCpuStatus (
>> +  IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo,
>> +  IN UINT8                        CpuStatus
>> +  )
>> +{
>> +  EFI_STATUS Status;
>> +
>> +  Status = MmCpuIo->Io.Write (
>> +                         MmCpuIo,
>> +                         MM_IO_UINT8,
>> +                         ICH9_CPU_HOTPLUG_BASE + QEMU_CPUHP_R_CPU_STAT,
>> +                         1,
>> +                         &CpuStatus
>> +                         );
>> +  if (EFI_ERROR (Status)) {
>> +    DEBUG ((DEBUG_ERROR, "%a: %r\n", __FUNCTION__, Status));
>> +    ASSERT (FALSE);
>> +    CpuDeadLoop ();
>> +  }
>> +}
>> +
>>   UINT32
>>   QemuCpuhpReadCommandData (
>>     IN CONST EFI_MM_CPU_IO_PROTOCOL *MmCpuIo
>>
> 
> The code is fine, but please move the new function (both declaration and
> definition) between QemuCpuhpWriteCpuSelector() and QemuCpuhpWriteCommand().
> 
> Reason: the pre-patch order of the functions matches the order of the
> register descriptions in QEMU's "docs/specs/acpi_cpu_hotplug.txt".
> 
> There, we first have a section called "read access", then another called
> "write access". And in each section, registers are listed in increasing
> offset order, within the hotplug register block.

Will fix.

Thanks
Ankur

> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA
  2021-02-01  4:53   ` Laszlo Ersek
@ 2021-02-02  6:15     ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-02  6:15 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-01-31 8:53 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Define CPU_HOT_EJECT_DATA and add PCD PcdCpuHotEjectDataAddress, which
>> will be used to share CPU ejection state between OvmfPkg/CpuHotPlugSmm
>> and PiSmmCpuDxeSmm.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   OvmfPkg/OvmfPkg.dec                       | 10 +++++++++
>>   OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf   |  1 +
>>   OvmfPkg/Include/Library/CpuHotEjectData.h | 35 +++++++++++++++++++++++++++++++
>>   3 files changed, 46 insertions(+)
>>   create mode 100644 OvmfPkg/Include/Library/CpuHotEjectData.h
>>
>> diff --git a/OvmfPkg/OvmfPkg.dec b/OvmfPkg/OvmfPkg.dec
>> index 4348bb45c64a..1a2debb821d7 100644
>> --- a/OvmfPkg/OvmfPkg.dec
>> +++ b/OvmfPkg/OvmfPkg.dec
>> @@ -106,6 +106,10 @@ [LibraryClasses]
>>     #
>>     XenPlatformLib|Include/Library/XenPlatformLib.h
>>
>> +  ##  @libraryclass  Share CPU hot-eject state
>> +  #
>> +  CpuHotEjectData|Include/Library/CpuHotEjectData.h
>> +
>>   [Guids]
>>     gUefiOvmfPkgTokenSpaceGuid            = {0x93bb96af, 0xb9f2, 0x4eb8, {0x94, 0x62, 0xe0, 0xba, 0x74, 0x56, 0x42, 0x36}}
>>     gEfiXenInfoGuid                       = {0xd3b46f3b, 0xd441, 0x1244, {0x9a, 0x12, 0x0, 0x12, 0x27, 0x3f, 0xc1, 0x4d}}
> 
> (1) Please drop this hunk -- the [LibraryClasses] section should not be
> modified, as we're not introducing a new library class.
> 
> 
>> @@ -352,6 +356,12 @@ [PcdsDynamic, PcdsDynamicEx]
>>     #  This PCD is only accessed if PcdSmmSmramRequire is TRUE (see below).
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase|FALSE|BOOLEAN|0x34
>>
>> +  ## This PCD adds a communication channel between PiSmmCpuDxeSmm and
>> +  #  CpuHotplugSmm.
> 
> (2) I suggest:
> 
>    ## This PCD adds a communication channel between OVMF's SmmCpuFeaturesLib
>    #  instance in PiSmmCpuDxeSmm, and CpuHotplugSmm.
> 
> 
>> +  #
>> +  #  Only accessed if PcdCpuHotPlugSupport is TRUE
> 
> (3) This statement is technically true, but I suggest dropping it. In my
> opinion, it is not useful (it's a superfluous statement). Here's why:
> 
> - We set the "PcdCpuHotPlugSupport" feature flag to TRUE in the OVMF DSC
>    files exactly when the SMM_REQUIRE feature test macro is set on the
>    "build" command line.
> 
> - The whole SMM infrastructure is included in the firmware binary
>    exactly when SMM_REQUIRE is set.
> 
> In other words, PcdCpuHotPlugSupport is *equivalent* with
> SmmCpuFeaturesLib, PiSmmCpuDxeSmm, and CpuHotplugSmm being included in
> the firmware binary.
> 
> Given that the first comment already declares the PCD as an info channel
> between SmmCpuFeaturesLib (as built into PiSmmCpuDxeSmm) and
> CpuHotplugSmm, the second comment adds nothing.

That makes sense.

> 
> 
>> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress|0|UINT64|0x46
>> +
>>   [PcdsFeatureFlag]
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderPciTranslation|TRUE|BOOLEAN|0x1c
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQemuBootOrderMmioTranslation|FALSE|BOOLEAN|0x1d
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> index 04322b0d7855..e08b572ef169 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> @@ -54,6 +54,7 @@ [Protocols]
>>
>>   [Pcd]
>>     gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
>> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
>>
>>   [FeaturePcd]
> 
> (4) Please move this hunk to patch#7 (subject: "OvmfPkg/CpuHotplugSmm:
> add CpuEject()"). That's where CpuHotplugSmm first needs the new PCD.
> 
> 
>> diff --git a/OvmfPkg/Include/Library/CpuHotEjectData.h b/OvmfPkg/Include/Library/CpuHotEjectData.h
>> new file mode 100644
>> index 000000000000..b6fb629a1283
>> --- /dev/null
>> +++ b/OvmfPkg/Include/Library/CpuHotEjectData.h
>> @@ -0,0 +1,35 @@
>> +/** @file
>> +  Definition for a CPU hot-eject state sharing structure.
>> +
> 
> (5a) I suggest the following language:
> 
>    Definition of the CPU_HOT_EJECT_DATA structure, which shares CPU hot-eject
>    state between OVMF's SmmCpuFeaturesLib instance in PiSmmCpuDxeSmm, and
>    CpuHotplugSmm.
> 
>    CPU_HOT_EJECT_DATA is allocated in SMRAM, and pointed-to by
>    PcdCpuHotEjectDataAddress.
> 
> (5b) Please append at least one more sentence to state the condition
> when the PCD is *not* NULL.
> 
> 
> (6) This new header file should be located at:
> 
>    OvmfPkg/Include/Pcd/CpuHotEjectData.h

Your explanation below makes sense. OvmfPkg/Include/Library did not
quite seem like the right place but I wasn't sure what would be
better.

Will fix.

> 
> please.
> 
> The (more or less) general rule is this:
> 
> - if you have a macro definition or a structure type that is accessible
>    through a Pcd, a Protocol, a Guid -- HOB, VenHw() devpath node etc --,
>    a Library, a Register, etc,
> 
> - and the Pcd, Protocol, Guid, Library etc in question is declared in
>    "WhateverPkg/WhateverPkg.dec",
> 
> - then the header file defining the structure or macro should be placed
>    in the following directory (according to the access type):
> 
>    WhateverPkg/Include/Pcd/
>    WhateverPkg/Include/Protocol/
>    WhateverPkg/Include/Guid/
>    WhateverPkg/Include/Library/
>    WhateverPkg/Include/Register/
> 
> Admittedly, while this rule is universally honored in edk2 in the
> Protocol, Guid, and Library cases, the Register case is somewhat less
> frequently followed, and the Pcd case is almost nonexistent. For
> example, "UefiCpuPkg/Include/CpuHotPlugData.h" itself does not follow
> the rule (no "Pcd" subdir). However, there are examples that do follow
> the rule:
> 
>    CryptoPkg/Include/Pcd/PcdCryptoServiceFamilyEnable.h
>    RedfishPkg/Include/Pcd/RestExServiceDevicePath.h
> 
> 
>> +  Copyright (C) 2021, Oracle Corporation.
>> +
>> +  SPDX-License-Identifier: BSD-2-Clause-Patent
>> +**/
>> +
>> +#ifndef _CPU_HOT_EJECT_DATA_H_
>> +#define _CPU_HOT_EJECT_DATA_H_
> 
> (7) Please use the following guard macro:
> 
>    CPU_HOT_EJECT_DATA_H_
> 
> (i.e., please drop the leading underscore).
> 
> Although the leading underscore is widely used in edk2, in include guard
> macros, it's a bad practice (it creates identifiers that are reserved by
> the C standard), so we should not introduce more of it.
> 
> 
>> +
>> +typedef
>> +VOID
>> +(EFIAPI *CPU_HOT_EJECT_FN)(
> 
> (8) Please replace _FN with _HANDLER or _FUNCTION.
> 
> In edk2, we tend to avoid abbreviations. (Yes, the practice has not
> entirely been consistent, and sometimes it's actually *annoying* that
> our type names are too long. But that's what we got.)
> 
> ... _HANDLER would be better, as you call the related field "Handler" in
> the structure.
> 
> 
> (9) Missing space character before the last parenthesis on the line.
> 
> 
> (10) Please add a leading comment block on this function prototype.
> (Well, yes, I realize it is technically a *pointer* type, but still.)
> 
> This is not just a formality; I'd really like the "ProcessorNum"
> parameter to be described, for example its relationship with the
> "ProcessorNumber" parameter of EFI_SMM_CPU_SERVICE_PROTOCOL member
> functions, and/or the "CPU_HOT_PLUG_DATA.ApicId" array.
> 
> 
>> +  IN UINTN  ProcessorNum
>> +  );
>> +
>> +#define CPU_EJECT_INVALID               (MAX_UINT64)
>> +#define CPU_EJECT_WORKER                (MAX_UINT64-1)
> 
> (11a) If these are meant as special values for the "ApicIdMap" array,
> then I'd suggest something like:
> 
>    CPU_EJECT_APIC_ID_INVALID
>    CPU_EJECT_APIC_ID_WORKER
> 
> (11b) Can you add a single-sentence comment to each macro? (Observe the
> comment style while at it, please.)
> 
> 
>> +
>> +#define  CPU_HOT_EJECT_DATA_REVISION_1  0x00000001
>> +
>> +typedef struct {
>> +  UINT32           Revision;          // Used for version identification of
>> +                                      // this structure
> 
> (12) Please drop both the "CPU_HOT_EJECT_DATA_REVISION_1" macro and the
> "Revision" field.
> 
> The "CPU_HOT_PLUG_DATA" structure, from
> "UefiCpuPkg/Include/CpuHotPlugData.h", is different. That structure is
> versioned because it establishes a communication channel between a core
> module (PiSmmCpuDxeSmm) and a platform module (such as
> OvmfPkg/CpuHotplugSmm); what's more, those modules could even be built
> separately, and be available in binary-only form.
> 
> (Side note: we ignore "CPU_HOT_PLUG_DATA.Revision" in
> "OvmfPkg/CpuHotplugSmm" because the OVMF platforms exist in the exact
> same repository as PiSmmCpuDxeSmm, so we can keep them in sync. This is
> BTW one reason why I absolutely want OVMF to live in the core edk2
> repository. Anyway, digression ends.)
> 
> However, the same versioning idea (or requirement) does not hold for the
> present use case. The new communication channel is between:
> 
> - OVMF's SmmCpuFeaturesLib instance in PiSmmCpuDxeSmm,
> - and CpuHotplugSmm.
> 
> Both of those are OVMF platform modules, and we never build one without
> building the other. (Put differently, we never build PiSmmCpuDxeSmm and
> CpuHotplugSmm separately, for any particular OVMF binary.)
> 
> Thus, the "Revision" field is useless.

Agreed. I was unsure about adding this field -- but wasn't sure if there
might be situations when CpuHotplugSmm and PiSmmCpuDxeSmm might come
separately or not.

Thanks for clarifying it.

> 
> 
>> +  UINT32           ArrayLength;       // Entries in the ApicIdMap array
>> +
>> +  UINT64           *ApicIdMap;        // Pointer to CpuIndex->ApicId map for
>> +                                      // pending hot-ejects
> 
> (13a) "CpuIndex" is yet another name here; if you mean
> "ProcessorNum[ber]" -- see point (10) above --, then please use that
> word.

I did. Will fix.

> 
> (13b) Also, the "->" arrow is a bit confusing (is "CpuIndex" a
> pointer???), so please either use " -> " (spaces on both sides) or write
> "ProcessorNumber-to-ApicId".
> 
> 
>> +  CPU_HOT_EJECT_FN Handler;           // Handler to do the CPU ejection
>> +
>> +  UINT64           Reserved;
> 
> (14) Please drop the "Reserved" field as well, with the justification
> given in (12).
> 
> 
>> +} CPU_HOT_EJECT_DATA;
>> +
>> +#endif /* _CPU_HOT_EJECT_DATA_H_ */
>>
> 
> (15) Comment style is wrong; should be //.
> 
> (I admit that you may find many examples for the wrong comment style,
> near such "#endif" directives, under OvmfPkg/Include; sorry about that.)
> 
> 
> (16) Please drop the leading underscore here too.
> 

Will fix and acking all the other comments that I did not explicitly
address as well.

Thanks
Ankur

> 
> I plan to continue the review either today, or sometime later this week.
> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [edk2-devel] [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic
  2021-01-30  1:15   ` [edk2-devel] " Laszlo Ersek
@ 2021-02-02  6:19     ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-02  6:19 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-01-29 5:15 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Refactor CpuHotplugMmi() to pull out the CPU hotplug logic into
>> ProcessHotAddedCpus(). This is in preparation for supporting CPU
>> hot-unplug.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>
>> Notes:
>>       > +  if (EFI_ERROR(Status)) {
>>       > +    goto Fatal;
>>       >    }
>>      
>>       (13) Without having seen the rest of the patches, I think this error
>>       check should be nested under the same (PluggedCount > 0) condition; in
>>       other words, I think it only makes sense to check Status after we
>>       actually call ProcessHotAddedCpus().
>>      
>>      Addresses all comments from v5, except for this one, since the (lack) of
>>      nesting makes more sense after patch 4, "OvmfPkg/CpuHotplugSmm: introduce
>>      UnplugCpus()".
>>
>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 214 ++++++++++++++++++++++---------------
>>   1 file changed, 129 insertions(+), 85 deletions(-)
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index cfe698ed2b5e..05b1f8cb63a6 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -62,6 +62,130 @@ STATIC UINT32 mPostSmmPenAddress;
>>   //
>>   STATIC EFI_HANDLE mDispatchHandle;
>>   
>> +/**
>> +  Process CPUs that have been hot-added, per QemuCpuhpCollectApicIds().
>> +
>> +  For each such CPU, relocate the SMBASE, and report the CPU to PiSmmCpuDxeSmm
>> +  via EFI_SMM_CPU_SERVICE_PROTOCOL. If the supposedly hot-added CPU is already
>> +  known, skip it silently.
>> +
>> +  @param[in] PluggedApicIds    The APIC IDs of the CPUs that have been
>> +                               hot-plugged.
>> +
>> +  @param[in] PluggedCount      The number of filled-in APIC IDs in
>> +                               PluggedApicIds.
>> +
>> +  @retval EFI_SUCCESS          CPUs corresponding to all the APIC IDs are
>> +                               populated.
>> +
>> +  @retval EFI_OUT_OF_RESOURCES Out of APIC ID space in "mCpuHotPlugData".
>> +
>> +  @return                      Error codes propagated from SmbaseRelocate()
>> +                               and mMmCpuService->AddProcessor().
>> +
>> +**/
>> +STATIC
>> +EFI_STATUS
>> +ProcessHotAddedCpus (
>> +  IN APIC_ID                      *PluggedApicIds,
>> +  IN UINT32                       PluggedCount
>> +  )
>> +{
>> +  EFI_STATUS Status;
>> +  UINT32     PluggedIdx;
>> +  UINT32     NewSlot;
>> +
>> +  //
>> +  // The Post-SMM Pen need not be reinstalled multiple times within a single
>> +  // root MMI handling. Even reinstalling once per root MMI is only prudence;
>> +  // in theory installing the pen in the driver's entry point function should
>> +  // suffice.
>> +  //
>> +  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
>> +
>> +  PluggedIdx = 0;
>> +  NewSlot = 0;
>> +  while (PluggedIdx < PluggedCount) {
>> +    APIC_ID NewApicId;
>> +    UINT32  CheckSlot;
>> +    UINTN   NewProcessorNumberByProtocol;
>> +
>> +    NewApicId = PluggedApicIds[PluggedIdx];
>> +
>> +    //
>> +    // Check if the supposedly hot-added CPU is already known to us.
>> +    //
>> +    for (CheckSlot = 0;
>> +         CheckSlot < mCpuHotPlugData->ArrayLength;
>> +         CheckSlot++) {
>> +      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
>> +        break;
>> +      }
>> +    }
>> +    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
>> +      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
>> +        "before; ignoring it\n", __FUNCTION__, NewApicId));
>> +      PluggedIdx++;
>> +      continue;
>> +    }
>> +
>> +    //
>> +    // Find the first empty slot in CPU_HOT_PLUG_DATA.
>> +    //
>> +    while (NewSlot < mCpuHotPlugData->ArrayLength &&
>> +           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
>> +      NewSlot++;
>> +    }
>> +    if (NewSlot == mCpuHotPlugData->ArrayLength) {
>> +      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
>> +        __FUNCTION__, NewApicId));
>> +      return EFI_OUT_OF_RESOURCES;
>> +    }
>> +
>> +    //
>> +    // Store the APIC ID of the new processor to the slot.
>> +    //
>> +    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
>> +
>> +    //
>> +    // Relocate the SMBASE of the new CPU.
>> +    //
>> +    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
>> +               mPostSmmPenAddress);
>> +    if (EFI_ERROR (Status)) {
>> +      goto RevokeNewSlot;
>> +    }
>> +
>> +    //
>> +    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
>> +    //
>> +    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
>> +                              &NewProcessorNumberByProtocol);
>> +    if (EFI_ERROR (Status)) {
>> +      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
>> +        __FUNCTION__, NewApicId, Status));
>> +      goto RevokeNewSlot;
>> +    }
>> +
>> +    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
>> +      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
>> +      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
>> +      (UINT64)NewProcessorNumberByProtocol));
>> +
>> +    NewSlot++;
>> +    PluggedIdx++;
>> +  }
>> +
>> +  //
>> +  // We've processed this batch of hot-added CPUs.
>> +  //
>> +  return EFI_SUCCESS;
>> +
>> +RevokeNewSlot:
>> +  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
>> +
>> +  return Status;
>> +}
>>   
>>   /**
>>     CPU Hotplug MMI handler function.
>> @@ -122,8 +246,6 @@ CpuHotplugMmi (
>>     UINT8      ApmControl;
>>     UINT32     PluggedCount;
>>     UINT32     ToUnplugCount;
>> -  UINT32     PluggedIdx;
>> -  UINT32     NewSlot;
>>   
>>     //
>>     // Assert that we are entering this function due to our root MMI handler
>> @@ -179,87 +301,12 @@ CpuHotplugMmi (
>>       goto Fatal;
>>     }
>>   
>> -  //
>> -  // Process hot-added CPUs.
>> -  //
>> -  // The Post-SMM Pen need not be reinstalled multiple times within a single
>> -  // root MMI handling. Even reinstalling once per root MMI is only prudence;
>> -  // in theory installing the pen in the driver's entry point function should
>> -  // suffice.
>> -  //
>> -  SmbaseReinstallPostSmmPen (mPostSmmPenAddress);
>> +  if (PluggedCount > 0) {
>> +    Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>> +  }
>>   
>> -  PluggedIdx = 0;
>> -  NewSlot = 0;
>> -  while (PluggedIdx < PluggedCount) {
>> -    APIC_ID NewApicId;
>> -    UINT32  CheckSlot;
>> -    UINTN   NewProcessorNumberByProtocol;
>> -
>> -    NewApicId = mPluggedApicIds[PluggedIdx];
>> -
>> -    //
>> -    // Check if the supposedly hot-added CPU is already known to us.
>> -    //
>> -    for (CheckSlot = 0;
>> -         CheckSlot < mCpuHotPlugData->ArrayLength;
>> -         CheckSlot++) {
>> -      if (mCpuHotPlugData->ApicId[CheckSlot] == NewApicId) {
>> -        break;
>> -      }
>> -    }
>> -    if (CheckSlot < mCpuHotPlugData->ArrayLength) {
>> -      DEBUG ((DEBUG_VERBOSE, "%a: APIC ID " FMT_APIC_ID " was hot-plugged "
>> -        "before; ignoring it\n", __FUNCTION__, NewApicId));
>> -      PluggedIdx++;
>> -      continue;
>> -    }
>> -
>> -    //
>> -    // Find the first empty slot in CPU_HOT_PLUG_DATA.
>> -    //
>> -    while (NewSlot < mCpuHotPlugData->ArrayLength &&
>> -           mCpuHotPlugData->ApicId[NewSlot] != MAX_UINT64) {
>> -      NewSlot++;
>> -    }
>> -    if (NewSlot == mCpuHotPlugData->ArrayLength) {
>> -      DEBUG ((DEBUG_ERROR, "%a: no room for APIC ID " FMT_APIC_ID "\n",
>> -        __FUNCTION__, NewApicId));
>> -      goto Fatal;
>> -    }
>> -
>> -    //
>> -    // Store the APIC ID of the new processor to the slot.
>> -    //
>> -    mCpuHotPlugData->ApicId[NewSlot] = NewApicId;
>> -
>> -    //
>> -    // Relocate the SMBASE of the new CPU.
>> -    //
>> -    Status = SmbaseRelocate (NewApicId, mCpuHotPlugData->SmBase[NewSlot],
>> -               mPostSmmPenAddress);
>> -    if (EFI_ERROR (Status)) {
>> -      goto RevokeNewSlot;
>> -    }
>> -
>> -    //
>> -    // Add the new CPU with EFI_SMM_CPU_SERVICE_PROTOCOL.
>> -    //
>> -    Status = mMmCpuService->AddProcessor (mMmCpuService, NewApicId,
>> -                              &NewProcessorNumberByProtocol);
>> -    if (EFI_ERROR (Status)) {
>> -      DEBUG ((DEBUG_ERROR, "%a: AddProcessor(" FMT_APIC_ID "): %r\n",
>> -        __FUNCTION__, NewApicId, Status));
>> -      goto RevokeNewSlot;
>> -    }
>> -
>> -    DEBUG ((DEBUG_INFO, "%a: hot-added APIC ID " FMT_APIC_ID ", SMBASE 0x%Lx, "
>> -      "EFI_SMM_CPU_SERVICE_PROTOCOL assigned number %Lu\n", __FUNCTION__,
>> -      NewApicId, (UINT64)mCpuHotPlugData->SmBase[NewSlot],
>> -      (UINT64)NewProcessorNumberByProtocol));
>> -
>> -    NewSlot++;
>> -    PluggedIdx++;
>> +  if (EFI_ERROR(Status)) {
> 
> (1) I understand why you skipped point (13) from the previous review,
> but you missed point (14) as well -- space character missing after
> "EFI_ERROR":
> 
> https://edk2.groups.io/g/devel/message/70785
> 
> Anyway, in case v7 will not be necessary, I can fix this up myself.
> 
> With the space character added:
> 
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks. Will fix in v7 (not quite sure how this one escaped me; my eyes
are too used to not having the additional space, but I did have a
grep command that should have caught this one as well.)

Thanks
Ankur

> 
> Thanks
> Laszlo
> 
> 
>> +    goto Fatal;
>>     }
>>   
>>     //
>> @@ -267,9 +314,6 @@ CpuHotplugMmi (
>>     //
>>     return EFI_SUCCESS;
>>   
>> -RevokeNewSlot:
>> -  mCpuHotPlugData->ApicId[NewSlot] = MAX_UINT64;
>> -
>>   Fatal:
>>     ASSERT (FALSE);
>>     CpuDeadLoop ();
>>
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  2021-02-01 19:21     ` Ankur Arora
@ 2021-02-02 13:23       ` Laszlo Ersek
  2021-02-03  5:41         ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-02 13:23 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/01/21 20:21, Ankur Arora wrote:
> On 2021-02-01 9:22 a.m., Laszlo Ersek wrote:
>> On 01/29/21 01:59, Ankur Arora wrote:
>>> Designate a worker CPU (we use the one executing the root MMI
>>> handler), which will do the actual ejection via QEMU in CpuEject().
>>>
>>> CpuEject(), on the worker CPU, ejects each marked CPU by first
>>> selecting its APIC ID and then sending the QEMU "eject" command.
>>> QEMU in-turn signals the remote VCPU thread which context-switches
>>> it out of the SMI.
>>>
>>> CpuEject(), on the CPU being ejected, spins around in its holding
>>> area until this final context-switch. This does mean that there is
>>> some CPU state that would ordinarily be restored (in SmiRendezvous()
>>> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
>>> This unrestored state includes FPU state, CET enable, stuffing of
>>> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
>>> this should be okay.
>>>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>> Cc: Aaron Young <aaron.young@oracle.com>
>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>> ---
>>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73
>>> ++++++++++++++++++++++++++++++++++----
>>>   1 file changed, 67 insertions(+), 6 deletions(-)
>>
>> (1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
>> and code too).
>>
>>
>>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>> b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>> index 526f51faf070..bf91344eef9c 100644
>>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>> @@ -193,9 +193,12 @@ RevokeNewSlot:
>>>     CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>>     on each CPU at exit from SMM.
>>>
>>> -  If, the executing CPU is not being ejected, nothing to be done.
>>> +  If, the executing CPU is neither a worker, nor being ejected, nothing
>>> +  to be done.
>>>     If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>>     until ejected.
>>> +  If, the executing CPU is a worker CPU, set QEMU CPU status to eject
>>> +  for CPUs being ejected.
>>>
>>>     @param[in] ProcessorNum      Index of executing CPU.
>>>
>>> @@ -217,6 +220,56 @@ CpuEject (
>>>       return;
>>>     }
>>>
>>> +  if (ApicId == CPU_EJECT_WORKER) {
>>
>> (2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
>> generality). I wish I understood this idea earlier in the patch set.
>>
>> (2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
>> CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
>> dropped.
>>
>> (2b) In this patch, the question whether the executing CPU is the BSP or
>> not, should be decided with the same logic that is visible in
>> PlatformSmmBspElection()
>> [OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:
>>
>>    MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
>>    BOOLEAN                     IsBsp;
>>
>>    ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
>>    IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
>>
>> (2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
>> in UnplugCpus() below.
>>
>> (2d) The "is worker" language (in comments etc) should be replaced with
>> direct "is BSP" language.
>>
>>
>>> +    UINT32 CpuIndex;
>>> +
>>> +    for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength;
>>> CpuIndex++) {
>>> +      UINT64 RemoveApicId;
>>> +
>>> +      RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
>>> +
>>> +      if ((RemoveApicId != CPU_EJECT_INVALID &&
>>> +           RemoveApicId != CPU_EJECT_WORKER)) {
>>> +        //
>>> +        // This to-be-ejected-CPU has already received the BSP's SMI
>>> exit
>>> +        // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
>>> +        // followed by this callback or is already waiting in the
>>> +        // CpuDeadLoop() below.
>>> +        //
>>> +        // Tell QEMU to context-switch it out.
>>> +        //
>>> +        QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>>> +        QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
>>
>> (3) While the QEMU CPU selector value *usually* matches the APIC ID,
>> it's not an invariant. APIC IDs have an internal structure, composed of
>> bit-fields, where each bit-field accommodates one hierarchy level in the
>> CPU topology (thread, core, die (maybe), and socket).
>>
>> However, this mapping need not be surjective. QEMU lets you create
>> "pathological" CPU topologies, for example one with:
>> - 3 threads/core,
>> - 5 cores/socket,
>> - (say) 2 sockets.
>>
>> Under that example, the bit-field standing for the "thread number" level
>> would have 2 bits, theoretically permitting *4* threads/core, and the
>> bit-field standing for the "core number" level would have 3 bits,
>> theoretically allowing for *8* cores/socket.
>>
>> Considering the fully populated topology, you'd see the CPU selector
>> range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
>> processors in total). However, the APIC ID *image* of this CPU selector
>> *domain* would not be "contiguous" -- the APIC ID space, with the
>> above-described structure, would accommodate 4*8*2=64 logical
>> processors. For example, each APIC ID that stood for the nonexistent
>> "thread#3" on a particular core would be left unused (no CPU selector
>> would map to it).
>>
>> All in all, you can't write the APIC ID to the CPU selector register,
>> for ejection. You need to select the CPU whose APIC ID is the APIC ID
>> you want to eject, and then initiate ejection.
> 
> Yeah, this is a clear bug. Should have seen it earlier. Thanks for
> pointing it out.
> 
>>
>> This requires one of two alternatives:
>>
>>
>> (3a) The first option is to keep the change local to this patch.
>>
>> This alternative is the more CPU-hungry (and uglier) one.
>>
>> The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
>> possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
>> CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
>> there is a match, eject.
>>
>>
>> (3b) The second option is much more elegant (and it's faster too), but
>> it requires a much more intrusive update to the patch set.
>>
>> First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
>> operates on, has to be changed from APIC_ID to a structure type that
>> pairs APIC_ID with the QEMU CPU selector. [*]
>>
>> Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
>> also save the "CurrentSelector" value (in the other field of the output
>> array element structure).
>>
>> Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
>> replaced with a structure type similar (or identical) to the one
>> described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
>> be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
>> the QEMU selector for that processor, and the APIC ID.
>>
>> Fourth, the actual ejection should use the selector.
>>
>> Fifth, the debug message (below) should continue logging the APIC ID, to
>> mirror the DEBUG_INFO message in ProcessHotAddedCpus().
>>
>>
>> Would you be willing to implement (3b)?
> 
> 3b is clearly the better solution. However, is there enough value in
> the print message containing APIC ID, that CPU_HOT_EJECT_DATA.ApicIdMap
> carry both the cpu-selector and APIC ID?
> 
> As you say, the ejection itself just needs the ProcessorNum -> QEMU
> cpu-selector mapping.

Good question, and I'm torn.

The default DEBUG build of OVMF enables INFO messages, and more severe
ones. It does not enable VERBOSE messages.

In such a DEBUG build (i.e., with otherwise default settings), the log
entries that relate to hot-plug do not report selectors. Conversely, on
hot-unplug, the log would report selectors *only* (not APIC IDs). I feel
this makes it more difficult to get useful OVMF debug logs in bug
reports, especially from users of distro-provided OVMF builds.

If we can ask the user to rebuild with VERBOSE enabled and re-run the
test, that's always great, but frequently users don't know how to
rebuild OVMF, plus usually OVMF is hidden so deep in their virt stack
that they have no idea how to make that stack use a custom OVMF build.

... How about this: in UnplugCpus(), we could emit an INFO message about
the ProcessorNumber <-> APIC ID <-> Selector correspondence, and when
the eject happens, it would suffice to log (at INFO level) only
ProcessorNumber <-> Selector.

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-01 20:12     ` Ankur Arora
@ 2021-02-02 14:00       ` Laszlo Ersek
  2021-02-02 14:15         ` Laszlo Ersek
  2021-02-03  6:13         ` Ankur Arora
  0 siblings, 2 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-02 14:00 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/01/21 21:12, Ankur Arora wrote:
> On 2021-02-01 11:08 a.m., Laszlo Ersek wrote:
>> apologies, I've got more comments here:
>>
>> On 01/29/21 01:59, Ankur Arora wrote:
>>
>>>   /**
>>> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>> +  on each CPU at exit from SMM.
>>> +
>>> +  If, the executing CPU is not being ejected, nothing to be done.
>>> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>> +  until ejected.
>>> +
>>> +  @param[in] ProcessorNum      Index of executing CPU.
>>> +
>>> +**/
>>> +VOID
>>> +EFIAPI
>>> +CpuEject (
>>> +  IN UINTN ProcessorNum
>>> +  )
>>> +{
>>> +  //
>>> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
>>> +  // so use UINT64 throughout.
>>> +  //
>>> +  UINT64 ApicId;
>>> +
>>> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>>> +  if (ApicId == CPU_EJECT_INVALID) {
>>> +    return;
>>> +  }
>>> +
>>> +  //
>>> +  // CPU(s) being unplugged get here from
>>> SmmCpuFeaturesSmiRendezvousExit()
>>> +  // after having been cleared to exit the SMI by the monarch and
>>> thus have
>>> +  // no SMM processing remaining.
>>> +  //
>>> +  // Given that we cannot allow them to escape to the guest, we pen
>>> them
>>> +  // here until the SMM monarch tells the HW to unplug them.
>>> +  //
>>> +  CpuDeadLoop ();
>>> +}
>>
>> (15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() --
>> it's SmmCpuFeaturesRendezvousExit().
>>
>> (16) This function uses a data structure for communication between BSP
>> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
>> the BSP, and checked above by the APs (too).
>>
>> What guarantees the visibility of mCpuHotEjectData->ApicIdMap?
>
> I was banking on SmiRendezvous() explicitly signalling that all
> processing on the BSP was done before any AP will look at
> mCpuHotEjectData in SmmCpuFeaturesRendezvousExit().
>
> 1716     //
> 1717     // Wait for BSP's signal to exit SMI
> 1718     //
> 1719     while (*mSmmMpSyncData->AllCpusInSync) {
> 1720       CpuPause ();
> 1721     }
> 1722   }
> 1723
> 1724 Exit:
> 1725   SmmCpuFeaturesRendezvousExit (CpuIndex);

Right; it's a general pattern in edk2: volatile UINT8 (aka BOOLEAN)
objects are considered atomic. (See
SMM_DISPATCHER_MP_SYNC_DATA.AllCpusInSync -- it's a pointer to a
volatile BOOLEAN.)

But our UINT64 values are neither volatile nor UINT8, and I got suddenly
doubtful about "AllCpusInSync" working as a multiprocessor barrier.

(I could be unjustifiedly worried, as a bunch of other fields in
SMM_DISPATCHER_MP_SYNC_DATA are volatile, wider than UINT8, and *not*
accessed with InterlockedCompareExchageXx().)


>
>>
>> I think we might want to use InterlockedCompareExchange64() in both
>> EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in
>> addition). InterlockedCompareExchange64() can be used just for
>> comparison as well, by passing ExchangeValue=CompareValue.
>
>
> Speaking specifically about the ApicIdMap, I'm not sure I fully
> agree (assuming my comment just above is correct.)
>
>
> The only AP (reader) ApicIdMap deref is here:
>
> CpuEject():
> 218   ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>
> For the to-be-ejected-AP, this value can only move from
>    valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID.
>
> Given that, by the time the worker does the write on line 254, this
> AP is guaranteed to be dead already, I don't think there's any
> scenario where the to-be-ejected-AP can see anything other than
> a valid-APIC-ID.

The scenario I had in mind was different: what guarantees that the
effect of

   375        mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId;

which is performed by the BSP in UnplugCpus(), is visible by the AP on
line 218 (see your quote above)?

What if the AP gets to line 218 before the BSP's write on line 375
*propagates* sufficiently?

There's no question that the BSP writes before the AP reads, but I'm
uncertain if that suffices for the *effect* of the write to be visible
to the AP. My concern is not whether the AP sees a partial vs. a settled
update; my concern is if the AP could see an entirely *stale* value.

The consequence of that problem would be that an AP that the BSP were
about to eject would return from CpuEject() to
SmmCpuFeaturesRendezvousExit() to SmiRendezvous().

... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
combination with the sync-up point that you quoted. This seems to match
existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
so atomicity is not a concern, and serializing the instruction streams
coarsely, with the sync-up, in combination with volatile accesses,
should presumably guarantee visibility (on x86 anyway).

Thanks
Laszlo


>
> 241         QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
> 242         QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
> 243
> 244         //
> 245         // Compiler barrier to ensure the next store isn't reordered
> 246         //
> 247         MemoryFence ();
> 248
> 249         //
> 250         // Clear the eject status for CpuIndex to ensure that an
> invalid
> 251         // SMI later does not end up trying to eject it or a newly
> 252         // hotplugged CpuIndex does not go into the dead loop.
> 253         //
> 254         mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
>   For APs that are not being ejected, they will always see
> CPU_EJECT_INVALID
> since the writer never changes that.
>
> The one scenario in which bad things could happen is if entries in the
> ApicIdMap are unaligned (or if the compiler or cpu-arch tears aligned
> writes).
>
>>
>> (17) I think a similar observation applies to the "Handler" field too,
>> as APs call it, while the BSP keeps flipping it between NULL and a real
>> function address. We might have to turn that field into an
> From a real function address, to NULL is the problem part right?
>
> (Same argument as above for the transition in UnplugCpus() from
> NULL -> function-address.)
>
>
>> EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use
>> InterlockedCompareExchange64() again.
>
> AFAICS, these are the problematic derefs:
>
> SmmCpuFeaturesRendezvousExit():
>
> 450   if (mCpuHotEjectData == NULL ||
> 451       mCpuHotEjectData->Handler == NULL) {
> 452     return;
>
> and problematic assignments:
>
> 266     //
> 267     // We are done until the next hot-unplug; clear the handler.
> 268     //
> 269     mCpuHotEjectData->Handler = NULL;
> 270     return;
> 271   }
>
> Here as well, I've been banking on aligned writes such that the APs would
> only see the before or after value not an intermediate value.
>
> Thanks
> Ankur
>
>>
>> Thanks
>> Laszlo
>>
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-02 14:00       ` Laszlo Ersek
@ 2021-02-02 14:15         ` Laszlo Ersek
  2021-02-03  6:45           ` Ankur Arora
  2021-02-03  6:13         ` Ankur Arora
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-02 14:15 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/02/21 15:00, Laszlo Ersek wrote:

> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
> combination with the sync-up point that you quoted. This seems to match
> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
> so atomicity is not a concern, and serializing the instruction streams
> coarsely, with the sync-up, in combination with volatile accesses,
> should presumably guarantee visibility (on x86 anyway).

To summarize, this is what I would ask for:

- make CPU_HOT_EJECT_DATA volatile

- make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile

- after storing something to CPU_HOT_EJECT_DATA or
CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()

- before fetching something from CPU_HOT_EJECT_DATA or
CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()


Except: MemoryFence() isn't a *memory fence* in fact.

See "MdePkg/Library/BaseLib/X64/GccInline.c".

It's just a compiler barrier, which may not add anything beyond what
we'd already have from "volatile".

Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
not contain a single invocation of MemoryFence(). It uses volatile
objects, and a handful of InterlockedCompareExchangeXx() calls, for
implementing semaphores. (NB: there is no 8-bit variant of
InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
in itself, and a suitable basis for a sempahore too.) And given the
synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
updates to the *other* volatile objects are both atomic and visible.

I'm pretty sure this only works because x86 is in-order. There are
instruction stream barriers in place, and compiler barriers too, but no
actual memory barriers.

Now the question is whether we have managed to *sufficiently* imitate
these patterns from PiSmmCpuDxeSmm, in this patch set.

Making stuff volatile, and relying on the existent sync-up point, might
suffice.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  2021-02-01  3:13   ` Laszlo Ersek
@ 2021-02-03  4:28     ` Ankur Arora
  2021-02-03 19:20       ` Laszlo Ersek
  0 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  4:28 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-01-31 7:13 p.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Introduce UnplugCpus() which maps each APIC ID being unplugged
>> onto the hardware ID of the processor and informs PiSmmCpuDxeSmm
>> of removal by calling EFI_SMM_CPU_SERVICE_PROTOCOL.RemoveProcessor().
>>
>> With this change we handle the first phase of unplug where we collect
>> the CPUs that need to be unplugged and mark them for removal in SMM
>> data structures.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 84 ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 84 insertions(+)
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index 05b1f8cb63a6..70d69f6ed65b 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -188,6 +188,88 @@ RevokeNewSlot:
>>   }
>>
>>   /**
>> +  Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>> +
>> +  For each such CPU, report the CPU to PiSmmCpuDxeSmm via
>> +  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
>> +  unknown, skip it silently.
>> +
>> +  @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
>> +                                hot-unplugged.
>> +
>> +  @param[in] ToUnplugCount      The number of filled-in APIC IDs in
>> +                                ToUnplugApicIds.
>> +
>> +  @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
>> +                                structures.
>> +
>> +  @return                       Error codes propagated from
>> +                                mMmCpuService->RemoveProcessor().
>> +
> 
> (1) Please drop this empty line (just before the '**/').
> 
> 
>> +**/
>> +STATIC
>> +EFI_STATUS
>> +UnplugCpus (
>> +  IN APIC_ID                      *ToUnplugApicIds,
>> +  IN UINT32                       ToUnplugCount
>> +  )
>> +{
>> +  EFI_STATUS Status;
>> +  UINT32     ToUnplugIdx;
>> +  UINTN      ProcessorNum;
>> +
>> +  ToUnplugIdx = 0;
>> +  while (ToUnplugIdx < ToUnplugCount) {
>> +    APIC_ID    RemoveApicId;
>> +
>> +    RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
>> +
>> +    //
>> +    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
>> +    // the ProcessorNum for the APIC ID to be removed.
>> +    //
>> +    for (ProcessorNum = 0;
>> +         ProcessorNum < mCpuHotPlugData->ArrayLength;
>> +         ProcessorNum++) {
>> +      if (mCpuHotPlugData->ApicId[ProcessorNum] == RemoveApicId) {
>> +        break;
>> +      }
>> +    }
>> +
>> +    //
>> +    // Ignore the unplug if APIC ID not found
>> +    //
>> +    if (ProcessorNum == mCpuHotPlugData->ArrayLength) {
>> +      DEBUG ((DEBUG_INFO, "%a: did not find APIC ID " FMT_APIC_ID
>> +          " to unplug\n", __FUNCTION__, RemoveApicId));
> 
> (2) Please use DEBUG_VERBOSE here.
> 
> (I agree that we should have *one* DEBUG_INFO message that relates to
> the removal of an individual processor; however, I think we should emit
> that message when we finally signal QEMU to eject the processor.)

Based on our discussion around establishing the correspondence between
The ProcessorNum, APIC-ID and CPU selector, I'll change this to
DEBUG_VERBOSE and add a new DEBUG_INFO print after successfully
putting it in the APICIdMap.

> 
> 
> (3) Please un-indent ("outdent"?) the second line by two spaces.
> 
> 
>> +      ToUnplugIdx++;
>> +      continue;
>> +    }
>> +
>> +    //
>> +    // Mark ProcessorNum for removal from SMM data structures
>> +    //
>> +    Status = mMmCpuService->RemoveProcessor (mMmCpuService, ProcessorNum);
>> +
> 
> (4) It would be more idiomatic to remove this empty line (between Status
> assignment and check).
> 
> 
>> +    if (EFI_ERROR (Status)) {
>> +      DEBUG ((DEBUG_ERROR, "%a: RemoveProcessor(" FMT_APIC_ID "): %r\n",
>> +        __FUNCTION__, RemoveApicId, Status));
>> +      goto Fatal;
> 
> (5) Please just "return Status" here, and drop the "Fatal" label.
> 
> 
>> +    }
>> +
>> +    ToUnplugIdx++;
>> +  }
>> +
>> +  //
>> +  // We've removed this set of APIC IDs from SMM data structures.
>> +  //
>> +  return EFI_SUCCESS;
>> +
>> +Fatal:
>> +  return Status;
>> +}
>> +
>> +/**
>>     CPU Hotplug MMI handler function.
>>
>>     This is a root MMI handler.
>> @@ -303,6 +385,8 @@ CpuHotplugMmi (
>>
>>     if (PluggedCount > 0) {
>>       Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>> +  } else if (ToUnplugCount > 0) {
>> +    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
>>     }
>>
>>     if (EFI_ERROR(Status)) {
>>
> 
> (6) Hmm... What's the reason for the exclusivity?
> 
> Why is the following not better:
> 
>    if (PluggedCount > 0) {
>      Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
>      if (EFI_ERROR (Status)) {
>        goto Fatal;
>      }
>    }
>    if (ToUnplugCount > 0) {
>      Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
>      if (EFI_ERROR (Status)) {
>        goto Fatal;
>      }
>    }
> 
> QemuCpuhpCollectApicIds() intentionally populates both arrays in a
> single go. As I suggested earlier:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06711.html
> msgid: <a92b50df-f693-ebda-e549-7bc9e6d2d7a5@redhat.com>
> 
>> [...] please handle plugs first, for which unused slots in
>> mCpuHotPlugData.ApicId will be populated, and *then* handle removals
>> (in the same invocation of CpuHotplugMmi()).
> 
> Did that turn out as unviable (the "same invocation of CpuHotplugMmi()"
> part)?

No I had some confusion while looking at the underlying AddProcessor() and
RemoveProcessor() logic.

Looking at it again, it should work. Will fix.

Also acking the rest of your comments here.

Thanks
Ankur

> 
> 
> (7) As a side note, addressing point (6) above would allow you to
> address my point (13) from the v5 patch#1 thread, too; i.e., nesting the
> Status check under (PluggedCount > 0).
> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  2021-02-01 13:36   ` Laszlo Ersek
@ 2021-02-03  5:20     ` Ankur Arora
  2021-02-03 20:36       ` Laszlo Ersek
  0 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  5:20 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-01 5:36 a.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> Init CPU_HOT_EJECT_DATA, which will be used to share CPU ejection state
>> between SmmCpuFeaturesLib (via PiSmmCpuDxeSmm) and CpuHotPlugSmm.
>> CpuHotplugSmm also sets up the CPU ejection mechanism via
>> CPU_HOT_EJECT_DATA->Handler.
>>
>> Additionally, expose CPU_HOT_EJECT_DATA via PcdCpuHotEjectDataAddress.
> 
> (1) Please mention that the logic is added to
> SmmCpuFeaturesSmmRelocationComplete(), and so it will run as part of the
> PiSmmCpuDxeSmm entry point function, PiCpuSmmEntry().
> 
> 
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   .../SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf        |  3 +
>>   .../Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c  | 78 ++++++++++++++++++++++
>>   2 files changed, 81 insertions(+)
>>
>> diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
>> index 97a10afb6e27..32c63722ee62 100644
>> --- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
>> +++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
>> @@ -35,4 +35,7 @@ [LibraryClasses]
>>     UefiBootServicesTableLib
>>
>>   [Pcd]
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugSupport
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
>> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase
>> diff --git a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
>> index 7ef7ed98342e..33dd5da92432 100644
>> --- a/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
>> +++ b/OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c
>> @@ -14,7 +14,9 @@
>>   #include <Library/PcdLib.h>
>>   #include <Library/SmmCpuFeaturesLib.h>
>>   #include <Library/SmmServicesTableLib.h>
>> +#include <Library/MemoryAllocationLib.h>
> 
> (2) The MemoryAllocationLib class is not listed in the [LibraryClasses]
> section of "OvmfPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf"; so
> please list it there as well.
> 
> (Please keep the [LibraryClasses] section in the INF file sorted, while
> at it.)
> 
> 
>>   #include <Library/UefiBootServicesTableLib.h>
>> +#include <Library/CpuHotEjectData.h>
> 
> (3) This will change, once you move the header file under
> "OvmfPkg/Include/Pcd/"; either way, please keep the #include directives
> alphabetically sorted.
> 
> (Before this patch, the #include list is well-sorted.)
> 
> 
>>   #include <PiSmm.h>
>>   #include <Register/Intel/SmramSaveStateMap.h>
>>   #include <Register/QemuSmramSaveStateMap.h>
>> @@ -171,6 +173,70 @@ SmmCpuFeaturesHookReturnFromSmm (
>>     return OriginalInstructionPointer;
>>   }
>>
>> +GLOBAL_REMOVE_IF_UNREFERENCED
> 
> (4a) This is useless unless building with MSVC; I don't really remember
> introducing any instance of this macro myself, ever. I suggest dropping
> it.

Will drop. As you said, it's a NOP for !MSVC.

Just as a sidenote, I do see two copies of the mCpuHotEjectData in
the PiSmmCpuSmm and CpuHotplugSmm maps (which makes sense, given
that both include SmmCpuFeaturesLib):

.bss.mCpuHotEjectData
0x0000000000017d60        0x8 /tmp/PiSmmCpuDxeSmm.dll.0k4hl8.ltrans1.ltrans.o

.bss.mCpuHotEjectData
0x0000000000005110        0x8 /tmp/CpuHotplugSmm.dll.ixiN9a.ltrans0.ltrans.o

I imagine they do get unified in the build process later, but that's the point
my understanding stops.

> 
> (4b) On the other hand, STATIC it should be.

Yeah, that it should be.
  
> 
>> +CPU_HOT_EJECT_DATA *mCpuHotEjectData = NULL;
>> +
>> +/**
>> +  Initialize CpuHotEjectData if PcdCpuHotPlugSupport is enabled
>> +  and, if more than 1 CPU is configured.
>> +
>> +  Also sets up the corresponding PcdCpuHotEjectDataAddress.
>> +**/
> 
> (5) typo: s/CpuHotEjectData/mCpuHotEjectData/
> 
> 
> (6) As requested elsewhere under v6, there's no need to make this
> dependent on PcdCpuHotPlugSupport.
> 
> 
> (7) "Initialize" is imperative mood, "sets up" is indicative mood.
> Either one is fine, just be consistent please.

Will fix, and thanks for the very careful reading.

Thanks
Ankur

> 
> 
>> +STATIC
>> +VOID
>> +SmmCpuFeaturesSmmInitHotEject (
> 
> (8) This is a STATIC function (i.e., it has internal linkage); there's
> no need to complicate its name with the "SmmCpuFeatures..." prefix.
> 
> I suggest "InitCpuHotEjectData".
> 
> 
>> +  VOID
>> +  )
>> +{
>> +  UINT32      mMaxNumberOfCpus;
> 
> (9) This is a variable with automatic storage duration, so the "m"
> prefix is invalid.
> 
> 
>> +  EFI_STATUS  Status;
>> +
>> +  if (!FeaturePcdGet (PcdCpuHotPlugSupport)) {
>> +    return;
>> +  }
> 
> (10a) Please drop this, per prior discussion.
> 
> (10b) Please drop the PCD from the INF file too.
> 
> 
> (11) In the rest of this function, the comment style is incorrect in
> several spots. The idiomatic style is:
> 
>    //
>    // Blah.
>    //
> 
> I.e., normally we'd need leading and trailing empty comment lines.
> 
> *However*, most of those comments don't really explain much beyond
> what's emergent from the code anyway, to me anyway, thus, I would simply
> suggest dropping those comments.
> 
> 
>> +
>> +  // PcdCpuHotPlugSupport => PcdCpuMaxLogicalProcessorNumber
>> +  mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
>> +
>> +  // No spare CPUs to hot-eject
>> +  if (mMaxNumberOfCpus == 1) {
>> +    return;
>> +  }
>> +
>> +  mCpuHotEjectData =
>> +    (CPU_HOT_EJECT_DATA *)AllocatePool (sizeof (*mCpuHotEjectData));
> 
> (12) The cast is superfluous (it only wastes screen real estate), as
> AllocatePool() returns (VOID *).
> 
> (Hopefully this will also let us avoid the somewhat awkward line break.)
> 
> 
>> +  ASSERT (mCpuHotEjectData != NULL);
> 
> (13) Here we need to hang harder than this -- even in a RELEASE build,
> in case AllocatePool() fails. The following should work:
> 
>    if (mCpuHotEjectData == NULL) {
>      ASSERT (mCpuHotEjectData != NULL);
>      CpuDeadLoop ();
>    }
> 
> I'll have another comment on this, below...
> 
> 
>> +
>> +  //
>> +  // Allocate buffer for pointers to array in CPU_HOT_EJECT_DATA.
>> +  //
>> +
>> +  // Revision
>> +  mCpuHotEjectData->Revision = CPU_HOT_EJECT_DATA_REVISION_1;
>> +
>> +  // Array Length of APIC ID
>> +  mCpuHotEjectData->ArrayLength = mMaxNumberOfCpus;
>> +
>> +  // CpuIndex -> APIC ID map
>> +  mCpuHotEjectData->ApicIdMap = (UINT64 *)AllocatePool
>> +      (sizeof (*mCpuHotEjectData->ApicIdMap) * mCpuHotEjectData->ArrayLength);
>> +
>> +  // Hot-eject handler
>> +  mCpuHotEjectData->Handler = NULL;
>> +
>> +  // Reserved
>> +  mCpuHotEjectData->Reserved = 0;
>> +
>> +  ASSERT (mCpuHotEjectData->ApicIdMap != NULL);
>> +
> 
> (14) I would propose the following:
> 
> (14a) Add SafeIntLib to both the #include directive list, and the
> [LibraryClasses] section in the INF file.
> 
> (14b) Use SafeIntLib functions to calculate the cumulative size for both
> CPU_HOT_EJECT_DATA, and the ApicIdMap placed right after it, in a local
> UINTN variable.

So I faintly see why using SafeInt might be a good (edge cases, overflow
etc) but given that the SMRAM region (and max CPU count)  is much smaller
than any overflow case, could you expand on the logic behind using
SafeIntLib?

Either way, will fix.

> 
> (14c) Use a single AllocatePool() call. This simplifies error handling
> -- you'll need just one instance of point (13) above --, plus it might
> even reduce SMRAM fragmentation a tiny bit.

Makes sense.

> 
> 
> (15) The following initialization logic, from patch v6 7/9
> ("OvmfPkg/CpuHotplugSmm: add CpuEject()"), belongs in the present patch,
> in my opinion:
> 
>      //
>      // For CPU ejection we need to map ProcessorNum -> APIC_ID. By the time
>      // we need the mapping, however, the Processor's APIC ID has already been
>      // removed from SMM data structures. So we will maintain a local map
>      // in mCpuHotEjectData->ApicIdMap.
>      //
>      for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
>        mCpuHotEjectData->ApicIdMap[Idx] = CPU_EJECT_INVALID;
>      }
> 
> If necessary, feel free to trim or reword the comment; I just think the
> data structure is not ready for publishing via the PCD until the
> "ApicIdMap" elements have been set to INVALID. (IOW, I'm kind of making
> a "RAII" argument.)

The original reason was that I did not want to expose the CPU_EJECT_* values
outside CpuHotplugSmm. That's not really true any more -- given that they are
in a common header.

Will move that logic here.

> 
> 
>> +  //
>> +  // Expose address of CPU Hot eject Data structure
>> +  //
> 
> (this comment is helpful, please keep it)
> 
> 
>> +  Status = PcdSet64S (PcdCpuHotEjectDataAddress,
>> +                      (UINT64)(VOID *)mCpuHotEjectData);
> 
> (16) Incorrect indentation on the second line.
> 
> 
> (17) The (UINT64) cast could trigger a warning in an IA32 build (casting
> between integer and pointer types should keep the width); please replace
> (UINT64) with (UINTN).
> 
> 
>> +  ASSERT_EFI_ERROR (Status);
> 
> (18) Given that we don't use the "Status" variable for anything else in
> this function, it's more idiomatic for "Status" to directly match the
> type returned by PcdSet64S() -- RETURN_STATUS. In such cases, we
> generally call the variable "PcdStatus". So the idea is
> 
>    RETURN_STATUS PcdStatus;
> 
>    PcdStatus = PcdSet64S (...);
>    ASSERT_RETURN_ERROR (PcdStatus);
> 
> RETURN_STATUS and EFI_STATUS behave identically in practice, but (again)
> if we use the status variable *only* for a PcdSet retval, then
> RETURN_STATUS is more elegant.
> 
> (RETURN_STATUS is basically a BASE type, while EFI_STATUS exists in
> connection with the PI and UEFI specs; IOW, RETURN_STATUS is
> semantically more primitive / foundational.)
> 
> The usage of an ASSERT is fine here, BTW; we don't expect this PcdSet
> call to ever fail.
> 
> 
>> +}
>> +
>>   /**
>>     Hook point in normal execution mode that allows the one CPU that was elected
>>     as monarch during System Management Mode initialization to perform additional
>> @@ -188,6 +254,9 @@ SmmCpuFeaturesSmmRelocationComplete (
>>     UINTN      MapPagesBase;
>>     UINTN      MapPagesCount;
>>
>> +
>> +  SmmCpuFeaturesSmmInitHotEject ();
>> +
>>     if (!MemEncryptSevIsEnabled ()) {
>>       return;
>>     }
>> @@ -375,6 +444,15 @@ SmmCpuFeaturesRendezvousExit (
>>     IN UINTN  CpuIndex
>>     )
>>   {
>> +  //
>> +  // CPU Hot-eject not enabled.
>> +  //
>> +  if (mCpuHotEjectData == NULL ||
>> +      mCpuHotEjectData->Handler == NULL) {
>> +    return;
>> +  }
>> +
>> +  mCpuHotEjectData->Handler (CpuIndex);
>>   }
>>
>>   /**
>>
> 
> (19a) Please split the SmmCpuFeaturesRendezvousExit() change to a
> separate patch.
> 
> In particular, "init CPU ejection state" in the subject does not cover
> the SmmCpuFeaturesRendezvousExit() change at all.
> 
> (19b) In the separate patch's commit message, it would be nice to
> mention the *call site* of SmmCpuFeaturesRendezvousExit(), such as "one
> of the last actions in SmiRendezvous()".
> 
> 
> (20) I think we should refine the comment "CPU Hot-eject not enabled".
> That comment covers the (mCpuHotEjectData == NULL) case, yes; but it
> doesn't cover (mCpuHotEjectData->Handler == NULL).
> 
> The latter condition certainly seems valid, because:
> 
> - some SMIs are likely handled before the SMM driver dispatch reaches
>    the CpuHotplugSmm driver, and the latter gets a chance to set up the
>    callback, as a part of erecting the CPU hot-(un)plug support,
> 
> - and even after CpuHotplugSmm is loaded, an unplug request may never
>    happen.
> 
> However, we should document this particular state, with a dedicated
> comment -- perhaps just say, "hot-eject has not been requested yet".

Thanks, these are quite helpful. Will fix.

Ankur

> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection
  2021-02-02 13:23       ` Laszlo Ersek
@ 2021-02-03  5:41         ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  5:41 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-02 5:23 a.m., Laszlo Ersek wrote:
> On 02/01/21 20:21, Ankur Arora wrote:
>> On 2021-02-01 9:22 a.m., Laszlo Ersek wrote:
>>> On 01/29/21 01:59, Ankur Arora wrote:
>>>> Designate a worker CPU (we use the one executing the root MMI
>>>> handler), which will do the actual ejection via QEMU in CpuEject().
>>>>
>>>> CpuEject(), on the worker CPU, ejects each marked CPU by first
>>>> selecting its APIC ID and then sending the QEMU "eject" command.
>>>> QEMU in-turn signals the remote VCPU thread which context-switches
>>>> it out of the SMI.
>>>>
>>>> CpuEject(), on the CPU being ejected, spins around in its holding
>>>> area until this final context-switch. This does mean that there is
>>>> some CPU state that would ordinarily be restored (in SmiRendezvous()
>>>> and in SmiEntry.nasm::CommonHandler), but will not be anymore.
>>>> This unrestored state includes FPU state, CET enable, stuffing of
>>>> RSB and the final RSM. Since the CPU state is destroyed by QEMU,
>>>> this should be okay.
>>>>
>>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Cc: Aaron Young <aaron.young@oracle.com>
>>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>>> ---
>>>>    OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 73
>>>> ++++++++++++++++++++++++++++++++++----
>>>>    1 file changed, 67 insertions(+), 6 deletions(-)
>>>
>>> (1) s/CpuEject/EjectCpu/g, per previous request (affects commit message
>>> and code too).
>>>
>>>
>>>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> index 526f51faf070..bf91344eef9c 100644
>>>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>>>> @@ -193,9 +193,12 @@ RevokeNewSlot:
>>>>      CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>>>      on each CPU at exit from SMM.
>>>>
>>>> -  If, the executing CPU is not being ejected, nothing to be done.
>>>> +  If, the executing CPU is neither a worker, nor being ejected, nothing
>>>> +  to be done.
>>>>      If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>>>      until ejected.
>>>> +  If, the executing CPU is a worker CPU, set QEMU CPU status to eject
>>>> +  for CPUs being ejected.
>>>>
>>>>      @param[in] ProcessorNum      Index of executing CPU.
>>>>
>>>> @@ -217,6 +220,56 @@ CpuEject (
>>>>        return;
>>>>      }
>>>>
>>>> +  if (ApicId == CPU_EJECT_WORKER) {
>>>
>>> (2) The CPU_EJECT_WORKER approach is needlessly complicated (speculative
>>> generality). I wish I understood this idea earlier in the patch set.
>>>
>>> (2a) In patch #5 (subject "OvmfPkg/CpuHotplugSmm: define
>>> CPU_HOT_EJECT_DATA"), the CPU_EJECT_WORKER macro definition should be
>>> dropped.
>>>
>>> (2b) In this patch, the question whether the executing CPU is the BSP or
>>> not, should be decided with the same logic that is visible in
>>> PlatformSmmBspElection()
>>> [OvmfPkg/Library/SmmCpuPlatformHookLibQemu/SmmCpuPlatformHookLibQemu.c]:
>>>
>>>     MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
>>>     BOOLEAN                     IsBsp;
>>>
>>>     ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
>>>     IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
>>>
>>> (2c) Point (2b) obviates the explicit "mark as worker" logic entirely,
>>> in UnplugCpus() below.
>>>
>>> (2d) The "is worker" language (in comments etc) should be replaced with
>>> direct "is BSP" language.
>>>
>>>
>>>> +    UINT32 CpuIndex;
>>>> +
>>>> +    for (CpuIndex = 0; CpuIndex < mCpuHotEjectData->ArrayLength;
>>>> CpuIndex++) {
>>>> +      UINT64 RemoveApicId;
>>>> +
>>>> +      RemoveApicId = mCpuHotEjectData->ApicIdMap[CpuIndex];
>>>> +
>>>> +      if ((RemoveApicId != CPU_EJECT_INVALID &&
>>>> +           RemoveApicId != CPU_EJECT_WORKER)) {
>>>> +        //
>>>> +        // This to-be-ejected-CPU has already received the BSP's SMI
>>>> exit
>>>> +        // signal and, will execute SmmCpuFeaturesSmiRendezvousExit()
>>>> +        // followed by this callback or is already waiting in the
>>>> +        // CpuDeadLoop() below.
>>>> +        //
>>>> +        // Tell QEMU to context-switch it out.
>>>> +        //
>>>> +        QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>>>> +        QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
>>>
>>> (3) While the QEMU CPU selector value *usually* matches the APIC ID,
>>> it's not an invariant. APIC IDs have an internal structure, composed of
>>> bit-fields, where each bit-field accommodates one hierarchy level in the
>>> CPU topology (thread, core, die (maybe), and socket).
>>>
>>> However, this mapping need not be surjective. QEMU lets you create
>>> "pathological" CPU topologies, for example one with:
>>> - 3 threads/core,
>>> - 5 cores/socket,
>>> - (say) 2 sockets.
>>>
>>> Under that example, the bit-field standing for the "thread number" level
>>> would have 2 bits, theoretically permitting *4* threads/core, and the
>>> bit-field standing for the "core number" level would have 3 bits,
>>> theoretically allowing for *8* cores/socket.
>>>
>>> Considering the fully populated topology, you'd see the CPU selector
>>> range from 0 to (3*5*2-1)=29, inclusive (corresponding to 30 logical
>>> processors in total). However, the APIC ID *image* of this CPU selector
>>> *domain* would not be "contiguous" -- the APIC ID space, with the
>>> above-described structure, would accommodate 4*8*2=64 logical
>>> processors. For example, each APIC ID that stood for the nonexistent
>>> "thread#3" on a particular core would be left unused (no CPU selector
>>> would map to it).
>>>
>>> All in all, you can't write the APIC ID to the CPU selector register,
>>> for ejection. You need to select the CPU whose APIC ID is the APIC ID
>>> you want to eject, and then initiate ejection.
>>
>> Yeah, this is a clear bug. Should have seen it earlier. Thanks for
>> pointing it out.
>>
>>>
>>> This requires one of two alternatives:
>>>
>>>
>>> (3a) The first option is to keep the change local to this patch.
>>>
>>> This alternative is the more CPU-hungry (and uglier) one.
>>>
>>> The idea is to perform a QEMU_CPUHP_CMD_GET_ARCH_ID loop over all
>>> possible CPUs, somewhat similarly to QemuCpuhpCollectApicIds(). At every
>>> CPU, knowing the APIC ID, try to find the APIC ID in "ApicIdMap". If
>>> there is a match, eject.
>>>
>>>
>>> (3b) The second option is much more elegant (and it's faster too), but
>>> it requires a much more intrusive update to the patch set.
>>>
>>> First, the *element type* of the arrays that QemuCpuhpCollectApicIds()
>>> operates on, has to be changed from APIC_ID to a structure type that
>>> pairs APIC_ID with the QEMU CPU selector. [*]
>>>
>>> Second, whenever QemuCpuhpCollectApicIds() outputs an APIC_ID, it should
>>> also save the "CurrentSelector" value (in the other field of the output
>>> array element structure).
>>>
>>> Third, the element type of CPU_HOT_EJECT_DATA.ApicIdMap should be
>>> replaced with a structure type similar (or identical) to the one
>>> described at [*]. The ProcessorNumber lookup in UnplugCpus() would still
>>> be based upon the APIC ID, but CPU_HOT_EJECT_DATA should remember both
>>> the QEMU selector for that processor, and the APIC ID.
>>>
>>> Fourth, the actual ejection should use the selector.
>>>
>>> Fifth, the debug message (below) should continue logging the APIC ID, to
>>> mirror the DEBUG_INFO message in ProcessHotAddedCpus().
>>>
>>>
>>> Would you be willing to implement (3b)?
>>
>> 3b is clearly the better solution. However, is there enough value in
>> the print message containing APIC ID, that CPU_HOT_EJECT_DATA.ApicIdMap
>> carry both the cpu-selector and APIC ID?
>>
>> As you say, the ejection itself just needs the ProcessorNum -> QEMU
>> cpu-selector mapping.
> 
> Good question, and I'm torn.
> 
> The default DEBUG build of OVMF enables INFO messages, and more severe
> ones. It does not enable VERBOSE messages.
> 
> In such a DEBUG build (i.e., with otherwise default settings), the log
> entries that relate to hot-plug do not report selectors. Conversely, on
> hot-unplug, the log would report selectors *only* (not APIC IDs). I feel
> this makes it more difficult to get useful OVMF debug logs in bug
> reports, especially from users of distro-provided OVMF builds.
> 
> If we can ask the user to rebuild with VERBOSE enabled and re-run the
> test, that's always great, but frequently users don't know how to
> rebuild OVMF, plus usually OVMF is hidden so deep in their virt stack
> that they have no idea how to make that stack use a custom OVMF build.
> 
> ... How about this: in UnplugCpus(), we could emit an INFO message about
> the ProcessorNumber <-> APIC ID <-> Selector correspondence, and when
> the eject happens, it would suffice to log (at INFO level) only
> ProcessorNumber <-> Selector.

This makes sense to me. Will do.

Thanks
Ankur

> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-02-01 17:37   ` Laszlo Ersek
  2021-02-01 17:40     ` Laszlo Ersek
@ 2021-02-03  5:46     ` Ankur Arora
  2021-02-03 20:45       ` Laszlo Ersek
  1 sibling, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  5:46 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-01 9:37 a.m., Laszlo Ersek wrote:
> On 01/29/21 01:59, Ankur Arora wrote:
>> As part of the negotiation treat ICH9_LPC_SMI_F_CPU_HOT_UNPLUG as a
>> subfeature of feature flag ICH9_LPC_SMI_F_CPU_HOTPLUG, so enable it
>> only if the other is also being negotiated.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>   OvmfPkg/SmmControl2Dxe/SmiFeatures.c | 25 ++++++++++++++++++++++---
>>   1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> index c9d875543205..e70f3f8b58cb 100644
>> --- a/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> +++ b/OvmfPkg/SmmControl2Dxe/SmiFeatures.c
>> @@ -29,6 +29,13 @@
>>   //
>>   #define ICH9_LPC_SMI_F_CPU_HOTPLUG BIT1
>>   
>> +// The following bit value stands for "enable CPU hot unplug, and inject an SMI
> 
> (1) s/hot unplug/hot-unplug/
> 
> 
>> +// with control value ICH9_APM_CNT_CPU_HOT_UNPLUG upon hot unplug", in the
> 
> (2) There is no such thing as ICH9_APM_CNT_CPU_HOT_UNPLUG; we use the
> same SMI command value ICH9_APM_CNT_CPU_HOTPLUG (= 4) for unplug.
> 
> In QEMU, the macro is called OVMF_CPUHP_SMI_CMD.
> 
> 
> (3) s/hot unplug/hot-unplug/.
> 
> 
>> +// "etc/smi/supported-features" and "etc/smi/requested-features" fw_cfg files.
>> +// Is only negotiated alongside ICH9_LPC_SMI_F_CPU_HOTPLUG.
> 
> (4) Please drop the last sentence (see more on it below).
> 
> 
>> +//
>> +#define ICH9_LPC_SMI_F_CPU_HOT_UNPLUG BIT2
>> +
>>   //
>>   // Provides a scratch buffer (allocated in EfiReservedMemoryType type memory)
>>   // for the S3 boot script fragment to write to and read from.
>> @@ -112,7 +119,8 @@ NegotiateSmiFeatures (
>>     QemuFwCfgReadBytes (sizeof mSmiFeatures, &mSmiFeatures);
>>   
>>     //
>> -  // We want broadcast SMI, SMI on CPU hotplug, and nothing else.
>> +  // We want broadcast SMI, SMI on CPU hotplug, on CPU hot-unplug
>> +  // and nothing else.
>>     //
>>     RequestedFeaturesMask = ICH9_LPC_SMI_F_BROADCAST;
>>     if (!MemEncryptSevIsEnabled ()) {
> 
> (5) Please spell out the full expression "SMI on CPU hot-unplug".
> 
> 
>> @@ -120,8 +128,18 @@ NegotiateSmiFeatures (
>>       // For now, we only support hotplug with SEV disabled.
>>       //
>>       RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOTPLUG;
>> +    RequestedFeaturesMask |= ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>>     }
>>     mSmiFeatures &= RequestedFeaturesMask;
>> +
>> +  if (!(mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) &&
>> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG)) {
>> +    DEBUG ((DEBUG_WARN, "%a CPU host-features %Lx, requested mask %Lx\n",
>> +      __FUNCTION__, mSmiFeatures, RequestedFeaturesMask));
>> +
>> +    mSmiFeatures &= ~ICH9_LPC_SMI_F_CPU_HOT_UNPLUG;
>> +  }
>> +
>>     QemuFwCfgSelectItem (mRequestedFeaturesItem);
>>     QemuFwCfgWriteBytes (sizeof mSmiFeatures, &mSmiFeatures);
>>   
> 
> (6) Please drop this hunk. We don't try to be smarter than QEMU, in
> general, whenever we perform feature negotiation.

Also, AFAICS, we will do the hotplug (and now hot-unplug) even if it wasn't
negotiated?

> 
> For example, the pre-patch code doesn't attempt to notice if QEMU
> acknowledges ICH9_LPC_SMI_F_CPU_HOTPLUG but not ICH9_LPC_SMI_F_BROADCAST.
> 
> 
>> @@ -162,8 +180,9 @@ NegotiateSmiFeatures (
>>     if ((mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOTPLUG) == 0) {
>>       DEBUG ((DEBUG_INFO, "%a: CPU hotplug not negotiated\n", __FUNCTION__));
>>     } else {
>> -    DEBUG ((DEBUG_INFO, "%a: CPU hotplug with SMI negotiated\n",
>> -      __FUNCTION__));
>> +    DEBUG ((DEBUG_INFO, "%a: CPU hotplug%s with SMI negotiated\n",
>> +      __FUNCTION__,
>> +      (mSmiFeatures & ICH9_LPC_SMI_F_CPU_HOT_UNPLUG) ? ", unplug" : ""));
>>     }
>>   
>>     //
>>
> 
> (7) Rather than combining these two in a common debug message, please
> just add a separate "if" that follows the whole pattern seen with
> ICH9_LPC_SMI_F_CPU_HOTPLUG. Thus, for each feature bit we care about,
> we'll have a dedicated log message, saying yes or no.

Acking this set (and the ones down-thread.) Will fix.

Thanks
Ankur

> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-02 14:00       ` Laszlo Ersek
  2021-02-02 14:15         ` Laszlo Ersek
@ 2021-02-03  6:13         ` Ankur Arora
  2021-02-03 20:55           ` Laszlo Ersek
  1 sibling, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  6:13 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-02 6:00 a.m., Laszlo Ersek wrote:
> On 02/01/21 21:12, Ankur Arora wrote:
>> On 2021-02-01 11:08 a.m., Laszlo Ersek wrote:
>>> apologies, I've got more comments here:
>>>
>>> On 01/29/21 01:59, Ankur Arora wrote:
>>>
>>>>    /**
>>>> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(),
>>>> +  on each CPU at exit from SMM.
>>>> +
>>>> +  If, the executing CPU is not being ejected, nothing to be done.
>>>> +  If, the executing CPU is being ejected, wait in a CpuDeadLoop()
>>>> +  until ejected.
>>>> +
>>>> +  @param[in] ProcessorNum      Index of executing CPU.
>>>> +
>>>> +**/
>>>> +VOID
>>>> +EFIAPI
>>>> +CpuEject (
>>>> +  IN UINTN ProcessorNum
>>>> +  )
>>>> +{
>>>> +  //
>>>> +  // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64
>>>> +  // so use UINT64 throughout.
>>>> +  //
>>>> +  UINT64 ApicId;
>>>> +
>>>> +  ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>>>> +  if (ApicId == CPU_EJECT_INVALID) {
>>>> +    return;
>>>> +  }
>>>> +
>>>> +  //
>>>> +  // CPU(s) being unplugged get here from
>>>> SmmCpuFeaturesSmiRendezvousExit()
>>>> +  // after having been cleared to exit the SMI by the monarch and
>>>> thus have
>>>> +  // no SMM processing remaining.
>>>> +  //
>>>> +  // Given that we cannot allow them to escape to the guest, we pen
>>>> them
>>>> +  // here until the SMM monarch tells the HW to unplug them.
>>>> +  //
>>>> +  CpuDeadLoop ();
>>>> +}
>>>
>>> (15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() --
>>> it's SmmCpuFeaturesRendezvousExit().
>>>
>>> (16) This function uses a data structure for communication between BSP
>>> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
>>> the BSP, and checked above by the APs (too).
>>>
>>> What guarantees the visibility of mCpuHotEjectData->ApicIdMap?
>>
>> I was banking on SmiRendezvous() explicitly signalling that all
>> processing on the BSP was done before any AP will look at
>> mCpuHotEjectData in SmmCpuFeaturesRendezvousExit().
>>
>> 1716     //
>> 1717     // Wait for BSP's signal to exit SMI
>> 1718     //
>> 1719     while (*mSmmMpSyncData->AllCpusInSync) {
>> 1720       CpuPause ();
>> 1721     }
>> 1722   }
>> 1723
>> 1724 Exit:
>> 1725   SmmCpuFeaturesRendezvousExit (CpuIndex);
> 
> Right; it's a general pattern in edk2: volatile UINT8 (aka BOOLEAN)
> objects are considered atomic. (See
> SMM_DISPATCHER_MP_SYNC_DATA.AllCpusInSync -- it's a pointer to a
> volatile BOOLEAN.)
> 
> But our UINT64 values are neither volatile nor UINT8, and I got suddenly
> doubtful about "AllCpusInSync" working as a multiprocessor barrier.
> 
> (I could be unjustifiedly worried, as a bunch of other fields in
> SMM_DISPATCHER_MP_SYNC_DATA are volatile, wider than UINT8, and *not*
> accessed with InterlockedCompareExchageXx().)

Thanks for pointing me to this code. There's a curious comment in
about making this structure uncache-able in the declaration here
(though I couldn't figure out how that is done):

418 typedef struct {
419   //
420   // Pointer to an array. The array should be located immediately after this structure
421   // so that UC cache-ability can be set together.
422   //
423   SMM_CPU_DATA_BLOCK            *CpuData;
424   volatile UINT32               *Counter;
425   volatile UINT32               BspIndex;
426   volatile BOOLEAN              *InsideSmm;
427   volatile BOOLEAN              *AllCpusInSync;
428   volatile SMM_CPU_SYNC_MODE    EffectiveSyncMode;
429   volatile BOOLEAN              SwitchBsp;
430   volatile BOOLEAN              *CandidateBsp;
431   EFI_AP_PROCEDURE              StartupProcedure;
432   VOID                          *StartupProcArgs;
433 } SMM_DISPATCHER_MP_SYNC_DATA;

Also, is there an expectation that these fields (at least some of
them) switch over when a new leader is chosen?

Otherwise I'm not sure why for instance, AllCpusInSync would be
a pointer.

> 
> 
>>
>>>
>>> I think we might want to use InterlockedCompareExchange64() in both
>>> EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in
>>> addition). InterlockedCompareExchange64() can be used just for
>>> comparison as well, by passing ExchangeValue=CompareValue.
>>
>>
>> Speaking specifically about the ApicIdMap, I'm not sure I fully
>> agree (assuming my comment just above is correct.)
>>
>>
>> The only AP (reader) ApicIdMap deref is here:
>>
>> CpuEject():
>> 218   ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>>
>> For the to-be-ejected-AP, this value can only move from
>>     valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID.
>>
>> Given that, by the time the worker does the write on line 254, this
>> AP is guaranteed to be dead already, I don't think there's any
>> scenario where the to-be-ejected-AP can see anything other than
>> a valid-APIC-ID.
> 
> The scenario I had in mind was different: what guarantees that the
> effect of
> 
>     375        mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId;
> 
> which is performed by the BSP in UnplugCpus(), is visible by the AP on
> line 218 (see your quote above)?
> 
> What if the AP gets to line 218 before the BSP's write on line 375
> *propagates* sufficiently?

I understand. That does make sense. And, as you said elsewhere, a real
memory fence would come in useful here.

We could use AsmCpuid() as a poor man's mfence, but that seems overkill
given that x86 at least guarantees store-order.


Ankur

> 
> There's no question that the BSP writes before the AP reads, but I'm
> uncertain if that suffices for the *effect* of the write to be visible
> to the AP. My concern is not whether the AP sees a partial vs. a settled
> update; my concern is if the AP could see an entirely *stale* value.
> 
> The consequence of that problem would be that an AP that the BSP were
> about to eject would return from CpuEject() to
> SmmCpuFeaturesRendezvousExit() to SmiRendezvous().
> 
> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
> combination with the sync-up point that you quoted. This seems to match
> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
> so atomicity is not a concern, and serializing the instruction streams
> coarsely, with the sync-up, in combination with volatile accesses,
> should presumably guarantee visibility (on x86 anyway).
> 
> Thanks
> Laszlo
> 
> 
>>
>> 241         QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId);
>> 242         QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED);
>> 243
>> 244         //
>> 245         // Compiler barrier to ensure the next store isn't reordered
>> 246         //
>> 247         MemoryFence ();
>> 248
>> 249         //
>> 250         // Clear the eject status for CpuIndex to ensure that an
>> invalid
>> 251         // SMI later does not end up trying to eject it or a newly
>> 252         // hotplugged CpuIndex does not go into the dead loop.
>> 253         //
>> 254         mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID;
>>    For APs that are not being ejected, they will always see
>> CPU_EJECT_INVALID
>> since the writer never changes that.
>>
>> The one scenario in which bad things could happen is if entries in the
>> ApicIdMap are unaligned (or if the compiler or cpu-arch tears aligned
>> writes).
>>
>>>
>>> (17) I think a similar observation applies to the "Handler" field too,
>>> as APs call it, while the BSP keeps flipping it between NULL and a real
>>> function address. We might have to turn that field into an
>>  From a real function address, to NULL is the problem part right?
>>
>> (Same argument as above for the transition in UnplugCpus() from
>> NULL -> function-address.)
>>
>>
>>> EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use
>>> InterlockedCompareExchange64() again.
>>
>> AFAICS, these are the problematic derefs:
>>
>> SmmCpuFeaturesRendezvousExit():
>>
>> 450   if (mCpuHotEjectData == NULL ||
>> 451       mCpuHotEjectData->Handler == NULL) {
>> 452     return;
>>
>> and problematic assignments:
>>
>> 266     //
>> 267     // We are done until the next hot-unplug; clear the handler.
>> 268     //
>> 269     mCpuHotEjectData->Handler = NULL;
>> 270     return;
>> 271   }
>>
>> Here as well, I've been banking on aligned writes such that the APs would
>> only see the before or after value not an intermediate value.
>>
>> Thanks
>> Ankur
>>
>>>
>>> Thanks
>>> Laszlo
>>>
>>
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-02 14:15         ` Laszlo Ersek
@ 2021-02-03  6:45           ` Ankur Arora
  2021-02-03 20:58             ` Laszlo Ersek
  0 siblings, 1 reply; 52+ messages in thread
From: Ankur Arora @ 2021-02-03  6:45 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
> On 02/02/21 15:00, Laszlo Ersek wrote:
> 
>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>> combination with the sync-up point that you quoted. This seems to match
>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
>> so atomicity is not a concern, and serializing the instruction streams
>> coarsely, with the sync-up, in combination with volatile accesses,
>> should presumably guarantee visibility (on x86 anyway).
> 
> To summarize, this is what I would ask for:
> 
> - make CPU_HOT_EJECT_DATA volatile
> 
> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
> 
> - after storing something to CPU_HOT_EJECT_DATA or
> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
> 
> - before fetching something from CPU_HOT_EJECT_DATA or
> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
> 
> 
> Except: MemoryFence() isn't a *memory fence* in fact.
> 
> See "MdePkg/Library/BaseLib/X64/GccInline.c".
> 
> It's just a compiler barrier, which may not add anything beyond what
> we'd already have from "volatile".
> 
> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
> not contain a single invocation of MemoryFence(). It uses volatile
> objects, and a handful of InterlockedCompareExchangeXx() calls, for
> implementing semaphores. (NB: there is no 8-bit variant of
> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
> in itself, and a suitable basis for a sempahore too.) And given the
> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
> updates to the *other* volatile objects are both atomic and visible.
> 
> I'm pretty sure this only works because x86 is in-order. There are
> instruction stream barriers in place, and compiler barriers too, but no
> actual memory barriers.

Right and just to separate them explicitly, there are two problems:

  - compiler: where the compiler caches stuff in or looks at stale memory
locations. Now, AFAICS, this should not happen because the ApicIdMap would
never change once set so the compiler should reasonably be able to cache
the address of ApicIdMap and dereference it (thus obviating the need for
volatile.)
The compiler could, however, cache any assignments to ApicIdMap[Idx]
(especially given LTO) and we could use a MemoryFence() (as the compiler
barrier that it is) to force the store.

  - CPU pipeline: as you said, right now we basically depend on x86 store
order semantics (and the CpuPause() loop around AllCpusInSync, kinda provides
a barrier.)

So the BSP writes in this order:
ApicIdMap[Idx]=x; ... ->AllCpusInSync = true

And whenever the AP sees ->AllCpusInSync == True, it has to now see
ApicIdMap[Idx] == x.

Maybe the thing to do is to detail this barrier in a commit note/comment?
And add the MemoryFence() but not the volatile.


Thanks
Ankur


> 
> Now the question is whether we have managed to *sufficiently* imitate
> these patterns from PiSmmCpuDxeSmm, in this patch set.
> 
> Making stuff volatile, and relying on the existent sync-up point, might
> suffice.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus()
  2021-02-03  4:28     ` Ankur Arora
@ 2021-02-03 19:20       ` Laszlo Ersek
  0 siblings, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-03 19:20 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/03/21 05:28, Ankur Arora wrote:
> On 2021-01-31 7:13 p.m., Laszlo Ersek wrote:
>> On 01/29/21 01:59, Ankur Arora wrote:

>>> +**/
>>> +STATIC
>>> +EFI_STATUS
>>> +UnplugCpus (
>>> +  IN APIC_ID                      *ToUnplugApicIds,
>>> +  IN UINT32                       ToUnplugCount
>>> +  )
>>> +{
>>> +  EFI_STATUS Status;
>>> +  UINT32     ToUnplugIdx;
>>> +  UINTN      ProcessorNum;
>>> +
>>> +  ToUnplugIdx = 0;
>>> +  while (ToUnplugIdx < ToUnplugCount) {
>>> +    APIC_ID    RemoveApicId;
>>> +
>>> +    RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
>>> +
>>> +    //
>>> +    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it
>>> to find
>>> +    // the ProcessorNum for the APIC ID to be removed.
>>> +    //
>>> +    for (ProcessorNum = 0;
>>> +         ProcessorNum < mCpuHotPlugData->ArrayLength;
>>> +         ProcessorNum++) {
>>> +      if (mCpuHotPlugData->ApicId[ProcessorNum] == RemoveApicId) {
>>> +        break;
>>> +      }
>>> +    }
>>> +
>>> +    //
>>> +    // Ignore the unplug if APIC ID not found
>>> +    //
>>> +    if (ProcessorNum == mCpuHotPlugData->ArrayLength) {
>>> +      DEBUG ((DEBUG_INFO, "%a: did not find APIC ID " FMT_APIC_ID
>>> +          " to unplug\n", __FUNCTION__, RemoveApicId));
>>
>> (2) Please use DEBUG_VERBOSE here.
>>
>> (I agree that we should have *one* DEBUG_INFO message that relates to
>> the removal of an individual processor; however, I think we should emit
>> that message when we finally signal QEMU to eject the processor.)
> 
> Based on our discussion around establishing the correspondence between
> The ProcessorNum, APIC-ID and CPU selector, I'll change this to
> DEBUG_VERBOSE and add a new DEBUG_INFO print after successfully
> putting it in the APICIdMap.

Thanks!

[...]

Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  2021-02-03  5:20     ` Ankur Arora
@ 2021-02-03 20:36       ` Laszlo Ersek
  2021-02-04  2:58         ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-03 20:36 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/03/21 06:20, Ankur Arora wrote:

> Just as a sidenote, I do see two copies of the mCpuHotEjectData in
> the PiSmmCpuSmm and CpuHotplugSmm maps (which makes sense, given
> that both include SmmCpuFeaturesLib):
> 
> .bss.mCpuHotEjectData
> 0x0000000000017d60        0x8
> /tmp/PiSmmCpuDxeSmm.dll.0k4hl8.ltrans1.ltrans.o
> 
> .bss.mCpuHotEjectData
> 0x0000000000005110        0x8
> /tmp/CpuHotplugSmm.dll.ixiN9a.ltrans0.ltrans.o
> 
> I imagine they do get unified in the build process later, but that's the
> point my understanding stops.

The PiSmmCpuDxeSmm binary has a (static global) variable called
"mCpuHotEjectData" via OVMF's SmmCpuFeaturesLib instance, from this
patch (patch#6).

The CpuHotplugSmm binary has a (static global) variable called
"mCpuHotEjectData" because the "CpuHotplug.c" source file defines that
variable, from patch#7. (CpuHotplugSmm does not consume the
SmmCpuFeaturesLib class -- it has no reason to.)

In other words, there's nothing common between the two variables, beyond
the name. If you rename the first to mCpuHotEjectData1, and the second
to mCpuHotEjectData2, just for the experiment's sake, nothing will break.

PiSmmCpuDxeSmm and CpuHotplugSmm never get unified in the build process;
they are independent binaries.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-02-03  5:46     ` Ankur Arora
@ 2021-02-03 20:45       ` Laszlo Ersek
  2021-02-04  3:04         ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-03 20:45 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/03/21 06:46, Ankur Arora wrote:
> On 2021-02-01 9:37 a.m., Laszlo Ersek wrote:

>> (6) Please drop this hunk. We don't try to be smarter than QEMU, in
>> general, whenever we perform feature negotiation.
> 
> Also, AFAICS, we will do the hotplug (and now hot-unplug) even if it wasn't
> negotiated?

Yes, totally. We don't try to "evict" CpuHotplugSmm in case the related
features are not supported/offered by QEMU, we'll just leave
CpuHotplugSmm unused.

Here's why: the SMI feature negotiation interface is locked down at a
certain point; the negotiation of all of the feature bits needs to
happen centrally, in a common spot; and it would require a really quirky
solution in the firmware to let independent drivers negotiate *subsets*
of the features.

You have correctly determined that SmmControl2Dxe, the runtime DXE
driver that produces EFI_SMM_CONTROL2_PROTOCOL, has nothing much to do
with CPU hot-(un)plug. It's just that this is the driver that first
used, and therefore now *owns*, the SMI feature negotiation. (See commit
5ba203b54e59 ("OvmfPkg/SmmControl2Dxe: negotiate
ICH9_LPC_SMI_F_CPU_HOTPLUG", 2020-08-24).)

So, to reformulate your question/statement: the firmware will retain the
ability to do hot-(un)plug even if QEMU doesn't contain (or enable)
those particular features.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-03  6:13         ` Ankur Arora
@ 2021-02-03 20:55           ` Laszlo Ersek
  2021-02-04  2:57             ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-03 20:55 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/03/21 07:13, Ankur Arora wrote:
> On 2021-02-02 6:00 a.m., Laszlo Ersek wrote:
>> On 02/01/21 21:12, Ankur Arora wrote:
>>> On 2021-02-01 11:08 a.m., Laszlo Ersek wrote:

>>>> (16) This function uses a data structure for communication between BSP
>>>> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
>>>> the BSP, and checked above by the APs (too).
>>>>
>>>> What guarantees the visibility of mCpuHotEjectData->ApicIdMap?
>>>
>>> I was banking on SmiRendezvous() explicitly signalling that all
>>> processing on the BSP was done before any AP will look at
>>> mCpuHotEjectData in SmmCpuFeaturesRendezvousExit().
>>>
>>> 1716     //
>>> 1717     // Wait for BSP's signal to exit SMI
>>> 1718     //
>>> 1719     while (*mSmmMpSyncData->AllCpusInSync) {
>>> 1720       CpuPause ();
>>> 1721     }
>>> 1722   }
>>> 1723
>>> 1724 Exit:
>>> 1725   SmmCpuFeaturesRendezvousExit (CpuIndex);
>>
>> Right; it's a general pattern in edk2: volatile UINT8 (aka BOOLEAN)
>> objects are considered atomic. (See
>> SMM_DISPATCHER_MP_SYNC_DATA.AllCpusInSync -- it's a pointer to a
>> volatile BOOLEAN.)
>>
>> But our UINT64 values are neither volatile nor UINT8, and I got suddenly
>> doubtful about "AllCpusInSync" working as a multiprocessor barrier.
>>
>> (I could be unjustifiedly worried, as a bunch of other fields in
>> SMM_DISPATCHER_MP_SYNC_DATA are volatile, wider than UINT8, and *not*
>> accessed with InterlockedCompareExchageXx().)
> 
> Thanks for pointing me to this code. There's a curious comment in
> about making this structure uncache-able in the declaration here
> (though I couldn't figure out how that is done):
> 
> 418 typedef struct {
> 419   //
> 420   // Pointer to an array. The array should be located immediately
> after this structure
> 421   // so that UC cache-ability can be set together.
> 422   //

This is probably through SMRR manipulation.

The "UefiCpuPkg/Library/SmmCpuFeaturesLib" instance contains SMRR support.

The "OvmfPkg/Library/SmmCpuFeaturesLib" instance contains no SMRR
support. (Just search both source files for "SMRR".)


> 423   SMM_CPU_DATA_BLOCK            *CpuData;
> 424   volatile UINT32               *Counter;
> 425   volatile UINT32               BspIndex;
> 426   volatile BOOLEAN              *InsideSmm;
> 427   volatile BOOLEAN              *AllCpusInSync;
> 428   volatile SMM_CPU_SYNC_MODE    EffectiveSyncMode;
> 429   volatile BOOLEAN              SwitchBsp;
> 430   volatile BOOLEAN              *CandidateBsp;
> 431   EFI_AP_PROCEDURE              StartupProcedure;
> 432   VOID                          *StartupProcArgs;
> 433 } SMM_DISPATCHER_MP_SYNC_DATA;
> 
> Also, is there an expectation that these fields (at least some of
> them) switch over when a new leader is chosen?

Yes, see for example the "Elect BSP" section in SmiRendezvous().


> Otherwise I'm not sure why for instance, AllCpusInSync would be
> a pointer.

TBH I can't explain that; I'm not too familiar with those parts...


>>> CpuEject():
>>> 218   ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>>>
>>> For the to-be-ejected-AP, this value can only move from
>>>     valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID.
>>>
>>> Given that, by the time the worker does the write on line 254, this
>>> AP is guaranteed to be dead already, I don't think there's any
>>> scenario where the to-be-ejected-AP can see anything other than
>>> a valid-APIC-ID.
>>
>> The scenario I had in mind was different: what guarantees that the
>> effect of
>>
>>     375        mCpuHotEjectData->ApicIdMap[ProcessorNum] =
>> (UINT64)RemoveApicId;
>>
>> which is performed by the BSP in UnplugCpus(), is visible by the AP on
>> line 218 (see your quote above)?
>>
>> What if the AP gets to line 218 before the BSP's write on line 375
>> *propagates* sufficiently?
> 
> I understand. That does make sense. And, as you said elsewhere, a real
> memory fence would come in useful here.
> 
> We could use AsmCpuid() as a poor man's mfence, but that seems overkill
> given that x86 at least guarantees store-order.

Right -- I don't recall any examples of AsmCpuid() being used like that
in edk2.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-03  6:45           ` Ankur Arora
@ 2021-02-03 20:58             ` Laszlo Ersek
  2021-02-04  2:49               ` Ankur Arora
  0 siblings, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-03 20:58 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/03/21 07:45, Ankur Arora wrote:
> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
>> On 02/02/21 15:00, Laszlo Ersek wrote:
>>
>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>>> combination with the sync-up point that you quoted. This seems to match
>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
>>> so atomicity is not a concern, and serializing the instruction streams
>>> coarsely, with the sync-up, in combination with volatile accesses,
>>> should presumably guarantee visibility (on x86 anyway).
>>
>> To summarize, this is what I would ask for:
>>
>> - make CPU_HOT_EJECT_DATA volatile
>>
>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
>>
>> - after storing something to CPU_HOT_EJECT_DATA or
>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
>>
>> - before fetching something from CPU_HOT_EJECT_DATA or
>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
>>
>>
>> Except: MemoryFence() isn't a *memory fence* in fact.
>>
>> See "MdePkg/Library/BaseLib/X64/GccInline.c".
>>
>> It's just a compiler barrier, which may not add anything beyond what
>> we'd already have from "volatile".
>>
>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
>> not contain a single invocation of MemoryFence(). It uses volatile
>> objects, and a handful of InterlockedCompareExchangeXx() calls, for
>> implementing semaphores. (NB: there is no 8-bit variant of
>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
>> in itself, and a suitable basis for a sempahore too.) And given the
>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
>> updates to the *other* volatile objects are both atomic and visible.
>>
>> I'm pretty sure this only works because x86 is in-order. There are
>> instruction stream barriers in place, and compiler barriers too, but no
>> actual memory barriers.
> 
> Right and just to separate them explicitly, there are two problems:
> 
>  - compiler: where the compiler caches stuff in or looks at stale memory
> locations. Now, AFAICS, this should not happen because the ApicIdMap would
> never change once set so the compiler should reasonably be able to cache
> the address of ApicIdMap and dereference it (thus obviating the need for
> volatile.)

(CPU_HOT_EJECT_DATA.Handler does change though.)

> The compiler could, however, cache any assignments to ApicIdMap[Idx]
> (especially given LTO) and we could use a MemoryFence() (as the compiler
> barrier that it is) to force the store.
> 
>  - CPU pipeline: as you said, right now we basically depend on x86 store
> order semantics (and the CpuPause() loop around AllCpusInSync, kinda
> provides
> a barrier.)
> 
> So the BSP writes in this order:
> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true
> 
> And whenever the AP sees ->AllCpusInSync == True, it has to now see
> ApicIdMap[Idx] == x.

Yes.

> 
> Maybe the thing to do is to detail this barrier in a commit note/comment?

That would be nice.

> And add the MemoryFence() but not the volatile.

Yes, that should work.

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-03 20:58             ` Laszlo Ersek
@ 2021-02-04  2:49               ` Ankur Arora
  2021-02-04  8:58                 ` Laszlo Ersek
  2021-02-05 16:06                 ` [edk2-devel] " Laszlo Ersek
  0 siblings, 2 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-04  2:49 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-03 12:58 p.m., Laszlo Ersek wrote:
> On 02/03/21 07:45, Ankur Arora wrote:
>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
>>> On 02/02/21 15:00, Laszlo Ersek wrote:
>>>
>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>>>> combination with the sync-up point that you quoted. This seems to match
>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses,
>>>> so atomicity is not a concern, and serializing the instruction streams
>>>> coarsely, with the sync-up, in combination with volatile accesses,
>>>> should presumably guarantee visibility (on x86 anyway).
>>>
>>> To summarize, this is what I would ask for:
>>>
>>> - make CPU_HOT_EJECT_DATA volatile
>>>
>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
>>>
>>> - after storing something to CPU_HOT_EJECT_DATA or
>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
>>>
>>> - before fetching something from CPU_HOT_EJECT_DATA or
>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
>>>
>>>
>>> Except: MemoryFence() isn't a *memory fence* in fact.
>>>
>>> See "MdePkg/Library/BaseLib/X64/GccInline.c".
>>>
>>> It's just a compiler barrier, which may not add anything beyond what
>>> we'd already have from "volatile".
>>>
>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
>>> not contain a single invocation of MemoryFence(). It uses volatile
>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for
>>> implementing semaphores. (NB: there is no 8-bit variant of
>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
>>> in itself, and a suitable basis for a sempahore too.) And given the
>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
>>> updates to the *other* volatile objects are both atomic and visible.
>>>
>>> I'm pretty sure this only works because x86 is in-order. There are
>>> instruction stream barriers in place, and compiler barriers too, but no
>>> actual memory barriers.
>>
>> Right and just to separate them explicitly, there are two problems:
>>
>>   - compiler: where the compiler caches stuff in or looks at stale memory
>> locations. Now, AFAICS, this should not happen because the ApicIdMap would
>> never change once set so the compiler should reasonably be able to cache
>> the address of ApicIdMap and dereference it (thus obviating the need for
>> volatile.)
> 
> (CPU_HOT_EJECT_DATA.Handler does change though.)

Yeah, I did kinda elide over that. Let me think this through in v7
and add more explicit comments and then we can see if it still looks
fishy?

Thanks
Ankur

> 
>> The compiler could, however, cache any assignments to ApicIdMap[Idx]
>> (especially given LTO) and we could use a MemoryFence() (as the compiler
>> barrier that it is) to force the store.
>>
>>   - CPU pipeline: as you said, right now we basically depend on x86 store
>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda
>> provides
>> a barrier.)
>>
>> So the BSP writes in this order:
>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true
>>
>> And whenever the AP sees ->AllCpusInSync == True, it has to now see
>> ApicIdMap[Idx] == x.
> 
> Yes.
> 
>>
>> Maybe the thing to do is to detail this barrier in a commit note/comment?
> 
> That would be nice.
> 
>> And add the MemoryFence() but not the volatile.
> 
> Yes, that should work.
> 
> Thanks,
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-03 20:55           ` Laszlo Ersek
@ 2021-02-04  2:57             ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-04  2:57 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-03 12:55 p.m., Laszlo Ersek wrote:
> On 02/03/21 07:13, Ankur Arora wrote:
>> On 2021-02-02 6:00 a.m., Laszlo Ersek wrote:
>>> On 02/01/21 21:12, Ankur Arora wrote:
>>>> On 2021-02-01 11:08 a.m., Laszlo Ersek wrote:
> 
>>>>> (16) This function uses a data structure for communication between BSP
>>>>> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on
>>>>> the BSP, and checked above by the APs (too).
>>>>>
>>>>> What guarantees the visibility of mCpuHotEjectData->ApicIdMap?
>>>>
>>>> I was banking on SmiRendezvous() explicitly signalling that all
>>>> processing on the BSP was done before any AP will look at
>>>> mCpuHotEjectData in SmmCpuFeaturesRendezvousExit().
>>>>
>>>> 1716     //
>>>> 1717     // Wait for BSP's signal to exit SMI
>>>> 1718     //
>>>> 1719     while (*mSmmMpSyncData->AllCpusInSync) {
>>>> 1720       CpuPause ();
>>>> 1721     }
>>>> 1722   }
>>>> 1723
>>>> 1724 Exit:
>>>> 1725   SmmCpuFeaturesRendezvousExit (CpuIndex);
>>>
>>> Right; it's a general pattern in edk2: volatile UINT8 (aka BOOLEAN)
>>> objects are considered atomic. (See
>>> SMM_DISPATCHER_MP_SYNC_DATA.AllCpusInSync -- it's a pointer to a
>>> volatile BOOLEAN.)
>>>
>>> But our UINT64 values are neither volatile nor UINT8, and I got suddenly
>>> doubtful about "AllCpusInSync" working as a multiprocessor barrier.
>>>
>>> (I could be unjustifiedly worried, as a bunch of other fields in
>>> SMM_DISPATCHER_MP_SYNC_DATA are volatile, wider than UINT8, and *not*
>>> accessed with InterlockedCompareExchageXx().)
>>
>> Thanks for pointing me to this code. There's a curious comment in
>> about making this structure uncache-able in the declaration here
>> (though I couldn't figure out how that is done):
>>
>> 418 typedef struct {
>> 419   //
>> 420   // Pointer to an array. The array should be located immediately
>> after this structure
>> 421   // so that UC cache-ability can be set together.
>> 422   //
> 
> This is probably through SMRR manipulation.
> 
> The "UefiCpuPkg/Library/SmmCpuFeaturesLib" instance contains SMRR support.
> 
> The "OvmfPkg/Library/SmmCpuFeaturesLib" instance contains no SMRR
> support. (Just search both source files for "SMRR".)
> 

Oh, now I see what SMRR does. Thanks that helps make sense of
what's going on here.

Ankur

> 
>> 423   SMM_CPU_DATA_BLOCK            *CpuData;
>> 424   volatile UINT32               *Counter;
>> 425   volatile UINT32               BspIndex;
>> 426   volatile BOOLEAN              *InsideSmm;
>> 427   volatile BOOLEAN              *AllCpusInSync;
>> 428   volatile SMM_CPU_SYNC_MODE    EffectiveSyncMode;
>> 429   volatile BOOLEAN              SwitchBsp;
>> 430   volatile BOOLEAN              *CandidateBsp;
>> 431   EFI_AP_PROCEDURE              StartupProcedure;
>> 432   VOID                          *StartupProcArgs;
>> 433 } SMM_DISPATCHER_MP_SYNC_DATA;
>>
>> Also, is there an expectation that these fields (at least some of
>> them) switch over when a new leader is chosen?
> 
> Yes, see for example the "Elect BSP" section in SmiRendezvous().
> 
> 
>> Otherwise I'm not sure why for instance, AllCpusInSync would be
>> a pointer.
> 
> TBH I can't explain that; I'm not too familiar with those parts...
> 
> 
>>>> CpuEject():
>>>> 218   ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum];
>>>>
>>>> For the to-be-ejected-AP, this value can only move from
>>>>      valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID.
>>>>
>>>> Given that, by the time the worker does the write on line 254, this
>>>> AP is guaranteed to be dead already, I don't think there's any
>>>> scenario where the to-be-ejected-AP can see anything other than
>>>> a valid-APIC-ID.
>>>
>>> The scenario I had in mind was different: what guarantees that the
>>> effect of
>>>
>>>      375        mCpuHotEjectData->ApicIdMap[ProcessorNum] =
>>> (UINT64)RemoveApicId;
>>>
>>> which is performed by the BSP in UnplugCpus(), is visible by the AP on
>>> line 218 (see your quote above)?
>>>
>>> What if the AP gets to line 218 before the BSP's write on line 375
>>> *propagates* sufficiently?
>>
>> I understand. That does make sense. And, as you said elsewhere, a real
>> memory fence would come in useful here.
>>
>> We could use AsmCpuid() as a poor man's mfence, but that seems overkill
>> given that x86 at least guarantees store-order.
> 
> Right -- I don't recall any examples of AsmCpuid() being used like that
> in edk2.
> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state
  2021-02-03 20:36       ` Laszlo Ersek
@ 2021-02-04  2:58         ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-04  2:58 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-03 12:36 p.m., Laszlo Ersek wrote:
> On 02/03/21 06:20, Ankur Arora wrote:
> 
>> Just as a sidenote, I do see two copies of the mCpuHotEjectData in
>> the PiSmmCpuSmm and CpuHotplugSmm maps (which makes sense, given
>> that both include SmmCpuFeaturesLib):
>>
>> .bss.mCpuHotEjectData
>> 0x0000000000017d60        0x8
>> /tmp/PiSmmCpuDxeSmm.dll.0k4hl8.ltrans1.ltrans.o
>>
>> .bss.mCpuHotEjectData
>> 0x0000000000005110        0x8
>> /tmp/CpuHotplugSmm.dll.ixiN9a.ltrans0.ltrans.o
>>
>> I imagine they do get unified in the build process later, but that's the
>> point my understanding stops.
> 
> The PiSmmCpuDxeSmm binary has a (static global) variable called
> "mCpuHotEjectData" via OVMF's SmmCpuFeaturesLib instance, from this
> patch (patch#6).
> 
> The CpuHotplugSmm binary has a (static global) variable called
> "mCpuHotEjectData" because the "CpuHotplug.c" source file defines that
> variable, from patch#7. (CpuHotplugSmm does not consume the
> SmmCpuFeaturesLib class -- it has no reason to.)
> 
> In other words, there's nothing common between the two variables, beyond
> the name. If you rename the first to mCpuHotEjectData1, and the second
> to mCpuHotEjectData2, just for the experiment's sake, nothing will break.

Yeah you are right. I completely forgot that I had defined mCpuHotEjectData
in CpuHotplug.c and then assumed that it was because we link with
SmmCpuFeaturesLib.a

Sorry for the confusion.

Ankur

> 
> PiSmmCpuDxeSmm and CpuHotplugSmm never get unified in the build process;
> they are independent binaries.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug
  2021-02-03 20:45       ` Laszlo Ersek
@ 2021-02-04  3:04         ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-04  3:04 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-03 12:45 p.m., Laszlo Ersek wrote:
> On 02/03/21 06:46, Ankur Arora wrote:
>> On 2021-02-01 9:37 a.m., Laszlo Ersek wrote:
> 
>>> (6) Please drop this hunk. We don't try to be smarter than QEMU, in
>>> general, whenever we perform feature negotiation.
>>
>> Also, AFAICS, we will do the hotplug (and now hot-unplug) even if it wasn't
>> negotiated?
> 
> Yes, totally. We don't try to "evict" CpuHotplugSmm in case the related
> features are not supported/offered by QEMU, we'll just leave
> CpuHotplugSmm unused.
> 
> Here's why: the SMI feature negotiation interface is locked down at a
> certain point; the negotiation of all of the feature bits needs to
> happen centrally, in a common spot; and it would require a really quirky
> solution in the firmware to let independent drivers negotiate *subsets*
> of the features.

Right, I see your point. Firmware doesn't really get to stand on
ceremony when HW asks it to do stuff.

Thanks
Ankur

> 
> You have correctly determined that SmmControl2Dxe, the runtime DXE
> driver that produces EFI_SMM_CONTROL2_PROTOCOL, has nothing much to do
> with CPU hot-(un)plug. It's just that this is the driver that first
> used, and therefore now *owns*, the SMI feature negotiation. (See commit
> 5ba203b54e59 ("OvmfPkg/SmmControl2Dxe: negotiate
> ICH9_LPC_SMI_F_CPU_HOTPLUG", 2020-08-24).)
> 
> So, to reformulate your question/statement: the firmware will retain the
> ability to do hot-(un)plug even if QEMU doesn't contain (or enable)
> those particular features.
> 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-04  2:49               ` Ankur Arora
@ 2021-02-04  8:58                 ` Laszlo Ersek
  2021-02-05 16:06                 ` [edk2-devel] " Laszlo Ersek
  1 sibling, 0 replies; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-04  8:58 UTC (permalink / raw)
  To: Ankur Arora, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 02/04/21 03:49, Ankur Arora wrote:
> On 2021-02-03 12:58 p.m., Laszlo Ersek wrote:
>> On 02/03/21 07:45, Ankur Arora wrote:
>>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
>>>> On 02/02/21 15:00, Laszlo Ersek wrote:
>>>>
>>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>>>>> combination with the sync-up point that you quoted. This seems to
>>>>> match
>>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent
>>>>> accesses,
>>>>> so atomicity is not a concern, and serializing the instruction streams
>>>>> coarsely, with the sync-up, in combination with volatile accesses,
>>>>> should presumably guarantee visibility (on x86 anyway).
>>>>
>>>> To summarize, this is what I would ask for:
>>>>
>>>> - make CPU_HOT_EJECT_DATA volatile
>>>>
>>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
>>>>
>>>> - after storing something to CPU_HOT_EJECT_DATA or
>>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
>>>>
>>>> - before fetching something from CPU_HOT_EJECT_DATA or
>>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
>>>>
>>>>
>>>> Except: MemoryFence() isn't a *memory fence* in fact.
>>>>
>>>> See "MdePkg/Library/BaseLib/X64/GccInline.c".
>>>>
>>>> It's just a compiler barrier, which may not add anything beyond what
>>>> we'd already have from "volatile".
>>>>
>>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
>>>> not contain a single invocation of MemoryFence(). It uses volatile
>>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for
>>>> implementing semaphores. (NB: there is no 8-bit variant of
>>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
>>>> in itself, and a suitable basis for a sempahore too.) And given the
>>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
>>>> updates to the *other* volatile objects are both atomic and visible.
>>>>
>>>> I'm pretty sure this only works because x86 is in-order. There are
>>>> instruction stream barriers in place, and compiler barriers too, but no
>>>> actual memory barriers.
>>>
>>> Right and just to separate them explicitly, there are two problems:
>>>
>>>   - compiler: where the compiler caches stuff in or looks at stale
>>> memory
>>> locations. Now, AFAICS, this should not happen because the ApicIdMap
>>> would
>>> never change once set so the compiler should reasonably be able to cache
>>> the address of ApicIdMap and dereference it (thus obviating the need for
>>> volatile.)
>>
>> (CPU_HOT_EJECT_DATA.Handler does change though.)
> 
> Yeah, I did kinda elide over that. Let me think this through in v7
> and add more explicit comments and then we can see if it still looks
> fishy?

OK.

(Clearly, I don't want to block progress on this with concerns that are
purely theoretical.)

Thanks,
Laszlo


> Thanks
> Ankur
> 
>>
>>> The compiler could, however, cache any assignments to ApicIdMap[Idx]
>>> (especially given LTO) and we could use a MemoryFence() (as the compiler
>>> barrier that it is) to force the store.
>>>
>>>   - CPU pipeline: as you said, right now we basically depend on x86
>>> store
>>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda
>>> provides
>>> a barrier.)
>>>
>>> So the BSP writes in this order:
>>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true
>>>
>>> And whenever the AP sees ->AllCpusInSync == True, it has to now see
>>> ApicIdMap[Idx] == x.
>>
>> Yes.
>>
>>>
>>> Maybe the thing to do is to detail this barrier in a commit
>>> note/comment?
>>
>> That would be nice.
>>
>>> And add the MemoryFence() but not the volatile.
>>
>> Yes, that should work.
>>
>> Thanks,
>> Laszlo
>>
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [edk2-devel] [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-04  2:49               ` Ankur Arora
  2021-02-04  8:58                 ` Laszlo Ersek
@ 2021-02-05 16:06                 ` Laszlo Ersek
  2021-02-08  5:04                   ` Ankur Arora
  1 sibling, 1 reply; 52+ messages in thread
From: Laszlo Ersek @ 2021-02-05 16:06 UTC (permalink / raw)
  To: devel, ankur.a.arora
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

Hi Ankur,

I figure it's prudent for me to follow up here too:

On 02/04/21 03:49, Ankur Arora wrote:
> On 2021-02-03 12:58 p.m., Laszlo Ersek wrote:
>> On 02/03/21 07:45, Ankur Arora wrote:
>>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
>>>> On 02/02/21 15:00, Laszlo Ersek wrote:
>>>>
>>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>>>>> combination with the sync-up point that you quoted. This seems to
>>>>> match
>>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent
>>>>> accesses,
>>>>> so atomicity is not a concern, and serializing the instruction streams
>>>>> coarsely, with the sync-up, in combination with volatile accesses,
>>>>> should presumably guarantee visibility (on x86 anyway).
>>>>
>>>> To summarize, this is what I would ask for:
>>>>
>>>> - make CPU_HOT_EJECT_DATA volatile
>>>>
>>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
>>>>
>>>> - after storing something to CPU_HOT_EJECT_DATA or
>>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
>>>>
>>>> - before fetching something from CPU_HOT_EJECT_DATA or
>>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
>>>>
>>>>
>>>> Except: MemoryFence() isn't a *memory fence* in fact.
>>>>
>>>> See "MdePkg/Library/BaseLib/X64/GccInline.c".
>>>>
>>>> It's just a compiler barrier, which may not add anything beyond what
>>>> we'd already have from "volatile".
>>>>
>>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
>>>> not contain a single invocation of MemoryFence(). It uses volatile
>>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for
>>>> implementing semaphores. (NB: there is no 8-bit variant of
>>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
>>>> in itself, and a suitable basis for a sempahore too.) And given the
>>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
>>>> updates to the *other* volatile objects are both atomic and visible.
>>>>
>>>> I'm pretty sure this only works because x86 is in-order. There are
>>>> instruction stream barriers in place, and compiler barriers too, but no
>>>> actual memory barriers.
>>>
>>> Right and just to separate them explicitly, there are two problems:
>>>
>>>   - compiler: where the compiler caches stuff in or looks at stale
>>> memory
>>> locations. Now, AFAICS, this should not happen because the ApicIdMap
>>> would
>>> never change once set so the compiler should reasonably be able to cache
>>> the address of ApicIdMap and dereference it (thus obviating the need for
>>> volatile.)
>>
>> (CPU_HOT_EJECT_DATA.Handler does change though.)
> 
> Yeah, I did kinda elide over that. Let me think this through in v7
> and add more explicit comments and then we can see if it still looks
> fishy?
> 
> Thanks
> Ankur
> 
>>
>>> The compiler could, however, cache any assignments to ApicIdMap[Idx]
>>> (especially given LTO) and we could use a MemoryFence() (as the compiler
>>> barrier that it is) to force the store.
>>>
>>>   - CPU pipeline: as you said, right now we basically depend on x86
>>> store
>>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda
>>> provides
>>> a barrier.)
>>>
>>> So the BSP writes in this order:
>>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true
>>>
>>> And whenever the AP sees ->AllCpusInSync == True, it has to now see
>>> ApicIdMap[Idx] == x.
>>
>> Yes.
>>
>>>
>>> Maybe the thing to do is to detail this barrier in a commit
>>> note/comment?
>>
>> That would be nice.
>>
>>> And add the MemoryFence() but not the volatile.
>>
>> Yes, that should work.

Please *do* add the volatile, and also the MemoryFence(). When built
with Visual Studio, MemoryFence() does nothing at all (at least when LTO
is in effect -- which it almost always is). So we should have the
volatile for making things work, and MemoryFence() as a conceptual
reminder, so we know where to fix up things, when (if!) we come around
fixing this mess with MemoryFence(). Reference:

https://edk2.groups.io/g/rfc/message/500
https://edk2.groups.io/g/rfc/message/501
https://edk2.groups.io/g/rfc/message/502
https://edk2.groups.io/g/rfc/message/503

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [edk2-devel] [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject()
  2021-02-05 16:06                 ` [edk2-devel] " Laszlo Ersek
@ 2021-02-08  5:04                   ` Ankur Arora
  0 siblings, 0 replies; 52+ messages in thread
From: Ankur Arora @ 2021-02-08  5:04 UTC (permalink / raw)
  To: Laszlo Ersek, devel
  Cc: imammedo, boris.ostrovsky, Jordan Justen, Ard Biesheuvel,
	Aaron Young

On 2021-02-05 8:06 a.m., Laszlo Ersek wrote:
> Hi Ankur,
> 
> I figure it's prudent for me to follow up here too:
> 
> On 02/04/21 03:49, Ankur Arora wrote:
>> On 2021-02-03 12:58 p.m., Laszlo Ersek wrote:
>>> On 02/03/21 07:45, Ankur Arora wrote:
>>>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote:
>>>>> On 02/02/21 15:00, Laszlo Ersek wrote:
>>>>>
>>>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the
>>>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In
>>>>>> combination with the sync-up point that you quoted. This seems to
>>>>>> match
>>>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent
>>>>>> accesses,
>>>>>> so atomicity is not a concern, and serializing the instruction streams
>>>>>> coarsely, with the sync-up, in combination with volatile accesses,
>>>>>> should presumably guarantee visibility (on x86 anyway).
>>>>>
>>>>> To summarize, this is what I would ask for:
>>>>>
>>>>> - make CPU_HOT_EJECT_DATA volatile
>>>>>
>>>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile
>>>>>
>>>>> - after storing something to CPU_HOT_EJECT_DATA or
>>>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence()
>>>>>
>>>>> - before fetching something from CPU_HOT_EJECT_DATA or
>>>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence()
>>>>>
>>>>>
>>>>> Except: MemoryFence() isn't a *memory fence* in fact.
>>>>>
>>>>> See "MdePkg/Library/BaseLib/X64/GccInline.c".
>>>>>
>>>>> It's just a compiler barrier, which may not add anything beyond what
>>>>> we'd already have from "volatile".
>>>>>
>>>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does
>>>>> not contain a single invocation of MemoryFence(). It uses volatile
>>>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for
>>>>> implementing semaphores. (NB: there is no 8-bit variant of
>>>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic
>>>>> in itself, and a suitable basis for a sempahore too.) And given the
>>>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that
>>>>> updates to the *other* volatile objects are both atomic and visible.
>>>>>
>>>>> I'm pretty sure this only works because x86 is in-order. There are
>>>>> instruction stream barriers in place, and compiler barriers too, but no
>>>>> actual memory barriers.
>>>>
>>>> Right and just to separate them explicitly, there are two problems:
>>>>
>>>>    - compiler: where the compiler caches stuff in or looks at stale
>>>> memory
>>>> locations. Now, AFAICS, this should not happen because the ApicIdMap
>>>> would
>>>> never change once set so the compiler should reasonably be able to cache
>>>> the address of ApicIdMap and dereference it (thus obviating the need for
>>>> volatile.)
>>>
>>> (CPU_HOT_EJECT_DATA.Handler does change though.)
>>
>> Yeah, I did kinda elide over that. Let me think this through in v7
>> and add more explicit comments and then we can see if it still looks
>> fishy?
>>
>> Thanks
>> Ankur
>>
>>>
>>>> The compiler could, however, cache any assignments to ApicIdMap[Idx]
>>>> (especially given LTO) and we could use a MemoryFence() (as the compiler
>>>> barrier that it is) to force the store.
>>>>
>>>>    - CPU pipeline: as you said, right now we basically depend on x86
>>>> store
>>>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda
>>>> provides
>>>> a barrier.)
>>>>
>>>> So the BSP writes in this order:
>>>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true
>>>>
>>>> And whenever the AP sees ->AllCpusInSync == True, it has to now see
>>>> ApicIdMap[Idx] == x.
>>>
>>> Yes.
>>>
>>>>
>>>> Maybe the thing to do is to detail this barrier in a commit
>>>> note/comment?
>>>
>>> That would be nice.
>>>
>>>> And add the MemoryFence() but not the volatile.
>>>
>>> Yes, that should work.
> 
> Please *do* add the volatile, and also the MemoryFence(). When built
> with Visual Studio, MemoryFence() does nothing at all (at least when LTO
> is in effect -- which it almost always is). So we should have the
> volatile for making things work, and MemoryFence() as a conceptual
> reminder, so we know where to fix up things, when (if!) we come around
> fixing this mess with MemoryFence(). Reference:
> 
> https://edk2.groups.io/g/rfc/message/500
> https://edk2.groups.io/g/rfc/message/501
> https://edk2.groups.io/g/rfc/message/502
> https://edk2.groups.io/g/rfc/message/503

Did see it on the thread. Yeah agreed, Visual Studio does necessitate volatile here.
Will add.

Thanks
Ankur

> 
> Thanks!
> Laszlo
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2021-02-08  5:04 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-29  0:59 [PATCH v6 0/9] support CPU hot-unplug Ankur Arora
2021-01-29  0:59 ` [PATCH v6 1/9] OvmfPkg/CpuHotplugSmm: refactor hotplug logic Ankur Arora
2021-01-30  1:15   ` [edk2-devel] " Laszlo Ersek
2021-02-02  6:19     ` Ankur Arora
2021-02-01  2:59   ` Laszlo Ersek
2021-01-29  0:59 ` [PATCH v6 2/9] OvmfPkg/CpuHotplugSmm: collect hot-unplug events Ankur Arora
2021-01-30  2:18   ` Laszlo Ersek
2021-01-30  2:23     ` Laszlo Ersek
2021-02-02  6:03     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 3/9] OvmfPkg/CpuHotplugSmm: add Qemu Cpu Status helper Ankur Arora
2021-01-30  2:36   ` Laszlo Ersek
2021-02-02  6:04     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 4/9] OvmfPkg/CpuHotplugSmm: introduce UnplugCpus() Ankur Arora
2021-01-30  2:37   ` Laszlo Ersek
2021-02-01  3:13   ` Laszlo Ersek
2021-02-03  4:28     ` Ankur Arora
2021-02-03 19:20       ` Laszlo Ersek
2021-01-29  0:59 ` [PATCH v6 5/9] OvmfPkg/CpuHotplugSmm: define CPU_HOT_EJECT_DATA Ankur Arora
2021-02-01  4:53   ` Laszlo Ersek
2021-02-02  6:15     ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 6/9] OvmfPkg/SmmCpuFeaturesLib: init CPU ejection state Ankur Arora
2021-02-01 13:36   ` Laszlo Ersek
2021-02-03  5:20     ` Ankur Arora
2021-02-03 20:36       ` Laszlo Ersek
2021-02-04  2:58         ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() Ankur Arora
2021-02-01 16:11   ` Laszlo Ersek
2021-02-01 19:08   ` Laszlo Ersek
2021-02-01 20:12     ` Ankur Arora
2021-02-02 14:00       ` Laszlo Ersek
2021-02-02 14:15         ` Laszlo Ersek
2021-02-03  6:45           ` Ankur Arora
2021-02-03 20:58             ` Laszlo Ersek
2021-02-04  2:49               ` Ankur Arora
2021-02-04  8:58                 ` Laszlo Ersek
2021-02-05 16:06                 ` [edk2-devel] " Laszlo Ersek
2021-02-08  5:04                   ` Ankur Arora
2021-02-03  6:13         ` Ankur Arora
2021-02-03 20:55           ` Laszlo Ersek
2021-02-04  2:57             ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 8/9] OvmfPkg/CpuHotplugSmm: add worker to do CPU ejection Ankur Arora
2021-02-01 17:22   ` Laszlo Ersek
2021-02-01 19:21     ` Ankur Arora
2021-02-02 13:23       ` Laszlo Ersek
2021-02-03  5:41         ` Ankur Arora
2021-01-29  0:59 ` [PATCH v6 9/9] OvmfPkg/SmmControl2Dxe: negotiate CPU hot-unplug Ankur Arora
2021-02-01 17:37   ` Laszlo Ersek
2021-02-01 17:40     ` Laszlo Ersek
2021-02-01 17:48       ` Laszlo Ersek
2021-02-03  5:46     ` Ankur Arora
2021-02-03 20:45       ` Laszlo Ersek
2021-02-04  3:04         ` Ankur Arora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox