public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib
@ 2023-12-06 10:01 Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP Wu, Jiaxin
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel

The series patches are to refine SMM CPU Sync flow.
After the refinement, SmmCpuSyncLib is abstracted for
any user to provide different SMM CPU Sync implementation.

Compared to V2, has following refinement & changes:
1. rename SMM_CPU_SYNC_CXT to SMM_CPU_SYNC_CONTEXT
2. rename SemBlock to SemBuffer, SemBlockPages to SemBufferSize
3. remove empty lines among the local variable declarations
4. add assert before if condition check
5. update the comments for SmmCpuSyncCheckOutCpu,
SmmCpuSyncGetArrivedCpuCount and SmmCpuSyncLockDoor.
6. remove unnecessary cxt local variable declarations.

Jiaxin Wu (6):
  UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP
  UefiCpuPkg: Adds SmmCpuSyncLib library class
  UefiCpuPkg: Implements SmmCpuSyncLib library instance
  OvmfPkg: Specifies SmmCpuSyncLib instance
  UefiPayloadPkg: Specifies SmmCpuSyncLib instance
  UefiCpuPkg/PiSmmCpuDxeSmm: Consume SmmCpuSyncLib

 OvmfPkg/CloudHv/CloudHvX64.dsc                     |   2 +
 OvmfPkg/OvmfPkgIa32.dsc                            |   2 +
 OvmfPkg/OvmfPkgIa32X64.dsc                         |   2 +
 OvmfPkg/OvmfPkgX64.dsc                             |   1 +
 UefiCpuPkg/Include/Library/SmmCpuSyncLib.h         | 275 +++++++++
 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c   | 647 +++++++++++++++++++++
 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf |  39 ++
 UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c              | 275 ++++-----
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h         |   6 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf       |   1 +
 UefiCpuPkg/UefiCpuPkg.dec                          |   3 +
 UefiCpuPkg/UefiCpuPkg.dsc                          |   3 +
 UefiPayloadPkg/UefiPayloadPkg.dsc                  |   1 +
 13 files changed, 1086 insertions(+), 171 deletions(-)
 create mode 100644 UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
 create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
 create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf

-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112108): https://edk2.groups.io/g/devel/message/112108
Mute This Topic: https://groups.io/mt/103010162/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  2023-12-12 19:27   ` Laszlo Ersek
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class Wu, Jiaxin
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Eric Dong, Ray Ni, Zeng Star, Rahul Kumar,
	Gerd Hoffmann

This patch is to define 3 new functions (WaitForBsp & ReleaseBsp &
ReleaseOneAp) used for the semaphore sync between BSP & AP. With the
change, BSP and AP Sync flow will be easy understand as below:
BSP: ReleaseAllAPs or ReleaseOneAp --> AP: WaitForBsp
BSP: WaitForAllAPs                 <-- AP: ReleaseBsp

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
---
 UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c | 72 ++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 14 deletions(-)

diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
index b279f5dfcc..54542262a2 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
@@ -120,10 +120,11 @@ LockdownSemaphore (
 
   return Value;
 }
 
 /**
+  Used for BSP to wait all APs.
   Wait all APs to performs an atomic compare exchange operation to release semaphore.
 
   @param   NumberOfAPs      AP number
 
 **/
@@ -139,10 +140,11 @@ WaitForAllAPs (
     WaitForSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
   }
 }
 
 /**
+  Used for BSP to release all APs.
   Performs an atomic compare exchange operation to release semaphore
   for each AP.
 
 **/
 VOID
@@ -157,10 +159,52 @@ ReleaseAllAPs (
       ReleaseSemaphore (mSmmMpSyncData->CpuData[Index].Run);
     }
   }
 }
 
+/**
+  Used for BSP to release one AP.
+
+  @param      ApSem     IN:  32-bit unsigned integer
+                        OUT: original integer + 1
+**/
+VOID
+ReleaseOneAp   (
+  IN OUT  volatile UINT32  *ApSem
+  )
+{
+  ReleaseSemaphore (ApSem);
+}
+
+/**
+  Used for AP to wait BSP.
+
+  @param      ApSem      IN:  32-bit unsigned integer
+                         OUT: original integer - 1
+**/
+VOID
+WaitForBsp  (
+  IN OUT  volatile UINT32  *ApSem
+  )
+{
+  WaitForSemaphore (ApSem);
+}
+
+/**
+  Used for AP to release BSP.
+
+  @param      BspSem     IN:  32-bit unsigned integer
+                         OUT: original integer + 1
+**/
+VOID
+ReleaseBsp   (
+  IN OUT  volatile UINT32  *BspSem
+  )
+{
+  ReleaseSemaphore (BspSem);
+}
+
 /**
   Check whether the index of CPU perform the package level register
   programming during System Management Mode initialization.
 
   The index of Processor specified by mPackageFirstThreadIndex[PackageIndex]
@@ -632,11 +676,11 @@ BSPHandler (
       // Signal all APs it's time for backup MTRRs
       //
       ReleaseAllAPs ();
 
       //
-      // WaitForSemaphore() may wait for ever if an AP happens to enter SMM at
+      // WaitForAllAPs() may wait for ever if an AP happens to enter SMM at
       // exactly this point. Please make sure PcdCpuSmmMaxSyncLoops has been set
       // to a large enough value to avoid this situation.
       // Note: For HT capable CPUs, threads within a core share the same set of MTRRs.
       // We do the backup first and then set MTRR to avoid race condition for threads
       // in the same core.
@@ -652,11 +696,11 @@ BSPHandler (
       // Let all processors program SMM MTRRs together
       //
       ReleaseAllAPs ();
 
       //
-      // WaitForSemaphore() may wait for ever if an AP happens to enter SMM at
+      // WaitForAllAPs() may wait for ever if an AP happens to enter SMM at
       // exactly this point. Please make sure PcdCpuSmmMaxSyncLoops has been set
       // to a large enough value to avoid this situation.
       //
       ReplaceOSMtrrs (CpuIndex);
 
@@ -898,50 +942,50 @@ APHandler (
 
   if ((SyncMode == SmmCpuSyncModeTradition) || SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Notify BSP of arrival at this point
     //
-    ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
   }
 
   if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Wait for the signal from BSP to backup MTRRs
     //
-    WaitForSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
     //
     // Backup OS MTRRs
     //
     MtrrGetAllMtrrs (&Mtrrs);
 
     //
     // Signal BSP the completion of this AP
     //
-    ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
 
     //
     // Wait for BSP's signal to program MTRRs
     //
-    WaitForSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
     //
     // Replace OS MTRRs with SMI MTRRs
     //
     ReplaceOSMtrrs (CpuIndex);
 
     //
     // Signal BSP the completion of this AP
     //
-    ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
   }
 
   while (TRUE) {
     //
     // Wait for something to happen
     //
-    WaitForSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
     //
     // Check if BSP wants to exit SMM
     //
     if (!(*mSmmMpSyncData->InsideSmm)) {
@@ -977,16 +1021,16 @@ APHandler (
 
   if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Notify BSP the readiness of this AP to program MTRRs
     //
-    ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
 
     //
     // Wait for the signal from BSP to program MTRRs
     //
-    WaitForSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
     //
     // Restore OS MTRRs
     //
     SmmCpuFeaturesReenableSmrr ();
@@ -994,26 +1038,26 @@ APHandler (
   }
 
   //
   // Notify BSP the readiness of this AP to Reset states/semaphore for this processor
   //
-  ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+  ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
 
   //
   // Wait for the signal from BSP to Reset states/semaphore for this processor
   //
-  WaitForSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+  WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
   //
   // Reset states/semaphore for this processor
   //
   *(mSmmMpSyncData->CpuData[CpuIndex].Present) = FALSE;
 
   //
   // Notify BSP the readiness of this AP to exit SMM
   //
-  ReleaseSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
+  ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
 }
 
 /**
   Checks whether the input token is the current used token.
 
@@ -1277,11 +1321,11 @@ InternalSmmStartupThisAp (
   mSmmMpSyncData->CpuData[CpuIndex].Status = CpuStatus;
   if (mSmmMpSyncData->CpuData[CpuIndex].Status != NULL) {
     *mSmmMpSyncData->CpuData[CpuIndex].Status = EFI_NOT_READY;
   }
 
-  ReleaseSemaphore (mSmmMpSyncData->CpuData[CpuIndex].Run);
+  ReleaseOneAp (mSmmMpSyncData->CpuData[CpuIndex].Run);
 
   if (Token == NULL) {
     AcquireSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);
     ReleaseSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);
   }
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112109): https://edk2.groups.io/g/devel/message/112109
Mute This Topic: https://groups.io/mt/103010163/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  2023-12-07  9:07   ` Ni, Ray
  2023-12-12 20:18   ` Laszlo Ersek
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance Wu, Jiaxin
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Eric Dong, Ray Ni, Zeng Star, Gerd Hoffmann,
	Rahul Kumar

Intel is planning to provide different SMM CPU Sync implementation
along with some specific registers to improve the SMI performance,
hence need SmmCpuSyncLib Library for Intel.

This patch is to:
1.Adds SmmCpuSyncLib Library class in UefiCpuPkg.dec.
2.Adds SmmCpuSyncLib.h function declaration header file.

For the new SmmCpuSyncLib, it provides 3 sets of APIs:

1. ContextInit/ContextDeinit/ContextReset:
ContextInit() is called in driver's entrypoint to allocate and
initialize the SMM CPU Sync context. ContextDeinit() is called in
driver's unload function to deinitialize SMM CPU Sync context.
ContextReset() is called before CPU exist SMI, which allows CPU to
check into the next SMI from this point.

2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
When SMI happens, all processors including BSP enter to SMM mode by
calling CheckInCpu(). The elected BSP calls LockDoor() so that
CheckInCpu() will return the error code after that. CheckOutCpu() can
be called in error handling flow for the CPU who calls CheckInCpu()
earlier. GetArrivedCpuCount() returns the number of checked-in CPUs.

3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number
of APs and release one specific AP. WaitForBsp() & ReleaseBsp() are
called from APs to wait and release BSP. The 4 APIs are used to
synchronize the running flow among BSP and APs. BSP and AP Sync flow
can be easy understand as below:
BSP: ReleaseOneAp  -->  AP: WaitForBsp
BSP: WaitForAPs    <--  AP: ReleaseBsp

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
---
 UefiCpuPkg/Include/Library/SmmCpuSyncLib.h | 275 +++++++++++++++++++++++++++++
 UefiCpuPkg/UefiCpuPkg.dec                  |   3 +
 2 files changed, 278 insertions(+)
 create mode 100644 UefiCpuPkg/Include/Library/SmmCpuSyncLib.h

diff --git a/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h b/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
new file mode 100644
index 0000000000..0f9eb3414a
--- /dev/null
+++ b/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
@@ -0,0 +1,275 @@
+/** @file
+  Library that provides SMM CPU Sync related operations.
+  The lib provides 3 sets of APIs:
+  1. ContextInit/ContextDeinit/ContextReset:
+  ContextInit() is called in driver's entrypoint to allocate and initialize the SMM CPU Sync context.
+  ContextDeinit() is called in driver's unload function to deinitialize the SMM CPU Sync context.
+  ContextReset() is called before CPU exist SMI, which allows CPU to check into the next SMI from this point.
+
+  2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
+  When SMI happens, all processors including BSP enter to SMM mode by calling CheckInCpu().
+  The elected BSP calls LockDoor() so that CheckInCpu() will return the error code after that.
+  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
+  GetArrivedCpuCount() returns the number of checked-in CPUs.
+
+  3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
+  WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number of APs and release one specific AP.
+  WaitForBsp() & ReleaseBsp() are called from APs to wait and release BSP.
+  The 4 APIs are used to synchronize the running flow among BSP and APs. BSP and AP Sync flow can be
+  easy understand as below:
+  BSP: ReleaseOneAp  -->  AP: WaitForBsp
+  BSP: WaitForAPs    <--  AP: ReleaseBsp
+
+  Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+
+#ifndef SMM_CPU_SYNC_LIB_H_
+#define SMM_CPU_SYNC_LIB_H_
+
+#include <Uefi/UefiBaseType.h>
+
+//
+// Opaque structure for SMM CPU Sync context.
+//
+typedef struct SMM_CPU_SYNC_CONTEXT SMM_CPU_SYNC_CONTEXT;
+
+/**
+  Create and initialize the SMM CPU Sync context.
+
+  SmmCpuSyncContextInit() function is to allocate and initialize the SMM CPU Sync context.
+
+  @param[in]  NumberOfCpus          The number of Logical Processors in the system.
+  @param[out] SmmCpuSyncCtx         Pointer to the new created and initialized SMM CPU Sync context object.
+                                    NULL will be returned if any error happen during init.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful created and initialized.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_BUFFER_TOO_SMALL   Overflow happen
+  @retval RETURN_OUT_OF_RESOURCES   There are not enough resources available to create and initialize SMM CPU Sync context.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextInit (
+  IN   UINTN                 NumberOfCpus,
+  OUT  SMM_CPU_SYNC_CONTEXT  **SmmCpuSyncCtx
+  );
+
+/**
+  Deinit an allocated SMM CPU Sync context.
+
+  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync context, the resources allocated in
+  SmmCpuSyncContextInit() will be freed.
+
+  Note: This function only can be called after SmmCpuSyncContextInit() return success.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be deinitialized.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful deinitialized.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextDeinit (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
+  );
+
+/**
+  Reset SMM CPU Sync context.
+
+  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context to the initialized state.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be reset.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful reset.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextReset (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
+  );
+
+/**
+  Get current number of arrived CPU in SMI.
+
+  For traditional CPU synchronization method, BSP might need to know the current number of arrived CPU in
+  SMI to make sure all APs in SMI. This API can be for that purpose.
+
+  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
+
+  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncGetArrivedCpuCount (
+  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN OUT UINTN                 *CpuCount
+  );
+
+/**
+  Performs an atomic operation to check in CPU.
+
+  When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Check in CPU index.
+
+  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncCheckInCpu (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex
+  );
+
+/**
+  Performs an atomic operation to check out CPU.
+
+  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Check out CPU index.
+
+  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_NOT_READY          The CPU is not checked-in.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncCheckOutCpu (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex
+  );
+
+/**
+  Performs an atomic operation lock door for CPU checkin or checkout.
+
+  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
+
+  The CPU specified by CpuIndex is elected to lock door.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which CPU to lock door.
+  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.
+
+  @retval RETURN_SUCCESS            Lock door for CPU successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncLockDoor (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN OUT UINTN                 *CpuCount
+  );
+
+/**
+  Used by the BSP to wait for APs.
+
+  The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
+
+  Note: This function is blocking mode, and it will return only after the number of APs released by
+  calling SmmCpuSyncReleaseBsp():
+  BSP: WaitForAPs    <--  AP: ReleaseBsp
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
+  @param[in]      BspIndex          The BSP Index to wait for APs.
+
+  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or NumberOfAPs > total number of processors in system.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncWaitForAPs (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 NumberOfAPs,
+  IN     UINTN                 BspIndex
+  );
+
+/**
+  Used by the BSP to release one AP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which AP need to be released.
+  @param[in]      BspIndex          The BSP Index to release AP.
+
+  @retval RETURN_SUCCESS            BSP to release one AP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncReleaseOneAp   (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  );
+
+/**
+  Used by the AP to wait BSP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  Note: This function is blocking mode, and it will return only after the AP released by
+  calling SmmCpuSyncReleaseOneAp():
+  BSP: ReleaseOneAp  -->  AP: WaitForBsp
+
+  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex         Indicate which AP wait BSP.
+  @param[in]      BspIndex         The BSP Index to be waited.
+
+  @retval RETURN_SUCCESS            AP to wait BSP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncWaitForBsp (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  );
+
+/**
+  Used by the AP to release BSP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which AP release BSP.
+  @param[in]      BspIndex          The BSP Index to be released.
+
+  @retval RETURN_SUCCESS            AP to release BSP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncReleaseBsp (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  );
+
+#endif
diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
index 0b5431dbf7..20ab079219 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -62,10 +62,13 @@
   CpuPageTableLib|Include/Library/CpuPageTableLib.h
 
   ## @libraryclass   Provides functions for manipulating smram savestate registers.
   MmSaveStateLib|Include/Library/MmSaveStateLib.h
 
+  ## @libraryclass   Provides functions for SMM CPU Sync Operation.
+  SmmCpuSyncLib|Include/Library/SmmCpuSyncLib.h
+
 [LibraryClasses.RISCV64]
   ##  @libraryclass  Provides functions to manage MMU features on RISCV64 CPUs.
   ##
   RiscVMmuLib|Include/Library/BaseRiscVMmuLib.h
 
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112110): https://edk2.groups.io/g/devel/message/112110
Mute This Topic: https://groups.io/mt/103010164/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  2023-12-13 14:34   ` Laszlo Ersek
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance Wu, Jiaxin
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Eric Dong, Ray Ni, Zeng Star, Gerd Hoffmann,
	Rahul Kumar

Implements SmmCpuSyncLib Library instance. The instance refers the
existing SMM CPU driver (PiSmmCpuDxeSmm) sync implementation
and behavior:
1.Abstract Counter and Run semaphores into SmmCpuSyncCtx.
2.Abstract CPU arrival count operation to
SmmCpuSyncGetArrivedCpuCount(), SmmCpuSyncCheckInCpu(),
SmmCpuSyncCheckOutCpu(), SmmCpuSyncLockDoor().
Implementation is aligned with existing SMM CPU driver.
3. Abstract SMM CPU Sync flow to:
BSP: SmmCpuSyncReleaseOneAp  -->  AP: SmmCpuSyncWaitForBsp
BSP: SmmCpuSyncWaitForAPs    <--  AP: SmmCpuSyncReleaseBsp
Semaphores release & wait during sync flow is same as existing SMM
CPU driver.
4.Same operation to Counter and Run semaphores by leverage the atomic
compare exchange.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
---
 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c   | 647 +++++++++++++++++++++
 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf |  39 ++
 UefiCpuPkg/UefiCpuPkg.dsc                          |   3 +
 3 files changed, 689 insertions(+)
 create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
 create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf

diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
new file mode 100644
index 0000000000..3c2835f8de
--- /dev/null
+++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
@@ -0,0 +1,647 @@
+/** @file
+  SMM CPU Sync lib implementation.
+  The lib provides 3 sets of APIs:
+  1. ContextInit/ContextDeinit/ContextReset:
+  ContextInit() is called in driver's entrypoint to allocate and initialize the SMM CPU Sync context.
+  ContextDeinit() is called in driver's unload function to deinitialize the SMM CPU Sync context.
+  ContextReset() is called before CPU exist SMI, which allows CPU to check into the next SMI from this point.
+
+  2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
+  When SMI happens, all processors including BSP enter to SMM mode by calling CheckInCpu().
+  The elected BSP calls LockDoor() so that CheckInCpu() will return the error code after that.
+  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
+  GetArrivedCpuCount() returns the number of checked-in CPUs.
+
+  3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
+  WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number of APs and release one specific AP.
+  WaitForBsp() & ReleaseBsp() are called from APs to wait and release BSP.
+  The 4 APIs are used to synchronize the running flow among BSP and APs. BSP and AP Sync flow can be
+  easy understand as below:
+  BSP: ReleaseOneAp  -->  AP: WaitForBsp
+  BSP: WaitForAPs    <--  AP: ReleaseBsp
+
+  Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+
+#include <Base.h>
+#include <Uefi.h>
+#include <Library/UefiLib.h>
+#include <Library/BaseLib.h>
+#include <Library/DebugLib.h>
+#include <Library/SafeIntLib.h>
+#include <Library/SynchronizationLib.h>
+#include <Library/DebugLib.h>
+#include <Library/BaseMemoryLib.h>
+#include <Library/SmmServicesTableLib.h>
+#include <Library/MemoryAllocationLib.h>
+#include <Library/SmmCpuSyncLib.h>
+
+typedef struct {
+  ///
+  /// Indicate how many CPU entered SMM.
+  ///
+  volatile UINT32    *Counter;
+} SMM_CPU_SYNC_SEMAPHORE_GLOBAL;
+
+typedef struct {
+  ///
+  /// Used for control each CPU continue run or wait for signal
+  ///
+  volatile UINT32    *Run;
+} SMM_CPU_SYNC_SEMAPHORE_CPU;
+
+struct SMM_CPU_SYNC_CONTEXT  {
+  ///
+  ///  All global semaphores' pointer in SMM CPU Sync
+  ///
+  SMM_CPU_SYNC_SEMAPHORE_GLOBAL    *GlobalSem;
+  ///
+  ///  All semaphores for each processor in SMM CPU Sync
+  ///
+  SMM_CPU_SYNC_SEMAPHORE_CPU       *CpuSem;
+  ///
+  /// The number of processors in the system.
+  /// This does not indicate the number of processors that entered SMM.
+  ///
+  UINTN                            NumberOfCpus;
+  ///
+  /// Address of global and each CPU semaphores
+  ///
+  UINTN                            *SemBuffer;
+  ///
+  /// Size in bytes of global and each CPU semaphores
+  ///
+  UINTN                            SemBufferSize;
+};
+
+/**
+  Performs an atomic compare exchange operation to get semaphore.
+  The compare exchange operation must be performed using MP safe
+  mechanisms.
+
+  @param[in,out]  Sem    IN:  32-bit unsigned integer
+                         OUT: original integer - 1
+
+  @retval     Original integer - 1
+
+**/
+UINT32
+InternalWaitForSemaphore (
+  IN OUT  volatile UINT32  *Sem
+  )
+{
+  UINT32  Value;
+
+  for ( ; ;) {
+    Value = *Sem;
+    if ((Value != 0) &&
+        (InterlockedCompareExchange32 (
+           (UINT32 *)Sem,
+           Value,
+           Value - 1
+           ) == Value))
+    {
+      break;
+    }
+
+    CpuPause ();
+  }
+
+  return Value - 1;
+}
+
+/**
+  Performs an atomic compare exchange operation to release semaphore.
+  The compare exchange operation must be performed using MP safe
+  mechanisms.
+
+  @param[in,out]  Sem    IN:  32-bit unsigned integer
+                         OUT: original integer + 1
+
+  @retval    Original integer + 1
+
+**/
+UINT32
+InternalReleaseSemaphore (
+  IN OUT  volatile UINT32  *Sem
+  )
+{
+  UINT32  Value;
+
+  do {
+    Value = *Sem;
+  } while (Value + 1 != 0 &&
+           InterlockedCompareExchange32 (
+             (UINT32 *)Sem,
+             Value,
+             Value + 1
+             ) != Value);
+
+  return Value + 1;
+}
+
+/**
+  Performs an atomic compare exchange operation to lock semaphore.
+  The compare exchange operation must be performed using MP safe
+  mechanisms.
+
+  @param[in,out]  Sem    IN:  32-bit unsigned integer
+                         OUT: -1
+
+  @retval    Original integer
+
+**/
+UINT32
+InternalLockdownSemaphore (
+  IN OUT  volatile UINT32  *Sem
+  )
+{
+  UINT32  Value;
+
+  do {
+    Value = *Sem;
+  } while (InterlockedCompareExchange32 (
+             (UINT32 *)Sem,
+             Value,
+             (UINT32)-1
+             ) != Value);
+
+  return Value;
+}
+
+/**
+  Create and initialize the SMM CPU Sync context.
+
+  SmmCpuSyncContextInit() function is to allocate and initialize the SMM CPU Sync context.
+
+  @param[in]  NumberOfCpus          The number of Logical Processors in the system.
+  @param[out] SmmCpuSyncCtx         Pointer to the new created and initialized SMM CPU Sync context object.
+                                    NULL will be returned if any error happen during init.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful created and initialized.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_BUFFER_TOO_SMALL   Overflow happen
+  @retval RETURN_OUT_OF_RESOURCES   There are not enough resources available to create and initialize SMM CPU Sync context.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextInit (
+  IN   UINTN                 NumberOfCpus,
+  OUT  SMM_CPU_SYNC_CONTEXT  **SmmCpuSyncCtx
+  )
+{
+  RETURN_STATUS  Status;
+  UINTN          CpuSemInCtxSize;
+  UINTN          CtxSize;
+  UINTN          OneSemSize;
+  UINTN          GlobalSemSize;
+  UINTN          OneCpuSemSize;
+  UINTN          CpuSemSize;
+  UINTN          TotalSemSize;
+  UINTN          SemAddr;
+  UINTN          CpuIndex;
+
+  ASSERT (SmmCpuSyncCtx != NULL);
+  if (SmmCpuSyncCtx == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Count the CtxSize
+  //
+  Status = SafeUintnMult (NumberOfCpus, sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU), &CpuSemInCtxSize);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
+
+  Status = SafeUintnAdd (sizeof (SMM_CPU_SYNC_CONTEXT), CpuSemInCtxSize, &CtxSize);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
+
+  Status = SafeUintnAdd (CtxSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL), &CtxSize);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
+
+  //
+  // Allocate CtxSize buffer for the *SmmCpuSyncCtx
+  //
+  *SmmCpuSyncCtx = NULL;
+  *SmmCpuSyncCtx = (SMM_CPU_SYNC_CONTEXT *)AllocatePages (EFI_SIZE_TO_PAGES (CtxSize));
+  ASSERT (*SmmCpuSyncCtx != NULL);
+  if (*SmmCpuSyncCtx == NULL) {
+    return RETURN_OUT_OF_RESOURCES;
+  }
+
+  (*SmmCpuSyncCtx)->GlobalSem    = (SMM_CPU_SYNC_SEMAPHORE_GLOBAL *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT));
+  (*SmmCpuSyncCtx)->CpuSem       = (SMM_CPU_SYNC_SEMAPHORE_CPU *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL));
+  (*SmmCpuSyncCtx)->NumberOfCpus = NumberOfCpus;
+
+  //
+  // Count the TotalSemSize
+  //
+  OneSemSize = GetSpinLockProperties ();
+
+  Status = SafeUintnMult (OneSemSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) / sizeof (VOID *), &GlobalSemSize);
+  if (EFI_ERROR (Status)) {
+    goto ON_ERROR;
+  }
+
+  Status = SafeUintnMult (OneSemSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU) / sizeof (VOID *), &OneCpuSemSize);
+  if (EFI_ERROR (Status)) {
+    goto ON_ERROR;
+  }
+
+  Status = SafeUintnMult (NumberOfCpus, OneCpuSemSize, &CpuSemSize);
+  if (EFI_ERROR (Status)) {
+    goto ON_ERROR;
+  }
+
+  Status = SafeUintnAdd (GlobalSemSize, CpuSemSize, &TotalSemSize);
+  if (EFI_ERROR (Status)) {
+    goto ON_ERROR;
+  }
+
+  DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%x\n", __func__, OneSemSize));
+  DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%x\n", __func__, TotalSemSize));
+
+  //
+  // Allocate for Semaphores in the *SmmCpuSyncCtx
+  //
+  (*SmmCpuSyncCtx)->SemBufferSize = TotalSemSize;
+  (*SmmCpuSyncCtx)->SemBuffer     = AllocatePages (EFI_SIZE_TO_PAGES ((*SmmCpuSyncCtx)->SemBufferSize));
+  ASSERT ((*SmmCpuSyncCtx)->SemBuffer != NULL);
+  if ((*SmmCpuSyncCtx)->SemBuffer == NULL) {
+    Status = RETURN_OUT_OF_RESOURCES;
+    goto ON_ERROR;
+  }
+
+  ZeroMem ((*SmmCpuSyncCtx)->SemBuffer, TotalSemSize);
+
+  //
+  // Assign Global Semaphore pointer
+  //
+  SemAddr                               = (UINTN)(*SmmCpuSyncCtx)->SemBuffer;
+  (*SmmCpuSyncCtx)->GlobalSem->Counter  = (UINT32 *)SemAddr;
+  *(*SmmCpuSyncCtx)->GlobalSem->Counter = 0;
+  DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->GlobalSem->Counter Address: 0x%08x\n", __func__, (UINTN)(*SmmCpuSyncCtx)->GlobalSem->Counter));
+
+  SemAddr += GlobalSemSize;
+
+  //
+  // Assign CPU Semaphore pointer
+  //
+  for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
+    (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run  = (UINT32 *)(SemAddr + (CpuSemSize / NumberOfCpus) * CpuIndex);
+    *(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run = 0;
+    DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->CpuSem[%d].Run Address: 0x%08x\n", __func__, CpuIndex, (UINTN)(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run));
+  }
+
+  return RETURN_SUCCESS;
+
+ON_ERROR:
+  FreePages (*SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (CtxSize));
+  return Status;
+}
+
+/**
+  Deinit an allocated SMM CPU Sync context.
+
+  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync context, the resources allocated in
+  SmmCpuSyncContextInit() will be freed.
+
+  Note: This function only can be called after SmmCpuSyncContextInit() return success.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be deinitialized.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful deinitialized.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextDeinit (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
+  )
+{
+  UINTN  SmmCpuSyncCtxSize;
+
+  ASSERT (SmmCpuSyncCtx != NULL);
+  if (SmmCpuSyncCtx == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  SmmCpuSyncCtxSize = sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) + sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU) * (SmmCpuSyncCtx->NumberOfCpus);
+
+  FreePages (SmmCpuSyncCtx->SemBuffer, EFI_SIZE_TO_PAGES (SmmCpuSyncCtx->SemBufferSize));
+
+  FreePages (SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (SmmCpuSyncCtxSize));
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Reset SMM CPU Sync context.
+
+  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context to the initialized state.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be reset.
+
+  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful reset.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncContextReset (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL);
+  if (SmmCpuSyncCtx == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  *SmmCpuSyncCtx->GlobalSem->Counter = 0;
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Get current number of arrived CPU in SMI.
+
+  For traditional CPU synchronization method, BSP might need to know the current number of arrived CPU in
+  SMI to make sure all APs in SMI. This API can be for that purpose.
+
+  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
+
+  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncGetArrivedCpuCount (
+  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN OUT UINTN                 *CpuCount
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
+  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  if (*SmmCpuSyncCtx->GlobalSem->Counter < 0) {
+    return RETURN_UNSUPPORTED;
+  }
+
+  *CpuCount = *SmmCpuSyncCtx->GlobalSem->Counter;
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Performs an atomic operation to check in CPU.
+
+  When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Check in CPU index.
+
+  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncCheckInCpu (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL);
+  if (SmmCpuSyncCtx == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Check to return if Counter has already been locked.
+  //
+  if ((INT32)InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) <= 0) {
+    return RETURN_ABORTED;
+  }
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Performs an atomic operation to check out CPU.
+
+  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Check out CPU index.
+
+  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
+  @retval RETURN_NOT_READY          The CPU is not checked-in.
+  @retval RETURN_UNSUPPORTED        Unsupported operation.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncCheckOutCpu (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL);
+  if (SmmCpuSyncCtx == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  if (*SmmCpuSyncCtx->GlobalSem->Counter == 0) {
+    return RETURN_NOT_READY;
+  }
+
+  if ((INT32)InternalWaitForSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) < 0) {
+    return RETURN_UNSUPPORTED;
+  }
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Performs an atomic operation lock door for CPU checkin or checkout.
+
+  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
+
+  The CPU specified by CpuIndex is elected to lock door.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which CPU to lock door.
+  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.
+
+  @retval RETURN_SUCCESS            Lock door for CPU successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncLockDoor (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN OUT UINTN                 *CpuCount
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
+  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  *CpuCount = InternalLockdownSemaphore (SmmCpuSyncCtx->GlobalSem->Counter);
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Used by the BSP to wait for APs.
+
+  The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
+
+  Note: This function is blocking mode, and it will return only after the number of APs released by
+  calling SmmCpuSyncReleaseBsp():
+  BSP: WaitForAPs    <--  AP: ReleaseBsp
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
+  @param[in]      BspIndex          The BSP Index to wait for APs.
+
+  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or NumberOfAPs > total number of processors in system.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncWaitForAPs (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 NumberOfAPs,
+  IN     UINTN                 BspIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && NumberOfAPs <= SmmCpuSyncCtx->NumberOfCpus);
+  if ((SmmCpuSyncCtx == NULL) || (NumberOfAPs > SmmCpuSyncCtx->NumberOfCpus)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  while (NumberOfAPs-- > 0) {
+    InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
+  }
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Used by the BSP to release one AP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which AP need to be released.
+  @param[in]      BspIndex          The BSP Index to release AP.
+
+  @retval RETURN_SUCCESS            BSP to release one AP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncReleaseOneAp   (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
+  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Used by the AP to wait BSP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  Note: This function is blocking mode, and it will return only after the AP released by
+  calling SmmCpuSyncReleaseOneAp():
+  BSP: ReleaseOneAp  -->  AP: WaitForBsp
+
+  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex         Indicate which AP wait BSP.
+  @param[in]      BspIndex         The BSP Index to be waited.
+
+  @retval RETURN_SUCCESS            AP to wait BSP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncWaitForBsp (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
+  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
+
+  return RETURN_SUCCESS;
+}
+
+/**
+  Used by the AP to release BSP.
+
+  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
+
+  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
+  @param[in]      CpuIndex          Indicate which AP release BSP.
+  @param[in]      BspIndex          The BSP Index to be released.
+
+  @retval RETURN_SUCCESS            AP to release BSP successfully.
+  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
+
+**/
+RETURN_STATUS
+EFIAPI
+SmmCpuSyncReleaseBsp (
+  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
+  IN     UINTN                 CpuIndex,
+  IN     UINTN                 BspIndex
+  )
+{
+  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
+  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
+
+  return RETURN_SUCCESS;
+}
diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
new file mode 100644
index 0000000000..6bb1895577
--- /dev/null
+++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
@@ -0,0 +1,39 @@
+## @file
+# SMM CPU Synchronization lib.
+#
+# This is SMM CPU Synchronization lib used for SMM CPU sync operations.
+#
+# Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
+# SPDX-License-Identifier: BSD-2-Clause-Patent
+#
+##
+
+[Defines]
+  INF_VERSION                    = 0x00010005
+  BASE_NAME                      = SmmCpuSyncLib
+  FILE_GUID                      = 1ca1bc1a-16a4-46ef-956a-ca500fd3381f
+  MODULE_TYPE                    = DXE_SMM_DRIVER
+  LIBRARY_CLASS                  = SmmCpuSyncLib|DXE_SMM_DRIVER
+
+[Sources]
+  SmmCpuSyncLib.c
+
+[Packages]
+  MdePkg/MdePkg.dec
+  MdeModulePkg/MdeModulePkg.dec
+  UefiCpuPkg/UefiCpuPkg.dec
+
+[LibraryClasses]
+  UefiLib
+  BaseLib
+  DebugLib
+  PrintLib
+  SafeIntLib
+  SynchronizationLib
+  BaseMemoryLib
+  SmmServicesTableLib
+  MemoryAllocationLib
+
+[Pcd]
+
+[Protocols]
diff --git a/UefiCpuPkg/UefiCpuPkg.dsc b/UefiCpuPkg/UefiCpuPkg.dsc
index 074fd77461..f264031c77 100644
--- a/UefiCpuPkg/UefiCpuPkg.dsc
+++ b/UefiCpuPkg/UefiCpuPkg.dsc
@@ -23,10 +23,11 @@
 #
 
 !include MdePkg/MdeLibs.dsc.inc
 
 [LibraryClasses]
+  SafeIntLib|MdePkg/Library/BaseSafeIntLib/BaseSafeIntLib.inf
   BaseLib|MdePkg/Library/BaseLib/BaseLib.inf
   BaseMemoryLib|MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf
   CpuLib|MdePkg/Library/BaseCpuLib/BaseCpuLib.inf
   DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf
   SerialPortLib|MdePkg/Library/BaseSerialPortLibNull/BaseSerialPortLibNull.inf
@@ -54,10 +55,11 @@
   CacheMaintenanceLib|MdePkg/Library/BaseCacheMaintenanceLib/BaseCacheMaintenanceLib.inf
   PciLib|MdePkg/Library/BasePciLibPciExpress/BasePciLibPciExpress.inf
   PciExpressLib|MdePkg/Library/BasePciExpressLib/BasePciExpressLib.inf
   SmmCpuPlatformHookLib|UefiCpuPkg/Library/SmmCpuPlatformHookLibNull/SmmCpuPlatformHookLibNull.inf
   SmmCpuFeaturesLib|UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
   PeCoffGetEntryPointLib|MdePkg/Library/BasePeCoffGetEntryPointLib/BasePeCoffGetEntryPointLib.inf
   PeCoffExtraActionLib|MdePkg/Library/BasePeCoffExtraActionLibNull/BasePeCoffExtraActionLibNull.inf
   TpmMeasurementLib|MdeModulePkg/Library/TpmMeasurementLibNull/TpmMeasurementLibNull.inf
   CcExitLib|UefiCpuPkg/Library/CcExitLibNull/CcExitLibNull.inf
   MicrocodeLib|UefiCpuPkg/Library/MicrocodeLib/MicrocodeLib.inf
@@ -154,10 +156,11 @@
   UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf
   UefiCpuPkg/Library/SmmCpuPlatformHookLibNull/SmmCpuPlatformHookLibNull.inf
   UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
   UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLibStm.inf
   UefiCpuPkg/Library/SmmCpuFeaturesLib/StandaloneMmCpuFeaturesLib.inf
+  UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
   UefiCpuPkg/Library/CcExitLibNull/CcExitLibNull.inf
   UefiCpuPkg/PiSmmCommunication/PiSmmCommunicationPei.inf
   UefiCpuPkg/PiSmmCommunication/PiSmmCommunicationSmm.inf
   UefiCpuPkg/SecCore/SecCore.inf
   UefiCpuPkg/SecCore/SecCoreNative.inf
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112111): https://edk2.groups.io/g/devel/message/112111
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
                   ` (2 preceding siblings ...)
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  2023-12-13 16:52   ` Laszlo Ersek
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 5/6] UefiPayloadPkg: " Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 6/6] UefiCpuPkg/PiSmmCpuDxeSmm: Consume SmmCpuSyncLib Wu, Jiaxin
  5 siblings, 1 reply; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Ard Biesheuvel, Jiewen Yao, Jordan Justen,
	Eric Dong, Ray Ni, Zeng Star, Rahul Kumar, Gerd Hoffmann

This patch is to specify SmmCpuSyncLib instance for OvmfPkg.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
---
 OvmfPkg/CloudHv/CloudHvX64.dsc | 2 ++
 OvmfPkg/OvmfPkgIa32.dsc        | 2 ++
 OvmfPkg/OvmfPkgIa32X64.dsc     | 2 ++
 OvmfPkg/OvmfPkgX64.dsc         | 1 +
 4 files changed, 7 insertions(+)

diff --git a/OvmfPkg/CloudHv/CloudHvX64.dsc b/OvmfPkg/CloudHv/CloudHvX64.dsc
index 821ad1b9fa..f735b69a37 100644
--- a/OvmfPkg/CloudHv/CloudHvX64.dsc
+++ b/OvmfPkg/CloudHv/CloudHvX64.dsc
@@ -183,10 +183,12 @@
   PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
   DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
   ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
 !if $(SMM_REQUIRE) == FALSE
   LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
+!else
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
 !endif
   CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
   FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
   MemEncryptTdxLib|OvmfPkg/Library/BaseMemEncryptTdxLib/BaseMemEncryptTdxLib.inf
 
diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index bce2aedcd7..b05b13b18c 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -188,10 +188,12 @@
   PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
   DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
   ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
 !if $(SMM_REQUIRE) == FALSE
   LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
+!else
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
 !endif
   CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
   FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
 
 !if $(SOURCE_DEBUG_ENABLE) == TRUE
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index 631e909a54..5a16eb7abe 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -193,10 +193,12 @@
   PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
   DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
   ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
 !if $(SMM_REQUIRE) == FALSE
   LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
+!else
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
 !endif
   CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
   FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
 
 !if $(SOURCE_DEBUG_ENABLE) == TRUE
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index 4ea3008cc6..6bb4c777b9 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -209,10 +209,11 @@
 !if $(SMM_REQUIRE) == FALSE
   LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
   CcProbeLib|OvmfPkg/Library/CcProbeLib/DxeCcProbeLib.inf
 !else
   CcProbeLib|MdePkg/Library/CcProbeLibNull/CcProbeLibNull.inf
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
 !endif
   CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
   FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
 
 !if $(SOURCE_DEBUG_ENABLE) == TRUE
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112112): https://edk2.groups.io/g/devel/message/112112
Mute This Topic: https://groups.io/mt/103010166/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 5/6] UefiPayloadPkg: Specifies SmmCpuSyncLib instance
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
                   ` (3 preceding siblings ...)
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 6/6] UefiCpuPkg/PiSmmCpuDxeSmm: Consume SmmCpuSyncLib Wu, Jiaxin
  5 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Guo Dong, Sean Rhodes, James Lu, Gua Guo, Ray Ni,
	Zeng Star

This patch is to specify SmmCpuSyncLib instance for UefiPayloadPkg.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Guo Dong <guo.dong@intel.com>
Cc: Sean Rhodes <sean@starlabs.systems>
Cc: James Lu <james.lu@intel.com>
Cc: Gua Guo <gua.guo@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
Reviewed-by: Gua Guo <gua.guo@intel.com>
---
 UefiPayloadPkg/UefiPayloadPkg.dsc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/UefiPayloadPkg/UefiPayloadPkg.dsc b/UefiPayloadPkg/UefiPayloadPkg.dsc
index a65f9d5b83..b8b13ad201 100644
--- a/UefiPayloadPkg/UefiPayloadPkg.dsc
+++ b/UefiPayloadPkg/UefiPayloadPkg.dsc
@@ -253,10 +253,11 @@
   #
   MtrrLib|UefiCpuPkg/Library/MtrrLib/MtrrLib.inf
   LocalApicLib|UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.inf
   MicrocodeLib|UefiCpuPkg/Library/MicrocodeLib/MicrocodeLib.inf
   CpuPageTableLib|UefiCpuPkg/Library/CpuPageTableLib/CpuPageTableLib.inf
+  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
 
   #
   # Platform
   #
 !if $(CPU_TIMER_LIB_ENABLE) == TRUE && $(UNIVERSAL_PAYLOAD) == TRUE
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112114): https://edk2.groups.io/g/devel/message/112114
Mute This Topic: https://groups.io/mt/103010167/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [edk2-devel] [PATCH v3 6/6] UefiCpuPkg/PiSmmCpuDxeSmm: Consume SmmCpuSyncLib
  2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
                   ` (4 preceding siblings ...)
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 5/6] UefiPayloadPkg: " Wu, Jiaxin
@ 2023-12-06 10:01 ` Wu, Jiaxin
  5 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-06 10:01 UTC (permalink / raw)
  To: devel
  Cc: Laszlo Ersek, Eric Dong, Ray Ni, Zeng Star, Gerd Hoffmann,
	Rahul Kumar

There is the SmmCpuSyncLib Library class define the SMM CPU sync
flow, which is aligned with existing SMM CPU driver sync behavior.
This patch is to consume SmmCpuSyncLib instance directly.

With this change, SMM CPU Sync flow/logic can be customized
with different implementation no matter for any purpose, e.g.
performance tuning, handle specific register, etc.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Zeng Star <star.zeng@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
---
 UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c        | 317 +++++++++------------------
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h   |   6 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf |   1 +
 3 files changed, 110 insertions(+), 214 deletions(-)

diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
index 54542262a2..e37c03d0e5 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
@@ -27,122 +27,10 @@ MM_COMPLETION                mSmmStartupThisApToken;
 //
 // Processor specified by mPackageFirstThreadIndex[PackageIndex] will do the package-scope register check.
 //
 UINT32  *mPackageFirstThreadIndex = NULL;
 
-/**
-  Performs an atomic compare exchange operation to get semaphore.
-  The compare exchange operation must be performed using
-  MP safe mechanisms.
-
-  @param      Sem        IN:  32-bit unsigned integer
-                         OUT: original integer - 1
-  @return     Original integer - 1
-
-**/
-UINT32
-WaitForSemaphore (
-  IN OUT  volatile UINT32  *Sem
-  )
-{
-  UINT32  Value;
-
-  for ( ; ;) {
-    Value = *Sem;
-    if ((Value != 0) &&
-        (InterlockedCompareExchange32 (
-           (UINT32 *)Sem,
-           Value,
-           Value - 1
-           ) == Value))
-    {
-      break;
-    }
-
-    CpuPause ();
-  }
-
-  return Value - 1;
-}
-
-/**
-  Performs an atomic compare exchange operation to release semaphore.
-  The compare exchange operation must be performed using
-  MP safe mechanisms.
-
-  @param      Sem        IN:  32-bit unsigned integer
-                         OUT: original integer + 1
-  @return     Original integer + 1
-
-**/
-UINT32
-ReleaseSemaphore (
-  IN OUT  volatile UINT32  *Sem
-  )
-{
-  UINT32  Value;
-
-  do {
-    Value = *Sem;
-  } while (Value + 1 != 0 &&
-           InterlockedCompareExchange32 (
-             (UINT32 *)Sem,
-             Value,
-             Value + 1
-             ) != Value);
-
-  return Value + 1;
-}
-
-/**
-  Performs an atomic compare exchange operation to lock semaphore.
-  The compare exchange operation must be performed using
-  MP safe mechanisms.
-
-  @param      Sem        IN:  32-bit unsigned integer
-                         OUT: -1
-  @return     Original integer
-
-**/
-UINT32
-LockdownSemaphore (
-  IN OUT  volatile UINT32  *Sem
-  )
-{
-  UINT32  Value;
-
-  do {
-    Value = *Sem;
-  } while (InterlockedCompareExchange32 (
-             (UINT32 *)Sem,
-             Value,
-             (UINT32)-1
-             ) != Value);
-
-  return Value;
-}
-
-/**
-  Used for BSP to wait all APs.
-  Wait all APs to performs an atomic compare exchange operation to release semaphore.
-
-  @param   NumberOfAPs      AP number
-
-**/
-VOID
-WaitForAllAPs (
-  IN      UINTN  NumberOfAPs
-  )
-{
-  UINTN  BspIndex;
-
-  BspIndex = mSmmMpSyncData->BspIndex;
-  while (NumberOfAPs-- > 0) {
-    WaitForSemaphore (mSmmMpSyncData->CpuData[BspIndex].Run);
-  }
-}
-
 /**
   Used for BSP to release all APs.
   Performs an atomic compare exchange operation to release semaphore
   for each AP.
 
@@ -154,57 +42,15 @@ ReleaseAllAPs (
 {
   UINTN  Index;
 
   for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
     if (IsPresentAp (Index)) {
-      ReleaseSemaphore (mSmmMpSyncData->CpuData[Index].Run);
+      SmmCpuSyncReleaseOneAp (mSmmMpSyncData->SmmCpuSyncCtx, Index, gSmmCpuPrivate->SmmCoreEntryContext.CurrentlyExecutingCpu);
     }
   }
 }
 
-/**
-  Used for BSP to release one AP.
-
-  @param      ApSem     IN:  32-bit unsigned integer
-                        OUT: original integer + 1
-**/
-VOID
-ReleaseOneAp   (
-  IN OUT  volatile UINT32  *ApSem
-  )
-{
-  ReleaseSemaphore (ApSem);
-}
-
-/**
-  Used for AP to wait BSP.
-
-  @param      ApSem      IN:  32-bit unsigned integer
-                         OUT: original integer - 1
-**/
-VOID
-WaitForBsp  (
-  IN OUT  volatile UINT32  *ApSem
-  )
-{
-  WaitForSemaphore (ApSem);
-}
-
-/**
-  Used for AP to release BSP.
-
-  @param      BspSem     IN:  32-bit unsigned integer
-                         OUT: original integer + 1
-**/
-VOID
-ReleaseBsp   (
-  IN OUT  volatile UINT32  *BspSem
-  )
-{
-  ReleaseSemaphore (BspSem);
-}
-
 /**
   Check whether the index of CPU perform the package level register
   programming during System Management Mode initialization.
 
   The index of Processor specified by mPackageFirstThreadIndex[PackageIndex]
@@ -285,42 +131,53 @@ GetSmmDelayedBlockedDisabledCount (
 BOOLEAN
 AllCpusInSmmExceptBlockedDisabled (
   VOID
   )
 {
+  RETURN_STATUS  Status;
+
+  UINTN   CpuCount;
   UINT32  BlockedCount;
   UINT32  DisabledCount;
 
+  CpuCount      = 0;
   BlockedCount  = 0;
   DisabledCount = 0;
 
+  Status = SmmCpuSyncGetArrivedCpuCount (mSmmMpSyncData->SmmCpuSyncCtx, &CpuCount);
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "AllCpusInSmmExceptBlockedDisabled: SmmCpuSyncGetArrivedCpuCount return error %r!\n", Status));
+    CpuDeadLoop ();
+    return FALSE;
+  }
+
   //
-  // Check to make sure mSmmMpSyncData->Counter is valid and not locked.
+  // Check to make sure the CPU arrival count is valid and not locked.
   //
-  ASSERT (*mSmmMpSyncData->Counter <= mNumberOfCpus);
+  ASSERT (CpuCount <= mNumberOfCpus);
 
   //
   // Check whether all CPUs in SMM.
   //
-  if (*mSmmMpSyncData->Counter == mNumberOfCpus) {
+  if (CpuCount == mNumberOfCpus) {
     return TRUE;
   }
 
   //
   // Check for the Blocked & Disabled Exceptions Case.
   //
   GetSmmDelayedBlockedDisabledCount (NULL, &BlockedCount, &DisabledCount);
 
   //
-  // *mSmmMpSyncData->Counter might be updated by all APs concurrently. The value
+  // The CPU arrival count might be updated by all APs concurrently. The value
   // can be dynamic changed. If some Aps enter the SMI after the BlockedCount &
-  // DisabledCount check, then the *mSmmMpSyncData->Counter will be increased, thus
-  // leading the *mSmmMpSyncData->Counter + BlockedCount + DisabledCount > mNumberOfCpus.
+  // DisabledCount check, then the CPU arrival count will be increased, thus
+  // leading the retrieved CPU arrival count + BlockedCount + DisabledCount > mNumberOfCpus.
   // since the BlockedCount & DisabledCount are local variable, it's ok here only for
   // the checking of all CPUs In Smm.
   //
-  if (*mSmmMpSyncData->Counter + BlockedCount + DisabledCount >= mNumberOfCpus) {
+  if (CpuCount + BlockedCount + DisabledCount >= mNumberOfCpus) {
     return TRUE;
   }
 
   return FALSE;
 }
@@ -384,23 +241,35 @@ IsLmceSignaled (
 VOID
 SmmWaitForApArrival (
   VOID
   )
 {
+  RETURN_STATUS  Status;
+
+  UINTN    CpuCount;
   UINT64   Timer;
   UINTN    Index;
   BOOLEAN  LmceEn;
   BOOLEAN  LmceSignal;
   UINT32   DelayedCount;
   UINT32   BlockedCount;
 
   PERF_FUNCTION_BEGIN ();
 
+  CpuCount     = 0;
   DelayedCount = 0;
   BlockedCount = 0;
 
-  ASSERT (*mSmmMpSyncData->Counter <= mNumberOfCpus);
+  Status = SmmCpuSyncGetArrivedCpuCount (mSmmMpSyncData->SmmCpuSyncCtx, &CpuCount);
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "SmmWaitForApArrival: SmmCpuSyncGetArrivedCpuCount return error %r!\n", Status));
+    CpuDeadLoop ();
+    PERF_FUNCTION_END ();
+    return;
+  }
+
+  ASSERT (CpuCount <= mNumberOfCpus);
 
   LmceEn     = FALSE;
   LmceSignal = FALSE;
   if (mMachineCheckSupported) {
     LmceEn     = IsLmceOsEnabled ();
@@ -431,10 +300,21 @@ SmmWaitForApArrival (
     }
 
     CpuPause ();
   }
 
+  //
+  // Check the CpuCount after Sync with APs 1st.
+  //
+  Status = SmmCpuSyncGetArrivedCpuCount (mSmmMpSyncData->SmmCpuSyncCtx, &CpuCount);
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "SmmWaitForApArrival: SmmCpuSyncGetArrivedCpuCount return error %r!\n", Status));
+    CpuDeadLoop ();
+    PERF_FUNCTION_END ();
+    return;
+  }
+
   //
   // Not all APs have arrived, so we need 2nd round of timeout. IPIs should be sent to ALL none present APs,
   // because:
   // a) Delayed AP may have just come out of the delayed state. Blocked AP may have just been brought out of blocked state by some AP running
   //    normal mode code. These APs need to be guaranteed to have an SMI pending to insure that once they are out of delayed / blocked state, they
@@ -447,11 +327,11 @@ SmmWaitForApArrival (
   // d) We don't add code to check SMI disabling status to skip sending IPI to SMI disabled APs, because:
   //    - In traditional flow, SMI disabling is discouraged.
   //    - In relaxed flow, CheckApArrival() will check SMI disabling status before calling this function.
   //    In both cases, adding SMI-disabling checking code increases overhead.
   //
-  if (*mSmmMpSyncData->Counter < mNumberOfCpus) {
+  if (CpuCount < mNumberOfCpus) {
     //
     // Send SMI IPIs to bring outside processors in
     //
     for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
       if (!(*(mSmmMpSyncData->CpuData[Index].Present)) && (gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId != INVALID_APIC_ID)) {
@@ -610,18 +490,22 @@ VOID
 BSPHandler (
   IN      UINTN              CpuIndex,
   IN      SMM_CPU_SYNC_MODE  SyncMode
   )
 {
+  RETURN_STATUS  Status;
+
+  UINTN          CpuCount;
   UINTN          Index;
   MTRR_SETTINGS  Mtrrs;
   UINTN          ApCount;
   BOOLEAN        ClearTopLevelSmiResult;
   UINTN          PresentCount;
 
   ASSERT (CpuIndex == mSmmMpSyncData->BspIndex);
-  ApCount = 0;
+  CpuCount = 0;
+  ApCount  = 0;
 
   PERF_FUNCTION_BEGIN ();
 
   //
   // Flag BSP's presence
@@ -659,28 +543,35 @@ BSPHandler (
     // Wait for APs to arrive
     //
     SmmWaitForApArrival ();
 
     //
-    // Lock the counter down and retrieve the number of APs
+    // Lock door for late comming CPU checkin and retrieve the Arrived number of APs
     //
     *mSmmMpSyncData->AllCpusInSync = TRUE;
-    ApCount                        = LockdownSemaphore (mSmmMpSyncData->Counter) - 1;
+
+    Status = SmmCpuSyncLockDoor (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, &CpuCount);
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "BSPHandler: SmmCpuSyncLockDoor return error %r!\n", Status));
+      CpuDeadLoop ();
+    }
+
+    ApCount = CpuCount - 1;
 
     //
     // Wait for all APs to get ready for programming MTRRs
     //
-    WaitForAllAPs (ApCount);
+    SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
 
     if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
       //
       // Signal all APs it's time for backup MTRRs
       //
       ReleaseAllAPs ();
 
       //
-      // WaitForAllAPs() may wait for ever if an AP happens to enter SMM at
+      // SmmCpuSyncWaitForAPs() may wait for ever if an AP happens to enter SMM at
       // exactly this point. Please make sure PcdCpuSmmMaxSyncLoops has been set
       // to a large enough value to avoid this situation.
       // Note: For HT capable CPUs, threads within a core share the same set of MTRRs.
       // We do the backup first and then set MTRR to avoid race condition for threads
       // in the same core.
@@ -688,28 +579,28 @@ BSPHandler (
       MtrrGetAllMtrrs (&Mtrrs);
 
       //
       // Wait for all APs to complete their MTRR saving
       //
-      WaitForAllAPs (ApCount);
+      SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
 
       //
       // Let all processors program SMM MTRRs together
       //
       ReleaseAllAPs ();
 
       //
-      // WaitForAllAPs() may wait for ever if an AP happens to enter SMM at
+      // SmmCpuSyncWaitForAPs() may wait for ever if an AP happens to enter SMM at
       // exactly this point. Please make sure PcdCpuSmmMaxSyncLoops has been set
       // to a large enough value to avoid this situation.
       //
       ReplaceOSMtrrs (CpuIndex);
 
       //
       // Wait for all APs to complete their MTRR programming
       //
-      WaitForAllAPs (ApCount);
+      SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
     }
   }
 
   //
   // The BUSY lock is initialized to Acquired state
@@ -741,14 +632,21 @@ BSPHandler (
   // make those APs to exit SMI synchronously. APs which arrive later will be excluded and
   // will run through freely.
   //
   if ((SyncMode != SmmCpuSyncModeTradition) && !SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
-    // Lock the counter down and retrieve the number of APs
+    // Lock door for late comming CPU checkin and retrieve the Arrived number of APs
     //
     *mSmmMpSyncData->AllCpusInSync = TRUE;
-    ApCount                        = LockdownSemaphore (mSmmMpSyncData->Counter) - 1;
+    Status                         = SmmCpuSyncLockDoor (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, &CpuCount);
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "BSPHandler: SmmCpuSyncLockDoor return error %r!\n", Status));
+      CpuDeadLoop ();
+    }
+
+    ApCount = CpuCount - 1;
+
     //
     // Make sure all APs have their Present flag set
     //
     while (TRUE) {
       PresentCount = 0;
@@ -771,11 +669,11 @@ BSPHandler (
   ReleaseAllAPs ();
 
   //
   // Wait for all APs to complete their pending tasks
   //
-  WaitForAllAPs (ApCount);
+  SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
 
   if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Signal APs to restore MTRRs
     //
@@ -788,11 +686,11 @@ BSPHandler (
     MtrrSetAllMtrrs (&Mtrrs);
 
     //
     // Wait for all APs to complete MTRR programming
     //
-    WaitForAllAPs (ApCount);
+    SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
   }
 
   //
   // Stop source level debug in BSP handler, the code below will not be
   // debugged.
@@ -816,11 +714,11 @@ BSPHandler (
 
   //
   // Gather APs to exit SMM synchronously. Note the Present flag is cleared by now but
   // WaitForAllAps does not depend on the Present flag.
   //
-  WaitForAllAPs (ApCount);
+  SmmCpuSyncWaitForAPs (mSmmMpSyncData->SmmCpuSyncCtx, ApCount, CpuIndex);
 
   //
   // At this point, all APs should have exited from APHandler().
   // Migrate the SMM MP performance logging to standard SMM performance logging.
   // Any SMM MP performance logging after this point will be migrated in next SMI.
@@ -842,11 +740,11 @@ BSPHandler (
   }
 
   //
   // Allow APs to check in from this point on
   //
-  *mSmmMpSyncData->Counter                  = 0;
+  SmmCpuSyncContextReset (mSmmMpSyncData->SmmCpuSyncCtx);
   *mSmmMpSyncData->AllCpusInSync            = FALSE;
   mSmmMpSyncData->AllApArrivedWithException = FALSE;
 
   PERF_FUNCTION_END ();
 }
@@ -912,21 +810,21 @@ APHandler (
 
       if (!(*mSmmMpSyncData->InsideSmm)) {
         //
         // Give up since BSP is unable to enter SMM
         // and signal the completion of this AP
-        // Reduce the mSmmMpSyncData->Counter!
+        // Reduce the CPU arrival count!
         //
-        WaitForSemaphore (mSmmMpSyncData->Counter);
+        SmmCpuSyncCheckOutCpu (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex);
         return;
       }
     } else {
       //
       // Don't know BSP index. Give up without sending IPI to BSP.
-      // Reduce the mSmmMpSyncData->Counter!
+      // Reduce the CPU arrival count!
       //
-      WaitForSemaphore (mSmmMpSyncData->Counter);
+      SmmCpuSyncCheckOutCpu (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex);
       return;
     }
   }
 
   //
@@ -942,50 +840,50 @@ APHandler (
 
   if ((SyncMode == SmmCpuSyncModeTradition) || SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Notify BSP of arrival at this point
     //
-    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+    SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
   }
 
   if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Wait for the signal from BSP to backup MTRRs
     //
-    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    SmmCpuSyncWaitForBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Backup OS MTRRs
     //
     MtrrGetAllMtrrs (&Mtrrs);
 
     //
     // Signal BSP the completion of this AP
     //
-    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+    SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Wait for BSP's signal to program MTRRs
     //
-    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    SmmCpuSyncWaitForBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Replace OS MTRRs with SMI MTRRs
     //
     ReplaceOSMtrrs (CpuIndex);
 
     //
     // Signal BSP the completion of this AP
     //
-    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+    SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
   }
 
   while (TRUE) {
     //
     // Wait for something to happen
     //
-    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    SmmCpuSyncWaitForBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Check if BSP wants to exit SMM
     //
     if (!(*mSmmMpSyncData->InsideSmm)) {
@@ -1021,16 +919,16 @@ APHandler (
 
   if (SmmCpuFeaturesNeedConfigureMtrrs ()) {
     //
     // Notify BSP the readiness of this AP to program MTRRs
     //
-    ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+    SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Wait for the signal from BSP to program MTRRs
     //
-    WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+    SmmCpuSyncWaitForBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
     //
     // Restore OS MTRRs
     //
     SmmCpuFeaturesReenableSmrr ();
@@ -1038,26 +936,26 @@ APHandler (
   }
 
   //
   // Notify BSP the readiness of this AP to Reset states/semaphore for this processor
   //
-  ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+  SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
   //
   // Wait for the signal from BSP to Reset states/semaphore for this processor
   //
-  WaitForBsp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+  SmmCpuSyncWaitForBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 
   //
   // Reset states/semaphore for this processor
   //
   *(mSmmMpSyncData->CpuData[CpuIndex].Present) = FALSE;
 
   //
   // Notify BSP the readiness of this AP to exit SMM
   //
-  ReleaseBsp (mSmmMpSyncData->CpuData[BspIndex].Run);
+  SmmCpuSyncReleaseBsp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, BspIndex);
 }
 
 /**
   Checks whether the input token is the current used token.
 
@@ -1321,11 +1219,11 @@ InternalSmmStartupThisAp (
   mSmmMpSyncData->CpuData[CpuIndex].Status = CpuStatus;
   if (mSmmMpSyncData->CpuData[CpuIndex].Status != NULL) {
     *mSmmMpSyncData->CpuData[CpuIndex].Status = EFI_NOT_READY;
   }
 
-  ReleaseOneAp (mSmmMpSyncData->CpuData[CpuIndex].Run);
+  SmmCpuSyncReleaseOneAp (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex, gSmmCpuPrivate->SmmCoreEntryContext.CurrentlyExecutingCpu);
 
   if (Token == NULL) {
     AcquireSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);
     ReleaseSpinLock (mSmmMpSyncData->CpuData[CpuIndex].Busy);
   }
@@ -1450,11 +1348,11 @@ InternalSmmStartupAllAPs (
 
       //
       // Decrease the count to mark this processor(AP or BSP) as finished.
       //
       if (ProcToken != NULL) {
-        WaitForSemaphore (&ProcToken->RunningApCount);
+        InterlockedDecrement (&ProcToken->RunningApCount);
       }
     }
   }
 
   ReleaseAllAPs ();
@@ -1725,14 +1623,15 @@ SmiRendezvous (
     //
     goto Exit;
   } else {
     //
     // Signal presence of this processor
-    // mSmmMpSyncData->Counter is increased here!
-    // "ReleaseSemaphore (mSmmMpSyncData->Counter) == 0" means BSP has already ended the synchronization.
+    // CPU check in here!
+    // "SmmCpuSyncCheckInCpu (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex)" return error means failed
+    // to check in CPU. BSP has already ended the synchronization.
     //
-    if (ReleaseSemaphore (mSmmMpSyncData->Counter) == 0) {
+    if (RETURN_ERROR (SmmCpuSyncCheckInCpu (mSmmMpSyncData->SmmCpuSyncCtx, CpuIndex))) {
       //
       // BSP has already ended the synchronization, so QUIT!!!
       // Existing AP is too late now to enter SMI since BSP has already ended the synchronization!!!
       //
 
@@ -1824,12 +1723,10 @@ SmiRendezvous (
       } else {
         APHandler (CpuIndex, ValidSmi, mSmmMpSyncData->EffectiveSyncMode);
       }
     }
 
-    ASSERT (*mSmmMpSyncData->CpuData[CpuIndex].Run == 0);
-
     //
     // Wait for BSP's signal to exit SMI
     //
     while (*mSmmMpSyncData->AllCpusInSync) {
       CpuPause ();
@@ -1945,12 +1842,10 @@ InitializeSmmCpuSemaphores (
   SemaphoreBlock = AllocatePages (Pages);
   ASSERT (SemaphoreBlock != NULL);
   ZeroMem (SemaphoreBlock, TotalSize);
 
   SemaphoreAddr                                   = (UINTN)SemaphoreBlock;
-  mSmmCpuSemaphores.SemaphoreGlobal.Counter       = (UINT32 *)SemaphoreAddr;
-  SemaphoreAddr                                  += SemaphoreSize;
   mSmmCpuSemaphores.SemaphoreGlobal.InsideSmm     = (BOOLEAN *)SemaphoreAddr;
   SemaphoreAddr                                  += SemaphoreSize;
   mSmmCpuSemaphores.SemaphoreGlobal.AllCpusInSync = (BOOLEAN *)SemaphoreAddr;
   SemaphoreAddr                                  += SemaphoreSize;
   mSmmCpuSemaphores.SemaphoreGlobal.PFLock        = (SPIN_LOCK *)SemaphoreAddr;
@@ -1960,12 +1855,10 @@ InitializeSmmCpuSemaphores (
   SemaphoreAddr += SemaphoreSize;
 
   SemaphoreAddr                          = (UINTN)SemaphoreBlock + GlobalSemaphoresSize;
   mSmmCpuSemaphores.SemaphoreCpu.Busy    = (SPIN_LOCK *)SemaphoreAddr;
   SemaphoreAddr                         += ProcessorCount * SemaphoreSize;
-  mSmmCpuSemaphores.SemaphoreCpu.Run     = (UINT32 *)SemaphoreAddr;
-  SemaphoreAddr                         += ProcessorCount * SemaphoreSize;
   mSmmCpuSemaphores.SemaphoreCpu.Present = (BOOLEAN *)SemaphoreAddr;
 
   mPFLock                       = mSmmCpuSemaphores.SemaphoreGlobal.PFLock;
   mConfigSmmCodeAccessCheckLock = mSmmCpuSemaphores.SemaphoreGlobal.CodeAccessCheckLock;
 
@@ -1980,10 +1873,12 @@ VOID
 EFIAPI
 InitializeMpSyncData (
   VOID
   )
 {
+  RETURN_STATUS  Status;
+
   UINTN  CpuIndex;
 
   if (mSmmMpSyncData != NULL) {
     //
     // mSmmMpSyncDataSize includes one structure of SMM_DISPATCHER_MP_SYNC_DATA, one
@@ -2009,32 +1904,34 @@ InitializeMpSyncData (
       }
     }
 
     mSmmMpSyncData->EffectiveSyncMode = mCpuSmmSyncMode;
 
-    mSmmMpSyncData->Counter       = mSmmCpuSemaphores.SemaphoreGlobal.Counter;
+    Status = SmmCpuSyncContextInit (gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus, &(mSmmMpSyncData->SmmCpuSyncCtx));
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "InitializeMpSyncData: SmmCpuSyncContextInit return error %r!\n", Status));
+      CpuDeadLoop ();
+      return;
+    }
+
     mSmmMpSyncData->InsideSmm     = mSmmCpuSemaphores.SemaphoreGlobal.InsideSmm;
     mSmmMpSyncData->AllCpusInSync = mSmmCpuSemaphores.SemaphoreGlobal.AllCpusInSync;
     ASSERT (
-      mSmmMpSyncData->Counter != NULL && mSmmMpSyncData->InsideSmm != NULL &&
+      mSmmMpSyncData->SmmCpuSyncCtx != NULL && mSmmMpSyncData->InsideSmm != NULL &&
       mSmmMpSyncData->AllCpusInSync != NULL
       );
-    *mSmmMpSyncData->Counter       = 0;
     *mSmmMpSyncData->InsideSmm     = FALSE;
     *mSmmMpSyncData->AllCpusInSync = FALSE;
 
     mSmmMpSyncData->AllApArrivedWithException = FALSE;
 
     for (CpuIndex = 0; CpuIndex < gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus; CpuIndex++) {
       mSmmMpSyncData->CpuData[CpuIndex].Busy =
         (SPIN_LOCK *)((UINTN)mSmmCpuSemaphores.SemaphoreCpu.Busy + mSemaphoreSize * CpuIndex);
-      mSmmMpSyncData->CpuData[CpuIndex].Run =
-        (UINT32 *)((UINTN)mSmmCpuSemaphores.SemaphoreCpu.Run + mSemaphoreSize * CpuIndex);
       mSmmMpSyncData->CpuData[CpuIndex].Present =
         (BOOLEAN *)((UINTN)mSmmCpuSemaphores.SemaphoreCpu.Present + mSemaphoreSize * CpuIndex);
       *(mSmmMpSyncData->CpuData[CpuIndex].Busy)    = 0;
-      *(mSmmMpSyncData->CpuData[CpuIndex].Run)     = 0;
       *(mSmmMpSyncData->CpuData[CpuIndex].Present) = FALSE;
     }
   }
 }
 
diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
index 20ada465c2..5b18ddde66 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
@@ -52,10 +52,11 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
 #include <Library/PeCoffGetEntryPointLib.h>
 #include <Library/RegisterCpuFeaturesLib.h>
 #include <Library/PerformanceLib.h>
 #include <Library/CpuPageTableLib.h>
 #include <Library/MmSaveStateLib.h>
+#include <Library/SmmCpuSyncLib.h>
 
 #include <AcpiCpuData.h>
 #include <CpuHotPlugData.h>
 
 #include <Register/Intel/Cpuid.h>
@@ -403,11 +404,10 @@ SmmRelocationSemaphoreComplete (
 ///
 typedef struct {
   SPIN_LOCK                     *Busy;
   volatile EFI_AP_PROCEDURE2    Procedure;
   volatile VOID                 *Parameter;
-  volatile UINT32               *Run;
   volatile BOOLEAN              *Present;
   PROCEDURE_TOKEN               *Token;
   EFI_STATUS                    *Status;
 } SMM_CPU_DATA_BLOCK;
 
@@ -421,29 +421,28 @@ typedef struct {
   //
   // Pointer to an array. The array should be located immediately after this structure
   // so that UC cache-ability can be set together.
   //
   SMM_CPU_DATA_BLOCK            *CpuData;
-  volatile UINT32               *Counter;
   volatile UINT32               BspIndex;
   volatile BOOLEAN              *InsideSmm;
   volatile BOOLEAN              *AllCpusInSync;
   volatile SMM_CPU_SYNC_MODE    EffectiveSyncMode;
   volatile BOOLEAN              SwitchBsp;
   volatile BOOLEAN              *CandidateBsp;
   volatile BOOLEAN              AllApArrivedWithException;
   EFI_AP_PROCEDURE              StartupProcedure;
   VOID                          *StartupProcArgs;
+  SMM_CPU_SYNC_CONTEXT          *SmmCpuSyncCtx;
 } SMM_DISPATCHER_MP_SYNC_DATA;
 
 #define SMM_PSD_OFFSET  0xfb00
 
 ///
 /// All global semaphores' pointer
 ///
 typedef struct {
-  volatile UINT32     *Counter;
   volatile BOOLEAN    *InsideSmm;
   volatile BOOLEAN    *AllCpusInSync;
   SPIN_LOCK           *PFLock;
   SPIN_LOCK           *CodeAccessCheckLock;
 } SMM_CPU_SEMAPHORE_GLOBAL;
@@ -451,11 +450,10 @@ typedef struct {
 ///
 /// All semaphores for each processor
 ///
 typedef struct {
   SPIN_LOCK           *Busy;
-  volatile UINT32     *Run;
   volatile BOOLEAN    *Present;
   SPIN_LOCK           *Token;
 } SMM_CPU_SEMAPHORE_CPU;
 
 ///
diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf
index 5d52ed7d13..e92b8c747d 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf
@@ -101,10 +101,11 @@
   SmmCpuFeaturesLib
   PeCoffGetEntryPointLib
   PerformanceLib
   CpuPageTableLib
   MmSaveStateLib
+  SmmCpuSyncLib
 
 [Protocols]
   gEfiSmmAccess2ProtocolGuid               ## CONSUMES
   gEfiMpServiceProtocolGuid                ## CONSUMES
   gEfiSmmConfigurationProtocolGuid         ## PRODUCES
-- 
2.16.2.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112115): https://edk2.groups.io/g/devel/message/112115
Mute This Topic: https://groups.io/mt/103010168/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class Wu, Jiaxin
@ 2023-12-07  9:07   ` Ni, Ray
  2023-12-12 20:18   ` Laszlo Ersek
  1 sibling, 0 replies; 22+ messages in thread
From: Ni, Ray @ 2023-12-07  9:07 UTC (permalink / raw)
  To: Wu, Jiaxin, devel@edk2.groups.io
  Cc: Laszlo Ersek, Dong, Eric, Zeng, Star, Gerd Hoffmann,
	Kumar, Rahul R



Thanks,
Ray
> -----Original Message-----
> From: Wu, Jiaxin <jiaxin.wu@intel.com>
> Sent: Wednesday, December 6, 2023 6:01 PM
> To: devel@edk2.groups.io
> Cc: Laszlo Ersek <lersek@redhat.com>; Dong, Eric <eric.dong@intel.com>; Ni,
> Ray <ray.ni@intel.com>; Zeng, Star <star.zeng@intel.com>; Gerd Hoffmann
> <kraxel@redhat.com>; Kumar, Rahul R <rahul.r.kumar@intel.com>
> Subject: [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
> 
> Intel is planning to provide different SMM CPU Sync implementation
> along with some specific registers to improve the SMI performance,
> hence need SmmCpuSyncLib Library for Intel.
> 
> This patch is to:
> 1.Adds SmmCpuSyncLib Library class in UefiCpuPkg.dec.
> 2.Adds SmmCpuSyncLib.h function declaration header file.
> 
> For the new SmmCpuSyncLib, it provides 3 sets of APIs:
> 
> 1. ContextInit/ContextDeinit/ContextReset:

1. add extra empty line.

> ContextInit() is called in driver's entrypoint to allocate and

2. add extra spaces to show the paragraph is body of each bullet item.

In general, please make necessary changes to make the comments easy to read.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112172): https://edk2.groups.io/g/devel/message/112172
Mute This Topic: https://groups.io/mt/103010164/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP Wu, Jiaxin
@ 2023-12-12 19:27   ` Laszlo Ersek
  0 siblings, 0 replies; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-12 19:27 UTC (permalink / raw)
  To: devel, jiaxin.wu; +Cc: Eric Dong, Ray Ni, Zeng Star, Rahul Kumar, Gerd Hoffmann

On 12/6/23 11:01, Wu, Jiaxin wrote:
> This patch is to define 3 new functions (WaitForBsp & ReleaseBsp &
> ReleaseOneAp) used for the semaphore sync between BSP & AP. With the
> change, BSP and AP Sync flow will be easy understand as below:
> BSP: ReleaseAllAPs or ReleaseOneAp --> AP: WaitForBsp
> BSP: WaitForAllAPs                 <-- AP: ReleaseBsp
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ray Ni <ray.ni@intel.com>
> Cc: Zeng Star <star.zeng@intel.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c | 72 ++++++++++++++++++++++++++++-------
>  1 file changed, 58 insertions(+), 14 deletions(-)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>




-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112454): https://edk2.groups.io/g/devel/message/112454
Mute This Topic: https://groups.io/mt/103010163/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class Wu, Jiaxin
  2023-12-07  9:07   ` Ni, Ray
@ 2023-12-12 20:18   ` Laszlo Ersek
  2023-12-13  4:23     ` Wu, Jiaxin
  1 sibling, 1 reply; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-12 20:18 UTC (permalink / raw)
  To: devel, jiaxin.wu; +Cc: Eric Dong, Ray Ni, Zeng Star, Gerd Hoffmann, Rahul Kumar

On 12/6/23 11:01, Wu, Jiaxin wrote:
> Intel is planning to provide different SMM CPU Sync implementation
> along with some specific registers to improve the SMI performance,
> hence need SmmCpuSyncLib Library for Intel.
> 
> This patch is to:
> 1.Adds SmmCpuSyncLib Library class in UefiCpuPkg.dec.
> 2.Adds SmmCpuSyncLib.h function declaration header file.
> 
> For the new SmmCpuSyncLib, it provides 3 sets of APIs:
> 
> 1. ContextInit/ContextDeinit/ContextReset:
> ContextInit() is called in driver's entrypoint to allocate and
> initialize the SMM CPU Sync context. ContextDeinit() is called in
> driver's unload function to deinitialize SMM CPU Sync context.
> ContextReset() is called before CPU exist SMI, which allows CPU to
> check into the next SMI from this point.
> 
> 2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
> When SMI happens, all processors including BSP enter to SMM mode by
> calling CheckInCpu(). The elected BSP calls LockDoor() so that
> CheckInCpu() will return the error code after that. CheckOutCpu() can
> be called in error handling flow for the CPU who calls CheckInCpu()
> earlier. GetArrivedCpuCount() returns the number of checked-in CPUs.
> 
> 3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
> WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number
> of APs and release one specific AP. WaitForBsp() & ReleaseBsp() are
> called from APs to wait and release BSP. The 4 APIs are used to
> synchronize the running flow among BSP and APs. BSP and AP Sync flow
> can be easy understand as below:
> BSP: ReleaseOneAp  -->  AP: WaitForBsp
> BSP: WaitForAPs    <--  AP: ReleaseBsp
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ray Ni <ray.ni@intel.com>
> Cc: Zeng Star <star.zeng@intel.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
> ---
>  UefiCpuPkg/Include/Library/SmmCpuSyncLib.h | 275 +++++++++++++++++++++++++++++
>  UefiCpuPkg/UefiCpuPkg.dec                  |   3 +
>  2 files changed, 278 insertions(+)
>  create mode 100644 UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
> 
> diff --git a/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h b/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
> new file mode 100644
> index 0000000000..0f9eb3414a
> --- /dev/null
> +++ b/UefiCpuPkg/Include/Library/SmmCpuSyncLib.h
> @@ -0,0 +1,275 @@
> +/** @file
> +  Library that provides SMM CPU Sync related operations.
> +  The lib provides 3 sets of APIs:
> +  1. ContextInit/ContextDeinit/ContextReset:
> +  ContextInit() is called in driver's entrypoint to allocate and initialize the SMM CPU Sync context.
> +  ContextDeinit() is called in driver's unload function to deinitialize the SMM CPU Sync context.
> +  ContextReset() is called before CPU exist SMI, which allows CPU to check into the next SMI from this point.
> +
> +  2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
> +  When SMI happens, all processors including BSP enter to SMM mode by calling CheckInCpu().
> +  The elected BSP calls LockDoor() so that CheckInCpu() will return the error code after that.
> +  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
> +  GetArrivedCpuCount() returns the number of checked-in CPUs.
> +
> +  3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
> +  WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number of APs and release one specific AP.
> +  WaitForBsp() & ReleaseBsp() are called from APs to wait and release BSP.
> +  The 4 APIs are used to synchronize the running flow among BSP and APs. BSP and AP Sync flow can be
> +  easy understand as below:
> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> +
> +  Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
> +  SPDX-License-Identifier: BSD-2-Clause-Patent
> +
> +**/

Thanks. This documentation (in the commit message and the lib class
header file) seems really good (especially with the formatting updates
suggested by Ray).

(1) I think there is one typo: exist <-> exits.

> +
> +#ifndef SMM_CPU_SYNC_LIB_H_
> +#define SMM_CPU_SYNC_LIB_H_
> +
> +#include <Uefi/UefiBaseType.h>
> +
> +//
> +// Opaque structure for SMM CPU Sync context.
> +//
> +typedef struct SMM_CPU_SYNC_CONTEXT SMM_CPU_SYNC_CONTEXT;
> +
> +/**
> +  Create and initialize the SMM CPU Sync context.
> +
> +  SmmCpuSyncContextInit() function is to allocate and initialize the SMM CPU Sync context.
> +
> +  @param[in]  NumberOfCpus          The number of Logical Processors in the system.
> +  @param[out] SmmCpuSyncCtx         Pointer to the new created and initialized SMM CPU Sync context object.
> +                                    NULL will be returned if any error happen during init.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful created and initialized.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_BUFFER_TOO_SMALL   Overflow happen
> +  @retval RETURN_OUT_OF_RESOURCES   There are not enough resources available to create and initialize SMM CPU Sync context.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextInit (
> +  IN   UINTN                 NumberOfCpus,
> +  OUT  SMM_CPU_SYNC_CONTEXT  **SmmCpuSyncCtx
> +  );
> +
> +/**
> +  Deinit an allocated SMM CPU Sync context.
> +
> +  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync context, the resources allocated in
> +  SmmCpuSyncContextInit() will be freed.
> +
> +  Note: This function only can be called after SmmCpuSyncContextInit() return success.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be deinitialized.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful deinitialized.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextDeinit (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> +  );
> +
> +/**
> +  Reset SMM CPU Sync context.
> +
> +  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context to the initialized state.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be reset.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful reset.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextReset (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> +  );

(2) The file-top documentation says about this API: "ContextReset() is
called before CPU exist SMI, which allows CPU to check into the next SMI
from this point".

It is not clear *which* CPU is supposed to call ContextReset (and the
function does not take a CPU index). Can you explain this in the
documentation? (Assuming my observation makes sense.)

> +
> +/**
> +  Get current number of arrived CPU in SMI.
> +
> +  For traditional CPU synchronization method, BSP might need to know the current number of arrived CPU in
> +  SMI to make sure all APs in SMI. This API can be for that purpose.
> +
> +  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
> +
> +  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncGetArrivedCpuCount (
> +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN OUT UINTN                 *CpuCount
> +  );

(3) Why is CpuCount IN OUT? I would think just OUT should suffice.


> +
> +/**
> +  Performs an atomic operation to check in CPU.
> +
> +  When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Check in CPU index.
> +
> +  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncCheckInCpu (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex
> +  );

(4) Do we need an error condition for CpuIndex being out of range?

(5) Do we have a special CpuIndex value for the BSP?

> +
> +/**
> +  Performs an atomic operation to check out CPU.
> +
> +  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Check out CPU index.
> +
> +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_NOT_READY          The CPU is not checked-in.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncCheckOutCpu (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex
> +  );
> +

(6) Looks good, my only question is again if we need a status code for
CpuIndex being out of range.

> +/**
> +  Performs an atomic operation lock door for CPU checkin or checkout.
> +
> +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
> +
> +  The CPU specified by CpuIndex is elected to lock door.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which CPU to lock door.
> +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.
> +
> +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncLockDoor (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN OUT UINTN                 *CpuCount
> +  );

This is where it's getting tricky :)

(7) error condition for CpuIndex being out of range?

(8) why is CpuCount IN OUT and not just OUT? (Other than that, I can see
how outputting CpuCout at once can be useful.)

(9) do we need error conditions for:

(9.1) CpuIndex being "wrong" (i.e., not the CPU that's "supposed" to
lock the door)?

(9.2) CpuIndex not having checked in already, before trying to lock the
door?

Now, I can imagine that problem (9.1) is undetectable, i.e., it causes
undefined behavior. That's fine, but then we should mention that.

> +
> +/**
> +  Used by the BSP to wait for APs.
> +
> +  The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
> +
> +  Note: This function is blocking mode, and it will return only after the number of APs released by
> +  calling SmmCpuSyncReleaseBsp():
> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
> +  @param[in]      BspIndex          The BSP Index to wait for APs.
> +
> +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or NumberOfAPs > total number of processors in system.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncWaitForAPs (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 NumberOfAPs,
> +  IN     UINTN                 BspIndex
> +  );

The "NumberOfAPs > total number of processors in system" check is nice!

(10) Again, do we need a similar error condition for BspIndex being out
of range?

(11) Do we need to document / enforce explicitly (status code) that the
BSP and the APs must have checked in, and/or the door must have been
locked? Again -- if we can't detect / enforce these conditions, that's
fine, but then we should mention the expected call environment. The
file-top description does not seem very explicit about it.

> +
> +/**
> +  Used by the BSP to release one AP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which AP need to be released.
> +  @param[in]      BspIndex          The BSP Index to release AP.
> +
> +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncReleaseOneAp   (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  );

(12) Same comments as elsewhere:

- it's good that we check CpuIndex versus BspIndex, but do we also need
to range-check each?

- document that both affected CPUs need to have checked in, with the
door potentially locked?


> +
> +/**
> +  Used by the AP to wait BSP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  Note: This function is blocking mode, and it will return only after the AP released by
> +  calling SmmCpuSyncReleaseOneAp():
> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> +
> +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex         Indicate which AP wait BSP.
> +  @param[in]      BspIndex         The BSP Index to be waited.
> +
> +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncWaitForBsp (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  );
> +

(13) Same questions as under (12).

> +/**
> +  Used by the AP to release BSP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which AP release BSP.
> +  @param[in]      BspIndex          The BSP Index to be released.
> +
> +  @retval RETURN_SUCCESS            AP to release BSP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncReleaseBsp (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  );

(14) Same questions as under (12).

> +
> +#endif
> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
> index 0b5431dbf7..20ab079219 100644
> --- a/UefiCpuPkg/UefiCpuPkg.dec
> +++ b/UefiCpuPkg/UefiCpuPkg.dec
> @@ -62,10 +62,13 @@
>    CpuPageTableLib|Include/Library/CpuPageTableLib.h
>  
>    ## @libraryclass   Provides functions for manipulating smram savestate registers.
>    MmSaveStateLib|Include/Library/MmSaveStateLib.h
>  
> +  ## @libraryclass   Provides functions for SMM CPU Sync Operation.
> +  SmmCpuSyncLib|Include/Library/SmmCpuSyncLib.h
> +
>  [LibraryClasses.RISCV64]
>    ##  @libraryclass  Provides functions to manage MMU features on RISCV64 CPUs.
>    ##
>    RiscVMmuLib|Include/Library/BaseRiscVMmuLib.h
>  

These interfaces look real nice, my comments/questions are all docs-related.

Thanks!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112455): https://edk2.groups.io/g/devel/message/112455
Mute This Topic: https://groups.io/mt/103010164/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
  2023-12-12 20:18   ` Laszlo Ersek
@ 2023-12-13  4:23     ` Wu, Jiaxin
  2023-12-13 15:02       ` Laszlo Ersek
  0 siblings, 1 reply; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-13  4:23 UTC (permalink / raw)
  To: Laszlo Ersek, devel@edk2.groups.io
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

> 
> Thanks. This documentation (in the commit message and the lib class
> header file) seems really good (especially with the formatting updates
> suggested by Ray).
> 
> (1) I think there is one typo: exist <-> exits.
> 

agree, I will fix this.

> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncContextReset (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> > +  );
> 
> (2) The file-top documentation says about this API: "ContextReset() is
> called before CPU exist SMI, which allows CPU to check into the next SMI
> from this point".
> 
> It is not clear *which* CPU is supposed to call ContextReset (and the
> function does not take a CPU index). Can you explain this in the
> documentation? (Assuming my observation makes sense.)
> 

For SMM CPU driver, it shall the BSP to call the function since BSP will gather all APs to exit SMM synchronously, it's the role to control the overall SMI execution flow.

For the API itself, I don't restrict which CPU can do that. It depends on the consumer. So, it's not the mandatory, that's the reason I don't mention that.

But you are right, here, it has a requirement: the elected CPU calling this function need make sure all CPUs are ready to exist SMI. I can clear document this requirement as below:

"This function is called by one of CPUs after all CPUs are ready to exist SMI, which allows CPU to check into the next SMI from this point."

Besides, one comment from Ray: we can ASSERT the SmmCpuSyncCtx is not NULL, don't need return status to handle all such case. if so, the RETURN_STATUS is not required.

So, based on all above, I will update the API as below:

/**
  Reset SMM CPU Sync context. SMM CPU Sync context will be reset to the initialized state.

  This function is called by one of CPUs after all CPUs are ready to exist SMI, which allows CPU to
  check into the next SMI from this point.

  If Context is NULL, then ASSERT().

  @param[in,out]  Context     Pointer to the SMM CPU Sync context object to be reset.

**/
VOID
EFIAPI
SmmCpuSyncContextReset (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context
  );


> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncGetArrivedCpuCount (
> > +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN OUT UINTN                 *CpuCount
> > +  );
> 
> (3) Why is CpuCount IN OUT? I would think just OUT should suffice.
> 
> 

Agree. I will correct all similar case. Besides, I also received the comments from Ray offline:
1. we can ASSERT the SmmCpuSyncCtx is not NULL, don't need return status to handle that.
2. we don't need RETURN_UNSUPPORTED,  GetArrivedCpuCount() should be always supported.
With above comments, I will update the API as below to return the count directly, this is also aligned with the function name to get arrived CPU Count:

/**
  Get current number of arrived CPU in SMI.

  BSP might need to know the current number of arrived CPU in SMI to make sure all APs
  in SMI. This API can be for that purpose.

  If Context is NULL, then ASSERT().

  @param[in]      Context     Pointer to the SMM CPU Sync context object.

  @retval    Current number of arrived CPU in SMI.

**/
UINTN
EFIAPI
SmmCpuSyncGetArrivedCpuCount (
  IN  SMM_CPU_SYNC_CONTEXT  *Context
  );

 
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncCheckInCpu (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex
> > +  );
> 
> (4) Do we need an error condition for CpuIndex being out of range?
> 

Good idea. We can handle this check within ASSERT. Then I will update all similar case by adding below comments in API:

"If CpuIndex exceeds the range of all CPUs in the system, then ASSERT()."

For example:
/**
  Performs an atomic operation to check in CPU.

  When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().

  If Context is NULL, then ASSERT().
  If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Check in CPU index.

  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
  @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.

**/
RETURN_STATUS
EFIAPI
SmmCpuSyncCheckInCpu (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex
  );


> (5) Do we have a special CpuIndex value for the BSP?
> 

No, the elected BSP is also the part of CPUs with its own CpuIndex value.


> > +
> > +/**
> > +  Performs an atomic operation to check out CPU.
> > +
> > +  CheckOutCpu() can be called in error handling flow for the CPU who calls
> CheckInCpu() earlier.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Check out CPU index.
> > +
> > +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> > +  @retval RETURN_NOT_READY          The CPU is not checked-in.
> > +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncCheckOutCpu (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex
> > +  );
> > +
> 
> (6) Looks good, my only question is again if we need a status code for
> CpuIndex being out of range.
> 

Yes, agree. The comments from Ray is that we don't need handle the RETURN_INVALID_PARAMETER, just ASSERT for better performance, we can avoid the if check in release version. Just document the assert.

To better define the API behavior, I will remove RETURN_UNSUPPORTED, and replaced with RETURN_ABORTED, which can align with the checkin behavior if we don't want support the checkout after look door. RETURN_ABORTED clearly document the behavior instead of RETURN_UNSUPPORTED.          

So, the API would be as below:
/**
  Performs an atomic operation to check out CPU.

  This function can be called in error handling flow for the CPU who calls CheckInCpu() earlier.

  If Context is NULL, then ASSERT().
  If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Check out CPU index.

  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
  @retval RETURN_ABORTED            Check out CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.

**/
RETURN_STATUS
EFIAPI
SmmCpuSyncCheckOutCpu (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex
  );


> > +/**
> > +  Performs an atomic operation lock door for CPU checkin or checkout.
> > +
> > +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
> > +
> > +  The CPU specified by CpuIndex is elected to lock door.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which CPU to lock door.
> > +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look
> door.
> > +
> > +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
> NULL.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncLockDoor (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN OUT UINTN                 *CpuCount
> > +  );
> 
> This is where it's getting tricky :)
> 
> (7) error condition for CpuIndex being out of range?

Yes, agree. The same case as above, it will be handled in the ASSERT and documented in the comments.

> 
> (8) why is CpuCount IN OUT and not just OUT? (Other than that, I can see
> how outputting CpuCout at once can be useful.)
> 

Well, I agree it should only OUT. 
CpuCout is to tell the number of arrived CPU in SMI after look door. For SMM CPU driver, it needs to know the number of arrived CPU in SMI after look door, it's for later rendezvous & sync usage. So, it returns the CpuCount.


> (9) do we need error conditions for:
> 
> (9.1) CpuIndex being "wrong" (i.e., not the CPU that's "supposed" to
> lock the door)?
> 
> (9.2) CpuIndex not having checked in already, before trying to lock the
> door?
> 
> Now, I can imagine that problem (9.1) is undetectable, i.e., it causes
> undefined behavior. That's fine, but then we should mention that.
> 

Actually CpuIndex might not be cared by the lib instance. It puts here just for the future extension that some lib instance might need to know which CPU lock the door. It's the information provided by the API.
I don't add the error check for those because I don't want focus the implementation to do this check.

 But I agree, we can document this undefined behavior. How about like below:
  "The CPU specified by CpuIndex is elected to lock door. The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior."

 With above, I will update the API like below:

/**
  Performs an atomic operation lock door for CPU checkin and checkout. After this function:
  CPU can not check in via SmmCpuSyncCheckInCpu().
  CPU can not check out via SmmCpuSyncCheckOutCpu().

  The CPU specified by CpuIndex is elected to lock door. The caller shall make sure the CpuIndex
  is the actual CPU calling this function to avoid the undefined behavior.

  If Context is NULL, then ASSERT().
  If CpuCount is NULL, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Indicate which CPU to lock door.
  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.

**/
VOID
EFIAPI
SmmCpuSyncLockDoor (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN          UINTN                 CpuIndex,
       OUT UINTN                 *CpuCount
  );



> > +
> > +/**
> > +  Used by the BSP to wait for APs.
> > +
> > +  The number of APs need to be waited is specified by NumberOfAPs. The
> BSP is specified by BspIndex.
> > +
> > +  Note: This function is blocking mode, and it will return only after the
> number of APs released by
> > +  calling SmmCpuSyncReleaseBsp():
> > +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      NumberOfAPs       Number of APs need to be waited by
> BSP.
> > +  @param[in]      BspIndex          The BSP Index to wait for APs.
> > +
> > +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> NumberOfAPs > total number of processors in system.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncWaitForAPs (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 NumberOfAPs,
> > +  IN     UINTN                 BspIndex
> > +  );
> 
> The "NumberOfAPs > total number of processors in system" check is nice!
> 
> (10) Again, do we need a similar error condition for BspIndex being out
> of range?
> 

Agree, I will handle the case in the same way as above in the ASSERT. If so, no need return the status.


> (11) Do we need to document / enforce explicitly (status code) that the
> BSP and the APs must have checked in, and/or the door must have been
> locked? Again -- if we can't detect / enforce these conditions, that's
> fine, but then we should mention the expected call environment. The
> file-top description does not seem very explicit about it.
> 

Agree, if BspIndex is the actual CPU calling this function, it must be checkin before. So, how about adding the comments as below:
  " The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior."

Based on above, I propose the API to be:

/**
  Used by the BSP to wait for APs.

  The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
  The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior.
  The caller shall make sure the NumberOfAPs have checked-in to avoid the undefined behavior.

  If Context is NULL, then ASSERT().
  If NumberOfAPs > All CPUs in system, then ASSERT().
  If BspIndex exceeds the range of all CPUs in the system, then ASSERT().

  Note:
  This function is blocking mode, and it will return only after the number of APs released by
  calling SmmCpuSyncReleaseBsp():
  BSP: WaitForAPs    <--  AP: ReleaseBsp

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
  @param[in]      BspIndex          The BSP Index to wait for APs.

**/
VOID
EFIAPI
SmmCpuSyncWaitForAPs (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 NumberOfAPs,
  IN     UINTN                 BspIndex
  );

> > +
> > +/**
> > +  Used by the BSP to release one AP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which AP need to be released.
> > +  @param[in]      BspIndex          The BSP Index to release AP.
> > +
> > +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncReleaseOneAp   (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  );
> 
> (12) Same comments as elsewhere:
> 
> - it's good that we check CpuIndex versus BspIndex, but do we also need
> to range-check each?
> 

Agree.

> - document that both affected CPUs need to have checked in, with the
> door potentially locked?
> 

Yes, for SMM CPU driver, it shall be called after look door. For API itself, it's better not restrict it. the only requirement I can see is need CpuIndex must be checkin. So, I will refine it as below:
/**
  Used by the BSP to release one AP.

  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
  The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior.
  The caller shall make sure the CpuIndex has checked-in to avoid the undefined behavior.

  If Context is NULL, then ASSERT().
  If CpuIndex == BspIndex, then ASSERT().
  If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Indicate which AP need to be released.
  @param[in]      BspIndex          The BSP Index to release AP.

**/
VOID
EFIAPI
SmmCpuSyncReleaseOneAp   (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex,
  IN     UINTN                 BspIndex
  );



> 
> > +
> > +/**
> > +  Used by the AP to wait BSP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  Note: This function is blocking mode, and it will return only after the AP
> released by
> > +  calling SmmCpuSyncReleaseOneAp():
> > +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> > +
> > +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context
> object.
> > +  @param[in]      CpuIndex         Indicate which AP wait BSP.
> > +  @param[in]      BspIndex         The BSP Index to be waited.
> > +
> > +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncWaitForBsp (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  );
> > +
> 
> (13) Same questions as under (12).
> 

See below proposed API:

/**
  Used by the AP to wait BSP.

  The AP is specified by CpuIndex.
  The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior.
  The BSP is specified by BspIndex.

  If Context is NULL, then ASSERT().
  If CpuIndex == BspIndex, then ASSERT().
  If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().

  Note:
  This function is blocking mode, and it will return only after the AP released by
  calling SmmCpuSyncReleaseOneAp():
  BSP: ReleaseOneAp  -->  AP: WaitForBsp

  @param[in,out]  Context          Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex         Indicate which AP wait BSP.
  @param[in]      BspIndex         The BSP Index to be waited.

**/
VOID
EFIAPI
SmmCpuSyncWaitForBsp (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex,
  IN     UINTN                 BspIndex
  );


> > +/**
> > +  Used by the AP to release BSP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which AP release BSP.
> > +  @param[in]      BspIndex          The BSP Index to be released.
> > +
> > +  @retval RETURN_SUCCESS            AP to release BSP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncReleaseBsp (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  );
> 
> (14) Same questions as under (12).
> 

See below proposed API:

/**
  Used by the AP to release BSP.

  The AP is specified by CpuIndex.
  The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior.
  The BSP is specified by BspIndex.

  If Context is NULL, then ASSERT().
  If CpuIndex == BspIndex, then ASSERT().
  If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Indicate which AP release BSP.
  @param[in]      BspIndex          The BSP Index to be released.

**/
VOID
EFIAPI
SmmCpuSyncReleaseBsp (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex,
  IN     UINTN                 BspIndex
  );


Thanks,
Jiaxin 


> > +
> > +#endif
> > diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
> > index 0b5431dbf7..20ab079219 100644
> > --- a/UefiCpuPkg/UefiCpuPkg.dec
> > +++ b/UefiCpuPkg/UefiCpuPkg.dec
> > @@ -62,10 +62,13 @@
> >    CpuPageTableLib|Include/Library/CpuPageTableLib.h
> >
> >    ## @libraryclass   Provides functions for manipulating smram savestate
> registers.
> >    MmSaveStateLib|Include/Library/MmSaveStateLib.h
> >
> > +  ## @libraryclass   Provides functions for SMM CPU Sync Operation.
> > +  SmmCpuSyncLib|Include/Library/SmmCpuSyncLib.h
> > +
> >  [LibraryClasses.RISCV64]
> >    ##  @libraryclass  Provides functions to manage MMU features on RISCV64
> CPUs.
> >    ##
> >    RiscVMmuLib|Include/Library/BaseRiscVMmuLib.h
> >
> 
> These interfaces look real nice, my comments/questions are all docs-related.
> 
> Thanks!
> Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112465): https://edk2.groups.io/g/devel/message/112465
Mute This Topic: https://groups.io/mt/103010164/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance Wu, Jiaxin
@ 2023-12-13 14:34   ` Laszlo Ersek
  2023-12-14 11:11     ` Wu, Jiaxin
  0 siblings, 1 reply; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-13 14:34 UTC (permalink / raw)
  To: devel, jiaxin.wu; +Cc: Eric Dong, Ray Ni, Zeng Star, Gerd Hoffmann, Rahul Kumar

On 12/6/23 11:01, Wu, Jiaxin wrote:
> Implements SmmCpuSyncLib Library instance. The instance refers the
> existing SMM CPU driver (PiSmmCpuDxeSmm) sync implementation
> and behavior:
> 1.Abstract Counter and Run semaphores into SmmCpuSyncCtx.
> 2.Abstract CPU arrival count operation to
> SmmCpuSyncGetArrivedCpuCount(), SmmCpuSyncCheckInCpu(),
> SmmCpuSyncCheckOutCpu(), SmmCpuSyncLockDoor().
> Implementation is aligned with existing SMM CPU driver.
> 3. Abstract SMM CPU Sync flow to:
> BSP: SmmCpuSyncReleaseOneAp  -->  AP: SmmCpuSyncWaitForBsp
> BSP: SmmCpuSyncWaitForAPs    <--  AP: SmmCpuSyncReleaseBsp
> Semaphores release & wait during sync flow is same as existing SMM
> CPU driver.
> 4.Same operation to Counter and Run semaphores by leverage the atomic
> compare exchange.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ray Ni <ray.ni@intel.com>
> Cc: Zeng Star <star.zeng@intel.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
> ---
>  UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c   | 647 +++++++++++++++++++++
>  UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf |  39 ++
>  UefiCpuPkg/UefiCpuPkg.dsc                          |   3 +
>  3 files changed, 689 insertions(+)
>  create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
>  create mode 100644 UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>
> diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> new file mode 100644
> index 0000000000..3c2835f8de
> --- /dev/null
> +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> @@ -0,0 +1,647 @@
> +/** @file
> +  SMM CPU Sync lib implementation.
> +  The lib provides 3 sets of APIs:
> +  1. ContextInit/ContextDeinit/ContextReset:
> +  ContextInit() is called in driver's entrypoint to allocate and initialize the SMM CPU Sync context.
> +  ContextDeinit() is called in driver's unload function to deinitialize the SMM CPU Sync context.
> +  ContextReset() is called before CPU exist SMI, which allows CPU to check into the next SMI from this point.
> +
> +  2. GetArrivedCpuCount/CheckInCpu/CheckOutCpu/LockDoor:
> +  When SMI happens, all processors including BSP enter to SMM mode by calling CheckInCpu().
> +  The elected BSP calls LockDoor() so that CheckInCpu() will return the error code after that.
> +  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
> +  GetArrivedCpuCount() returns the number of checked-in CPUs.
> +
> +  3. WaitForAPs/ReleaseOneAp/WaitForBsp/ReleaseBsp
> +  WaitForAPs() & ReleaseOneAp() are called from BSP to wait the number of APs and release one specific AP.
> +  WaitForBsp() & ReleaseBsp() are called from APs to wait and release BSP.
> +  The 4 APIs are used to synchronize the running flow among BSP and APs. BSP and AP Sync flow can be
> +  easy understand as below:
> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> +
> +  Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
> +  SPDX-License-Identifier: BSD-2-Clause-Patent
> +
> +**/

(1) If / when you update the documentation in patch#2, please update
this one as well.

> +
> +#include <Base.h>
> +#include <Uefi.h>
> +#include <Library/UefiLib.h>
> +#include <Library/BaseLib.h>
> +#include <Library/DebugLib.h>
> +#include <Library/SafeIntLib.h>
> +#include <Library/SynchronizationLib.h>
> +#include <Library/DebugLib.h>
> +#include <Library/BaseMemoryLib.h>
> +#include <Library/SmmServicesTableLib.h>
> +#include <Library/MemoryAllocationLib.h>
> +#include <Library/SmmCpuSyncLib.h>

(2) Please sort the #include list alphabetically.

(The idea is that the [LibraryClasses] section in the INF file should be
sorted as well, and then we can easily verify whether those two lists
match each other -- modulo <Library/SmmCpuSyncLib.h>, of course.)

> +
> +typedef struct {
> +  ///
> +  /// Indicate how many CPU entered SMM.
> +  ///
> +  volatile UINT32    *Counter;
> +} SMM_CPU_SYNC_SEMAPHORE_GLOBAL;
> +
> +typedef struct {
> +  ///
> +  /// Used for control each CPU continue run or wait for signal
> +  ///
> +  volatile UINT32    *Run;
> +} SMM_CPU_SYNC_SEMAPHORE_CPU;

(3) We can improve this, as follows:

  typedef volatile UINT32 SMM_CPU_SYNC_SEMAPHORE;

  typedef struct {
    SMM_CPU_SYNC_SEMAPHORE *Counter;
  } SMM_CPU_SYNC_SEMAPHORE_GLOBAL;

  typedef struct {
    SMM_CPU_SYNC_SEMAPHORE *Run;
  } SMM_CPU_SYNC_SEMAPHORE_CPU;

Because, while it *indeed* makes some sense to introduce these separate
wrapper structures, we should still ensure that the internals are
identical. This will come handy later.

> +
> +struct SMM_CPU_SYNC_CONTEXT  {
> +  ///
> +  ///  All global semaphores' pointer in SMM CPU Sync
> +  ///
> +  SMM_CPU_SYNC_SEMAPHORE_GLOBAL    *GlobalSem;
> +  ///
> +  ///  All semaphores for each processor in SMM CPU Sync
> +  ///
> +  SMM_CPU_SYNC_SEMAPHORE_CPU       *CpuSem;
> +  ///
> +  /// The number of processors in the system.
> +  /// This does not indicate the number of processors that entered SMM.
> +  ///
> +  UINTN                            NumberOfCpus;
> +  ///
> +  /// Address of global and each CPU semaphores
> +  ///
> +  UINTN                            *SemBuffer;
> +  ///
> +  /// Size in bytes of global and each CPU semaphores
> +  ///
> +  UINTN                            SemBufferSize;
> +};

(4) This is too complicated, in my opinion.

(4.1) First of all, please add a *conspicuous* comment to the
SMM_CPU_SYNC_CONTEXT here, explaining that the whole idea is to place
the Counter and Run semaphores on different CPU cache lines, for good
performance. That's the *core* principle of this whole structure --
that's why we have an array of pointers to semaphores, rather than an
array of semaphores directly.

You didn't document that principle, and I had to spend a lot of time
deducing that fact from the SmmCpuSyncContextInit() function.

(4.2) The structure should go like this:

struct SMM_CPU_SYNC_CONTEXT  {
  UINTN                            NumberOfCpus;
  VOID                             *SemBuffer;
  UINTN                            SemBufferPages;
  SMM_CPU_SYNC_SEMAPHORE_GLOBAL    GlobalSem;
  SMM_CPU_SYNC_SEMAPHORE_CPU       CpuSem[];
};

Details:

- move NumberOfCpus to the top

- change the type of SemBuffer from (UINTN*) to (VOID*)

- replace SemBufferSize with SemBufferPages

- move GlobalSem and CpuSem to the end

- We need exactly one SMM_CPU_SYNC_SEMAPHORE_GLOBAL, therefore embed
GlobalSem directly as a field (it should not be a pointer)

- We can much simplify the code by turning CpuSem into a *flexible array
member* (this is a C99 feature that is already widely used in edk2).
This is why we move CpuSem to the end (and then we keep GlobalSem
nearby, for clarity).

I'll make more comments on this under SmmCpuSyncContextInit().

> +
> +/**
> +  Performs an atomic compare exchange operation to get semaphore.
> +  The compare exchange operation must be performed using MP safe
> +  mechanisms.
> +
> +  @param[in,out]  Sem    IN:  32-bit unsigned integer
> +                         OUT: original integer - 1
> +
> +  @retval     Original integer - 1
> +
> +**/
> +UINT32
> +InternalWaitForSemaphore (
> +  IN OUT  volatile UINT32  *Sem
> +  )
> +{
> +  UINT32  Value;
> +
> +  for ( ; ;) {
> +    Value = *Sem;
> +    if ((Value != 0) &&
> +        (InterlockedCompareExchange32 (
> +           (UINT32 *)Sem,
> +           Value,
> +           Value - 1
> +           ) == Value))
> +    {
> +      break;
> +    }
> +
> +    CpuPause ();
> +  }
> +
> +  return Value - 1;
> +}
> +
> +/**
> +  Performs an atomic compare exchange operation to release semaphore.
> +  The compare exchange operation must be performed using MP safe
> +  mechanisms.
> +
> +  @param[in,out]  Sem    IN:  32-bit unsigned integer
> +                         OUT: original integer + 1
> +
> +  @retval    Original integer + 1
> +
> +**/
> +UINT32
> +InternalReleaseSemaphore (
> +  IN OUT  volatile UINT32  *Sem
> +  )
> +{
> +  UINT32  Value;
> +
> +  do {
> +    Value = *Sem;
> +  } while (Value + 1 != 0 &&
> +           InterlockedCompareExchange32 (
> +             (UINT32 *)Sem,
> +             Value,
> +             Value + 1
> +             ) != Value);
> +
> +  return Value + 1;
> +}
> +
> +/**
> +  Performs an atomic compare exchange operation to lock semaphore.
> +  The compare exchange operation must be performed using MP safe
> +  mechanisms.
> +
> +  @param[in,out]  Sem    IN:  32-bit unsigned integer
> +                         OUT: -1
> +
> +  @retval    Original integer
> +
> +**/
> +UINT32
> +InternalLockdownSemaphore (
> +  IN OUT  volatile UINT32  *Sem
> +  )
> +{
> +  UINT32  Value;
> +
> +  do {
> +    Value = *Sem;
> +  } while (InterlockedCompareExchange32 (
> +             (UINT32 *)Sem,
> +             Value,
> +             (UINT32)-1
> +             ) != Value);
> +
> +  return Value;
> +}

(5) Please make these Internal functions STATIC.

Better yet, please make *all* functions that are not EFIAPI, STATIC.

> +
> +/**
> +  Create and initialize the SMM CPU Sync context.
> +
> +  SmmCpuSyncContextInit() function is to allocate and initialize the SMM CPU Sync context.
> +
> +  @param[in]  NumberOfCpus          The number of Logical Processors in the system.
> +  @param[out] SmmCpuSyncCtx         Pointer to the new created and initialized SMM CPU Sync context object.
> +                                    NULL will be returned if any error happen during init.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful created and initialized.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_BUFFER_TOO_SMALL   Overflow happen
> +  @retval RETURN_OUT_OF_RESOURCES   There are not enough resources available to create and initialize SMM CPU Sync context.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextInit (
> +  IN   UINTN                 NumberOfCpus,
> +  OUT  SMM_CPU_SYNC_CONTEXT  **SmmCpuSyncCtx
> +  )
> +{
> +  RETURN_STATUS  Status;
> +  UINTN          CpuSemInCtxSize;
> +  UINTN          CtxSize;
> +  UINTN          OneSemSize;
> +  UINTN          GlobalSemSize;
> +  UINTN          OneCpuSemSize;
> +  UINTN          CpuSemSize;
> +  UINTN          TotalSemSize;
> +  UINTN          SemAddr;
> +  UINTN          CpuIndex;
> +
> +  ASSERT (SmmCpuSyncCtx != NULL);

(6) This assert is unnecessary and wrong; we perform correct error
checking already.

> +  if (SmmCpuSyncCtx == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Count the CtxSize
> +  //
> +  Status = SafeUintnMult (NumberOfCpus, sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU), &CpuSemInCtxSize);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
> +
> +  Status = SafeUintnAdd (sizeof (SMM_CPU_SYNC_CONTEXT), CpuSemInCtxSize, &CtxSize);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
> +
> +  Status = SafeUintnAdd (CtxSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL), &CtxSize);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
> +
> +  //
> +  // Allocate CtxSize buffer for the *SmmCpuSyncCtx
> +  //
> +  *SmmCpuSyncCtx = NULL;
> +  *SmmCpuSyncCtx = (SMM_CPU_SYNC_CONTEXT *)AllocatePages (EFI_SIZE_TO_PAGES (CtxSize));
> +  ASSERT (*SmmCpuSyncCtx != NULL);
> +  if (*SmmCpuSyncCtx == NULL) {
> +    return RETURN_OUT_OF_RESOURCES;
> +  }

So, several comments on this section:

(7) the separate NULL assignment to (*SmmCpuSyncCtx) is superfluous, we
overwrite the object immediately after.

(8) the ASSERT() is superfluous and wrong; we already check for -- and
report -- allocation failure correctly.

(9) *page* allocation is useless / wasteful here; the main sync context
structure can be allocated from *pool*

(10) SafeIntLib APIs return RETURN_STATUS (and Status already has type
RETURN_STATUS), so we should use RETURN_ERROR() rather than EFI_ERROR()
-- it's more idiomatic

(11) Referring back to (4), SMM_CPU_SYNC_SEMAPHORE_GLOBAL should not be
counted (added) separately, because it is embedded in
SMM_CPU_SYNC_CONTEXT directly.

> +
> +  (*SmmCpuSyncCtx)->GlobalSem    = (SMM_CPU_SYNC_SEMAPHORE_GLOBAL *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT));
> +  (*SmmCpuSyncCtx)->CpuSem       = (SMM_CPU_SYNC_SEMAPHORE_CPU *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL));

(12) And then these two assignments should be dropped.

> +  (*SmmCpuSyncCtx)->NumberOfCpus = NumberOfCpus;
> +
> +  //
> +  // Count the TotalSemSize
> +  //
> +  OneSemSize = GetSpinLockProperties ();
> +
> +  Status = SafeUintnMult (OneSemSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) / sizeof (VOID *), &GlobalSemSize);
> +  if (EFI_ERROR (Status)) {
> +    goto ON_ERROR;
> +  }
> +
> +  Status = SafeUintnMult (OneSemSize, sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU) / sizeof (VOID *), &OneCpuSemSize);
> +  if (EFI_ERROR (Status)) {
> +    goto ON_ERROR;
> +  }

(13) I find this obscure and misleading. How about this one instead:

  UINTN  CacheLineSize;

  CacheLineSize = GetSpinLockProperties ();
  OneSemSize    = ALIGN_VALUE (sizeof (SMM_CPU_SYNC_SEMAPHORE), CacheLineSize);

and then eliminate GlobalSemSize and OneCpuSemSize altogether.

The above construct will ensure that

(a) OneSemSize is just large enough for placing semaphores on different
cache lines, and that

(b) OneSemSize is suitable for *both* SMM_CPU_SYNC_SEMAPHORE_GLOBAL and
SMM_CPU_SYNC_SEMAPHORE_CPU. This is where we rely on the common internal
type SMM_CPU_SYNC_SEMAPHORE.

> +
> +  Status = SafeUintnMult (NumberOfCpus, OneCpuSemSize, &CpuSemSize);
> +  if (EFI_ERROR (Status)) {
> +    goto ON_ERROR;
> +  }
> +
> +  Status = SafeUintnAdd (GlobalSemSize, CpuSemSize, &TotalSemSize);
> +  if (EFI_ERROR (Status)) {
> +    goto ON_ERROR;
> +  }

(14) This is probably better written as

  UINTN  NumSem;

  Status = SafeUintnAdd (1, NumberOfCpus, &NumSem);
  if (RETURN_ERROR (Status)) {
    goto ON_ERROR;
  }

  Status = SafeUintnMult (NumSem, OneSemSize, &TotalSemSize);
  if (RETURN_ERROR (Status)) {
    goto ON_ERROR;
  }

and remove the variable CpuSemSize as well.

> +
> +  DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%x\n", __func__, OneSemSize));
> +  DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%x\n", __func__, TotalSemSize));

(15) These are useful, but %x is not suitable for formatting UINTN.
Instead, use %Lx, and cast the values to UINT64:

  DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%Lx\n", __func__, (UINT64)OneSemSize));
  DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%Lx\n", __func__, (UINT64)TotalSemSize));

> +
> +  //
> +  // Allocate for Semaphores in the *SmmCpuSyncCtx
> +  //
> +  (*SmmCpuSyncCtx)->SemBufferSize = TotalSemSize;
> +  (*SmmCpuSyncCtx)->SemBuffer     = AllocatePages (EFI_SIZE_TO_PAGES ((*SmmCpuSyncCtx)->SemBufferSize));

(16) I suggest reworking this as follows (will be beneficial later), in
accordance with (4):

  (*SmmCpuSyncCtx)->SemBufferPages = EFI_SIZE_TO_PAGES (TotalSemSize);
  (*SmmCpuSyncCtx)->SemBuffer      = AllocatePages (
                                       (*SmmCpuSyncCtx)->SemBufferPages
                                       );

> +  ASSERT ((*SmmCpuSyncCtx)->SemBuffer != NULL);

(17) Bogus assert; same reason as in point (8).

> +  if ((*SmmCpuSyncCtx)->SemBuffer == NULL) {
> +    Status = RETURN_OUT_OF_RESOURCES;
> +    goto ON_ERROR;
> +  }
> +
> +  ZeroMem ((*SmmCpuSyncCtx)->SemBuffer, TotalSemSize);

(18) First approach: simplify the code by calling AllocateZeroPages()
instead. (It may zero more bytes than strictly necessary, but it's not a
big deal, and the code simplification is worth it.)

(19) Second approach: even better, just drop this call. There is no need
for zeroing the semaphore buffer at all, as we are going to manually set
both the Counter and the individual Run elements, below!

(20) With ZeroMem() gone, evaluate if we still depend on the
BaseMemoryLib class (#include and [LibraryClasses]).

> +
> +  //
> +  // Assign Global Semaphore pointer
> +  //
> +  SemAddr                               = (UINTN)(*SmmCpuSyncCtx)->SemBuffer;
> +  (*SmmCpuSyncCtx)->GlobalSem->Counter  = (UINT32 *)SemAddr;

(21) Side comment (the compiler will catch it for you anyway): it's not
"GlobalSem->Counter" but "GlobalSem.Counter", after point (4).

(22) The explicit (UINT32 *) cast is ugly. We should cast to
(SMM_CPU_SYNC_SEMAPHORE *).

> +  *(*SmmCpuSyncCtx)->GlobalSem->Counter = 0;
> +  DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->GlobalSem->Counter Address: 0x%08x\n", __func__, (UINTN)(*SmmCpuSyncCtx)->GlobalSem->Counter));

(23) problems with this DEBUG line:

(23.1) needlessly verbose,

(23.2) prints UINTN with %x,

(23.3) pads to 8 nibbles even though UINTN can be 64-bit

How about:

  DEBUG ((
    DEBUG_INFO,
    "[%a] - GlobalSem.Counter @ 0x%016Lx\n",
    __func__,
    (UINT64)SemAddr
    ));

> +
> +  SemAddr += GlobalSemSize;

(24) Should be "+= OneSemSize".

> +
> +  //
> +  // Assign CPU Semaphore pointer
> +  //
> +  for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
> +    (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run  = (UINT32 *)(SemAddr + (CpuSemSize / NumberOfCpus) * CpuIndex);
> +    *(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run = 0;
> +    DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->CpuSem[%d].Run Address: 0x%08x\n", __func__, CpuIndex, (UINTN)(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run));
> +  }

(25) Extremely over-complicated.

(25.1) The quotient (CpuSemSize / NumberOfCpus) is just OneCpuSemSize,
from the previous SafeUintnMult() call.

(25.2) Using SemAddr as a base address, and then performing a separate
multiplication, is wasteful -- not just computationally, but
semantically. We can simply advance SemAddr here!

(25.3) the expression

  (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run

is tiresome to read, and so we shouldn't repat it multiple times!

(25.4) the usual problems with the DEBUG line:

(25.4.1) needlessly verbose

(25.4.2) uses %d for formatting CpuIndex (which is UINTN)

(25.4.3) uses %x for formatting (UINTN)Run

(25.4.4) padds to 8 nibbles even though the Run address can be 64-bit

So:

  SMM_CPU_SYNC_SEMAPHORE_CPU *CpuSem;

  CpuSem = (*SmmCpuSyncCtx)->CpuSem;
  for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
    CpuSem->Run  = (SMM_CPU_SYNC_SEMAPHORE *)SemAddr;
    *CpuSem->Run = 0;

    DEBUG ((
      DEBUG_INFO,
      "[%a] - CpuSem[%Lu].Run @ 0x%016Lx\n",
      __func__,
      (UINT64)CpuIndex,
      (UINT64)SemAddr
      ));

    CpuSem++;
    SemAddr += OneSemSize;
  }

> +
> +  return RETURN_SUCCESS;
> +
> +ON_ERROR:
> +  FreePages (*SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (CtxSize));

(26) And then this can be

  FreePool (*SmmCpuSyncCtx);

per comment (9).

> +  return Status;
> +}
> +
> +/**
> +  Deinit an allocated SMM CPU Sync context.
> +
> +  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync context, the resources allocated in
> +  SmmCpuSyncContextInit() will be freed.
> +
> +  Note: This function only can be called after SmmCpuSyncContextInit() return success.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be deinitialized.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful deinitialized.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextDeinit (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> +  )
> +{
> +  UINTN  SmmCpuSyncCtxSize;
> +
> +  ASSERT (SmmCpuSyncCtx != NULL);

(27) bogus ASSERT

> +  if (SmmCpuSyncCtx == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  SmmCpuSyncCtxSize = sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) + sizeof (SMM_CPU_SYNC_SEMAPHORE_CPU) * (SmmCpuSyncCtx->NumberOfCpus);
> +
> +  FreePages (SmmCpuSyncCtx->SemBuffer, EFI_SIZE_TO_PAGES (SmmCpuSyncCtx->SemBufferSize));

(28) Per comment (16), this can be simplified as:

  FreePages (SmmCpuSyncCtx->SemBuffer, SmmCpuSyncCtx->SemBufferPages);

> +
> +  FreePages (SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (SmmCpuSyncCtxSize));

(29) Per comments (9) and (26), this should be just

  FreePool (SmmCpuSyncCtx);

(and the variable "SmmCpuSyncCtxSize" should be removed).

> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Reset SMM CPU Sync context.
> +
> +  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context to the initialized state.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object to be reset.
> +
> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was successful reset.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncContextReset (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL);

(30) bogus assert

> +  if (SmmCpuSyncCtx == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  *SmmCpuSyncCtx->GlobalSem->Counter = 0;
> +
> +  return RETURN_SUCCESS;
> +}

(31) Is there anything to do about the Run semaphores here?

> +
> +/**
> +  Get current number of arrived CPU in SMI.
> +
> +  For traditional CPU synchronization method, BSP might need to know the current number of arrived CPU in
> +  SMI to make sure all APs in SMI. This API can be for that purpose.
> +
> +  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
> +
> +  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncGetArrivedCpuCount (
> +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN OUT UINTN                 *CpuCount
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);

(32) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  if (*SmmCpuSyncCtx->GlobalSem->Counter < 0) {

(33) The type of Counter is

  volatile UINT32

therefore this condition will never evaluate to true.

If you want to check for the door being locked, then I suggest

  *SmmCpuSyncCtx->GlobalSem.Counter == (UINT32)-1

or

  *SmmCpuSyncCtx->GlobalSem.Counter == MAX_UINT32

> +    return RETURN_UNSUPPORTED;
> +  }
> +
> +  *CpuCount = *SmmCpuSyncCtx->GlobalSem->Counter;
> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Performs an atomic operation to check in CPU.
> +
> +  When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Check in CPU index.
> +
> +  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncCheckInCpu (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL);

(34) bogus ASSERT

> +  if (SmmCpuSyncCtx == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check to return if Counter has already been locked.
> +  //
> +  if ((INT32)InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) <= 0) {

(35) The cast and the comparison are bogus.

InternalReleaseSemaphore():

- returns 0, and leaves the semaphore unchanged, if the current value of
the semaphore is MAX_UINT32,

- increments the semaphore, and returns the incremented -- hence:
strictly positive -- UINT32 value, otherwise.

So the condition for

  semaphore unchanged because door has been locked

is:

  InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) == 0

No INT32 cast, and no "<".


> +    return RETURN_ABORTED;
> +  }
> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Performs an atomic operation to check out CPU.
> +
> +  CheckOutCpu() can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Check out CPU index.
> +
> +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> +  @retval RETURN_NOT_READY          The CPU is not checked-in.
> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncCheckOutCpu (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL);

(36) bogus assert


> +  if (SmmCpuSyncCtx == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  if (*SmmCpuSyncCtx->GlobalSem->Counter == 0) {
> +    return RETURN_NOT_READY;
> +  }

(37) This preliminary check is not particularly useful.

Assume that Counter is currently 1, but -- due to a programming error
somewhere -- there are two APs executing SmmCpuSyncCheckOutCpu() in
parallel. Both may pass this check (due to Counter being 1), and then
one of the APs will consume the semaphore and return, and the other AP
will hang forever.

So this check is "best effort". It's fine -- some programming errors
just inevitably lead to undefined behavior; not all bad usage can be
explicitly caught.

Maybe add a comment?

> +  if ((INT32)InternalWaitForSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) < 0) {
> +    return RETURN_UNSUPPORTED;
> +  }

(38) This doesn't look right. InternalWaitForSemaphore() blocks for as
long as the semaphore is zero. When the semaphore is nonzero,
InternalWaitForSemaphore() decrements it, and returns the decremented
value. Thus, InternalWaitForSemaphore() cannot return negative values
(it returns UINT32), and it also cannot return MAX_UINT32.

So I simply don't understand the purpose of this code.

As written, this condition could only fire if InternalWaitForSemaphore()
successfully decremented the semaphore, and the *new* value of the
semaphore were >=0x8000_0000. Because in that case, the INT32 cast (=
implementation-defined behavior) would produce a negative value. But for
that, we'd first have to increase Counter to 0x8000_0001 at least, and
that could never happen in practice, IMO.

So this is basically dead code. What is the intent?

> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Performs an atomic operation lock door for CPU checkin or checkout.
> +
> +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
> +
> +  The CPU specified by CpuIndex is elected to lock door.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which CPU to lock door.
> +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.
> +
> +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is NULL.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncLockDoor (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN OUT UINTN                 *CpuCount
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);

(39) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  *CpuCount = InternalLockdownSemaphore (SmmCpuSyncCtx->GlobalSem->Counter);
> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Used by the BSP to wait for APs.
> +
> +  The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
> +
> +  Note: This function is blocking mode, and it will return only after the number of APs released by
> +  calling SmmCpuSyncReleaseBsp():
> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
> +  @param[in]      BspIndex          The BSP Index to wait for APs.
> +
> +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or NumberOfAPs > total number of processors in system.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncWaitForAPs (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 NumberOfAPs,
> +  IN     UINTN                 BspIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && NumberOfAPs <= SmmCpuSyncCtx->NumberOfCpus);

(40) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (NumberOfAPs > SmmCpuSyncCtx->NumberOfCpus)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }

(41) Question for both the library instance and the library class (i.e.,
API documentation):

Is it ever valid to call this function with (NumberOfAPs ==
SmmCpuSyncCtx->NumberOfCpus)?

I would think not. NumberOfCpus is supposed to include the BSP and the
APs. Therefore the highest permitted NumberOfAPs value, on input, is
(SmmCpuSyncCtx->NumberOfCpus - 1).

So I think we should modify the lib class and the lib instance both.
RETURN_INVALID_PARAMETER applies to "NumberOfAPs *>=* total number of
processors in system".

> +
> +  while (NumberOfAPs-- > 0) {
> +    InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
> +  }

(42) In my opinion, this is an ugly pattern.

First, after the loop, NumberOfAPs will be MAX_UINTN.

Second, modifying input parameters is also an anti-pattern. Assume you
debug a problem, and fetch a backtrace where the two innermost frames
are SmmCpuSyncWaitForAPs() and InternalWaitForSemaphore(). If you look
at the stack frame that belongs to SmmCpuSyncWaitForAPs(), you may be
led to think that the function was *invoked* with a *low* NumberOfAPs
value. Whereas in fact NumberOfAPs may have been a larger value at the
time of call, only the function decremented NumberOfAPs by the time the
stack trace was fetched.

So, please add a new helper variable, and write a proper "for" loop.

  UINTN  Arrived;

  for (Arrived = 0; Arrived < NumberOfAPs; Arrived++) {
    InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
  }

(43) I mentioned this while reviewing the lib class header (patch#2), so
let me repeat it here:

BspIndex is used for indexing the CpuSem array, but we perform no range
checking, against "SmmCpuSyncCtx->NumberOfCpus".

That error should be documented (in the lib class header), and
caught/reported here.

> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Used by the BSP to release one AP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which AP need to be released.
> +  @param[in]      BspIndex          The BSP Index to release AP.
> +
> +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncReleaseOneAp   (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);

(44) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }

(45) range checks for BspIndex and CpuIndex missing (in both lib class
and lib instance)

> +
> +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Used by the AP to wait BSP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  Note: This function is blocking mode, and it will return only after the AP released by
> +  calling SmmCpuSyncReleaseOneAp():
> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> +
> +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex         Indicate which AP wait BSP.
> +  @param[in]      BspIndex         The BSP Index to be waited.
> +
> +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncWaitForBsp (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);

(46) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }

(47) range checks missing (lib class and instance)

> +
> +  InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
> +
> +  return RETURN_SUCCESS;
> +}
> +
> +/**
> +  Used by the AP to release BSP.
> +
> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> +
> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync context object.
> +  @param[in]      CpuIndex          Indicate which AP release BSP.
> +  @param[in]      BspIndex          The BSP Index to be released.
> +
> +  @retval RETURN_SUCCESS            AP to release BSP successfully.
> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or CpuIndex is same as BspIndex.
> +
> +**/
> +RETURN_STATUS
> +EFIAPI
> +SmmCpuSyncReleaseBsp (
> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> +  IN     UINTN                 CpuIndex,
> +  IN     UINTN                 BspIndex
> +  )
> +{
> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);

(48) bogus assert

> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> +    return RETURN_INVALID_PARAMETER;
> +  }

(49) range checks missing (lib class and instance)

> +
> +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
> +
> +  return RETURN_SUCCESS;
> +}
> diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> new file mode 100644
> index 0000000000..6bb1895577
> --- /dev/null
> +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> @@ -0,0 +1,39 @@
> +## @file
> +# SMM CPU Synchronization lib.
> +#
> +# This is SMM CPU Synchronization lib used for SMM CPU sync operations.
> +#
> +# Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
> +# SPDX-License-Identifier: BSD-2-Clause-Patent
> +#
> +##
> +
> +[Defines]
> +  INF_VERSION                    = 0x00010005
> +  BASE_NAME                      = SmmCpuSyncLib
> +  FILE_GUID                      = 1ca1bc1a-16a4-46ef-956a-ca500fd3381f
> +  MODULE_TYPE                    = DXE_SMM_DRIVER
> +  LIBRARY_CLASS                  = SmmCpuSyncLib|DXE_SMM_DRIVER
> +
> +[Sources]
> +  SmmCpuSyncLib.c
> +
> +[Packages]
> +  MdePkg/MdePkg.dec
> +  MdeModulePkg/MdeModulePkg.dec
> +  UefiCpuPkg/UefiCpuPkg.dec
> +
> +[LibraryClasses]
> +  UefiLib
> +  BaseLib
> +  DebugLib
> +  PrintLib
> +  SafeIntLib
> +  SynchronizationLib
> +  BaseMemoryLib
> +  SmmServicesTableLib
> +  MemoryAllocationLib

(50) Please sort this list alphabetically (cf. comment (2)).

> +
> +[Pcd]
> +
> +[Protocols]

(51) Useless empty INF file sections; please remove them.

> diff --git a/UefiCpuPkg/UefiCpuPkg.dsc b/UefiCpuPkg/UefiCpuPkg.dsc
> index 074fd77461..f264031c77 100644
> --- a/UefiCpuPkg/UefiCpuPkg.dsc
> +++ b/UefiCpuPkg/UefiCpuPkg.dsc
> @@ -23,10 +23,11 @@
>  #
>
>  !include MdePkg/MdeLibs.dsc.inc
>
>  [LibraryClasses]
> +  SafeIntLib|MdePkg/Library/BaseSafeIntLib/BaseSafeIntLib.inf
>    BaseLib|MdePkg/Library/BaseLib/BaseLib.inf
>    BaseMemoryLib|MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf
>    CpuLib|MdePkg/Library/BaseCpuLib/BaseCpuLib.inf
>    DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf
>    SerialPortLib|MdePkg/Library/BaseSerialPortLibNull/BaseSerialPortLibNull.inf

(52) Just from the context visible here, this list seems alphabetically
sorted pre-patch; if that's the case, please stick with it (don't break
the sort order).

> @@ -54,10 +55,11 @@
>    CacheMaintenanceLib|MdePkg/Library/BaseCacheMaintenanceLib/BaseCacheMaintenanceLib.inf
>    PciLib|MdePkg/Library/BasePciLibPciExpress/BasePciLibPciExpress.inf
>    PciExpressLib|MdePkg/Library/BasePciExpressLib/BasePciExpressLib.inf
>    SmmCpuPlatformHookLib|UefiCpuPkg/Library/SmmCpuPlatformHookLibNull/SmmCpuPlatformHookLibNull.inf
>    SmmCpuFeaturesLib|UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
> +  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>    PeCoffGetEntryPointLib|MdePkg/Library/BasePeCoffGetEntryPointLib/BasePeCoffGetEntryPointLib.inf
>    PeCoffExtraActionLib|MdePkg/Library/BasePeCoffExtraActionLibNull/BasePeCoffExtraActionLibNull.inf
>    TpmMeasurementLib|MdeModulePkg/Library/TpmMeasurementLibNull/TpmMeasurementLibNull.inf
>    CcExitLib|UefiCpuPkg/Library/CcExitLibNull/CcExitLibNull.inf
>    MicrocodeLib|UefiCpuPkg/Library/MicrocodeLib/MicrocodeLib.inf
> @@ -154,10 +156,11 @@
>    UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf
>    UefiCpuPkg/Library/SmmCpuPlatformHookLibNull/SmmCpuPlatformHookLibNull.inf
>    UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.inf
>    UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLibStm.inf
>    UefiCpuPkg/Library/SmmCpuFeaturesLib/StandaloneMmCpuFeaturesLib.inf
> +  UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>    UefiCpuPkg/Library/CcExitLibNull/CcExitLibNull.inf
>    UefiCpuPkg/PiSmmCommunication/PiSmmCommunicationPei.inf
>    UefiCpuPkg/PiSmmCommunication/PiSmmCommunicationSmm.inf
>    UefiCpuPkg/SecCore/SecCore.inf
>    UefiCpuPkg/SecCore/SecCoreNative.inf

Thanks
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112476): https://edk2.groups.io/g/devel/message/112476
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class
  2023-12-13  4:23     ` Wu, Jiaxin
@ 2023-12-13 15:02       ` Laszlo Ersek
  0 siblings, 0 replies; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-13 15:02 UTC (permalink / raw)
  To: Wu, Jiaxin, devel@edk2.groups.io
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

On 12/13/23 05:23, Wu, Jiaxin wrote:
>>
>> Thanks. This documentation (in the commit message and the lib class
>> header file) seems really good (especially with the formatting updates
>> suggested by Ray).
>>
>> (1) I think there is one typo: exist <-> exits.
>>
> 
> agree, I will fix this.
> 
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncContextReset (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
>>> +  );
>>
>> (2) The file-top documentation says about this API: "ContextReset() is
>> called before CPU exist SMI, which allows CPU to check into the next SMI
>> from this point".
>>
>> It is not clear *which* CPU is supposed to call ContextReset (and the
>> function does not take a CPU index). Can you explain this in the
>> documentation? (Assuming my observation makes sense.)
>>
> 
> For SMM CPU driver, it shall the BSP to call the function since BSP will gather all APs to exit SMM synchronously, it's the role to control the overall SMI execution flow.
> 
> For the API itself, I don't restrict which CPU can do that. It depends on the consumer. So, it's not the mandatory, that's the reason I don't mention that.
> 
> But you are right, here, it has a requirement: the elected CPU calling this function need make sure all CPUs are ready to exist SMI. I can clear document this requirement as below:
> 
> "This function is called by one of CPUs after all CPUs are ready to exist SMI, which allows CPU to check into the next SMI from this point."
> 
> Besides, one comment from Ray: we can ASSERT the SmmCpuSyncCtx is not NULL, don't need return status to handle all such case. if so, the RETURN_STATUS is not required.
> 
> So, based on all above, I will update the API as below:
> 
> /**
>   Reset SMM CPU Sync context. SMM CPU Sync context will be reset to the initialized state.
> 
>   This function is called by one of CPUs after all CPUs are ready to exist SMI, which allows CPU to
>   check into the next SMI from this point.
> 
>   If Context is NULL, then ASSERT().
> 
>   @param[in,out]  Context     Pointer to the SMM CPU Sync context object to be reset.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncContextReset (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context
>   );

Looks good, thanks -- except, there is again the same typo: "ready to
exist SMI". Should be "ready to exit".

I also agree that asserting (Context != NULL) is valid, as long as we
document that passing in a non-NULL context is the caller's
responsibility. (And we do that above, so fine.)

> 
> 
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncGetArrivedCpuCount (
>>> +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN OUT UINTN                 *CpuCount
>>> +  );
>>
>> (3) Why is CpuCount IN OUT? I would think just OUT should suffice.
>>
>>
> 
> Agree. I will correct all similar case. Besides, I also received the comments from Ray offline:
> 1. we can ASSERT the SmmCpuSyncCtx is not NULL, don't need return status to handle that.
> 2. we don't need RETURN_UNSUPPORTED,  GetArrivedCpuCount() should be always supported.
> With above comments, I will update the API as below to return the count directly, this is also aligned with the function name to get arrived CPU Count:
> 
> /**
>   Get current number of arrived CPU in SMI.
> 
>   BSP might need to know the current number of arrived CPU in SMI to make sure all APs
>   in SMI. This API can be for that purpose.
> 
>   If Context is NULL, then ASSERT().
> 
>   @param[in]      Context     Pointer to the SMM CPU Sync context object.
> 
>   @retval    Current number of arrived CPU in SMI.
> 
> **/
> UINTN
> EFIAPI
> SmmCpuSyncGetArrivedCpuCount (
>   IN  SMM_CPU_SYNC_CONTEXT  *Context
>   );

Sure, if you can guarantee at the lib *class* level that this API always
succeeds as long as Context is not NULL, then this update looks fine.

> 
>  
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncCheckInCpu (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex
>>> +  );
>>
>> (4) Do we need an error condition for CpuIndex being out of range?
>>
> 
> Good idea. We can handle this check within ASSERT. Then I will update all similar case by adding below comments in API:
> 
> "If CpuIndex exceeds the range of all CPUs in the system, then ASSERT()."
> 
> For example:
> /**
>   Performs an atomic operation to check in CPU.
> 
>   When SMI happens, all processors including BSP enter to SMM mode by calling SmmCpuSyncCheckInCpu().
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Check in CPU index.
> 
>   @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
>   @retval RETURN_ABORTED            Check in CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
> 
> **/
> RETURN_STATUS
> EFIAPI
> SmmCpuSyncCheckInCpu (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex
>   );

Works for me.

My main points are:

- if we consider a particular input condition a *programming error* (and
we document it as such), then asserting the opposite of that condition
is fine

- if we consider an input condition a particular data / environment
issue, then catching it / reporting it with an explicit status code is fine.

- point is, for *any* problematic input condition, we should decide if
we handle it with *either* assert (in which case the caller is
responsible for preventing that condition upon input), *or* with a
retval (in which case the caller is responsible for handling the
circumstance after the call returns). Handling a given input state with
*both* approaches at the same time is totally bogus.



> 
> 
>> (5) Do we have a special CpuIndex value for the BSP?
>>
> 
> No, the elected BSP is also the part of CPUs with its own CpuIndex value.
> 
> 
>>> +
>>> +/**
>>> +  Performs an atomic operation to check out CPU.
>>> +
>>> +  CheckOutCpu() can be called in error handling flow for the CPU who calls
>> CheckInCpu() earlier.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Check out CPU index.
>>> +
>>> +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
>>> +  @retval RETURN_NOT_READY          The CPU is not checked-in.
>>> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncCheckOutCpu (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex
>>> +  );
>>> +
>>
>> (6) Looks good, my only question is again if we need a status code for
>> CpuIndex being out of range.
>>
> 
> Yes, agree. The comments from Ray is that we don't need handle the RETURN_INVALID_PARAMETER, just ASSERT for better performance, we can avoid the if check in release version. Just document the assert.
> 
> To better define the API behavior, I will remove RETURN_UNSUPPORTED, and replaced with RETURN_ABORTED, which can align with the checkin behavior if we don't want support the checkout after look door. RETURN_ABORTED clearly document the behavior instead of RETURN_UNSUPPORTED.          
> 
> So, the API would be as below:
> /**
>   Performs an atomic operation to check out CPU.
> 
>   This function can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Check out CPU index.
> 
>   @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
>   @retval RETURN_ABORTED            Check out CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
> 
> **/
> RETURN_STATUS
> EFIAPI
> SmmCpuSyncCheckOutCpu (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex
>   );
> 
> 
>>> +/**
>>> +  Performs an atomic operation lock door for CPU checkin or checkout.
>>> +
>>> +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
>>> +
>>> +  The CPU specified by CpuIndex is elected to lock door.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which CPU to lock door.
>>> +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look
>> door.
>>> +
>>> +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
>> NULL.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncLockDoor (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN OUT UINTN                 *CpuCount
>>> +  );
>>
>> This is where it's getting tricky :)
>>
>> (7) error condition for CpuIndex being out of range?
> 
> Yes, agree. The same case as above, it will be handled in the ASSERT and documented in the comments.
> 
>>
>> (8) why is CpuCount IN OUT and not just OUT? (Other than that, I can see
>> how outputting CpuCout at once can be useful.)
>>
> 
> Well, I agree it should only OUT. 
> CpuCout is to tell the number of arrived CPU in SMI after look door. For SMM CPU driver, it needs to know the number of arrived CPU in SMI after look door, it's for later rendezvous & sync usage. So, it returns the CpuCount.
> 
> 
>> (9) do we need error conditions for:
>>
>> (9.1) CpuIndex being "wrong" (i.e., not the CPU that's "supposed" to
>> lock the door)?
>>
>> (9.2) CpuIndex not having checked in already, before trying to lock the
>> door?
>>
>> Now, I can imagine that problem (9.1) is undetectable, i.e., it causes
>> undefined behavior. That's fine, but then we should mention that.
>>
> 
> Actually CpuIndex might not be cared by the lib instance. It puts here just for the future extension that some lib instance might need to know which CPU lock the door. It's the information provided by the API.
> I don't add the error check for those because I don't want focus the implementation to do this check.
> 
>  But I agree, we can document this undefined behavior. How about like below:
>   "The CPU specified by CpuIndex is elected to lock door. The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior."
> 
>  With above, I will update the API like below:
> 
> /**
>   Performs an atomic operation lock door for CPU checkin and checkout. After this function:
>   CPU can not check in via SmmCpuSyncCheckInCpu().
>   CPU can not check out via SmmCpuSyncCheckOutCpu().
> 
>   The CPU specified by CpuIndex is elected to lock door. The caller shall make sure the CpuIndex
>   is the actual CPU calling this function to avoid the undefined behavior.
> 
>   If Context is NULL, then ASSERT().
>   If CpuCount is NULL, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Indicate which CPU to lock door.
>   @param[in,out]  CpuCount          Number of arrived CPU in SMI after look door.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncLockDoor (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN          UINTN                 CpuIndex,
>        OUT UINTN                 *CpuCount
>   );

Works for me!

> 
> 
> 
>>> +
>>> +/**
>>> +  Used by the BSP to wait for APs.
>>> +
>>> +  The number of APs need to be waited is specified by NumberOfAPs. The
>> BSP is specified by BspIndex.
>>> +
>>> +  Note: This function is blocking mode, and it will return only after the
>> number of APs released by
>>> +  calling SmmCpuSyncReleaseBsp():
>>> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      NumberOfAPs       Number of APs need to be waited by
>> BSP.
>>> +  @param[in]      BspIndex          The BSP Index to wait for APs.
>>> +
>>> +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> NumberOfAPs > total number of processors in system.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncWaitForAPs (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 NumberOfAPs,
>>> +  IN     UINTN                 BspIndex
>>> +  );
>>
>> The "NumberOfAPs > total number of processors in system" check is nice!
>>
>> (10) Again, do we need a similar error condition for BspIndex being out
>> of range?
>>
> 
> Agree, I will handle the case in the same way as above in the ASSERT. If so, no need return the status.
> 
> 
>> (11) Do we need to document / enforce explicitly (status code) that the
>> BSP and the APs must have checked in, and/or the door must have been
>> locked? Again -- if we can't detect / enforce these conditions, that's
>> fine, but then we should mention the expected call environment. The
>> file-top description does not seem very explicit about it.
>>
> 
> Agree, if BspIndex is the actual CPU calling this function, it must be checkin before. So, how about adding the comments as below:
>   " The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior."
> 
> Based on above, I propose the API to be:
> 
> /**
>   Used by the BSP to wait for APs.
> 
>   The number of APs need to be waited is specified by NumberOfAPs. The BSP is specified by BspIndex.
>   The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior.
>   The caller shall make sure the NumberOfAPs have checked-in to avoid the undefined behavior.
> 
>   If Context is NULL, then ASSERT().
>   If NumberOfAPs > All CPUs in system, then ASSERT().
>   If BspIndex exceeds the range of all CPUs in the system, then ASSERT().
> 
>   Note:
>   This function is blocking mode, and it will return only after the number of APs released by
>   calling SmmCpuSyncReleaseBsp():
>   BSP: WaitForAPs    <--  AP: ReleaseBsp
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      NumberOfAPs       Number of APs need to be waited by BSP.
>   @param[in]      BspIndex          The BSP Index to wait for APs.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncWaitForAPs (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 NumberOfAPs,
>   IN     UINTN                 BspIndex
>   );

OK, thanks.

> 
>>> +
>>> +/**
>>> +  Used by the BSP to release one AP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which AP need to be released.
>>> +  @param[in]      BspIndex          The BSP Index to release AP.
>>> +
>>> +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncReleaseOneAp   (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  );
>>
>> (12) Same comments as elsewhere:
>>
>> - it's good that we check CpuIndex versus BspIndex, but do we also need
>> to range-check each?
>>
> 
> Agree.
> 
>> - document that both affected CPUs need to have checked in, with the
>> door potentially locked?
>>
> 
> Yes, for SMM CPU driver, it shall be called after look door. For API itself, it's better not restrict it. the only requirement I can see is need CpuIndex must be checkin. So, I will refine it as below:
> /**
>   Used by the BSP to release one AP.
> 
>   The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>   The caller shall make sure the BspIndex is the actual CPU calling this function to avoid the undefined behavior.
>   The caller shall make sure the CpuIndex has checked-in to avoid the undefined behavior.
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex == BspIndex, then ASSERT().
>   If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Indicate which AP need to be released.
>   @param[in]      BspIndex          The BSP Index to release AP.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncReleaseOneAp   (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex,
>   IN     UINTN                 BspIndex
>   );

OK.

(Small update: the comment should say: "If BspIndex *or* CpuIndex exceed
the range ...". For the other functions too, below.)

> 
> 
> 
>>
>>> +
>>> +/**
>>> +  Used by the AP to wait BSP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  Note: This function is blocking mode, and it will return only after the AP
>> released by
>>> +  calling SmmCpuSyncReleaseOneAp():
>>> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context
>> object.
>>> +  @param[in]      CpuIndex         Indicate which AP wait BSP.
>>> +  @param[in]      BspIndex         The BSP Index to be waited.
>>> +
>>> +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncWaitForBsp (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  );
>>> +
>>
>> (13) Same questions as under (12).
>>
> 
> See below proposed API:
> 
> /**
>   Used by the AP to wait BSP.
> 
>   The AP is specified by CpuIndex.
>   The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior.
>   The BSP is specified by BspIndex.
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex == BspIndex, then ASSERT().
>   If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().
> 
>   Note:
>   This function is blocking mode, and it will return only after the AP released by
>   calling SmmCpuSyncReleaseOneAp():
>   BSP: ReleaseOneAp  -->  AP: WaitForBsp
> 
>   @param[in,out]  Context          Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex         Indicate which AP wait BSP.
>   @param[in]      BspIndex         The BSP Index to be waited.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncWaitForBsp (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex,
>   IN     UINTN                 BspIndex
>   );

OK, thanks.

> 
> 
>>> +/**
>>> +  Used by the AP to release BSP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which AP release BSP.
>>> +  @param[in]      BspIndex          The BSP Index to be released.
>>> +
>>> +  @retval RETURN_SUCCESS            AP to release BSP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncReleaseBsp (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  );
>>
>> (14) Same questions as under (12).
>>
> 
> See below proposed API:
> 
> /**
>   Used by the AP to release BSP.
> 
>   The AP is specified by CpuIndex.
>   The caller shall make sure the CpuIndex is the actual CPU calling this function to avoid the undefined behavior.
>   The BSP is specified by BspIndex.
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex == BspIndex, then ASSERT().
>   If BspIndex and CpuIndex exceed the range of all CPUs in the system, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Indicate which AP release BSP.
>   @param[in]      BspIndex          The BSP Index to be released.
> 
> **/
> VOID
> EFIAPI
> SmmCpuSyncReleaseBsp (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex,
>   IN     UINTN                 BspIndex
>   );
> 

Thanks!
Laszlo

> 
> Thanks,
> Jiaxin 
> 
> 
>>> +
>>> +#endif
>>> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
>>> index 0b5431dbf7..20ab079219 100644
>>> --- a/UefiCpuPkg/UefiCpuPkg.dec
>>> +++ b/UefiCpuPkg/UefiCpuPkg.dec
>>> @@ -62,10 +62,13 @@
>>>    CpuPageTableLib|Include/Library/CpuPageTableLib.h
>>>
>>>    ## @libraryclass   Provides functions for manipulating smram savestate
>> registers.
>>>    MmSaveStateLib|Include/Library/MmSaveStateLib.h
>>>
>>> +  ## @libraryclass   Provides functions for SMM CPU Sync Operation.
>>> +  SmmCpuSyncLib|Include/Library/SmmCpuSyncLib.h
>>> +
>>>  [LibraryClasses.RISCV64]
>>>    ##  @libraryclass  Provides functions to manage MMU features on RISCV64
>> CPUs.
>>>    ##
>>>    RiscVMmuLib|Include/Library/BaseRiscVMmuLib.h
>>>
>>
>> These interfaces look real nice, my comments/questions are all docs-related.
>>
>> Thanks!
>> Laszlo
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112484): https://edk2.groups.io/g/devel/message/112484
Mute This Topic: https://groups.io/mt/103010164/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance
  2023-12-06 10:01 ` [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance Wu, Jiaxin
@ 2023-12-13 16:52   ` Laszlo Ersek
  2023-12-14 13:43     ` Wu, Jiaxin
  2023-12-15  0:21     ` Ni, Ray
  0 siblings, 2 replies; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-13 16:52 UTC (permalink / raw)
  To: devel, jiaxin.wu
  Cc: Ard Biesheuvel, Jiewen Yao, Jordan Justen, Eric Dong, Ray Ni,
	Zeng Star, Rahul Kumar, Gerd Hoffmann

On 12/6/23 11:01, Wu, Jiaxin wrote:
> This patch is to specify SmmCpuSyncLib instance for OvmfPkg.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ray Ni <ray.ni@intel.com>
> Cc: Zeng Star <star.zeng@intel.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
> ---
>  OvmfPkg/CloudHv/CloudHvX64.dsc | 2 ++
>  OvmfPkg/OvmfPkgIa32.dsc        | 2 ++
>  OvmfPkg/OvmfPkgIa32X64.dsc     | 2 ++
>  OvmfPkg/OvmfPkgX64.dsc         | 1 +
>  4 files changed, 7 insertions(+)
> 
> diff --git a/OvmfPkg/CloudHv/CloudHvX64.dsc b/OvmfPkg/CloudHv/CloudHvX64.dsc
> index 821ad1b9fa..f735b69a37 100644
> --- a/OvmfPkg/CloudHv/CloudHvX64.dsc
> +++ b/OvmfPkg/CloudHv/CloudHvX64.dsc
> @@ -183,10 +183,12 @@
>    PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
>    DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
>    ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
>  !if $(SMM_REQUIRE) == FALSE
>    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> +!else
> +  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>  !endif
>    CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
>    FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
>    MemEncryptTdxLib|OvmfPkg/Library/BaseMemEncryptTdxLib/BaseMemEncryptTdxLib.inf
>  
> diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
> index bce2aedcd7..b05b13b18c 100644
> --- a/OvmfPkg/OvmfPkgIa32.dsc
> +++ b/OvmfPkg/OvmfPkgIa32.dsc
> @@ -188,10 +188,12 @@
>    PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
>    DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
>    ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
>  !if $(SMM_REQUIRE) == FALSE
>    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> +!else
> +  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>  !endif
>    CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
>    FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
>  
>  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
> index 631e909a54..5a16eb7abe 100644
> --- a/OvmfPkg/OvmfPkgIa32X64.dsc
> +++ b/OvmfPkg/OvmfPkgIa32X64.dsc
> @@ -193,10 +193,12 @@
>    PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.inf
>    DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib.inf
>    ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLib/ImagePropertiesRecordLib.inf
>  !if $(SMM_REQUIRE) == FALSE
>    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> +!else
> +  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>  !endif
>    CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
>    FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
>  
>  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index 4ea3008cc6..6bb4c777b9 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -209,10 +209,11 @@
>  !if $(SMM_REQUIRE) == FALSE
>    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
>    CcProbeLib|OvmfPkg/Library/CcProbeLib/DxeCcProbeLib.inf
>  !else
>    CcProbeLib|MdePkg/Library/CcProbeLibNull/CcProbeLibNull.inf
> +  SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>  !endif
>    CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/CustomizedDisplayLib.inf
>    FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBltLib.inf
>  
>  !if $(SOURCE_DEBUG_ENABLE) == TRUE

All four DSC files already include "PiSmmCpuDxeSmm.inf" like this:

  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf {
    <LibraryClasses>
      ...
  }

Given that this new library class is again exclusively used by
PiSmmCpuDxeSmm, can you please resolve this lib class too in module
scope only?

Thanks!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112487): https://edk2.groups.io/g/devel/message/112487
Mute This Topic: https://groups.io/mt/103010166/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-13 14:34   ` Laszlo Ersek
@ 2023-12-14 11:11     ` Wu, Jiaxin
  2023-12-14 13:48       ` Laszlo Ersek
  0 siblings, 1 reply; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-14 11:11 UTC (permalink / raw)
  To: Laszlo Ersek, devel@edk2.groups.io
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

Hi Laszlo,

Really appreciate your comments! I checked one by one and feedback as below, thank you & Ray again & again for patch refinement!!!!


> 
> (1) If / when you update the documentation in patch#2, please update
> this one as well.
> 

Yes, I will do the alignment.

> (2) Please sort the #include list alphabetically.
> 
> (The idea is that the [LibraryClasses] section in the INF file should be
> sorted as well, and then we can easily verify whether those two lists
> match each other -- modulo <Library/SmmCpuSyncLib.h>, of course.)
> 

Agree.

> (3) We can improve this, as follows:
> 
>   typedef volatile UINT32 SMM_CPU_SYNC_SEMAPHORE;

Good comment. Agree.


> 
>   typedef struct {
>     SMM_CPU_SYNC_SEMAPHORE *Counter;
>   } SMM_CPU_SYNC_SEMAPHORE_GLOBAL;
> 
>   typedef struct {
>     SMM_CPU_SYNC_SEMAPHORE *Run;
>   } SMM_CPU_SYNC_SEMAPHORE_CPU;
> 
> Because, while it *indeed* makes some sense to introduce these separate
> wrapper structures, we should still ensure that the internals are
> identical. This will come handy later.
> 

After check with Ray, I convinced we don't need wrapper the "Counter" into the SMM_CPU_SYNC_SEMAPHORE_GLOBAL. He thinks it's overdesigned since currently we only have one GLOBAL semaphore. "Counter" defines in the SMM_CPU_SYNC_CONTEXT directly can also make "Counter" has its own CPU cache lines, and don't need consider the future extension. So, I agree to move Counter into SMM_CPU_SYNC_CONTEXT. What's your opinion?

But for the *Run*, he is ok to wrap into the structure since it's for each CPU, and it can benefit & simply the coding logic, which can easily help us direct to the different CPU with index and meet the semaphore size alignment requirement.

> 
> (4) This is too complicated, in my opinion.
> 
> (4.1) First of all, please add a *conspicuous* comment to the
> SMM_CPU_SYNC_CONTEXT here, explaining that the whole idea is to place
> the Counter and Run semaphores on different CPU cache lines, for good
> performance. That's the *core* principle of this whole structure --
> that's why we have an array of pointers to semaphores, rather than an
> array of semaphores directly.
> 
> You didn't document that principle, and I had to spend a lot of time
> deducing that fact from the SmmCpuSyncContextInit() function.

Sorry about that, I will document for the SMM_CPU_SYNC_CONTEXT definition.


> 
> (4.2) The structure should go like this:
> 
> struct SMM_CPU_SYNC_CONTEXT  {
>   UINTN                            NumberOfCpus;
>   VOID                             *SemBuffer;
>   UINTN                            SemBufferPages;
>   SMM_CPU_SYNC_SEMAPHORE_GLOBAL    GlobalSem;
>   SMM_CPU_SYNC_SEMAPHORE_CPU       CpuSem[];
> };
> 
> Details:
> 
> - move NumberOfCpus to the top
> 
> - change the type of SemBuffer from (UINTN*) to (VOID*)
> 
> - replace SemBufferSize with SemBufferPages

Oh? Lazlo, could you explain why it's better to define it as "SemBufferPages" instead of "SemBufferSize" in bytes?


> 
> - move GlobalSem and CpuSem to the end
> 
> - We need exactly one SMM_CPU_SYNC_SEMAPHORE_GLOBAL, therefore
> embed
> GlobalSem directly as a field (it should not be a pointer)
> 
> - We can much simplify the code by turning CpuSem into a *flexible array
> member* (this is a C99 feature that is already widely used in edk2).
> This is why we move CpuSem to the end (and then we keep GlobalSem
> nearby, for clarity).

Agree! The same comment from Ray! Thank you and Ray, both!!!

>

Really great comment here:

Based on my above explanation, I propose to below definition:

///
/// Shall place each semaphore on exclusive cache line for good performance.
///
typedef volatile UINT32 SMM_CPU_SYNC_SEMAPHORE;

typedef struct {
  ///
  /// Used for control each CPU continue run or wait for signal
  ///
  SMM_CPU_SYNC_SEMAPHORE    *Run;
} SMM_CPU_SYNC_SEMAPHORE_FOR_EACH_CPU;

struct SMM_CPU_SYNC_CONTEXT  {
  ///
  /// Indicate all CPUs in the system.
  ///
  UINTN                                  NumberOfCpus;
  ///
  /// Address of semaphores
  ///
  VOID                                   *SemBuffer;
  ///
  /// Size in bytes of semaphores
  ///
  UINTN                                  SemBufferSize;      ----> I can change to pages based on your feedback:).
  ///
  /// Indicate CPUs entered SMM.
  ///
  SMM_CPU_SYNC_SEMAPHORE                 *CpuCount;
  ///
  /// Define an array of structure for each CPU semaphore due to the size alignment
  /// requirement. With the array of structure for each CPU semaphore, it's easy to
  /// reach the specific CPU with CPU Index for its own semaphore access: CpuSem[CpuIndex].
  ///
  SMM_CPU_SYNC_SEMAPHORE_FOR_EACH_CPU    CpuSem[];
};


> 
> (5) Please make these Internal functions STATIC.
> 
> Better yet, please make *all* functions that are not EFIAPI, STATIC.

Agree.


> > +  ASSERT (SmmCpuSyncCtx != NULL);
> 
> (6) This assert is unnecessary and wrong; we perform correct error
> checking already.

Agree, I will remove the check, but keep assert.

> 
> So, several comments on this section:
> 
> (7) the separate NULL assignment to (*SmmCpuSyncCtx) is superfluous, we
> overwrite the object immediately after.
> 

Agree.

> (8) the ASSERT() is superfluous and wrong; we already check for -- and
> report -- allocation failure correctly.

Agree, will remove the assert here.


> 
> (9) *page* allocation is useless / wasteful here; the main sync context
> structure can be allocated from *pool*
> 

Agree! The same comment from Ray! Again, thank you and Ray, both!!!



> (10) SafeIntLib APIs return RETURN_STATUS (and Status already has type
> RETURN_STATUS), so we should use RETURN_ERROR() rather than
> EFI_ERROR()
> -- it's more idiomatic

Agree.

> 
> (11) Referring back to (4), SMM_CPU_SYNC_SEMAPHORE_GLOBAL should
> not be
> counted (added) separately, because it is embedded in
> SMM_CPU_SYNC_CONTEXT directly.
> 
> > +
> > +  (*SmmCpuSyncCtx)->GlobalSem    =
> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL *)((UINT8 *)(*SmmCpuSyncCtx) +
> sizeof (SMM_CPU_SYNC_CONTEXT));
> > +  (*SmmCpuSyncCtx)->CpuSem       = (SMM_CPU_SYNC_SEMAPHORE_CPU
> *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT) +
> sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL));
> 
> (12) And then these two assignments should be dropped.
> 

Yes, agree, with my proposed definition, I will refine it.


> > +  (*SmmCpuSyncCtx)->NumberOfCpus = NumberOfCpus;
> > +
> > +  //
> > +  // Count the TotalSemSize
> > +  //
> > +  OneSemSize = GetSpinLockProperties ();
> > +
> > +  Status = SafeUintnMult (OneSemSize, sizeof
> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) / sizeof (VOID *), &GlobalSemSize);
> > +  if (EFI_ERROR (Status)) {
> > +    goto ON_ERROR;
> > +  }
> > +
> > +  Status = SafeUintnMult (OneSemSize, sizeof
> (SMM_CPU_SYNC_SEMAPHORE_CPU) / sizeof (VOID *), &OneCpuSemSize);
> > +  if (EFI_ERROR (Status)) {
> > +    goto ON_ERROR;
> > +  }
> 
> (13) I find this obscure and misleading. How about this one instead:
> 
>   UINTN  CacheLineSize;
> 
>   CacheLineSize = GetSpinLockProperties ();
>   OneSemSize    = ALIGN_VALUE (sizeof (SMM_CPU_SYNC_SEMAPHORE),
> CacheLineSize);
> 
> and then eliminate GlobalSemSize and OneCpuSemSize altogether.
> 
> The above construct will ensure that
> 
> (a) OneSemSize is just large enough for placing semaphores on different
> cache lines, and that
> 
> (b) OneSemSize is suitable for *both*
> SMM_CPU_SYNC_SEMAPHORE_GLOBAL and
> SMM_CPU_SYNC_SEMAPHORE_CPU. This is where we rely on the common
> internal
> type SMM_CPU_SYNC_SEMAPHORE.
> 
> > +
> > +  Status = SafeUintnMult (NumberOfCpus, OneCpuSemSize, &CpuSemSize);
> > +  if (EFI_ERROR (Status)) {
> > +    goto ON_ERROR;
> > +  }
> > +
> > +  Status = SafeUintnAdd (GlobalSemSize, CpuSemSize, &TotalSemSize);
> > +  if (EFI_ERROR (Status)) {
> > +    goto ON_ERROR;
> > +  }
> 

Agree!


> (14) This is probably better written as
> 
>   UINTN  NumSem;
> 
>   Status = SafeUintnAdd (1, NumberOfCpus, &NumSem);
>   if (RETURN_ERROR (Status)) {
>     goto ON_ERROR;
>   }
> 
>   Status = SafeUintnMult (NumSem, OneSemSize, &TotalSemSize);
>   if (RETURN_ERROR (Status)) {
>     goto ON_ERROR;
>   }
> 
> and remove the variable CpuSemSize as well.
> 
> > +
> > +  DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%x\n",
> __func__, OneSemSize));
> > +  DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%x\n",
> __func__, TotalSemSize));
> 
> (15) These are useful, but %x is not suitable for formatting UINTN.
> Instead, use %Lx, and cast the values to UINT64:
> 
>   DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%Lx\n", __func__,
> (UINT64)OneSemSize));
>   DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%Lx\n", __func__,
> (UINT64)TotalSemSize));
> 

Agree!

> > +
> > +  //
> > +  // Allocate for Semaphores in the *SmmCpuSyncCtx
> > +  //
> > +  (*SmmCpuSyncCtx)->SemBufferSize = TotalSemSize;
> > +  (*SmmCpuSyncCtx)->SemBuffer     = AllocatePages (EFI_SIZE_TO_PAGES
> ((*SmmCpuSyncCtx)->SemBufferSize));
> 
> (16) I suggest reworking this as follows (will be beneficial later), in
> accordance with (4):
> 
>   (*SmmCpuSyncCtx)->SemBufferPages = EFI_SIZE_TO_PAGES (TotalSemSize);
>   (*SmmCpuSyncCtx)->SemBuffer      = AllocatePages (
>                                        (*SmmCpuSyncCtx)->SemBufferPages
>                                        );
> 
> > +  ASSERT ((*SmmCpuSyncCtx)->SemBuffer != NULL);
> 
> (17) Bogus assert; same reason as in point (8).


Agree!

> 
> > +  if ((*SmmCpuSyncCtx)->SemBuffer == NULL) {
> > +    Status = RETURN_OUT_OF_RESOURCES;
> > +    goto ON_ERROR;
> > +  }
> > +
> > +  ZeroMem ((*SmmCpuSyncCtx)->SemBuffer, TotalSemSize);
> 
> (18) First approach: simplify the code by calling AllocateZeroPages()
> instead. (It may zero more bytes than strictly necessary, but it's not a
> big deal, and the code simplification is worth it.)
> 
> (19) Second approach: even better, just drop this call. There is no need
> for zeroing the semaphore buffer at all, as we are going to manually set
> both the Counter and the individual Run elements, below!
> 
> (20) With ZeroMem() gone, evaluate if we still depend on the
> BaseMemoryLib class (#include and [LibraryClasses]).

Agree, I will drop the zeromem!


> 
> > +
> > +  //
> > +  // Assign Global Semaphore pointer
> > +  //
> > +  SemAddr                               = (UINTN)(*SmmCpuSyncCtx)->SemBuffer;
> > +  (*SmmCpuSyncCtx)->GlobalSem->Counter  = (UINT32 *)SemAddr;
> 
> (21) Side comment (the compiler will catch it for you anyway): it's not
> "GlobalSem->Counter" but "GlobalSem.Counter", after point (4).
> 

I will rename the Counter --> CpuCount, and move to SmmCpuSyncCtx directly based on the comment from Ray.


> (22) The explicit (UINT32 *) cast is ugly. We should cast to
> (SMM_CPU_SYNC_SEMAPHORE *).
> 

Agree!

> > +  *(*SmmCpuSyncCtx)->GlobalSem->Counter = 0;
> > +  DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->GlobalSem->Counter
> Address: 0x%08x\n", __func__, (UINTN)(*SmmCpuSyncCtx)->GlobalSem-
> >Counter));
> 
> (23) problems with this DEBUG line:
> 
> (23.1) needlessly verbose,
> 
> (23.2) prints UINTN with %x,
> 
> (23.3) pads to 8 nibbles even though UINTN can be 64-bit
> 
> How about:
> 
>   DEBUG ((
>     DEBUG_INFO,
>     "[%a] - GlobalSem.Counter @ 0x%016Lx\n",
>     __func__,
>     (UINT64)SemAddr
>     ));
> 
> > +
> > +  SemAddr += GlobalSemSize;
> 
> (24) Should be "+= OneSemSize".

Agree, even move the Counter in context, it should be ok with "+= OneSemSize".

> 
> > +
> > +  //
> > +  // Assign CPU Semaphore pointer
> > +  //
> > +  for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
> > +    (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run  = (UINT32 *)(SemAddr +
> (CpuSemSize / NumberOfCpus) * CpuIndex);
> > +    *(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run = 0;
> > +    DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->CpuSem[%d].Run
> Address: 0x%08x\n", __func__, CpuIndex, (UINTN)(*SmmCpuSyncCtx)-
> >CpuSem[CpuIndex].Run));
> > +  }
> 
> (25) Extremely over-complicated.
> 

Yes, agree, now I have realized that. Thank you again!


> (25.1) The quotient (CpuSemSize / NumberOfCpus) is just OneCpuSemSize,
> from the previous SafeUintnMult() call.
> 

Yes.

> (25.2) Using SemAddr as a base address, and then performing a separate
> multiplication, is wasteful -- not just computationally, but
> semantically. We can simply advance SemAddr here!
> 

Agree.

> (25.3) the expression
> 
>   (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run
> 
> is tiresome to read, and so we shouldn't repat it multiple times!

Agree

> 
> (25.4) the usual problems with the DEBUG line:
> 
> (25.4.1) needlessly verbose
> 
> (25.4.2) uses %d for formatting CpuIndex (which is UINTN)
> 
> (25.4.3) uses %x for formatting (UINTN)Run
> 
> (25.4.4) padds to 8 nibbles even though the Run address can be 64-bit
> 
> So:
> 
>   SMM_CPU_SYNC_SEMAPHORE_CPU *CpuSem;
> 
>   CpuSem = (*SmmCpuSyncCtx)->CpuSem;
>   for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
>     CpuSem->Run  = (SMM_CPU_SYNC_SEMAPHORE *)SemAddr;
>     *CpuSem->Run = 0;
> 
>     DEBUG ((
>       DEBUG_INFO,
>       "[%a] - CpuSem[%Lu].Run @ 0x%016Lx\n",
>       __func__,
>       (UINT64)CpuIndex,
>       (UINT64)SemAddr
>       ));
> 
>     CpuSem++;
>     SemAddr += OneSemSize;
>   }
> 

Really good, thanks comments. Totally agree!


> > +
> > +  return RETURN_SUCCESS;
> > +
> > +ON_ERROR:
> > +  FreePages (*SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (CtxSize));
> 
> (26) And then this can be
> 
>   FreePool (*SmmCpuSyncCtx);
> 

Yes, agree.

> per comment (9).
> 
> > +  return Status;
> > +}
> > +
> > +/**
> > +  Deinit an allocated SMM CPU Sync context.
> > +
> > +  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync
> context, the resources allocated in
> > +  SmmCpuSyncContextInit() will be freed.
> > +
> > +  Note: This function only can be called after SmmCpuSyncContextInit()
> return success.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object to be deinitialized.
> > +
> > +  @retval RETURN_SUCCESS            The SMM CPU Sync context was
> successful deinitialized.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> > +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncContextDeinit (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> > +  )
> > +{
> > +  UINTN  SmmCpuSyncCtxSize;
> > +
> > +  ASSERT (SmmCpuSyncCtx != NULL);
> 
> (27) bogus ASSERT
> 

According our original define, I will refine the interface.

> > +  if (SmmCpuSyncCtx == NULL) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> > +
> > +  SmmCpuSyncCtxSize = sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof
> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) + sizeof
> (SMM_CPU_SYNC_SEMAPHORE_CPU) * (SmmCpuSyncCtx->NumberOfCpus);
> > +
> > +  FreePages (SmmCpuSyncCtx->SemBuffer, EFI_SIZE_TO_PAGES
> (SmmCpuSyncCtx->SemBufferSize));
> 
> (28) Per comment (16), this can be simplified as:
> 
>   FreePages (SmmCpuSyncCtx->SemBuffer, SmmCpuSyncCtx-
> >SemBufferPages);
> 
> > +
> > +  FreePages (SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (SmmCpuSyncCtxSize));
> 
> (29) Per comments (9) and (26), this should be just
> 
>   FreePool (SmmCpuSyncCtx);
> 
> (and the variable "SmmCpuSyncCtxSize" should be removed).
> 

Yes, agree!


> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Reset SMM CPU Sync context.
> > +
> > +  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context
> to the initialized state.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object to be reset.
> > +
> > +  @retval RETURN_SUCCESS            The SMM CPU Sync context was
> successful reset.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncContextReset (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL);
> 
> (30) bogus assert
> 

Will refine the implementation based on the latest interface.

> > +  if (SmmCpuSyncCtx == NULL) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> > +
> > +  *SmmCpuSyncCtx->GlobalSem->Counter = 0;
> > +
> > +  return RETURN_SUCCESS;
> > +}
> 
> (31) Is there anything to do about the Run semaphores here?
> 

No, Run semaphores will naturally reset if all cpus ready to exit if the caller follow the API requirement calling.


> > +
> > +/**
> > +  Get current number of arrived CPU in SMI.
> > +
> > +  For traditional CPU synchronization method, BSP might need to know the
> current number of arrived CPU in
> > +  SMI to make sure all APs in SMI. This API can be for that purpose.
> > +
> > +  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context
> object.
> > +  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
> > +
> > +  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI
> successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
> NULL.
> > +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncGetArrivedCpuCount (
> > +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN OUT UINTN                 *CpuCount
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
> 
> (32) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> > +
> > +  if (*SmmCpuSyncCtx->GlobalSem->Counter < 0) {
> 
> (33) The type of Counter is
> 
>   volatile UINT32
> 
> therefore this condition will never evaluate to true.
> 
> If you want to check for the door being locked, then I suggest
> 
>   *SmmCpuSyncCtx->GlobalSem.Counter == (UINT32)-1
> 
> or
> 
>   *SmmCpuSyncCtx->GlobalSem.Counter == MAX_UINT32
> 

Yes, I do need check the locked or not, I will fix this!!!


For this API, I will rewrite the implementation according the interface we aligned.



> > +    return RETURN_UNSUPPORTED;
> > +  }
> > +
> > +  *CpuCount = *SmmCpuSyncCtx->GlobalSem->Counter;
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Performs an atomic operation to check in CPU.
> > +
> > +  When SMI happens, all processors including BSP enter to SMM mode by
> calling SmmCpuSyncCheckInCpu().
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Check in CPU index.
> > +
> > +  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> > +  @retval RETURN_ABORTED            Check in CPU failed due to
> SmmCpuSyncLockDoor() has been called by one elected CPU.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncCheckInCpu (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL);
> 
> (34) bogus ASSERT
> 
> > +  if (SmmCpuSyncCtx == NULL) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }

Agree, I will refine!

> > +
> > +  //
> > +  // Check to return if Counter has already been locked.
> > +  //
> > +  if ((INT32)InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem-
> >Counter) <= 0) {
> 
> (35) The cast and the comparison are bogus.
> 
> InternalReleaseSemaphore():
> 
> - returns 0, and leaves the semaphore unchanged, if the current value of
> the semaphore is MAX_UINT32,
> 
> - increments the semaphore, and returns the incremented -- hence:
> strictly positive -- UINT32 value, otherwise.
> 
> So the condition for
> 
>   semaphore unchanged because door has been locked
> 
> is:
> 
>   InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) == 0
> 
> No INT32 cast, and no "<".
> 

Agree! I will do the fix. 


> 
> > +    return RETURN_ABORTED;
> > +  }
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Performs an atomic operation to check out CPU.
> > +
> > +  CheckOutCpu() can be called in error handling flow for the CPU who calls
> CheckInCpu() earlier.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Check out CPU index.
> > +
> > +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
> > +  @retval RETURN_NOT_READY          The CPU is not checked-in.
> > +  @retval RETURN_UNSUPPORTED        Unsupported operation.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncCheckOutCpu (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL);
> 
> (36) bogus assert
> 
> 
> > +  if (SmmCpuSyncCtx == NULL) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> > +

The same, I will refine it.





> > +  if (*SmmCpuSyncCtx->GlobalSem->Counter == 0) {
> > +    return RETURN_NOT_READY;
> > +  }
> 
> (37) This preliminary check is not particularly useful.
> 
> Assume that Counter is currently 1, but -- due to a programming error
> somewhere -- there are two APs executing SmmCpuSyncCheckOutCpu() in
> parallel. Both may pass this check (due to Counter being 1), and then
> one of the APs will consume the semaphore and return, and the other AP
> will hang forever.
> 
> So this check is "best effort". It's fine -- some programming errors
> just inevitably lead to undefined behavior; not all bad usage can be
> explicitly caught.
> 
> Maybe add a comment?
> 


Yes, I will remove the RETURN_NOT_READY, and add the comment for the API:

"The caller shall make sure the CPU specified by CpuIndex has already checked-in."



> > +  if ((INT32)InternalWaitForSemaphore (SmmCpuSyncCtx->GlobalSem-
> >Counter) < 0) {
> > +    return RETURN_UNSUPPORTED;
> > +  }
> 
> (38) This doesn't look right. InternalWaitForSemaphore() blocks for as
> long as the semaphore is zero. When the semaphore is nonzero,
> InternalWaitForSemaphore() decrements it, and returns the decremented
> value. Thus, InternalWaitForSemaphore() cannot return negative values
> (it returns UINT32), and it also cannot return MAX_UINT32.
> 
> So I simply don't understand the purpose of this code.

The Counter can be locked by calling SmmCpuSyncLockDoor(), after that, Counter will be (UINT32)-1, in such a case,  InternalWaitForSemaphore will continue decrease it to (UINT32)-2, this is to catch the lock case. but according we aligned for the interface below:
/**
  Performs an atomic operation to check out CPU.

  This function can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
  The caller shall make sure the CPU specified by CpuIndex has already checked-in.

  If Context is NULL, then ASSERT().
  If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().

  @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
  @param[in]      CpuIndex          Check out CPU index.

  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
  @retval RETURN_ABORTED            Check out CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.

**/
RETURN_STATUS
EFIAPI
SmmCpuSyncCheckOutCpu (
  IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
  IN     UINTN                 CpuIndex
  );

The code will be changed to:

  if ((INT32)InternalWaitForSemaphore (Context->CpuCount) < 0) {
    return RETURN_ABORTED;
  }


> 
> As written, this condition could only fire if InternalWaitForSemaphore()
> successfully decremented the semaphore, and the *new* value of the
> semaphore were >=0x8000_0000. Because in that case, the INT32 cast (=
> implementation-defined behavior) would produce a negative value. But for
> that, we'd first have to increase Counter to 0x8000_0001 at least, and
> that could never happen in practice, IMO.
> 
> So this is basically dead code. What is the intent?


It can happen when locked the Counter.


> 
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Performs an atomic operation lock door for CPU checkin or checkout.
> > +
> > +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
> > +
> > +  The CPU specified by CpuIndex is elected to lock door.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which CPU to lock door.
> > +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look
> door.
> > +
> > +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
> NULL.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncLockDoor (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN OUT UINTN                 *CpuCount
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
> 
> (39) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> > +
> > +  *CpuCount = InternalLockdownSemaphore (SmmCpuSyncCtx-
> >GlobalSem->Counter);
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Used by the BSP to wait for APs.
> > +
> > +  The number of APs need to be waited is specified by NumberOfAPs. The
> BSP is specified by BspIndex.
> > +
> > +  Note: This function is blocking mode, and it will return only after the
> number of APs released by
> > +  calling SmmCpuSyncReleaseBsp():
> > +  BSP: WaitForAPs    <--  AP: ReleaseBsp
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      NumberOfAPs       Number of APs need to be waited by
> BSP.
> > +  @param[in]      BspIndex          The BSP Index to wait for APs.
> > +
> > +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> NumberOfAPs > total number of processors in system.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncWaitForAPs (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 NumberOfAPs,
> > +  IN     UINTN                 BspIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && NumberOfAPs <= SmmCpuSyncCtx-
> >NumberOfCpus);
> 
> (40) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (NumberOfAPs > SmmCpuSyncCtx-
> >NumberOfCpus)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> 
> (41) Question for both the library instance and the library class (i.e.,
> API documentation):
> 
> Is it ever valid to call this function with (NumberOfAPs ==
> SmmCpuSyncCtx->NumberOfCpus)?
> 
> I would think not. NumberOfCpus is supposed to include the BSP and the
> APs. Therefore the highest permitted NumberOfAPs value, on input, is
> (SmmCpuSyncCtx->NumberOfCpus - 1).
> 
> So I think we should modify the lib class and the lib instance both.
> RETURN_INVALID_PARAMETER applies to "NumberOfAPs *>=* total number
> of
> processors in system".
> 

Good catch!!! I will fix it.

> > +
> > +  while (NumberOfAPs-- > 0) {
> > +    InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
> > +  }
> 
> (42) In my opinion, this is an ugly pattern.
> 
> First, after the loop, NumberOfAPs will be MAX_UINTN.
> 
> Second, modifying input parameters is also an anti-pattern. Assume you
> debug a problem, and fetch a backtrace where the two innermost frames
> are SmmCpuSyncWaitForAPs() and InternalWaitForSemaphore(). If you look
> at the stack frame that belongs to SmmCpuSyncWaitForAPs(), you may be
> led to think that the function was *invoked* with a *low* NumberOfAPs
> value. Whereas in fact NumberOfAPs may have been a larger value at the
> time of call, only the function decremented NumberOfAPs by the time the
> stack trace was fetched.
> 
> So, please add a new helper variable, and write a proper "for" loop.
> 
>   UINTN  Arrived;
> 
>   for (Arrived = 0; Arrived < NumberOfAPs; Arrived++) {
>     InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
>   }
> 

Agree. Thank you!


> (43) I mentioned this while reviewing the lib class header (patch#2), so
> let me repeat it here:
> 
> BspIndex is used for indexing the CpuSem array, but we perform no range
> checking, against "SmmCpuSyncCtx->NumberOfCpus".
> 
> That error should be documented (in the lib class header), and
> caught/reported here.
> 

Yes, I will refine the patch based on the lib class header we aligned.


> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Used by the BSP to release one AP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which AP need to be released.
> > +  @param[in]      BspIndex          The BSP Index to release AP.
> > +
> > +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncReleaseOneAp   (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
> 
> (44) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> 
> (45) range checks for BspIndex and CpuIndex missing (in both lib class
> and lib instance)
> 

The same, I will align the code with we aligned lib class.


> > +
> > +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Used by the AP to wait BSP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  Note: This function is blocking mode, and it will return only after the AP
> released by
> > +  calling SmmCpuSyncReleaseOneAp():
> > +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
> > +
> > +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context
> object.
> > +  @param[in]      CpuIndex         Indicate which AP wait BSP.
> > +  @param[in]      BspIndex         The BSP Index to be waited.
> > +
> > +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncWaitForBsp (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
> 
> (46) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> 
> (47) range checks missing (lib class and instance)
> 


The same, I will align the code with we aligned lib class.

> > +
> > +  InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > +
> > +/**
> > +  Used by the AP to release BSP.
> > +
> > +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
> > +
> > +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
> context object.
> > +  @param[in]      CpuIndex          Indicate which AP release BSP.
> > +  @param[in]      BspIndex          The BSP Index to be released.
> > +
> > +  @retval RETURN_SUCCESS            AP to release BSP successfully.
> > +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
> CpuIndex is same as BspIndex.
> > +
> > +**/
> > +RETURN_STATUS
> > +EFIAPI
> > +SmmCpuSyncReleaseBsp (
> > +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
> > +  IN     UINTN                 CpuIndex,
> > +  IN     UINTN                 BspIndex
> > +  )
> > +{
> > +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
> 
> (48) bogus assert
> 
> > +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
> > +    return RETURN_INVALID_PARAMETER;
> > +  }
> 
> (49) range checks missing (lib class and instance)
> 

The same, I will align the code with we aligned lib class.


> > +
> > +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
> > +
> > +  return RETURN_SUCCESS;
> > +}
> > diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> > new file mode 100644
> > index 0000000000..6bb1895577
> > --- /dev/null
> > +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> > @@ -0,0 +1,39 @@
> > +## @file
> > +# SMM CPU Synchronization lib.
> > +#
> > +# This is SMM CPU Synchronization lib used for SMM CPU sync operations.
> > +#
> > +# Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
> > +# SPDX-License-Identifier: BSD-2-Clause-Patent
> > +#
> > +##
> > +
> > +[Defines]
> > +  INF_VERSION                    = 0x00010005
> > +  BASE_NAME                      = SmmCpuSyncLib
> > +  FILE_GUID                      = 1ca1bc1a-16a4-46ef-956a-ca500fd3381f
> > +  MODULE_TYPE                    = DXE_SMM_DRIVER
> > +  LIBRARY_CLASS                  = SmmCpuSyncLib|DXE_SMM_DRIVER
> > +
> > +[Sources]
> > +  SmmCpuSyncLib.c
> > +
> > +[Packages]
> > +  MdePkg/MdePkg.dec
> > +  MdeModulePkg/MdeModulePkg.dec
> > +  UefiCpuPkg/UefiCpuPkg.dec
> > +
> > +[LibraryClasses]
> > +  UefiLib
> > +  BaseLib
> > +  DebugLib
> > +  PrintLib
> > +  SafeIntLib
> > +  SynchronizationLib
> > +  BaseMemoryLib
> > +  SmmServicesTableLib
> > +  MemoryAllocationLib
> 
> (50) Please sort this list alphabetically (cf. comment (2)).
> 

Ok, will refine it.

> > +
> > +[Pcd]
> > +
> > +[Protocols]
> 
> (51) Useless empty INF file sections; please remove them.
> 
> > diff --git a/UefiCpuPkg/UefiCpuPkg.dsc b/UefiCpuPkg/UefiCpuPkg.dsc
> > index 074fd77461..f264031c77 100644
> > --- a/UefiCpuPkg/UefiCpuPkg.dsc
> > +++ b/UefiCpuPkg/UefiCpuPkg.dsc
> > @@ -23,10 +23,11 @@
> >  #
> >
> >  !include MdePkg/MdeLibs.dsc.inc
> >
> >  [LibraryClasses]
> > +  SafeIntLib|MdePkg/Library/BaseSafeIntLib/BaseSafeIntLib.inf
> >    BaseLib|MdePkg/Library/BaseLib/BaseLib.inf
> >    BaseMemoryLib|MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf
> >    CpuLib|MdePkg/Library/BaseCpuLib/BaseCpuLib.inf
> >    DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf
> >
> SerialPortLib|MdePkg/Library/BaseSerialPortLibNull/BaseSerialPortLibNull.inf
> 
> (52) Just from the context visible here, this list seems alphabetically
> sorted pre-patch; if that's the case, please stick with it (don't break
> the sort order).


How about adding the SafeIntLib in the MdePkg/MdeLibs.dsc.inc? if agree, I will do that with separated patch.


Thanks,
Jiaxin



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112520): https://edk2.groups.io/g/devel/message/112520
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance
  2023-12-13 16:52   ` Laszlo Ersek
@ 2023-12-14 13:43     ` Wu, Jiaxin
  2023-12-15  0:21     ` Ni, Ray
  1 sibling, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-14 13:43 UTC (permalink / raw)
  To: Laszlo Ersek, devel@edk2.groups.io
  Cc: Ard Biesheuvel, Yao, Jiewen, Justen, Jordan L, Dong, Eric,
	Ni, Ray, Zeng, Star, Kumar, Rahul R, Gerd Hoffmann

> >
> >  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> 
> All four DSC files already include "PiSmmCpuDxeSmm.inf" like this:
> 
>   UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf {
>     <LibraryClasses>
>       ...
>   }
> 
> Given that this new library class is again exclusively used by
> PiSmmCpuDxeSmm, can you please resolve this lib class too in module
> scope only?
> 

Yes, I will put it under the PiSmmCpuDxeSmm

> Thanks!
> Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112528): https://edk2.groups.io/g/devel/message/112528
Mute This Topic: https://groups.io/mt/103010166/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-14 11:11     ` Wu, Jiaxin
@ 2023-12-14 13:48       ` Laszlo Ersek
  2023-12-14 15:34         ` Wu, Jiaxin
                           ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Laszlo Ersek @ 2023-12-14 13:48 UTC (permalink / raw)
  To: Wu, Jiaxin, devel@edk2.groups.io
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

On 12/14/23 12:11, Wu, Jiaxin wrote:
> Hi Laszlo,
> 
> Really appreciate your comments! I checked one by one and feedback as below, thank you & Ray again & again for patch refinement!!!!
> 
> 
>>
>> (1) If / when you update the documentation in patch#2, please update
>> this one as well.
>>
> 
> Yes, I will do the alignment.
> 
>> (2) Please sort the #include list alphabetically.
>>
>> (The idea is that the [LibraryClasses] section in the INF file should be
>> sorted as well, and then we can easily verify whether those two lists
>> match each other -- modulo <Library/SmmCpuSyncLib.h>, of course.)
>>
> 
> Agree.
> 
>> (3) We can improve this, as follows:
>>
>>   typedef volatile UINT32 SMM_CPU_SYNC_SEMAPHORE;
> 
> Good comment. Agree.
> 
> 
>>
>>   typedef struct {
>>     SMM_CPU_SYNC_SEMAPHORE *Counter;
>>   } SMM_CPU_SYNC_SEMAPHORE_GLOBAL;
>>
>>   typedef struct {
>>     SMM_CPU_SYNC_SEMAPHORE *Run;
>>   } SMM_CPU_SYNC_SEMAPHORE_CPU;
>>
>> Because, while it *indeed* makes some sense to introduce these separate
>> wrapper structures, we should still ensure that the internals are
>> identical. This will come handy later.
>>
> 
> After check with Ray, I convinced we don't need wrapper the "Counter" into the SMM_CPU_SYNC_SEMAPHORE_GLOBAL. He thinks it's overdesigned since currently we only have one GLOBAL semaphore. "Counter" defines in the SMM_CPU_SYNC_CONTEXT directly can also make "Counter" has its own CPU cache lines, and don't need consider the future extension. So, I agree to move Counter into SMM_CPU_SYNC_CONTEXT. What's your opinion?
> 
> But for the *Run*, he is ok to wrap into the structure since it's for each CPU, and it can benefit & simply the coding logic, which can easily help us direct to the different CPU with index and meet the semaphore size alignment requirement.

My actual opinion is that *both* wrapper structures are unnecessary. You
can just embed the Counter pointer into the outer sync structure, and
you can have a Run *array of pointers* in the outer sync structure as well.

However, many people find double indirection confusing, and therefore
wrap one level into thin structures. That's fine with me. It has zero
runtime cost, and if it makes the code more manageable to the submitter,
I don't mind. So, up to you.


> 
>>
>> (4) This is too complicated, in my opinion.
>>
>> (4.1) First of all, please add a *conspicuous* comment to the
>> SMM_CPU_SYNC_CONTEXT here, explaining that the whole idea is to place
>> the Counter and Run semaphores on different CPU cache lines, for good
>> performance. That's the *core* principle of this whole structure --
>> that's why we have an array of pointers to semaphores, rather than an
>> array of semaphores directly.
>>
>> You didn't document that principle, and I had to spend a lot of time
>> deducing that fact from the SmmCpuSyncContextInit() function.
> 
> Sorry about that, I will document for the SMM_CPU_SYNC_CONTEXT definition.
> 
> 
>>
>> (4.2) The structure should go like this:
>>
>> struct SMM_CPU_SYNC_CONTEXT  {
>>   UINTN                            NumberOfCpus;
>>   VOID                             *SemBuffer;
>>   UINTN                            SemBufferPages;
>>   SMM_CPU_SYNC_SEMAPHORE_GLOBAL    GlobalSem;
>>   SMM_CPU_SYNC_SEMAPHORE_CPU       CpuSem[];
>> };
>>
>> Details:
>>
>> - move NumberOfCpus to the top
>>
>> - change the type of SemBuffer from (UINTN*) to (VOID*)
>>
>> - replace SemBufferSize with SemBufferPages
> 
> Oh? Lazlo, could you explain why it's better to define it as "SemBufferPages" instead of "SemBufferSize" in bytes?

Semantically, there is no difference, or even SemBufferSize is more
flexible. My recommendation is basically a "forward declaration" here:
using "SemBufferPages" will simplify the code later on. Because, you
never need to *remember* the precise byte count. What you need to
*remember* is the page count only. So that way the declaration matches
the usage better, and you can save some code logic later on.

> 
> 
>>
>> - move GlobalSem and CpuSem to the end
>>
>> - We need exactly one SMM_CPU_SYNC_SEMAPHORE_GLOBAL, therefore
>> embed
>> GlobalSem directly as a field (it should not be a pointer)
>>
>> - We can much simplify the code by turning CpuSem into a *flexible array
>> member* (this is a C99 feature that is already widely used in edk2).
>> This is why we move CpuSem to the end (and then we keep GlobalSem
>> nearby, for clarity).
> 
> Agree! The same comment from Ray! Thank you and Ray, both!!!
> 
>>
> 
> Really great comment here:
> 
> Based on my above explanation, I propose to below definition:
> 
> ///
> /// Shall place each semaphore on exclusive cache line for good performance.
> ///
> typedef volatile UINT32 SMM_CPU_SYNC_SEMAPHORE;
> 
> typedef struct {
>   ///
>   /// Used for control each CPU continue run or wait for signal
>   ///
>   SMM_CPU_SYNC_SEMAPHORE    *Run;
> } SMM_CPU_SYNC_SEMAPHORE_FOR_EACH_CPU;
> 
> struct SMM_CPU_SYNC_CONTEXT  {
>   ///
>   /// Indicate all CPUs in the system.
>   ///
>   UINTN                                  NumberOfCpus;
>   ///
>   /// Address of semaphores
>   ///
>   VOID                                   *SemBuffer;
>   ///
>   /// Size in bytes of semaphores
>   ///
>   UINTN                                  SemBufferSize;      ----> I can change to pages based on your feedback:).
>   ///
>   /// Indicate CPUs entered SMM.
>   ///
>   SMM_CPU_SYNC_SEMAPHORE                 *CpuCount;
>   ///
>   /// Define an array of structure for each CPU semaphore due to the size alignment
>   /// requirement. With the array of structure for each CPU semaphore, it's easy to
>   /// reach the specific CPU with CPU Index for its own semaphore access: CpuSem[CpuIndex].
>   ///
>   SMM_CPU_SYNC_SEMAPHORE_FOR_EACH_CPU    CpuSem[];
> };

Yes. This works too. In earnest, as I state above, I would prefer

  SMM_CPU_SYNC_SEMAPHORE *CpuSem[];

but again, some people dislike an array-of-pointers. It's a matter of taste.

> 
> 
>>
>> (5) Please make these Internal functions STATIC.
>>
>> Better yet, please make *all* functions that are not EFIAPI, STATIC.
> 
> Agree.
> 
> 
>>> +  ASSERT (SmmCpuSyncCtx != NULL);
>>
>> (6) This assert is unnecessary and wrong; we perform correct error
>> checking already.
> 
> Agree, I will remove the check, but keep assert.

Right, per our previous discussion, that works too, as long as the API
documents the requirement for the caller.

> 
>>
>> So, several comments on this section:
>>
>> (7) the separate NULL assignment to (*SmmCpuSyncCtx) is superfluous, we
>> overwrite the object immediately after.
>>
> 
> Agree.
> 
>> (8) the ASSERT() is superfluous and wrong; we already check for -- and
>> report -- allocation failure correctly.
> 
> Agree, will remove the assert here.
> 
> 
>>
>> (9) *page* allocation is useless / wasteful here; the main sync context
>> structure can be allocated from *pool*
>>
> 
> Agree! The same comment from Ray! Again, thank you and Ray, both!!!
> 
> 
> 
>> (10) SafeIntLib APIs return RETURN_STATUS (and Status already has type
>> RETURN_STATUS), so we should use RETURN_ERROR() rather than
>> EFI_ERROR()
>> -- it's more idiomatic
> 
> Agree.
> 
>>
>> (11) Referring back to (4), SMM_CPU_SYNC_SEMAPHORE_GLOBAL should
>> not be
>> counted (added) separately, because it is embedded in
>> SMM_CPU_SYNC_CONTEXT directly.
>>
>>> +
>>> +  (*SmmCpuSyncCtx)->GlobalSem    =
>> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL *)((UINT8 *)(*SmmCpuSyncCtx) +
>> sizeof (SMM_CPU_SYNC_CONTEXT));
>>> +  (*SmmCpuSyncCtx)->CpuSem       = (SMM_CPU_SYNC_SEMAPHORE_CPU
>> *)((UINT8 *)(*SmmCpuSyncCtx) + sizeof (SMM_CPU_SYNC_CONTEXT) +
>> sizeof (SMM_CPU_SYNC_SEMAPHORE_GLOBAL));
>>
>> (12) And then these two assignments should be dropped.
>>
> 
> Yes, agree, with my proposed definition, I will refine it.
> 
> 
>>> +  (*SmmCpuSyncCtx)->NumberOfCpus = NumberOfCpus;
>>> +
>>> +  //
>>> +  // Count the TotalSemSize
>>> +  //
>>> +  OneSemSize = GetSpinLockProperties ();
>>> +
>>> +  Status = SafeUintnMult (OneSemSize, sizeof
>> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) / sizeof (VOID *), &GlobalSemSize);
>>> +  if (EFI_ERROR (Status)) {
>>> +    goto ON_ERROR;
>>> +  }
>>> +
>>> +  Status = SafeUintnMult (OneSemSize, sizeof
>> (SMM_CPU_SYNC_SEMAPHORE_CPU) / sizeof (VOID *), &OneCpuSemSize);
>>> +  if (EFI_ERROR (Status)) {
>>> +    goto ON_ERROR;
>>> +  }
>>
>> (13) I find this obscure and misleading. How about this one instead:
>>
>>   UINTN  CacheLineSize;
>>
>>   CacheLineSize = GetSpinLockProperties ();
>>   OneSemSize    = ALIGN_VALUE (sizeof (SMM_CPU_SYNC_SEMAPHORE),
>> CacheLineSize);
>>
>> and then eliminate GlobalSemSize and OneCpuSemSize altogether.
>>
>> The above construct will ensure that
>>
>> (a) OneSemSize is just large enough for placing semaphores on different
>> cache lines, and that
>>
>> (b) OneSemSize is suitable for *both*
>> SMM_CPU_SYNC_SEMAPHORE_GLOBAL and
>> SMM_CPU_SYNC_SEMAPHORE_CPU. This is where we rely on the common
>> internal
>> type SMM_CPU_SYNC_SEMAPHORE.
>>
>>> +
>>> +  Status = SafeUintnMult (NumberOfCpus, OneCpuSemSize, &CpuSemSize);
>>> +  if (EFI_ERROR (Status)) {
>>> +    goto ON_ERROR;
>>> +  }
>>> +
>>> +  Status = SafeUintnAdd (GlobalSemSize, CpuSemSize, &TotalSemSize);
>>> +  if (EFI_ERROR (Status)) {
>>> +    goto ON_ERROR;
>>> +  }
>>
> 
> Agree!
> 
> 
>> (14) This is probably better written as
>>
>>   UINTN  NumSem;
>>
>>   Status = SafeUintnAdd (1, NumberOfCpus, &NumSem);
>>   if (RETURN_ERROR (Status)) {
>>     goto ON_ERROR;
>>   }
>>
>>   Status = SafeUintnMult (NumSem, OneSemSize, &TotalSemSize);
>>   if (RETURN_ERROR (Status)) {
>>     goto ON_ERROR;
>>   }
>>
>> and remove the variable CpuSemSize as well.
>>
>>> +
>>> +  DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%x\n",
>> __func__, OneSemSize));
>>> +  DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%x\n",
>> __func__, TotalSemSize));
>>
>> (15) These are useful, but %x is not suitable for formatting UINTN.
>> Instead, use %Lx, and cast the values to UINT64:
>>
>>   DEBUG ((DEBUG_INFO, "[%a] - One Semaphore Size    = 0x%Lx\n", __func__,
>> (UINT64)OneSemSize));
>>   DEBUG ((DEBUG_INFO, "[%a] - Total Semaphores Size = 0x%Lx\n", __func__,
>> (UINT64)TotalSemSize));
>>
> 
> Agree!
> 
>>> +
>>> +  //
>>> +  // Allocate for Semaphores in the *SmmCpuSyncCtx
>>> +  //
>>> +  (*SmmCpuSyncCtx)->SemBufferSize = TotalSemSize;
>>> +  (*SmmCpuSyncCtx)->SemBuffer     = AllocatePages (EFI_SIZE_TO_PAGES
>> ((*SmmCpuSyncCtx)->SemBufferSize));
>>
>> (16) I suggest reworking this as follows (will be beneficial later), in
>> accordance with (4):
>>
>>   (*SmmCpuSyncCtx)->SemBufferPages = EFI_SIZE_TO_PAGES (TotalSemSize);
>>   (*SmmCpuSyncCtx)->SemBuffer      = AllocatePages (
>>                                        (*SmmCpuSyncCtx)->SemBufferPages
>>                                        );
>>
>>> +  ASSERT ((*SmmCpuSyncCtx)->SemBuffer != NULL);
>>
>> (17) Bogus assert; same reason as in point (8).
> 
> 
> Agree!
> 
>>
>>> +  if ((*SmmCpuSyncCtx)->SemBuffer == NULL) {
>>> +    Status = RETURN_OUT_OF_RESOURCES;
>>> +    goto ON_ERROR;
>>> +  }
>>> +
>>> +  ZeroMem ((*SmmCpuSyncCtx)->SemBuffer, TotalSemSize);
>>
>> (18) First approach: simplify the code by calling AllocateZeroPages()
>> instead. (It may zero more bytes than strictly necessary, but it's not a
>> big deal, and the code simplification is worth it.)
>>
>> (19) Second approach: even better, just drop this call. There is no need
>> for zeroing the semaphore buffer at all, as we are going to manually set
>> both the Counter and the individual Run elements, below!
>>
>> (20) With ZeroMem() gone, evaluate if we still depend on the
>> BaseMemoryLib class (#include and [LibraryClasses]).
> 
> Agree, I will drop the zeromem!
> 
> 
>>
>>> +
>>> +  //
>>> +  // Assign Global Semaphore pointer
>>> +  //
>>> +  SemAddr                               = (UINTN)(*SmmCpuSyncCtx)->SemBuffer;
>>> +  (*SmmCpuSyncCtx)->GlobalSem->Counter  = (UINT32 *)SemAddr;
>>
>> (21) Side comment (the compiler will catch it for you anyway): it's not
>> "GlobalSem->Counter" but "GlobalSem.Counter", after point (4).
>>
> 
> I will rename the Counter --> CpuCount, and move to SmmCpuSyncCtx directly based on the comment from Ray.
> 
> 
>> (22) The explicit (UINT32 *) cast is ugly. We should cast to
>> (SMM_CPU_SYNC_SEMAPHORE *).
>>
> 
> Agree!
> 
>>> +  *(*SmmCpuSyncCtx)->GlobalSem->Counter = 0;
>>> +  DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->GlobalSem->Counter
>> Address: 0x%08x\n", __func__, (UINTN)(*SmmCpuSyncCtx)->GlobalSem-
>>> Counter));
>>
>> (23) problems with this DEBUG line:
>>
>> (23.1) needlessly verbose,
>>
>> (23.2) prints UINTN with %x,
>>
>> (23.3) pads to 8 nibbles even though UINTN can be 64-bit
>>
>> How about:
>>
>>   DEBUG ((
>>     DEBUG_INFO,
>>     "[%a] - GlobalSem.Counter @ 0x%016Lx\n",
>>     __func__,
>>     (UINT64)SemAddr
>>     ));
>>
>>> +
>>> +  SemAddr += GlobalSemSize;
>>
>> (24) Should be "+= OneSemSize".
> 
> Agree, even move the Counter in context, it should be ok with "+= OneSemSize".
> 
>>
>>> +
>>> +  //
>>> +  // Assign CPU Semaphore pointer
>>> +  //
>>> +  for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
>>> +    (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run  = (UINT32 *)(SemAddr +
>> (CpuSemSize / NumberOfCpus) * CpuIndex);
>>> +    *(*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run = 0;
>>> +    DEBUG ((DEBUG_INFO, "[%a] - (*SmmCpuSyncCtx)->CpuSem[%d].Run
>> Address: 0x%08x\n", __func__, CpuIndex, (UINTN)(*SmmCpuSyncCtx)-
>>> CpuSem[CpuIndex].Run));
>>> +  }
>>
>> (25) Extremely over-complicated.
>>
> 
> Yes, agree, now I have realized that. Thank you again!
> 
> 
>> (25.1) The quotient (CpuSemSize / NumberOfCpus) is just OneCpuSemSize,
>> from the previous SafeUintnMult() call.
>>
> 
> Yes.
> 
>> (25.2) Using SemAddr as a base address, and then performing a separate
>> multiplication, is wasteful -- not just computationally, but
>> semantically. We can simply advance SemAddr here!
>>
> 
> Agree.
> 
>> (25.3) the expression
>>
>>   (*SmmCpuSyncCtx)->CpuSem[CpuIndex].Run
>>
>> is tiresome to read, and so we shouldn't repat it multiple times!
> 
> Agree
> 
>>
>> (25.4) the usual problems with the DEBUG line:
>>
>> (25.4.1) needlessly verbose
>>
>> (25.4.2) uses %d for formatting CpuIndex (which is UINTN)
>>
>> (25.4.3) uses %x for formatting (UINTN)Run
>>
>> (25.4.4) padds to 8 nibbles even though the Run address can be 64-bit
>>
>> So:
>>
>>   SMM_CPU_SYNC_SEMAPHORE_CPU *CpuSem;
>>
>>   CpuSem = (*SmmCpuSyncCtx)->CpuSem;
>>   for (CpuIndex = 0; CpuIndex < NumberOfCpus; CpuIndex++) {
>>     CpuSem->Run  = (SMM_CPU_SYNC_SEMAPHORE *)SemAddr;
>>     *CpuSem->Run = 0;
>>
>>     DEBUG ((
>>       DEBUG_INFO,
>>       "[%a] - CpuSem[%Lu].Run @ 0x%016Lx\n",
>>       __func__,
>>       (UINT64)CpuIndex,
>>       (UINT64)SemAddr
>>       ));
>>
>>     CpuSem++;
>>     SemAddr += OneSemSize;
>>   }
>>
> 
> Really good, thanks comments. Totally agree!
> 
> 
>>> +
>>> +  return RETURN_SUCCESS;
>>> +
>>> +ON_ERROR:
>>> +  FreePages (*SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (CtxSize));
>>
>> (26) And then this can be
>>
>>   FreePool (*SmmCpuSyncCtx);
>>
> 
> Yes, agree.
> 
>> per comment (9).
>>
>>> +  return Status;
>>> +}
>>> +
>>> +/**
>>> +  Deinit an allocated SMM CPU Sync context.
>>> +
>>> +  SmmCpuSyncContextDeinit() function is to deinitialize SMM CPU Sync
>> context, the resources allocated in
>>> +  SmmCpuSyncContextInit() will be freed.
>>> +
>>> +  Note: This function only can be called after SmmCpuSyncContextInit()
>> return success.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object to be deinitialized.
>>> +
>>> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was
>> successful deinitialized.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
>>> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncContextDeinit (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
>>> +  )
>>> +{
>>> +  UINTN  SmmCpuSyncCtxSize;
>>> +
>>> +  ASSERT (SmmCpuSyncCtx != NULL);
>>
>> (27) bogus ASSERT
>>
> 
> According our original define, I will refine the interface.
> 
>>> +  if (SmmCpuSyncCtx == NULL) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>> +
>>> +  SmmCpuSyncCtxSize = sizeof (SMM_CPU_SYNC_CONTEXT) + sizeof
>> (SMM_CPU_SYNC_SEMAPHORE_GLOBAL) + sizeof
>> (SMM_CPU_SYNC_SEMAPHORE_CPU) * (SmmCpuSyncCtx->NumberOfCpus);
>>> +
>>> +  FreePages (SmmCpuSyncCtx->SemBuffer, EFI_SIZE_TO_PAGES
>> (SmmCpuSyncCtx->SemBufferSize));
>>
>> (28) Per comment (16), this can be simplified as:
>>
>>   FreePages (SmmCpuSyncCtx->SemBuffer, SmmCpuSyncCtx-
>>> SemBufferPages);
>>
>>> +
>>> +  FreePages (SmmCpuSyncCtx, EFI_SIZE_TO_PAGES (SmmCpuSyncCtxSize));
>>
>> (29) Per comments (9) and (26), this should be just
>>
>>   FreePool (SmmCpuSyncCtx);
>>
>> (and the variable "SmmCpuSyncCtxSize" should be removed).
>>
> 
> Yes, agree!
> 
> 
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Reset SMM CPU Sync context.
>>> +
>>> +  SmmCpuSyncContextReset() function is to reset SMM CPU Sync context
>> to the initialized state.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object to be reset.
>>> +
>>> +  @retval RETURN_SUCCESS            The SMM CPU Sync context was
>> successful reset.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncContextReset (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL);
>>
>> (30) bogus assert
>>
> 
> Will refine the implementation based on the latest interface.
> 
>>> +  if (SmmCpuSyncCtx == NULL) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>> +
>>> +  *SmmCpuSyncCtx->GlobalSem->Counter = 0;
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>
>> (31) Is there anything to do about the Run semaphores here?
>>
> 
> No, Run semaphores will naturally reset if all cpus ready to exit if the caller follow the API requirement calling.

OK, thanks!

> 
> 
>>> +
>>> +/**
>>> +  Get current number of arrived CPU in SMI.
>>> +
>>> +  For traditional CPU synchronization method, BSP might need to know the
>> current number of arrived CPU in
>>> +  SMI to make sure all APs in SMI. This API can be for that purpose.
>>> +
>>> +  @param[in]      SmmCpuSyncCtx     Pointer to the SMM CPU Sync context
>> object.
>>> +  @param[in,out]  CpuCount          Current count of arrived CPU in SMI.
>>> +
>>> +  @retval RETURN_SUCCESS            Get current number of arrived CPU in SMI
>> successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
>> NULL.
>>> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncGetArrivedCpuCount (
>>> +  IN     SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN OUT UINTN                 *CpuCount
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
>>
>> (32) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>> +
>>> +  if (*SmmCpuSyncCtx->GlobalSem->Counter < 0) {
>>
>> (33) The type of Counter is
>>
>>   volatile UINT32
>>
>> therefore this condition will never evaluate to true.
>>
>> If you want to check for the door being locked, then I suggest
>>
>>   *SmmCpuSyncCtx->GlobalSem.Counter == (UINT32)-1
>>
>> or
>>
>>   *SmmCpuSyncCtx->GlobalSem.Counter == MAX_UINT32
>>
> 
> Yes, I do need check the locked or not, I will fix this!!!
> 
> 
> For this API, I will rewrite the implementation according the interface we aligned.
> 
> 
> 
>>> +    return RETURN_UNSUPPORTED;
>>> +  }
>>> +
>>> +  *CpuCount = *SmmCpuSyncCtx->GlobalSem->Counter;
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Performs an atomic operation to check in CPU.
>>> +
>>> +  When SMI happens, all processors including BSP enter to SMM mode by
>> calling SmmCpuSyncCheckInCpu().
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Check in CPU index.
>>> +
>>> +  @retval RETURN_SUCCESS            Check in CPU (CpuIndex) successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
>>> +  @retval RETURN_ABORTED            Check in CPU failed due to
>> SmmCpuSyncLockDoor() has been called by one elected CPU.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncCheckInCpu (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL);
>>
>> (34) bogus ASSERT
>>
>>> +  if (SmmCpuSyncCtx == NULL) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
> 
> Agree, I will refine!
> 
>>> +
>>> +  //
>>> +  // Check to return if Counter has already been locked.
>>> +  //
>>> +  if ((INT32)InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem-
>>> Counter) <= 0) {
>>
>> (35) The cast and the comparison are bogus.
>>
>> InternalReleaseSemaphore():
>>
>> - returns 0, and leaves the semaphore unchanged, if the current value of
>> the semaphore is MAX_UINT32,
>>
>> - increments the semaphore, and returns the incremented -- hence:
>> strictly positive -- UINT32 value, otherwise.
>>
>> So the condition for
>>
>>   semaphore unchanged because door has been locked
>>
>> is:
>>
>>   InternalReleaseSemaphore (SmmCpuSyncCtx->GlobalSem->Counter) == 0
>>
>> No INT32 cast, and no "<".
>>
> 
> Agree! I will do the fix. 
> 
> 
>>
>>> +    return RETURN_ABORTED;
>>> +  }
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Performs an atomic operation to check out CPU.
>>> +
>>> +  CheckOutCpu() can be called in error handling flow for the CPU who calls
>> CheckInCpu() earlier.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Check out CPU index.
>>> +
>>> +  @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL.
>>> +  @retval RETURN_NOT_READY          The CPU is not checked-in.
>>> +  @retval RETURN_UNSUPPORTED        Unsupported operation.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncCheckOutCpu (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL);
>>
>> (36) bogus assert
>>
>>
>>> +  if (SmmCpuSyncCtx == NULL) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>> +
> 
> The same, I will refine it.
> 
> 
> 
> 
> 
>>> +  if (*SmmCpuSyncCtx->GlobalSem->Counter == 0) {
>>> +    return RETURN_NOT_READY;
>>> +  }
>>
>> (37) This preliminary check is not particularly useful.
>>
>> Assume that Counter is currently 1, but -- due to a programming error
>> somewhere -- there are two APs executing SmmCpuSyncCheckOutCpu() in
>> parallel. Both may pass this check (due to Counter being 1), and then
>> one of the APs will consume the semaphore and return, and the other AP
>> will hang forever.
>>
>> So this check is "best effort". It's fine -- some programming errors
>> just inevitably lead to undefined behavior; not all bad usage can be
>> explicitly caught.
>>
>> Maybe add a comment?
>>
> 
> 
> Yes, I will remove the RETURN_NOT_READY, and add the comment for the API:
> 
> "The caller shall make sure the CPU specified by CpuIndex has already checked-in."

OK, thanks.

> 
> 
> 
>>> +  if ((INT32)InternalWaitForSemaphore (SmmCpuSyncCtx->GlobalSem-
>>> Counter) < 0) {
>>> +    return RETURN_UNSUPPORTED;
>>> +  }
>>
>> (38) This doesn't look right. InternalWaitForSemaphore() blocks for as
>> long as the semaphore is zero. When the semaphore is nonzero,
>> InternalWaitForSemaphore() decrements it, and returns the decremented
>> value. Thus, InternalWaitForSemaphore() cannot return negative values
>> (it returns UINT32), and it also cannot return MAX_UINT32.
>>
>> So I simply don't understand the purpose of this code.
> 
> The Counter can be locked by calling SmmCpuSyncLockDoor(), after that, Counter will be (UINT32)-1, in such a case,  InternalWaitForSemaphore will continue decrease it to (UINT32)-2, this is to catch the lock case. but according we aligned for the interface below:
> /**
>   Performs an atomic operation to check out CPU.
> 
>   This function can be called in error handling flow for the CPU who calls CheckInCpu() earlier.
>   The caller shall make sure the CPU specified by CpuIndex has already checked-in.
> 
>   If Context is NULL, then ASSERT().
>   If CpuIndex exceeds the range of all CPUs in the system, then ASSERT().
> 
>   @param[in,out]  Context           Pointer to the SMM CPU Sync context object.
>   @param[in]      CpuIndex          Check out CPU index.
> 
>   @retval RETURN_SUCCESS            Check out CPU (CpuIndex) successfully.
>   @retval RETURN_ABORTED            Check out CPU failed due to SmmCpuSyncLockDoor() has been called by one elected CPU.
> 
> **/
> RETURN_STATUS
> EFIAPI
> SmmCpuSyncCheckOutCpu (
>   IN OUT SMM_CPU_SYNC_CONTEXT  *Context,
>   IN     UINTN                 CpuIndex
>   );
> 
> The code will be changed to:
> 
>   if ((INT32)InternalWaitForSemaphore (Context->CpuCount) < 0) {
>     return RETURN_ABORTED;
>   }

I find this quite ugly. In the "semaphore post" operation, we already
have code that prevents incrementing if the semaphore is "locked". Can
we perhaps create a "semaphore pend" operation that does the same?

How about this:

diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
index 3c2835f8def6..5d7fc58ef23f 100644
--- a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
+++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
@@ -91,35 +91,38 @@ UINT32
 InternalWaitForSemaphore (
   IN OUT  volatile UINT32  *Sem
   )
 {
   UINT32  Value;

   for ( ; ;) {
     Value = *Sem;
+    if (Value == MAX_UINT32) {
+      return Value;
+    }
     if ((Value != 0) &&
         (InterlockedCompareExchange32 (
            (UINT32 *)Sem,
            Value,
            Value - 1
            ) == Value))
     {
       break;
     }

     CpuPause ();
   }

   return Value - 1;
 }

Note, I'm just brainstorming here, I've not thought it through. Just to
illustrate the direction I'm thinking of.

This change should be mostly OK. InternalWaitForSemaphore() returns the
decremented value. So, for InternalWaitForSemaphore() to return
MAX_UINT32 *without* this update, the function would have to decrement
the semaphore when the semaphore is zero. But in that case, the function
*blocks*. Thus, a return value of MAX_UINT32 is not possible without
this extension; ergo, if MAX_UINT32 is returned (with this extension),
we know the door was locked earlier (and the semaphore is not changed).

At the same time, we might want to update InternalReleaseSemaphore() as
well, so that it cannot validly increment the semaphore value to MAX_UINT32.



> 
> 
>>
>> As written, this condition could only fire if InternalWaitForSemaphore()
>> successfully decremented the semaphore, and the *new* value of the
>> semaphore were >=0x8000_0000. Because in that case, the INT32 cast (=
>> implementation-defined behavior) would produce a negative value. But for
>> that, we'd first have to increase Counter to 0x8000_0001 at least, and
>> that could never happen in practice, IMO.
>>
>> So this is basically dead code. What is the intent?
> 
> 
> It can happen when locked the Counter.
> 
> 
>>
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Performs an atomic operation lock door for CPU checkin or checkout.
>>> +
>>> +  After this function, CPU can not check in via SmmCpuSyncCheckInCpu().
>>> +
>>> +  The CPU specified by CpuIndex is elected to lock door.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which CPU to lock door.
>>> +  @param[in,out]  CpuCount          Number of arrived CPU in SMI after look
>> door.
>>> +
>>> +  @retval RETURN_SUCCESS            Lock door for CPU successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx or CpuCount is
>> NULL.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncLockDoor (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN OUT UINTN                 *CpuCount
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && CpuCount != NULL);
>>
>> (39) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (CpuCount == NULL)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>> +
>>> +  *CpuCount = InternalLockdownSemaphore (SmmCpuSyncCtx-
>>> GlobalSem->Counter);
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Used by the BSP to wait for APs.
>>> +
>>> +  The number of APs need to be waited is specified by NumberOfAPs. The
>> BSP is specified by BspIndex.
>>> +
>>> +  Note: This function is blocking mode, and it will return only after the
>> number of APs released by
>>> +  calling SmmCpuSyncReleaseBsp():
>>> +  BSP: WaitForAPs    <--  AP: ReleaseBsp
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      NumberOfAPs       Number of APs need to be waited by
>> BSP.
>>> +  @param[in]      BspIndex          The BSP Index to wait for APs.
>>> +
>>> +  @retval RETURN_SUCCESS            BSP to wait for APs successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> NumberOfAPs > total number of processors in system.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncWaitForAPs (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 NumberOfAPs,
>>> +  IN     UINTN                 BspIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && NumberOfAPs <= SmmCpuSyncCtx-
>>> NumberOfCpus);
>>
>> (40) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (NumberOfAPs > SmmCpuSyncCtx-
>>> NumberOfCpus)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>
>> (41) Question for both the library instance and the library class (i.e.,
>> API documentation):
>>
>> Is it ever valid to call this function with (NumberOfAPs ==
>> SmmCpuSyncCtx->NumberOfCpus)?
>>
>> I would think not. NumberOfCpus is supposed to include the BSP and the
>> APs. Therefore the highest permitted NumberOfAPs value, on input, is
>> (SmmCpuSyncCtx->NumberOfCpus - 1).
>>
>> So I think we should modify the lib class and the lib instance both.
>> RETURN_INVALID_PARAMETER applies to "NumberOfAPs *>=* total number
>> of
>> processors in system".
>>
> 
> Good catch!!! I will fix it.
> 
>>> +
>>> +  while (NumberOfAPs-- > 0) {
>>> +    InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
>>> +  }
>>
>> (42) In my opinion, this is an ugly pattern.
>>
>> First, after the loop, NumberOfAPs will be MAX_UINTN.
>>
>> Second, modifying input parameters is also an anti-pattern. Assume you
>> debug a problem, and fetch a backtrace where the two innermost frames
>> are SmmCpuSyncWaitForAPs() and InternalWaitForSemaphore(). If you look
>> at the stack frame that belongs to SmmCpuSyncWaitForAPs(), you may be
>> led to think that the function was *invoked* with a *low* NumberOfAPs
>> value. Whereas in fact NumberOfAPs may have been a larger value at the
>> time of call, only the function decremented NumberOfAPs by the time the
>> stack trace was fetched.
>>
>> So, please add a new helper variable, and write a proper "for" loop.
>>
>>   UINTN  Arrived;
>>
>>   for (Arrived = 0; Arrived < NumberOfAPs; Arrived++) {
>>     InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
>>   }
>>
> 
> Agree. Thank you!
> 
> 
>> (43) I mentioned this while reviewing the lib class header (patch#2), so
>> let me repeat it here:
>>
>> BspIndex is used for indexing the CpuSem array, but we perform no range
>> checking, against "SmmCpuSyncCtx->NumberOfCpus".
>>
>> That error should be documented (in the lib class header), and
>> caught/reported here.
>>
> 
> Yes, I will refine the patch based on the lib class header we aligned.
> 
> 
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Used by the BSP to release one AP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which AP need to be released.
>>> +  @param[in]      BspIndex          The BSP Index to release AP.
>>> +
>>> +  @retval RETURN_SUCCESS            BSP to release one AP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncReleaseOneAp   (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
>>
>> (44) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>
>> (45) range checks for BspIndex and CpuIndex missing (in both lib class
>> and lib instance)
>>
> 
> The same, I will align the code with we aligned lib class.
> 
> 
>>> +
>>> +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Used by the AP to wait BSP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  Note: This function is blocking mode, and it will return only after the AP
>> released by
>>> +  calling SmmCpuSyncReleaseOneAp():
>>> +  BSP: ReleaseOneAp  -->  AP: WaitForBsp
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx    Pointer to the SMM CPU Sync context
>> object.
>>> +  @param[in]      CpuIndex         Indicate which AP wait BSP.
>>> +  @param[in]      BspIndex         The BSP Index to be waited.
>>> +
>>> +  @retval RETURN_SUCCESS            AP to wait BSP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncWaitForBsp (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
>>
>> (46) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>
>> (47) range checks missing (lib class and instance)
>>
> 
> 
> The same, I will align the code with we aligned lib class.
> 
>>> +
>>> +  InternalWaitForSemaphore (SmmCpuSyncCtx->CpuSem[CpuIndex].Run);
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> +
>>> +/**
>>> +  Used by the AP to release BSP.
>>> +
>>> +  The AP is specified by CpuIndex. The BSP is specified by BspIndex.
>>> +
>>> +  @param[in,out]  SmmCpuSyncCtx     Pointer to the SMM CPU Sync
>> context object.
>>> +  @param[in]      CpuIndex          Indicate which AP release BSP.
>>> +  @param[in]      BspIndex          The BSP Index to be released.
>>> +
>>> +  @retval RETURN_SUCCESS            AP to release BSP successfully.
>>> +  @retval RETURN_INVALID_PARAMETER  SmmCpuSyncCtx is NULL or
>> CpuIndex is same as BspIndex.
>>> +
>>> +**/
>>> +RETURN_STATUS
>>> +EFIAPI
>>> +SmmCpuSyncReleaseBsp (
>>> +  IN OUT SMM_CPU_SYNC_CONTEXT  *SmmCpuSyncCtx,
>>> +  IN     UINTN                 CpuIndex,
>>> +  IN     UINTN                 BspIndex
>>> +  )
>>> +{
>>> +  ASSERT (SmmCpuSyncCtx != NULL && BspIndex != CpuIndex);
>>
>> (48) bogus assert
>>
>>> +  if ((SmmCpuSyncCtx == NULL) || (BspIndex == CpuIndex)) {
>>> +    return RETURN_INVALID_PARAMETER;
>>> +  }
>>
>> (49) range checks missing (lib class and instance)
>>
> 
> The same, I will align the code with we aligned lib class.
> 
> 
>>> +
>>> +  InternalReleaseSemaphore (SmmCpuSyncCtx->CpuSem[BspIndex].Run);
>>> +
>>> +  return RETURN_SUCCESS;
>>> +}
>>> diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>> b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>>> new file mode 100644
>>> index 0000000000..6bb1895577
>>> --- /dev/null
>>> +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
>>> @@ -0,0 +1,39 @@
>>> +## @file
>>> +# SMM CPU Synchronization lib.
>>> +#
>>> +# This is SMM CPU Synchronization lib used for SMM CPU sync operations.
>>> +#
>>> +# Copyright (c) 2023, Intel Corporation. All rights reserved.<BR>
>>> +# SPDX-License-Identifier: BSD-2-Clause-Patent
>>> +#
>>> +##
>>> +
>>> +[Defines]
>>> +  INF_VERSION                    = 0x00010005
>>> +  BASE_NAME                      = SmmCpuSyncLib
>>> +  FILE_GUID                      = 1ca1bc1a-16a4-46ef-956a-ca500fd3381f
>>> +  MODULE_TYPE                    = DXE_SMM_DRIVER
>>> +  LIBRARY_CLASS                  = SmmCpuSyncLib|DXE_SMM_DRIVER
>>> +
>>> +[Sources]
>>> +  SmmCpuSyncLib.c
>>> +
>>> +[Packages]
>>> +  MdePkg/MdePkg.dec
>>> +  MdeModulePkg/MdeModulePkg.dec
>>> +  UefiCpuPkg/UefiCpuPkg.dec
>>> +
>>> +[LibraryClasses]
>>> +  UefiLib
>>> +  BaseLib
>>> +  DebugLib
>>> +  PrintLib
>>> +  SafeIntLib
>>> +  SynchronizationLib
>>> +  BaseMemoryLib
>>> +  SmmServicesTableLib
>>> +  MemoryAllocationLib
>>
>> (50) Please sort this list alphabetically (cf. comment (2)).
>>
> 
> Ok, will refine it.
> 
>>> +
>>> +[Pcd]
>>> +
>>> +[Protocols]
>>
>> (51) Useless empty INF file sections; please remove them.
>>
>>> diff --git a/UefiCpuPkg/UefiCpuPkg.dsc b/UefiCpuPkg/UefiCpuPkg.dsc
>>> index 074fd77461..f264031c77 100644
>>> --- a/UefiCpuPkg/UefiCpuPkg.dsc
>>> +++ b/UefiCpuPkg/UefiCpuPkg.dsc
>>> @@ -23,10 +23,11 @@
>>>  #
>>>
>>>  !include MdePkg/MdeLibs.dsc.inc
>>>
>>>  [LibraryClasses]
>>> +  SafeIntLib|MdePkg/Library/BaseSafeIntLib/BaseSafeIntLib.inf
>>>    BaseLib|MdePkg/Library/BaseLib/BaseLib.inf
>>>    BaseMemoryLib|MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf
>>>    CpuLib|MdePkg/Library/BaseCpuLib/BaseCpuLib.inf
>>>    DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf
>>>
>> SerialPortLib|MdePkg/Library/BaseSerialPortLibNull/BaseSerialPortLibNull.inf
>>
>> (52) Just from the context visible here, this list seems alphabetically
>> sorted pre-patch; if that's the case, please stick with it (don't break
>> the sort order).
> 
> 
> How about adding the SafeIntLib in the MdePkg/MdeLibs.dsc.inc? if agree, I will do that with separated patch.

Sounds like a good idea -- there shouldn't ever be further instances of
SafeIntLib, and SafeIntLib should be used wherever possible (integer
overflow may be a risk anywhere at all). So a central lib class ->
instance resolution sounds useful.

Thanks!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112529): https://edk2.groups.io/g/devel/message/112529
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-14 13:48       ` Laszlo Ersek
@ 2023-12-14 15:34         ` Wu, Jiaxin
  2023-12-14 15:54         ` Wu, Jiaxin
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-14 15:34 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

> > The code will be changed to:
> >
> >   if ((INT32)InternalWaitForSemaphore (Context->CpuCount) < 0) {
> >     return RETURN_ABORTED;
> >   }
> 
> I find this quite ugly. In the "semaphore post" operation, we already
> have code that prevents incrementing if the semaphore is "locked". Can
> we perhaps create a "semaphore pend" operation that does the same?
> 
> How about this:
> 
> diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> index 3c2835f8def6..5d7fc58ef23f 100644
> --- a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> @@ -91,35 +91,38 @@ UINT32
>  InternalWaitForSemaphore (
>    IN OUT  volatile UINT32  *Sem
>    )
>  {
>    UINT32  Value;
> 
>    for ( ; ;) {
>      Value = *Sem;
> +    if (Value == MAX_UINT32) {
> +      return Value;
> +    }
>      if ((Value != 0) &&
>          (InterlockedCompareExchange32 (
>             (UINT32 *)Sem,
>             Value,
>             Value - 1
>             ) == Value))
>      {
>        break;
>      }
> 
>      CpuPause ();
>    }
> 
>    return Value - 1;
>  }
> 
> Note, I'm just brainstorming here, I've not thought it through. Just to
> illustrate the direction I'm thinking of.
> 
> This change should be mostly OK. InternalWaitForSemaphore() returns the
> decremented value. So, for InternalWaitForSemaphore() to return
> MAX_UINT32 *without* this update, the function would have to decrement
> the semaphore when the semaphore is zero. But in that case, the function
> *blocks*. Thus, a return value of MAX_UINT32 is not possible without
> this extension; ergo, if MAX_UINT32 is returned (with this extension),

Yes, that's for the semaphore sync usage, we have to block the sem if it's zero, decrease it when return. That's why I said - it's naturally make sure the Run is reset after all ready to exit.  Then it can achieve the below flow:
    BSP: ReleaseOneAp  -->  AP: WaitForBsp
    BSP: WaitForAPs    <--  AP: ReleaseBsp


For locked case, I just copy the existing logic from SMM cpu driver (as I document in the commit message: The instance refers the existing SMM CPU driver (PiSmmCpuDxeSmm) sync implementation and behavior):
existing ReleaseSemaphore() prevents increase the semaphore, but it still return the original semaphore value +1; --> that's why we have to check the return value is  0 or not in SmmCpuSyncCheckInCpu()
existing WaitForSemaphore() allow decrease the semaphore if locked, and it also return the original semaphore value -1;  --> that's why we have to check the return value is  < 0 or not in SmmCpuSyncCheckOutCpu()
 
so, do you want to align the behavior as below?

ReleaseSemaphore() prevents increase the semaphore if locked, and it should return the locked value (MAX_UINT32);  --> then we can check the return value is  MAX_UINT32 or not in SmmCpuSyncCheckInCpu(), and sem itself won't be changed.
WaitForSemaphore() prevents decrease the semaphore if locked, and it should return the locked value (MAX_UINT32); --> then we can check the return value is  MAX_UINT32 or not in SmmCpuSyncCheckOutCpu (), and sem itself won't be changed.

I think:
for ReleaseSemaphore, it must meet below 2 cases usage:
1. for semaphore sync usage (Run), it doesn't care the lock case, and returned value is not cared. Just check the semaphore itself.
2. for Rendezvous case (Counter), it not only needs to check locked or not from return value, but also require "only increase the semaphore if not locked".

for WaitForSemaphore, it must meet below 2 cases usage:
1. for semaphore sync usage (Run), it doesn't care the lock case, and returned value is not cared. But for the semaphore itself, it need block at 0, and decrease when return.
2. for Rendezvous case (Counter), it only needs to check locked or not from return value. semaphore itself is not cared.

So, based on above, I think, yes, we can do the change to align the lock behavior:  

/**
  Performs an atomic compare exchange operation to get semaphore.
  The compare exchange operation must be performed using MP safe
  mechanisms.

  @param[in,out]  Sem    IN:  32-bit unsigned integer
                         OUT: original integer - 1 if Sem is not locked.
                         OUT: original integer (MAX_UINT32) if Sem is locked.

  @retval     Original integer - 1 if Sem is not locked.
              Original integer (MAX_UINT32) if Sem is locked.

**/
STATIC
UINT32
InternalWaitForSemaphore (
  IN OUT  volatile UINT32  *Sem
  )
{
  UINT32  Value;

  for ( ; ;) {
    Value = *Sem;
    if (Value == MAX_UINT32) {
      return Value;
    }

    if ((Value != 0) &&
        (InterlockedCompareExchange32 (
           (UINT32 *)Sem,
           Value,
           Value - 1
           ) == Value))
    {
      break;
    }

    CpuPause ();
  }

  return Value - 1;
}

/**
  Performs an atomic compare exchange operation to release semaphore.
  The compare exchange operation must be performed using MP safe
  mechanisms.

  @param[in,out]  Sem    IN:  32-bit unsigned integer
                         OUT: original integer + 1 if Sem is not locked.
                         OUT: original integer (MAX_UINT32) if Sem is locked.

  @retval    Original integer + 1 if Sem is not locked.
             Original integer (MAX_UINT32) if Sem is locked.

**/
STATIC
UINT32
InternalReleaseSemaphore (
  IN OUT  volatile UINT32  *Sem
  )
{
  UINT32  Value;

  do {
    Value = *Sem;
  } while (Value + 1 != 0 &&
           InterlockedCompareExchange32 (
             (UINT32 *)Sem,
             Value,
             Value + 1
             ) != Value);

  if (Value == MAX_UINT32) {
    return Value;
  }

  return Value + 1;
}

I haven't see any issue with this change. 

> we know the door was locked earlier (and the semaphore is not changed).
> 
> At the same time, we might want to update InternalReleaseSemaphore() as
> well, so that it cannot validly increment the semaphore value to MAX_UINT32.
> 
> 
> 
> >
> >



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112542): https://edk2.groups.io/g/devel/message/112542
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-14 13:48       ` Laszlo Ersek
  2023-12-14 15:34         ` Wu, Jiaxin
@ 2023-12-14 15:54         ` Wu, Jiaxin
  2023-12-15  6:41         ` Wu, Jiaxin
  2023-12-15  6:44         ` Wu, Jiaxin
  3 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-14 15:54 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

BTW, for SmmCpuSyncGetArrivedCpuCount ():

we can't check the CpuCount (Original is named as Counter Sem) is locked or not, then decide return from the *Context->CpuCount or locked value for the arrived CPU in SMI. Just like:

if (*Context->CpuCount == MAX_UINT32) {        ------> does not meet this condition, means unlocked!
	Return real CpuCount from the SmmCpuSyncLockDoor(). 
} 
                ----> lock operation is here!!!! *Context->CpuCount change to MAX_UINT32
Return *Context->CpuCount;   --> return wrong value since MAX_UINT32 is return.

Because if we found it's not locked during the check, but it suddenly locked before return, then -1 will be returned. this is not atomic operation. The behavior is not expected. If we add the atomic operation here, I believe it will surely impact the existing performance.

And the real usage case is that we only need this api before the lock. I don't want make it complex.

So, based on this, we add the comment in the function:
  The caller shall not call this function for the number of arrived CPU after look door
  in SMI since the value has been returned in the parameter of LockDoor().
 
See below: 

/**
  Get current number of arrived CPU in SMI.

  BSP might need to know the current number of arrived CPU in SMI to make sure all APs
  in SMI. This API can be for that purpose.

  The caller shall not call this function for the number of arrived CPU after look door
  in SMI since the value has been returned in the parameter of LockDoor().

  If Context is NULL, then ASSERT().

  @param[in]      Context     Pointer to the SMM CPU Sync context object.

  @retval    Current number of arrived CPU in SMI.

**/
UINTN
EFIAPI
SmmCpuSyncGetArrivedCpuCount (
  IN  SMM_CPU_SYNC_CONTEXT  *Context
  )
{
  ASSERT (Context != NULL);

  return *Context->CpuCount;
}

Thanks,
Jiaxin 
















-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112544): https://edk2.groups.io/g/devel/message/112544
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance
  2023-12-13 16:52   ` Laszlo Ersek
  2023-12-14 13:43     ` Wu, Jiaxin
@ 2023-12-15  0:21     ` Ni, Ray
  1 sibling, 0 replies; 22+ messages in thread
From: Ni, Ray @ 2023-12-15  0:21 UTC (permalink / raw)
  To: Laszlo Ersek, devel@edk2.groups.io, Wu, Jiaxin
  Cc: Ard Biesheuvel, Yao, Jiewen, Justen, Jordan L, Dong, Eric,
	Zeng, Star, Kumar, Rahul R, Gerd Hoffmann

Laszlo,
I like the "local lib override" idea, similar to the C language recommendation to use local variables instead of global ones when possible.

Thanks,
Ray
> -----Original Message-----
> From: Laszlo Ersek <lersek@redhat.com>
> Sent: Thursday, December 14, 2023 12:53 AM
> To: devel@edk2.groups.io; Wu, Jiaxin <jiaxin.wu@intel.com>
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>; Yao, Jiewen
> <jiewen.yao@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>; Dong,
> Eric <eric.dong@intel.com>; Ni, Ray <ray.ni@intel.com>; Zeng, Star
> <star.zeng@intel.com>; Kumar, Rahul R <rahul.r.kumar@intel.com>; Gerd
> Hoffmann <kraxel@redhat.com>
> Subject: Re: [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib
> instance
> 
> On 12/6/23 11:01, Wu, Jiaxin wrote:
> > This patch is to specify SmmCpuSyncLib instance for OvmfPkg.
> >
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Ray Ni <ray.ni@intel.com>
> > Cc: Zeng Star <star.zeng@intel.com>
> > Cc: Rahul Kumar <rahul1.kumar@intel.com>
> > Cc: Gerd Hoffmann <kraxel@redhat.com>
> > Signed-off-by: Jiaxin Wu <jiaxin.wu@intel.com>
> > ---
> >  OvmfPkg/CloudHv/CloudHvX64.dsc | 2 ++
> >  OvmfPkg/OvmfPkgIa32.dsc        | 2 ++
> >  OvmfPkg/OvmfPkgIa32X64.dsc     | 2 ++
> >  OvmfPkg/OvmfPkgX64.dsc         | 1 +
> >  4 files changed, 7 insertions(+)
> >
> > diff --git a/OvmfPkg/CloudHv/CloudHvX64.dsc
> b/OvmfPkg/CloudHv/CloudHvX64.dsc
> > index 821ad1b9fa..f735b69a37 100644
> > --- a/OvmfPkg/CloudHv/CloudHvX64.dsc
> > +++ b/OvmfPkg/CloudHv/CloudHvX64.dsc
> > @@ -183,10 +183,12 @@
> >
> PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.i
> nf
> >
> DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib
> .inf
> >
> ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLi
> b/ImagePropertiesRecordLib.inf
> >  !if $(SMM_REQUIRE) == FALSE
> >    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> > +!else
> > +
> SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> >  !endif
> >
> CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/Custo
> mizedDisplayLib.inf
> >
> FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBlt
> Lib.inf
> >
> MemEncryptTdxLib|OvmfPkg/Library/BaseMemEncryptTdxLib/BaseMemEncr
> yptTdxLib.inf
> >
> > diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
> > index bce2aedcd7..b05b13b18c 100644
> > --- a/OvmfPkg/OvmfPkgIa32.dsc
> > +++ b/OvmfPkg/OvmfPkgIa32.dsc
> > @@ -188,10 +188,12 @@
> >
> PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.i
> nf
> >
> DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib
> .inf
> >
> ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLi
> b/ImagePropertiesRecordLib.inf
> >  !if $(SMM_REQUIRE) == FALSE
> >    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> > +!else
> > +
> SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> >  !endif
> >
> CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/Custo
> mizedDisplayLib.inf
> >
> FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBlt
> Lib.inf
> >
> >  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> > diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc
> b/OvmfPkg/OvmfPkgIa32X64.dsc
> > index 631e909a54..5a16eb7abe 100644
> > --- a/OvmfPkg/OvmfPkgIa32X64.dsc
> > +++ b/OvmfPkg/OvmfPkgIa32X64.dsc
> > @@ -193,10 +193,12 @@
> >
> PeiHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/PeiHardwareInfoLib.i
> nf
> >
> DxeHardwareInfoLib|OvmfPkg/Library/HardwareInfoLib/DxeHardwareInfoLib
> .inf
> >
> ImagePropertiesRecordLib|MdeModulePkg/Library/ImagePropertiesRecordLi
> b/ImagePropertiesRecordLib.inf
> >  !if $(SMM_REQUIRE) == FALSE
> >    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> > +!else
> > +
> SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> >  !endif
> >
> CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/Custo
> mizedDisplayLib.inf
> >
> FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBlt
> Lib.inf
> >
> >  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> > diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> > index 4ea3008cc6..6bb4c777b9 100644
> > --- a/OvmfPkg/OvmfPkgX64.dsc
> > +++ b/OvmfPkg/OvmfPkgX64.dsc
> > @@ -209,10 +209,11 @@
> >  !if $(SMM_REQUIRE) == FALSE
> >    LockBoxLib|OvmfPkg/Library/LockBoxLib/LockBoxBaseLib.inf
> >    CcProbeLib|OvmfPkg/Library/CcProbeLib/DxeCcProbeLib.inf
> >  !else
> >    CcProbeLib|MdePkg/Library/CcProbeLibNull/CcProbeLibNull.inf
> > +
> SmmCpuSyncLib|UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.inf
> >  !endif
> >
> CustomizedDisplayLib|MdeModulePkg/Library/CustomizedDisplayLib/Custo
> mizedDisplayLib.inf
> >
> FrameBufferBltLib|MdeModulePkg/Library/FrameBufferBltLib/FrameBufferBlt
> Lib.inf
> >
> >  !if $(SOURCE_DEBUG_ENABLE) == TRUE
> 
> All four DSC files already include "PiSmmCpuDxeSmm.inf" like this:
> 
>   UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf {
>     <LibraryClasses>
>       ...
>   }
> 
> Given that this new library class is again exclusively used by
> PiSmmCpuDxeSmm, can you please resolve this lib class too in module
> scope only?
> 
> Thanks!
> Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112579): https://edk2.groups.io/g/devel/message/112579
Mute This Topic: https://groups.io/mt/103010166/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-14 13:48       ` Laszlo Ersek
  2023-12-14 15:34         ` Wu, Jiaxin
  2023-12-14 15:54         ` Wu, Jiaxin
@ 2023-12-15  6:41         ` Wu, Jiaxin
  2023-12-15  6:44         ` Wu, Jiaxin
  3 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-15  6:41 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

I will the align the ReleaseSemaphore & WaitForSemaphore behavior as blow:

ReleaseSemaphore() prevents increase the semaphore if locked, and it should return the locked value (MAX_UINT32);  --> then we can check the return value is  MAX_UINT32 or not in SmmCpuSyncCheckInCpu(), and sem itself won't be changed.
WaitForSemaphore() prevents decrease the semaphore if locked, and it should return the locked value (MAX_UINT32); --> then we can check the return value is  MAX_UINT32 or not in SmmCpuSyncCheckOutCpu (), and sem itself won'tbe changed.

Thanks,
Jiaxin 


> -----Original Message-----
> From: Wu, Jiaxin
> Sent: Thursday, December 14, 2023 11:35 PM
> To: devel@edk2.groups.io; lersek@redhat.com
> Cc: Dong, Eric <eric.dong@intel.com>; Ni, Ray <ray.ni@intel.com>; Zeng, Star
> <star.zeng@intel.com>; Gerd Hoffmann <kraxel@redhat.com>; Kumar, Rahul R
> <rahul.r.kumar@intel.com>
> Subject: RE: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements
> SmmCpuSyncLib library instance
> 
> > > The code will be changed to:
> > >
> > >   if ((INT32)InternalWaitForSemaphore (Context->CpuCount) < 0) {
> > >     return RETURN_ABORTED;
> > >   }
> >
> > I find this quite ugly. In the "semaphore post" operation, we already
> > have code that prevents incrementing if the semaphore is "locked". Can
> > we perhaps create a "semaphore pend" operation that does the same?
> >
> > How about this:
> >
> > diff --git a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> > b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> > index 3c2835f8def6..5d7fc58ef23f 100644
> > --- a/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> > +++ b/UefiCpuPkg/Library/SmmCpuSyncLib/SmmCpuSyncLib.c
> > @@ -91,35 +91,38 @@ UINT32
> >  InternalWaitForSemaphore (
> >    IN OUT  volatile UINT32  *Sem
> >    )
> >  {
> >    UINT32  Value;
> >
> >    for ( ; ;) {
> >      Value = *Sem;
> > +    if (Value == MAX_UINT32) {
> > +      return Value;
> > +    }
> >      if ((Value != 0) &&
> >          (InterlockedCompareExchange32 (
> >             (UINT32 *)Sem,
> >             Value,
> >             Value - 1
> >             ) == Value))
> >      {
> >        break;
> >      }
> >
> >      CpuPause ();
> >    }
> >
> >    return Value - 1;
> >  }
> >
> > Note, I'm just brainstorming here, I've not thought it through. Just to
> > illustrate the direction I'm thinking of.
> >
> > This change should be mostly OK. InternalWaitForSemaphore() returns the
> > decremented value. So, for InternalWaitForSemaphore() to return
> > MAX_UINT32 *without* this update, the function would have to decrement
> > the semaphore when the semaphore is zero. But in that case, the function
> > *blocks*. Thus, a return value of MAX_UINT32 is not possible without
> > this extension; ergo, if MAX_UINT32 is returned (with this extension),
> 
> Yes, that's for the semaphore sync usage, we have to block the sem if it's zero,
> decrease it when return. That's why I said - it's naturally make sure the Run is
> reset after all ready to exit.  Then it can achieve the below flow:
>     BSP: ReleaseOneAp  -->  AP: WaitForBsp
>     BSP: WaitForAPs    <--  AP: ReleaseBsp
> 
> 
> For locked case, I just copy the existing logic from SMM cpu driver (as I
> document in the commit message: The instance refers the existing SMM CPU
> driver (PiSmmCpuDxeSmm) sync implementation and behavior):
> existing ReleaseSemaphore() prevents increase the semaphore, but it still
> return the original semaphore value +1; --> that's why we have to check the
> return value is  0 or not in SmmCpuSyncCheckInCpu()
> existing WaitForSemaphore() allow decrease the semaphore if locked, and it
> also return the original semaphore value -1;  --> that's why we have to check
> the return value is  < 0 or not in SmmCpuSyncCheckOutCpu()
> 
> so, do you want to align the behavior as below?
> 
> ReleaseSemaphore() prevents increase the semaphore if locked, and it should
> return the locked value (MAX_UINT32);  --> then we can check the return
> value is  MAX_UINT32 or not in SmmCpuSyncCheckInCpu(), and sem itself
> won't be changed.
> WaitForSemaphore() prevents decrease the semaphore if locked, and it should
> return the locked value (MAX_UINT32); --> then we can check the return value
> is  MAX_UINT32 or not in SmmCpuSyncCheckOutCpu (), and sem itself won't
> be changed.
> 
> I think:
> for ReleaseSemaphore, it must meet below 2 cases usage:
> 1. for semaphore sync usage (Run), it doesn't care the lock case, and returned
> value is not cared. Just check the semaphore itself.
> 2. for Rendezvous case (Counter), it not only needs to check locked or not
> from return value, but also require "only increase the semaphore if not
> locked".
> 
> for WaitForSemaphore, it must meet below 2 cases usage:
> 1. for semaphore sync usage (Run), it doesn't care the lock case, and returned
> value is not cared. But for the semaphore itself, it need block at 0, and
> decrease when return.
> 2. for Rendezvous case (Counter), it only needs to check locked or not from
> return value. semaphore itself is not cared.
> 
> So, based on above, I think, yes, we can do the change to align the lock
> behavior:
> 
> /**
>   Performs an atomic compare exchange operation to get semaphore.
>   The compare exchange operation must be performed using MP safe
>   mechanisms.
> 
>   @param[in,out]  Sem    IN:  32-bit unsigned integer
>                          OUT: original integer - 1 if Sem is not locked.
>                          OUT: original integer (MAX_UINT32) if Sem is locked.
> 
>   @retval     Original integer - 1 if Sem is not locked.
>               Original integer (MAX_UINT32) if Sem is locked.
> 
> **/
> STATIC
> UINT32
> InternalWaitForSemaphore (
>   IN OUT  volatile UINT32  *Sem
>   )
> {
>   UINT32  Value;
> 
>   for ( ; ;) {
>     Value = *Sem;
>     if (Value == MAX_UINT32) {
>       return Value;
>     }
> 
>     if ((Value != 0) &&
>         (InterlockedCompareExchange32 (
>            (UINT32 *)Sem,
>            Value,
>            Value - 1
>            ) == Value))
>     {
>       break;
>     }
> 
>     CpuPause ();
>   }
> 
>   return Value - 1;
> }
> 
> /**
>   Performs an atomic compare exchange operation to release semaphore.
>   The compare exchange operation must be performed using MP safe
>   mechanisms.
> 
>   @param[in,out]  Sem    IN:  32-bit unsigned integer
>                          OUT: original integer + 1 if Sem is not locked.
>                          OUT: original integer (MAX_UINT32) if Sem is locked.
> 
>   @retval    Original integer + 1 if Sem is not locked.
>              Original integer (MAX_UINT32) if Sem is locked.
> 
> **/
> STATIC
> UINT32
> InternalReleaseSemaphore (
>   IN OUT  volatile UINT32  *Sem
>   )
> {
>   UINT32  Value;
> 
>   do {
>     Value = *Sem;
>   } while (Value + 1 != 0 &&
>            InterlockedCompareExchange32 (
>              (UINT32 *)Sem,
>              Value,
>              Value + 1
>              ) != Value);
> 
>   if (Value == MAX_UINT32) {
>     return Value;
>   }
> 
>   return Value + 1;
> }
> 
> I haven't see any issue with this change.
> 
> > we know the door was locked earlier (and the semaphore is not changed).
> >
> > At the same time, we might want to update InternalReleaseSemaphore() as
> > well, so that it cannot validly increment the semaphore value to
> MAX_UINT32.
> >
> >
> >
> > >
> > >



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112585): https://edk2.groups.io/g/devel/message/112585
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance
  2023-12-14 13:48       ` Laszlo Ersek
                           ` (2 preceding siblings ...)
  2023-12-15  6:41         ` Wu, Jiaxin
@ 2023-12-15  6:44         ` Wu, Jiaxin
  3 siblings, 0 replies; 22+ messages in thread
From: Wu, Jiaxin @ 2023-12-15  6:44 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Dong, Eric, Ni, Ray, Zeng, Star, Gerd Hoffmann, Kumar, Rahul R

SmmCpuSyncGetArrivedCpuCount () won't have below requirement to caller:

"The caller shall not call this function for the number of arrived CPU after look door in SMI since the value has been returned in the parameter of LockDoor()."

The API will ways support to get the ArrivedCpuCount no mater locked or not.

So, ignore below case.

Thanks,
Jiaxin 

> -----Original Message-----
> From: Wu, Jiaxin
> Sent: Thursday, December 14, 2023 11:55 PM
> To: devel@edk2.groups.io; lersek@redhat.com
> Cc: Dong, Eric <eric.dong@intel.com>; Ni, Ray <ray.ni@intel.com>; Zeng, Star
> <star.zeng@intel.com>; Gerd Hoffmann <kraxel@redhat.com>; Kumar, Rahul R
> <rahul.r.kumar@intel.com>
> Subject: RE: [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements
> SmmCpuSyncLib library instance
> 
> BTW, for SmmCpuSyncGetArrivedCpuCount ():
> 
> we can't check the CpuCount (Original is named as Counter Sem) is locked or
> not, then decide return from the *Context->CpuCount or locked value for the
> arrived CPU in SMI. Just like:
> 
> if (*Context->CpuCount == MAX_UINT32) {        ------> does not meet this
> condition, means unlocked!
> 	Return real CpuCount from the SmmCpuSyncLockDoor().
> }
>                 ----> lock operation is here!!!! *Context->CpuCount change to
> MAX_UINT32
> Return *Context->CpuCount;   --> return wrong value since MAX_UINT32 is
> return.
> 
> Because if we found it's not locked during the check, but it suddenly locked
> before return, then -1 will be returned. this is not atomic operation. The
> behavior is not expected. If we add the atomic operation here, I believe it will
> surely impact the existing performance.
> 
> And the real usage case is that we only need this api before the lock. I don't
> want make it complex.
> 
> So, based on this, we add the comment in the function:
>   The caller shall not call this function for the number of arrived CPU after look
> door
>   in SMI since the value has been returned in the parameter of LockDoor().
> 
> See below:
> 
> /**
>   Get current number of arrived CPU in SMI.
> 
>   BSP might need to know the current number of arrived CPU in SMI to make
> sure all APs
>   in SMI. This API can be for that purpose.
> 
>   The caller shall not call this function for the number of arrived CPU after look
> door
>   in SMI since the value has been returned in the parameter of LockDoor().
> 
>   If Context is NULL, then ASSERT().
> 
>   @param[in]      Context     Pointer to the SMM CPU Sync context object.
> 
>   @retval    Current number of arrived CPU in SMI.
> 
> **/
> UINTN
> EFIAPI
> SmmCpuSyncGetArrivedCpuCount (
>   IN  SMM_CPU_SYNC_CONTEXT  *Context
>   )
> {
>   ASSERT (Context != NULL);
> 
>   return *Context->CpuCount;
> }
> 
> Thanks,
> Jiaxin
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#112586): https://edk2.groups.io/g/devel/message/112586
Mute This Topic: https://groups.io/mt/103010165/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-12-15  6:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-06 10:01 [edk2-devel] [PATCH v3 0/6] Refine SMM CPU Sync flow and abstract SmmCpuSyncLib Wu, Jiaxin
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 1/6] UefiCpuPkg/PiSmmCpuDxeSmm: Optimize Semaphore Sync between BSP and AP Wu, Jiaxin
2023-12-12 19:27   ` Laszlo Ersek
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 2/6] UefiCpuPkg: Adds SmmCpuSyncLib library class Wu, Jiaxin
2023-12-07  9:07   ` Ni, Ray
2023-12-12 20:18   ` Laszlo Ersek
2023-12-13  4:23     ` Wu, Jiaxin
2023-12-13 15:02       ` Laszlo Ersek
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 3/6] UefiCpuPkg: Implements SmmCpuSyncLib library instance Wu, Jiaxin
2023-12-13 14:34   ` Laszlo Ersek
2023-12-14 11:11     ` Wu, Jiaxin
2023-12-14 13:48       ` Laszlo Ersek
2023-12-14 15:34         ` Wu, Jiaxin
2023-12-14 15:54         ` Wu, Jiaxin
2023-12-15  6:41         ` Wu, Jiaxin
2023-12-15  6:44         ` Wu, Jiaxin
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 4/6] OvmfPkg: Specifies SmmCpuSyncLib instance Wu, Jiaxin
2023-12-13 16:52   ` Laszlo Ersek
2023-12-14 13:43     ` Wu, Jiaxin
2023-12-15  0:21     ` Ni, Ray
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 5/6] UefiPayloadPkg: " Wu, Jiaxin
2023-12-06 10:01 ` [edk2-devel] [PATCH v3 6/6] UefiCpuPkg/PiSmmCpuDxeSmm: Consume SmmCpuSyncLib Wu, Jiaxin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox