public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH v2 0/3] UefiCpuPkg, OvmfPkg: set maximum (V)CPU count dynamically
@ 2016-11-24 20:56 Laszlo Ersek
  2016-11-24 20:56 ` [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber Laszlo Ersek
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-24 20:56 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen, Michael Kinney

This is version 2 of the series at
<https://lists.01.org/pipermail/edk2-devel/2016-November/005073.html>.

Changes in this version:

- Patches #1 and #2 from v1 have been committed.

- Patches #3 and #4 from v1 have been reworked based on Jeff's advice.
  Namely, we now set PcdCpuMaxLogicalProcessorNumber and
  PcdCpuApInitTimeOutInMicroSeconds dynamically, rather than adding a
  new PCD. The method of waiting has been changed too, we now wait for
  the CPUs and the timeout simultaneously.

- Making PcdCpuMaxLogicalProcessorNumber actually dynamic required
  changes for PiSmmCpuDxeSmm too, so a new patch has been inserted at
  the front.

Repo:   https://github.com/lersek/edk2/
Branch: mpinit_v2

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>

Thanks
Laszlo

Laszlo Ersek (3):
  UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic
    PcdCpuMaxLogicalProcessorNumber
  UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP
    startup
  OvmfPkg/PlatformPei: take VCPU count from QEMU and configure MpInitLib

 OvmfPkg/OvmfPkgIa32.dsc                    |  4 ++
 OvmfPkg/OvmfPkgIa32X64.dsc                 |  4 ++
 OvmfPkg/OvmfPkgX64.dsc                     |  4 ++
 OvmfPkg/PlatformPei/MemDetect.c            |  2 +-
 OvmfPkg/PlatformPei/Platform.c             | 43 +++++++++++++
 OvmfPkg/PlatformPei/Platform.h             |  2 +
 OvmfPkg/PlatformPei/PlatformPei.inf        |  1 +
 UefiCpuPkg/Library/MpInitLib/MpLib.c       | 67 +++++++++++++++++++-
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  2 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c     | 18 +++---
 UefiCpuPkg/UefiCpuPkg.dec                  |  3 +-
 11 files changed, 137 insertions(+), 13 deletions(-)

-- 
2.9.2



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber
  2016-11-24 20:56 [PATCH v2 0/3] UefiCpuPkg, OvmfPkg: set maximum (V)CPU count dynamically Laszlo Ersek
@ 2016-11-24 20:56 ` Laszlo Ersek
  2016-11-28  2:15   ` Fan, Jeff
  2016-11-24 20:56 ` [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup Laszlo Ersek
  2016-11-24 20:56 ` [PATCH v2 3/3] OvmfPkg/PlatformPei: take VCPU count from QEMU and configure MpInitLib Laszlo Ersek
  2 siblings, 1 reply; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-24 20:56 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen, Michael Kinney

"UefiCpuPkg/UefiCpuPkg.dec" already allows platforms to make
PcdCpuMaxLogicalProcessorNumber dynamic, however PiSmmCpuDxeSmm does not
take this into account everywhere. As soon as a platform turns the PCD
into a dynamic one, at least S3 fails. When the PCD is dynamic, all
PcdGet() calls translate into PCD DXE protocol calls, which are only
permitted at boot time, not at runtime or during S3 resume.

We already have a variable called mMaxNumberOfCpus; it is initialized in
the entry point function like this:

> //
> // If support CPU hot plug, we need to allocate resources for possibly
> // hot-added processors
> //
> if (FeaturePcdGet (PcdCpuHotPlugSupport)) {
>   mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
> } else {
>   mMaxNumberOfCpus = mNumberOfCpus;
> }

There's another use of the PCD a bit higher up, also in the entry point
function:

> //
> // Use MP Services Protocol to retrieve the number of processors and
> // number of enabled processors
> //
> Status = MpServices->GetNumberOfProcessors (MpServices, &mNumberOfCpus,
>                        &NumberOfEnabledProcessors);
> ASSERT_EFI_ERROR (Status);
> ASSERT (mNumberOfCpus <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));

Preserve these calls in the entry point function, and replace all other
uses of PcdCpuMaxLogicalProcessorNumber -- there are only reads -- with
mMaxNumberOfCpus.

For PcdCpuHotPlugSupport==TRUE, this is an unobservable change.

For PcdCpuHotPlugSupport==FALSE, we even save SMRAM, because we no longer
allocate resources needlessly for CPUs that can never appear in the
system.

PcdCpuMaxLogicalProcessorNumber is also retrieved in
"UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c",  but only in
the library instance constructor, which runs even before the entry point
function is called.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  2 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c     | 18 +++++++++---------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
index 4baba1e9f8dc..f957de1f4764 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
@@ -344,7 +344,7 @@ SmmInitHandler (
   AsmWriteIdtr (&gcSmiIdtr);
   ApicId = GetApicId ();
 
-  ASSERT (mNumberOfCpus <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+  ASSERT (mNumberOfCpus <= mMaxNumberOfCpus);
 
   for (Index = 0; Index < mNumberOfCpus; Index++) {
     if (ApicId == (UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId) {
diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
index c1a48d100e0f..f53819ee24c2 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
@@ -139,7 +139,7 @@ GetCpuIndex (
 
   ApicId = GetApicId ();
 
-  for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
+  for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
     if (gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId == ApicId) {
       return Index;
     }
@@ -825,13 +825,13 @@ InitSmmProfileInternal (
   UINTN                      MsrDsAreaSizePerCpu;
   UINTN                      TotalSize;
 
-  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * mMaxNumberOfCpus);
   ASSERT (mPFEntryCount != NULL);
   mLastPFEntryValue = (UINT64  (*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
-                                                         sizeof (mLastPFEntryValue[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+                                                         sizeof (mLastPFEntryValue[0]) * mMaxNumberOfCpus);
   ASSERT (mLastPFEntryValue != NULL);
   mLastPFEntryPointer = (UINT64 *(*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
-                                                           sizeof (mLastPFEntryPointer[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+                                                           sizeof (mLastPFEntryPointer[0]) * mMaxNumberOfCpus);
   ASSERT (mLastPFEntryPointer != NULL);
 
   //
@@ -872,17 +872,17 @@ InitSmmProfileInternal (
   mSmmProfileBase->NumCpus        = gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus;
 
   if (mBtsSupported) {
-    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof (MSR_DS_AREA_STRUCT *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof (MSR_DS_AREA_STRUCT *) * mMaxNumberOfCpus);
     ASSERT (mMsrDsArea != NULL);
-    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof (BRANCH_TRACE_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof (BRANCH_TRACE_RECORD *) * mMaxNumberOfCpus);
     ASSERT (mMsrBTSRecord != NULL);
-    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof (PEBS_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof (PEBS_RECORD *) * mMaxNumberOfCpus);
     ASSERT (mMsrPEBSRecord != NULL);
 
     mMsrDsAreaBase  = (MSR_DS_AREA_STRUCT *)((UINTN)Base + mSmmProfileSize);
-    MsrDsAreaSizePerCpu = mMsrDsAreaSize / PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
+    MsrDsAreaSizePerCpu = mMsrDsAreaSize / mMaxNumberOfCpus;
     mBTSRecordNumber    = (MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER - sizeof(MSR_DS_AREA_STRUCT)) / sizeof(BRANCH_TRACE_RECORD);
-    for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
+    for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
       mMsrDsArea[Index]     = (MSR_DS_AREA_STRUCT *)((UINTN)mMsrDsAreaBase + MsrDsAreaSizePerCpu * Index);
       mMsrBTSRecord[Index]  = (BRANCH_TRACE_RECORD *)((UINTN)mMsrDsArea[Index] + sizeof(MSR_DS_AREA_STRUCT));
       mMsrPEBSRecord[Index] = (PEBS_RECORD *)((UINTN)mMsrDsArea[Index] + MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER);
-- 
2.9.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
  2016-11-24 20:56 [PATCH v2 0/3] UefiCpuPkg, OvmfPkg: set maximum (V)CPU count dynamically Laszlo Ersek
  2016-11-24 20:56 ` [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber Laszlo Ersek
@ 2016-11-24 20:56 ` Laszlo Ersek
  2016-11-28  5:37   ` Fan, Jeff
  2016-11-24 20:56 ` [PATCH v2 3/3] OvmfPkg/PlatformPei: take VCPU count from QEMU and configure MpInitLib Laszlo Ersek
  2 siblings, 1 reply; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-24 20:56 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen, Michael Kinney

Sometimes a platform knows exactly how many CPUs it has at boot. It should
be able to
- set PcdCpuMaxLogicalProcessorNumber dynamically to this number,
- set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),
- and expect that MpInitLib wait exactly as long as necessary for all APs
  to report in.

Other platforms should be able to continue setting a reasonably large
upper bound on supported CPUs, and waiting for a reasonable, fixed amount
of time for all APs to report in.

Add this functionality. The TimedWaitForApFinish() function will return
when all APs have reported in, or the timeout has expired -- whichever
happens first.

(Accessing these PCDs dynamically is safe. The PEI and DXE phase instances
of this library are restricted to PEIM and DXE_DRIVER client modules, thus
the PCD accesses cannot be linked into runtime code.)

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/UefiCpuPkg.dec            |  3 +-
 UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
index ca560398bbef..8607ae8b7ae8 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
   # @Prompt Configure max supported number of Logical Processors
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002
   ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.
-  # @Prompt Timeout for the BSP to detect all APs for the first time.
+  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.
+  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004
   ## Specifies the base address of the first microcode Patch in the microcode Region.
   # @Prompt Microcode Region base address.
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 15dbfa1e7d6c..76de92fa552a 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -684,6 +684,22 @@ FillExchangeInfoData (
 }
 
 /**
+  Helper function that waits until the finished AP count reaches the specified
+  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  );
+
+/**
   This function will be called by BSP to wakeup AP.
 
   @param[in] CpuMpData          Pointer to CPU MP Data
@@ -748,7 +764,11 @@ WakeUpAP (
       //
       // Wait for all potential APs waken up in one specified period
       //
-      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      TimedWaitForApFinish (
+        CpuMpData,
+        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
+        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
+        );
     } else {
       //
       // Wait all APs waken up if this is not the 1st broadcast of SIPI
@@ -895,6 +915,51 @@ CheckTimeout (
 }
 
 /**
+  Helper function that waits until the finished AP count reaches the specified
+  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  )
+{
+  CpuMpData->TotalTime = 0;
+  CpuMpData->ExpectedTime = CalculateTimeout (
+                              TimeLimit,
+                              &CpuMpData->CurrentTime
+                              );
+  while (CpuMpData->FinishedCount < FinishedApLimit &&
+         !CheckTimeout (
+            &CpuMpData->CurrentTime,
+            &CpuMpData->TotalTime,
+            CpuMpData->ExpectedTime
+            )) {
+    CpuPause ();
+  }
+
+  if (CpuMpData->FinishedCount >= FinishedApLimit) {
+    DEBUG ((
+      DEBUG_VERBOSE,
+      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",
+      __FUNCTION__,
+      FinishedApLimit,
+      DivU64x64Remainder (
+        MultU64x32 (CpuMpData->TotalTime, 1000000),
+        GetPerformanceCounterProperties (NULL, NULL),
+        NULL
+        )
+      ));
+  }
+}
+
+/**
   Reset an AP to Idle state.
 
   Any task being executed by the AP will be aborted and the AP
-- 
2.9.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/3] OvmfPkg/PlatformPei: take VCPU count from QEMU and configure MpInitLib
  2016-11-24 20:56 [PATCH v2 0/3] UefiCpuPkg, OvmfPkg: set maximum (V)CPU count dynamically Laszlo Ersek
  2016-11-24 20:56 ` [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber Laszlo Ersek
  2016-11-24 20:56 ` [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup Laszlo Ersek
@ 2016-11-24 20:56 ` Laszlo Ersek
  2 siblings, 0 replies; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-24 20:56 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen, Michael Kinney

These settings will allow CpuMpPei and CpuDxe to wait for the initial AP
check-ins exactly as long as necessary.

It is safe to set PcdCpuMaxLogicalProcessorNumber and
PcdCpuApInitTimeOutInMicroSeconds in OvmfPkg/PlatformPei.
OvmfPkg/PlatformPei installs the permanent PEI RAM, producing
gEfiPeiMemoryDiscoveredPpiGuid, and UefiCpuPkg/CpuMpPei has a depex on
gEfiPeiMemoryDiscoveredPpiGuid.

It is safe to read the fw_cfg item QemuFwCfgItemSmpCpuCount (0x0005). It
was added to QEMU in 2008 as key FW_CFG_NB_CPUS, in commit 905fdcb5264c
("Add common keys to firmware configuration"). Even if the key is
unavailable (or if fw_cfg is entirely unavailable, for example on Xen),
QemuFwCfgRead16() will return 0, and then we stick with the current
behavior.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 OvmfPkg/OvmfPkgIa32.dsc             |  4 ++
 OvmfPkg/OvmfPkgIa32X64.dsc          |  4 ++
 OvmfPkg/OvmfPkgX64.dsc              |  4 ++
 OvmfPkg/PlatformPei/PlatformPei.inf |  1 +
 OvmfPkg/PlatformPei/Platform.h      |  2 +
 OvmfPkg/PlatformPei/MemDetect.c     |  2 +-
 OvmfPkg/PlatformPei/Platform.c      | 43 ++++++++++++++++++++
 7 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index ed43c4514491..d91303052263 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -488,6 +488,10 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  # UefiCpuPkg PCDs related to initial AP bringup and general AP management.
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index ccd156d9231a..8143ea9c2837 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -496,6 +496,10 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  # UefiCpuPkg PCDs related to initial AP bringup and general AP management.
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index 012ce85462c5..d48d603d33d4 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -495,6 +495,10 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  # UefiCpuPkg PCDs related to initial AP bringup and general AP management.
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/PlatformPei/PlatformPei.inf b/OvmfPkg/PlatformPei/PlatformPei.inf
index 776a4ab11f79..fbaed3182dcf 100644
--- a/OvmfPkg/PlatformPei/PlatformPei.inf
+++ b/OvmfPkg/PlatformPei/PlatformPei.inf
@@ -95,6 +95,7 @@ [Pcd]
   gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiS3Enable
   gUefiCpuPkgTokenSpaceGuid.PcdCpuLocalApicBaseAddress
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize
 
 [FixedPcd]
diff --git a/OvmfPkg/PlatformPei/Platform.h b/OvmfPkg/PlatformPei/Platform.h
index eda765be30de..18f42c3f0ea8 100644
--- a/OvmfPkg/PlatformPei/Platform.h
+++ b/OvmfPkg/PlatformPei/Platform.h
@@ -101,4 +101,6 @@ extern BOOLEAN mS3Supported;
 
 extern UINT8 mPhysMemAddressWidth;
 
+extern UINT32 mMaxCpuCount;
+
 #endif // _PLATFORM_PEI_H_INCLUDED_
diff --git a/OvmfPkg/PlatformPei/MemDetect.c b/OvmfPkg/PlatformPei/MemDetect.c
index d00a570d4381..af96a04d194a 100644
--- a/OvmfPkg/PlatformPei/MemDetect.c
+++ b/OvmfPkg/PlatformPei/MemDetect.c
@@ -358,7 +358,7 @@ PublishPeiMemory (
   //
   if (mS3Supported) {
     mS3AcpiReservedMemorySize = SIZE_512KB +
-      PcdGet32 (PcdCpuMaxLogicalProcessorNumber) *
+      mMaxCpuCount *
       PcdGet32 (PcdCpuApStackSize);
     mS3AcpiReservedMemoryBase = LowerMemorySize - mS3AcpiReservedMemorySize;
     LowerMemorySize = mS3AcpiReservedMemoryBase;
diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index c6e1106c9ed0..8522fe5d831d 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -68,6 +68,7 @@ EFI_BOOT_MODE mBootMode = BOOT_WITH_FULL_CONFIGURATION;
 
 BOOLEAN mS3Supported = FALSE;
 
+UINT32 mMaxCpuCount;
 
 VOID
 AddIoMemoryBaseSizeHob (
@@ -568,6 +569,47 @@ S3Verification (
 
 
 /**
+  Fetch the number of boot CPUs from QEMU and expose it to UefiCpuPkg modules.
+  Set the mMaxCpuCount variable.
+**/
+VOID
+MaxCpuCountInitialization (
+  VOID
+  )
+{
+  UINT16        ProcessorCount;
+  RETURN_STATUS PcdStatus;
+
+  QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount);
+  ProcessorCount = QemuFwCfgRead16 ();
+  //
+  // If the fw_cfg key or fw_cfg entirely is unavailable, load mMaxCpuCount
+  // from the PCD default. No change to PCDs.
+  //
+  if (ProcessorCount == 0) {
+    mMaxCpuCount = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
+    return;
+  }
+  //
+  // Otherwise, set mMaxCpuCount to the value reported by QEMU.
+  //
+  mMaxCpuCount = ProcessorCount;
+  //
+  // Additionally, tell UefiCpuPkg modules (a) the exact number of VCPUs, (b)
+  // to wait, in the initial AP bringup, exactly as long as it takes for all of
+  // the APs to report in. For this, we set an infinite timeout (represented by
+  // 0 microseconds).
+  //
+  PcdStatus = PcdSet32S (PcdCpuMaxLogicalProcessorNumber, ProcessorCount);
+  ASSERT_RETURN_ERROR (PcdStatus);
+  PcdStatus = PcdSet32S (PcdCpuApInitTimeOutInMicroSeconds, 0);
+  ASSERT_RETURN_ERROR (PcdStatus);
+  DEBUG ((DEBUG_INFO, "%a: QEMU reports %d processor(s)\n", __FUNCTION__,
+    ProcessorCount));
+}
+
+
+/**
   Perform Platform PEI initialization.
 
   @param  FileHandle      Handle of the file being invoked.
@@ -601,6 +643,7 @@ InitializePlatform (
   S3Verification ();
   BootModeInitialization ();
   AddressWidthInitialization ();
+  MaxCpuCountInitialization ();
 
   PublishPeiMemory ();
 
-- 
2.9.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber
  2016-11-24 20:56 ` [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber Laszlo Ersek
@ 2016-11-28  2:15   ` Fan, Jeff
  2016-11-28 11:23     ` Laszlo Ersek
  0 siblings, 1 reply; 9+ messages in thread
From: Fan, Jeff @ 2016-11-28  2:15 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01
  Cc: Igor Mammedov, Justen, Jordan L, Kinney, Michael D

Laszlo,

Even getting PcdCpuMaxLogicalProcessorNumber is legal in InitSmmProfileInternal() that is only invoked on boot time, I agree to do this updating in this patch.
We need to try to use mMaxNumberOfCpus as possible for size saving and performance improvement.

Reviewed-by: Jeff Fan <jeff.fan@intel.com>

Jeff

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com] 
Sent: Friday, November 25, 2016 4:57 AM
To: edk2-devel-01
Cc: Igor Mammedov; Fan, Jeff; Justen, Jordan L; Kinney, Michael D
Subject: [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber

"UefiCpuPkg/UefiCpuPkg.dec" already allows platforms to make PcdCpuMaxLogicalProcessorNumber dynamic, however PiSmmCpuDxeSmm does not take this into account everywhere. As soon as a platform turns the PCD into a dynamic one, at least S3 fails. When the PCD is dynamic, all
PcdGet() calls translate into PCD DXE protocol calls, which are only permitted at boot time, not at runtime or during S3 resume.

We already have a variable called mMaxNumberOfCpus; it is initialized in the entry point function like this:

> //
> // If support CPU hot plug, we need to allocate resources for possibly 
> // hot-added processors // if (FeaturePcdGet (PcdCpuHotPlugSupport)) {
>   mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
> } else {
>   mMaxNumberOfCpus = mNumberOfCpus;
> }

There's another use of the PCD a bit higher up, also in the entry point
function:

> //
> // Use MP Services Protocol to retrieve the number of processors and 
> // number of enabled processors // Status = 
> MpServices->GetNumberOfProcessors (MpServices, &mNumberOfCpus,
>                        &NumberOfEnabledProcessors); ASSERT_EFI_ERROR 
> (Status); ASSERT (mNumberOfCpus <= PcdGet32 
> (PcdCpuMaxLogicalProcessorNumber));

Preserve these calls in the entry point function, and replace all other uses of PcdCpuMaxLogicalProcessorNumber -- there are only reads -- with mMaxNumberOfCpus.

For PcdCpuHotPlugSupport==TRUE, this is an unobservable change.

For PcdCpuHotPlugSupport==FALSE, we even save SMRAM, because we no longer allocate resources needlessly for CPUs that can never appear in the system.

PcdCpuMaxLogicalProcessorNumber is also retrieved in "UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c",  but only in the library instance constructor, which runs even before the entry point function is called.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  2 +-
 UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c     | 18 +++++++++---------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
index 4baba1e9f8dc..f957de1f4764 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
@@ -344,7 +344,7 @@ SmmInitHandler (
   AsmWriteIdtr (&gcSmiIdtr);
   ApicId = GetApicId ();
 
-  ASSERT (mNumberOfCpus <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+  ASSERT (mNumberOfCpus <= mMaxNumberOfCpus);
 
   for (Index = 0; Index < mNumberOfCpus; Index++) {
     if (ApicId == (UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId) { diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
index c1a48d100e0f..f53819ee24c2 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
@@ -139,7 +139,7 @@ GetCpuIndex (
 
   ApicId = GetApicId ();
 
-  for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
+  for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
     if (gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId == ApicId) {
       return Index;
     }
@@ -825,13 +825,13 @@ InitSmmProfileInternal (
   UINTN                      MsrDsAreaSizePerCpu;
   UINTN                      TotalSize;
 
-  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * 
+ mMaxNumberOfCpus);
   ASSERT (mPFEntryCount != NULL);
   mLastPFEntryValue = (UINT64  (*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
-                                                         sizeof (mLastPFEntryValue[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+                                                         sizeof 
+ (mLastPFEntryValue[0]) * mMaxNumberOfCpus);
   ASSERT (mLastPFEntryValue != NULL);
   mLastPFEntryPointer = (UINT64 *(*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
-                                                           sizeof (mLastPFEntryPointer[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+                                                           sizeof 
+ (mLastPFEntryPointer[0]) * mMaxNumberOfCpus);
   ASSERT (mLastPFEntryPointer != NULL);
 
   //
@@ -872,17 +872,17 @@ InitSmmProfileInternal (
   mSmmProfileBase->NumCpus        = gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus;
 
   if (mBtsSupported) {
-    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof (MSR_DS_AREA_STRUCT *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof 
+ (MSR_DS_AREA_STRUCT *) * mMaxNumberOfCpus);
     ASSERT (mMsrDsArea != NULL);
-    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof (BRANCH_TRACE_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof 
+ (BRANCH_TRACE_RECORD *) * mMaxNumberOfCpus);
     ASSERT (mMsrBTSRecord != NULL);
-    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof (PEBS_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
+    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof 
+ (PEBS_RECORD *) * mMaxNumberOfCpus);
     ASSERT (mMsrPEBSRecord != NULL);
 
     mMsrDsAreaBase  = (MSR_DS_AREA_STRUCT *)((UINTN)Base + mSmmProfileSize);
-    MsrDsAreaSizePerCpu = mMsrDsAreaSize / PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
+    MsrDsAreaSizePerCpu = mMsrDsAreaSize / mMaxNumberOfCpus;
     mBTSRecordNumber    = (MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER - sizeof(MSR_DS_AREA_STRUCT)) / sizeof(BRANCH_TRACE_RECORD);
-    for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
+    for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
       mMsrDsArea[Index]     = (MSR_DS_AREA_STRUCT *)((UINTN)mMsrDsAreaBase + MsrDsAreaSizePerCpu * Index);
       mMsrBTSRecord[Index]  = (BRANCH_TRACE_RECORD *)((UINTN)mMsrDsArea[Index] + sizeof(MSR_DS_AREA_STRUCT));
       mMsrPEBSRecord[Index] = (PEBS_RECORD *)((UINTN)mMsrDsArea[Index] + MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER);
--
2.9.2




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
  2016-11-24 20:56 ` [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup Laszlo Ersek
@ 2016-11-28  5:37   ` Fan, Jeff
  2016-11-28 10:15     ` Laszlo Ersek
  0 siblings, 1 reply; 9+ messages in thread
From: Fan, Jeff @ 2016-11-28  5:37 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01; +Cc: Kinney, Michael D, Justen, Jordan L

Laszlo,

Setting PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0) is behavior changing. 

Even there is no any multi-processor supporting platform set this PCD to zero, this PCD maybe set to 0 on single processor platform.

I prefer to setting PcdCpuApInitTimeOutInMicroSeconds to 0xFFFFFFFF (about 1.1 hour) for such purpose.

Thanks!
Jeff

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
Sent: Friday, November 25, 2016 4:57 AM
To: edk2-devel-01
Cc: Kinney, Michael D; Fan, Jeff; Justen, Jordan L
Subject: [edk2] [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup

Sometimes a platform knows exactly how many CPUs it has at boot. It should be able to
- set PcdCpuMaxLogicalProcessorNumber dynamically to this number,
- set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),
- and expect that MpInitLib wait exactly as long as necessary for all APs
  to report in.

Other platforms should be able to continue setting a reasonably large upper bound on supported CPUs, and waiting for a reasonable, fixed amount of time for all APs to report in.

Add this functionality. The TimedWaitForApFinish() function will return when all APs have reported in, or the timeout has expired -- whichever happens first.

(Accessing these PCDs dynamically is safe. The PEI and DXE phase instances of this library are restricted to PEIM and DXE_DRIVER client modules, thus the PCD accesses cannot be linked into runtime code.)

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/UefiCpuPkg.dec            |  3 +-
 UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..8607ae8b7ae8 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
   # @Prompt Configure max supported number of Logical Processors
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002
   ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.
-  # @Prompt Timeout for the BSP to detect all APs for the first time.
+  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.
+  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004
   ## Specifies the base address of the first microcode Patch in the microcode Region.
   # @Prompt Microcode Region base address.
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 15dbfa1e7d6c..76de92fa552a 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -684,6 +684,22 @@ FillExchangeInfoData (  }
 
 /**
+  Helper function that waits until the finished AP count reaches the 
+ specified  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  );
+
+/**
   This function will be called by BSP to wakeup AP.
 
   @param[in] CpuMpData          Pointer to CPU MP Data
@@ -748,7 +764,11 @@ WakeUpAP (
       //
       // Wait for all potential APs waken up in one specified period
       //
-      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      TimedWaitForApFinish (
+        CpuMpData,
+        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
+        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
+        );
     } else {
       //
       // Wait all APs waken up if this is not the 1st broadcast of SIPI @@ -895,6 +915,51 @@ CheckTimeout (  }
 
 /**
+  Helper function that waits until the finished AP count reaches the 
+ specified  limit, or the specified timeout elapses (whichever comes first).
+
+  @param[in] CpuMpData        Pointer to CPU MP Data.
+  @param[in] FinishedApLimit  The number of finished APs to wait for.
+  @param[in] TimeLimit        The number of microseconds to wait for. Zero
+                              means infinity.
+**/
+VOID
+TimedWaitForApFinish (
+  IN CPU_MP_DATA               *CpuMpData,
+  IN UINT32                    FinishedApLimit,
+  IN UINT32                    TimeLimit
+  )
+{
+  CpuMpData->TotalTime = 0;
+  CpuMpData->ExpectedTime = CalculateTimeout (
+                              TimeLimit,
+                              &CpuMpData->CurrentTime
+                              );
+  while (CpuMpData->FinishedCount < FinishedApLimit &&
+         !CheckTimeout (
+            &CpuMpData->CurrentTime,
+            &CpuMpData->TotalTime,
+            CpuMpData->ExpectedTime
+            )) {
+    CpuPause ();
+  }
+
+  if (CpuMpData->FinishedCount >= FinishedApLimit) {
+    DEBUG ((
+      DEBUG_VERBOSE,
+      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",
+      __FUNCTION__,
+      FinishedApLimit,
+      DivU64x64Remainder (
+        MultU64x32 (CpuMpData->TotalTime, 1000000),
+        GetPerformanceCounterProperties (NULL, NULL),
+        NULL
+        )
+      ));
+  }
+}
+
+/**
   Reset an AP to Idle state.
 
   Any task being executed by the AP will be aborted and the AP
--
2.9.2


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
  2016-11-28  5:37   ` Fan, Jeff
@ 2016-11-28 10:15     ` Laszlo Ersek
  2016-11-28 11:58       ` Igor Mammedov
  0 siblings, 1 reply; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-28 10:15 UTC (permalink / raw)
  To: Fan, Jeff, edk2-devel-01; +Cc: Kinney, Michael D, Justen, Jordan L

On 11/28/16 06:37, Fan, Jeff wrote:
> Laszlo,
> 
> Setting PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0) is behavior changing. 
> 
> Even there is no any multi-processor supporting platform set this PCD to zero, this PCD maybe set to 0 on single processor platform.

On a uniprocessor platform, PcdCpuApInitTimeOutInMicroSeconds makes no
difference, since the code that uses the PCD is not reached, due to
commit 14e8137c8223 ("UefiCpuPkg/MpInitLib: Do not wakeup AP if only one
processor supported").

> I prefer to setting PcdCpuApInitTimeOutInMicroSeconds to 0xFFFFFFFF (about 1.1 hour) for such purpose.

In practice that is good enough for OVMF too, I guess.

However, I'm unsure how you would like me to update the patch. The patch
does not add any specific code for handling the 0 timeout as infinity.
Instead, it simply documents the existing behavior of CalculateTimeout()
and CheckTimeout(). Those helper functions are just right for
implementing the feature at hand, and they already handle a zero timeout
as infinity.

In fact, my original approach to implement TimedWaitForApFinish() was to
add brand new code for the timer loop, using the performance counter.
However then I noticed CalculateTimeout() and CheckTimeout(), and
thought that you would ask me to implement TimedWaitForApFinish() with
those two functions. This way the code reuse is good, and a zero timeout
happens to mean infinity (which is just right for OVMF, is not used by
any physical multiprocessor platform, and is irrelevant for uniprocessor).

... Hm, wait, do you mean that a uniprocessor platform might leave
PcdCpuMaxLogicalProcessorNumber > 1 (for example, the default 64), and
set *only* the timeout to zero (because they "know" there aren't more CPUs)?

If that's what you mean, I consider it a platform bug: if they know
there's only one CPU, they should set PcdCpuMaxLogicalProcessorNumber to
1 -- for saving memory.

Anyway, if you prefer, I can modify TimedWaitForApFinish() like this:
add an explicit check for TimeLimit==0 in the beginning, and return
immediately if that condition evaluates to true. (Plus, update the
comments, and modify patch #3 to use MAX_UINT32 as TimeLinit.)

What do you think?

Thanks
Laszlo

> 
> Thanks!
> Jeff
> 
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> Sent: Friday, November 25, 2016 4:57 AM
> To: edk2-devel-01
> Cc: Kinney, Michael D; Fan, Jeff; Justen, Jordan L
> Subject: [edk2] [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
> 
> Sometimes a platform knows exactly how many CPUs it has at boot. It should be able to
> - set PcdCpuMaxLogicalProcessorNumber dynamically to this number,
> - set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),
> - and expect that MpInitLib wait exactly as long as necessary for all APs
>   to report in.
> 
> Other platforms should be able to continue setting a reasonably large upper bound on supported CPUs, and waiting for a reasonable, fixed amount of time for all APs to report in.
> 
> Add this functionality. The TimedWaitForApFinish() function will return when all APs have reported in, or the timeout has expired -- whichever happens first.
> 
> (Accessing these PCDs dynamically is safe. The PEI and DXE phase instances of this library are restricted to PEIM and DXE_DRIVER client modules, thus the PCD accesses cannot be linked into runtime code.)
> 
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Jeff Fan <jeff.fan@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Michael Kinney <michael.d.kinney@intel.com>
> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  UefiCpuPkg/UefiCpuPkg.dec            |  3 +-
>  UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-
>  2 files changed, 68 insertions(+), 2 deletions(-)
> 
> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..8607ae8b7ae8 100644
> --- a/UefiCpuPkg/UefiCpuPkg.dec
> +++ b/UefiCpuPkg/UefiCpuPkg.dec
> @@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
>    # @Prompt Configure max supported number of Logical Processors
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002
>    ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.
> -  # @Prompt Timeout for the BSP to detect all APs for the first time.
> +  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.
> +  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004
>    ## Specifies the base address of the first microcode Patch in the microcode Region.
>    # @Prompt Microcode Region base address.
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 15dbfa1e7d6c..76de92fa552a 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -684,6 +684,22 @@ FillExchangeInfoData (  }
>  
>  /**
> +  Helper function that waits until the finished AP count reaches the 
> + specified  limit, or the specified timeout elapses (whichever comes first).
> +
> +  @param[in] CpuMpData        Pointer to CPU MP Data.
> +  @param[in] FinishedApLimit  The number of finished APs to wait for.
> +  @param[in] TimeLimit        The number of microseconds to wait for. Zero
> +                              means infinity.
> +**/
> +VOID
> +TimedWaitForApFinish (
> +  IN CPU_MP_DATA               *CpuMpData,
> +  IN UINT32                    FinishedApLimit,
> +  IN UINT32                    TimeLimit
> +  );
> +
> +/**
>    This function will be called by BSP to wakeup AP.
>  
>    @param[in] CpuMpData          Pointer to CPU MP Data
> @@ -748,7 +764,11 @@ WakeUpAP (
>        //
>        // Wait for all potential APs waken up in one specified period
>        //
> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> +      TimedWaitForApFinish (
> +        CpuMpData,
> +        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
> +        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
> +        );
>      } else {
>        //
>        // Wait all APs waken up if this is not the 1st broadcast of SIPI @@ -895,6 +915,51 @@ CheckTimeout (  }
>  
>  /**
> +  Helper function that waits until the finished AP count reaches the 
> + specified  limit, or the specified timeout elapses (whichever comes first).
> +
> +  @param[in] CpuMpData        Pointer to CPU MP Data.
> +  @param[in] FinishedApLimit  The number of finished APs to wait for.
> +  @param[in] TimeLimit        The number of microseconds to wait for. Zero
> +                              means infinity.
> +**/
> +VOID
> +TimedWaitForApFinish (
> +  IN CPU_MP_DATA               *CpuMpData,
> +  IN UINT32                    FinishedApLimit,
> +  IN UINT32                    TimeLimit
> +  )
> +{
> +  CpuMpData->TotalTime = 0;
> +  CpuMpData->ExpectedTime = CalculateTimeout (
> +                              TimeLimit,
> +                              &CpuMpData->CurrentTime
> +                              );
> +  while (CpuMpData->FinishedCount < FinishedApLimit &&
> +         !CheckTimeout (
> +            &CpuMpData->CurrentTime,
> +            &CpuMpData->TotalTime,
> +            CpuMpData->ExpectedTime
> +            )) {
> +    CpuPause ();
> +  }
> +
> +  if (CpuMpData->FinishedCount >= FinishedApLimit) {
> +    DEBUG ((
> +      DEBUG_VERBOSE,
> +      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",
> +      __FUNCTION__,
> +      FinishedApLimit,
> +      DivU64x64Remainder (
> +        MultU64x32 (CpuMpData->TotalTime, 1000000),
> +        GetPerformanceCounterProperties (NULL, NULL),
> +        NULL
> +        )
> +      ));
> +  }
> +}
> +
> +/**
>    Reset an AP to Idle state.
>  
>    Any task being executed by the AP will be aborted and the AP
> --
> 2.9.2
> 
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber
  2016-11-28  2:15   ` Fan, Jeff
@ 2016-11-28 11:23     ` Laszlo Ersek
  0 siblings, 0 replies; 9+ messages in thread
From: Laszlo Ersek @ 2016-11-28 11:23 UTC (permalink / raw)
  To: Fan, Jeff, edk2-devel-01; +Cc: Kinney, Michael D, Justen, Jordan L

On 11/28/16 03:15, Fan, Jeff wrote:
> Laszlo,
> 
> Even getting PcdCpuMaxLogicalProcessorNumber is legal in InitSmmProfileInternal() that is only invoked on boot time, I agree to do this updating in this patch.
> We need to try to use mMaxNumberOfCpus as possible for size saving and performance improvement.
> 
> Reviewed-by: Jeff Fan <jeff.fan@intel.com>

Thanks, I committed this patch (for PiSmmCpuDxeSmm) as bb767506b265.
I'll submit a new version for the rest.

Cheers
Laszlo

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com] 
> Sent: Friday, November 25, 2016 4:57 AM
> To: edk2-devel-01
> Cc: Igor Mammedov; Fan, Jeff; Justen, Jordan L; Kinney, Michael D
> Subject: [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber
> 
> "UefiCpuPkg/UefiCpuPkg.dec" already allows platforms to make PcdCpuMaxLogicalProcessorNumber dynamic, however PiSmmCpuDxeSmm does not take this into account everywhere. As soon as a platform turns the PCD into a dynamic one, at least S3 fails. When the PCD is dynamic, all
> PcdGet() calls translate into PCD DXE protocol calls, which are only permitted at boot time, not at runtime or during S3 resume.
> 
> We already have a variable called mMaxNumberOfCpus; it is initialized in the entry point function like this:
> 
>> //
>> // If support CPU hot plug, we need to allocate resources for possibly 
>> // hot-added processors // if (FeaturePcdGet (PcdCpuHotPlugSupport)) {
>>   mMaxNumberOfCpus = PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
>> } else {
>>   mMaxNumberOfCpus = mNumberOfCpus;
>> }
> 
> There's another use of the PCD a bit higher up, also in the entry point
> function:
> 
>> //
>> // Use MP Services Protocol to retrieve the number of processors and 
>> // number of enabled processors // Status = 
>> MpServices->GetNumberOfProcessors (MpServices, &mNumberOfCpus,
>>                        &NumberOfEnabledProcessors); ASSERT_EFI_ERROR 
>> (Status); ASSERT (mNumberOfCpus <= PcdGet32 
>> (PcdCpuMaxLogicalProcessorNumber));
> 
> Preserve these calls in the entry point function, and replace all other uses of PcdCpuMaxLogicalProcessorNumber -- there are only reads -- with mMaxNumberOfCpus.
> 
> For PcdCpuHotPlugSupport==TRUE, this is an unobservable change.
> 
> For PcdCpuHotPlugSupport==FALSE, we even save SMRAM, because we no longer allocate resources needlessly for CPUs that can never appear in the system.
> 
> PcdCpuMaxLogicalProcessorNumber is also retrieved in "UefiCpuPkg/Library/SmmCpuFeaturesLib/SmmCpuFeaturesLib.c",  but only in the library instance constructor, which runs even before the entry point function is called.
> 
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Jeff Fan <jeff.fan@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Michael Kinney <michael.d.kinney@intel.com>
> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c |  2 +-
>  UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c     | 18 +++++++++---------
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> index 4baba1e9f8dc..f957de1f4764 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.c
> @@ -344,7 +344,7 @@ SmmInitHandler (
>    AsmWriteIdtr (&gcSmiIdtr);
>    ApicId = GetApicId ();
>  
> -  ASSERT (mNumberOfCpus <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +  ASSERT (mNumberOfCpus <= mMaxNumberOfCpus);
>  
>    for (Index = 0; Index < mNumberOfCpus; Index++) {
>      if (ApicId == (UINT32)gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId) { diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
> index c1a48d100e0f..f53819ee24c2 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/SmmProfile.c
> @@ -139,7 +139,7 @@ GetCpuIndex (
>  
>    ApicId = GetApicId ();
>  
> -  for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
> +  for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
>      if (gSmmCpuPrivate->ProcessorInfo[Index].ProcessorId == ApicId) {
>        return Index;
>      }
> @@ -825,13 +825,13 @@ InitSmmProfileInternal (
>    UINTN                      MsrDsAreaSizePerCpu;
>    UINTN                      TotalSize;
>  
> -  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +  mPFEntryCount = (UINTN *)AllocateZeroPool (sizeof (UINTN) * 
> + mMaxNumberOfCpus);
>    ASSERT (mPFEntryCount != NULL);
>    mLastPFEntryValue = (UINT64  (*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
> -                                                         sizeof (mLastPFEntryValue[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +                                                         sizeof 
> + (mLastPFEntryValue[0]) * mMaxNumberOfCpus);
>    ASSERT (mLastPFEntryValue != NULL);
>    mLastPFEntryPointer = (UINT64 *(*)[MAX_PF_ENTRY_COUNT])AllocateZeroPool (
> -                                                           sizeof (mLastPFEntryPointer[0]) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +                                                           sizeof 
> + (mLastPFEntryPointer[0]) * mMaxNumberOfCpus);
>    ASSERT (mLastPFEntryPointer != NULL);
>  
>    //
> @@ -872,17 +872,17 @@ InitSmmProfileInternal (
>    mSmmProfileBase->NumCpus        = gSmmCpuPrivate->SmmCoreEntryContext.NumberOfCpus;
>  
>    if (mBtsSupported) {
> -    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof (MSR_DS_AREA_STRUCT *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +    mMsrDsArea = (MSR_DS_AREA_STRUCT **)AllocateZeroPool (sizeof 
> + (MSR_DS_AREA_STRUCT *) * mMaxNumberOfCpus);
>      ASSERT (mMsrDsArea != NULL);
> -    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof (BRANCH_TRACE_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +    mMsrBTSRecord = (BRANCH_TRACE_RECORD **)AllocateZeroPool (sizeof 
> + (BRANCH_TRACE_RECORD *) * mMaxNumberOfCpus);
>      ASSERT (mMsrBTSRecord != NULL);
> -    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof (PEBS_RECORD *) * PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
> +    mMsrPEBSRecord = (PEBS_RECORD **)AllocateZeroPool (sizeof 
> + (PEBS_RECORD *) * mMaxNumberOfCpus);
>      ASSERT (mMsrPEBSRecord != NULL);
>  
>      mMsrDsAreaBase  = (MSR_DS_AREA_STRUCT *)((UINTN)Base + mSmmProfileSize);
> -    MsrDsAreaSizePerCpu = mMsrDsAreaSize / PcdGet32 (PcdCpuMaxLogicalProcessorNumber);
> +    MsrDsAreaSizePerCpu = mMsrDsAreaSize / mMaxNumberOfCpus;
>      mBTSRecordNumber    = (MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER - sizeof(MSR_DS_AREA_STRUCT)) / sizeof(BRANCH_TRACE_RECORD);
> -    for (Index = 0; Index < PcdGet32 (PcdCpuMaxLogicalProcessorNumber); Index++) {
> +    for (Index = 0; Index < mMaxNumberOfCpus; Index++) {
>        mMsrDsArea[Index]     = (MSR_DS_AREA_STRUCT *)((UINTN)mMsrDsAreaBase + MsrDsAreaSizePerCpu * Index);
>        mMsrBTSRecord[Index]  = (BRANCH_TRACE_RECORD *)((UINTN)mMsrDsArea[Index] + sizeof(MSR_DS_AREA_STRUCT));
>        mMsrPEBSRecord[Index] = (PEBS_RECORD *)((UINTN)mMsrDsArea[Index] + MsrDsAreaSizePerCpu - sizeof(PEBS_RECORD) * PEBS_RECORD_NUMBER);
> --
> 2.9.2
> 
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
  2016-11-28 10:15     ` Laszlo Ersek
@ 2016-11-28 11:58       ` Igor Mammedov
  0 siblings, 0 replies; 9+ messages in thread
From: Igor Mammedov @ 2016-11-28 11:58 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Fan, Jeff, edk2-devel-01, Kinney, Michael D, Justen, Jordan L

On Mon, 28 Nov 2016 11:15:15 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 11/28/16 06:37, Fan, Jeff wrote:
> > Laszlo,
> > 
> > Setting PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0) is behavior changing. 
> > 
> > Even there is no any multi-processor supporting platform set this PCD to zero, this PCD maybe set to 0 on single processor platform.  
> 
> On a uniprocessor platform, PcdCpuApInitTimeOutInMicroSeconds makes no
> difference, since the code that uses the PCD is not reached, due to
> commit 14e8137c8223 ("UefiCpuPkg/MpInitLib: Do not wakeup AP if only one
> processor supported").
> 
> > I prefer to setting PcdCpuApInitTimeOutInMicroSeconds to 0xFFFFFFFF (about 1.1 hour) for such purpose.  
> 
> In practice that is good enough for OVMF too, I guess.
> 
> However, I'm unsure how you would like me to update the patch. The patch
> does not add any specific code for handling the 0 timeout as infinity.
> Instead, it simply documents the existing behavior of CalculateTimeout()
> and CheckTimeout(). Those helper functions are just right for
> implementing the feature at hand, and they already handle a zero timeout
> as infinity.
> 
> In fact, my original approach to implement TimedWaitForApFinish() was to
> add brand new code for the timer loop, using the performance counter.
> However then I noticed CalculateTimeout() and CheckTimeout(), and
> thought that you would ask me to implement TimedWaitForApFinish() with
> those two functions. This way the code reuse is good, and a zero timeout
> happens to mean infinity (which is just right for OVMF, is not used by
> any physical multiprocessor platform, and is irrelevant for uniprocessor).
> 
> ... Hm, wait, do you mean that a uniprocessor platform might leave
> PcdCpuMaxLogicalProcessorNumber > 1 (for example, the default 64), and
> set *only* the timeout to zero (because they "know" there aren't more CPUs)?
> 
> If that's what you mean, I consider it a platform bug: if they know
> there's only one CPU, they should set PcdCpuMaxLogicalProcessorNumber to
> 1 -- for saving memory.
In my understanding PcdCpuMaxLogicalProcessorNumber is max possible CPUs
number and not currently present CPUs which platform might not know about
in advance, so it sends broadcast INIT/SIPI to find out how many CPUs are
present upto PcdCpuMaxLogicalProcessorNumber limit.

It's just we are cannibalizing PcdCpuMaxLogicalProcessorNumber for present
CPUs number as it were suggested in v1 comments to make some progress
and fix OVMF crash when booting with more than 64 CPU
and optimize memory usage/boot and resume time.

I'd prefer v1 where PDCs were used the way they've been meant to but
this route also sufficient as intermediate step as it allows to boot
with more than 64 CPUs.

> 
> Anyway, if you prefer, I can modify TimedWaitForApFinish() like this:
> add an explicit check for TimeLimit==0 in the beginning, and return
> immediately if that condition evaluates to true. (Plus, update the
> comments, and modify patch #3 to use MAX_UINT32 as TimeLinit.)
> 
> What do you think?
> 
> Thanks
> Laszlo
> 
> > 
> > Thanks!
> > Jeff
> > 
> > -----Original Message-----
> > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> > Sent: Friday, November 25, 2016 4:57 AM
> > To: edk2-devel-01
> > Cc: Kinney, Michael D; Fan, Jeff; Justen, Jordan L
> > Subject: [edk2] [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup
> > 
> > Sometimes a platform knows exactly how many CPUs it has at boot. It should be able to
> > - set PcdCpuMaxLogicalProcessorNumber dynamically to this number,
> > - set PcdCpuApInitTimeOutInMicroSeconds to infinity (marked with 0),
> > - and expect that MpInitLib wait exactly as long as necessary for all APs
> >   to report in.
> > 
> > Other platforms should be able to continue setting a reasonably large upper bound on supported CPUs, and waiting for a reasonable, fixed amount of time for all APs to report in.
> > 
> > Add this functionality. The TimedWaitForApFinish() function will return when all APs have reported in, or the timeout has expired -- whichever happens first.
> > 
> > (Accessing these PCDs dynamically is safe. The PEI and DXE phase instances of this library are restricted to PEIM and DXE_DRIVER client modules, thus the PCD accesses cannot be linked into runtime code.)
> > 
> > Cc: Igor Mammedov <imammedo@redhat.com>
> > Cc: Jeff Fan <jeff.fan@intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Michael Kinney <michael.d.kinney@intel.com>
> > Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=116
> > Contributed-under: TianoCore Contribution Agreement 1.0
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > ---
> >  UefiCpuPkg/UefiCpuPkg.dec            |  3 +-
> >  UefiCpuPkg/Library/MpInitLib/MpLib.c | 67 +++++++++++++++++++-
> >  2 files changed, 68 insertions(+), 2 deletions(-)
> > 
> > diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..8607ae8b7ae8 100644
> > --- a/UefiCpuPkg/UefiCpuPkg.dec
> > +++ b/UefiCpuPkg/UefiCpuPkg.dec
> > @@ -174,7 +174,8 @@ [PcdsFixedAtBuild, PcdsPatchableInModule, PcdsDynamic, PcdsDynamicEx]
> >    # @Prompt Configure max supported number of Logical Processors
> >    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64|UINT32|0x00000002
> >    ## Specifies timeout value in microseconds for the BSP to detect all APs for the first time.
> > -  # @Prompt Timeout for the BSP to detect all APs for the first time.
> > +  #  When set to zero, the BSP will wait forever for all (PcdCpuMaxLogicalProcessorNumber-1) APs to report in.
> > +  # @Prompt Timeout for the BSP to detect all APs for the first time (zero means await maximum supported AP count).
> >    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000|UINT32|0x00000004
> >    ## Specifies the base address of the first microcode Patch in the microcode Region.
> >    # @Prompt Microcode Region base address.
> > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> > index 15dbfa1e7d6c..76de92fa552a 100644
> > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> > @@ -684,6 +684,22 @@ FillExchangeInfoData (  }
> >  
> >  /**
> > +  Helper function that waits until the finished AP count reaches the 
> > + specified  limit, or the specified timeout elapses (whichever comes first).
> > +
> > +  @param[in] CpuMpData        Pointer to CPU MP Data.
> > +  @param[in] FinishedApLimit  The number of finished APs to wait for.
> > +  @param[in] TimeLimit        The number of microseconds to wait for. Zero
> > +                              means infinity.
> > +**/
> > +VOID
> > +TimedWaitForApFinish (
> > +  IN CPU_MP_DATA               *CpuMpData,
> > +  IN UINT32                    FinishedApLimit,
> > +  IN UINT32                    TimeLimit
> > +  );
> > +
> > +/**
> >    This function will be called by BSP to wakeup AP.
> >  
> >    @param[in] CpuMpData          Pointer to CPU MP Data
> > @@ -748,7 +764,11 @@ WakeUpAP (
> >        //
> >        // Wait for all potential APs waken up in one specified period
> >        //
> > -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> > +      TimedWaitForApFinish (
> > +        CpuMpData,
> > +        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
> > +        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
> > +        );
> >      } else {
> >        //
> >        // Wait all APs waken up if this is not the 1st broadcast of SIPI @@ -895,6 +915,51 @@ CheckTimeout (  }
> >  
> >  /**
> > +  Helper function that waits until the finished AP count reaches the 
> > + specified  limit, or the specified timeout elapses (whichever comes first).
> > +
> > +  @param[in] CpuMpData        Pointer to CPU MP Data.
> > +  @param[in] FinishedApLimit  The number of finished APs to wait for.
> > +  @param[in] TimeLimit        The number of microseconds to wait for. Zero
> > +                              means infinity.
> > +**/
> > +VOID
> > +TimedWaitForApFinish (
> > +  IN CPU_MP_DATA               *CpuMpData,
> > +  IN UINT32                    FinishedApLimit,
> > +  IN UINT32                    TimeLimit
> > +  )
> > +{
> > +  CpuMpData->TotalTime = 0;
> > +  CpuMpData->ExpectedTime = CalculateTimeout (
> > +                              TimeLimit,
> > +                              &CpuMpData->CurrentTime
> > +                              );
> > +  while (CpuMpData->FinishedCount < FinishedApLimit &&
> > +         !CheckTimeout (
> > +            &CpuMpData->CurrentTime,
> > +            &CpuMpData->TotalTime,
> > +            CpuMpData->ExpectedTime
> > +            )) {
> > +    CpuPause ();
> > +  }
> > +
> > +  if (CpuMpData->FinishedCount >= FinishedApLimit) {
> > +    DEBUG ((
> > +      DEBUG_VERBOSE,
> > +      "%a: reached FinishedApLimit=%u in %Lu microseconds\n",
> > +      __FUNCTION__,
> > +      FinishedApLimit,
> > +      DivU64x64Remainder (
> > +        MultU64x32 (CpuMpData->TotalTime, 1000000),
> > +        GetPerformanceCounterProperties (NULL, NULL),
> > +        NULL
> > +        )
> > +      ));
> > +  }
> > +}
> > +
> > +/**
> >    Reset an AP to Idle state.
> >  
> >    Any task being executed by the AP will be aborted and the AP
> > --
> > 2.9.2
> > 
> > 
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
> >   
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-11-28 11:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-24 20:56 [PATCH v2 0/3] UefiCpuPkg, OvmfPkg: set maximum (V)CPU count dynamically Laszlo Ersek
2016-11-24 20:56 ` [PATCH v2 1/3] UefiCpuPkg/PiSmmCpuDxeSmm: handle dynamic PcdCpuMaxLogicalProcessorNumber Laszlo Ersek
2016-11-28  2:15   ` Fan, Jeff
2016-11-28 11:23     ` Laszlo Ersek
2016-11-24 20:56 ` [PATCH v2 2/3] UefiCpuPkg/MpInitLib: wait no longer than necessary for initial AP startup Laszlo Ersek
2016-11-28  5:37   ` Fan, Jeff
2016-11-28 10:15     ` Laszlo Ersek
2016-11-28 11:58       ` Igor Mammedov
2016-11-24 20:56 ` [PATCH v2 3/3] OvmfPkg/PlatformPei: take VCPU count from QEMU and configure MpInitLib Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox