public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements
@ 2016-11-22 20:26 Laszlo Ersek
  2016-11-22 20:26 ` [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf Laszlo Ersek
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-22 20:26 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen

Patches 1 and 2 fix several instances of the same CPUID-related bug in
UefiCpuPkg (BaseXApicLib, BaseXApicX2ApicLib, MpInitLib/Ia32,
MpInitLib/X64).

Patches 3 and 4 are independent from the bugfix, but they relate
nonetheless to multiprocessing, so I'm including them in the same
series. They allow OvmfPkg to fetch the number of boot processors from
QEMU, and to make CpuMpPei / CpuDxe wait in the initial enumeration
exactly as long as it takes for all of the APs to come up. This makes MP
boot much more reliable (and no time is wasted waiting).

(

  Note that I have another patch set pending on the list:

    [edk2] [PATCH v2 0/4] OvmfPkg: broadcast SMIs and dynamically revert
                          to traditional AP sync mode

  That series and this series have a trivial context dependency in
  OvmfPkg/OvmfPkg*.dsc. Because that series consumes an interface from
  QEMU that will be part of the 2.9 release only, not the pending 2.8
  release, that series comes *second*. In comparison, this series works
  as-is, so it comes first. Locally I have rebased the broadcast SMI
  series already (it is trivial); it does not interfere with the review.

)

Repo:   https://github.com/lersek/edk2/
Branch: mpinit

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>

Thanks
Laszlo

Laszlo Ersek (4):
  UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID
    leaf
  UefiCpuPkg/MpInitLib: fix feature test for Extended Topology CPUID
    leaf
  UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count
    upfront
  OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for
    MpInitLib

 OvmfPkg/OvmfPkgIa32.dsc                                    |  2 ++
 OvmfPkg/OvmfPkgIa32X64.dsc                                 |  2 ++
 OvmfPkg/OvmfPkgX64.dsc                                     |  2 ++
 OvmfPkg/PlatformPei/Platform.c                             | 22 ++++++++++++++++++++
 OvmfPkg/PlatformPei/PlatformPei.inf                        |  1 +
 UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c             |  7 +++++--
 UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c |  7 +++++--
 UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf              |  1 +
 UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf              |  4 ++--
 UefiCpuPkg/Library/MpInitLib/MpLib.c                       | 16 ++++++++++----
 UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm             | 21 +++++++++++--------
 UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm              | 21 +++++++++++--------
 UefiCpuPkg/UefiCpuPkg.dec                                  | 11 ++++++++++
 13 files changed, 89 insertions(+), 28 deletions(-)

-- 
2.9.2



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf
  2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
@ 2016-11-22 20:26 ` Laszlo Ersek
  2016-11-23  5:10   ` Fan, Jeff
  2016-11-22 20:26 ` [PATCH 2/4] UefiCpuPkg/MpInitLib: " Laszlo Ersek
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-22 20:26 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Jeff Fan

According to the Intel SDM (325462-060US / September 2016),

> INPUT EAX = 0BH: Returns Extended Topology Information
>
> [...] Software must detect the presence of CPUID leaf 0BH by verifying
> (a) the highest leaf index supported by CPUID is >= 0BH, and
> (b) CPUID.0BH:EBX[15:0] reports a non-zero value. [...]

The LocalApicLib instances in UefiCpuPkg do not perform check (b).

This causes an actual bug in the following OVMF setup:

- Intel W3550 host processor <http://ark.intel.com/products/39720/>,

- the QEMU/KVM guest's VCPU model is set to "host", that is, "the CPU
  visible to the guest should be exactly the same as the host CPU".

In the GetInitialApicId() function, check (a) passes: the CPUID level of
the W3550 is exactly 11 decimal. However, leaf 11 itself is not supported,
therefore EDX is set to zero:

> If a value entered for CPUID.EAX is less than or equal to the maximum
> input value and the leaf is not supported on that processor then 0 is
> returned in all the registers.

Because we don't check (b), we return 0 as initial APIC ID on the BSP and
on all of the APs as well.

Add the missing check.

Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c             | 7 +++++--
 UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c | 7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c b/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
index 4064049807b7..f81bbb2252d9 100644
--- a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
+++ b/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
@@ -314,12 +314,15 @@ GetInitialApicId (
 
   //
   // If CPUID Leaf B is supported, 
+  // And CPUID.0BH:EBX[15:0] reports a non-zero value,
   // Then the initial 32-bit APIC ID = CPUID.0BH:EDX
   // Else the initial 8-bit APIC ID = CPUID.1:EBX[31:24]
   //
   if (MaxCpuIdIndex >= CPUID_EXTENDED_TOPOLOGY) {
-    AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, NULL, NULL, &ApicId);
-    return ApicId;
+    AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, &RegEbx, NULL, &ApicId);
+    if ((RegEbx & (BIT16 - 1)) != 0) {
+      return ApicId;
+    }
   }
 
   AsmCpuid (CPUID_VERSION_INFO, NULL, &RegEbx, NULL, NULL);
diff --git a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c b/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
index 9720d26e60e2..e690d2aa1445 100644
--- a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
+++ b/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
@@ -411,12 +411,15 @@ GetInitialApicId (
     AsmCpuid (CPUID_SIGNATURE, &MaxCpuIdIndex, NULL, NULL, NULL);
     //
     // If CPUID Leaf B is supported, 
+    // And CPUID.0BH:EBX[15:0] reports a non-zero value,
     // Then the initial 32-bit APIC ID = CPUID.0BH:EDX
     // Else the initial 8-bit APIC ID = CPUID.1:EBX[31:24]
     //
     if (MaxCpuIdIndex >= CPUID_EXTENDED_TOPOLOGY) {
-      AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, NULL, NULL, &ApicId);
-      return ApicId;
+      AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, &RegEbx, NULL, &ApicId);
+      if ((RegEbx & (BIT16 - 1)) != 0) {
+        return ApicId;
+      }
     }
     AsmCpuid (CPUID_VERSION_INFO, NULL, &RegEbx, NULL, NULL);
     return RegEbx >> 24;
-- 
2.9.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] UefiCpuPkg/MpInitLib: fix feature test for Extended Topology CPUID leaf
  2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
  2016-11-22 20:26 ` [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf Laszlo Ersek
@ 2016-11-22 20:26 ` Laszlo Ersek
  2016-11-23  5:10   ` Fan, Jeff
  2016-11-22 20:26 ` [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-22 20:26 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Jeff Fan

According to the Intel SDM (325462-060US / September 2016),

> INPUT EAX = 0BH: Returns Extended Topology Information
>
> [...] Software must detect the presence of CPUID leaf 0BH by verifying
> (a) the highest leaf index supported by CPUID is >= 0BH, and
> (b) CPUID.0BH:EBX[15:0] reports a non-zero value. [...]

The "GetApicId" sections in the Ia32 and X64 "MpFuncs.nasm" files do not
perform check (b).

This causes an actual bug in the following OVMF setup:

- Intel W3550 host processor <http://ark.intel.com/products/39720/>,

- the QEMU/KVM guest's VCPU model is set to "host", that is, "the CPU
  visible to the guest should be exactly the same as the host CPU".

Under "GetApicId", check (a) passes: the CPUID level of the W3550 is
exactly 11 decimal. However, leaf 11 itself is not supported, therefore
EDX is set to zero:

> If a value entered for CPUID.EAX is less than or equal to the maximum
> input value and the leaf is not supported on that processor then 0 is
> returned in all the registers.

Because we don't check (b), the "GetProcessorNumber" section of the code
is reached with an initial APIC ID of 0 in EDX on all of the APs. Given
that "GetProcessorNumber" searches the
"MP_CPU_EXCHANGE_INFO.CpuInfo[*].InitialApicId" fields for a match, all
APs enter ApWakeupFunction() with an identical "NumApsExecuting"
parameter. This results in unpredictable guest behavior (crashes, reboots,
hangs etc).

Reorganize the "GetApicId" section and add the missing check in both
assembly files.

Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 21 +++++++++++---------
 UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm  | 21 +++++++++++---------
 2 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 64e51d87ae24..9067f7807098 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -154,21 +154,24 @@ GetApicId:
     mov        eax, 0
     cpuid
     cmp        eax, 0bh
-    jnb        X2Apic
+    jb         NoX2Apic             ; CPUID level below CPUID_EXTENDED_TOPOLOGY
+
+    mov        eax, 0bh
+    xor        ecx, ecx
+    cpuid
+    test       ebx, 0ffffh
+    jz         NoX2Apic             ; CPUID.0BH:EBX[15:0] is zero
+
+    ; Processor is x2APIC capable; 32-bit x2APIC ID is already in EDX
+    jmp        GetProcessorNumber
+
+NoX2Apic:
     ; Processor is not x2APIC capable, so get 8-bit APIC ID
     mov        eax, 1
     cpuid
     shr        ebx, 24
     mov        edx, ebx
-    jmp        GetProcessorNumber
 
-X2Apic:
-    ; Processor is x2APIC capable, so get 32-bit x2APIC ID
-    mov        eax, 0bh
-    xor        ecx, ecx
-    cpuid                   
-    ; edx save x2APIC ID
-    
 GetProcessorNumber:
     ;
     ; Get processor number for this AP
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index aaabb50c5468..e7e7d8086dd0 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -158,21 +158,24 @@ GetApicId:
     mov        eax, 0
     cpuid
     cmp        eax, 0bh
-    jnb        X2Apic
+    jb         NoX2Apic             ; CPUID level below CPUID_EXTENDED_TOPOLOGY
+
+    mov        eax, 0bh
+    xor        ecx, ecx
+    cpuid
+    test       ebx, 0ffffh
+    jz         NoX2Apic             ; CPUID.0BH:EBX[15:0] is zero
+
+    ; Processor is x2APIC capable; 32-bit x2APIC ID is already in EDX
+    jmp        GetProcessorNumber
+
+NoX2Apic:
     ; Processor is not x2APIC capable, so get 8-bit APIC ID
     mov        eax, 1
     cpuid
     shr        ebx, 24
     mov        edx, ebx
-    jmp        GetProcessorNumber
 
-X2Apic:
-    ; Processor is x2APIC capable, so get 32-bit x2APIC ID
-    mov        eax, 0bh
-    xor        ecx, ecx
-    cpuid                   
-    ; edx save x2APIC ID
-    
 GetProcessorNumber:
     ;
     ; Get processor number for this AP
-- 
2.9.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
  2016-11-22 20:26 ` [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf Laszlo Ersek
  2016-11-22 20:26 ` [PATCH 2/4] UefiCpuPkg/MpInitLib: " Laszlo Ersek
@ 2016-11-22 20:26 ` Laszlo Ersek
  2016-11-23  5:36   ` Fan, Jeff
  2016-11-22 20:26 ` [PATCH 4/4] OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for MpInitLib Laszlo Ersek
  2016-11-23 20:43 ` [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
  4 siblings, 1 reply; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-22 20:26 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan

On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It
can be a low number or a high number.

Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP
collection is impractical, because in a VM, the time needed for an AP to
wake up can vary widely:

- if the APs report back quickly, then we could be wasting time,

- if the APs report back late (due to scheduling artifacts or VCPU
  over-subscription on the virtualization host), then the timeout can
  elapse before all APs increment CpuMpData->CpuCount in
  ApWakeupFunction().

Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also
impractical, as scheduling artifacts on the KVM host may delay AP threads
arbitrarily.

Instead, allow OVMF (and other platforms) to tell MpInitLib the number of
boot-time CPUs in advance.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
 UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +
 UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
 UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec
index ca560398bbef..35903c4386e4 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
   # @ValidList   0x80000001 | 0
   gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x60000011
 
+  ## On platforms where the number of boot-time CPUs can be dynamically
+  #  retrieved from a platform-specific information source, the BSP does not
+  #  have to wait for PcdCpuApInitTimeOutInMicroSeconds in order to detect all
+  #  APs for the first time. On such platforms, this PCD specifies the number
+  #  of CPUs available at boot. If the platform doesn't support this feature,
+  #  this PCD should be set to 0. The platform is responsible for ensuring that
+  #  this PCD is never set to a value larger than
+  #  PcdCpuMaxLogicalProcessorNumber.
+  # @Prompt Known number of CPUs available at boot.
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32|0x60000012
+
 [UserExtensions.TianoCore."ExtraFiles"]
   UefiCpuPkgExtra.uni
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
index 11b230174ec8..dc18eaf6152d 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
@@ -61,6 +61,7 @@ [Guids]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
index 0c6873da79db..2bcfa70ae7e5 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
@@ -61,10 +61,10 @@ [Ppis]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
-  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
-
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 15dbfa1e7d6c..f7dfbd5bad13 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -707,6 +707,7 @@ WakeUpAP (
   CPU_AP_DATA                      *CpuData;
   BOOLEAN                          ResetVectorRequired;
   CPU_INFO_IN_HOB                  *CpuInfoInHob;
+  UINT32                           KnownProcessorCount;
 
   CpuMpData->FinishedCount = 0;
   ResetVectorRequired = FALSE;
@@ -745,10 +746,17 @@ WakeUpAP (
       SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
     }
     if (CpuMpData->InitFlag == ApInitConfig) {
-      //
-      // Wait for all potential APs waken up in one specified period
-      //
-      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
+      if (KnownProcessorCount > 0) {
+        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
+          CpuPause ();
+        }
+      } else {
+        //
+        // Wait for all potential APs waken up in one specified period
+        //
+        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      }
     } else {
       //
       // Wait all APs waken up if this is not the 1st broadcast of SIPI
-- 
2.9.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for MpInitLib
  2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
                   ` (2 preceding siblings ...)
  2016-11-22 20:26 ` [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront Laszlo Ersek
@ 2016-11-22 20:26 ` Laszlo Ersek
  2016-11-23 20:43 ` [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
  4 siblings, 0 replies; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-22 20:26 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen

This setting will allow CpuMpPei and CpuDxe to wait for the initial AP
check-ins exactly as long as necessary.

It is safe to set PcdCpuKnownLogicalProcessorNumber in
OvmfPkg/PlatformPei. OvmfPkg/PlatformPei installs the permanent PEI RAM,
producing gEfiPeiMemoryDiscoveredPpiGuid, and UefiCpuPkg/CpuMpPei has a
depex on gEfiPeiMemoryDiscoveredPpiGuid.

It is safe to read the fw_cfg item QemuFwCfgItemSmpCpuCount (0x0005). It
was added to QEMU in 2008 as key FW_CFG_NB_CPUS, in commit 905fdcb5264c
("Add common keys to firmware configuration"). Even if the key is
unavailable (or if fw_cfg is entirely unavailable, for example on Xen),
QemuFwCfgRead16() will return 0, and then we stick with the current
behavior (PcdCpuKnownLogicalProcessorNumber == 0, which is the default in
"UefiCpuPkg/UefiCpuPkg.dec".)

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 OvmfPkg/OvmfPkgIa32.dsc             |  2 ++
 OvmfPkg/OvmfPkgIa32X64.dsc          |  2 ++
 OvmfPkg/OvmfPkgX64.dsc              |  2 ++
 OvmfPkg/PlatformPei/PlatformPei.inf |  1 +
 OvmfPkg/PlatformPei/Platform.c      | 22 ++++++++++++++++++++
 5 files changed, 29 insertions(+)

diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index ed43c4514491..4973a5c12719 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -488,6 +488,8 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index ccd156d9231a..1a7de33450b2 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -496,6 +496,8 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index 012ce85462c5..08f1542df158 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -495,6 +495,8 @@ [PcdsDynamicDefault]
   gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE
   gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE
 
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0
+
 ################################################################################
 #
 # Components Section - list of all EDK II Modules needed by this Platform.
diff --git a/OvmfPkg/PlatformPei/PlatformPei.inf b/OvmfPkg/PlatformPei/PlatformPei.inf
index 776a4ab11f79..872b3c075421 100644
--- a/OvmfPkg/PlatformPei/PlatformPei.inf
+++ b/OvmfPkg/PlatformPei/PlatformPei.inf
@@ -95,6 +95,7 @@ [Pcd]
   gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiS3Enable
   gUefiCpuPkgTokenSpaceGuid.PcdCpuLocalApicBaseAddress
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize
 
 [FixedPcd]
diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index c6e1106c9ed0..ce8f38561fbe 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -394,6 +394,7 @@ MiscInitialization (
   UINTN         AcpiCtlReg;
   UINT8         AcpiEnBit;
   RETURN_STATUS PcdStatus;
+  UINT16        KnownProcessorCount;
 
   //
   // Disable A20 Mask
@@ -473,6 +474,27 @@ MiscInitialization (
     //
     PciExBarInitialization ();
   }
+
+  //
+  // Fetch the number of boot CPUs from QEMU and expose it to MpInitLib in
+  // UefiCpuPkg.
+  //
+  QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount);
+  KnownProcessorCount = QemuFwCfgRead16 ();
+  if (KnownProcessorCount > PcdGet32 (PcdCpuMaxLogicalProcessorNumber)) {
+    DEBUG ((DEBUG_ERROR,
+      "%a: KnownProcessorCount (%d) > PcdCpuMaxLogicalProcessorNumber (%u)\n",
+      __FUNCTION__, KnownProcessorCount,
+      PcdGet32 (PcdCpuMaxLogicalProcessorNumber)));
+    ASSERT (FALSE);
+    CpuDeadLoop ();
+  } else if (KnownProcessorCount > 0) {
+    PcdStatus = PcdSet32S (PcdCpuKnownLogicalProcessorNumber,
+                  KnownProcessorCount);
+    ASSERT_RETURN_ERROR (PcdStatus);
+    DEBUG ((DEBUG_INFO, "%a: QEMU reports %d processors\n", __FUNCTION__,
+      KnownProcessorCount));
+  }
 }
 
 
-- 
2.9.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf
  2016-11-22 20:26 ` [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf Laszlo Ersek
@ 2016-11-23  5:10   ` Fan, Jeff
  0 siblings, 0 replies; 15+ messages in thread
From: Fan, Jeff @ 2016-11-23  5:10 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01

Yes. This is a bug to get Initial APIC ID.

Reviewed-by: Jeff Fan <jeff.fan@intel.com>


-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com] 
Sent: Wednesday, November 23, 2016 4:26 AM
To: edk2-devel-01
Cc: Fan, Jeff
Subject: [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf

According to the Intel SDM (325462-060US / September 2016),

> INPUT EAX = 0BH: Returns Extended Topology Information
>
> [...] Software must detect the presence of CPUID leaf 0BH by verifying
> (a) the highest leaf index supported by CPUID is >= 0BH, and
> (b) CPUID.0BH:EBX[15:0] reports a non-zero value. [...]

The LocalApicLib instances in UefiCpuPkg do not perform check (b).

This causes an actual bug in the following OVMF setup:

- Intel W3550 host processor <http://ark.intel.com/products/39720/>,

- the QEMU/KVM guest's VCPU model is set to "host", that is, "the CPU
  visible to the guest should be exactly the same as the host CPU".

In the GetInitialApicId() function, check (a) passes: the CPUID level of the W3550 is exactly 11 decimal. However, leaf 11 itself is not supported, therefore EDX is set to zero:

> If a value entered for CPUID.EAX is less than or equal to the maximum 
> input value and the leaf is not supported on that processor then 0 is 
> returned in all the registers.

Because we don't check (b), we return 0 as initial APIC ID on the BSP and on all of the APs as well.

Add the missing check.

Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c             | 7 +++++--
 UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c | 7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c b/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
index 4064049807b7..f81bbb2252d9 100644
--- a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
+++ b/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c
@@ -314,12 +314,15 @@ GetInitialApicId (
 
   //
   // If CPUID Leaf B is supported, 
+  // And CPUID.0BH:EBX[15:0] reports a non-zero value,
   // Then the initial 32-bit APIC ID = CPUID.0BH:EDX
   // Else the initial 8-bit APIC ID = CPUID.1:EBX[31:24]
   //
   if (MaxCpuIdIndex >= CPUID_EXTENDED_TOPOLOGY) {
-    AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, NULL, NULL, &ApicId);
-    return ApicId;
+    AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, &RegEbx, NULL, &ApicId);
+    if ((RegEbx & (BIT16 - 1)) != 0) {
+      return ApicId;
+    }
   }
 
   AsmCpuid (CPUID_VERSION_INFO, NULL, &RegEbx, NULL, NULL); diff --git a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c b/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
index 9720d26e60e2..e690d2aa1445 100644
--- a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
+++ b/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c
@@ -411,12 +411,15 @@ GetInitialApicId (
     AsmCpuid (CPUID_SIGNATURE, &MaxCpuIdIndex, NULL, NULL, NULL);
     //
     // If CPUID Leaf B is supported, 
+    // And CPUID.0BH:EBX[15:0] reports a non-zero value,
     // Then the initial 32-bit APIC ID = CPUID.0BH:EDX
     // Else the initial 8-bit APIC ID = CPUID.1:EBX[31:24]
     //
     if (MaxCpuIdIndex >= CPUID_EXTENDED_TOPOLOGY) {
-      AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, NULL, NULL, &ApicId);
-      return ApicId;
+      AsmCpuidEx (CPUID_EXTENDED_TOPOLOGY, 0, NULL, &RegEbx, NULL, &ApicId);
+      if ((RegEbx & (BIT16 - 1)) != 0) {
+        return ApicId;
+      }
     }
     AsmCpuid (CPUID_VERSION_INFO, NULL, &RegEbx, NULL, NULL);
     return RegEbx >> 24;
--
2.9.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] UefiCpuPkg/MpInitLib: fix feature test for Extended Topology CPUID leaf
  2016-11-22 20:26 ` [PATCH 2/4] UefiCpuPkg/MpInitLib: " Laszlo Ersek
@ 2016-11-23  5:10   ` Fan, Jeff
  0 siblings, 0 replies; 15+ messages in thread
From: Fan, Jeff @ 2016-11-23  5:10 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01

Reviewed-by: Jeff Fan <jeff.fan@intel.com>

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
Sent: Wednesday, November 23, 2016 4:26 AM
To: edk2-devel-01
Cc: Fan, Jeff
Subject: [edk2] [PATCH 2/4] UefiCpuPkg/MpInitLib: fix feature test for Extended Topology CPUID leaf

According to the Intel SDM (325462-060US / September 2016),

> INPUT EAX = 0BH: Returns Extended Topology Information
>
> [...] Software must detect the presence of CPUID leaf 0BH by verifying
> (a) the highest leaf index supported by CPUID is >= 0BH, and
> (b) CPUID.0BH:EBX[15:0] reports a non-zero value. [...]

The "GetApicId" sections in the Ia32 and X64 "MpFuncs.nasm" files do not perform check (b).

This causes an actual bug in the following OVMF setup:

- Intel W3550 host processor <http://ark.intel.com/products/39720/>,

- the QEMU/KVM guest's VCPU model is set to "host", that is, "the CPU
  visible to the guest should be exactly the same as the host CPU".

Under "GetApicId", check (a) passes: the CPUID level of the W3550 is exactly 11 decimal. However, leaf 11 itself is not supported, therefore EDX is set to zero:

> If a value entered for CPUID.EAX is less than or equal to the maximum 
> input value and the leaf is not supported on that processor then 0 is 
> returned in all the registers.

Because we don't check (b), the "GetProcessorNumber" section of the code is reached with an initial APIC ID of 0 in EDX on all of the APs. Given that "GetProcessorNumber" searches the "MP_CPU_EXCHANGE_INFO.CpuInfo[*].InitialApicId" fields for a match, all APs enter ApWakeupFunction() with an identical "NumApsExecuting"
parameter. This results in unpredictable guest behavior (crashes, reboots, hangs etc).

Reorganize the "GetApicId" section and add the missing check in both assembly files.

Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 21 +++++++++++---------  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm  | 21 +++++++++++---------
 2 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 64e51d87ae24..9067f7807098 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -154,21 +154,24 @@ GetApicId:
     mov        eax, 0
     cpuid
     cmp        eax, 0bh
-    jnb        X2Apic
+    jb         NoX2Apic             ; CPUID level below CPUID_EXTENDED_TOPOLOGY
+
+    mov        eax, 0bh
+    xor        ecx, ecx
+    cpuid
+    test       ebx, 0ffffh
+    jz         NoX2Apic             ; CPUID.0BH:EBX[15:0] is zero
+
+    ; Processor is x2APIC capable; 32-bit x2APIC ID is already in EDX
+    jmp        GetProcessorNumber
+
+NoX2Apic:
     ; Processor is not x2APIC capable, so get 8-bit APIC ID
     mov        eax, 1
     cpuid
     shr        ebx, 24
     mov        edx, ebx
-    jmp        GetProcessorNumber
 
-X2Apic:
-    ; Processor is x2APIC capable, so get 32-bit x2APIC ID
-    mov        eax, 0bh
-    xor        ecx, ecx
-    cpuid                   
-    ; edx save x2APIC ID
-    
 GetProcessorNumber:
     ;
     ; Get processor number for this AP
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index aaabb50c5468..e7e7d8086dd0 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -158,21 +158,24 @@ GetApicId:
     mov        eax, 0
     cpuid
     cmp        eax, 0bh
-    jnb        X2Apic
+    jb         NoX2Apic             ; CPUID level below CPUID_EXTENDED_TOPOLOGY
+
+    mov        eax, 0bh
+    xor        ecx, ecx
+    cpuid
+    test       ebx, 0ffffh
+    jz         NoX2Apic             ; CPUID.0BH:EBX[15:0] is zero
+
+    ; Processor is x2APIC capable; 32-bit x2APIC ID is already in EDX
+    jmp        GetProcessorNumber
+
+NoX2Apic:
     ; Processor is not x2APIC capable, so get 8-bit APIC ID
     mov        eax, 1
     cpuid
     shr        ebx, 24
     mov        edx, ebx
-    jmp        GetProcessorNumber
 
-X2Apic:
-    ; Processor is x2APIC capable, so get 32-bit x2APIC ID
-    mov        eax, 0bh
-    xor        ecx, ecx
-    cpuid                   
-    ; edx save x2APIC ID
-    
 GetProcessorNumber:
     ;
     ; Get processor number for this AP
--
2.9.2


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-22 20:26 ` [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront Laszlo Ersek
@ 2016-11-23  5:36   ` Fan, Jeff
  2016-11-23 16:04     ` Laszlo Ersek
  0 siblings, 1 reply; 15+ messages in thread
From: Fan, Jeff @ 2016-11-23  5:36 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01; +Cc: Igor Mammedov

Laszlo,

I ever submitted one bugzilla for such request at https://bugzilla.tianocore.org/show_bug.cgi?id=116

My point is not to add new PCD and just to use existing PCD PcdCpuMaxLogicalProcessorNumber that supports dynamic type.

Platform Peim could update it is platform pei module before memory ready.

MpInitLib just exits from wait loop once the discovered processors number reaches the PcdCpuMaxLogicalProcessorNumber.

Does it meet your requirement?

Thanks!
Jeff

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com] 
Sent: Wednesday, November 23, 2016 4:26 AM
To: edk2-devel-01
Cc: Igor Mammedov; Fan, Jeff
Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront

On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.

Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:

- if the APs report back quickly, then we could be wasting time,

- if the APs report back late (due to scheduling artifacts or VCPU
  over-subscription on the virtualization host), then the timeout can
  elapse before all APs increment CpuMpData->CpuCount in
  ApWakeupFunction().

Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.

Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Fan <jeff.fan@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
 UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
 UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..35903c4386e4 100644
--- a/UefiCpuPkg/UefiCpuPkg.dec
+++ b/UefiCpuPkg/UefiCpuPkg.dec
@@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
   # @ValidList   0x80000001 | 0
   gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x60000011
 
+  ## On platforms where the number of boot-time CPUs can be dynamically  
+ #  retrieved from a platform-specific information source, the BSP does 
+ not  #  have to wait for PcdCpuApInitTimeOutInMicroSeconds in order to 
+ detect all  #  APs for the first time. On such platforms, this PCD 
+ specifies the number  #  of CPUs available at boot. If the platform 
+ doesn't support this feature,  #  this PCD should be set to 0. The 
+ platform is responsible for ensuring that  #  this PCD is never set to 
+ a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
+  # @Prompt Known number of CPUs available at boot.
+  
+ gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32|0
+ x60000012
+
 [UserExtensions.TianoCore."ExtraFiles"]
   UefiCpuPkgExtra.uni
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
index 11b230174ec8..dc18eaf6152d 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
@@ -61,6 +61,7 @@ [Guids]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
index 0c6873da79db..2bcfa70ae7e5 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
@@ -61,10 +61,10 @@ [Ppis]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
-  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
-
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 15dbfa1e7d6c..f7dfbd5bad13 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -707,6 +707,7 @@ WakeUpAP (
   CPU_AP_DATA                      *CpuData;
   BOOLEAN                          ResetVectorRequired;
   CPU_INFO_IN_HOB                  *CpuInfoInHob;
+  UINT32                           KnownProcessorCount;
 
   CpuMpData->FinishedCount = 0;
   ResetVectorRequired = FALSE;
@@ -745,10 +746,17 @@ WakeUpAP (
       SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
     }
     if (CpuMpData->InitFlag == ApInitConfig) {
-      //
-      // Wait for all potential APs waken up in one specified period
-      //
-      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
+      if (KnownProcessorCount > 0) {
+        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
+          CpuPause ();
+        }
+      } else {
+        //
+        // Wait for all potential APs waken up in one specified period
+        //
+        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
+      }
     } else {
       //
       // Wait all APs waken up if this is not the 1st broadcast of SIPI
--
2.9.2




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-23  5:36   ` Fan, Jeff
@ 2016-11-23 16:04     ` Laszlo Ersek
  2016-11-24  1:18       ` Fan, Jeff
  0 siblings, 1 reply; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-23 16:04 UTC (permalink / raw)
  To: Fan, Jeff, edk2-devel-01

On 11/23/16 06:36, Fan, Jeff wrote:
> Laszlo,
> 
> I ever submitted one bugzilla for such request at
> https://bugzilla.tianocore.org/show_bug.cgi?id=116

Huh, I didn't know about that. :)

> My point is not to add new PCD and just to use existing PCD
> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
> 
> Platform Peim could update it is platform pei module before memory ready.
> 
> MpInitLib just exits from wait loop once the discovered processors
> number reaches the PcdCpuMaxLogicalProcessorNumber.
> 
> Does it meet your requirement?

This was the first idea I had as well, yes. However, I considered the
following case:

- QEMU is started (and the guest is booted) with N online processors and
M > N *possible* processors.

- During OS runtime, the user hotplugs one or more additional
processors, so that the number of currently online processors is now
larger than N.

- The user suspends and resumes (S3) the virtual machine.

- During S3 resume, QEMU will report the increased number of boot
processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item
that returns this value to the firmware, from the increased number of
VCPUs.)

- During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any
longer, because the UefiCpuPkg modules that consume the PCD have already
allocated their data structures using the first (lower) value. Those
allocations cannot be resized / extended during S3 resume.

So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the
number of the expected maximum VCPU count, which will ensure the proper
allocations during first boot. Then allow
PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also
possible -- from normal boot to S3 resume.

(The concept of "expected maximum" is not specific to VCPU hotplug in
QEMU, it applies to memory (DIMM) hotplug and memory ballooning as well.)

Of course, I don't know (or expect) that the UefiCpuPkg modules fully
support this use case right now; it's just that I didn't want to
*prevent* this use case. So I figured I'd preserve the original role of
PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for
the "current boot VCPU count".

Do you agree that it makes sense to keep these quantities separate?

If you think we should just set PcdCpuMaxLogicalProcessorNumber
dynamically at this point, and introduce no new PCD for now, I can do
that too. I just wanted to make sure that we discuss VCPU hot-plug /
hot-unplug and S3 resume first.

Thanks!
Laszlo

> 
> Thanks!
> Jeff
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com] 
> Sent: Wednesday, November 23, 2016 4:26 AM
> To: edk2-devel-01
> Cc: Igor Mammedov; Fan, Jeff
> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
> 
> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
> 
> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
> 
> - if the APs report back quickly, then we could be wasting time,
> 
> - if the APs report back late (due to scheduling artifacts or VCPU
>   over-subscription on the virtualization host), then the timeout can
>   elapse before all APs increment CpuMpData->CpuCount in
>   ApWakeupFunction().
> 
> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
> 
> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
> 
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Jeff Fan <jeff.fan@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
>  4 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec index ca560398bbef..35903c4386e4 100644
> --- a/UefiCpuPkg/UefiCpuPkg.dec
> +++ b/UefiCpuPkg/UefiCpuPkg.dec
> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
>    # @ValidList   0x80000001 | 0
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x60000011
>  
> +  ## On platforms where the number of boot-time CPUs can be dynamically  
> + #  retrieved from a platform-specific information source, the BSP does 
> + not  #  have to wait for PcdCpuApInitTimeOutInMicroSeconds in order to 
> + detect all  #  APs for the first time. On such platforms, this PCD 
> + specifies the number  #  of CPUs available at boot. If the platform 
> + doesn't support this feature,  #  this PCD should be set to 0. The 
> + platform is responsible for ensuring that  #  this PCD is never set to 
> + a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
> +  # @Prompt Known number of CPUs available at boot.
> +  
> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32|0
> + x60000012
> +
>  [UserExtensions.TianoCore."ExtraFiles"]
>    UefiCpuPkgExtra.uni
> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> index 11b230174ec8..dc18eaf6152d 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> @@ -61,6 +61,7 @@ [Guids]
>  
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> index 0c6873da79db..2bcfa70ae7e5 100644
> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> @@ -61,10 +61,10 @@ [Ppis]
>  
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
> -
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 15dbfa1e7d6c..f7dfbd5bad13 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -707,6 +707,7 @@ WakeUpAP (
>    CPU_AP_DATA                      *CpuData;
>    BOOLEAN                          ResetVectorRequired;
>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
> +  UINT32                           KnownProcessorCount;
>  
>    CpuMpData->FinishedCount = 0;
>    ResetVectorRequired = FALSE;
> @@ -745,10 +746,17 @@ WakeUpAP (
>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
>      }
>      if (CpuMpData->InitFlag == ApInitConfig) {
> -      //
> -      // Wait for all potential APs waken up in one specified period
> -      //
> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
> +      if (KnownProcessorCount > 0) {
> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
> +          CpuPause ();
> +        }
> +      } else {
> +        //
> +        // Wait for all potential APs waken up in one specified period
> +        //
> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> +      }
>      } else {
>        //
>        // Wait all APs waken up if this is not the 1st broadcast of SIPI
> --
> 2.9.2
> 
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements
  2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
                   ` (3 preceding siblings ...)
  2016-11-22 20:26 ` [PATCH 4/4] OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for MpInitLib Laszlo Ersek
@ 2016-11-23 20:43 ` Laszlo Ersek
  4 siblings, 0 replies; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-23 20:43 UTC (permalink / raw)
  To: edk2-devel-01; +Cc: Igor Mammedov, Jeff Fan, Jordan Justen

On 11/22/16 21:26, Laszlo Ersek wrote:
> Patches 1 and 2 fix several instances of the same CPUID-related bug in
> UefiCpuPkg (BaseXApicLib, BaseXApicX2ApicLib, MpInitLib/Ia32,
> MpInitLib/X64).

I committed the first two patches (as they constitute a separate
bugfix), with Jeff's R-b's. Commit range 7b9b576c71c7..1cbd83308989.

Thanks!
Laszlo

> Patches 3 and 4 are independent from the bugfix, but they relate
> nonetheless to multiprocessing, so I'm including them in the same
> series. They allow OvmfPkg to fetch the number of boot processors from
> QEMU, and to make CpuMpPei / CpuDxe wait in the initial enumeration
> exactly as long as it takes for all of the APs to come up. This makes MP
> boot much more reliable (and no time is wasted waiting).
> 
> (
> 
>   Note that I have another patch set pending on the list:
> 
>     [edk2] [PATCH v2 0/4] OvmfPkg: broadcast SMIs and dynamically revert
>                           to traditional AP sync mode
> 
>   That series and this series have a trivial context dependency in
>   OvmfPkg/OvmfPkg*.dsc. Because that series consumes an interface from
>   QEMU that will be part of the 2.9 release only, not the pending 2.8
>   release, that series comes *second*. In comparison, this series works
>   as-is, so it comes first. Locally I have rebased the broadcast SMI
>   series already (it is trivial); it does not interfere with the review.
> 
> )
> 
> Repo:   https://github.com/lersek/edk2/
> Branch: mpinit
> 
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Jeff Fan <jeff.fan@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> 
> Thanks
> Laszlo
> 
> Laszlo Ersek (4):
>   UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID
>     leaf
>   UefiCpuPkg/MpInitLib: fix feature test for Extended Topology CPUID
>     leaf
>   UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count
>     upfront
>   OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for
>     MpInitLib
> 
>  OvmfPkg/OvmfPkgIa32.dsc                                    |  2 ++
>  OvmfPkg/OvmfPkgIa32X64.dsc                                 |  2 ++
>  OvmfPkg/OvmfPkgX64.dsc                                     |  2 ++
>  OvmfPkg/PlatformPei/Platform.c                             | 22 ++++++++++++++++++++
>  OvmfPkg/PlatformPei/PlatformPei.inf                        |  1 +
>  UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c             |  7 +++++--
>  UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c |  7 +++++--
>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf              |  1 +
>  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf              |  4 ++--
>  UefiCpuPkg/Library/MpInitLib/MpLib.c                       | 16 ++++++++++----
>  UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm             | 21 +++++++++++--------
>  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm              | 21 +++++++++++--------
>  UefiCpuPkg/UefiCpuPkg.dec                                  | 11 ++++++++++
>  13 files changed, 89 insertions(+), 28 deletions(-)
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-23 16:04     ` Laszlo Ersek
@ 2016-11-24  1:18       ` Fan, Jeff
  2016-11-24  9:36         ` Laszlo Ersek
  0 siblings, 1 reply; 15+ messages in thread
From: Fan, Jeff @ 2016-11-24  1:18 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01

Laszlo,

Please use existing PcdCpuMaxLogicalProcessorNumber firstly.

So, for the platforms do not support CPU hot-plug. It could benefit the boot performance.
For the platforms supporting CPU hot-plug.  It will have no any impact.

Introducing new PCD is good idea to benefit boot performance on S3 path, but it is some complex. 
For example, if DXE phase disables some CPU and do reboot. Real platform may need another policy to set this PCD used by PEI phase.

Could you file one bugzilla for this new PCD requirement?

Thanks!
Jeff

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com] 
Sent: Thursday, November 24, 2016 12:05 AM
To: Fan, Jeff; edk2-devel-01
Cc: Igor Mammedov
Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront

On 11/23/16 06:36, Fan, Jeff wrote:
> Laszlo,
> 
> I ever submitted one bugzilla for such request at
> https://bugzilla.tianocore.org/show_bug.cgi?id=116

Huh, I didn't know about that. :)

> My point is not to add new PCD and just to use existing PCD 
> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
> 
> Platform Peim could update it is platform pei module before memory ready.
> 
> MpInitLib just exits from wait loop once the discovered processors 
> number reaches the PcdCpuMaxLogicalProcessorNumber.
> 
> Does it meet your requirement?

This was the first idea I had as well, yes. However, I considered the following case:

- QEMU is started (and the guest is booted) with N online processors and M > N *possible* processors.

- During OS runtime, the user hotplugs one or more additional processors, so that the number of currently online processors is now larger than N.

- The user suspends and resumes (S3) the virtual machine.

- During S3 resume, QEMU will report the increased number of boot processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item that returns this value to the firmware, from the increased number of
VCPUs.)

- During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any longer, because the UefiCpuPkg modules that consume the PCD have already allocated their data structures using the first (lower) value. Those allocations cannot be resized / extended during S3 resume.

So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the number of the expected maximum VCPU count, which will ensure the proper allocations during first boot. Then allow PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also possible -- from normal boot to S3 resume.

(The concept of "expected maximum" is not specific to VCPU hotplug in QEMU, it applies to memory (DIMM) hotplug and memory ballooning as well.)

Of course, I don't know (or expect) that the UefiCpuPkg modules fully support this use case right now; it's just that I didn't want to
*prevent* this use case. So I figured I'd preserve the original role of PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for the "current boot VCPU count".

Do you agree that it makes sense to keep these quantities separate?

If you think we should just set PcdCpuMaxLogicalProcessorNumber dynamically at this point, and introduce no new PCD for now, I can do that too. I just wanted to make sure that we discuss VCPU hot-plug / hot-unplug and S3 resume first.

Thanks!
Laszlo

> 
> Thanks!
> Jeff
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, November 23, 2016 4:26 AM
> To: edk2-devel-01
> Cc: Igor Mammedov; Fan, Jeff
> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide 
> a known CPU count upfront
> 
> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
> 
> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
> 
> - if the APs report back quickly, then we could be wasting time,
> 
> - if the APs report back late (due to scheduling artifacts or VCPU
>   over-subscription on the virtualization host), then the timeout can
>   elapse before all APs increment CpuMpData->CpuCount in
>   ApWakeupFunction().
> 
> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
> 
> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
> 
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Jeff Fan <jeff.fan@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
>  4 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec 
> index ca560398bbef..35903c4386e4 100644
> --- a/UefiCpuPkg/UefiCpuPkg.dec
> +++ b/UefiCpuPkg/UefiCpuPkg.dec
> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
>    # @ValidList   0x80000001 | 0
>    
> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x600000
> 11
>  
> +  ## On platforms where the number of boot-time CPUs can be 
> + dynamically #  retrieved from a platform-specific information 
> + source, the BSP does not  #  have to wait for 
> + PcdCpuApInitTimeOutInMicroSeconds in order to detect all  #  APs for 
> + the first time. On such platforms, this PCD specifies the number  #  
> + of CPUs available at boot. If the platform doesn't support this 
> + feature,  #  this PCD should be set to 0. The platform is 
> + responsible for ensuring that  #  this PCD is never set to a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
> +  # @Prompt Known number of CPUs available at boot.
> +  
> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32
> + |0
> + x60000012
> +
>  [UserExtensions.TianoCore."ExtraFiles"]
>    UefiCpuPkgExtra.uni
> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf 
> b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> index 11b230174ec8..dc18eaf6152d 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> @@ -61,6 +61,7 @@ [Guids]
>  
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf 
> b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> index 0c6873da79db..2bcfa70ae7e5 100644
> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> @@ -61,10 +61,10 @@ [Ppis]
>  
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
> -
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 15dbfa1e7d6c..f7dfbd5bad13 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -707,6 +707,7 @@ WakeUpAP (
>    CPU_AP_DATA                      *CpuData;
>    BOOLEAN                          ResetVectorRequired;
>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
> +  UINT32                           KnownProcessorCount;
>  
>    CpuMpData->FinishedCount = 0;
>    ResetVectorRequired = FALSE;
> @@ -745,10 +746,17 @@ WakeUpAP (
>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
>      }
>      if (CpuMpData->InitFlag == ApInitConfig) {
> -      //
> -      // Wait for all potential APs waken up in one specified period
> -      //
> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
> +      if (KnownProcessorCount > 0) {
> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
> +          CpuPause ();
> +        }
> +      } else {
> +        //
> +        // Wait for all potential APs waken up in one specified period
> +        //
> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> +      }
>      } else {
>        //
>        // Wait all APs waken up if this is not the 1st broadcast of 
> SIPI
> --
> 2.9.2
> 
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-24  1:18       ` Fan, Jeff
@ 2016-11-24  9:36         ` Laszlo Ersek
  2016-11-24 13:49           ` Fan, Jeff
  2016-11-24 18:32           ` Igor Mammedov
  0 siblings, 2 replies; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-24  9:36 UTC (permalink / raw)
  To: Fan, Jeff, edk2-devel-01

On 11/24/16 02:18, Fan, Jeff wrote:
> Laszlo,
> 
> Please use existing PcdCpuMaxLogicalProcessorNumber firstly.

Okay, I will submit a new version of the series with this change.

> So, for the platforms do not support CPU hot-plug. It could benefit the boot performance.
> For the platforms supporting CPU hot-plug.  It will have no any impact.
> 
> Introducing new PCD is good idea to benefit boot performance on S3 path, but it is some complex. 
> For example, if DXE phase disables some CPU and do reboot. Real platform may need another policy to set this PCD used by PEI phase.
> 
> Could you file one bugzilla for this new PCD requirement?

I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=251> and
assigned it to you; I hope that's alright. Feel free to edit the BZ
title; I might not have captured the use case exactly.

Thanks!
Laszlo


> Thanks!
> Jeff
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com] 
> Sent: Thursday, November 24, 2016 12:05 AM
> To: Fan, Jeff; edk2-devel-01
> Cc: Igor Mammedov
> Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
> 
> On 11/23/16 06:36, Fan, Jeff wrote:
>> Laszlo,
>>
>> I ever submitted one bugzilla for such request at
>> https://bugzilla.tianocore.org/show_bug.cgi?id=116
> 
> Huh, I didn't know about that. :)
> 
>> My point is not to add new PCD and just to use existing PCD 
>> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
>>
>> Platform Peim could update it is platform pei module before memory ready.
>>
>> MpInitLib just exits from wait loop once the discovered processors 
>> number reaches the PcdCpuMaxLogicalProcessorNumber.
>>
>> Does it meet your requirement?
> 
> This was the first idea I had as well, yes. However, I considered the following case:
> 
> - QEMU is started (and the guest is booted) with N online processors and M > N *possible* processors.
> 
> - During OS runtime, the user hotplugs one or more additional processors, so that the number of currently online processors is now larger than N.
> 
> - The user suspends and resumes (S3) the virtual machine.
> 
> - During S3 resume, QEMU will report the increased number of boot processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item that returns this value to the firmware, from the increased number of
> VCPUs.)
> 
> - During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any longer, because the UefiCpuPkg modules that consume the PCD have already allocated their data structures using the first (lower) value. Those allocations cannot be resized / extended during S3 resume.
> 
> So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the number of the expected maximum VCPU count, which will ensure the proper allocations during first boot. Then allow PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also possible -- from normal boot to S3 resume.
> 
> (The concept of "expected maximum" is not specific to VCPU hotplug in QEMU, it applies to memory (DIMM) hotplug and memory ballooning as well.)
> 
> Of course, I don't know (or expect) that the UefiCpuPkg modules fully support this use case right now; it's just that I didn't want to
> *prevent* this use case. So I figured I'd preserve the original role of PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for the "current boot VCPU count".
> 
> Do you agree that it makes sense to keep these quantities separate?
> 
> If you think we should just set PcdCpuMaxLogicalProcessorNumber dynamically at this point, and introduce no new PCD for now, I can do that too. I just wanted to make sure that we discuss VCPU hot-plug / hot-unplug and S3 resume first.
> 
> Thanks!
> Laszlo
> 
>>
>> Thanks!
>> Jeff
>>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Wednesday, November 23, 2016 4:26 AM
>> To: edk2-devel-01
>> Cc: Igor Mammedov; Fan, Jeff
>> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide 
>> a known CPU count upfront
>>
>> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
>>
>> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
>>
>> - if the APs report back quickly, then we could be wasting time,
>>
>> - if the APs report back late (due to scheduling artifacts or VCPU
>>   over-subscription on the virtualization host), then the timeout can
>>   elapse before all APs increment CpuMpData->CpuCount in
>>   ApWakeupFunction().
>>
>> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
>>
>> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
>>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Jeff Fan <jeff.fan@intel.com>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
>>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
>>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
>>  4 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec 
>> index ca560398bbef..35903c4386e4 100644
>> --- a/UefiCpuPkg/UefiCpuPkg.dec
>> +++ b/UefiCpuPkg/UefiCpuPkg.dec
>> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
>>    # @ValidList   0x80000001 | 0
>>    
>> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x600000
>> 11
>>  
>> +  ## On platforms where the number of boot-time CPUs can be 
>> + dynamically #  retrieved from a platform-specific information 
>> + source, the BSP does not  #  have to wait for 
>> + PcdCpuApInitTimeOutInMicroSeconds in order to detect all  #  APs for 
>> + the first time. On such platforms, this PCD specifies the number  #  
>> + of CPUs available at boot. If the platform doesn't support this 
>> + feature,  #  this PCD should be set to 0. The platform is 
>> + responsible for ensuring that  #  this PCD is never set to a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
>> +  # @Prompt Known number of CPUs available at boot.
>> +  
>> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32
>> + |0
>> + x60000012
>> +
>>  [UserExtensions.TianoCore."ExtraFiles"]
>>    UefiCpuPkgExtra.uni
>> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf 
>> b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> index 11b230174ec8..dc18eaf6152d 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> @@ -61,6 +61,7 @@ [Guids]
>>  
>>  [Pcd]
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf 
>> b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> index 0c6873da79db..2bcfa70ae7e5 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> @@ -61,10 +61,10 @@ [Ppis]
>>  
>>  [Pcd]
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
>> -
>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> index 15dbfa1e7d6c..f7dfbd5bad13 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> @@ -707,6 +707,7 @@ WakeUpAP (
>>    CPU_AP_DATA                      *CpuData;
>>    BOOLEAN                          ResetVectorRequired;
>>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
>> +  UINT32                           KnownProcessorCount;
>>  
>>    CpuMpData->FinishedCount = 0;
>>    ResetVectorRequired = FALSE;
>> @@ -745,10 +746,17 @@ WakeUpAP (
>>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
>>      }
>>      if (CpuMpData->InitFlag == ApInitConfig) {
>> -      //
>> -      // Wait for all potential APs waken up in one specified period
>> -      //
>> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
>> +      if (KnownProcessorCount > 0) {
>> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
>> +          CpuPause ();
>> +        }
>> +      } else {
>> +        //
>> +        // Wait for all potential APs waken up in one specified period
>> +        //
>> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>> +      }
>>      } else {
>>        //
>>        // Wait all APs waken up if this is not the 1st broadcast of 
>> SIPI
>> --
>> 2.9.2
>>
>>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-24  9:36         ` Laszlo Ersek
@ 2016-11-24 13:49           ` Fan, Jeff
  2016-11-24 18:32           ` Igor Mammedov
  1 sibling, 0 replies; 15+ messages in thread
From: Fan, Jeff @ 2016-11-24 13:49 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-01

Laszlo,

Thanks!

Jeff
-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com] 
Sent: Thursday, November 24, 2016 5:37 PM
To: Fan, Jeff; edk2-devel-01
Cc: Igor Mammedov
Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront

On 11/24/16 02:18, Fan, Jeff wrote:
> Laszlo,
> 
> Please use existing PcdCpuMaxLogicalProcessorNumber firstly.

Okay, I will submit a new version of the series with this change.

> So, for the platforms do not support CPU hot-plug. It could benefit the boot performance.
> For the platforms supporting CPU hot-plug.  It will have no any impact.
> 
> Introducing new PCD is good idea to benefit boot performance on S3 path, but it is some complex. 
> For example, if DXE phase disables some CPU and do reboot. Real platform may need another policy to set this PCD used by PEI phase.
> 
> Could you file one bugzilla for this new PCD requirement?

I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=251> and assigned it to you; I hope that's alright. Feel free to edit the BZ title; I might not have captured the use case exactly.

Thanks!
Laszlo


> Thanks!
> Jeff
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Thursday, November 24, 2016 12:05 AM
> To: Fan, Jeff; edk2-devel-01
> Cc: Igor Mammedov
> Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to 
> provide a known CPU count upfront
> 
> On 11/23/16 06:36, Fan, Jeff wrote:
>> Laszlo,
>>
>> I ever submitted one bugzilla for such request at
>> https://bugzilla.tianocore.org/show_bug.cgi?id=116
> 
> Huh, I didn't know about that. :)
> 
>> My point is not to add new PCD and just to use existing PCD 
>> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
>>
>> Platform Peim could update it is platform pei module before memory ready.
>>
>> MpInitLib just exits from wait loop once the discovered processors 
>> number reaches the PcdCpuMaxLogicalProcessorNumber.
>>
>> Does it meet your requirement?
> 
> This was the first idea I had as well, yes. However, I considered the following case:
> 
> - QEMU is started (and the guest is booted) with N online processors and M > N *possible* processors.
> 
> - During OS runtime, the user hotplugs one or more additional processors, so that the number of currently online processors is now larger than N.
> 
> - The user suspends and resumes (S3) the virtual machine.
> 
> - During S3 resume, QEMU will report the increased number of boot 
> processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item 
> that returns this value to the firmware, from the increased number of
> VCPUs.)
> 
> - During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any longer, because the UefiCpuPkg modules that consume the PCD have already allocated their data structures using the first (lower) value. Those allocations cannot be resized / extended during S3 resume.
> 
> So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the number of the expected maximum VCPU count, which will ensure the proper allocations during first boot. Then allow PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also possible -- from normal boot to S3 resume.
> 
> (The concept of "expected maximum" is not specific to VCPU hotplug in 
> QEMU, it applies to memory (DIMM) hotplug and memory ballooning as 
> well.)
> 
> Of course, I don't know (or expect) that the UefiCpuPkg modules fully 
> support this use case right now; it's just that I didn't want to
> *prevent* this use case. So I figured I'd preserve the original role of PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for the "current boot VCPU count".
> 
> Do you agree that it makes sense to keep these quantities separate?
> 
> If you think we should just set PcdCpuMaxLogicalProcessorNumber dynamically at this point, and introduce no new PCD for now, I can do that too. I just wanted to make sure that we discuss VCPU hot-plug / hot-unplug and S3 resume first.
> 
> Thanks!
> Laszlo
> 
>>
>> Thanks!
>> Jeff
>>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Wednesday, November 23, 2016 4:26 AM
>> To: edk2-devel-01
>> Cc: Igor Mammedov; Fan, Jeff
>> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide 
>> a known CPU count upfront
>>
>> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
>>
>> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
>>
>> - if the APs report back quickly, then we could be wasting time,
>>
>> - if the APs report back late (due to scheduling artifacts or VCPU
>>   over-subscription on the virtualization host), then the timeout can
>>   elapse before all APs increment CpuMpData->CpuCount in
>>   ApWakeupFunction().
>>
>> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
>>
>> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
>>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Jeff Fan <jeff.fan@intel.com>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
>>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
>>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
>>  4 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec 
>> index ca560398bbef..35903c4386e4 100644
>> --- a/UefiCpuPkg/UefiCpuPkg.dec
>> +++ b/UefiCpuPkg/UefiCpuPkg.dec
>> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
>>    # @ValidList   0x80000001 | 0
>>    
>> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x60000
>> 0
>> 11
>>  
>> +  ## On platforms where the number of boot-time CPUs can be 
>> + dynamically #  retrieved from a platform-specific information 
>> + source, the BSP does not  #  have to wait for 
>> + PcdCpuApInitTimeOutInMicroSeconds in order to detect all  #  APs 
>> + for the first time. On such platforms, this PCD specifies the 
>> + number  # of CPUs available at boot. If the platform doesn't 
>> + support this feature,  #  this PCD should be set to 0. The platform 
>> + is responsible for ensuring that  #  this PCD is never set to a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
>> +  # @Prompt Known number of CPUs available at boot.
>> +  
>> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT3
>> + 2
>> + |0
>> + x60000012
>> +
>>  [UserExtensions.TianoCore."ExtraFiles"]
>>    UefiCpuPkgExtra.uni
>> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> index 11b230174ec8..dc18eaf6152d 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>> @@ -61,6 +61,7 @@ [Guids]
>>  
>>  [Pcd]
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> index 0c6873da79db..2bcfa70ae7e5 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>> @@ -61,10 +61,10 @@ [Ppis]
>>  
>>  [Pcd]
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
>> -
>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> index 15dbfa1e7d6c..f7dfbd5bad13 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> @@ -707,6 +707,7 @@ WakeUpAP (
>>    CPU_AP_DATA                      *CpuData;
>>    BOOLEAN                          ResetVectorRequired;
>>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
>> +  UINT32                           KnownProcessorCount;
>>  
>>    CpuMpData->FinishedCount = 0;
>>    ResetVectorRequired = FALSE;
>> @@ -745,10 +746,17 @@ WakeUpAP (
>>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
>>      }
>>      if (CpuMpData->InitFlag == ApInitConfig) {
>> -      //
>> -      // Wait for all potential APs waken up in one specified period
>> -      //
>> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
>> +      if (KnownProcessorCount > 0) {
>> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
>> +          CpuPause ();
>> +        }
>> +      } else {
>> +        //
>> +        // Wait for all potential APs waken up in one specified period
>> +        //
>> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>> +      }
>>      } else {
>>        //
>>        // Wait all APs waken up if this is not the 1st broadcast of 
>> SIPI
>> --
>> 2.9.2
>>
>>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-24  9:36         ` Laszlo Ersek
  2016-11-24 13:49           ` Fan, Jeff
@ 2016-11-24 18:32           ` Igor Mammedov
  2016-11-24 21:20             ` Laszlo Ersek
  1 sibling, 1 reply; 15+ messages in thread
From: Igor Mammedov @ 2016-11-24 18:32 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Fan, Jeff, edk2-devel-01

On Thu, 24 Nov 2016 10:36:38 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 11/24/16 02:18, Fan, Jeff wrote:
> > Laszlo,
> > 
> > Please use existing PcdCpuMaxLogicalProcessorNumber firstly.
> 
> Okay, I will submit a new version of the series with this change.
Using present count of CPUs might potentially break things if
there were hotplugged CPUs with follow up S3 and wakeup dute this code:

MpInitLibInitialize()
  OldCpuMpData = GetCpuMpDataFromGuidedHob ();                                   
  if (OldCpuMpData == NULL) {                                                    
    MaxLogicalProcessorNumber = PcdGet32(PcdCpuMaxLogicalProcessorNumber);       
  } else {                                                                       
    MaxLogicalProcessorNumber = OldCpuMpData->CpuCount;                          
  }

I've traced this part of code under QEMU with PcdCpuMaxLogicalProcessorNumber
set to present number of CPUs and currently OldCpuMpData is always NULL on resume
so PcdCpuMaxLogicalProcessorNumber gets updated CPU count from platform and
it resumes seemingly fine but I won't bet that it's correct.

here is a patch I've used for this experiment:

diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index 3f4d42d..ce80690 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -455,6 +455,7 @@
 ################################################################################
 
 [PcdsDynamicDefault]
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
   gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index 5688475..f7697b1 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -461,6 +461,7 @@
 ################################################################################
 
 [PcdsDynamicDefault]
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
   gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index dcf64b9..4034989 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -460,6 +460,7 @@
 ################################################################################
 
 [PcdsDynamicDefault]
+  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
   gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
   gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
diff --git a/OvmfPkg/PlatformPei/MemDetect.c b/OvmfPkg/PlatformPei/MemDetect.c
index d00a570..e205fe7 100644
--- a/OvmfPkg/PlatformPei/MemDetect.c
+++ b/OvmfPkg/PlatformPei/MemDetect.c
@@ -342,6 +342,7 @@ PublishPeiMemory (
   UINT64                      MemorySize;
   UINT32                      LowerMemorySize;
   UINT32                      PeiMemoryCap;
+  UINT16       CpuCount;
 
   LowerMemorySize = GetSystemMemorySizeBelow4gb ();
   if (FeaturePcdGet (PcdSmmSmramRequire)) {
@@ -351,6 +352,12 @@ PublishPeiMemory (
     LowerMemorySize -= FixedPcdGet8 (PcdQ35TsegMbytes) * SIZE_1MB;
   }
 
+  QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount);
+  //QemuFwCfgSelectItem (QemuFwCfgItemMaximumCpuCount);
+  CpuCount = QemuFwCfgRead16 ();
+  Status = PcdSet32S(PcdCpuMaxLogicalProcessorNumber, CpuCount);
+  DEBUG ((EFI_D_INFO, "XXXXXXXXXXXXXX: QemuFwCfgItemSmpCpuCount: %u\n", CpuCount));
+
   //
   // If S3 is supported, then the S3 permanent PEI memory is placed next,
   // downwards. Its size is primarily dictated by CpuMpPei. The formula below
 
> > So, for the platforms do not support CPU hot-plug. It could benefit the boot performance.
> > For the platforms supporting CPU hot-plug.  It will have no any impact.
> > 
> > Introducing new PCD is good idea to benefit boot performance on S3 path, but it is some complex. 
> > For example, if DXE phase disables some CPU and do reboot. Real platform may need another policy to set this PCD used by PEI phase.
> > 
> > Could you file one bugzilla for this new PCD requirement?
> 
> I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=251> and
> assigned it to you; I hope that's alright. Feel free to edit the BZ
> title; I might not have captured the use case exactly.
> 
> Thanks!
> Laszlo
> 
> 
> > Thanks!
> > Jeff
> > 
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com] 
> > Sent: Thursday, November 24, 2016 12:05 AM
> > To: Fan, Jeff; edk2-devel-01
> > Cc: Igor Mammedov
> > Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
> > 
> > On 11/23/16 06:36, Fan, Jeff wrote:
> >> Laszlo,
> >>
> >> I ever submitted one bugzilla for such request at
> >> https://bugzilla.tianocore.org/show_bug.cgi?id=116
> > 
> > Huh, I didn't know about that. :)
> > 
> >> My point is not to add new PCD and just to use existing PCD 
> >> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
> >>
> >> Platform Peim could update it is platform pei module before memory ready.
> >>
> >> MpInitLib just exits from wait loop once the discovered processors 
> >> number reaches the PcdCpuMaxLogicalProcessorNumber.
> >>
> >> Does it meet your requirement?
> > 
> > This was the first idea I had as well, yes. However, I considered the following case:
> > 
> > - QEMU is started (and the guest is booted) with N online processors and M > N *possible* processors.
> > 
> > - During OS runtime, the user hotplugs one or more additional processors, so that the number of currently online processors is now larger than N.
> > 
> > - The user suspends and resumes (S3) the virtual machine.
> > 
> > - During S3 resume, QEMU will report the increased number of boot processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item that returns this value to the firmware, from the increased number of
> > VCPUs.)
> > 
> > - During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any longer, because the UefiCpuPkg modules that consume the PCD have already allocated their data structures using the first (lower) value. Those allocations cannot be resized / extended during S3 resume.
> > 
> > So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the number of the expected maximum VCPU count, which will ensure the proper allocations during first boot. Then allow PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also possible -- from normal boot to S3 resume.
> > 
> > (The concept of "expected maximum" is not specific to VCPU hotplug in QEMU, it applies to memory (DIMM) hotplug and memory ballooning as well.)
> > 
> > Of course, I don't know (or expect) that the UefiCpuPkg modules fully support this use case right now; it's just that I didn't want to
> > *prevent* this use case. So I figured I'd preserve the original role of PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for the "current boot VCPU count".
> > 
> > Do you agree that it makes sense to keep these quantities separate?
> > 
> > If you think we should just set PcdCpuMaxLogicalProcessorNumber dynamically at this point, and introduce no new PCD for now, I can do that too. I just wanted to make sure that we discuss VCPU hot-plug / hot-unplug and S3 resume first.
> > 
> > Thanks!
> > Laszlo
> > 
> >>
> >> Thanks!
> >> Jeff
> >>
> >> -----Original Message-----
> >> From: Laszlo Ersek [mailto:lersek@redhat.com]
> >> Sent: Wednesday, November 23, 2016 4:26 AM
> >> To: edk2-devel-01
> >> Cc: Igor Mammedov; Fan, Jeff
> >> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide 
> >> a known CPU count upfront
> >>
> >> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
> >>
> >> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
> >>
> >> - if the APs report back quickly, then we could be wasting time,
> >>
> >> - if the APs report back late (due to scheduling artifacts or VCPU
> >>   over-subscription on the virtualization host), then the timeout can
> >>   elapse before all APs increment CpuMpData->CpuCount in
> >>   ApWakeupFunction().
> >>
> >> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
> >>
> >> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
> >>
> >> Cc: Igor Mammedov <imammedo@redhat.com>
> >> Cc: Jeff Fan <jeff.fan@intel.com>
> >> Contributed-under: TianoCore Contribution Agreement 1.0
> >> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> >> ---
> >>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
> >>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
> >>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
> >>  4 files changed, 26 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec 
> >> index ca560398bbef..35903c4386e4 100644
> >> --- a/UefiCpuPkg/UefiCpuPkg.dec
> >> +++ b/UefiCpuPkg/UefiCpuPkg.dec
> >> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
> >>    # @ValidList   0x80000001 | 0
> >>    
> >> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x600000
> >> 11
> >>  
> >> +  ## On platforms where the number of boot-time CPUs can be 
> >> + dynamically #  retrieved from a platform-specific information 
> >> + source, the BSP does not  #  have to wait for 
> >> + PcdCpuApInitTimeOutInMicroSeconds in order to detect all  #  APs for 
> >> + the first time. On such platforms, this PCD specifies the number  #  
> >> + of CPUs available at boot. If the platform doesn't support this 
> >> + feature,  #  this PCD should be set to 0. The platform is 
> >> + responsible for ensuring that  #  this PCD is never set to a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
> >> +  # @Prompt Known number of CPUs available at boot.
> >> +  
> >> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32
> >> + |0
> >> + x60000012
> >> +
> >>  [UserExtensions.TianoCore."ExtraFiles"]
> >>    UefiCpuPkgExtra.uni
> >> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf 
> >> b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> >> index 11b230174ec8..dc18eaf6152d 100644
> >> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> >> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
> >> @@ -61,6 +61,7 @@ [Guids]
> >>  
> >>  [Pcd]
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> >> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
> >> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf 
> >> b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> >> index 0c6873da79db..2bcfa70ae7e5 100644
> >> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> >> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
> >> @@ -61,10 +61,10 @@ [Ppis]
> >>  
> >>  [Pcd]
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
> >> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
> >> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
> >> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
> >>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
> >> -
> >> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
> >> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> index 15dbfa1e7d6c..f7dfbd5bad13 100644
> >> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> @@ -707,6 +707,7 @@ WakeUpAP (
> >>    CPU_AP_DATA                      *CpuData;
> >>    BOOLEAN                          ResetVectorRequired;
> >>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
> >> +  UINT32                           KnownProcessorCount;
> >>  
> >>    CpuMpData->FinishedCount = 0;
> >>    ResetVectorRequired = FALSE;
> >> @@ -745,10 +746,17 @@ WakeUpAP (
> >>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
> >>      }
> >>      if (CpuMpData->InitFlag == ApInitConfig) {
> >> -      //
> >> -      // Wait for all potential APs waken up in one specified period
> >> -      //
> >> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> >> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
> >> +      if (KnownProcessorCount > 0) {
> >> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
> >> +          CpuPause ();
> >> +        }
> >> +      } else {
> >> +        //
> >> +        // Wait for all potential APs waken up in one specified period
> >> +        //
> >> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
> >> +      }
> >>      } else {
> >>        //
> >>        // Wait all APs waken up if this is not the 1st broadcast of 
> >> SIPI
> >> --
> >> 2.9.2
> >>
> >>
> > 
> 



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
  2016-11-24 18:32           ` Igor Mammedov
@ 2016-11-24 21:20             ` Laszlo Ersek
  0 siblings, 0 replies; 15+ messages in thread
From: Laszlo Ersek @ 2016-11-24 21:20 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: Fan, Jeff, edk2-devel-01

On 11/24/16 19:32, Igor Mammedov wrote:
> On Thu, 24 Nov 2016 10:36:38 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 11/24/16 02:18, Fan, Jeff wrote:
>>> Laszlo,
>>>
>>> Please use existing PcdCpuMaxLogicalProcessorNumber firstly.
>>
>> Okay, I will submit a new version of the series with this change.
> Using present count of CPUs might potentially break things if
> there were hotplugged CPUs with follow up S3 and wakeup dute this code:
> 
> MpInitLibInitialize()
>   OldCpuMpData = GetCpuMpDataFromGuidedHob ();                                   
>   if (OldCpuMpData == NULL) {                                                    
>     MaxLogicalProcessorNumber = PcdGet32(PcdCpuMaxLogicalProcessorNumber);       
>   } else {                                                                       
>     MaxLogicalProcessorNumber = OldCpuMpData->CpuCount;                          
>   }

Yes, this is a correct assessment. And, we're aware of it in general; I
raised the same scenario in my previous email.

It seems that Jeff prefers the simpler solution for now. The hotplug
case is tracked by <https://bugzilla.tianocore.org/show_bug.cgi?id=251>.

Thanks
Laszlo

> I've traced this part of code under QEMU with PcdCpuMaxLogicalProcessorNumber
> set to present number of CPUs and currently OldCpuMpData is always NULL on resume
> so PcdCpuMaxLogicalProcessorNumber gets updated CPU count from platform and
> it resumes seemingly fine but I won't bet that it's correct.
> 
> here is a patch I've used for this experiment:
> 
> diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
> index 3f4d42d..ce80690 100644
> --- a/OvmfPkg/OvmfPkgIa32.dsc
> +++ b/OvmfPkg/OvmfPkgIa32.dsc
> @@ -455,6 +455,7 @@
>  ################################################################################
>  
>  [PcdsDynamicDefault]
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
>    gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
> diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
> index 5688475..f7697b1 100644
> --- a/OvmfPkg/OvmfPkgIa32X64.dsc
> +++ b/OvmfPkg/OvmfPkgIa32X64.dsc
> @@ -461,6 +461,7 @@
>  ################################################################################
>  
>  [PcdsDynamicDefault]
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
>    gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index dcf64b9..4034989 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -460,6 +460,7 @@
>  ################################################################################
>  
>  [PcdsDynamicDefault]
> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|10
>    gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageVariableBase64|0
>    gEfiMdeModulePkgTokenSpaceGuid.PcdFlashNvStorageFtwWorkingBase|0
> diff --git a/OvmfPkg/PlatformPei/MemDetect.c b/OvmfPkg/PlatformPei/MemDetect.c
> index d00a570..e205fe7 100644
> --- a/OvmfPkg/PlatformPei/MemDetect.c
> +++ b/OvmfPkg/PlatformPei/MemDetect.c
> @@ -342,6 +342,7 @@ PublishPeiMemory (
>    UINT64                      MemorySize;
>    UINT32                      LowerMemorySize;
>    UINT32                      PeiMemoryCap;
> +  UINT16       CpuCount;
>  
>    LowerMemorySize = GetSystemMemorySizeBelow4gb ();
>    if (FeaturePcdGet (PcdSmmSmramRequire)) {
> @@ -351,6 +352,12 @@ PublishPeiMemory (
>      LowerMemorySize -= FixedPcdGet8 (PcdQ35TsegMbytes) * SIZE_1MB;
>    }
>  
> +  QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount);
> +  //QemuFwCfgSelectItem (QemuFwCfgItemMaximumCpuCount);
> +  CpuCount = QemuFwCfgRead16 ();
> +  Status = PcdSet32S(PcdCpuMaxLogicalProcessorNumber, CpuCount);
> +  DEBUG ((EFI_D_INFO, "XXXXXXXXXXXXXX: QemuFwCfgItemSmpCpuCount: %u\n", CpuCount));
> +
>    //
>    // If S3 is supported, then the S3 permanent PEI memory is placed next,
>    // downwards. Its size is primarily dictated by CpuMpPei. The formula below
>  
>>> So, for the platforms do not support CPU hot-plug. It could benefit the boot performance.
>>> For the platforms supporting CPU hot-plug.  It will have no any impact.
>>>
>>> Introducing new PCD is good idea to benefit boot performance on S3 path, but it is some complex. 
>>> For example, if DXE phase disables some CPU and do reboot. Real platform may need another policy to set this PCD used by PEI phase.
>>>
>>> Could you file one bugzilla for this new PCD requirement?
>>
>> I filed <https://bugzilla.tianocore.org/show_bug.cgi?id=251> and
>> assigned it to you; I hope that's alright. Feel free to edit the BZ
>> title; I might not have captured the use case exactly.
>>
>> Thanks!
>> Laszlo
>>
>>
>>> Thanks!
>>> Jeff
>>>
>>> -----Original Message-----
>>> From: Laszlo Ersek [mailto:lersek@redhat.com] 
>>> Sent: Thursday, November 24, 2016 12:05 AM
>>> To: Fan, Jeff; edk2-devel-01
>>> Cc: Igor Mammedov
>>> Subject: Re: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront
>>>
>>> On 11/23/16 06:36, Fan, Jeff wrote:
>>>> Laszlo,
>>>>
>>>> I ever submitted one bugzilla for such request at
>>>> https://bugzilla.tianocore.org/show_bug.cgi?id=116
>>>
>>> Huh, I didn't know about that. :)
>>>
>>>> My point is not to add new PCD and just to use existing PCD 
>>>> PcdCpuMaxLogicalProcessorNumber that supports dynamic type.
>>>>
>>>> Platform Peim could update it is platform pei module before memory ready.
>>>>
>>>> MpInitLib just exits from wait loop once the discovered processors 
>>>> number reaches the PcdCpuMaxLogicalProcessorNumber.
>>>>
>>>> Does it meet your requirement?
>>>
>>> This was the first idea I had as well, yes. However, I considered the following case:
>>>
>>> - QEMU is started (and the guest is booted) with N online processors and M > N *possible* processors.
>>>
>>> - During OS runtime, the user hotplugs one or more additional processors, so that the number of currently online processors is now larger than N.
>>>
>>> - The user suspends and resumes (S3) the virtual machine.
>>>
>>> - During S3 resume, QEMU will report the increased number of boot processors. (At S3 resume, QEMU specifically refreshes the fw_cfg item that returns this value to the firmware, from the increased number of
>>> VCPUs.)
>>>
>>> - During S3 resume, we cannot modify PcdCpuMaxLogicalProcessorNumber any longer, because the UefiCpuPkg modules that consume the PCD have already allocated their data structures using the first (lower) value. Those allocations cannot be resized / extended during S3 resume.
>>>
>>> So the idea is, set PcdCpuMaxLogicalProcessorNumber statically to the number of the expected maximum VCPU count, which will ensure the proper allocations during first boot. Then allow PcdCpuKnownLogicalProcessorNumber to change -- hot-unplug is also possible -- from normal boot to S3 resume.
>>>
>>> (The concept of "expected maximum" is not specific to VCPU hotplug in QEMU, it applies to memory (DIMM) hotplug and memory ballooning as well.)
>>>
>>> Of course, I don't know (or expect) that the UefiCpuPkg modules fully support this use case right now; it's just that I didn't want to
>>> *prevent* this use case. So I figured I'd preserve the original role of PcdCpuMaxLogicalProcessorNumber, and introduce a more dynamic value for the "current boot VCPU count".
>>>
>>> Do you agree that it makes sense to keep these quantities separate?
>>>
>>> If you think we should just set PcdCpuMaxLogicalProcessorNumber dynamically at this point, and introduce no new PCD for now, I can do that too. I just wanted to make sure that we discuss VCPU hot-plug / hot-unplug and S3 resume first.
>>>
>>> Thanks!
>>> Laszlo
>>>
>>>>
>>>> Thanks!
>>>> Jeff
>>>>
>>>> -----Original Message-----
>>>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>>>> Sent: Wednesday, November 23, 2016 4:26 AM
>>>> To: edk2-devel-01
>>>> Cc: Igor Mammedov; Fan, Jeff
>>>> Subject: [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide 
>>>> a known CPU count upfront
>>>>
>>>> On QEMU/KVM, the VCPU topology for a guest is specified dynamically. It can be a low number or a high number.
>>>>
>>>> Waiting for PcdCpuApInitTimeOutInMicroSeconds during the initial AP collection is impractical, because in a VM, the time needed for an AP to wake up can vary widely:
>>>>
>>>> - if the APs report back quickly, then we could be wasting time,
>>>>
>>>> - if the APs report back late (due to scheduling artifacts or VCPU
>>>>   over-subscription on the virtualization host), then the timeout can
>>>>   elapse before all APs increment CpuMpData->CpuCount in
>>>>   ApWakeupFunction().
>>>>
>>>> Trying to set PcdCpuApInitTimeOutInMicroSeconds dynamically is also impractical, as scheduling artifacts on the KVM host may delay AP threads arbitrarily.
>>>>
>>>> Instead, allow OVMF (and other platforms) to tell MpInitLib the number of boot-time CPUs in advance.
>>>>
>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>> Cc: Jeff Fan <jeff.fan@intel.com>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>>> ---
>>>>  UefiCpuPkg/UefiCpuPkg.dec                     | 11 +++++++++++
>>>>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |  1 +  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |  4 ++--
>>>>  UefiCpuPkg/Library/MpInitLib/MpLib.c          | 16 ++++++++++++----
>>>>  4 files changed, 26 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec 
>>>> index ca560398bbef..35903c4386e4 100644
>>>> --- a/UefiCpuPkg/UefiCpuPkg.dec
>>>> +++ b/UefiCpuPkg/UefiCpuPkg.dec
>>>> @@ -225,5 +225,16 @@ [PcdsDynamic, PcdsDynamicEx]
>>>>    # @ValidList   0x80000001 | 0
>>>>    
>>>> gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress|0x0|UINT64|0x600000
>>>> 11
>>>>  
>>>> +  ## On platforms where the number of boot-time CPUs can be 
>>>> + dynamically #  retrieved from a platform-specific information 
>>>> + source, the BSP does not  #  have to wait for 
>>>> + PcdCpuApInitTimeOutInMicroSeconds in order to detect all  #  APs for 
>>>> + the first time. On such platforms, this PCD specifies the number  #  
>>>> + of CPUs available at boot. If the platform doesn't support this 
>>>> + feature,  #  this PCD should be set to 0. The platform is 
>>>> + responsible for ensuring that  #  this PCD is never set to a value larger than  #  PcdCpuMaxLogicalProcessorNumber.
>>>> +  # @Prompt Known number of CPUs available at boot.
>>>> +  
>>>> + gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber|0|UINT32
>>>> + |0
>>>> + x60000012
>>>> +
>>>>  [UserExtensions.TianoCore."ExtraFiles"]
>>>>    UefiCpuPkgExtra.uni
>>>> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf 
>>>> b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>>>> index 11b230174ec8..dc18eaf6152d 100644
>>>> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>>>> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
>>>> @@ -61,6 +61,7 @@ [Guids]
>>>>  
>>>>  [Pcd]
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>>>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## SOMETIMES_CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>>>> diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf 
>>>> b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>>>> index 0c6873da79db..2bcfa70ae7e5 100644
>>>> --- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>>>> +++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
>>>> @@ -61,10 +61,10 @@ [Ppis]
>>>>  
>>>>  [Pcd]
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
>>>> -  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## CONSUMES
>>>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownLogicalProcessorNumber      ## CONSUMES
>>>> +  gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds      ## SOMETIMES_CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize                      ## CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress            ## CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize         ## CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
>>>>    gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
>>>> -
>>>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c 
>>>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>>> index 15dbfa1e7d6c..f7dfbd5bad13 100644
>>>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>>> @@ -707,6 +707,7 @@ WakeUpAP (
>>>>    CPU_AP_DATA                      *CpuData;
>>>>    BOOLEAN                          ResetVectorRequired;
>>>>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
>>>> +  UINT32                           KnownProcessorCount;
>>>>  
>>>>    CpuMpData->FinishedCount = 0;
>>>>    ResetVectorRequired = FALSE;
>>>> @@ -745,10 +746,17 @@ WakeUpAP (
>>>>        SendInitSipiSipiAllExcludingSelf ((UINT32) ExchangeInfo->BufferStart);
>>>>      }
>>>>      if (CpuMpData->InitFlag == ApInitConfig) {
>>>> -      //
>>>> -      // Wait for all potential APs waken up in one specified period
>>>> -      //
>>>> -      MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>>>> +      KnownProcessorCount = PcdGet32 (PcdCpuKnownLogicalProcessorNumber);
>>>> +      if (KnownProcessorCount > 0) {
>>>> +        while (CpuMpData->FinishedCount < (KnownProcessorCount - 1)) {
>>>> +          CpuPause ();
>>>> +        }
>>>> +      } else {
>>>> +        //
>>>> +        // Wait for all potential APs waken up in one specified period
>>>> +        //
>>>> +        MicroSecondDelay (PcdGet32(PcdCpuApInitTimeOutInMicroSeconds));
>>>> +      }
>>>>      } else {
>>>>        //
>>>>        // Wait all APs waken up if this is not the 1st broadcast of 
>>>> SIPI
>>>> --
>>>> 2.9.2
>>>>
>>>>
>>>
>>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-11-24 21:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-22 20:26 [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek
2016-11-22 20:26 ` [PATCH 1/4] UefiCpuPkg/LocalApicLib: fix feature test for Extended Topology CPUID leaf Laszlo Ersek
2016-11-23  5:10   ` Fan, Jeff
2016-11-22 20:26 ` [PATCH 2/4] UefiCpuPkg/MpInitLib: " Laszlo Ersek
2016-11-23  5:10   ` Fan, Jeff
2016-11-22 20:26 ` [PATCH 3/4] UefiCpuPkg/MpInitLib: allow platforms to provide a known CPU count upfront Laszlo Ersek
2016-11-23  5:36   ` Fan, Jeff
2016-11-23 16:04     ` Laszlo Ersek
2016-11-24  1:18       ` Fan, Jeff
2016-11-24  9:36         ` Laszlo Ersek
2016-11-24 13:49           ` Fan, Jeff
2016-11-24 18:32           ` Igor Mammedov
2016-11-24 21:20             ` Laszlo Ersek
2016-11-22 20:26 ` [PATCH 4/4] OvmfPkg/PlatformPei: set PcdCpuKnownLogicalProcessorNumber for MpInitLib Laszlo Ersek
2016-11-23 20:43 ` [PATCH 0/4] UefiCpuPkg, OvmfPkg: multiprocessing fixes and improvements Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox