* [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time.
2018-07-13 0:47 [Patch v3 0/3] Optimize load uCode performance Eric Dong
@ 2018-07-13 0:47 ` Eric Dong
2018-07-13 0:47 ` [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible Eric Dong
2018-07-13 0:47 ` [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core Eric Dong
2 siblings, 0 replies; 6+ messages in thread
From: Eric Dong @ 2018-07-13 0:47 UTC (permalink / raw)
Cc: Laszlo Ersek, Ruiyu Ni
Read uCode from memory has better performance than from flash.
But it needs extra effort to let BSP copy uCode from flash to
memory. Also BSP already enable cache in SEC phase, so it use
less time to relocate uCode from flash to memory. After
verification, if system has more than one processor, it will
reduce some time if load uCode from memory.
This change enable this optimization.
V3 changes:
Remove the ASSERT which is not correct.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/MpLib.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 108eea0a6f..d8b56f149f 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -1520,6 +1520,7 @@ MpInitLibInitialize (
UINTN ApResetVectorSize;
UINTN BackupBufferAddr;
UINTN ApIdtBase;
+ VOID *MicrocodePatchInRam;
OldCpuMpData = GetCpuMpDataFromGuidedHob ();
if (OldCpuMpData == NULL) {
@@ -1587,8 +1588,38 @@ MpInitLibInitialize (
CpuMpData->SwitchBspFlag = FALSE;
CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
- CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
+ //
+ // If platform has more than one CPU, relocate microcode to memory to reduce
+ // loading microcode time.
+ //
+ MicrocodePatchInRam = NULL;
+ if (MaxLogicalProcessorNumber > 1) {
+ MicrocodePatchInRam = AllocatePages (
+ EFI_SIZE_TO_PAGES (
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ )
+ );
+ }
+ if (MicrocodePatchInRam == NULL) {
+ //
+ // there is only one processor, or no microcode patch is available, or
+ // memory allocation failed
+ //
+ CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
+ } else {
+ //
+ // there are multiple processors, and a microcode patch is available, and
+ // memory allocation succeeded
+ //
+ CopyMem (
+ MicrocodePatchInRam,
+ (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ );
+ CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
+ }
+
InitializeSpinLock(&CpuMpData->MpLock);
//
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible.
2018-07-13 0:47 [Patch v3 0/3] Optimize load uCode performance Eric Dong
2018-07-13 0:47 ` [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time Eric Dong
@ 2018-07-13 0:47 ` Eric Dong
2018-07-13 0:47 ` [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core Eric Dong
2 siblings, 0 replies; 6+ messages in thread
From: Eric Dong @ 2018-07-13 0:47 UTC (permalink / raw)
Cc: Laszlo Ersek, Ruiyu Ni
Search uCode costs much time, if AP has same processor type
with BSP, AP can use BSP saved uCode info to get better performance.
This change enables this solution.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Microcode.c | 34 +++++++++++++++++++++++++++++---
UefiCpuPkg/Library/MpInitLib/MpLib.c | 4 ++--
UefiCpuPkg/Library/MpInitLib/MpLib.h | 11 +++++++++--
3 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Microcode.c b/UefiCpuPkg/Library/MpInitLib/Microcode.c
index e47f9f4f8f..351975e2a2 100644
--- a/UefiCpuPkg/Library/MpInitLib/Microcode.c
+++ b/UefiCpuPkg/Library/MpInitLib/Microcode.c
@@ -35,11 +35,13 @@ GetCurrentMicrocodeSignature (
/**
Detect whether specified processor can find matching microcode patch and load it.
- @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] IsBspCallIn Indicate whether the caller is BSP or not.
**/
VOID
MicrocodeDetect (
- IN CPU_MP_DATA *CpuMpData
+ IN CPU_MP_DATA *CpuMpData,
+ IN BOOLEAN IsBspCallIn
)
{
UINT32 ExtendedTableLength;
@@ -58,6 +60,7 @@ MicrocodeDetect (
BOOLEAN CorrectMicrocode;
VOID *MicrocodeData;
MSR_IA32_PLATFORM_ID_REGISTER PlatformIdMsr;
+ UINT32 ProcessorFlags;
if (CpuMpData->MicrocodePatchRegionSize == 0) {
//
@@ -67,7 +70,7 @@ MicrocodeDetect (
}
CurrentRevision = GetCurrentMicrocodeSignature ();
- if (CurrentRevision != 0) {
+ if (CurrentRevision != 0 && !IsBspCallIn) {
//
// Skip loading microcode if it has been loaded successfully
//
@@ -87,6 +90,19 @@ MicrocodeDetect (
PlatformIdMsr.Uint64 = AsmReadMsr64 (MSR_IA32_PLATFORM_ID);
PlatformId = (UINT8) PlatformIdMsr.Bits.PlatformId;
+ //
+ // Check whether AP has same processor with BSP.
+ // If yes, direct use microcode info saved by BSP.
+ //
+ if (!IsBspCallIn) {
+ if ((CpuMpData->ProcessorSignature == Eax.Uint32) &&
+ (CpuMpData->ProcessorFlags & (1 << PlatformId)) != 0) {
+ MicrocodeData = (VOID *)(UINTN) CpuMpData->MicrocodeDataAddress;
+ LatestRevision = CpuMpData->MicrocodeRevision;
+ goto Done;
+ }
+ }
+
LatestRevision = 0;
MicrocodeData = NULL;
MicrocodeEnd = (UINTN) (CpuMpData->MicrocodePatchAddress + CpuMpData->MicrocodePatchRegionSize);
@@ -117,6 +133,7 @@ MicrocodeDetect (
}
if (CheckSum32 == 0) {
CorrectMicrocode = TRUE;
+ ProcessorFlags = MicrocodeEntryPoint->ProcessorFlags;
}
} else if ((MicrocodeEntryPoint->DataSize != 0) &&
(MicrocodeEntryPoint->UpdateRevision > LatestRevision)) {
@@ -151,6 +168,7 @@ MicrocodeDetect (
// Find one
//
CorrectMicrocode = TRUE;
+ ProcessorFlags = ExtendedTable->ProcessorFlag;
break;
}
}
@@ -188,6 +206,7 @@ MicrocodeDetect (
MicrocodeEntryPoint = (CPU_MICROCODE_HEADER *) (((UINTN) MicrocodeEntryPoint) + TotalSize);
} while (((UINTN) MicrocodeEntryPoint < MicrocodeEnd));
+Done:
if (LatestRevision > CurrentRevision) {
//
// BIOS only authenticate updates that contain a numerically larger revision
@@ -211,4 +230,13 @@ MicrocodeDetect (
ReleaseSpinLock(&CpuMpData->MpLock);
}
}
+
+ if (IsBspCallIn && (LatestRevision != 0)) {
+ CpuMpData->ProcessorSignature = Eax.Uint32;
+ CpuMpData->ProcessorFlags = ProcessorFlags;
+ CpuMpData->MicrocodeDataAddress = (UINTN) MicrocodeData;
+ CpuMpData->MicrocodeRevision = LatestRevision;
+ DEBUG ((DEBUG_INFO, "BSP Microcode:: signature [0x%08x], ProcessorFlags [0x%08x], \
+ MicroData [0x%08x], Revision [0x%08x]\n", Eax.Uint32, ProcessorFlags, (UINTN) MicrocodeData, LatestRevision));
+ }
}
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index d8b56f149f..722db2a01f 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -410,7 +410,7 @@ ApInitializeSync (
//
// Load microcode on AP
//
- MicrocodeDetect (CpuMpData);
+ MicrocodeDetect (CpuMpData, FALSE);
//
// Sync BSP's MTRR table to AP
//
@@ -1658,7 +1658,7 @@ MpInitLibInitialize (
//
// Load Microcode on BSP
//
- MicrocodeDetect (CpuMpData);
+ MicrocodeDetect (CpuMpData, TRUE);
//
// Store BSP's MTRR setting
//
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 9aedb52636..6958080ac1 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -245,6 +245,11 @@ struct _CPU_MP_DATA {
BOOLEAN TimerInterruptState;
UINT64 MicrocodePatchAddress;
UINT64 MicrocodePatchRegionSize;
+
+ UINT32 ProcessorSignature;
+ UINT32 ProcessorFlags;
+ UINT64 MicrocodeDataAddress;
+ UINT32 MicrocodeRevision;
};
extern EFI_GUID mCpuInitMpLibHobGuid;
@@ -546,11 +551,13 @@ CheckAndUpdateApsStatus (
/**
Detect whether specified processor can find matching microcode patch and load it.
- @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] IsBspCallIn Indicate whether the caller is BSP or not.
**/
VOID
MicrocodeDetect (
- IN CPU_MP_DATA *CpuMpData
+ IN CPU_MP_DATA *CpuMpData,
+ IN BOOLEAN IsBspCallIn
);
/**
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-13 0:47 [Patch v3 0/3] Optimize load uCode performance Eric Dong
2018-07-13 0:47 ` [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time Eric Dong
2018-07-13 0:47 ` [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible Eric Dong
@ 2018-07-13 0:47 ` Eric Dong
2 siblings, 0 replies; 6+ messages in thread
From: Eric Dong @ 2018-07-13 0:47 UTC (permalink / raw)
Cc: Laszlo Ersek, Ruiyu Ni
The SDM requires only one thread per core to load the
microcode.
This change enables this solution.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Microcode.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/UefiCpuPkg/Library/MpInitLib/Microcode.c b/UefiCpuPkg/Library/MpInitLib/Microcode.c
index 351975e2a2..122c23469d 100644
--- a/UefiCpuPkg/Library/MpInitLib/Microcode.c
+++ b/UefiCpuPkg/Library/MpInitLib/Microcode.c
@@ -61,6 +61,7 @@ MicrocodeDetect (
VOID *MicrocodeData;
MSR_IA32_PLATFORM_ID_REGISTER PlatformIdMsr;
UINT32 ProcessorFlags;
+ UINT32 ThreadId;
if (CpuMpData->MicrocodePatchRegionSize == 0) {
//
@@ -77,6 +78,14 @@ MicrocodeDetect (
return;
}
+ GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL, &ThreadId);
+ if (ThreadId != 0) {
+ //
+ // Skip loading microcode if it is not the first thread in one core.
+ //
+ return;
+ }
+
ExtendedTableLength = 0;
//
// Here data of CPUID leafs have not been collected into context buffer, so
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 6+ messages in thread