* [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time.
2018-07-16 3:08 [Patch v3 0/3] Optimize load uCode performance Eric Dong
@ 2018-07-16 3:08 ` Eric Dong
2018-07-17 10:02 ` Ni, Ruiyu
2018-07-16 3:08 ` [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible Eric Dong
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Eric Dong @ 2018-07-16 3:08 UTC (permalink / raw)
To: edk2-devel; +Cc: Laszlo Ersek, Ruiyu Ni
Read uCode from memory has better performance than from flash.
But it needs extra effort to let BSP copy uCode from flash to
memory. Also BSP already enable cache in SEC phase, so it use
less time to relocate uCode from flash to memory. After
verification, if system has more than one processor, it will
reduce some time if load uCode from memory.
This change enable this optimization.
V3 changes:
Remove the ASSERT which is not correct.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/MpLib.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 108eea0a6f..d8b56f149f 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -1520,6 +1520,7 @@ MpInitLibInitialize (
UINTN ApResetVectorSize;
UINTN BackupBufferAddr;
UINTN ApIdtBase;
+ VOID *MicrocodePatchInRam;
OldCpuMpData = GetCpuMpDataFromGuidedHob ();
if (OldCpuMpData == NULL) {
@@ -1587,8 +1588,38 @@ MpInitLibInitialize (
CpuMpData->SwitchBspFlag = FALSE;
CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
- CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
+ //
+ // If platform has more than one CPU, relocate microcode to memory to reduce
+ // loading microcode time.
+ //
+ MicrocodePatchInRam = NULL;
+ if (MaxLogicalProcessorNumber > 1) {
+ MicrocodePatchInRam = AllocatePages (
+ EFI_SIZE_TO_PAGES (
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ )
+ );
+ }
+ if (MicrocodePatchInRam == NULL) {
+ //
+ // there is only one processor, or no microcode patch is available, or
+ // memory allocation failed
+ //
+ CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
+ } else {
+ //
+ // there are multiple processors, and a microcode patch is available, and
+ // memory allocation succeeded
+ //
+ CopyMem (
+ MicrocodePatchInRam,
+ (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ );
+ CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
+ }
+
InitializeSpinLock(&CpuMpData->MpLock);
//
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time.
2018-07-16 3:08 ` [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time Eric Dong
@ 2018-07-17 10:02 ` Ni, Ruiyu
0 siblings, 0 replies; 14+ messages in thread
From: Ni, Ruiyu @ 2018-07-17 10:02 UTC (permalink / raw)
To: Eric Dong, edk2-devel; +Cc: Laszlo Ersek
On 7/16/2018 11:08 AM, Eric Dong wrote:
> Read uCode from memory has better performance than from flash.
> But it needs extra effort to let BSP copy uCode from flash to
> memory. Also BSP already enable cache in SEC phase, so it use
> less time to relocate uCode from flash to memory. After
> verification, if system has more than one processor, it will
> reduce some time if load uCode from memory.
>
> This change enable this optimization.
>
> V3 changes:
> Remove the ASSERT which is not correct.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Eric Dong <eric.dong@intel.com>
> ---
> UefiCpuPkg/Library/MpInitLib/MpLib.c | 33 ++++++++++++++++++++++++++++++++-
> 1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 108eea0a6f..d8b56f149f 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -1520,6 +1520,7 @@ MpInitLibInitialize (
> UINTN ApResetVectorSize;
> UINTN BackupBufferAddr;
> UINTN ApIdtBase;
> + VOID *MicrocodePatchInRam;
>
> OldCpuMpData = GetCpuMpDataFromGuidedHob ();
> if (OldCpuMpData == NULL) {
> @@ -1587,8 +1588,38 @@ MpInitLibInitialize (
> CpuMpData->SwitchBspFlag = FALSE;
> CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
> CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
> - CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
> CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
> + //
> + // If platform has more than one CPU, relocate microcode to memory to reduce
> + // loading microcode time.
> + //
> + MicrocodePatchInRam = NULL;
> + if (MaxLogicalProcessorNumber > 1) {
> + MicrocodePatchInRam = AllocatePages (
> + EFI_SIZE_TO_PAGES (
> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> + )
> + );
> + }
> + if (MicrocodePatchInRam == NULL) {
> + //
> + // there is only one processor, or no microcode patch is available, or
> + // memory allocation failed
> + //
> + CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
> + } else {
> + //
> + // there are multiple processors, and a microcode patch is available, and
> + // memory allocation succeeded
> + //
> + CopyMem (
> + MicrocodePatchInRam,
> + (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> + );
> + CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
> + }
> +
> InitializeSpinLock(&CpuMpData->MpLock);
>
> //
>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
--
Thanks,
Ray
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible.
2018-07-16 3:08 [Patch v3 0/3] Optimize load uCode performance Eric Dong
2018-07-16 3:08 ` [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time Eric Dong
@ 2018-07-16 3:08 ` Eric Dong
2018-07-17 10:00 ` Ni, Ruiyu
2018-07-16 3:08 ` [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core Eric Dong
2018-07-17 16:38 ` [Patch v3 0/3] Optimize load uCode performance Laszlo Ersek
3 siblings, 1 reply; 14+ messages in thread
From: Eric Dong @ 2018-07-16 3:08 UTC (permalink / raw)
To: edk2-devel; +Cc: Laszlo Ersek, Ruiyu Ni
Search uCode costs much time, if AP has same processor type
with BSP, AP can use BSP saved uCode info to get better performance.
This change enables this solution.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Microcode.c | 34 +++++++++++++++++++++++++++++---
UefiCpuPkg/Library/MpInitLib/MpLib.c | 4 ++--
UefiCpuPkg/Library/MpInitLib/MpLib.h | 11 +++++++++--
3 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Microcode.c b/UefiCpuPkg/Library/MpInitLib/Microcode.c
index e47f9f4f8f..351975e2a2 100644
--- a/UefiCpuPkg/Library/MpInitLib/Microcode.c
+++ b/UefiCpuPkg/Library/MpInitLib/Microcode.c
@@ -35,11 +35,13 @@ GetCurrentMicrocodeSignature (
/**
Detect whether specified processor can find matching microcode patch and load it.
- @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] IsBspCallIn Indicate whether the caller is BSP or not.
**/
VOID
MicrocodeDetect (
- IN CPU_MP_DATA *CpuMpData
+ IN CPU_MP_DATA *CpuMpData,
+ IN BOOLEAN IsBspCallIn
)
{
UINT32 ExtendedTableLength;
@@ -58,6 +60,7 @@ MicrocodeDetect (
BOOLEAN CorrectMicrocode;
VOID *MicrocodeData;
MSR_IA32_PLATFORM_ID_REGISTER PlatformIdMsr;
+ UINT32 ProcessorFlags;
if (CpuMpData->MicrocodePatchRegionSize == 0) {
//
@@ -67,7 +70,7 @@ MicrocodeDetect (
}
CurrentRevision = GetCurrentMicrocodeSignature ();
- if (CurrentRevision != 0) {
+ if (CurrentRevision != 0 && !IsBspCallIn) {
//
// Skip loading microcode if it has been loaded successfully
//
@@ -87,6 +90,19 @@ MicrocodeDetect (
PlatformIdMsr.Uint64 = AsmReadMsr64 (MSR_IA32_PLATFORM_ID);
PlatformId = (UINT8) PlatformIdMsr.Bits.PlatformId;
+ //
+ // Check whether AP has same processor with BSP.
+ // If yes, direct use microcode info saved by BSP.
+ //
+ if (!IsBspCallIn) {
+ if ((CpuMpData->ProcessorSignature == Eax.Uint32) &&
+ (CpuMpData->ProcessorFlags & (1 << PlatformId)) != 0) {
+ MicrocodeData = (VOID *)(UINTN) CpuMpData->MicrocodeDataAddress;
+ LatestRevision = CpuMpData->MicrocodeRevision;
+ goto Done;
+ }
+ }
+
LatestRevision = 0;
MicrocodeData = NULL;
MicrocodeEnd = (UINTN) (CpuMpData->MicrocodePatchAddress + CpuMpData->MicrocodePatchRegionSize);
@@ -117,6 +133,7 @@ MicrocodeDetect (
}
if (CheckSum32 == 0) {
CorrectMicrocode = TRUE;
+ ProcessorFlags = MicrocodeEntryPoint->ProcessorFlags;
}
} else if ((MicrocodeEntryPoint->DataSize != 0) &&
(MicrocodeEntryPoint->UpdateRevision > LatestRevision)) {
@@ -151,6 +168,7 @@ MicrocodeDetect (
// Find one
//
CorrectMicrocode = TRUE;
+ ProcessorFlags = ExtendedTable->ProcessorFlag;
break;
}
}
@@ -188,6 +206,7 @@ MicrocodeDetect (
MicrocodeEntryPoint = (CPU_MICROCODE_HEADER *) (((UINTN) MicrocodeEntryPoint) + TotalSize);
} while (((UINTN) MicrocodeEntryPoint < MicrocodeEnd));
+Done:
if (LatestRevision > CurrentRevision) {
//
// BIOS only authenticate updates that contain a numerically larger revision
@@ -211,4 +230,13 @@ MicrocodeDetect (
ReleaseSpinLock(&CpuMpData->MpLock);
}
}
+
+ if (IsBspCallIn && (LatestRevision != 0)) {
+ CpuMpData->ProcessorSignature = Eax.Uint32;
+ CpuMpData->ProcessorFlags = ProcessorFlags;
+ CpuMpData->MicrocodeDataAddress = (UINTN) MicrocodeData;
+ CpuMpData->MicrocodeRevision = LatestRevision;
+ DEBUG ((DEBUG_INFO, "BSP Microcode:: signature [0x%08x], ProcessorFlags [0x%08x], \
+ MicroData [0x%08x], Revision [0x%08x]\n", Eax.Uint32, ProcessorFlags, (UINTN) MicrocodeData, LatestRevision));
+ }
}
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index d8b56f149f..722db2a01f 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -410,7 +410,7 @@ ApInitializeSync (
//
// Load microcode on AP
//
- MicrocodeDetect (CpuMpData);
+ MicrocodeDetect (CpuMpData, FALSE);
//
// Sync BSP's MTRR table to AP
//
@@ -1658,7 +1658,7 @@ MpInitLibInitialize (
//
// Load Microcode on BSP
//
- MicrocodeDetect (CpuMpData);
+ MicrocodeDetect (CpuMpData, TRUE);
//
// Store BSP's MTRR setting
//
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 9aedb52636..6958080ac1 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -245,6 +245,11 @@ struct _CPU_MP_DATA {
BOOLEAN TimerInterruptState;
UINT64 MicrocodePatchAddress;
UINT64 MicrocodePatchRegionSize;
+
+ UINT32 ProcessorSignature;
+ UINT32 ProcessorFlags;
+ UINT64 MicrocodeDataAddress;
+ UINT32 MicrocodeRevision;
};
extern EFI_GUID mCpuInitMpLibHobGuid;
@@ -546,11 +551,13 @@ CheckAndUpdateApsStatus (
/**
Detect whether specified processor can find matching microcode patch and load it.
- @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] CpuMpData The pointer to CPU MP Data structure.
+ @param[in] IsBspCallIn Indicate whether the caller is BSP or not.
**/
VOID
MicrocodeDetect (
- IN CPU_MP_DATA *CpuMpData
+ IN CPU_MP_DATA *CpuMpData,
+ IN BOOLEAN IsBspCallIn
);
/**
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible.
2018-07-16 3:08 ` [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible Eric Dong
@ 2018-07-17 10:00 ` Ni, Ruiyu
0 siblings, 0 replies; 14+ messages in thread
From: Ni, Ruiyu @ 2018-07-17 10:00 UTC (permalink / raw)
To: edk2-devel
On 7/16/2018 11:08 AM, Eric Dong wrote:
> +
> + if (IsBspCallIn && (LatestRevision != 0)) {
> + CpuMpData->ProcessorSignature = Eax.Uint32;
> + CpuMpData->ProcessorFlags = ProcessorFlags;
> + CpuMpData->MicrocodeDataAddress = (UINTN) MicrocodeData;
> + CpuMpData->MicrocodeRevision = LatestRevision;
> + DEBUG ((DEBUG_INFO, "BSP Microcode:: signature [0x%08x], ProcessorFlags [0x%08x], \
> + MicroData [0x%08x], Revision [0x%08x]\n", Eax.Uint32, ProcessorFlags, (UINTN) MicrocodeData, LatestRevision));
> + }
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
Can you add more comments for the above logic when check in the patch?
And make the "=" aligned.
--
Thanks,
Ray
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-16 3:08 [Patch v3 0/3] Optimize load uCode performance Eric Dong
2018-07-16 3:08 ` [Patch v3 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time Eric Dong
2018-07-16 3:08 ` [Patch v3 2/3] UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible Eric Dong
@ 2018-07-16 3:08 ` Eric Dong
2018-07-17 10:02 ` Ni, Ruiyu
2018-07-17 16:38 ` [Patch v3 0/3] Optimize load uCode performance Laszlo Ersek
3 siblings, 1 reply; 14+ messages in thread
From: Eric Dong @ 2018-07-16 3:08 UTC (permalink / raw)
To: edk2-devel; +Cc: Laszlo Ersek, Ruiyu Ni
The SDM requires only one thread per core to load the
microcode.
This change enables this solution.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Microcode.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/UefiCpuPkg/Library/MpInitLib/Microcode.c b/UefiCpuPkg/Library/MpInitLib/Microcode.c
index 351975e2a2..122c23469d 100644
--- a/UefiCpuPkg/Library/MpInitLib/Microcode.c
+++ b/UefiCpuPkg/Library/MpInitLib/Microcode.c
@@ -61,6 +61,7 @@ MicrocodeDetect (
VOID *MicrocodeData;
MSR_IA32_PLATFORM_ID_REGISTER PlatformIdMsr;
UINT32 ProcessorFlags;
+ UINT32 ThreadId;
if (CpuMpData->MicrocodePatchRegionSize == 0) {
//
@@ -77,6 +78,14 @@ MicrocodeDetect (
return;
}
+ GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL, &ThreadId);
+ if (ThreadId != 0) {
+ //
+ // Skip loading microcode if it is not the first thread in one core.
+ //
+ return;
+ }
+
ExtendedTableLength = 0;
//
// Here data of CPUID leafs have not been collected into context buffer, so
--
2.15.0.windows.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-16 3:08 ` [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core Eric Dong
@ 2018-07-17 10:02 ` Ni, Ruiyu
2018-07-17 14:07 ` Laszlo Ersek
0 siblings, 1 reply; 14+ messages in thread
From: Ni, Ruiyu @ 2018-07-17 10:02 UTC (permalink / raw)
To: edk2-devel
On 7/16/2018 11:08 AM, Eric Dong wrote:
> GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL, &ThreadId);
> + if (ThreadId != 0) {
> + //
> + // Skip loading microcode if it is not the first thread in one core.
> + //
> + return;
> + }
> +
Eric,
Is it possible that Thread#0 is disabled while Thread#1 is enabled? It
may cause the certain core with no uCode applied.
--
Thanks,
Ray
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-17 10:02 ` Ni, Ruiyu
@ 2018-07-17 14:07 ` Laszlo Ersek
2018-07-18 0:11 ` Dong, Eric
0 siblings, 1 reply; 14+ messages in thread
From: Laszlo Ersek @ 2018-07-17 14:07 UTC (permalink / raw)
To: Ni, Ruiyu, Eric Dong; +Cc: edk2-devel
On 07/17/18 12:02, Ni, Ruiyu wrote:
> On 7/16/2018 11:08 AM, Eric Dong wrote:
>> GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL,
>> &ThreadId);
>> + if (ThreadId != 0) {
>> + //
>> + // Skip loading microcode if it is not the first thread in one core.
>> + //
>> + return;
>> + }
>> +
>
> Eric,
> Is it possible that Thread#0 is disabled while Thread#1 is enabled? It
> may cause the certain core with no uCode applied.
>
I thought of this (superficially) but I figured the code would run
anyway at CpuMpPei / CpuDxe startup only, at which point no logical
processors could have been disabled yet -- programmatically, through the
MP PPI / protocol. Is there any other reason why ThreadId=0 could be
missing?
(I plan to check the rest of the series later.)
Thanks
Laszlo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-17 14:07 ` Laszlo Ersek
@ 2018-07-18 0:11 ` Dong, Eric
2018-07-19 4:42 ` Ni, Ruiyu
0 siblings, 1 reply; 14+ messages in thread
From: Dong, Eric @ 2018-07-18 0:11 UTC (permalink / raw)
To: Laszlo Ersek, Ni, Ruiyu; +Cc: edk2-devel@lists.01.org
Hi Laszlo & Ray,
> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Tuesday, July 17, 2018 10:07 PM
> To: Ni, Ruiyu <ruiyu.ni@intel.com>; Dong, Eric <eric.dong@intel.com>
> Cc: edk2-devel@lists.01.org
> Subject: Re: [edk2] [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for
> each core.
>
> On 07/17/18 12:02, Ni, Ruiyu wrote:
> > On 7/16/2018 11:08 AM, Eric Dong wrote:
> >> GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL,
> >> &ThreadId);
> >> + if (ThreadId != 0) {
> >> + //
> >> + // Skip loading microcode if it is not the first thread in one core.
> >> + //
> >> + return;
> >> + }
> >> +
> >
> > Eric,
> > Is it possible that Thread#0 is disabled while Thread#1 is enabled? It
> > may cause the certain core with no uCode applied.
> >
>
> I thought of this (superficially) but I figured the code would run anyway at
> CpuMpPei / CpuDxe startup only, at which point no logical processors could
> have been disabled yet -- programmatically, through the MP PPI / protocol. Is
> there any other reason why ThreadId=0 could be missing?
>
Yes, I think at this point, all the threads should at the ready state. The only case I can image is that the hardware has broken and the core can't be wake up. But I think in this case, the core should can't work, not only thread 0 can't work. What do you think?
> (I plan to check the rest of the series later.)
>
> Thanks
> Laszlo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core.
2018-07-18 0:11 ` Dong, Eric
@ 2018-07-19 4:42 ` Ni, Ruiyu
0 siblings, 0 replies; 14+ messages in thread
From: Ni, Ruiyu @ 2018-07-19 4:42 UTC (permalink / raw)
To: Dong, Eric, Laszlo Ersek; +Cc: edk2-devel@lists.01.org
Thanks/Ray
> -----Original Message-----
> From: Dong, Eric
> Sent: Wednesday, July 18, 2018 8:12 AM
> To: Laszlo Ersek <lersek@redhat.com>; Ni, Ruiyu <ruiyu.ni@intel.com>
> Cc: edk2-devel@lists.01.org
> Subject: RE: [edk2] [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once
> for each core.
>
> Hi Laszlo & Ray,
>
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: Tuesday, July 17, 2018 10:07 PM
> > To: Ni, Ruiyu <ruiyu.ni@intel.com>; Dong, Eric <eric.dong@intel.com>
> > Cc: edk2-devel@lists.01.org
> > Subject: Re: [edk2] [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode
> > once for each core.
> >
> > On 07/17/18 12:02, Ni, Ruiyu wrote:
> > > On 7/16/2018 11:08 AM, Eric Dong wrote:
> > >> GetProcessorLocationByApicId (GetInitialApicId (), NULL, NULL,
> > >> &ThreadId);
> > >> + if (ThreadId != 0) {
> > >> + //
> > >> + // Skip loading microcode if it is not the first thread in one core.
> > >> + //
> > >> + return;
> > >> + }
> > >> +
> > >
> > > Eric,
> > > Is it possible that Thread#0 is disabled while Thread#1 is enabled?
> > > It may cause the certain core with no uCode applied.
> > >
> >
> > I thought of this (superficially) but I figured the code would run
> > anyway at CpuMpPei / CpuDxe startup only, at which point no logical
> > processors could have been disabled yet -- programmatically, through
> > the MP PPI / protocol. Is there any other reason why ThreadId=0 could be
> missing?
> >
>
> Yes, I think at this point, all the threads should at the ready state. The only
> case I can image is that the hardware has broken and the core can't be wake
> up. But I think in this case, the core should can't work, not only thread 0 can't
> work. What do you think?
I am not sure. We need to check the IA32 SDM carefully for any potential unexpected cases.
>
> > (I plan to check the rest of the series later.)
> >
> > Thanks
> > Laszlo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 0/3] Optimize load uCode performance
2018-07-16 3:08 [Patch v3 0/3] Optimize load uCode performance Eric Dong
` (2 preceding siblings ...)
2018-07-16 3:08 ` [Patch v3 3/3] UefiCpuPkg/MpInitLib: Load uCode once for each core Eric Dong
@ 2018-07-17 16:38 ` Laszlo Ersek
2018-07-17 19:08 ` Laszlo Ersek
2018-07-18 0:18 ` Dong, Eric
3 siblings, 2 replies; 14+ messages in thread
From: Laszlo Ersek @ 2018-07-17 16:38 UTC (permalink / raw)
To: Eric Dong; +Cc: edk2-devel
Hi Eric,
On 07/16/18 05:08, Eric Dong wrote:
> Use below three rules to optimize load uCode performance:
> 1. Let BSP relocate uCode from flash to memory for better performance.
> 2. BSP caches the CPU ID and address of uCode so AP doesn’t need to look
> for the uCode again if the CPU ID is same as BSP’s.
> 3. Only apply uCode in one thread of a core when hyper threading is enabled.
>
> v2 changes:
> Fix potential issue if allocate memory failed.
>
> V3 Changes:
> Remove the ASSERT code which is not correct.
Based on the above, my understanding is that you didn't modify patches
#2 and #3, from v2 to v3. Is that correct?
If it's correct, then you should have picked up my Acked-by tags from
the v2 review, for the v3 posting:
http://mid.mail-archive.com/dcf4df85-1d35-65e9-2c9b-d5d47a0988aa@redhat.com
http://mid.mail-archive.com/dbc8439f-448e-306c-cdbd-1b2edc1f4aef@redhat.com
If you don't pick up my previous review tags for un-modified patches in
new postings of the patch series, then I have to re-review those patches
every single time. I described this workflow element here:
https://github.com/tianocore/tianocore.github.io/wiki/Laszlo's-unkempt-git-guide-for-edk2-contributors-and-maintainers#contrib-26
----------
26. § More frequently though, you will get requests for changes for
*some* of your patches, while *others* of your patches will be
fine, and garner Reviewed-by, Acked-by, and Tested-by tags. What
you need to do in this case is:
* create the next version of your local branch
* pick up the tags that you got on the list
* implement the requested changes
* mark the v2 changes on each patch outside of the commit
message
* push the next version to your personal repo again
* post the next version to the list
In the following steps, we'll go through each of these in more
detail.
----------
Thanks
Laszlo
> Test:
> Use an sample platform which has 1 socket, 4 core, 8 threads, the
> CpuMpPei driver cost time reduce from 108.4ms to 27.2ms
>
>
> Eric Dong (3):
> UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible.
> UefiCpuPkg/MpInitLib: Load uCode once for each core.
> UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time.
>
> UefiCpuPkg/Library/MpInitLib/Microcode.c | 43 +++++++++++++++++++++++++++++---
> UefiCpuPkg/Library/MpInitLib/MpLib.c | 38 +++++++++++++++++++++++++---
> UefiCpuPkg/Library/MpInitLib/MpLib.h | 11 ++++++--
> 3 files changed, 84 insertions(+), 8 deletions(-)
>
>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 0/3] Optimize load uCode performance
2018-07-17 16:38 ` [Patch v3 0/3] Optimize load uCode performance Laszlo Ersek
@ 2018-07-17 19:08 ` Laszlo Ersek
2018-07-18 0:18 ` Dong, Eric
1 sibling, 0 replies; 14+ messages in thread
From: Laszlo Ersek @ 2018-07-17 19:08 UTC (permalink / raw)
To: Eric Dong; +Cc: edk2-devel
On 07/17/18 18:38, Laszlo Ersek wrote:
> Hi Eric,
>
> On 07/16/18 05:08, Eric Dong wrote:
>> Use below three rules to optimize load uCode performance:
>> 1. Let BSP relocate uCode from flash to memory for better performance.
>> 2. BSP caches the CPU ID and address of uCode so AP doesn’t need to look
>> for the uCode again if the CPU ID is same as BSP’s.
>> 3. Only apply uCode in one thread of a core when hyper threading is enabled.
>>
>> v2 changes:
>> Fix potential issue if allocate memory failed.
>>
>> V3 Changes:
>> Remove the ASSERT code which is not correct.
>
> Based on the above, my understanding is that you didn't modify patches
> #2 and #3, from v2 to v3. Is that correct?
I compared the v2 and v3 patches pair-wise; indeed the ASSERT()'s
removal in patch #1 is the only difference.
> If it's correct, then you should have picked up my Acked-by tags from
> the v2 review, for the v3 posting:
>
> http://mid.mail-archive.com/dcf4df85-1d35-65e9-2c9b-d5d47a0988aa@redhat.com
>
> http://mid.mail-archive.com/dbc8439f-448e-306c-cdbd-1b2edc1f4aef@redhat.com
>
> If you don't pick up my previous review tags for un-modified patches in
> new postings of the patch series, then I have to re-review those patches
> every single time. I described this workflow element here:
>
> https://github.com/tianocore/tianocore.github.io/wiki/Laszlo's-unkempt-git-guide-for-edk2-contributors-and-maintainers#contrib-26
>
> ----------
> 26. § More frequently though, you will get requests for changes for
> *some* of your patches, while *others* of your patches will be
> fine, and garner Reviewed-by, Acked-by, and Tested-by tags. What
> you need to do in this case is:
>
> * create the next version of your local branch
> * pick up the tags that you got on the list
> * implement the requested changes
> * mark the v2 changes on each patch outside of the commit
> message
> * push the next version to your personal repo again
> * post the next version to the list
>
> In the following steps, we'll go through each of these in more
> detail.
> ----------
* For patch #1:
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
* For patches #2 and #3:
Acked-by: Laszlo Ersek <lersek@redhat.com>
Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
Thanks
Laszlo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Patch v3 0/3] Optimize load uCode performance
2018-07-17 16:38 ` [Patch v3 0/3] Optimize load uCode performance Laszlo Ersek
2018-07-17 19:08 ` Laszlo Ersek
@ 2018-07-18 0:18 ` Dong, Eric
1 sibling, 0 replies; 14+ messages in thread
From: Dong, Eric @ 2018-07-18 0:18 UTC (permalink / raw)
To: Laszlo Ersek; +Cc: edk2-devel@lists.01.org
Hi Laszlo,
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Laszlo Ersek
> Sent: Wednesday, July 18, 2018 12:38 AM
> To: Dong, Eric <eric.dong@intel.com>
> Cc: edk2-devel@lists.01.org
> Subject: Re: [edk2] [Patch v3 0/3] Optimize load uCode performance
>
> Hi Eric,
>
> On 07/16/18 05:08, Eric Dong wrote:
> > Use below three rules to optimize load uCode performance:
> > 1. Let BSP relocate uCode from flash to memory for better performance.
> > 2. BSP caches the CPU ID and address of uCode so AP doesn’t need to look
> > for the uCode again if the CPU ID is same as BSP’s.
> > 3. Only apply uCode in one thread of a core when hyper threading is
> enabled.
> >
> > v2 changes:
> > Fix potential issue if allocate memory failed.
> >
> > V3 Changes:
> > Remove the ASSERT code which is not correct.
>
> Based on the above, my understanding is that you didn't modify patches
> #2 and #3, from v2 to v3. Is that correct?
>
> If it's correct, then you should have picked up my Acked-by tags from the v2
> review, for the v3 posting:
>
> http://mid.mail-archive.com/dcf4df85-1d35-65e9-2c9b-
> d5d47a0988aa@redhat.com
>
> http://mid.mail-archive.com/dbc8439f-448e-306c-cdbd-
> 1b2edc1f4aef@redhat.com
>
> If you don't pick up my previous review tags for un-modified patches in new
> postings of the patch series, then I have to re-review those patches every
> single time. I described this workflow element here:
>
Got it, will follow this rule later. Thanks.
> https://github.com/tianocore/tianocore.github.io/wiki/Laszlo's-unkempt-git-
> guide-for-edk2-contributors-and-maintainers#contrib-26
>
> ----------
> 26. § More frequently though, you will get requests for changes for
> *some* of your patches, while *others* of your patches will be
> fine, and garner Reviewed-by, Acked-by, and Tested-by tags. What
> you need to do in this case is:
>
> * create the next version of your local branch
> * pick up the tags that you got on the list
> * implement the requested changes
> * mark the v2 changes on each patch outside of the commit
> message
> * push the next version to your personal repo again
> * post the next version to the list
>
> In the following steps, we'll go through each of these in more
> detail.
> ----------
>
> Thanks
> Laszlo
>
> > Test:
> > Use an sample platform which has 1 socket, 4 core, 8 threads, the
> > CpuMpPei driver cost time reduce from 108.4ms to 27.2ms
> >
> >
> > Eric Dong (3):
> > UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible.
> > UefiCpuPkg/MpInitLib: Load uCode once for each core.
> > UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time.
> >
> > UefiCpuPkg/Library/MpInitLib/Microcode.c | 43
> +++++++++++++++++++++++++++++---
> > UefiCpuPkg/Library/MpInitLib/MpLib.c | 38
> +++++++++++++++++++++++++---
> > UefiCpuPkg/Library/MpInitLib/MpLib.h | 11 ++++++--
> > 3 files changed, 84 insertions(+), 8 deletions(-)
> >
> >
> >
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
> >
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 14+ messages in thread