Use below three rules to optimize load uCode performance: 1. Let BSP relocate uCode from flash to memory for better performance. 2. BSP caches the CPU ID and address of uCode so AP doesn’t need to look for the uCode again if the CPU ID is same as BSP’s. 3. Only apply uCode in one thread of a core when hyper threading is enabled. Test: Use an sample platform which has 1 socket, 4 core, 8 threads, the CpuMpPei driver cost time reduce from 108.4ms to 27.2ms Eric Dong (3): UefiCpuPkg/MpInitLib: Relocate uCode to memory to save time. UefiCpuPkg/MpInitLib: Use BSP uCode for APs if possible. UefiCpuPkg/MpInitLib: Load uCode once for one core. UefiCpuPkg/Library/MpInitLib/Microcode.c | 43 +++++++++++++++++++++++++++++--- UefiCpuPkg/Library/MpInitLib/MpLib.c | 17 ++++++++++--- UefiCpuPkg/Library/MpInitLib/MpLib.h | 11 ++++++-- 3 files changed, 63 insertions(+), 8 deletions(-) -- 2.15.0.windows.1