* [PATCH v3 0/4] Use XADD to avoid lock acquire/release
@ 2021-02-09 14:16 Ni, Ray
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
` (5 more replies)
0 siblings, 6 replies; 15+ messages in thread
From: Ni, Ray @ 2021-02-09 14:16 UTC (permalink / raw)
To: devel
Patch #1 follows Mike's suggestion to use XADD to avoid lock acquire/release.
Patch #2 follows Laszlo's suggestion to add global NASM macros for NASM struc usage.
Patch #3 simply remves all hardcode offset in NASM without changing any logic.
Patch #4 removes the dead code.
The final code is the same as that of V2.
Ray Ni (4):
UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
MdePkg/Nasm.inc: add macros for C types used in structure definition
UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
MdePkg/Include/Ia32/Nasm.inc | 38 ++++++
MdePkg/Include/X64/Nasm.inc | 38 ++++++
UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 43 -------
.../Library/MpInitLib/Ia32/MpFuncs.nasm | 98 +++++++---------
UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 99 ++++++++++++++++
UefiCpuPkg/Library/MpInitLib/MpLib.c | 1 -
UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +-
UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 45 --------
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 108 ++++++++----------
11 files changed, 272 insertions(+), 211 deletions(-)
delete mode 100644 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
create mode 100644 UefiCpuPkg/Library/MpInitLib/MpEqu.inc
delete mode 100644 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
--
2.27.0.windows.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
@ 2021-02-09 14:16 ` Ni, Ray
2021-02-22 9:06 ` Dong, Eric
2021-02-23 18:11 ` [edk2-devel] " Michael D Kinney
2021-02-09 14:16 ` [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition Ni, Ray
` (4 subsequent siblings)
5 siblings, 2 replies; 15+ messages in thread
From: Ni, Ray @ 2021-02-09 14:16 UTC (permalink / raw)
To: devel; +Cc: Eric Dong, Laszlo Ersek, Rahul Kumar
When AP firstly wakes up, MpFuncs.nasm contains below logic to assign
an unique ApIndex to each AP according to who comes first:
---ASM---
TestLock:
xchg [edi], eax
cmp eax, NotVacantFlag
jz TestLock
mov ecx, esi
add ecx, ApIndexLocation
inc dword [ecx]
mov ebx, [ecx]
Releaselock:
mov eax, VacantFlag
xchg [edi], eax
---ASM END---
"lock inc" cannot be used to increase ApIndex because not only the
global ApIndex should be increased, but also the result should be
stored to a local general purpose register EBX.
This patch learns from the NASM implementation of
InternalSyncIncrement() to use "XADD" instruction which can increase
the global ApIndex and store the original ApIndex to EBX in one
instruction.
With this patch, OVMF when running in a 255 threads QEMU spends about
one second to wakeup all APs. Original implementation needs more than
10 seconds.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
.../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
2 files changed, 12 insertions(+), 26 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 7e81d24aa6..2eaddc93bc 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -1,5 +1,5 @@
;------------------------------------------------------------------------------ ;
-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
; SPDX-License-Identifier: BSD-2-Clause-Patent
;
; Module Name:
@@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
add edi, LockLocation
mov eax, NotVacantFlag
-TestLock:
- xchg [edi], eax
- cmp eax, NotVacantFlag
- jz TestLock
-
- mov ecx, esi
- add ecx, ApIndexLocation
- inc dword [ecx]
- mov ebx, [ecx]
-
-Releaselock:
- mov eax, VacantFlag
- xchg [edi], eax
+ mov edi, esi
+ add edi, ApIndexLocation
+ mov ebx, 1
+ lock xadd dword [edi], ebx ; EBX = ApIndex++
+ inc ebx ; EBX is CpuNumber
mov edi, esi
add edi, StackSizeLocation
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index aecfd07bc0..5b588f2dcb 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -1,5 +1,5 @@
;------------------------------------------------------------------------------ ;
-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
; SPDX-License-Identifier: BSD-2-Clause-Patent
;
; Module Name:
@@ -161,18 +161,12 @@ LongModeStart:
add edi, LockLocation
mov rax, NotVacantFlag
-TestLock:
- xchg qword [edi], rax
- cmp rax, NotVacantFlag
- jz TestLock
-
- lea ecx, [esi + ApIndexLocation]
- inc dword [ecx]
- mov ebx, [ecx]
+ mov edi, esi
+ add edi, ApIndexLocation
+ mov ebx, 1
+ lock xadd dword [edi], ebx ; EBX = ApIndex++
+ inc ebx ; EBX is CpuNumber
-Releaselock:
- mov rax, VacantFlag
- xchg qword [edi], rax
; program stack
mov edi, esi
add edi, StackSizeLocation
--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
@ 2021-02-09 14:16 ` Ni, Ray
2021-02-18 3:24 ` 回复: " gaoliming
2021-02-09 14:16 ` [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset Ni, Ray
` (3 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Ni, Ray @ 2021-02-09 14:16 UTC (permalink / raw)
To: devel; +Cc: Michael D Kinney, Liming Gao, Zhiguang Liu
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Liming Gao <gaoliming@byosoft.com.cn>
Cc: Zhiguang Liu <zhiguang.liu@intel.com>
---
MdePkg/Include/Ia32/Nasm.inc | 38 ++++++++++++++++++++++++++++++++++++
MdePkg/Include/X64/Nasm.inc | 38 ++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/MdePkg/Include/Ia32/Nasm.inc b/MdePkg/Include/Ia32/Nasm.inc
index 31ce861f1e..017fe5ffd8 100644
--- a/MdePkg/Include/Ia32/Nasm.inc
+++ b/MdePkg/Include/Ia32/Nasm.inc
@@ -20,3 +20,41 @@
%macro INCSSP_EAX 0
DB 0xF3, 0x0F, 0xAE, 0xE8
%endmacro
+
+; NASM provides built-in macros STRUC and ENDSTRUC for structure definition.
+; For example, to define a structure called mytype containing a longword,
+; a word, a byte and a string of bytes, you might code
+;
+; struc mytype
+;
+; mt_long: resd 1
+; mt_word: resw 1
+; mt_byte: resb 1
+; mt_str: resb 32
+;
+; endstruc
+;
+; Below macros are help to map the C types and the RESB family of pseudo-instructions.
+; So that the above structure definition can be coded as
+;
+; struc mytype
+;
+; mt_long: CTYPE_UINT32 1
+; mt_word: CTYPE_UINT16 1
+; mt_byte: CTYPE_UINT8 1
+; mt_str: CTYPE_CHAR8 32
+;
+; endstruc
+%define CTYPE_UINT64 resq
+%define CTYPE_INT64 resq
+%define CTYPE_UINT32 resd
+%define CTYPE_INT32 resd
+%define CTYPE_UINT16 resw
+%define CTYPE_INT16 resw
+%define CTYPE_BOOLEAN resb
+%define CTYPE_UINT8 resb
+%define CTYPE_CHAR8 resb
+%define CTYPE_INT8 resb
+
+%define CTYPE_UINTN resd
+%define CTYPE_INTN resd
diff --git a/MdePkg/Include/X64/Nasm.inc b/MdePkg/Include/X64/Nasm.inc
index 42412735ea..b48d8680bb 100644
--- a/MdePkg/Include/X64/Nasm.inc
+++ b/MdePkg/Include/X64/Nasm.inc
@@ -20,3 +20,41 @@
%macro INCSSP_RAX 0
DB 0xF3, 0x48, 0x0F, 0xAE, 0xE8
%endmacro
+
+; NASM provides built-in macros STRUC and ENDSTRUC for structure definition.
+; For example, to define a structure called mytype containing a longword,
+; a word, a byte and a string of bytes, you might code
+;
+; struc mytype
+;
+; mt_long: resd 1
+; mt_word: resw 1
+; mt_byte: resb 1
+; mt_str: resb 32
+;
+; endstruc
+;
+; Below macros are help to map the C types and the RESB family of pseudo-instructions.
+; So that the above structure definition can be coded as
+;
+; struc mytype
+;
+; mt_long: CTYPE_UINT32 1
+; mt_word: CTYPE_UINT16 1
+; mt_byte: CTYPE_UINT8 1
+; mt_str: CTYPE_CHAR8 32
+;
+; endstruc
+%define CTYPE_UINT64 resq
+%define CTYPE_INT64 resq
+%define CTYPE_UINT32 resd
+%define CTYPE_INT32 resd
+%define CTYPE_UINT16 resw
+%define CTYPE_INT16 resw
+%define CTYPE_BOOLEAN resb
+%define CTYPE_UINT8 resb
+%define CTYPE_CHAR8 resb
+%define CTYPE_INT8 resb
+
+%define CTYPE_UINTN resq
+%define CTYPE_INTN resq
--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
2021-02-09 14:16 ` [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition Ni, Ray
@ 2021-02-09 14:16 ` Ni, Ray
2021-02-22 9:06 ` Dong, Eric
2021-02-09 14:16 ` [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO Ni, Ray
` (2 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Ni, Ray @ 2021-02-09 14:16 UTC (permalink / raw)
To: devel; +Cc: Eric Dong, Laszlo Ersek, Rahul Kumar
In Windows environment, "dumpbin /disasm" is used to verify the
disassembly before and after using NASM struc doesn't change.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 43 --------
.../Library/MpInitLib/Ia32/MpFuncs.nasm | 80 +++++++-------
UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 103 ++++++++++++++++++
UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 45 --------
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 92 ++++++++--------
7 files changed, 193 insertions(+), 180 deletions(-)
delete mode 100644 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
create mode 100644 UefiCpuPkg/Library/MpInitLib/MpEqu.inc
delete mode 100644 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
index 1771575c69..860a9750e2 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
@@ -1,7 +1,7 @@
## @file
# MP Initialize Library instance for DXE driver.
#
-# Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>
+# Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR>
# SPDX-License-Identifier: BSD-2-Clause-Patent
#
##
@@ -22,14 +22,13 @@ [Defines]
#
[Sources.IA32]
- Ia32/MpEqu.inc
Ia32/MpFuncs.nasm
[Sources.X64]
- X64/MpEqu.inc
X64/MpFuncs.nasm
[Sources.common]
+ MpEqu.inc
DxeMpLib.c
MpLib.c
MpLib.h
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
deleted file mode 100644
index 4f5a7c859a..0000000000
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
+++ /dev/null
@@ -1,43 +0,0 @@
-;------------------------------------------------------------------------------ ;
-; Copyright (c) 2015 - 2016, Intel Corporation. All rights reserved.<BR>
-; SPDX-License-Identifier: BSD-2-Clause-Patent
-;
-; Module Name:
-;
-; MpEqu.inc
-;
-; Abstract:
-;
-; This is the equates file for Multiple Processor support
-;
-;-------------------------------------------------------------------------------
-
-VacantFlag equ 00h
-NotVacantFlag equ 0ffh
-
-CPU_SWITCH_STATE_IDLE equ 0
-CPU_SWITCH_STATE_STORED equ 1
-CPU_SWITCH_STATE_LOADED equ 2
-
-LockLocation equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)
-StackStartAddressLocation equ LockLocation + 04h
-StackSizeLocation equ LockLocation + 08h
-ApProcedureLocation equ LockLocation + 0Ch
-GdtrLocation equ LockLocation + 10h
-IdtrLocation equ LockLocation + 16h
-BufferStartLocation equ LockLocation + 1Ch
-ModeOffsetLocation equ LockLocation + 20h
-ApIndexLocation equ LockLocation + 24h
-CodeSegmentLocation equ LockLocation + 28h
-DataSegmentLocation equ LockLocation + 2Ch
-EnableExecuteDisableLocation equ LockLocation + 30h
-Cr3Location equ LockLocation + 34h
-InitFlagLocation equ LockLocation + 38h
-CpuInfoLocation equ LockLocation + 3Ch
-NumApsExecutingLocation equ LockLocation + 40h
-InitializeFloatingPointUnitsAddress equ LockLocation + 48h
-ModeTransitionMemoryLocation equ LockLocation + 4Ch
-ModeTransitionSegmentLocation equ LockLocation + 50h
-ModeHighMemoryLocation equ LockLocation + 52h
-ModeHighSegmentLocation equ LockLocation + 56h
-
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 2eaddc93bc..4363ad9a18 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -39,21 +39,21 @@ BITS 16
mov fs, ax
mov gs, ax
- mov si, BufferStartLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (BufferStart)
mov ebx, [si]
- mov si, DataSegmentLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (DataSegment)
mov edx, [si]
;
; Get start address of 32-bit code in low memory (<1MB)
;
- mov edi, ModeTransitionMemoryLocation
+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeTransitionMemory)
- mov si, GdtrLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (GdtrProfile)
o32 lgdt [cs:si]
- mov si, IdtrLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (IdtrProfile)
o32 lidt [cs:si]
;
@@ -82,7 +82,7 @@ Flat32Start: ; protected mode entry point
mov esi, ebx
mov edi, esi
- add edi, EnableExecuteDisableLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (EnableExecuteDisable)
cmp byte [edi], 0
jz SkipEnableExecuteDisable
@@ -96,7 +96,7 @@ Flat32Start: ; protected mode entry point
wrmsr
mov edi, esi
- add edi, Cr3Location
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Cr3)
mov eax, dword [edi]
mov cr3, eax
@@ -110,35 +110,35 @@ Flat32Start: ; protected mode entry point
SkipEnableExecuteDisable:
mov edi, esi
- add edi, InitFlagLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (InitFlag)
cmp dword [edi], 1 ; 1 == ApInitConfig
jnz GetApicId
; Increment the number of APs executing here as early as possible
; This is decremented in C code when AP is finished executing
mov edi, esi
- add edi, NumApsExecutingLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (NumApsExecuting)
lock inc dword [edi]
; AP init
mov edi, esi
- add edi, LockLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)
mov eax, NotVacantFlag
mov edi, esi
- add edi, ApIndexLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex)
mov ebx, 1
lock xadd dword [edi], ebx ; EBX = ApIndex++
inc ebx ; EBX is CpuNumber
mov edi, esi
- add edi, StackSizeLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackSize)
mov eax, [edi]
mov ecx, ebx
inc ecx
mul ecx ; EAX = StackSize * (CpuNumber + 1)
mov edi, esi
- add edi, StackStartAddressLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackStart)
add eax, [edi]
mov esp, eax
jmp CProcedureInvoke
@@ -171,18 +171,18 @@ GetProcessorNumber:
; Note that BSP may become an AP due to SwitchBsp()
;
xor ebx, ebx
- lea eax, [esi + CpuInfoLocation]
+ lea eax, [esi + MP_CPU_EXCHANGE_INFO_FIELD (CpuInfo)]
mov edi, [eax]
GetNextProcNumber:
- cmp [edi], edx ; APIC ID match?
+ cmp dword [edi + CPU_INFO_IN_HOB.InitialApicId], edx ; APIC ID match?
jz ProgramStack
- add edi, 20
+ add edi, CPU_INFO_IN_HOB_size
inc ebx
jmp GetNextProcNumber
ProgramStack:
- mov esp, [edi + 12]
+ mov esp, dword [edi + CPU_INFO_IN_HOB.ApTopOfStack]
CProcedureInvoke:
push ebp ; push BIST data at top of AP stack
@@ -195,11 +195,11 @@ CProcedureInvoke:
push ebx ; Push ApIndex
mov eax, esi
- add eax, LockLocation
+ add eax, MP_CPU_EXCHANGE_INFO_OFFSET
push eax ; push address of exchange info data buffer
mov edi, esi
- add edi, ApProcedureLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (CFunction)
mov eax, [edi]
call eax ; Invoke C function
@@ -262,17 +262,17 @@ ASM_PFX(AsmGetAddressMap):
mov ebp,esp
mov ebx, [ebp + 24h]
- mov dword [ebx], RendezvousFunnelProcStart
- mov dword [ebx + 4h], Flat32Start - RendezvousFunnelProcStart
- mov dword [ebx + 8h], RendezvousFunnelProcEnd - RendezvousFunnelProcStart
- mov dword [ebx + 0Ch], AsmRelocateApLoopStart
- mov dword [ebx + 10h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
- mov dword [ebx + 14h], Flat32Start - RendezvousFunnelProcStart
- mov dword [ebx + 18h], SwitchToRealProcEnd - SwitchToRealProcStart ; SwitchToRealSize
- mov dword [ebx + 1Ch], SwitchToRealProcStart - RendezvousFunnelProcStart ; SwitchToRealOffset
- mov dword [ebx + 20h], SwitchToRealProcStart - Flat32Start ; SwitchToRealNoNxOffset
- mov dword [ebx + 24h], 0 ; SwitchToRealPM16ModeOffset
- mov dword [ebx + 28h], 0 ; SwitchToRealPM16ModeSize
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelAddress], RendezvousFunnelProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.ModeEntryOffset], Flat32Start - RendezvousFunnelProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelSize], RendezvousFunnelProcEnd - RendezvousFunnelProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncAddress], AsmRelocateApLoopStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncSize], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.ModeTransitionOffset], Flat32Start - RendezvousFunnelProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealSize], SwitchToRealProcEnd - SwitchToRealProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealOffset], SwitchToRealProcStart - RendezvousFunnelProcStart
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealNoNxOffset], SwitchToRealProcStart - Flat32Start
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeOffset], 0
+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeSize], 0
popad
ret
@@ -302,18 +302,18 @@ ASM_PFX(AsmExchangeRole):
mov eax, cr0
push eax
- sgdt [esi + 8]
- sidt [esi + 14]
+ sgdt [esi + CPU_EXCHANGE_ROLE_INFO.Gdtr]
+ sidt [esi + CPU_EXCHANGE_ROLE_INFO.Idtr]
; Store the its StackPointer
- mov [esi + 4],esp
+ mov [esi + CPU_EXCHANGE_ROLE_INFO.StackPointer],esp
; update its switch state to STORED
- mov byte [esi], CPU_SWITCH_STATE_STORED
+ mov byte [esi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED
WaitForOtherStored:
; wait until the other CPU finish storing its state
- cmp byte [edi], CPU_SWITCH_STATE_STORED
+ cmp byte [edi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED
jz OtherStored
pause
jmp WaitForOtherStored
@@ -321,21 +321,21 @@ WaitForOtherStored:
OtherStored:
; Since another CPU already stored its state, load them
; load GDTR value
- lgdt [edi + 8]
+ lgdt [edi + CPU_EXCHANGE_ROLE_INFO.Gdtr]
; load IDTR value
- lidt [edi + 14]
+ lidt [edi + CPU_EXCHANGE_ROLE_INFO.Idtr]
; load its future StackPointer
- mov esp, [edi + 4]
+ mov esp, [edi + CPU_EXCHANGE_ROLE_INFO.StackPointer]
; update the other CPU's switch state to LOADED
- mov byte [edi], CPU_SWITCH_STATE_LOADED
+ mov byte [edi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED
WaitForOtherLoaded:
; wait until the other CPU finish loading new state,
; otherwise the data in stack may corrupt
- cmp byte [esi], CPU_SWITCH_STATE_LOADED
+ cmp byte [esi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED
jz OtherLoaded
pause
jmp WaitForOtherLoaded
diff --git a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
new file mode 100644
index 0000000000..46c2b5c116
--- /dev/null
+++ b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
@@ -0,0 +1,103 @@
+;------------------------------------------------------------------------------ ;
+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
+; SPDX-License-Identifier: BSD-2-Clause-Patent
+;
+; Module Name:
+;
+; MpEqu.inc
+;
+; Abstract:
+;
+; This is the equates file for Multiple Processor support
+;
+;-------------------------------------------------------------------------------
+%include "Nasm.inc"
+
+VacantFlag equ 00h
+NotVacantFlag equ 0ffh
+
+CPU_SWITCH_STATE_IDLE equ 0
+CPU_SWITCH_STATE_STORED equ 1
+CPU_SWITCH_STATE_LOADED equ 2
+
+;
+; Equivalent NASM structure of MP_ASSEMBLY_ADDRESS_MAP
+;
+struc MP_ASSEMBLY_ADDRESS_MAP
+ .RendezvousFunnelAddress CTYPE_UINTN 1
+ .ModeEntryOffset CTYPE_UINTN 1
+ .RendezvousFunnelSize CTYPE_UINTN 1
+ .RelocateApLoopFuncAddress CTYPE_UINTN 1
+ .RelocateApLoopFuncSize CTYPE_UINTN 1
+ .ModeTransitionOffset CTYPE_UINTN 1
+ .SwitchToRealSize CTYPE_UINTN 1
+ .SwitchToRealOffset CTYPE_UINTN 1
+ .SwitchToRealNoNxOffset CTYPE_UINTN 1
+ .SwitchToRealPM16ModeOffset CTYPE_UINTN 1
+ .SwitchToRealPM16ModeSize CTYPE_UINTN 1
+endstruc
+
+;
+; Equivalent NASM structure of IA32_DESCRIPTOR
+;
+struc IA32_DESCRIPTOR
+ .Limit CTYPE_UINT16 1
+ .Base CTYPE_UINTN 1
+endstruc
+
+;
+; Equivalent NASM structure of CPU_EXCHANGE_ROLE_INFO
+;
+struc CPU_EXCHANGE_ROLE_INFO
+ ; State is defined as UINT8 in C header file
+ ; Define it as UINTN here to guarantee the fields that follow State
+ ; is naturally aligned. The structure layout doesn't change.
+ .State CTYPE_UINTN 1
+ .StackPointer CTYPE_UINTN 1
+ .Gdtr CTYPE_UINT8 IA32_DESCRIPTOR_size
+ .Idtr CTYPE_UINT8 IA32_DESCRIPTOR_size
+endstruc
+
+;
+; Equivalent NASM structure of CPU_INFO_IN_HOB
+;
+struc CPU_INFO_IN_HOB
+ .InitialApicId CTYPE_UINT32 1
+ .ApicId CTYPE_UINT32 1
+ .Health CTYPE_UINT32 1
+ .ApTopOfStack CTYPE_UINT64 1
+endstruc
+
+;
+; Equivalent NASM structure of MP_CPU_EXCHANGE_INFO
+;
+struc MP_CPU_EXCHANGE_INFO
+ .Lock: CTYPE_UINTN 1
+ .StackStart: CTYPE_UINTN 1
+ .StackSize: CTYPE_UINTN 1
+ .CFunction: CTYPE_UINTN 1
+ .GdtrProfile: CTYPE_UINT8 IA32_DESCRIPTOR_size
+ .IdtrProfile: CTYPE_UINT8 IA32_DESCRIPTOR_size
+ .BufferStart: CTYPE_UINTN 1
+ .ModeOffset: CTYPE_UINTN 1
+ .ApIndex: CTYPE_UINTN 1
+ .CodeSegment: CTYPE_UINTN 1
+ .DataSegment: CTYPE_UINTN 1
+ .EnableExecuteDisable: CTYPE_UINTN 1
+ .Cr3: CTYPE_UINTN 1
+ .InitFlag: CTYPE_UINTN 1
+ .CpuInfo: CTYPE_UINTN 1
+ .NumApsExecuting: CTYPE_UINTN 1
+ .CpuMpData: CTYPE_UINTN 1
+ .InitializeFloatingPointUnits: CTYPE_UINTN 1
+ .ModeTransitionMemory: CTYPE_UINT32 1
+ .ModeTransitionSegment: CTYPE_UINT16 1
+ .ModeHighMemory: CTYPE_UINT32 1
+ .ModeHighSegment: CTYPE_UINT16 1
+ .Enable5LevelPaging: CTYPE_BOOLEAN 1
+ .SevEsIsEnabled: CTYPE_BOOLEAN 1
+ .GhcbBase: CTYPE_UINTN 1
+endstruc
+
+MP_CPU_EXCHANGE_INFO_OFFSET equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)
+%define MP_CPU_EXCHANGE_INFO_FIELD(Field) (MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO. %+ Field)
diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
index 34abf25d43..49b0ffe8be 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
@@ -1,7 +1,7 @@
## @file
# MP Initialize Library instance for PEI driver.
#
-# Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>
+# Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR>
# SPDX-License-Identifier: BSD-2-Clause-Patent
#
##
@@ -22,14 +22,13 @@ [Defines]
#
[Sources.IA32]
- Ia32/MpEqu.inc
Ia32/MpFuncs.nasm
[Sources.X64]
- X64/MpEqu.inc
X64/MpFuncs.nasm
[Sources.common]
+ MpEqu.inc
PeiMpLib.c
MpLib.c
MpLib.h
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
deleted file mode 100644
index c92daaaffd..0000000000
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
+++ /dev/null
@@ -1,45 +0,0 @@
-;------------------------------------------------------------------------------ ;
-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
-; SPDX-License-Identifier: BSD-2-Clause-Patent
-;
-; Module Name:
-;
-; MpEqu.inc
-;
-; Abstract:
-;
-; This is the equates file for Multiple Processor support
-;
-;-------------------------------------------------------------------------------
-
-VacantFlag equ 00h
-NotVacantFlag equ 0ffh
-
-CPU_SWITCH_STATE_IDLE equ 0
-CPU_SWITCH_STATE_STORED equ 1
-CPU_SWITCH_STATE_LOADED equ 2
-
-LockLocation equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)
-StackStartAddressLocation equ LockLocation + 08h
-StackSizeLocation equ LockLocation + 10h
-ApProcedureLocation equ LockLocation + 18h
-GdtrLocation equ LockLocation + 20h
-IdtrLocation equ LockLocation + 2Ah
-BufferStartLocation equ LockLocation + 34h
-ModeOffsetLocation equ LockLocation + 3Ch
-ApIndexLocation equ LockLocation + 44h
-CodeSegmentLocation equ LockLocation + 4Ch
-DataSegmentLocation equ LockLocation + 54h
-EnableExecuteDisableLocation equ LockLocation + 5Ch
-Cr3Location equ LockLocation + 64h
-InitFlagLocation equ LockLocation + 6Ch
-CpuInfoLocation equ LockLocation + 74h
-NumApsExecutingLocation equ LockLocation + 7Ch
-InitializeFloatingPointUnitsAddress equ LockLocation + 8Ch
-ModeTransitionMemoryLocation equ LockLocation + 94h
-ModeTransitionSegmentLocation equ LockLocation + 98h
-ModeHighMemoryLocation equ LockLocation + 9Ah
-ModeHighSegmentLocation equ LockLocation + 9Eh
-Enable5LevelPagingLocation equ LockLocation + 0A0h
-SevEsIsEnabledLocation equ LockLocation + 0A1h
-GhcbBaseLocation equ LockLocation + 0A2h
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index 5b588f2dcb..db297f5cca 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -43,21 +43,21 @@ BITS 16
mov fs, ax
mov gs, ax
- mov si, BufferStartLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (BufferStart)
mov ebx, [si]
- mov si, DataSegmentLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (DataSegment)
mov edx, [si]
;
; Get start address of 32-bit code in low memory (<1MB)
;
- mov edi, ModeTransitionMemoryLocation
+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeTransitionMemory)
- mov si, GdtrLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (GdtrProfile)
o32 lgdt [cs:si]
- mov si, IdtrLocation
+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (IdtrProfile)
o32 lidt [cs:si]
;
@@ -85,7 +85,7 @@ Flat32Start: ; protected mode entry point
;
; Enable execute disable bit
;
- mov esi, EnableExecuteDisableLocation
+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (EnableExecuteDisable)
cmp byte [ebx + esi], 0
jz SkipEnableExecuteDisableBit
@@ -101,7 +101,7 @@ SkipEnableExecuteDisableBit:
mov eax, cr4
bts eax, 5
- mov esi, Enable5LevelPagingLocation
+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (Enable5LevelPaging)
cmp byte [ebx + esi], 0
jz SkipEnable5LevelPaging
@@ -117,7 +117,7 @@ SkipEnable5LevelPaging:
;
; Load page table
;
- mov esi, Cr3Location ; Save CR3 in ecx
+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (Cr3) ; Save CR3 in ecx
mov ecx, [ebx + esi]
mov cr3, ecx ; Load CR3
@@ -139,47 +139,47 @@ SkipEnable5LevelPaging:
;
; Far jump to 64-bit code
;
- mov edi, ModeHighMemoryLocation
+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeHighMemory)
add edi, ebx
jmp far [edi]
BITS 64
LongModeStart:
mov esi, ebx
- lea edi, [esi + InitFlagLocation]
+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (InitFlag)]
cmp qword [edi], 1 ; ApInitConfig
jnz GetApicId
; Increment the number of APs executing here as early as possible
; This is decremented in C code when AP is finished executing
mov edi, esi
- add edi, NumApsExecutingLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (NumApsExecuting)
lock inc dword [edi]
; AP init
mov edi, esi
- add edi, LockLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)
mov rax, NotVacantFlag
mov edi, esi
- add edi, ApIndexLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex)
mov ebx, 1
lock xadd dword [edi], ebx ; EBX = ApIndex++
inc ebx ; EBX is CpuNumber
; program stack
mov edi, esi
- add edi, StackSizeLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackSize)
mov eax, dword [edi]
mov ecx, ebx
inc ecx
mul ecx ; EAX = StackSize * (CpuNumber + 1)
mov edi, esi
- add edi, StackStartAddressLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackStart)
add rax, qword [edi]
mov rsp, rax
- lea edi, [esi + SevEsIsEnabledLocation]
+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (SevEsIsEnabled)]
cmp byte [edi], 1 ; SevEsIsEnabled
jne CProcedureInvoke
@@ -193,7 +193,7 @@ LongModeStart:
mov ecx, ebx
mul ecx ; EAX = SIZE_4K * 2 * CpuNumber
mov edi, esi
- add edi, GhcbBaseLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (GhcbBase)
add rax, qword [edi]
mov rdx, rax
shr rdx, 32
@@ -202,7 +202,7 @@ LongModeStart:
jmp CProcedureInvoke
GetApicId:
- lea edi, [esi + SevEsIsEnabledLocation]
+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (SevEsIsEnabled)]
cmp byte [edi], 1 ; SevEsIsEnabled
jne DoCpuid
@@ -296,18 +296,18 @@ GetProcessorNumber:
; Note that BSP may become an AP due to SwitchBsp()
;
xor ebx, ebx
- lea eax, [esi + CpuInfoLocation]
+ lea eax, [esi + MP_CPU_EXCHANGE_INFO_FIELD (CpuInfo)]
mov rdi, [eax]
GetNextProcNumber:
- cmp dword [rdi], edx ; APIC ID match?
+ cmp dword [rdi + CPU_INFO_IN_HOB.InitialApicId], edx ; APIC ID match?
jz ProgramStack
- add rdi, 20
+ add rdi, CPU_INFO_IN_HOB_size
inc ebx
jmp GetNextProcNumber
ProgramStack:
- mov rsp, qword [rdi + 12]
+ mov rsp, qword [rdi + CPU_INFO_IN_HOB.ApTopOfStack]
CProcedureInvoke:
push rbp ; Push BIST data at top of AP stack
@@ -315,17 +315,17 @@ CProcedureInvoke:
push rbp
mov rbp, rsp
- mov rax, qword [esi + InitializeFloatingPointUnitsAddress]
+ mov rax, qword [esi + MP_CPU_EXCHANGE_INFO_FIELD (InitializeFloatingPointUnits)]
sub rsp, 20h
call rax ; Call assembly function to initialize FPU per UEFI spec
add rsp, 20h
mov edx, ebx ; edx is ApIndex
mov ecx, esi
- add ecx, LockLocation ; rcx is address of exchange info data buffer
+ add ecx, MP_CPU_EXCHANGE_INFO_OFFSET ; rcx is address of exchange info data buffer
mov edi, esi
- add edi, ApProcedureLocation
+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (CFunction)
mov rax, qword [edi]
sub rsp, 20h
@@ -661,18 +661,18 @@ AsmRelocateApLoopEnd:
global ASM_PFX(AsmGetAddressMap)
ASM_PFX(AsmGetAddressMap):
lea rax, [ASM_PFX(RendezvousFunnelProc)]
- mov qword [rcx], rax
- mov qword [rcx + 8h], LongModeStart - RendezvousFunnelProcStart
- mov qword [rcx + 10h], RendezvousFunnelProcEnd - RendezvousFunnelProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelAddress], rax
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.ModeEntryOffset], LongModeStart - RendezvousFunnelProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelSize], RendezvousFunnelProcEnd - RendezvousFunnelProcStart
lea rax, [ASM_PFX(AsmRelocateApLoop)]
- mov qword [rcx + 18h], rax
- mov qword [rcx + 20h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
- mov qword [rcx + 28h], Flat32Start - RendezvousFunnelProcStart
- mov qword [rcx + 30h], SwitchToRealProcEnd - SwitchToRealProcStart ; SwitchToRealSize
- mov qword [rcx + 38h], SwitchToRealProcStart - RendezvousFunnelProcStart ; SwitchToRealOffset
- mov qword [rcx + 40h], SwitchToRealProcStart - Flat32Start ; SwitchToRealNoNxOffset
- mov qword [rcx + 48h], PM16Mode - RendezvousFunnelProcStart ; SwitchToRealPM16ModeOffset
- mov qword [rcx + 50h], SwitchToRealProcEnd - PM16Mode ; SwitchToRealPM16ModeSize
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncAddress], rax
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncSize], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.ModeTransitionOffset], Flat32Start - RendezvousFunnelProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealSize], SwitchToRealProcEnd - SwitchToRealProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealOffset], SwitchToRealProcStart - RendezvousFunnelProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealNoNxOffset], SwitchToRealProcStart - Flat32Start
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeOffset], PM16Mode - RendezvousFunnelProcStart
+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeSize], SwitchToRealProcEnd - PM16Mode
ret
;-------------------------------------------------------------------------------------
@@ -715,18 +715,18 @@ ASM_PFX(AsmExchangeRole):
;Store EFLAGS, GDTR and IDTR regiter to stack
pushfq
- sgdt [rsi + 16]
- sidt [rsi + 26]
+ sgdt [rsi + CPU_EXCHANGE_ROLE_INFO.Gdtr]
+ sidt [rsi + CPU_EXCHANGE_ROLE_INFO.Idtr]
; Store the its StackPointer
- mov [rsi + 8], rsp
+ mov [rsi + CPU_EXCHANGE_ROLE_INFO.StackPointer], rsp
; update its switch state to STORED
- mov byte [rsi], CPU_SWITCH_STATE_STORED
+ mov byte [rsi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED
WaitForOtherStored:
; wait until the other CPU finish storing its state
- cmp byte [rdi], CPU_SWITCH_STATE_STORED
+ cmp byte [rdi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED
jz OtherStored
pause
jmp WaitForOtherStored
@@ -734,21 +734,21 @@ WaitForOtherStored:
OtherStored:
; Since another CPU already stored its state, load them
; load GDTR value
- lgdt [rdi + 16]
+ lgdt [rdi + CPU_EXCHANGE_ROLE_INFO.Gdtr]
; load IDTR value
- lidt [rdi + 26]
+ lidt [rdi + CPU_EXCHANGE_ROLE_INFO.Idtr]
; load its future StackPointer
- mov rsp, [rdi + 8]
+ mov rsp, [rdi + CPU_EXCHANGE_ROLE_INFO.StackPointer]
; update the other CPU's switch state to LOADED
- mov byte [rdi], CPU_SWITCH_STATE_LOADED
+ mov byte [rdi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED
WaitForOtherLoaded:
; wait until the other CPU finish loading new state,
; otherwise the data in stack may corrupt
- cmp byte [rsi], CPU_SWITCH_STATE_LOADED
+ cmp byte [rsi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED
jz OtherLoaded
pause
jmp WaitForOtherLoaded
--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
` (2 preceding siblings ...)
2021-02-09 14:16 ` [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset Ni, Ray
@ 2021-02-09 14:16 ` Ni, Ray
2021-02-22 9:07 ` Dong, Eric
[not found] ` <166219FF4C25D9C5.16853@groups.io>
2021-02-25 19:03 ` [edk2-devel] [PATCH v3 0/4] " Laszlo Ersek
5 siblings, 1 reply; 15+ messages in thread
From: Ni, Ray @ 2021-02-09 14:16 UTC (permalink / raw)
To: devel; +Cc: Eric Dong, Laszlo Ersek, Rahul Kumar
The Lock is no longer needed since "LOCK XADD" was used in
MpFuncs.nasm for ApIndex atomic increment.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 4 ----
UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 4 ----
UefiCpuPkg/Library/MpInitLib/MpLib.c | 1 -
UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +--
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 4 ----
5 files changed, 1 insertion(+), 15 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 4363ad9a18..7bd2415670 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -121,10 +121,6 @@ SkipEnableExecuteDisable:
lock inc dword [edi]
; AP init
- mov edi, esi
- add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)
- mov eax, NotVacantFlag
-
mov edi, esi
add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex)
mov ebx, 1
diff --git a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
index 46c2b5c116..2e9368a374 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
+++ b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
@@ -13,9 +13,6 @@
;-------------------------------------------------------------------------------
%include "Nasm.inc"
-VacantFlag equ 00h
-NotVacantFlag equ 0ffh
-
CPU_SWITCH_STATE_IDLE equ 0
CPU_SWITCH_STATE_STORED equ 1
CPU_SWITCH_STATE_LOADED equ 2
@@ -72,7 +69,6 @@ endstruc
; Equivalent NASM structure of MP_CPU_EXCHANGE_INFO
;
struc MP_CPU_EXCHANGE_INFO
- .Lock: CTYPE_UINTN 1
.StackStart: CTYPE_UINTN 1
.StackSize: CTYPE_UINTN 1
.CFunction: CTYPE_UINTN 1
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 2568986d8c..5040053dad 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -1006,7 +1006,6 @@ FillExchangeInfoData (
IA32_CR4 Cr4;
ExchangeInfo = CpuMpData->MpCpuExchangeInfo;
- ExchangeInfo->Lock = 0;
ExchangeInfo->StackStart = CpuMpData->Buffer;
ExchangeInfo->StackSize = CpuMpData->CpuApStackSize;
ExchangeInfo->BufferStart = CpuMpData->WakeupBuffer;
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 02652eaae1..0bd60388b1 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -1,7 +1,7 @@
/** @file
Common header file for MP Initialize Library.
- Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>
+ Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR>
Copyright (c) 2020, AMD Inc. All rights reserved.<BR>
SPDX-License-Identifier: BSD-2-Clause-Patent
@@ -190,7 +190,6 @@ typedef struct _CPU_MP_DATA CPU_MP_DATA;
// into this structure are used in assembly code in this module
//
typedef struct {
- UINTN Lock;
UINTN StackStart;
UINTN StackSize;
UINTN CFunction;
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index db297f5cca..50df802d1f 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -157,10 +157,6 @@ LongModeStart:
lock inc dword [edi]
; AP init
- mov edi, esi
- add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)
- mov rax, NotVacantFlag
-
mov edi, esi
add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex)
mov ebx, 1
--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* 回复: [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition
2021-02-09 14:16 ` [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition Ni, Ray
@ 2021-02-18 3:24 ` gaoliming
0 siblings, 0 replies; 15+ messages in thread
From: gaoliming @ 2021-02-18 3:24 UTC (permalink / raw)
To: 'Ray Ni', devel
Cc: 'Michael D Kinney', 'Zhiguang Liu'
Reviewed-by: Liming Gao <gaoliming@byosoft.com.cn>
> -----邮件原件-----
> 发件人: Ray Ni <ray.ni@intel.com>
> 发送时间: 2021年2月9日 22:17
> 收件人: devel@edk2.groups.io
> 抄送: Michael D Kinney <michael.d.kinney@intel.com>; Liming Gao
> <gaoliming@byosoft.com.cn>; Zhiguang Liu <zhiguang.liu@intel.com>
> 主题: [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in
> structure definition
>
> Signed-off-by: Ray Ni <ray.ni@intel.com>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Liming Gao <gaoliming@byosoft.com.cn>
> Cc: Zhiguang Liu <zhiguang.liu@intel.com>
> ---
> MdePkg/Include/Ia32/Nasm.inc | 38
> ++++++++++++++++++++++++++++++++++++
> MdePkg/Include/X64/Nasm.inc | 38
> ++++++++++++++++++++++++++++++++++++
> 2 files changed, 76 insertions(+)
>
> diff --git a/MdePkg/Include/Ia32/Nasm.inc
> b/MdePkg/Include/Ia32/Nasm.inc
> index 31ce861f1e..017fe5ffd8 100644
> --- a/MdePkg/Include/Ia32/Nasm.inc
> +++ b/MdePkg/Include/Ia32/Nasm.inc
> @@ -20,3 +20,41 @@
> %macro INCSSP_EAX 0
>
> DB 0xF3, 0x0F, 0xAE, 0xE8
>
> %endmacro
>
> +
>
> +; NASM provides built-in macros STRUC and ENDSTRUC for structure
> definition.
>
> +; For example, to define a structure called mytype containing a longword,
>
> +; a word, a byte and a string of bytes, you might code
>
> +;
>
> +; struc mytype
>
> +;
>
> +; mt_long: resd 1
>
> +; mt_word: resw 1
>
> +; mt_byte: resb 1
>
> +; mt_str: resb 32
>
> +;
>
> +; endstruc
>
> +;
>
> +; Below macros are help to map the C types and the RESB family of
> pseudo-instructions.
>
> +; So that the above structure definition can be coded as
>
> +;
>
> +; struc mytype
>
> +;
>
> +; mt_long: CTYPE_UINT32 1
>
> +; mt_word: CTYPE_UINT16 1
>
> +; mt_byte: CTYPE_UINT8 1
>
> +; mt_str: CTYPE_CHAR8 32
>
> +;
>
> +; endstruc
>
> +%define CTYPE_UINT64 resq
>
> +%define CTYPE_INT64 resq
>
> +%define CTYPE_UINT32 resd
>
> +%define CTYPE_INT32 resd
>
> +%define CTYPE_UINT16 resw
>
> +%define CTYPE_INT16 resw
>
> +%define CTYPE_BOOLEAN resb
>
> +%define CTYPE_UINT8 resb
>
> +%define CTYPE_CHAR8 resb
>
> +%define CTYPE_INT8 resb
>
> +
>
> +%define CTYPE_UINTN resd
>
> +%define CTYPE_INTN resd
>
> diff --git a/MdePkg/Include/X64/Nasm.inc b/MdePkg/Include/X64/Nasm.inc
> index 42412735ea..b48d8680bb 100644
> --- a/MdePkg/Include/X64/Nasm.inc
> +++ b/MdePkg/Include/X64/Nasm.inc
> @@ -20,3 +20,41 @@
> %macro INCSSP_RAX 0
>
> DB 0xF3, 0x48, 0x0F, 0xAE, 0xE8
>
> %endmacro
>
> +
>
> +; NASM provides built-in macros STRUC and ENDSTRUC for structure
> definition.
>
> +; For example, to define a structure called mytype containing a longword,
>
> +; a word, a byte and a string of bytes, you might code
>
> +;
>
> +; struc mytype
>
> +;
>
> +; mt_long: resd 1
>
> +; mt_word: resw 1
>
> +; mt_byte: resb 1
>
> +; mt_str: resb 32
>
> +;
>
> +; endstruc
>
> +;
>
> +; Below macros are help to map the C types and the RESB family of
> pseudo-instructions.
>
> +; So that the above structure definition can be coded as
>
> +;
>
> +; struc mytype
>
> +;
>
> +; mt_long: CTYPE_UINT32 1
>
> +; mt_word: CTYPE_UINT16 1
>
> +; mt_byte: CTYPE_UINT8 1
>
> +; mt_str: CTYPE_CHAR8 32
>
> +;
>
> +; endstruc
>
> +%define CTYPE_UINT64 resq
>
> +%define CTYPE_INT64 resq
>
> +%define CTYPE_UINT32 resd
>
> +%define CTYPE_INT32 resd
>
> +%define CTYPE_UINT16 resw
>
> +%define CTYPE_INT16 resw
>
> +%define CTYPE_BOOLEAN resb
>
> +%define CTYPE_UINT8 resb
>
> +%define CTYPE_CHAR8 resb
>
> +%define CTYPE_INT8 resb
>
> +
>
> +%define CTYPE_UINTN resq
>
> +%define CTYPE_INTN resq
>
> --
> 2.27.0.windows.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
@ 2021-02-22 9:06 ` Dong, Eric
2021-02-23 18:11 ` [edk2-devel] " Michael D Kinney
1 sibling, 0 replies; 15+ messages in thread
From: Dong, Eric @ 2021-02-22 9:06 UTC (permalink / raw)
To: Ni, Ray, devel@edk2.groups.io; +Cc: Laszlo Ersek, Kumar, Rahul1
Reviewed-by: Eric Dong <eric.dong@intel.com>
-----Original Message-----
From: Ni, Ray <ray.ni@intel.com>
Sent: Tuesday, February 9, 2021 10:17 PM
To: devel@edk2.groups.io
Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>; Kumar, Rahul1 <rahul1.kumar@intel.com>
Subject: [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
When AP firstly wakes up, MpFuncs.nasm contains below logic to assign an unique ApIndex to each AP according to who comes first:
---ASM---
TestLock:
xchg [edi], eax
cmp eax, NotVacantFlag
jz TestLock
mov ecx, esi
add ecx, ApIndexLocation
inc dword [ecx]
mov ebx, [ecx]
Releaselock:
mov eax, VacantFlag
xchg [edi], eax
---ASM END---
"lock inc" cannot be used to increase ApIndex because not only the global ApIndex should be increased, but also the result should be stored to a local general purpose register EBX.
This patch learns from the NASM implementation of
InternalSyncIncrement() to use "XADD" instruction which can increase the global ApIndex and store the original ApIndex to EBX in one instruction.
With this patch, OVMF when running in a 255 threads QEMU spends about one second to wakeup all APs. Original implementation needs more than
10 seconds.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
.../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
2 files changed, 12 insertions(+), 26 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 7e81d24aa6..2eaddc93bc 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -1,5 +1,5 @@
;------------------------------------------------------------------------------ ;-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR> ; SPDX-License-Identifier: BSD-2-Clause-Patent ; ; Module Name:@@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
add edi, LockLocation mov eax, NotVacantFlag -TestLock:- xchg [edi], eax- cmp eax, NotVacantFlag- jz TestLock-- mov ecx, esi- add ecx, ApIndexLocation- inc dword [ecx]- mov ebx, [ecx]--Releaselock:- mov eax, VacantFlag- xchg [edi], eax+ mov edi, esi+ add edi, ApIndexLocation+ mov ebx, 1+ lock xadd dword [edi], ebx ; EBX = ApIndex+++ inc ebx ; EBX is CpuNumber mov edi, esi add edi, StackSizeLocationdiff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index aecfd07bc0..5b588f2dcb 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -1,5 +1,5 @@
;------------------------------------------------------------------------------ ;-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR> ; SPDX-License-Identifier: BSD-2-Clause-Patent ; ; Module Name:@@ -161,18 +161,12 @@ LongModeStart:
add edi, LockLocation mov rax, NotVacantFlag -TestLock:- xchg qword [edi], rax- cmp rax, NotVacantFlag- jz TestLock-- lea ecx, [esi + ApIndexLocation]- inc dword [ecx]- mov ebx, [ecx]+ mov edi, esi+ add edi, ApIndexLocation+ mov ebx, 1+ lock xadd dword [edi], ebx ; EBX = ApIndex+++ inc ebx ; EBX is CpuNumber -Releaselock:- mov rax, VacantFlag- xchg qword [edi], rax ; program stack mov edi, esi add edi, StackSizeLocation--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
2021-02-09 14:16 ` [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset Ni, Ray
@ 2021-02-22 9:06 ` Dong, Eric
0 siblings, 0 replies; 15+ messages in thread
From: Dong, Eric @ 2021-02-22 9:06 UTC (permalink / raw)
To: Ni, Ray, devel@edk2.groups.io; +Cc: Laszlo Ersek, Kumar, Rahul1
Reviewed-by: Eric Dong <eric.dong@intel.com>
-----Original Message-----
From: Ni, Ray <ray.ni@intel.com>
Sent: Tuesday, February 9, 2021 10:17 PM
To: devel@edk2.groups.io
Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>; Kumar, Rahul1 <rahul1.kumar@intel.com>
Subject: [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
In Windows environment, "dumpbin /disasm" is used to verify the disassembly before and after using NASM struc doesn't change.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 43 --------
.../Library/MpInitLib/Ia32/MpFuncs.nasm | 80 +++++++-------
UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 103 ++++++++++++++++++
UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 5 +-
UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 45 --------
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 92 ++++++++--------
7 files changed, 193 insertions(+), 180 deletions(-) delete mode 100644 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
create mode 100644 UefiCpuPkg/Library/MpInitLib/MpEqu.inc
delete mode 100644 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
index 1771575c69..860a9750e2 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
@@ -1,7 +1,7 @@
## @file # MP Initialize Library instance for DXE driver. #-# Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>+# Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR> # SPDX-License-Identifier: BSD-2-Clause-Patent # ##@@ -22,14 +22,13 @@ [Defines]
# [Sources.IA32]- Ia32/MpEqu.inc Ia32/MpFuncs.nasm [Sources.X64]- X64/MpEqu.inc X64/MpFuncs.nasm [Sources.common]+ MpEqu.inc DxeMpLib.c MpLib.c MpLib.hdiff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
deleted file mode 100644
index 4f5a7c859a..0000000000
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
+++ /dev/null
@@ -1,43 +0,0 @@
-;------------------------------------------------------------------------------ ;-; Copyright (c) 2015 - 2016, Intel Corporation. All rights reserved.<BR>-; SPDX-License-Identifier: BSD-2-Clause-Patent-;-; Module Name:-;-; MpEqu.inc-;-; Abstract:-;-; This is the equates file for Multiple Processor support-;-;---------------------------------------------------------------------------------VacantFlag equ 00h-NotVacantFlag equ 0ffh--CPU_SWITCH_STATE_IDLE equ 0-CPU_SWITCH_STATE_STORED equ 1-CPU_SWITCH_STATE_LOADED equ 2--LockLocation equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)-StackStartAddressLocation equ LockLocation + 04h-StackSizeLocation equ LockLocation + 08h-ApProcedureLocation equ LockLocation + 0Ch-GdtrLocation equ LockLocation + 10h-IdtrLocation equ LockLocation + 16h-BufferStartLocation equ LockLocation + 1Ch-ModeOffsetLocation equ LockLocation + 20h-ApIndexLocation equ LockLocation + 24h-CodeSegmentLocation equ LockLocation + 28h-DataSegmentLocation equ LockLocation + 2Ch-EnableExecuteDisableLocation equ LockLocation + 30h-Cr3Location equ LockLocation + 34h-InitFlagLocation equ LockLocation + 38h-CpuInfoLocation equ LockLocation + 3Ch-NumApsExecutingLocation equ LockLocation + 40h-InitializeFloatingPointUnitsAddress equ LockLocation + 48h-ModeTransitionMemoryLocation equ LockLocation + 4Ch-ModeTransitionSegmentLocation equ LockLocation + 50h-ModeHighMemoryLocation equ LockLocation + 52h-ModeHighSegmentLocation equ LockLocation + 56h-diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 2eaddc93bc..4363ad9a18 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -39,21 +39,21 @@ BITS 16
mov fs, ax mov gs, ax - mov si, BufferStartLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (BufferStart) mov ebx, [si] - mov si, DataSegmentLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (DataSegment) mov edx, [si] ; ; Get start address of 32-bit code in low memory (<1MB) ;- mov edi, ModeTransitionMemoryLocation+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeTransitionMemory) - mov si, GdtrLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (GdtrProfile) o32 lgdt [cs:si] - mov si, IdtrLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (IdtrProfile) o32 lidt [cs:si] ;@@ -82,7 +82,7 @@ Flat32Start: ; protected mode entry point
mov esi, ebx mov edi, esi- add edi, EnableExecuteDisableLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (EnableExecuteDisable) cmp byte [edi], 0 jz SkipEnableExecuteDisable @@ -96,7 +96,7 @@ Flat32Start: ; protected mode entry point
wrmsr mov edi, esi- add edi, Cr3Location+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Cr3) mov eax, dword [edi] mov cr3, eax @@ -110,35 +110,35 @@ Flat32Start: ; protected mode entry point
SkipEnableExecuteDisable: mov edi, esi- add edi, InitFlagLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (InitFlag) cmp dword [edi], 1 ; 1 == ApInitConfig jnz GetApicId ; Increment the number of APs executing here as early as possible ; This is decremented in C code when AP is finished executing mov edi, esi- add edi, NumApsExecutingLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (NumApsExecuting) lock inc dword [edi] ; AP init mov edi, esi- add edi, LockLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock) mov eax, NotVacantFlag mov edi, esi- add edi, ApIndexLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex) mov ebx, 1 lock xadd dword [edi], ebx ; EBX = ApIndex++ inc ebx ; EBX is CpuNumber mov edi, esi- add edi, StackSizeLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackSize) mov eax, [edi] mov ecx, ebx inc ecx mul ecx ; EAX = StackSize * (CpuNumber + 1) mov edi, esi- add edi, StackStartAddressLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackStart) add eax, [edi] mov esp, eax jmp CProcedureInvoke@@ -171,18 +171,18 @@ GetProcessorNumber:
; Note that BSP may become an AP due to SwitchBsp() ; xor ebx, ebx- lea eax, [esi + CpuInfoLocation]+ lea eax, [esi + MP_CPU_EXCHANGE_INFO_FIELD (CpuInfo)] mov edi, [eax] GetNextProcNumber:- cmp [edi], edx ; APIC ID match?+ cmp dword [edi + CPU_INFO_IN_HOB.InitialApicId], edx ; APIC ID match? jz ProgramStack- add edi, 20+ add edi, CPU_INFO_IN_HOB_size inc ebx jmp GetNextProcNumber ProgramStack:- mov esp, [edi + 12]+ mov esp, dword [edi + CPU_INFO_IN_HOB.ApTopOfStack] CProcedureInvoke: push ebp ; push BIST data at top of AP stack@@ -195,11 +195,11 @@ CProcedureInvoke:
push ebx ; Push ApIndex mov eax, esi- add eax, LockLocation+ add eax, MP_CPU_EXCHANGE_INFO_OFFSET push eax ; push address of exchange info data buffer mov edi, esi- add edi, ApProcedureLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (CFunction) mov eax, [edi] call eax ; Invoke C function@@ -262,17 +262,17 @@ ASM_PFX(AsmGetAddressMap):
mov ebp,esp mov ebx, [ebp + 24h]- mov dword [ebx], RendezvousFunnelProcStart- mov dword [ebx + 4h], Flat32Start - RendezvousFunnelProcStart- mov dword [ebx + 8h], RendezvousFunnelProcEnd - RendezvousFunnelProcStart- mov dword [ebx + 0Ch], AsmRelocateApLoopStart- mov dword [ebx + 10h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart- mov dword [ebx + 14h], Flat32Start - RendezvousFunnelProcStart- mov dword [ebx + 18h], SwitchToRealProcEnd - SwitchToRealProcStart ; SwitchToRealSize- mov dword [ebx + 1Ch], SwitchToRealProcStart - RendezvousFunnelProcStart ; SwitchToRealOffset- mov dword [ebx + 20h], SwitchToRealProcStart - Flat32Start ; SwitchToRealNoNxOffset- mov dword [ebx + 24h], 0 ; SwitchToRealPM16ModeOffset- mov dword [ebx + 28h], 0 ; SwitchToRealPM16ModeSize+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelAddress], RendezvousFunnelProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.ModeEntryOffset], Flat32Start - RendezvousFunnelProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelSize], RendezvousFunnelProcEnd - RendezvousFunnelProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncAddress], AsmRelocateApLoopStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncSize], AsmRelocateApLoopEnd - AsmRelocateApLoopStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.ModeTransitionOffset], Flat32Start - RendezvousFunnelProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealSize], SwitchToRealProcEnd - SwitchToRealProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealOffset], SwitchToRealProcStart - RendezvousFunnelProcStart+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealNoNxOffset], SwitchToRealProcStart - Flat32Start+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeOffset], 0+ mov dword [ebx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeSize], 0 popad ret@@ -302,18 +302,18 @@ ASM_PFX(AsmExchangeRole):
mov eax, cr0 push eax - sgdt [esi + 8]- sidt [esi + 14]+ sgdt [esi + CPU_EXCHANGE_ROLE_INFO.Gdtr]+ sidt [esi + CPU_EXCHANGE_ROLE_INFO.Idtr] ; Store the its StackPointer- mov [esi + 4],esp+ mov [esi + CPU_EXCHANGE_ROLE_INFO.StackPointer],esp ; update its switch state to STORED- mov byte [esi], CPU_SWITCH_STATE_STORED+ mov byte [esi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED WaitForOtherStored: ; wait until the other CPU finish storing its state- cmp byte [edi], CPU_SWITCH_STATE_STORED+ cmp byte [edi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED jz OtherStored pause jmp WaitForOtherStored@@ -321,21 +321,21 @@ WaitForOtherStored:
OtherStored: ; Since another CPU already stored its state, load them ; load GDTR value- lgdt [edi + 8]+ lgdt [edi + CPU_EXCHANGE_ROLE_INFO.Gdtr] ; load IDTR value- lidt [edi + 14]+ lidt [edi + CPU_EXCHANGE_ROLE_INFO.Idtr] ; load its future StackPointer- mov esp, [edi + 4]+ mov esp, [edi + CPU_EXCHANGE_ROLE_INFO.StackPointer] ; update the other CPU's switch state to LOADED- mov byte [edi], CPU_SWITCH_STATE_LOADED+ mov byte [edi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED WaitForOtherLoaded: ; wait until the other CPU finish loading new state, ; otherwise the data in stack may corrupt- cmp byte [esi], CPU_SWITCH_STATE_LOADED+ cmp byte [esi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED jz OtherLoaded pause jmp WaitForOtherLoadeddiff --git a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
new file mode 100644
index 0000000000..46c2b5c116
--- /dev/null
+++ b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
@@ -0,0 +1,103 @@
+;------------------------------------------------------------------------------ ;+; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>+; SPDX-License-Identifier: BSD-2-Clause-Patent+;+; Module Name:+;+; MpEqu.inc+;+; Abstract:+;+; This is the equates file for Multiple Processor support+;+;-------------------------------------------------------------------------------+%include "Nasm.inc"++VacantFlag equ 00h+NotVacantFlag equ 0ffh++CPU_SWITCH_STATE_IDLE equ 0+CPU_SWITCH_STATE_STORED equ 1+CPU_SWITCH_STATE_LOADED equ 2++;+; Equivalent NASM structure of MP_ASSEMBLY_ADDRESS_MAP+;+struc MP_ASSEMBLY_ADDRESS_MAP+ .RendezvousFunnelAddress CTYPE_UINTN 1+ .ModeEntryOffset CTYPE_UINTN 1+ .RendezvousFunnelSize CTYPE_UINTN 1+ .RelocateApLoopFuncAddress CTYPE_UINTN 1+ .RelocateApLoopFuncSize CTYPE_UINTN 1+ .ModeTransitionOffset CTYPE_UINTN 1+ .SwitchToRealSize CTYPE_UINTN 1+ .SwitchToRealOffset CTYPE_UINTN 1+ .SwitchToRealNoNxOffset CTYPE_UINTN 1+ .SwitchToRealPM16ModeOffset CTYPE_UINTN 1+ .SwitchToRealPM16ModeSize CTYPE_UINTN 1+endstruc++;+; Equivalent NASM structure of IA32_DESCRIPTOR+;+struc IA32_DESCRIPTOR+ .Limit CTYPE_UINT16 1+ .Base CTYPE_UINTN 1+endstruc++;+; Equivalent NASM structure of CPU_EXCHANGE_ROLE_INFO+;+struc CPU_EXCHANGE_ROLE_INFO+ ; State is defined as UINT8 in C header file+ ; Define it as UINTN here to guarantee the fields that follow State+ ; is naturally aligned. The structure layout doesn't change.+ .State CTYPE_UINTN 1+ .StackPointer CTYPE_UINTN 1+ .Gdtr CTYPE_UINT8 IA32_DESCRIPTOR_size+ .Idtr CTYPE_UINT8 IA32_DESCRIPTOR_size+endstruc++;+; Equivalent NASM structure of CPU_INFO_IN_HOB+;+struc CPU_INFO_IN_HOB+ .InitialApicId CTYPE_UINT32 1+ .ApicId CTYPE_UINT32 1+ .Health CTYPE_UINT32 1+ .ApTopOfStack CTYPE_UINT64 1+endstruc++;+; Equivalent NASM structure of MP_CPU_EXCHANGE_INFO+;+struc MP_CPU_EXCHANGE_INFO+ .Lock: CTYPE_UINTN 1+ .StackStart: CTYPE_UINTN 1+ .StackSize: CTYPE_UINTN 1+ .CFunction: CTYPE_UINTN 1+ .GdtrProfile: CTYPE_UINT8 IA32_DESCRIPTOR_size+ .IdtrProfile: CTYPE_UINT8 IA32_DESCRIPTOR_size+ .BufferStart: CTYPE_UINTN 1+ .ModeOffset: CTYPE_UINTN 1+ .ApIndex: CTYPE_UINTN 1+ .CodeSegment: CTYPE_UINTN 1+ .DataSegment: CTYPE_UINTN 1+ .EnableExecuteDisable: CTYPE_UINTN 1+ .Cr3: CTYPE_UINTN 1+ .InitFlag: CTYPE_UINTN 1+ .CpuInfo: CTYPE_UINTN 1+ .NumApsExecuting: CTYPE_UINTN 1+ .CpuMpData: CTYPE_UINTN 1+ .InitializeFloatingPointUnits: CTYPE_UINTN 1+ .ModeTransitionMemory: CTYPE_UINT32 1+ .ModeTransitionSegment: CTYPE_UINT16 1+ .ModeHighMemory: CTYPE_UINT32 1+ .ModeHighSegment: CTYPE_UINT16 1+ .Enable5LevelPaging: CTYPE_BOOLEAN 1+ .SevEsIsEnabled: CTYPE_BOOLEAN 1+ .GhcbBase: CTYPE_UINTN 1+endstruc++MP_CPU_EXCHANGE_INFO_OFFSET equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)+%define MP_CPU_EXCHANGE_INFO_FIELD(Field) (MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO. %+ Field)diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
index 34abf25d43..49b0ffe8be 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
@@ -1,7 +1,7 @@
## @file # MP Initialize Library instance for PEI driver. #-# Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>+# Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR> # SPDX-License-Identifier: BSD-2-Clause-Patent # ##@@ -22,14 +22,13 @@ [Defines]
# [Sources.IA32]- Ia32/MpEqu.inc Ia32/MpFuncs.nasm [Sources.X64]- X64/MpEqu.inc X64/MpFuncs.nasm [Sources.common]+ MpEqu.inc PeiMpLib.c MpLib.c MpLib.hdiff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
deleted file mode 100644
index c92daaaffd..0000000000
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
+++ /dev/null
@@ -1,45 +0,0 @@
-;------------------------------------------------------------------------------ ;-; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>-; SPDX-License-Identifier: BSD-2-Clause-Patent-;-; Module Name:-;-; MpEqu.inc-;-; Abstract:-;-; This is the equates file for Multiple Processor support-;-;---------------------------------------------------------------------------------VacantFlag equ 00h-NotVacantFlag equ 0ffh--CPU_SWITCH_STATE_IDLE equ 0-CPU_SWITCH_STATE_STORED equ 1-CPU_SWITCH_STATE_LOADED equ 2--LockLocation equ (SwitchToRealProcEnd - RendezvousFunnelProcStart)-StackStartAddressLocation equ LockLocation + 08h-StackSizeLocation equ LockLocation + 10h-ApProcedureLocation equ LockLocation + 18h-GdtrLocation equ LockLocation + 20h-IdtrLocation equ LockLocation + 2Ah-BufferStartLocation equ LockLocation + 34h-ModeOffsetLocation equ LockLocation + 3Ch-ApIndexLocation equ LockLocation + 44h-CodeSegmentLocation equ LockLocation + 4Ch-DataSegmentLocation equ LockLocation + 54h-EnableExecuteDisableLocation equ LockLocation + 5Ch-Cr3Location equ LockLocation + 64h-InitFlagLocation equ LockLocation + 6Ch-CpuInfoLocation equ LockLocation + 74h-NumApsExecutingLocation equ LockLocation + 7Ch-InitializeFloatingPointUnitsAddress equ LockLocation + 8Ch-ModeTransitionMemoryLocation equ LockLocation + 94h-ModeTransitionSegmentLocation equ LockLocation + 98h-ModeHighMemoryLocation equ LockLocation + 9Ah-ModeHighSegmentLocation equ LockLocation + 9Eh-Enable5LevelPagingLocation equ LockLocation + 0A0h-SevEsIsEnabledLocation equ LockLocation + 0A1h-GhcbBaseLocation equ LockLocation + 0A2hdiff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index 5b588f2dcb..db297f5cca 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -43,21 +43,21 @@ BITS 16
mov fs, ax mov gs, ax - mov si, BufferStartLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (BufferStart) mov ebx, [si] - mov si, DataSegmentLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (DataSegment) mov edx, [si] ; ; Get start address of 32-bit code in low memory (<1MB) ;- mov edi, ModeTransitionMemoryLocation+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeTransitionMemory) - mov si, GdtrLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (GdtrProfile) o32 lgdt [cs:si] - mov si, IdtrLocation+ mov si, MP_CPU_EXCHANGE_INFO_FIELD (IdtrProfile) o32 lidt [cs:si] ;@@ -85,7 +85,7 @@ Flat32Start: ; protected mode entry point
; ; Enable execute disable bit ;- mov esi, EnableExecuteDisableLocation+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (EnableExecuteDisable) cmp byte [ebx + esi], 0 jz SkipEnableExecuteDisableBit @@ -101,7 +101,7 @@ SkipEnableExecuteDisableBit:
mov eax, cr4 bts eax, 5 - mov esi, Enable5LevelPagingLocation+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (Enable5LevelPaging) cmp byte [ebx + esi], 0 jz SkipEnable5LevelPaging @@ -117,7 +117,7 @@ SkipEnable5LevelPaging:
; ; Load page table ;- mov esi, Cr3Location ; Save CR3 in ecx+ mov esi, MP_CPU_EXCHANGE_INFO_FIELD (Cr3) ; Save CR3 in ecx mov ecx, [ebx + esi] mov cr3, ecx ; Load CR3 @@ -139,47 +139,47 @@ SkipEnable5LevelPaging:
; ; Far jump to 64-bit code ;- mov edi, ModeHighMemoryLocation+ mov edi, MP_CPU_EXCHANGE_INFO_FIELD (ModeHighMemory) add edi, ebx jmp far [edi] BITS 64 LongModeStart: mov esi, ebx- lea edi, [esi + InitFlagLocation]+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (InitFlag)] cmp qword [edi], 1 ; ApInitConfig jnz GetApicId ; Increment the number of APs executing here as early as possible ; This is decremented in C code when AP is finished executing mov edi, esi- add edi, NumApsExecutingLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (NumApsExecuting) lock inc dword [edi] ; AP init mov edi, esi- add edi, LockLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock) mov rax, NotVacantFlag mov edi, esi- add edi, ApIndexLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex) mov ebx, 1 lock xadd dword [edi], ebx ; EBX = ApIndex++ inc ebx ; EBX is CpuNumber ; program stack mov edi, esi- add edi, StackSizeLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackSize) mov eax, dword [edi] mov ecx, ebx inc ecx mul ecx ; EAX = StackSize * (CpuNumber + 1) mov edi, esi- add edi, StackStartAddressLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (StackStart) add rax, qword [edi] mov rsp, rax - lea edi, [esi + SevEsIsEnabledLocation]+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (SevEsIsEnabled)] cmp byte [edi], 1 ; SevEsIsEnabled jne CProcedureInvoke @@ -193,7 +193,7 @@ LongModeStart:
mov ecx, ebx mul ecx ; EAX = SIZE_4K * 2 * CpuNumber mov edi, esi- add edi, GhcbBaseLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (GhcbBase) add rax, qword [edi] mov rdx, rax shr rdx, 32@@ -202,7 +202,7 @@ LongModeStart:
jmp CProcedureInvoke GetApicId:- lea edi, [esi + SevEsIsEnabledLocation]+ lea edi, [esi + MP_CPU_EXCHANGE_INFO_FIELD (SevEsIsEnabled)] cmp byte [edi], 1 ; SevEsIsEnabled jne DoCpuid @@ -296,18 +296,18 @@ GetProcessorNumber:
; Note that BSP may become an AP due to SwitchBsp() ; xor ebx, ebx- lea eax, [esi + CpuInfoLocation]+ lea eax, [esi + MP_CPU_EXCHANGE_INFO_FIELD (CpuInfo)] mov rdi, [eax] GetNextProcNumber:- cmp dword [rdi], edx ; APIC ID match?+ cmp dword [rdi + CPU_INFO_IN_HOB.InitialApicId], edx ; APIC ID match? jz ProgramStack- add rdi, 20+ add rdi, CPU_INFO_IN_HOB_size inc ebx jmp GetNextProcNumber ProgramStack:- mov rsp, qword [rdi + 12]+ mov rsp, qword [rdi + CPU_INFO_IN_HOB.ApTopOfStack] CProcedureInvoke: push rbp ; Push BIST data at top of AP stack@@ -315,17 +315,17 @@ CProcedureInvoke:
push rbp mov rbp, rsp - mov rax, qword [esi + InitializeFloatingPointUnitsAddress]+ mov rax, qword [esi + MP_CPU_EXCHANGE_INFO_FIELD (InitializeFloatingPointUnits)] sub rsp, 20h call rax ; Call assembly function to initialize FPU per UEFI spec add rsp, 20h mov edx, ebx ; edx is ApIndex mov ecx, esi- add ecx, LockLocation ; rcx is address of exchange info data buffer+ add ecx, MP_CPU_EXCHANGE_INFO_OFFSET ; rcx is address of exchange info data buffer mov edi, esi- add edi, ApProcedureLocation+ add edi, MP_CPU_EXCHANGE_INFO_FIELD (CFunction) mov rax, qword [edi] sub rsp, 20h@@ -661,18 +661,18 @@ AsmRelocateApLoopEnd:
global ASM_PFX(AsmGetAddressMap) ASM_PFX(AsmGetAddressMap): lea rax, [ASM_PFX(RendezvousFunnelProc)]- mov qword [rcx], rax- mov qword [rcx + 8h], LongModeStart - RendezvousFunnelProcStart- mov qword [rcx + 10h], RendezvousFunnelProcEnd - RendezvousFunnelProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelAddress], rax+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.ModeEntryOffset], LongModeStart - RendezvousFunnelProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RendezvousFunnelSize], RendezvousFunnelProcEnd - RendezvousFunnelProcStart lea rax, [ASM_PFX(AsmRelocateApLoop)]- mov qword [rcx + 18h], rax- mov qword [rcx + 20h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart- mov qword [rcx + 28h], Flat32Start - RendezvousFunnelProcStart- mov qword [rcx + 30h], SwitchToRealProcEnd - SwitchToRealProcStart ; SwitchToRealSize- mov qword [rcx + 38h], SwitchToRealProcStart - RendezvousFunnelProcStart ; SwitchToRealOffset- mov qword [rcx + 40h], SwitchToRealProcStart - Flat32Start ; SwitchToRealNoNxOffset- mov qword [rcx + 48h], PM16Mode - RendezvousFunnelProcStart ; SwitchToRealPM16ModeOffset- mov qword [rcx + 50h], SwitchToRealProcEnd - PM16Mode ; SwitchToRealPM16ModeSize+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncAddress], rax+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.RelocateApLoopFuncSize], AsmRelocateApLoopEnd - AsmRelocateApLoopStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.ModeTransitionOffset], Flat32Start - RendezvousFunnelProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealSize], SwitchToRealProcEnd - SwitchToRealProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealOffset], SwitchToRealProcStart - RendezvousFunnelProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealNoNxOffset], SwitchToRealProcStart - Flat32Start+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeOffset], PM16Mode - RendezvousFunnelProcStart+ mov qword [rcx + MP_ASSEMBLY_ADDRESS_MAP.SwitchToRealPM16ModeSize], SwitchToRealProcEnd - PM16Mode ret ;-------------------------------------------------------------------------------------@@ -715,18 +715,18 @@ ASM_PFX(AsmExchangeRole):
;Store EFLAGS, GDTR and IDTR regiter to stack pushfq- sgdt [rsi + 16]- sidt [rsi + 26]+ sgdt [rsi + CPU_EXCHANGE_ROLE_INFO.Gdtr]+ sidt [rsi + CPU_EXCHANGE_ROLE_INFO.Idtr] ; Store the its StackPointer- mov [rsi + 8], rsp+ mov [rsi + CPU_EXCHANGE_ROLE_INFO.StackPointer], rsp ; update its switch state to STORED- mov byte [rsi], CPU_SWITCH_STATE_STORED+ mov byte [rsi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED WaitForOtherStored: ; wait until the other CPU finish storing its state- cmp byte [rdi], CPU_SWITCH_STATE_STORED+ cmp byte [rdi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_STORED jz OtherStored pause jmp WaitForOtherStored@@ -734,21 +734,21 @@ WaitForOtherStored:
OtherStored: ; Since another CPU already stored its state, load them ; load GDTR value- lgdt [rdi + 16]+ lgdt [rdi + CPU_EXCHANGE_ROLE_INFO.Gdtr] ; load IDTR value- lidt [rdi + 26]+ lidt [rdi + CPU_EXCHANGE_ROLE_INFO.Idtr] ; load its future StackPointer- mov rsp, [rdi + 8]+ mov rsp, [rdi + CPU_EXCHANGE_ROLE_INFO.StackPointer] ; update the other CPU's switch state to LOADED- mov byte [rdi], CPU_SWITCH_STATE_LOADED+ mov byte [rdi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED WaitForOtherLoaded: ; wait until the other CPU finish loading new state, ; otherwise the data in stack may corrupt- cmp byte [rsi], CPU_SWITCH_STATE_LOADED+ cmp byte [rsi + CPU_EXCHANGE_ROLE_INFO.State], CPU_SWITCH_STATE_LOADED jz OtherLoaded pause jmp WaitForOtherLoaded--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
2021-02-09 14:16 ` [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO Ni, Ray
@ 2021-02-22 9:07 ` Dong, Eric
0 siblings, 0 replies; 15+ messages in thread
From: Dong, Eric @ 2021-02-22 9:07 UTC (permalink / raw)
To: Ni, Ray, devel@edk2.groups.io; +Cc: Laszlo Ersek, Kumar, Rahul1
Reviewed-by: Eric Dong <eric.dong@intel.com>
-----Original Message-----
From: Ni, Ray <ray.ni@intel.com>
Sent: Tuesday, February 9, 2021 10:17 PM
To: devel@edk2.groups.io
Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>; Kumar, Rahul1 <rahul1.kumar@intel.com>
Subject: [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
The Lock is no longer needed since "LOCK XADD" was used in MpFuncs.nasm for ApIndex atomic increment.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
---
UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 4 ----
UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 4 ----
UefiCpuPkg/Library/MpInitLib/MpLib.c | 1 -
UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +--
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 4 ----
5 files changed, 1 insertion(+), 15 deletions(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index 4363ad9a18..7bd2415670 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -121,10 +121,6 @@ SkipEnableExecuteDisable:
lock inc dword [edi] ; AP init- mov edi, esi- add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)- mov eax, NotVacantFlag- mov edi, esi add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex) mov ebx, 1diff --git a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
index 46c2b5c116..2e9368a374 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
+++ b/UefiCpuPkg/Library/MpInitLib/MpEqu.inc
@@ -13,9 +13,6 @@
;------------------------------------------------------------------------------- %include "Nasm.inc" -VacantFlag equ 00h-NotVacantFlag equ 0ffh- CPU_SWITCH_STATE_IDLE equ 0 CPU_SWITCH_STATE_STORED equ 1 CPU_SWITCH_STATE_LOADED equ 2@@ -72,7 +69,6 @@ endstruc
; Equivalent NASM structure of MP_CPU_EXCHANGE_INFO ; struc MP_CPU_EXCHANGE_INFO- .Lock: CTYPE_UINTN 1 .StackStart: CTYPE_UINTN 1 .StackSize: CTYPE_UINTN 1 .CFunction: CTYPE_UINTN 1diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 2568986d8c..5040053dad 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -1006,7 +1006,6 @@ FillExchangeInfoData (
IA32_CR4 Cr4; ExchangeInfo = CpuMpData->MpCpuExchangeInfo;- ExchangeInfo->Lock = 0; ExchangeInfo->StackStart = CpuMpData->Buffer; ExchangeInfo->StackSize = CpuMpData->CpuApStackSize; ExchangeInfo->BufferStart = CpuMpData->WakeupBuffer;diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 02652eaae1..0bd60388b1 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -1,7 +1,7 @@
/** @file Common header file for MP Initialize Library. - Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.<BR>+ Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.<BR> Copyright (c) 2020, AMD Inc. All rights reserved.<BR> SPDX-License-Identifier: BSD-2-Clause-Patent@@ -190,7 +190,6 @@ typedef struct _CPU_MP_DATA CPU_MP_DATA;
// into this structure are used in assembly code in this module // typedef struct {- UINTN Lock; UINTN StackStart; UINTN StackSize; UINTN CFunction;diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index db297f5cca..50df802d1f 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -157,10 +157,6 @@ LongModeStart:
lock inc dword [edi] ; AP init- mov edi, esi- add edi, MP_CPU_EXCHANGE_INFO_FIELD (Lock)- mov rax, NotVacantFlag- mov edi, esi add edi, MP_CPU_EXCHANGE_INFO_FIELD (ApIndex) mov ebx, 1--
2.27.0.windows.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
[not found] ` <166219FF4C25D9C5.16853@groups.io>
@ 2021-02-23 2:22 ` Ni, Ray
0 siblings, 0 replies; 15+ messages in thread
From: Ni, Ray @ 2021-02-23 2:22 UTC (permalink / raw)
To: Kinney, Michael D
Cc: Dong, Eric, Laszlo Ersek, Kumar, Rahul1, devel@edk2.groups.io,
Ni, Ray
Mike,
This patch follows your suggestion to fix the performance issue first, clean up the code next.
Can you check specifically this patch?
Thanks,
Ray
> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Ni, Ray
> Sent: Tuesday, February 9, 2021 10:17 PM
> To: devel@edk2.groups.io
> Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>;
> Kumar, Rahul1 <rahul1.kumar@intel.com>
> Subject: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to
> avoid lock acquire/release
>
> When AP firstly wakes up, MpFuncs.nasm contains below logic to assign
> an unique ApIndex to each AP according to who comes first:
> ---ASM---
> TestLock:
> xchg [edi], eax
> cmp eax, NotVacantFlag
> jz TestLock
>
> mov ecx, esi
> add ecx, ApIndexLocation
> inc dword [ecx]
> mov ebx, [ecx]
>
> Releaselock:
> mov eax, VacantFlag
> xchg [edi], eax
> ---ASM END---
>
> "lock inc" cannot be used to increase ApIndex because not only the
> global ApIndex should be increased, but also the result should be
> stored to a local general purpose register EBX.
>
> This patch learns from the NASM implementation of
> InternalSyncIncrement() to use "XADD" instruction which can increase
> the global ApIndex and store the original ApIndex to EBX in one
> instruction.
>
> With this patch, OVMF when running in a 255 threads QEMU spends about
> one second to wakeup all APs. Original implementation needs more than
> 10 seconds.
>
> Signed-off-by: Ray Ni <ray.ni@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> ---
> .../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
> 2 files changed, 12 insertions(+), 26 deletions(-)
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> index 7e81d24aa6..2eaddc93bc 100644
> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> @@ -1,5 +1,5 @@
> ;------------------------------------------------------------------------------ ;
>
> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>
> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>
> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>
> ;
>
> ; Module Name:
>
> @@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
> add edi, LockLocation
>
> mov eax, NotVacantFlag
>
>
>
> -TestLock:
>
> - xchg [edi], eax
>
> - cmp eax, NotVacantFlag
>
> - jz TestLock
>
> -
>
> - mov ecx, esi
>
> - add ecx, ApIndexLocation
>
> - inc dword [ecx]
>
> - mov ebx, [ecx]
>
> -
>
> -Releaselock:
>
> - mov eax, VacantFlag
>
> - xchg [edi], eax
>
> + mov edi, esi
>
> + add edi, ApIndexLocation
>
> + mov ebx, 1
>
> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>
> + inc ebx ; EBX is CpuNumber
>
>
>
> mov edi, esi
>
> add edi, StackSizeLocation
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> index aecfd07bc0..5b588f2dcb 100644
> --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> @@ -1,5 +1,5 @@
> ;------------------------------------------------------------------------------ ;
>
> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>
> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>
> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>
> ;
>
> ; Module Name:
>
> @@ -161,18 +161,12 @@ LongModeStart:
> add edi, LockLocation
>
> mov rax, NotVacantFlag
>
>
>
> -TestLock:
>
> - xchg qword [edi], rax
>
> - cmp rax, NotVacantFlag
>
> - jz TestLock
>
> -
>
> - lea ecx, [esi + ApIndexLocation]
>
> - inc dword [ecx]
>
> - mov ebx, [ecx]
>
> + mov edi, esi
>
> + add edi, ApIndexLocation
>
> + mov ebx, 1
>
> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>
> + inc ebx ; EBX is CpuNumber
>
>
>
> -Releaselock:
>
> - mov rax, VacantFlag
>
> - xchg qword [edi], rax
>
> ; program stack
>
> mov edi, esi
>
> add edi, StackSizeLocation
>
> --
> 2.27.0.windows.1
>
>
>
> -=-=-=-=-=-=
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#71517): https://edk2.groups.io/g/devel/message/71517
> Mute This Topic: https://groups.io/mt/80504936/1712937
> Group Owner: devel+owner@edk2.groups.io
> Unsubscribe: https://edk2.groups.io/g/devel/unsub [ray.ni@intel.com]
> -=-=-=-=-=-=
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
2021-02-22 9:06 ` Dong, Eric
@ 2021-02-23 18:11 ` Michael D Kinney
2021-02-25 4:04 ` Ni, Ray
1 sibling, 1 reply; 15+ messages in thread
From: Michael D Kinney @ 2021-02-23 18:11 UTC (permalink / raw)
To: devel@edk2.groups.io, Ni, Ray, Kinney, Michael D
Cc: Dong, Eric, Laszlo Ersek, Kumar, Rahul1
Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Ni, Ray
> Sent: Tuesday, February 9, 2021 6:17 AM
> To: devel@edk2.groups.io
> Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>; Kumar, Rahul1 <rahul1.kumar@intel.com>
> Subject: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
>
> When AP firstly wakes up, MpFuncs.nasm contains below logic to assign
> an unique ApIndex to each AP according to who comes first:
> ---ASM---
> TestLock:
> xchg [edi], eax
> cmp eax, NotVacantFlag
> jz TestLock
>
> mov ecx, esi
> add ecx, ApIndexLocation
> inc dword [ecx]
> mov ebx, [ecx]
>
> Releaselock:
> mov eax, VacantFlag
> xchg [edi], eax
> ---ASM END---
>
> "lock inc" cannot be used to increase ApIndex because not only the
> global ApIndex should be increased, but also the result should be
> stored to a local general purpose register EBX.
>
> This patch learns from the NASM implementation of
> InternalSyncIncrement() to use "XADD" instruction which can increase
> the global ApIndex and store the original ApIndex to EBX in one
> instruction.
>
> With this patch, OVMF when running in a 255 threads QEMU spends about
> one second to wakeup all APs. Original implementation needs more than
> 10 seconds.
>
> Signed-off-by: Ray Ni <ray.ni@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Rahul Kumar <rahul1.kumar@intel.com>
> ---
> .../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
> 2 files changed, 12 insertions(+), 26 deletions(-)
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> index 7e81d24aa6..2eaddc93bc 100644
> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> @@ -1,5 +1,5 @@
> ;------------------------------------------------------------------------------ ;
>
> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>
> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>
> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>
> ;
>
> ; Module Name:
>
> @@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
> add edi, LockLocation
>
> mov eax, NotVacantFlag
>
>
>
> -TestLock:
>
> - xchg [edi], eax
>
> - cmp eax, NotVacantFlag
>
> - jz TestLock
>
> -
>
> - mov ecx, esi
>
> - add ecx, ApIndexLocation
>
> - inc dword [ecx]
>
> - mov ebx, [ecx]
>
> -
>
> -Releaselock:
>
> - mov eax, VacantFlag
>
> - xchg [edi], eax
>
> + mov edi, esi
>
> + add edi, ApIndexLocation
>
> + mov ebx, 1
>
> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>
> + inc ebx ; EBX is CpuNumber
>
>
>
> mov edi, esi
>
> add edi, StackSizeLocation
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> index aecfd07bc0..5b588f2dcb 100644
> --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> @@ -1,5 +1,5 @@
> ;------------------------------------------------------------------------------ ;
>
> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>
> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>
> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>
> ;
>
> ; Module Name:
>
> @@ -161,18 +161,12 @@ LongModeStart:
> add edi, LockLocation
>
> mov rax, NotVacantFlag
>
>
>
> -TestLock:
>
> - xchg qword [edi], rax
>
> - cmp rax, NotVacantFlag
>
> - jz TestLock
>
> -
>
> - lea ecx, [esi + ApIndexLocation]
>
> - inc dword [ecx]
>
> - mov ebx, [ecx]
>
> + mov edi, esi
>
> + add edi, ApIndexLocation
>
> + mov ebx, 1
>
> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>
> + inc ebx ; EBX is CpuNumber
>
>
>
> -Releaselock:
>
> - mov rax, VacantFlag
>
> - xchg qword [edi], rax
>
> ; program stack
>
> mov edi, esi
>
> add edi, StackSizeLocation
>
> --
> 2.27.0.windows.1
>
>
>
> -=-=-=-=-=-=
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#71517): https://edk2.groups.io/g/devel/message/71517
> Mute This Topic: https://groups.io/mt/80504936/1643496
> Group Owner: devel+owner@edk2.groups.io
> Unsubscribe: https://edk2.groups.io/g/devel/unsub [michael.d.kinney@intel.com]
> -=-=-=-=-=-=
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
2021-02-23 18:11 ` [edk2-devel] " Michael D Kinney
@ 2021-02-25 4:04 ` Ni, Ray
2021-02-25 19:02 ` Laszlo Ersek
0 siblings, 1 reply; 15+ messages in thread
From: Ni, Ray @ 2021-02-25 4:04 UTC (permalink / raw)
To: devel@edk2.groups.io, Laszlo Ersek
Cc: Dong, Eric, Kumar, Rahul1, Kinney, Michael D
Laszlo,
Do you think that Mike's R-b to the first patch can be an ack to the V3 patch set?
Can you please review and provide comments?
Thanks,
Ray
> -----Original Message-----
> From: Kinney, Michael D <michael.d.kinney@intel.com>
> Sent: Wednesday, February 24, 2021 2:12 AM
> To: devel@edk2.groups.io; Ni, Ray <ray.ni@intel.com>; Kinney, Michael D
> <michael.d.kinney@intel.com>
> Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>;
> Kumar, Rahul1 <rahul1.kumar@intel.com>
> Subject: RE: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD
> to avoid lock acquire/release
>
> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
>
> > -----Original Message-----
> > From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Ni,
> Ray
> > Sent: Tuesday, February 9, 2021 6:17 AM
> > To: devel@edk2.groups.io
> > Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>;
> Kumar, Rahul1 <rahul1.kumar@intel.com>
> > Subject: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to
> avoid lock acquire/release
> >
> > When AP firstly wakes up, MpFuncs.nasm contains below logic to assign
> > an unique ApIndex to each AP according to who comes first:
> > ---ASM---
> > TestLock:
> > xchg [edi], eax
> > cmp eax, NotVacantFlag
> > jz TestLock
> >
> > mov ecx, esi
> > add ecx, ApIndexLocation
> > inc dword [ecx]
> > mov ebx, [ecx]
> >
> > Releaselock:
> > mov eax, VacantFlag
> > xchg [edi], eax
> > ---ASM END---
> >
> > "lock inc" cannot be used to increase ApIndex because not only the
> > global ApIndex should be increased, but also the result should be
> > stored to a local general purpose register EBX.
> >
> > This patch learns from the NASM implementation of
> > InternalSyncIncrement() to use "XADD" instruction which can increase
> > the global ApIndex and store the original ApIndex to EBX in one
> > instruction.
> >
> > With this patch, OVMF when running in a 255 threads QEMU spends about
> > one second to wakeup all APs. Original implementation needs more than
> > 10 seconds.
> >
> > Signed-off-by: Ray Ni <ray.ni@intel.com>
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Cc: Rahul Kumar <rahul1.kumar@intel.com>
> > ---
> > .../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
> > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
> > 2 files changed, 12 insertions(+), 26 deletions(-)
> >
> > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> > index 7e81d24aa6..2eaddc93bc 100644
> > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> > @@ -1,5 +1,5 @@
> > ;------------------------------------------------------------------------------ ;
> >
> > -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
> >
> > +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
> >
> > ; SPDX-License-Identifier: BSD-2-Clause-Patent
> >
> > ;
> >
> > ; Module Name:
> >
> > @@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
> > add edi, LockLocation
> >
> > mov eax, NotVacantFlag
> >
> >
> >
> > -TestLock:
> >
> > - xchg [edi], eax
> >
> > - cmp eax, NotVacantFlag
> >
> > - jz TestLock
> >
> > -
> >
> > - mov ecx, esi
> >
> > - add ecx, ApIndexLocation
> >
> > - inc dword [ecx]
> >
> > - mov ebx, [ecx]
> >
> > -
> >
> > -Releaselock:
> >
> > - mov eax, VacantFlag
> >
> > - xchg [edi], eax
> >
> > + mov edi, esi
> >
> > + add edi, ApIndexLocation
> >
> > + mov ebx, 1
> >
> > + lock xadd dword [edi], ebx ; EBX = ApIndex++
> >
> > + inc ebx ; EBX is CpuNumber
> >
> >
> >
> > mov edi, esi
> >
> > add edi, StackSizeLocation
> >
> > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> > index aecfd07bc0..5b588f2dcb 100644
> > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> > @@ -1,5 +1,5 @@
> > ;------------------------------------------------------------------------------ ;
> >
> > -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
> >
> > +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
> >
> > ; SPDX-License-Identifier: BSD-2-Clause-Patent
> >
> > ;
> >
> > ; Module Name:
> >
> > @@ -161,18 +161,12 @@ LongModeStart:
> > add edi, LockLocation
> >
> > mov rax, NotVacantFlag
> >
> >
> >
> > -TestLock:
> >
> > - xchg qword [edi], rax
> >
> > - cmp rax, NotVacantFlag
> >
> > - jz TestLock
> >
> > -
> >
> > - lea ecx, [esi + ApIndexLocation]
> >
> > - inc dword [ecx]
> >
> > - mov ebx, [ecx]
> >
> > + mov edi, esi
> >
> > + add edi, ApIndexLocation
> >
> > + mov ebx, 1
> >
> > + lock xadd dword [edi], ebx ; EBX = ApIndex++
> >
> > + inc ebx ; EBX is CpuNumber
> >
> >
> >
> > -Releaselock:
> >
> > - mov rax, VacantFlag
> >
> > - xchg qword [edi], rax
> >
> > ; program stack
> >
> > mov edi, esi
> >
> > add edi, StackSizeLocation
> >
> > --
> > 2.27.0.windows.1
> >
> >
> >
> > -=-=-=-=-=-=
> > Groups.io Links: You receive all messages sent to this group.
> > View/Reply Online (#71517):
> https://edk2.groups.io/g/devel/message/71517
> > Mute This Topic: https://groups.io/mt/80504936/1643496
> > Group Owner: devel+owner@edk2.groups.io
> > Unsubscribe: https://edk2.groups.io/g/devel/unsub
> [michael.d.kinney@intel.com]
> > -=-=-=-=-=-=
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
2021-02-25 4:04 ` Ni, Ray
@ 2021-02-25 19:02 ` Laszlo Ersek
0 siblings, 0 replies; 15+ messages in thread
From: Laszlo Ersek @ 2021-02-25 19:02 UTC (permalink / raw)
To: Ni, Ray, devel@edk2.groups.io
Cc: Dong, Eric, Kumar, Rahul1, Kinney, Michael D
On 02/25/21 05:04, Ni, Ray wrote:
> Laszlo,
> Do you think that Mike's R-b to the first patch can be an ack to the V3 patch set?
No, I don't think so. If an R-b is given in response to a specific patch
(not the cover letter), and the reviewer doesn't explicitly say "series
Reviewed-by" or "for the entire series:", then the R-b applies only to
the specific patch.
Now, a different question is whether you want or need Mike's R-b for
*all* of the patches. That's up to you and Mike to decide.
> Can you please review and provide comments?
Yes, I'll comment soon.
Thanks
Laszlo
>> -----Original Message-----
>> From: Kinney, Michael D <michael.d.kinney@intel.com>
>> Sent: Wednesday, February 24, 2021 2:12 AM
>> To: devel@edk2.groups.io; Ni, Ray <ray.ni@intel.com>; Kinney, Michael D
>> <michael.d.kinney@intel.com>
>> Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>;
>> Kumar, Rahul1 <rahul1.kumar@intel.com>
>> Subject: RE: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD
>> to avoid lock acquire/release
>>
>> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
>>
>>> -----Original Message-----
>>> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Ni,
>> Ray
>>> Sent: Tuesday, February 9, 2021 6:17 AM
>>> To: devel@edk2.groups.io
>>> Cc: Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>;
>> Kumar, Rahul1 <rahul1.kumar@intel.com>
>>> Subject: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to
>> avoid lock acquire/release
>>>
>>> When AP firstly wakes up, MpFuncs.nasm contains below logic to assign
>>> an unique ApIndex to each AP according to who comes first:
>>> ---ASM---
>>> TestLock:
>>> xchg [edi], eax
>>> cmp eax, NotVacantFlag
>>> jz TestLock
>>>
>>> mov ecx, esi
>>> add ecx, ApIndexLocation
>>> inc dword [ecx]
>>> mov ebx, [ecx]
>>>
>>> Releaselock:
>>> mov eax, VacantFlag
>>> xchg [edi], eax
>>> ---ASM END---
>>>
>>> "lock inc" cannot be used to increase ApIndex because not only the
>>> global ApIndex should be increased, but also the result should be
>>> stored to a local general purpose register EBX.
>>>
>>> This patch learns from the NASM implementation of
>>> InternalSyncIncrement() to use "XADD" instruction which can increase
>>> the global ApIndex and store the original ApIndex to EBX in one
>>> instruction.
>>>
>>> With this patch, OVMF when running in a 255 threads QEMU spends about
>>> one second to wakeup all APs. Original implementation needs more than
>>> 10 seconds.
>>>
>>> Signed-off-by: Ray Ni <ray.ni@intel.com>
>>> Cc: Eric Dong <eric.dong@intel.com>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> Cc: Rahul Kumar <rahul1.kumar@intel.com>
>>> ---
>>> .../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++-------------
>>> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++-----------
>>> 2 files changed, 12 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> index 7e81d24aa6..2eaddc93bc 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> @@ -1,5 +1,5 @@
>>> ;------------------------------------------------------------------------------ ;
>>>
>>> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>>>
>>> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>>>
>>> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>>>
>>> ;
>>>
>>> ; Module Name:
>>>
>>> @@ -125,19 +125,11 @@ SkipEnableExecuteDisable:
>>> add edi, LockLocation
>>>
>>> mov eax, NotVacantFlag
>>>
>>>
>>>
>>> -TestLock:
>>>
>>> - xchg [edi], eax
>>>
>>> - cmp eax, NotVacantFlag
>>>
>>> - jz TestLock
>>>
>>> -
>>>
>>> - mov ecx, esi
>>>
>>> - add ecx, ApIndexLocation
>>>
>>> - inc dword [ecx]
>>>
>>> - mov ebx, [ecx]
>>>
>>> -
>>>
>>> -Releaselock:
>>>
>>> - mov eax, VacantFlag
>>>
>>> - xchg [edi], eax
>>>
>>> + mov edi, esi
>>>
>>> + add edi, ApIndexLocation
>>>
>>> + mov ebx, 1
>>>
>>> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>>>
>>> + inc ebx ; EBX is CpuNumber
>>>
>>>
>>>
>>> mov edi, esi
>>>
>>> add edi, StackSizeLocation
>>>
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
>> b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
>>> index aecfd07bc0..5b588f2dcb 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
>>> +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
>>> @@ -1,5 +1,5 @@
>>> ;------------------------------------------------------------------------------ ;
>>>
>>> -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.<BR>
>>>
>>> +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.<BR>
>>>
>>> ; SPDX-License-Identifier: BSD-2-Clause-Patent
>>>
>>> ;
>>>
>>> ; Module Name:
>>>
>>> @@ -161,18 +161,12 @@ LongModeStart:
>>> add edi, LockLocation
>>>
>>> mov rax, NotVacantFlag
>>>
>>>
>>>
>>> -TestLock:
>>>
>>> - xchg qword [edi], rax
>>>
>>> - cmp rax, NotVacantFlag
>>>
>>> - jz TestLock
>>>
>>> -
>>>
>>> - lea ecx, [esi + ApIndexLocation]
>>>
>>> - inc dword [ecx]
>>>
>>> - mov ebx, [ecx]
>>>
>>> + mov edi, esi
>>>
>>> + add edi, ApIndexLocation
>>>
>>> + mov ebx, 1
>>>
>>> + lock xadd dword [edi], ebx ; EBX = ApIndex++
>>>
>>> + inc ebx ; EBX is CpuNumber
>>>
>>>
>>>
>>> -Releaselock:
>>>
>>> - mov rax, VacantFlag
>>>
>>> - xchg qword [edi], rax
>>>
>>> ; program stack
>>>
>>> mov edi, esi
>>>
>>> add edi, StackSizeLocation
>>>
>>> --
>>> 2.27.0.windows.1
>>>
>>>
>>>
>>> -=-=-=-=-=-=
>>> Groups.io Links: You receive all messages sent to this group.
>>> View/Reply Online (#71517):
>> https://edk2.groups.io/g/devel/message/71517
>>> Mute This Topic: https://groups.io/mt/80504936/1643496
>>> Group Owner: devel+owner@edk2.groups.io
>>> Unsubscribe: https://edk2.groups.io/g/devel/unsub
>> [michael.d.kinney@intel.com]
>>> -=-=-=-=-=-=
>>>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 0/4] Use XADD to avoid lock acquire/release
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
` (4 preceding siblings ...)
[not found] ` <166219FF4C25D9C5.16853@groups.io>
@ 2021-02-25 19:03 ` Laszlo Ersek
5 siblings, 0 replies; 15+ messages in thread
From: Laszlo Ersek @ 2021-02-25 19:03 UTC (permalink / raw)
To: devel, ray.ni
On 02/09/21 15:16, Ni, Ray wrote:
> Patch #1 follows Mike's suggestion to use XADD to avoid lock acquire/release.
> Patch #2 follows Laszlo's suggestion to add global NASM macros for NASM struc usage.
> Patch #3 simply remves all hardcode offset in NASM without changing any logic.
> Patch #4 removes the dead code.
>
> The final code is the same as that of V2.
Given that I was OK with v2:
series
Acked-by: Laszlo Ersek <lersek@redhat.com>
Thanks
Laszlo
> Ray Ni (4):
> UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
> MdePkg/Nasm.inc: add macros for C types used in structure definition
> UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
> UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
>
> MdePkg/Include/Ia32/Nasm.inc | 38 ++++++
> MdePkg/Include/X64/Nasm.inc | 38 ++++++
> UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 5 +-
> UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 43 -------
> .../Library/MpInitLib/Ia32/MpFuncs.nasm | 98 +++++++---------
> UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 99 ++++++++++++++++
> UefiCpuPkg/Library/MpInitLib/MpLib.c | 1 -
> UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +-
> UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 5 +-
> UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 45 --------
> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 108 ++++++++----------
> 11 files changed, 272 insertions(+), 211 deletions(-)
> delete mode 100644 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
> create mode 100644 UefiCpuPkg/Library/MpInitLib/MpEqu.inc
> delete mode 100644 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [edk2-devel] [PATCH v3 0/4] Use XADD to avoid lock acquire/release
[not found] <166219FED0821D4E.31350@groups.io>
@ 2021-02-27 11:29 ` Ni, Ray
0 siblings, 0 replies; 15+ messages in thread
From: Ni, Ray @ 2021-02-27 11:29 UTC (permalink / raw)
To: gaoliming@byosoft.com.cn; +Cc: devel@edk2.groups.io, Ni, Ray
Liming,
I pushed the first patch as it's a bug fix.
For the remaining 3 patches, I am ok to push them after the freeze time.
> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Ni, Ray
> Sent: Tuesday, February 9, 2021 10:17 PM
> To: devel@edk2.groups.io
> Subject: [edk2-devel] [PATCH v3 0/4] Use XADD to avoid lock acquire/release
>
> Patch #1 follows Mike's suggestion to use XADD to avoid lock acquire/release.
> Patch #2 follows Laszlo's suggestion to add global NASM macros for NASM struc usage.
> Patch #3 simply remves all hardcode offset in NASM without changing any logic.
> Patch #4 removes the dead code.
>
> The final code is the same as that of V2.
>
> Ray Ni (4):
> UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release
> MdePkg/Nasm.inc: add macros for C types used in structure definition
> UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset
> UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO
>
> MdePkg/Include/Ia32/Nasm.inc | 38 ++++++
> MdePkg/Include/X64/Nasm.inc | 38 ++++++
> UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf | 5 +-
> UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 43 -------
> .../Library/MpInitLib/Ia32/MpFuncs.nasm | 98 +++++++---------
> UefiCpuPkg/Library/MpInitLib/MpEqu.inc | 99 ++++++++++++++++
> UefiCpuPkg/Library/MpInitLib/MpLib.c | 1 -
> UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +-
> UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf | 5 +-
> UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 45 --------
> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 108 ++++++++----------
> 11 files changed, 272 insertions(+), 211 deletions(-)
> delete mode 100644 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
> create mode 100644 UefiCpuPkg/Library/MpInitLib/MpEqu.inc
> delete mode 100644 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
>
> --
> 2.27.0.windows.1
>
>
>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-02-27 11:29 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-09 14:16 [PATCH v3 0/4] Use XADD to avoid lock acquire/release Ni, Ray
2021-02-09 14:16 ` [PATCH v3 1/4] UefiCpuPkg/MpInitLib: " Ni, Ray
2021-02-22 9:06 ` Dong, Eric
2021-02-23 18:11 ` [edk2-devel] " Michael D Kinney
2021-02-25 4:04 ` Ni, Ray
2021-02-25 19:02 ` Laszlo Ersek
2021-02-09 14:16 ` [PATCH v3 2/4] MdePkg/Nasm.inc: add macros for C types used in structure definition Ni, Ray
2021-02-18 3:24 ` 回复: " gaoliming
2021-02-09 14:16 ` [PATCH v3 3/4] UefiCpuPkg/MpInitLib: Use NASM struc to avoid hardcode offset Ni, Ray
2021-02-22 9:06 ` Dong, Eric
2021-02-09 14:16 ` [PATCH v3 4/4] UefiCpuPkg/MpInitLib: Remove unused Lock from MP_CPU_EXCHANGE_INFO Ni, Ray
2021-02-22 9:07 ` Dong, Eric
[not found] ` <166219FF4C25D9C5.16853@groups.io>
2021-02-23 2:22 ` [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release Ni, Ray
2021-02-25 19:03 ` [edk2-devel] [PATCH v3 0/4] " Laszlo Ersek
[not found] <166219FED0821D4E.31350@groups.io>
2021-02-27 11:29 ` Ni, Ray
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox