* [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB @ 2016-11-23 14:01 Jeff Fan 2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Jeff Fan @ 2016-11-23 14:01 UTC (permalink / raw) To: edk2-devel; +Cc: Laszlo Ersek, Feng Tian, Michael D Kinney Allocate safe AP stack under 4GB and make sure BSP wait till all APs running in safe code. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Feng Tian <feng.tian@intel.com> Cc: Michael D Kinney <michael.d.kinney@intel.com> Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Jeff Fan <jeff.fan@intel.com> Jeff Fan (3): UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 29 ++++++++++++++++++++++---- UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 11 ++++++++-- UefiCpuPkg/Library/MpInitLib/MpLib.h | 4 +++- UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 5 +++++ 4 files changed, 42 insertions(+), 7 deletions(-) -- 2.9.3.windows.2 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable 2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan @ 2016-11-23 14:01 ` Jeff Fan 2016-11-23 17:08 ` Laszlo Ersek 2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan 2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan 2 siblings, 1 reply; 12+ messages in thread From: Jeff Fan @ 2016-11-23 14:01 UTC (permalink / raw) To: edk2-devel; +Cc: Laszlo Ersek, Feng Tian, Michael D Kinney Cc: Laszlo Ersek <lersek@redhat.com> Cc: Feng Tian <feng.tian@intel.com> Cc: Michael D Kinney <michael.d.kinney@intel.com> Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Jeff Fan <jeff.fan@intel.com> --- UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c index 7f3900b..a0d5eeb 100644 --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c @@ -244,7 +244,7 @@ RelocateApLoop ( CpuMpData = GetCpuMpData (); MwaitSupport = IsMwaitSupport (); - AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) Buffer; + AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) mReservedApLoopFunc; AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment); // // It should never reach here @@ -273,7 +273,7 @@ MpInitChangeApLoopCallback ( CpuMpData->SaveRestoreFlag = TRUE; CpuMpData->PmCodeSegment = GetProtectedModeCS (); CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode); - WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, mReservedApLoopFunc); + WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL); DEBUG ((DEBUG_INFO, "%a() done!\n", __FUNCTION__)); } -- 2.9.3.windows.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable 2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan @ 2016-11-23 17:08 ` Laszlo Ersek 0 siblings, 0 replies; 12+ messages in thread From: Laszlo Ersek @ 2016-11-23 17:08 UTC (permalink / raw) To: Jeff Fan, edk2-devel; +Cc: Feng Tian, Michael D Kinney On 11/23/16 15:01, Jeff Fan wrote: > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Feng Tian <feng.tian@intel.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Jeff Fan <jeff.fan@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index 7f3900b..a0d5eeb 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -244,7 +244,7 @@ RelocateApLoop ( > > CpuMpData = GetCpuMpData (); > MwaitSupport = IsMwaitSupport (); > - AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) Buffer; > + AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) mReservedApLoopFunc; > AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment); > // > // It should never reach here > @@ -273,7 +273,7 @@ MpInitChangeApLoopCallback ( > CpuMpData->SaveRestoreFlag = TRUE; > CpuMpData->PmCodeSegment = GetProtectedModeCS (); > CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode); > - WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, mReservedApLoopFunc); > + WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL); > DEBUG ((DEBUG_INFO, "%a() done!\n", __FUNCTION__)); > } > > This patch looks okay (there are no other references to the RelocateApLoop() function), but can you please modify the commit message to explain why this change is a good thing? Thanks Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan 2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan @ 2016-11-23 14:01 ` Jeff Fan 2016-11-23 18:31 ` Laszlo Ersek 2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan 2 siblings, 1 reply; 12+ messages in thread From: Jeff Fan @ 2016-11-23 14:01 UTC (permalink / raw) To: edk2-devel; +Cc: Laszlo Ersek, Feng Tian, Michael D Kinney For long mode DXE, we will disable paging on AP to protected mode to execute AP safe loop code in ACPI NVS range under 4GB. But we forget to allocate stack for AP under 4GB and AP still are using original AP stack. If original AP stack is larger than 4GB, it cannot be used after AP is transferred to protected mode. Moreover, even though AP stack is always under 4GB on protected mode DXE, AP stack(in BootServiceData) maybe crashed by OS after Exit Boot Service event. This fix is to allocate ACPI NVS range under 4GB together with AP safe loop code. APs will switch to new stack at beginning of safe loop code. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Feng Tian <feng.tian@intel.com> Cc: Michael D Kinney <michael.d.kinney@intel.com> Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Jeff Fan <jeff.fan@intel.com> --- UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 19 +++++++++++++++++-- UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 9 +++++++-- UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 3 +++ 4 files changed, 29 insertions(+), 5 deletions(-) diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c index a0d5eeb..d0f9f7e 100644 --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c @@ -18,6 +18,7 @@ #include <Library/UefiBootServicesTableLib.h> #define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100)) +#define AP_SAFT_STACK_SIZE 128 CPU_MP_DATA *mCpuMpData = NULL; EFI_EVENT mCheckAllApsEvent = NULL; @@ -25,6 +26,7 @@ EFI_EVENT mMpInitExitBootServicesEvent = NULL; EFI_EVENT mLegacyBootEvent = NULL; volatile BOOLEAN mStopCheckAllApsStatus = TRUE; VOID *mReservedApLoopFunc = NULL; +UINTN mReservedTopOfApStack; /** Get the pointer to CPU MP Data structure. @@ -241,11 +243,18 @@ RelocateApLoop ( CPU_MP_DATA *CpuMpData; BOOLEAN MwaitSupport; ASM_RELOCATE_AP_LOOP AsmRelocateApLoopFunc; + UINTN ProcessorNumber; + MpInitLibWhoAmI (&ProcessorNumber); CpuMpData = GetCpuMpData (); MwaitSupport = IsMwaitSupport (); AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) mReservedApLoopFunc; - AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment); + AsmRelocateApLoopFunc ( + MwaitSupport, + CpuMpData->ApTargetCState, + CpuMpData->PmCodeSegment, + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE + ); // // It should never reach here // @@ -289,6 +298,7 @@ InitMpGlobalData ( { EFI_STATUS Status; EFI_PHYSICAL_ADDRESS Address; + UINTN ApSafeBufferSize; SaveCpuMpData (CpuMpData); @@ -307,16 +317,21 @@ InitMpGlobalData ( // Allocating it in advance since memory services are not available in // Exit Boot Services callback function. // + ApSafeBufferSize = CpuMpData->AddressMap.RelocateApLoopFuncSize; + ApSafeBufferSize += CpuMpData->CpuCount * AP_SAFT_STACK_SIZE; + Address = BASE_4GB - 1; Status = gBS->AllocatePages ( AllocateMaxAddress, EfiReservedMemoryType, - EFI_SIZE_TO_PAGES (sizeof (CpuMpData->AddressMap.RelocateApLoopFuncSize)), + EFI_SIZE_TO_PAGES (ApSafeBufferSize), &Address ); ASSERT_EFI_ERROR (Status); mReservedApLoopFunc = (VOID *) (UINTN) Address; ASSERT (mReservedApLoopFunc != NULL); + mReservedTopOfApStack = (UINTN) Address + EFI_SIZE_TO_PAGES (ApSafeBufferSize) * SIZE_4KB; + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); CopyMem ( mReservedApLoopFunc, CpuMpData->AddressMap.RelocateApLoopFuncAddress, diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm index 64e51d8..4e55760 100644 --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: global ASM_PFX(AsmRelocateApLoop) ASM_PFX(AsmRelocateApLoop): AsmRelocateApLoopStart: - cmp byte [esp + 4], 1 + push ebp + mov ebp, esp + mov ecx, [ebp + 8] ; MwaitSupport + mov ebx, [ebp + 12] ; ApTargetCState + mov esp, [ebp + 20] ; TopOfApStack + cmp cl, 1 ; Check mwait-monitor support jnz HltLoop MwaitLoop: mov eax, esp xor ecx, ecx xor edx, edx monitor - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] shl eax, 4 mwait jmp MwaitLoop diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h index f73a469..e6dea18 100644 --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h @@ -250,7 +250,8 @@ VOID (EFIAPI * ASM_RELOCATE_AP_LOOP) ( IN BOOLEAN MwaitSupport, IN UINTN ApTargetCState, - IN UINTN PmCodeSegment + IN UINTN PmCodeSegment, + IN UINTN TopOfApStack ); /** diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm index aaabb50..7c8fa45 100644 --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm @@ -224,6 +224,9 @@ RendezvousFunnelProcEnd: global ASM_PFX(AsmRelocateApLoop) ASM_PFX(AsmRelocateApLoop): AsmRelocateApLoopStart: + push rbp + mov rbp, rsp + mov rsp, r9 push rcx push rdx -- 2.9.3.windows.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan @ 2016-11-23 18:31 ` Laszlo Ersek 2016-11-24 2:23 ` Fan, Jeff 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2016-11-23 18:31 UTC (permalink / raw) To: Jeff Fan, edk2-devel; +Cc: Feng Tian, Michael D Kinney On 11/23/16 15:01, Jeff Fan wrote: > For long mode DXE, we will disable paging on AP to protected mode to execute AP > safe loop code in ACPI NVS range under 4GB. But we forget to allocate stack for > AP under 4GB and AP still are using original AP stack. If original AP stack is > larger than 4GB, it cannot be used after AP is transferred to protected mode. > > Moreover, even though AP stack is always under 4GB on protected mode DXE, AP > stack(in BootServiceData) maybe crashed by OS after Exit Boot Service event. > > This fix is to allocate ACPI NVS range under 4GB together with AP safe loop > code. APs will switch to new stack at beginning of safe loop code. (1) We are actually allocating EfiReservedMemoryType -- please update the commit message. (2) For a minute I was confused about using the stack *at all* in the HLT loop. But looking farther down, and reading up on the MONITOR statement, I see that the MWAIT loops in both the Ia32 and X64 source files use the stack inherentily. So, the stack usage is unavoidable in the MwaitSupport==TRUE case. Can you mention in the commit message that the stack usage is only really necessary for MwaitSupport==TRUE, and in the HLT loop mode, it could be technically avoided? (I'm not suggesting that we implement a separate solution for the HLT loop mode, but we should be clear about the true incentive for this patch.) > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Feng Tian <feng.tian@intel.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Jeff Fan <jeff.fan@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 19 +++++++++++++++++-- > UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 9 +++++++-- > UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 3 +++ > 4 files changed, 29 insertions(+), 5 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index a0d5eeb..d0f9f7e 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -18,6 +18,7 @@ > #include <Library/UefiBootServicesTableLib.h> > > #define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100)) > +#define AP_SAFT_STACK_SIZE 128 (3) Is "SAFT" a typo? Did you mean "SAFE"? > > CPU_MP_DATA *mCpuMpData = NULL; > EFI_EVENT mCheckAllApsEvent = NULL; > @@ -25,6 +26,7 @@ EFI_EVENT mMpInitExitBootServicesEvent = NULL; > EFI_EVENT mLegacyBootEvent = NULL; > volatile BOOLEAN mStopCheckAllApsStatus = TRUE; > VOID *mReservedApLoopFunc = NULL; > +UINTN mReservedTopOfApStack; > > /** > Get the pointer to CPU MP Data structure. > @@ -241,11 +243,18 @@ RelocateApLoop ( > CPU_MP_DATA *CpuMpData; > BOOLEAN MwaitSupport; > ASM_RELOCATE_AP_LOOP AsmRelocateApLoopFunc; > + UINTN ProcessorNumber; > > + MpInitLibWhoAmI (&ProcessorNumber); > CpuMpData = GetCpuMpData (); > MwaitSupport = IsMwaitSupport (); > AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) mReservedApLoopFunc; > - AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment); > + AsmRelocateApLoopFunc ( > + MwaitSupport, > + CpuMpData->ApTargetCState, > + CpuMpData->PmCodeSegment, > + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE > + ); > // > // It should never reach here > // > @@ -289,6 +298,7 @@ InitMpGlobalData ( > { > EFI_STATUS Status; > EFI_PHYSICAL_ADDRESS Address; > + UINTN ApSafeBufferSize; > > SaveCpuMpData (CpuMpData); > > @@ -307,16 +317,21 @@ InitMpGlobalData ( > // Allocating it in advance since memory services are not available in > // Exit Boot Services callback function. > // > + ApSafeBufferSize = CpuMpData->AddressMap.RelocateApLoopFuncSize; > + ApSafeBufferSize += CpuMpData->CpuCount * AP_SAFT_STACK_SIZE; > + (So, I think this could be theoretically avoided for the HLT loop case, but I'm fine if we do it for both loop styles, for simplicity.) > Address = BASE_4GB - 1; > Status = gBS->AllocatePages ( > AllocateMaxAddress, > EfiReservedMemoryType, > - EFI_SIZE_TO_PAGES (sizeof (CpuMpData->AddressMap.RelocateApLoopFuncSize)), > + EFI_SIZE_TO_PAGES (ApSafeBufferSize), > &Address > ); > ASSERT_EFI_ERROR (Status); > mReservedApLoopFunc = (VOID *) (UINTN) Address; > ASSERT (mReservedApLoopFunc != NULL); > + mReservedTopOfApStack = (UINTN) Address + EFI_SIZE_TO_PAGES (ApSafeBufferSize) * SIZE_4KB; Ah, here you are moving the stacks to the end of the allocated area, so if there's any gap left after the assembly code and the stacks, the gap is now moved into the middle. Looks okay. (4) I propose to replace the multiplication with the more idiomatic EFI_PAGES_TO_SIZE (EFI_SIZE_TO_PAGES (ApSafeBufferSize)) or else ALIGN_VALUE (ApSafeBufferSize, EFI_PAGE_SIZE) Although the current code is correct too. > + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > CopyMem ( > mReservedApLoopFunc, > CpuMpData->AddressMap.RelocateApLoopFuncAddress, > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 64e51d8..4e55760 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > - cmp byte [esp + 4], 1 > + push ebp (5) Why do you save EBP on the stack? (Sorry if it is trivial, my assembly is not that great.) And, it is saved on the original AP stack. > + mov ebp, esp > + mov ecx, [ebp + 8] ; MwaitSupport > + mov ebx, [ebp + 12] ; ApTargetCState > + mov esp, [ebp + 20] ; TopOfApStack > + cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > mov eax, esp > xor ecx, ecx > xor edx, edx > monitor > - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] > + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] > shl eax, 4 > mwait > jmp MwaitLoop (6) These code changes look okay to me, but are they necessary in the 32-bit case as well? The original AP stack is under 4GB too. Oh wait, I understand. Our purpose is dual here. The stack space used by MONITOR should be *both* under 4GB *and* in reserved type memory. So, the code is fine, but can you please modify the following sentence in the commit message: Moreover, even though AP stack is always under 4GB on protected mode DXE, ... to: Moreover, even though AP stack is always under 4GB (a) in Ia32 DXE and (b) with this patch, after transfering to protected mode from X64 DXE, ... It should be clear from the commit message that for Ia32, we solve problem #2 (memory type for the MONITOR area), while for X64, we solve problem #1 (address range) and problem #2 (memory type) *both*. > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h > index f73a469..e6dea18 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > @@ -250,7 +250,8 @@ VOID > (EFIAPI * ASM_RELOCATE_AP_LOOP) ( > IN BOOLEAN MwaitSupport, > IN UINTN ApTargetCState, > - IN UINTN PmCodeSegment > + IN UINTN PmCodeSegment, > + IN UINTN TopOfApStack > ); > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index aaabb50..7c8fa45 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -224,6 +224,9 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > + push rbp (7) Again, not sure why we're saving RBP (on the current AP stack). > + mov rbp, rsp > + mov rsp, r9 > push rcx > push rdx > > This looks okay. I have some more questions about the preexistent code: (8) The MONITOR statement seems to set up an address *range* for monitoring with MWAIT. EAX provides the base address of the range, and we point it to our new stack. However, what determines the *size* of the address range? Obviously, it must fit in our new stack. (9) In the original (pre-patch) X64 code, I see this: MwaitLoop: mov eax, esp ; Set Monitor Address xor ecx, ecx ; ecx = 0 xor edx, edx ; edx = 0 monitor shl ebx, 4 mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] mwait jmp MwaitLoop I think this is wrong: EBX is supposed to contain ApTargetCState. In order to pass it to MWAIT, it has to be shifted left four bit positions, and moved to EAX, yes. But, *unlike* in the Ia32 case, in the X64 code you shift EBX itself, and then you move the result from EBX to EAX. (In the Ia32 case, you move EBX to EAX first, and then shift EAX). This means that on the second iteration, EBX will contain garbage, and MWAIT won't be configured correctly. EBX must be treated read-only in the loop, in my opinion. It should be fixed in a separate patch. (10) The X64 HLT loop goes like this: HltLoop: cli hlt jmp HltLoop ret Can we remove the "ret" (in a separate patch)? Thanks! Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-23 18:31 ` Laszlo Ersek @ 2016-11-24 2:23 ` Fan, Jeff 2016-11-24 9:20 ` Laszlo Ersek 0 siblings, 1 reply; 12+ messages in thread From: Fan, Jeff @ 2016-11-24 2:23 UTC (permalink / raw) To: Laszlo Ersek, edk2-devel@ml01.01.org; +Cc: Tian, Feng, Kinney, Michael D Laszlo, Thanks your comments. I update the feedback as below in [Jeff] Jeff -----Original Message----- From: Laszlo Ersek [mailto:lersek@redhat.com] Sent: Thursday, November 24, 2016 2:31 AM To: Fan, Jeff; edk2-devel@ml01.01.org Cc: Tian, Feng; Kinney, Michael D Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB On 11/23/16 15:01, Jeff Fan wrote: > For long mode DXE, we will disable paging on AP to protected mode to > execute AP safe loop code in ACPI NVS range under 4GB. But we forget > to allocate stack for AP under 4GB and AP still are using original AP > stack. If original AP stack is larger than 4GB, it cannot be used after AP is transferred to protected mode. > > Moreover, even though AP stack is always under 4GB on protected mode > DXE, AP stack(in BootServiceData) maybe crashed by OS after Exit Boot Service event. > > This fix is to allocate ACPI NVS range under 4GB together with AP safe > loop code. APs will switch to new stack at beginning of safe loop code. (1) We are actually allocating EfiReservedMemoryType -- please update the commit message. [Jeff] OK (2) For a minute I was confused about using the stack *at all* in the HLT loop. But looking farther down, and reading up on the MONITOR statement, I see that the MWAIT loops in both the Ia32 and X64 source files use the stack inherentily. So, the stack usage is unavoidable in the MwaitSupport==TRUE case. Can you mention in the commit message that the stack usage is only really necessary for MwaitSupport==TRUE, and in the HLT loop mode, it could be technically avoided? [Jeff] OK (I'm not suggesting that we implement a separate solution for the HLT loop mode, but we should be clear about the true incentive for this patch.) > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Feng Tian <feng.tian@intel.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Jeff Fan <jeff.fan@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 19 +++++++++++++++++-- > UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 9 +++++++-- > UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 3 +++ > 4 files changed, 29 insertions(+), 5 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index a0d5eeb..d0f9f7e 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -18,6 +18,7 @@ > #include <Library/UefiBootServicesTableLib.h> > > #define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100)) > +#define AP_SAFT_STACK_SIZE 128 (3) Is "SAFT" a typo? Did you mean "SAFE"? [Jeff] Yes. Will correct it. > > CPU_MP_DATA *mCpuMpData = NULL; > EFI_EVENT mCheckAllApsEvent = NULL; > @@ -25,6 +26,7 @@ EFI_EVENT mMpInitExitBootServicesEvent = NULL; > EFI_EVENT mLegacyBootEvent = NULL; > volatile BOOLEAN mStopCheckAllApsStatus = TRUE; > VOID *mReservedApLoopFunc = NULL; > +UINTN mReservedTopOfApStack; > > /** > Get the pointer to CPU MP Data structure. > @@ -241,11 +243,18 @@ RelocateApLoop ( > CPU_MP_DATA *CpuMpData; > BOOLEAN MwaitSupport; > ASM_RELOCATE_AP_LOOP AsmRelocateApLoopFunc; > + UINTN ProcessorNumber; > > + MpInitLibWhoAmI (&ProcessorNumber); > CpuMpData = GetCpuMpData (); > MwaitSupport = IsMwaitSupport (); > AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) > mReservedApLoopFunc; > - AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, > CpuMpData->PmCodeSegment); > + AsmRelocateApLoopFunc ( > + MwaitSupport, > + CpuMpData->ApTargetCState, > + CpuMpData->PmCodeSegment, > + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE > + ); > // > // It should never reach here > // > @@ -289,6 +298,7 @@ InitMpGlobalData ( { > EFI_STATUS Status; > EFI_PHYSICAL_ADDRESS Address; > + UINTN ApSafeBufferSize; > > SaveCpuMpData (CpuMpData); > > @@ -307,16 +317,21 @@ InitMpGlobalData ( > // Allocating it in advance since memory services are not available in > // Exit Boot Services callback function. > // > + ApSafeBufferSize = CpuMpData->AddressMap.RelocateApLoopFuncSize; > + ApSafeBufferSize += CpuMpData->CpuCount * AP_SAFT_STACK_SIZE; > + (So, I think this could be theoretically avoided for the HLT loop case, but I'm fine if we do it for both loop styles, for simplicity.) > Address = BASE_4GB - 1; > Status = gBS->AllocatePages ( > AllocateMaxAddress, > EfiReservedMemoryType, > - EFI_SIZE_TO_PAGES (sizeof (CpuMpData->AddressMap.RelocateApLoopFuncSize)), > + EFI_SIZE_TO_PAGES (ApSafeBufferSize), > &Address > ); > ASSERT_EFI_ERROR (Status); > mReservedApLoopFunc = (VOID *) (UINTN) Address; > ASSERT (mReservedApLoopFunc != NULL); > + mReservedTopOfApStack = (UINTN) Address + EFI_SIZE_TO_PAGES > + (ApSafeBufferSize) * SIZE_4KB; Ah, here you are moving the stacks to the end of the allocated area, so if there's any gap left after the assembly code and the stacks, the gap is now moved into the middle. Looks okay. (4) I propose to replace the multiplication with the more idiomatic EFI_PAGES_TO_SIZE (EFI_SIZE_TO_PAGES (ApSafeBufferSize)) [Jeff] Will correct it. or else ALIGN_VALUE (ApSafeBufferSize, EFI_PAGE_SIZE) Although the current code is correct too. > + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) > + == 0); > CopyMem ( > mReservedApLoopFunc, > CpuMpData->AddressMap.RelocateApLoopFuncAddress, > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 64e51d8..4e55760 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > - cmp byte [esp + 4], 1 > + push ebp (5) Why do you save EBP on the stack? (Sorry if it is trivial, my assembly is not that great.) And, it is saved on the original AP stack. [Jeff] Good point. It could be saved in new stack and used for stack trace. I will fix it. > + mov ebp, esp > + mov ecx, [ebp + 8] ; MwaitSupport > + mov ebx, [ebp + 12] ; ApTargetCState > + mov esp, [ebp + 20] ; TopOfApStack > + cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > mov eax, esp > xor ecx, ecx > xor edx, edx > monitor > - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] > + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] > shl eax, 4 > mwait > jmp MwaitLoop (6) These code changes look okay to me, but are they necessary in the 32-bit case as well? The original AP stack is under 4GB too. Oh wait, I understand. Our purpose is dual here. The stack space used by MONITOR should be *both* under 4GB *and* in reserved type memory. So, the code is fine, but can you please modify the following sentence in the commit message: Moreover, even though AP stack is always under 4GB on protected mode DXE, ... to: Moreover, even though AP stack is always under 4GB (a) in Ia32 DXE and (b) with this patch, after transfering to protected mode from X64 DXE, ... It should be clear from the commit message that for Ia32, we solve problem #2 (memory type for the MONITOR area), while for X64, we solve problem #1 (address range) and problem #2 (memory type) *both*. [Jeff] OK. > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h > b/UefiCpuPkg/Library/MpInitLib/MpLib.h > index f73a469..e6dea18 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > @@ -250,7 +250,8 @@ VOID > (EFIAPI * ASM_RELOCATE_AP_LOOP) ( > IN BOOLEAN MwaitSupport, > IN UINTN ApTargetCState, > - IN UINTN PmCodeSegment > + IN UINTN PmCodeSegment, > + IN UINTN TopOfApStack > ); > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index aaabb50..7c8fa45 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -224,6 +224,9 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > + push rbp (7) Again, not sure why we're saving RBP (on the current AP stack). [Jeff] For x64, saving RBP is not necessary. It cannot used to do stack trace. I will remove it. > + mov rbp, rsp > + mov rsp, r9 > push rcx > push rdx > > This looks okay. I have some more questions about the preexistent code: (8) The MONITOR statement seems to set up an address *range* for monitoring with MWAIT. EAX provides the base address of the range, and we point it to our new stack. However, what determines the *size* of the address range? Obviously, it must fit in our new stack. [Jeff] Yes. Monitor has size requirement to avoid fault wakeup. But this case, fault wakeup has no any real impact. It has rare case to write the memory. Even though fault wakeup happens, safe mwait-loop could make sure AP enter into C-state again. (9) In the original (pre-patch) X64 code, I see this: MwaitLoop: mov eax, esp ; Set Monitor Address xor ecx, ecx ; ecx = 0 xor edx, edx ; edx = 0 monitor shl ebx, 4 mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] mwait jmp MwaitLoop I think this is wrong: EBX is supposed to contain ApTargetCState. In order to pass it to MWAIT, it has to be shifted left four bit positions, and moved to EAX, yes. But, *unlike* in the Ia32 case, in the X64 code you shift EBX itself, and then you move the result from EBX to EAX. (In the Ia32 case, you move EBX to EAX first, and then shift EAX). This means that on the second iteration, EBX will contain garbage, and MWAIT won't be configured correctly. EBX must be treated read-only in the loop, in my opinion. It should be fixed in a separate patch. [Jeff] Good catch. Will fix it. (10) The X64 HLT loop goes like this: HltLoop: cli hlt jmp HltLoop ret Can we remove the "ret" (in a separate patch)? [Jeff] OK Thanks! Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-24 2:23 ` Fan, Jeff @ 2016-11-24 9:20 ` Laszlo Ersek 2016-11-24 13:48 ` Fan, Jeff 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2016-11-24 9:20 UTC (permalink / raw) To: Fan, Jeff, edk2-devel@ml01.01.org; +Cc: Tian, Feng, Kinney, Michael D On 11/24/16 03:23, Fan, Jeff wrote: > -----Original Message----- > From: Laszlo Ersek [mailto:lersek@redhat.com] > Sent: Thursday, November 24, 2016 2:31 AM > To: Fan, Jeff; edk2-devel@ml01.01.org > Cc: Tian, Feng; Kinney, Michael D > Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB > I have some more questions about the preexistent code: > > (8) The MONITOR statement seems to set up an address *range* for > monitoring with MWAIT. EAX provides the base address of the range, > and we point it to our new stack. However, what determines the *size* > of the address range? Obviously, it must fit in our new stack. > [Jeff] Yes. Monitor has size requirement to avoid fault wakeup. But > this case, fault wakeup has no any real impact. It has rare case to > write the memory. Even though fault wakeup happens, safe mwait-loop > could make sure AP enter into C-state again. I'm sorry, I think my question was not clear enough. I'm actually interested in three things here. * The first question is just for my general education, because I'm not familiar with the MONITOR configuration. The question is independent of the code, it is just for me to learn about MONITOR. So, how does the programmer configure the size of the monitored area? The base address is taken from EAX. Where does the size come from? * My second note is that whatever size we configure for the monitor, it should fit into our new stack. I see that you agree (although spurious wakeups don't matter), so this part is done then. * And third, I thought of something else last night. Please consider the Ia32 case: > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 64e51d8..4e55760 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > - cmp byte [esp + 4], 1 > + push ebp > + mov ebp, esp > + mov ecx, [ebp + 8] ; MwaitSupport > + mov ebx, [ebp + 12] ; ApTargetCState > + mov esp, [ebp + 20] ; TopOfApStack We set ESP precisely to TopOfApStack here. > + cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > mov eax, esp Here we move the same value to EAX -- so EAX equals TopOfApStack. > xor ecx, ecx > xor edx, edx > monitor And here we point the monitor to TopOfApStack exactly. Note that nothing has been pushed. > - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] > + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] > shl eax, 4 > mwait > jmp MwaitLoop The stack grows down. Whenever we push something, the stack pointer is decremented first, and then the value being pushed is written to the location pointed-to by the new (decremented) stack pointer. (I verified this order of operations in the Intel SDM). "mReservedTopOfApStack" points to the byte *right past* the allocated area. In turn, for ProcessorNumber = 0, "TopOfApStack" is set exactly to "mReservedTopOfApStack". This is valid if we push something, because then the pointer is decremented first, and the value will be written into the allocated area. However, with MONITOR, the area to watch extends *upward* from EAX. This means that for ProcessorNumber = 0, we will be monitoring an area that is strictly outside of the allocated range, regardless of the size of the monitored range. And, for ProcessorNumber (N+1), the monitored area will actually be located in the stack slice of ProcessorNumber (N). So, my point is, we should determine the size of the monitored area exactly, and push that much empty space (== decrement ESP manually) before invoking MONITOR. ... Hm, okay, here's what the Intel SDM tells us about MONITOR: "the address range that the monitoring hardware checks for store operations can be determined by using CPUID". We should consider this address range in both the assembly code, and in the original allocation (so replace AP_SAFE_STACK_SIZE with a dynamic value, or at least make sure that AP_SAFT_STACK_SIZE is larger than any imaginable monitor area size). The same issue could apply to the X64 code, but that's more complex so I didn't check it. Thank you, Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-24 9:20 ` Laszlo Ersek @ 2016-11-24 13:48 ` Fan, Jeff 2016-11-24 21:13 ` Laszlo Ersek 0 siblings, 1 reply; 12+ messages in thread From: Fan, Jeff @ 2016-11-24 13:48 UTC (permalink / raw) To: Laszlo Ersek, edk2-devel@ml01.01.org; +Cc: Tian, Feng, Kinney, Michael D Laszlo, I added my feedback for your three questions. -----Original Message----- From: Laszlo Ersek [mailto:lersek@redhat.com] Sent: Thursday, November 24, 2016 5:21 PM To: Fan, Jeff; edk2-devel@ml01.01.org Cc: Tian, Feng; Kinney, Michael D Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB On 11/24/16 03:23, Fan, Jeff wrote: > -----Original Message----- > From: Laszlo Ersek [mailto:lersek@redhat.com] > Sent: Thursday, November 24, 2016 2:31 AM > To: Fan, Jeff; edk2-devel@ml01.01.org > Cc: Tian, Feng; Kinney, Michael D > Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack > < 4GB > I have some more questions about the preexistent code: > > (8) The MONITOR statement seems to set up an address *range* for > monitoring with MWAIT. EAX provides the base address of the range, and > we point it to our new stack. However, what determines the *size* of > the address range? Obviously, it must fit in our new stack. > [Jeff] Yes. Monitor has size requirement to avoid fault wakeup. But > this case, fault wakeup has no any real impact. It has rare case to > write the memory. Even though fault wakeup happens, safe mwait-loop > could make sure AP enter into C-state again. I'm sorry, I think my question was not clear enough. I'm actually interested in three things here. * The first question is just for my general education, because I'm not familiar with the MONITOR configuration. The question is independent of the code, it is just for me to learn about MONITOR. So, how does the programmer configure the size of the monitored area? The base address is taken from EAX. Where does the size come from? [Jeff] Monitor size is from CPUID. I observed it equal to the cache line size. But we needn't to set size of Monitor and only set base address of Monitor buffer when executing MONITOR instruction. * My second note is that whatever size we configure for the monitor, it should fit into our new stack. I see that you agree (although spurious wakeups don't matter), so this part is done then. [Jeff] Yes. It should be in new stack. * And third, I thought of something else last night. Please consider the Ia32 case: > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 64e51d8..4e55760 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: > global ASM_PFX(AsmRelocateApLoop) > ASM_PFX(AsmRelocateApLoop): > AsmRelocateApLoopStart: > - cmp byte [esp + 4], 1 > + push ebp > + mov ebp, esp > + mov ecx, [ebp + 8] ; MwaitSupport > + mov ebx, [ebp + 12] ; ApTargetCState > + mov esp, [ebp + 20] ; TopOfApStack We set ESP precisely to TopOfApStack here. > + cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > mov eax, esp Here we move the same value to EAX -- so EAX equals TopOfApStack. > xor ecx, ecx > xor edx, edx > monitor And here we point the monitor to TopOfApStack exactly. Note that nothing has been pushed. > - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] > + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] > shl eax, 4 > mwait > jmp MwaitLoop The stack grows down. Whenever we push something, the stack pointer is decremented first, and then the value being pushed is written to the location pointed-to by the new (decremented) stack pointer. (I verified this order of operations in the Intel SDM). "mReservedTopOfApStack" points to the byte *right past* the allocated area. In turn, for ProcessorNumber = 0, "TopOfApStack" is set exactly to "mReservedTopOfApStack". This is valid if we push something, because then the pointer is decremented first, and the value will be written into the allocated area. However, with MONITOR, the area to watch extends *upward* from EAX. This means that for ProcessorNumber = 0, we will be monitoring an area that is strictly outside of the allocated range, regardless of the size of the monitored range. And, for ProcessorNumber (N+1), the monitored area will actually be located in the stack slice of ProcessorNumber (N). So, my point is, we should determine the size of the monitored area exactly, and push that much empty space (== decrement ESP manually) before invoking MONITOR. ... Hm, okay, here's what the Intel SDM tells us about MONITOR: "the address range that the monitoring hardware checks for store operations can be determined by using CPUID". We should consider this address range in both the assembly code, and in the original allocation (so replace AP_SAFE_STACK_SIZE with a dynamic value, or at least make sure that AP_SAFT_STACK_SIZE is larger than any imaginable monitor area size). The same issue could apply to the X64 code, but that's more complex so I didn't check it. [Jeff] On normal case, we DO should check monitor size and make sure we write the correct monitor buffer to wakeup Aps. But for this case, we do not want to wake up Aps at all. So, we just have one mwait deadloop here. Once we found AP is wake up by any reason (SMI, Interrupt, or monitor buffer writing), we will go on executing mwait and place Aps into C-State again. As long as this mwait loop does not touch any unsafe code/stack, it should be OK. This could simply the implementation. Thank you, Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB 2016-11-24 13:48 ` Fan, Jeff @ 2016-11-24 21:13 ` Laszlo Ersek 0 siblings, 0 replies; 12+ messages in thread From: Laszlo Ersek @ 2016-11-24 21:13 UTC (permalink / raw) To: Fan, Jeff, edk2-devel@ml01.01.org; +Cc: Tian, Feng, Kinney, Michael D On 11/24/16 14:48, Fan, Jeff wrote: > Laszlo, > > I added my feedback for your three questions. > > -----Original Message----- > From: Laszlo Ersek [mailto:lersek@redhat.com] > Sent: Thursday, November 24, 2016 5:21 PM > To: Fan, Jeff; edk2-devel@ml01.01.org > Cc: Tian, Feng; Kinney, Michael D > Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB > > On 11/24/16 03:23, Fan, Jeff wrote: >> -----Original Message----- >> From: Laszlo Ersek [mailto:lersek@redhat.com] >> Sent: Thursday, November 24, 2016 2:31 AM >> To: Fan, Jeff; edk2-devel@ml01.01.org >> Cc: Tian, Feng; Kinney, Michael D >> Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack >> < 4GB > >> I have some more questions about the preexistent code: >> >> (8) The MONITOR statement seems to set up an address *range* for >> monitoring with MWAIT. EAX provides the base address of the range, and >> we point it to our new stack. However, what determines the *size* of >> the address range? Obviously, it must fit in our new stack. > >> [Jeff] Yes. Monitor has size requirement to avoid fault wakeup. But >> this case, fault wakeup has no any real impact. It has rare case to >> write the memory. Even though fault wakeup happens, safe mwait-loop >> could make sure AP enter into C-state again. > > I'm sorry, I think my question was not clear enough. I'm actually interested in three things here. > > * The first question is just for my general education, because I'm not familiar with the MONITOR configuration. The question is independent of the code, it is just for me to learn about MONITOR. > > So, how does the programmer configure the size of the monitored area? > The base address is taken from EAX. Where does the size come from? > [Jeff] Monitor size is from CPUID. I observed it equal to the cache line size. But we needn't to set size of Monitor and only set base address of Monitor buffer when executing MONITOR instruction. > > * My second note is that whatever size we configure for the monitor, it should fit into our new stack. I see that you agree (although spurious wakeups don't matter), so this part is done then. > [Jeff] Yes. It should be in new stack. > > * And third, I thought of something else last night. Please consider the > Ia32 case: > >> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm >> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm >> index 64e51d8..4e55760 100644 >> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm >> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm >> @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd: >> global ASM_PFX(AsmRelocateApLoop) >> ASM_PFX(AsmRelocateApLoop): >> AsmRelocateApLoopStart: >> - cmp byte [esp + 4], 1 >> + push ebp >> + mov ebp, esp >> + mov ecx, [ebp + 8] ; MwaitSupport >> + mov ebx, [ebp + 12] ; ApTargetCState >> + mov esp, [ebp + 20] ; TopOfApStack > > We set ESP precisely to TopOfApStack here. > >> + cmp cl, 1 ; Check mwait-monitor support >> jnz HltLoop >> MwaitLoop: >> mov eax, esp > > Here we move the same value to EAX -- so EAX equals TopOfApStack. > >> xor ecx, ecx >> xor edx, edx >> monitor > > And here we point the monitor to TopOfApStack exactly. Note that nothing has been pushed. > >> - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4] >> + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4] >> shl eax, 4 >> mwait >> jmp MwaitLoop > > The stack grows down. Whenever we push something, the stack pointer is decremented first, and then the value being pushed is written to the location pointed-to by the new (decremented) stack pointer. (I verified this order of operations in the Intel SDM). > > "mReservedTopOfApStack" points to the byte *right past* the allocated area. In turn, for ProcessorNumber = 0, "TopOfApStack" is set exactly to "mReservedTopOfApStack". > > This is valid if we push something, because then the pointer is decremented first, and the value will be written into the allocated area. > > However, with MONITOR, the area to watch extends *upward* from EAX. This means that for ProcessorNumber = 0, we will be monitoring an area that is strictly outside of the allocated range, regardless of the size of the monitored range. And, for ProcessorNumber (N+1), the monitored area will actually be located in the stack slice of ProcessorNumber (N). > > So, my point is, we should determine the size of the monitored area exactly, and push that much empty space (== decrement ESP manually) before invoking MONITOR. > > ... Hm, okay, here's what the Intel SDM tells us about MONITOR: "the address range that the monitoring hardware checks for store operations can be determined by using CPUID". > > We should consider this address range in both the assembly code, and in the original allocation (so replace AP_SAFE_STACK_SIZE with a dynamic value, or at least make sure that AP_SAFT_STACK_SIZE is larger than any imaginable monitor area size). > > The same issue could apply to the X64 code, but that's more complex so I didn't check it. > > [Jeff] On normal case, we DO should check monitor size and make sure we write the correct monitor buffer to wakeup Aps. But for this case, we do not want to wake up Aps at all. So, we just have one mwait deadloop here. Once we found AP is wake up by any reason (SMI, Interrupt, or monitor buffer writing), we will go on executing mwait and place Aps into C-State again. > As long as this mwait loop does not touch any unsafe code/stack, it should be OK. This could simply the implementation. But, if we don't actually care where the monitor / mwait are looking (and we only want to make sure that the base address is in accessible address space), then what do we need the AP stacks for at all? As far as I recall from last night when I was looking at the code, the HLT loop doesn't seem to need a stack at all, and the MONITOR / MWAIT loops could share a single address between all of the APs. No need to allocate an array of stacks. ... I'm not trying to block this series, just to understand why the code is designed like this. Anyway, I think I have no more open questions, so if you wish, you could post v2, and I'd try to test it soon. Thanks! Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code 2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan 2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan 2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan @ 2016-11-23 14:01 ` Jeff Fan 2016-11-23 18:52 ` Laszlo Ersek 2 siblings, 1 reply; 12+ messages in thread From: Jeff Fan @ 2016-11-23 14:01 UTC (permalink / raw) To: edk2-devel; +Cc: Laszlo Ersek, Feng Tian, Michael D Kinney Add one semaphore to make sure BSP to wait till all APs run in AP safe loop code. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Feng Tian <feng.tian@intel.com> Cc: Michael D Kinney <michael.d.kinney@intel.com> Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Jeff Fan <jeff.fan@intel.com> --- UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 8 +++++++- UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 2 ++ UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 2 ++ 4 files changed, 13 insertions(+), 2 deletions(-) diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c index d0f9f7e..64c0c52 100644 --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c @@ -27,6 +27,7 @@ EFI_EVENT mLegacyBootEvent = NULL; volatile BOOLEAN mStopCheckAllApsStatus = TRUE; VOID *mReservedApLoopFunc = NULL; UINTN mReservedTopOfApStack; +volatile UINT32 mNumberToFinish = 0; /** Get the pointer to CPU MP Data structure. @@ -253,7 +254,8 @@ RelocateApLoop ( MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment, - mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE, + (UINTN) &mNumberToFinish ); // // It should never reach here @@ -282,8 +284,12 @@ MpInitChangeApLoopCallback ( CpuMpData->SaveRestoreFlag = TRUE; CpuMpData->PmCodeSegment = GetProtectedModeCS (); CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode); + mNumberToFinish = CpuMpData->CpuCount - 1; WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL); DEBUG ((DEBUG_INFO, "%a() done!\n", __FUNCTION__)); + while (mNumberToFinish > 0) { + CpuPause (); + } } /** diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm index 4e55760..6235748 100644 --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm @@ -222,6 +222,8 @@ AsmRelocateApLoopStart: mov ecx, [ebp + 8] ; MwaitSupport mov ebx, [ebp + 12] ; ApTargetCState mov esp, [ebp + 20] ; TopOfApStack + mov eax, [ebp + 24] ; CountTofinish + lock dec dword [eax] ; CountTofinish-- cmp cl, 1 ; Check mwait-monitor support jnz HltLoop MwaitLoop: diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h index e6dea18..49305ad 100644 --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h @@ -251,7 +251,8 @@ VOID IN BOOLEAN MwaitSupport, IN UINTN ApTargetCState, IN UINTN PmCodeSegment, - IN UINTN TopOfApStack + IN UINTN TopOfApStack, + IN UINTN NumberToFinish ); /** diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm index 7c8fa45..deaee9e 100644 --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm @@ -227,6 +227,8 @@ AsmRelocateApLoopStart: push rbp mov rbp, rsp mov rsp, r9 + mov rax, [rbp + 48] ; CountTofinish + lock dec dword [rax] ; CountTofinish-- push rcx push rdx -- 2.9.3.windows.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code 2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan @ 2016-11-23 18:52 ` Laszlo Ersek 2016-11-24 2:37 ` Fan, Jeff 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2016-11-23 18:52 UTC (permalink / raw) To: Jeff Fan, edk2-devel; +Cc: Feng Tian, Michael D Kinney On 11/23/16 15:01, Jeff Fan wrote: > Add one semaphore to make sure BSP to wait till all APs run in AP safe loop > code. > > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Feng Tian <feng.tian@intel.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Jeff Fan <jeff.fan@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 8 +++++++- > UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 2 ++ > UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 2 ++ > 4 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index d0f9f7e..64c0c52 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -27,6 +27,7 @@ EFI_EVENT mLegacyBootEvent = NULL; > volatile BOOLEAN mStopCheckAllApsStatus = TRUE; > VOID *mReservedApLoopFunc = NULL; > UINTN mReservedTopOfApStack; > +volatile UINT32 mNumberToFinish = 0; > > /** > Get the pointer to CPU MP Data structure. > @@ -253,7 +254,8 @@ RelocateApLoop ( > MwaitSupport, > CpuMpData->ApTargetCState, > CpuMpData->PmCodeSegment, > - mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE > + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE, > + (UINTN) &mNumberToFinish > ); > // > // It should never reach here > @@ -282,8 +284,12 @@ MpInitChangeApLoopCallback ( > CpuMpData->SaveRestoreFlag = TRUE; > CpuMpData->PmCodeSegment = GetProtectedModeCS (); > CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode); > + mNumberToFinish = CpuMpData->CpuCount - 1; > WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL); > DEBUG ((DEBUG_INFO, "%a() done!\n", __FUNCTION__)); > + while (mNumberToFinish > 0) { > + CpuPause (); > + } > } (1) Can you hoist the counting above the "done" message? > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 4e55760..6235748 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -222,6 +222,8 @@ AsmRelocateApLoopStart: > mov ecx, [ebp + 8] ; MwaitSupport > mov ebx, [ebp + 12] ; ApTargetCState > mov esp, [ebp + 20] ; TopOfApStack > + mov eax, [ebp + 24] ; CountTofinish > + lock dec dword [eax] ; CountTofinish-- (2) The comments should say ; NumberToFinish ; (*NumberToFinish)-- > cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h > index e6dea18..49305ad 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > @@ -251,7 +251,8 @@ VOID > IN BOOLEAN MwaitSupport, > IN UINTN ApTargetCState, > IN UINTN PmCodeSegment, > - IN UINTN TopOfApStack > + IN UINTN TopOfApStack, > + IN UINTN NumberToFinish > ); > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index 7c8fa45..deaee9e 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -227,6 +227,8 @@ AsmRelocateApLoopStart: > push rbp > mov rbp, rsp > mov rsp, r9 > + mov rax, [rbp + 48] ; CountTofinish > + lock dec dword [rax] ; CountTofinish-- > push rcx > push rdx > > Hmmm, in the X64 case, this decrement seems to be pretty far from the actual loop; a window still remains. On the other hand, if we wanted to do the decrement from protected mode, then the semaphore would have to be moved into the specially allocated area too. Anyway, the point is that we are on the new stack, and will make no further accesses to BootServicesData type memory. (3) The comments should say ; NumberToFinish ; (*NumberToFinish)-- I think I'll postpone testing until the next version (if you decide to post v3). Thanks! Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code 2016-11-23 18:52 ` Laszlo Ersek @ 2016-11-24 2:37 ` Fan, Jeff 0 siblings, 0 replies; 12+ messages in thread From: Fan, Jeff @ 2016-11-24 2:37 UTC (permalink / raw) To: Laszlo Ersek, edk2-devel@ml01.01.org; +Cc: Tian, Feng, Kinney, Michael D Laszlo, I added my feedback in [Jeff]. -----Original Message----- From: Laszlo Ersek [mailto:lersek@redhat.com] Sent: Thursday, November 24, 2016 2:52 AM To: Fan, Jeff; edk2-devel@ml01.01.org Cc: Tian, Feng; Kinney, Michael D Subject: Re: [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code On 11/23/16 15:01, Jeff Fan wrote: > Add one semaphore to make sure BSP to wait till all APs run in AP safe > loop code. > > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Feng Tian <feng.tian@intel.com> > Cc: Michael D Kinney <michael.d.kinney@intel.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Jeff Fan <jeff.fan@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 8 +++++++- > UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 2 ++ > UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 2 ++ > 4 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index d0f9f7e..64c0c52 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -27,6 +27,7 @@ EFI_EVENT mLegacyBootEvent = NULL; > volatile BOOLEAN mStopCheckAllApsStatus = TRUE; > VOID *mReservedApLoopFunc = NULL; > UINTN mReservedTopOfApStack; > +volatile UINT32 mNumberToFinish = 0; > > /** > Get the pointer to CPU MP Data structure. > @@ -253,7 +254,8 @@ RelocateApLoop ( > MwaitSupport, > CpuMpData->ApTargetCState, > CpuMpData->PmCodeSegment, > - mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE > + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE, > + (UINTN) &mNumberToFinish > ); > // > // It should never reach here > @@ -282,8 +284,12 @@ MpInitChangeApLoopCallback ( > CpuMpData->SaveRestoreFlag = TRUE; > CpuMpData->PmCodeSegment = GetProtectedModeCS (); > CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode); > + mNumberToFinish = CpuMpData->CpuCount - 1; > WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL); > DEBUG ((DEBUG_INFO, "%a() done!\n", __FUNCTION__)); > + while (mNumberToFinish > 0) { > + CpuPause (); > + } > } (1) Can you hoist the counting above the "done" message? [Jeff] Agree. > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 4e55760..6235748 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -222,6 +222,8 @@ AsmRelocateApLoopStart: > mov ecx, [ebp + 8] ; MwaitSupport > mov ebx, [ebp + 12] ; ApTargetCState > mov esp, [ebp + 20] ; TopOfApStack > + mov eax, [ebp + 24] ; CountTofinish > + lock dec dword [eax] ; CountTofinish-- (2) The comments should say ; NumberToFinish ; (*NumberToFinish)-- [Jeff] Agree. > cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > MwaitLoop: > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h > b/UefiCpuPkg/Library/MpInitLib/MpLib.h > index e6dea18..49305ad 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > @@ -251,7 +251,8 @@ VOID > IN BOOLEAN MwaitSupport, > IN UINTN ApTargetCState, > IN UINTN PmCodeSegment, > - IN UINTN TopOfApStack > + IN UINTN TopOfApStack, > + IN UINTN NumberToFinish > ); > > /** > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index 7c8fa45..deaee9e 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -227,6 +227,8 @@ AsmRelocateApLoopStart: > push rbp > mov rbp, rsp > mov rsp, r9 > + mov rax, [rbp + 48] ; CountTofinish > + lock dec dword [rax] ; CountTofinish-- > push rcx > push rdx > > Hmmm, in the X64 case, this decrement seems to be pretty far from the actual loop; a window still remains. On the other hand, if we wanted to do the decrement from protected mode, then the semaphore would have to be moved into the specially allocated area too. [Jeff] There is no window on this case. After "move rsp, r9" all AP code will be updated in safe code / safe stack. And it is in long mode. Anyway, the point is that we are on the new stack, and will make no further accesses to BootServicesData type memory. (3) The comments should say ; NumberToFinish ; (*NumberToFinish)-- [Jeff] Agree. I think I'll postpone testing until the next version (if you decide to post v3). Thanks! Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-11-24 21:14 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan 2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan 2016-11-23 17:08 ` Laszlo Ersek 2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan 2016-11-23 18:31 ` Laszlo Ersek 2016-11-24 2:23 ` Fan, Jeff 2016-11-24 9:20 ` Laszlo Ersek 2016-11-24 13:48 ` Fan, Jeff 2016-11-24 21:13 ` Laszlo Ersek 2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan 2016-11-23 18:52 ` Laszlo Ersek 2016-11-24 2:37 ` Fan, Jeff
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox