* [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB @ 2023-01-05 6:21 Yuanhao Xie 2023-01-05 6:28 ` [edk2-devel] " Ni, Ray 2023-01-05 9:38 ` Ard Biesheuvel 0 siblings, 2 replies; 17+ messages in thread From: Yuanhao Xie @ 2023-01-05 6:21 UTC (permalink / raw) To: devel Kept 4GB allocation limit for the case that APs are still transferred to 32-bit protected mode on long mode DXE. Fixed AsmRelocateApLoopStart stack offset in Ia32. Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4234 Cc: Eric Dong eric.dong@intel.com Cc: Ray Ni ray.ni@intel.com Cc: Rahul Kumar rahul1.kumar@intel.com Signed-off-by: Yuanhao Xie <yuanhao.xie@intel.com> --- UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 35 ++++++++++++------- .../Library/MpInitLib/Ia32/MpFuncs.nasm | 9 ++--- 2 files changed, 25 insertions(+), 19 deletions(-) diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c index beab06a5b1..1994ee44c2 100644 --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c @@ -389,7 +389,7 @@ RelocateApLoop ( MpInitLibWhoAmI (&ProcessorNumber); CpuMpData = GetCpuMpData (); MwaitSupport = IsMwaitSupport (); - if (StandardSignatureIsAuthenticAMD ()) { + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { StackStart = CpuMpData->UseSevEsAPMethod ? CpuMpData->SevEsAPResetStackStart : mReservedTopOfApStack; AsmRelocateApLoopFuncAmd = (ASM_RELOCATE_AP_LOOP_AMD)(UINTN)mReservedApLoopFunc; AsmRelocateApLoopFuncAmd ( @@ -480,6 +480,7 @@ InitMpGlobalData ( EFI_GCD_MEMORY_SPACE_DESCRIPTOR MemDesc; UINTN StackBase; CPU_INFO_IN_HOB *CpuInfoInHob; + EFI_PHYSICAL_ADDRESS Address; SaveCpuMpData (CpuMpData); @@ -536,9 +537,9 @@ InitMpGlobalData ( // // Avoid APs access invalid buffer data which allocated by BootServices, - // so we will allocate reserved data for AP loop code. We also need to - // allocate this buffer below 4GB due to APs may be transferred to 32bit - // protected mode on long mode DXE. + // so we will allocate reserved data for AP loop code. We need to + // allocate this buffer below 4GB for the case that APs are transferred + // to 32-bit protected mode on long mode DXE. // Allocating it in advance since memory services are not available in // Exit Boot Services callback function. // @@ -555,19 +556,25 @@ InitMpGlobalData ( ) ); - mReservedTopOfApStack = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); - ASSERT (mReservedTopOfApStack != 0); - ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); - ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); - - mReservedApLoopFunc = (VOID *)(mReservedTopOfApStack + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); - if (StandardSignatureIsAuthenticAMD ()) { + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { + Address = BASE_4GB - 1; + Status = gBS->AllocatePages ( + AllocateMaxAddress, + EfiReservedMemoryType, + EFI_SIZE_TO_PAGES (ApSafeBufferSize), + &Address + ); + ASSERT_EFI_ERROR (Status); + mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); CopyMem ( mReservedApLoopFunc, CpuMpData->AddressMap.RelocateApLoopFuncAddressAmd, CpuMpData->AddressMap.RelocateApLoopFuncSizeAmd ); } else { + Address = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); + ASSERT (Address != 0); + mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); CopyMem ( mReservedApLoopFunc, CpuMpData->AddressMap.RelocateApLoopFuncAddress, @@ -575,12 +582,14 @@ InitMpGlobalData ( ); mApPageTable = CreatePageTable ( - mReservedTopOfApStack, + (UINTN)Address, ApSafeBufferSize ); } - mReservedTopOfApStack += CpuMpData->CpuCount * AP_SAFE_STACK_SIZE; + mReservedTopOfApStack = (UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE; + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); + ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); Status = gBS->CreateEvent ( EVT_TIMER | EVT_NOTIFY_SIGNAL, diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm index bfcdbd31c1..5cffa632ab 100644 --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm @@ -219,20 +219,17 @@ SwitchToRealProcEnd: RendezvousFunnelProcEnd: ;------------------------------------------------------------------------------------- -; AsmRelocateApLoop (MwaitSupport, ApTargetCState, PmCodeSegment, TopOfApStack, CountTofinish, Pm16CodeSegment, SevEsAPJumpTable, WakeupBuffer); -; -; The last three parameters (Pm16CodeSegment, SevEsAPJumpTable and WakeupBuffer) are -; specific to SEV-ES support and are not applicable on IA32. +; AsmRelocateApLoop (MwaitSupport, ApTargetCState, TopOfApStack, CountTofinish, Cr3); ;------------------------------------------------------------------------------------- AsmRelocateApLoopStart: mov eax, esp - mov esp, [eax + 16] ; TopOfApStack + mov esp, [eax + 12] ; TopOfApStack push dword [eax] ; push return address for stack trace push ebp mov ebp, esp mov ebx, [eax + 8] ; ApTargetCState mov ecx, [eax + 4] ; MwaitSupport - mov eax, [eax + 20] ; CountTofinish + mov eax, [eax + 16] ; CountTofinish lock dec dword [eax] ; (*CountTofinish)-- cmp cl, 1 ; Check mwait-monitor support jnz HltLoop -- 2.36.1.windows.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-05 6:21 [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB Yuanhao Xie @ 2023-01-05 6:28 ` Ni, Ray 2023-01-05 7:19 ` Yuanhao Xie 2023-01-05 9:38 ` Ard Biesheuvel 1 sibling, 1 reply; 17+ messages in thread From: Ni, Ray @ 2023-01-05 6:28 UTC (permalink / raw) To: devel@edk2.groups.io, Xie, Yuanhao Cc: Laszlo Ersek, Gerd Hoffmann, Ard Biesheuvel > Fixed AsmRelocateApLoopStart stack offset in Ia32. The above commit message is not quite clear. How about "Fix 32bit version AsmRelocateApLoopStart to retrieve the parameters from correct stack offset"? I also recommend that you split to 2 patches because the code change here fixes two bugs. Have you confirmed that the 32bit Linux boot hangs without the changes and succeeds with the changes? Thanks, Ray ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-05 6:28 ` [edk2-devel] " Ni, Ray @ 2023-01-05 7:19 ` Yuanhao Xie 0 siblings, 0 replies; 17+ messages in thread From: Yuanhao Xie @ 2023-01-05 7:19 UTC (permalink / raw) To: Ni, Ray, devel@edk2.groups.io; +Cc: Laszlo Ersek, Gerd Hoffmann, Ard Biesheuvel -The above commit message is not quite clear. -How about "Fix 32bit version AsmRelocateApLoopStart to retrieve the parameters from correct stack offset"? -I also recommend that you split to 2 patches because the code change here fixes two bugs. Ok, I will split it as 2 patches with updated commit messages. - Have you confirmed that the 32bit Linux boot hangs without the changes and succeeds with the changes? Yes, I confirmed. It hangs at: "CpuDxe: Level 5 paging = 0" without code changes. With the code changes, the "debug.log" continues to read "MpInitChangeApLoopCallback() done!". Linux debug messages are also displayed on the terminal. Thanks, Yuanhao -----Original Message----- From: Ni, Ray <ray.ni@intel.com> Sent: Thursday, January 5, 2023 2:29 PM To: devel@edk2.groups.io; Xie, Yuanhao <yuanhao.xie@intel.com> Cc: Laszlo Ersek <lersek@redhat.com>; Gerd Hoffmann <kraxel@redhat.com>; Ard Biesheuvel <ardb@kernel.org> Subject: RE: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB > Fixed AsmRelocateApLoopStart stack offset in Ia32. The above commit message is not quite clear. How about "Fix 32bit version AsmRelocateApLoopStart to retrieve the parameters from correct stack offset"? I also recommend that you split to 2 patches because the code change here fixes two bugs. Have you confirmed that the 32bit Linux boot hangs without the changes and succeeds with the changes? Thanks, Ray ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-05 6:21 [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB Yuanhao Xie 2023-01-05 6:28 ` [edk2-devel] " Ni, Ray @ 2023-01-05 9:38 ` Ard Biesheuvel 2023-01-06 4:12 ` Ni, Ray 1 sibling, 1 reply; 17+ messages in thread From: Ard Biesheuvel @ 2023-01-05 9:38 UTC (permalink / raw) To: devel, yuanhao.xie On Thu, 5 Jan 2023 at 07:21, Yuanhao Xie <yuanhao.xie@intel.com> wrote: > > Kept 4GB allocation limit for the case that APs are still transferred to > 32-bit protected mode on long mode DXE. > Fixed AsmRelocateApLoopStart stack offset in Ia32. > > Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4234 > > Cc: Eric Dong eric.dong@intel.com > Cc: Ray Ni ray.ni@intel.com > Cc: Rahul Kumar rahul1.kumar@intel.com > Signed-off-by: Yuanhao Xie <yuanhao.xie@intel.com> > --- > UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 35 ++++++++++++------- > .../Library/MpInitLib/Ia32/MpFuncs.nasm | 9 ++--- > 2 files changed, 25 insertions(+), 19 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index beab06a5b1..1994ee44c2 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -389,7 +389,7 @@ RelocateApLoop ( > MpInitLibWhoAmI (&ProcessorNumber); > CpuMpData = GetCpuMpData (); > MwaitSupport = IsMwaitSupport (); > - if (StandardSignatureIsAuthenticAMD ()) { > + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { > StackStart = CpuMpData->UseSevEsAPMethod ? CpuMpData->SevEsAPResetStackStart : mReservedTopOfApStack; > AsmRelocateApLoopFuncAmd = (ASM_RELOCATE_AP_LOOP_AMD)(UINTN)mReservedApLoopFunc; > AsmRelocateApLoopFuncAmd ( > @@ -480,6 +480,7 @@ InitMpGlobalData ( > EFI_GCD_MEMORY_SPACE_DESCRIPTOR MemDesc; > UINTN StackBase; > CPU_INFO_IN_HOB *CpuInfoInHob; > + EFI_PHYSICAL_ADDRESS Address; > > SaveCpuMpData (CpuMpData); > > @@ -536,9 +537,9 @@ InitMpGlobalData ( > > // > // Avoid APs access invalid buffer data which allocated by BootServices, > - // so we will allocate reserved data for AP loop code. We also need to > - // allocate this buffer below 4GB due to APs may be transferred to 32bit > - // protected mode on long mode DXE. > + // so we will allocate reserved data for AP loop code. We need to > + // allocate this buffer below 4GB for the case that APs are transferred > + // to 32-bit protected mode on long mode DXE. > // Allocating it in advance since memory services are not available in > // Exit Boot Services callback function. > // > @@ -555,19 +556,25 @@ InitMpGlobalData ( > ) > ); > > - mReservedTopOfApStack = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); > - ASSERT (mReservedTopOfApStack != 0); > - ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > - ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); > - > - mReservedApLoopFunc = (VOID *)(mReservedTopOfApStack + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > - if (StandardSignatureIsAuthenticAMD ()) { > + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { This looks the wrong way around. > + Address = BASE_4GB - 1; > + Status = gBS->AllocatePages ( > + AllocateMaxAddress, > + EfiReservedMemoryType, > + EFI_SIZE_TO_PAGES (ApSafeBufferSize), > + &Address > + ); > + ASSERT_EFI_ERROR (Status); > + mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > CopyMem ( > mReservedApLoopFunc, > CpuMpData->AddressMap.RelocateApLoopFuncAddressAmd, > CpuMpData->AddressMap.RelocateApLoopFuncSizeAmd > ); > } else { > + Address = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); > + ASSERT (Address != 0); Isn't this code path still used for the IA32 version? > + mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > CopyMem ( > mReservedApLoopFunc, > CpuMpData->AddressMap.RelocateApLoopFuncAddress, > @@ -575,12 +582,14 @@ InitMpGlobalData ( > ); > > mApPageTable = CreatePageTable ( > - mReservedTopOfApStack, > + (UINTN)Address, > ApSafeBufferSize > ); > } > > - mReservedTopOfApStack += CpuMpData->CpuCount * AP_SAFE_STACK_SIZE; > + mReservedTopOfApStack = (UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE; > + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > + ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); > > Status = gBS->CreateEvent ( > EVT_TIMER | EVT_NOTIFY_SIGNAL, > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index bfcdbd31c1..5cffa632ab 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -219,20 +219,17 @@ SwitchToRealProcEnd: > RendezvousFunnelProcEnd: > > ;------------------------------------------------------------------------------------- > -; AsmRelocateApLoop (MwaitSupport, ApTargetCState, PmCodeSegment, TopOfApStack, CountTofinish, Pm16CodeSegment, SevEsAPJumpTable, WakeupBuffer); > -; > -; The last three parameters (Pm16CodeSegment, SevEsAPJumpTable and WakeupBuffer) are > -; specific to SEV-ES support and are not applicable on IA32. > +; AsmRelocateApLoop (MwaitSupport, ApTargetCState, TopOfApStack, CountTofinish, Cr3); > ;------------------------------------------------------------------------------------- > AsmRelocateApLoopStart: > mov eax, esp > - mov esp, [eax + 16] ; TopOfApStack > + mov esp, [eax + 12] ; TopOfApStack > push dword [eax] ; push return address for stack trace > push ebp > mov ebp, esp > mov ebx, [eax + 8] ; ApTargetCState > mov ecx, [eax + 4] ; MwaitSupport > - mov eax, [eax + 20] ; CountTofinish > + mov eax, [eax + 16] ; CountTofinish > lock dec dword [eax] ; (*CountTofinish)-- > cmp cl, 1 ; Check mwait-monitor support > jnz HltLoop > -- > 2.36.1.windows.1 > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-05 9:38 ` Ard Biesheuvel @ 2023-01-06 4:12 ` Ni, Ray 2023-01-06 6:42 ` Laszlo Ersek 0 siblings, 1 reply; 17+ messages in thread From: Ni, Ray @ 2023-01-06 4:12 UTC (permalink / raw) To: devel@edk2.groups.io, ardb@kernel.org, Xie, Yuanhao > > @@ -555,19 +556,25 @@ InitMpGlobalData ( > > ) > > ); > > > > - mReservedTopOfApStack = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); > > - ASSERT (mReservedTopOfApStack != 0); > > - ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > > - ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); > > - > > - mReservedApLoopFunc = (VOID *)(mReservedTopOfApStack + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > > - if (StandardSignatureIsAuthenticAMD ()) { > > + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { > > This looks the wrong way around. Ard, Only AMD X64 (including SEV and without SEV) runs the code that switches to 32bit paging disabled mode. Intel X64 runs the code that stays at 64bit paging mode. So no need for <4G memory. All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled mode. The AllocateReservedPages() call should not return a memory above 4GB in 32bit env. Did I miss anything? Thanks, Ray ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 4:12 ` Ni, Ray @ 2023-01-06 6:42 ` Laszlo Ersek 2023-01-06 8:03 ` Gerd Hoffmann 0 siblings, 1 reply; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 6:42 UTC (permalink / raw) To: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao, Gerd Hoffmann On 1/6/23 05:12, Ni, Ray wrote: >>> @@ -555,19 +556,25 @@ InitMpGlobalData ( >>> ) >>> ); >>> >>> - mReservedTopOfApStack = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); >>> - ASSERT (mReservedTopOfApStack != 0); >>> - ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); >>> - ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); >>> - >>> - mReservedApLoopFunc = (VOID *)(mReservedTopOfApStack + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); >>> - if (StandardSignatureIsAuthenticAMD ()) { >>> + if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { >> >> This looks the wrong way around. > > > Ard, > > Only AMD X64 (including SEV and without SEV) runs the code that > switches to 32bit paging disabled mode. > Intel X64 runs the code that stays at 64bit paging mode. So no need > for <4G memory. > All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled > mode. The AllocateReservedPages() call should not return a memory > above 4GB in 32bit env. This argument about the allocations sounds valid, thanks. The code still remains incredibly hard to read. It needs serious cleanup. (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", to better reflect the revised check. (2) Wherever we have an _AMD in a typedef, rename it to _AMD64. For example, in the ASM_RELOCATE_AP_LOOP_AMD typedef. The leading *comment* on that typedef should be updated, too. (3) The way the mReservedApLoopFunc variable is used (populated, and then consumed) in C code is super awkward. We have casts left and right, and duplicated code, too. (4) Before commit 73ccde8f6d04, we had two separate allocations: namely, for the AP loop func, and then the stacks of the CPUs together. The size of the loop func was rounded up to a whole multiple of EFI_PAGE_SIZE, and then the pages accommodating the loop func were set to executable. (Unfortunately the name for the rounded-up size was terribly non-descriptive: "ApSafeBufferSize".) This was done in commit bc2288f59ba2 ("UefiCpuPkg/MpInitLib: put mReservedApLoopFunc in executable memory", 2018-03-08). And, because we had two separate allocations, which could of course be discontiguous between each other, we needed two variables for storing their addresses, mReservedApLoopFunc and mReservedTopOfApStack. Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before booting to OS.", 2022-12-20) *removed* the executable marking. (4a) Is that not a problem? And if it's not a problem, why was it not done (or at least explained) separately? (4b) After commit 73ccde8f6d04, we have a single allocation only. The two allocations have been fused. The "mReservedTopOfApStack" variable is now effectively a duplicate of mReservedApLoopFunc, so it's only good for confusion. It should be eliminated. (5) The "ApSafeBufferSize" variable name is now totally bogus. It's OK to have a variable for expressing the allocation size, but "ApSafeBufferSize" is wrong. It should be renamed. (6) Here's yet another bug in commit 73ccde8f6d04 (which is not fixed by the currently proposed patch): the size of the fused allocation, expressed in "ApSafeBufferSize", does not take into account which variant of the AP loop func we're going to use; it only accounts for the non-AMD64 version. This bug is likely masked by the rounding-up to EFI_PAGE_SIZE, but it's still a bug. (7) We should assert that AP_SAFE_STACK_SIZE is correctly aligned with STATIC_ASSERT(), not ASSERT(). Here's a rough patch (on top of this one) to demonstrate some of the improvements, all squashed together (just to show the ideas): > diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > index 1994ee44c259..e77bdc4f201d 100644 > --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c > @@ -25,11 +25,16 @@ EFI_EVENT mCheckAllApsEvent = NULL; > EFI_EVENT mMpInitExitBootServicesEvent = NULL; > EFI_EVENT mLegacyBootEvent = NULL; > volatile BOOLEAN mStopCheckAllApsStatus = TRUE; > -VOID *mReservedApLoopFunc = NULL; > UINTN mReservedTopOfApStack; > volatile UINT32 mNumberToFinish = 0; > UINTN mApPageTable; > > +STATIC union { > + VOID *Data; > + ASM_RELOCATE_AP_LOOP_AMD Amd64Entry; > + ASM_RELOCATE_AP_LOOP GenericEntry; > +} mReservedApLoop; > + > // > // Begin wakeup buffer allocation below 0x88000 > // > @@ -381,8 +386,6 @@ RelocateApLoop ( > { > CPU_MP_DATA *CpuMpData; > BOOLEAN MwaitSupport; > - ASM_RELOCATE_AP_LOOP AsmRelocateApLoopFunc; > - ASM_RELOCATE_AP_LOOP_AMD AsmRelocateApLoopFuncAmd; > UINTN ProcessorNumber; > UINTN StackStart; > > @@ -391,8 +394,7 @@ RelocateApLoop ( > MwaitSupport = IsMwaitSupport (); > if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { > StackStart = CpuMpData->UseSevEsAPMethod ? CpuMpData->SevEsAPResetStackStart : mReservedTopOfApStack; > - AsmRelocateApLoopFuncAmd = (ASM_RELOCATE_AP_LOOP_AMD)(UINTN)mReservedApLoopFunc; > - AsmRelocateApLoopFuncAmd ( > + mReservedApLoop.Amd64Entry ( > MwaitSupport, > CpuMpData->ApTargetCState, > CpuMpData->PmCodeSegment, > @@ -404,8 +406,7 @@ RelocateApLoop ( > ); > } else { > StackStart = mReservedTopOfApStack; > - AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP)(UINTN)mReservedApLoopFunc; > - AsmRelocateApLoopFunc ( > + mReservedApLoop.GenericEntry ( > MwaitSupport, > CpuMpData->ApTargetCState, > StackStart - ProcessorNumber * AP_SAFE_STACK_SIZE, > @@ -475,12 +476,15 @@ InitMpGlobalData ( > ) > { > EFI_STATUS Status; > - UINTN ApSafeBufferSize; > + MP_ASSEMBLY_ADDRESS_MAP *AddressMap; > + UINTN AllocSize; > UINTN Index; > EFI_GCD_MEMORY_SPACE_DESCRIPTOR MemDesc; > UINTN StackBase; > CPU_INFO_IN_HOB *CpuInfoInHob; > EFI_PHYSICAL_ADDRESS Address; > + UINT8 *ApLoopFuncData; > + UINTN ApLoopFuncSize; > > SaveCpuMpData (CpuMpData); > > @@ -549,48 +553,58 @@ InitMpGlobalData ( > // | Stack * N | > // +------------+ (low address) > // > - ApSafeBufferSize = EFI_PAGES_TO_SIZE ( > - EFI_SIZE_TO_PAGES ( > - CpuMpData->CpuCount * AP_SAFE_STACK_SIZE > - + CpuMpData->AddressMap.RelocateApLoopFuncSize > - ) > - ); > + > + AddressMap = &CpuMpData->AddressMap; > > if (StandardSignatureIsAuthenticAMD () && (sizeof (UINTN) == sizeof (UINT64))) { > - Address = BASE_4GB - 1; > - Status = gBS->AllocatePages ( > - AllocateMaxAddress, > - EfiReservedMemoryType, > - EFI_SIZE_TO_PAGES (ApSafeBufferSize), > - &Address > - ); > - ASSERT_EFI_ERROR (Status); > - mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > - CopyMem ( > - mReservedApLoopFunc, > - CpuMpData->AddressMap.RelocateApLoopFuncAddressAmd, > - CpuMpData->AddressMap.RelocateApLoopFuncSizeAmd > - ); > + // > + // 64-bit AMD Processor > + // > + Address = BASE_4GB - 1; > + ApLoopFuncData = AddressMap->RelocateApLoopFuncAddressAmd; > + ApLoopFuncSize = AddressMap->RelocateApLoopFuncSizeAmd; > } else { > - Address = (UINTN)AllocateReservedPages (EFI_SIZE_TO_PAGES (ApSafeBufferSize)); > - ASSERT (Address != 0); > - mReservedApLoopFunc = (VOID *)((UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > - CopyMem ( > - mReservedApLoopFunc, > - CpuMpData->AddressMap.RelocateApLoopFuncAddress, > - CpuMpData->AddressMap.RelocateApLoopFuncSize > - ); > + // > + // Intel Processor (32-bit or 64-bit), or 32-bit AMD Processor > + // > + Address = MAX_ADDRESS; > + ApLoopFuncData = AddressMap->RelocateApLoopFuncAddress; > + ApLoopFuncSize = AddressMap->RelocateApLoopFuncSize; > + } > > + STATIC_ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); > + AllocSize = EFI_PAGES_TO_SIZE ( > + EFI_SIZE_TO_PAGES ( > + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE + ApLoopFuncSize > + ) > + ); > + > + Status = gBS->AllocatePages ( > + AllocateMaxAddress, > + EfiReservedMemoryType, > + EFI_SIZE_TO_PAGES (AllocSize), > + &Address > + ); > + ASSERT_EFI_ERROR (Status); > + > + mReservedTopOfApStack = ((UINTN)Address + > + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE); > + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > + > + mReservedApLoop.Data = (VOID *)mReservedTopOfApStack; > + CopyMem (mReservedApLoop.Data, ApLoopFuncData, ApLoopFuncSize); > + > + if (!StandardSignatureIsAuthenticAMD () && > + (sizeof (UINTN) == sizeof (UINT64))) { > + // > + // 64-bit Intel Processor > + // > mApPageTable = CreatePageTable ( > (UINTN)Address, > - ApSafeBufferSize > + AllocSize > ); > } > > - mReservedTopOfApStack = (UINTN)Address + CpuMpData->CpuCount * AP_SAFE_STACK_SIZE; > - ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0); > - ASSERT ((AP_SAFE_STACK_SIZE & (CPU_STACK_ALIGNMENT - 1)) == 0); > - > Status = gBS->CreateEvent ( > EVT_TIMER | EVT_NOTIFY_SIGNAL, > TPL_NOTIFY, Honestly, at this point I'm *even more convinced* that the original series should be reverted, and redone from the ground up. There is way too much cruft for sensible incremental fixing. If you consider just the renames I'm requesting, that's going to introduce huge churn already. So let's please first undo the damage done by 73ccde8f6d04, and then build up the desired functionality in *very small*, careful steps, using the correct variable and type names, right off the bat. And please let us consider *cleaning up* the source code a primary goal while at it, removing code duplication and so on. Thanks, Laszlo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 6:42 ` Laszlo Ersek @ 2023-01-06 8:03 ` Gerd Hoffmann 2023-01-06 8:30 ` Laszlo Ersek ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Gerd Hoffmann @ 2023-01-06 8:03 UTC (permalink / raw) To: Laszlo Ersek Cc: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao, thomas.lendacky On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: > On 1/6/23 05:12, Ni, Ray wrote: > > > > Ard, > > > > Only AMD X64 (including SEV and without SEV) runs the code that > > switches to 32bit paging disabled mode. > > Intel X64 runs the code that stays at 64bit paging mode. So no need > > for <4G memory. > > All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled > > mode. The AllocateReservedPages() call should not return a memory > > above 4GB in 32bit env. > > This argument about the allocations sounds valid, thanks. > > The code still remains incredibly hard to read. It needs serious > cleanup. > > (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", > to better reflect the revised check. Maybe even better: Use PcdConfidentialComputingGuestAttr to figure whenever SEV is active, if so branch into Amd assembler code. Rename "Amd" to "AmdSev". Otherwise just call normal X64 / Ia32 code. Amd assembler code can subsequently be simplified, the checks for SEV are not needed any more (but should not harm either). [ Adding Tom to CC ] > Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before > booting to OS.", 2022-12-20) *removed* the executable marking. > > (4a) Is that not a problem? I think so. Booting with strict NX checking (PcdDxeNxMemoryProtectionPolicy = 0xC000000000007FD5) and "qemu -smp 2" makes my qemu hang with 200% CPU, so probably both vcpus are spinning in a dead loop. For the BSP this is expected behavior (buggy grub.efi, see parallel thread). For the AP it is not, so apparently it is not running idle in hlt like it is supposed to. > Honestly, at this point I'm *even more convinced* that the original > series should be reverted, and redone from the ground up. Yes, "back to drawing board" looks like the best option at this point. take care, Gerd ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:03 ` Gerd Hoffmann @ 2023-01-06 8:30 ` Laszlo Ersek 2023-01-06 8:39 ` Ni, Ray 2023-01-06 8:43 ` Yuanhao Xie 2023-01-06 15:42 ` Lendacky, Thomas 2 siblings, 1 reply; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 8:30 UTC (permalink / raw) To: Gerd Hoffmann Cc: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao, thomas.lendacky On 1/6/23 09:03, Gerd Hoffmann wrote: > On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: >> On 1/6/23 05:12, Ni, Ray wrote: >>> >>> Ard, >>> >>> Only AMD X64 (including SEV and without SEV) runs the code that >>> switches to 32bit paging disabled mode. >>> Intel X64 runs the code that stays at 64bit paging mode. So no need >>> for <4G memory. >>> All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled >>> mode. The AllocateReservedPages() call should not return a memory >>> above 4GB in 32bit env. >> >> This argument about the allocations sounds valid, thanks. >> >> The code still remains incredibly hard to read. It needs serious >> cleanup. >> >> (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", >> to better reflect the revised check. > > Maybe even better: Use PcdConfidentialComputingGuestAttr to figure > whenever SEV is active, if so branch into Amd assembler code. Rename > "Amd" to "AmdSev". > > Otherwise just call normal X64 / Ia32 code. > > Amd assembler code can subsequently be simplified, the checks for SEV > are not needed any more (but should not harm either). > > [ Adding Tom to CC ] > >> Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before >> booting to OS.", 2022-12-20) *removed* the executable marking. >> >> (4a) Is that not a problem? > > I think so. Ah... OK, my fault: one should never ask questions in English the negative! :) So, based on your next paragraph, I think you agree that this *is* a problem. (I first thought you agreed with the lack of executable marking *not* being a problem -- again, my mistake for formulating the question in the negative!) > > Booting with strict NX checking (PcdDxeNxMemoryProtectionPolicy = > 0xC000000000007FD5) and "qemu -smp 2" makes my qemu hang with 200% CPU, > so probably both vcpus are spinning in a dead loop. For the BSP this is > expected behavior (buggy grub.efi, see parallel thread). For the AP it > is not, so apparently it is not running idle in hlt like it is supposed > to. > >> Honestly, at this point I'm *even more convinced* that the original >> series should be reverted, and redone from the ground up. > > Yes, "back to drawing board" looks like the best option at this point. Let me see if I can post a revert series today. Thanks Laszlo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:30 ` Laszlo Ersek @ 2023-01-06 8:39 ` Ni, Ray 2023-01-06 9:19 ` Laszlo Ersek 0 siblings, 1 reply; 17+ messages in thread From: Ni, Ray @ 2023-01-06 8:39 UTC (permalink / raw) To: Laszlo Ersek, Gerd Hoffmann Cc: devel@edk2.groups.io, ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com > -----Original Message----- > From: Laszlo Ersek <lersek@redhat.com> > Sent: Friday, January 6, 2023 4:31 PM > To: Gerd Hoffmann <kraxel@redhat.com> > Cc: devel@edk2.groups.io; Ni, Ray <ray.ni@intel.com>; ardb@kernel.org; Xie, Yuanhao <yuanhao.xie@intel.com>; > thomas.lendacky@amd.com > Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB > > On 1/6/23 09:03, Gerd Hoffmann wrote: > > On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: > >> On 1/6/23 05:12, Ni, Ray wrote: > >>> > >>> Ard, > >>> > >>> Only AMD X64 (including SEV and without SEV) runs the code that > >>> switches to 32bit paging disabled mode. > >>> Intel X64 runs the code that stays at 64bit paging mode. So no need > >>> for <4G memory. > >>> All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled > >>> mode. The AllocateReservedPages() call should not return a memory > >>> above 4GB in 32bit env. > >> > >> This argument about the allocations sounds valid, thanks. > >> > >> The code still remains incredibly hard to read. It needs serious > >> cleanup. > >> > >> (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", > >> to better reflect the revised check. > > > > Maybe even better: Use PcdConfidentialComputingGuestAttr to figure > > whenever SEV is active, if so branch into Amd assembler code. Rename > > "Amd" to "AmdSev". > > > > Otherwise just call normal X64 / Ia32 code. > > > > Amd assembler code can subsequently be simplified, the checks for SEV > > are not needed any more (but should not harm either). > > > > [ Adding Tom to CC ] > > > >> Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before > >> booting to OS.", 2022-12-20) *removed* the executable marking. > >> > >> (4a) Is that not a problem? > > > > I think so. > > Ah... OK, my fault: one should never ask questions in English the > negative! :) > > So, based on your next paragraph, I think you agree that this *is* a > problem. (I first thought you agreed with the lack of executable marking > *not* being a problem -- again, my mistake for formulating the question > in the negative!) I agree it's a problem. Original thought was since AP is using a brand-new page table that doesn't have the XD bit set. There is no need for removing the XD bit in existing page table. But the final code change forgot to switch to the new page table before calling to code in the reserved memory. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:39 ` Ni, Ray @ 2023-01-06 9:19 ` Laszlo Ersek 2023-01-06 9:45 ` Ni, Ray 0 siblings, 1 reply; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 9:19 UTC (permalink / raw) To: devel, ray.ni, Gerd Hoffmann Cc: ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com On 1/6/23 09:39, Ni, Ray wrote: > > >> -----Original Message----- >> From: Laszlo Ersek <lersek@redhat.com> >> Sent: Friday, January 6, 2023 4:31 PM >> To: Gerd Hoffmann <kraxel@redhat.com> >> Cc: devel@edk2.groups.io; Ni, Ray <ray.ni@intel.com>; ardb@kernel.org; Xie, Yuanhao <yuanhao.xie@intel.com>; >> thomas.lendacky@amd.com >> Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB >> >> On 1/6/23 09:03, Gerd Hoffmann wrote: >>> On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: >>>> On 1/6/23 05:12, Ni, Ray wrote: >>>>> >>>>> Ard, >>>>> >>>>> Only AMD X64 (including SEV and without SEV) runs the code that >>>>> switches to 32bit paging disabled mode. >>>>> Intel X64 runs the code that stays at 64bit paging mode. So no need >>>>> for <4G memory. >>>>> All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled >>>>> mode. The AllocateReservedPages() call should not return a memory >>>>> above 4GB in 32bit env. >>>> >>>> This argument about the allocations sounds valid, thanks. >>>> >>>> The code still remains incredibly hard to read. It needs serious >>>> cleanup. >>>> >>>> (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", >>>> to better reflect the revised check. >>> >>> Maybe even better: Use PcdConfidentialComputingGuestAttr to figure >>> whenever SEV is active, if so branch into Amd assembler code. Rename >>> "Amd" to "AmdSev". >>> >>> Otherwise just call normal X64 / Ia32 code. >>> >>> Amd assembler code can subsequently be simplified, the checks for SEV >>> are not needed any more (but should not harm either). >>> >>> [ Adding Tom to CC ] >>> >>>> Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before >>>> booting to OS.", 2022-12-20) *removed* the executable marking. >>>> >>>> (4a) Is that not a problem? >>> >>> I think so. >> >> Ah... OK, my fault: one should never ask questions in English the >> negative! :) >> >> So, based on your next paragraph, I think you agree that this *is* a >> problem. (I first thought you agreed with the lack of executable marking >> *not* being a problem -- again, my mistake for formulating the question >> in the negative!) > > I agree it's a problem. Original thought was since AP is using a brand-new page table > that doesn't have the XD bit set. There is no need for removing the XD bit in > existing page table. This makes sense, but, again, even disregarding the problem that the code forgot to switch to the new page table, the idea should be spelled out in the commit message and/or in code comments. Preferably: both. (In fact if the idea had been documented, Yuanhao might not have forgotten to implement the switch.) Laszlo > But the final code change forgot to switch to the new page table before calling to > code in the reserved memory. > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 9:19 ` Laszlo Ersek @ 2023-01-06 9:45 ` Ni, Ray 2023-01-06 10:35 ` Laszlo Ersek 0 siblings, 1 reply; 17+ messages in thread From: Ni, Ray @ 2023-01-06 9:45 UTC (permalink / raw) To: Laszlo Ersek, devel@edk2.groups.io, Gerd Hoffmann Cc: ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com > This makes sense, but, again, even disregarding the problem that the > code forgot to switch to the new page table, the idea should be spelled > out in the commit message and/or in code comments. Preferably: both. > > (In fact if the idea had been documented, Yuanhao might not have > forgotten to implement the switch.) > But the following cases still require the XD bit be removed from the active page table: *. 32bit mode In fact, old 32bit logic contains a bug that paging might still be enabled when AP is in the idle loop. That's a long-exist bug. QEMU doesn't enable the memory protection feature resulting the paging is disabled in 32bit mode. Hence the bug doesn't appear in QEMU 32bit image. Yuanhao's logic tries to not create page table for 32bit mode. I agree with this because it simplifies the 32bit flow. But considering the paging might be enabled, XD removal logic should be kept. *. 64bit AMD Today Yuanhao's logic tries to not modify AMD 64bit flow no matter SEV is enabled or not. But if the BSP page table is used by AP when running the code in the reserved memory, the XD bit removal logic should be kept for 64bit AMD. So, that means only 64bit Intel flow doesn't require the XD bit removal logic. Then, for code simplicity (reducing the branches), I recommend keeping the XD bit removal logic always. Imaging in a long future that 32bit x86 support is removed from edk2 and the legacy SEV is removed as well, we can cleanup the unnecessary XD bit removal then. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 9:45 ` Ni, Ray @ 2023-01-06 10:35 ` Laszlo Ersek 2023-01-06 11:14 ` Gerd Hoffmann 0 siblings, 1 reply; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 10:35 UTC (permalink / raw) To: devel, ray.ni, Gerd Hoffmann Cc: ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com On 1/6/23 10:45, Ni, Ray wrote: >> This makes sense, but, again, even disregarding the problem that the >> code forgot to switch to the new page table, the idea should be spelled >> out in the commit message and/or in code comments. Preferably: both. >> >> (In fact if the idea had been documented, Yuanhao might not have >> forgotten to implement the switch.) >> > > But the following cases still require the XD bit be removed from the active page table: > *. 32bit mode > In fact, old 32bit logic contains a bug that paging might still be enabled when AP is in the idle loop. > That's a long-exist bug. QEMU doesn't enable the memory protection feature resulting the paging is disabled in 32bit mode. > Hence the bug doesn't appear in QEMU 32bit image. > Yuanhao's logic tries to not create page table for 32bit mode. I agree with this because it simplifies the 32bit flow. > But considering the paging might be enabled, XD removal logic should be kept. > > *. 64bit AMD > Today Yuanhao's logic tries to not modify AMD 64bit flow no matter SEV is enabled or not. > But if the BSP page table is used by AP when running the code in the reserved memory, the XD bit removal logic should be > kept for 64bit AMD. > So, that means only 64bit Intel flow doesn't require the XD bit removal logic. > Then, for code simplicity (reducing the branches), I recommend keeping the XD bit removal logic always. That sounds OK to me. But then, can we go back to the original purpose here? What is achieved by simplifying the current MpFuncs stuff, for 64-bit Intel processors? What is achieved? Commit 73ccde8f6d04 says things like "avoiding switching modes", and "reclaiming memory by OS". I don't think I understand the importance of those. First, "reclaiming memory by OS" seems questionable, as both before and after, we need reserved memory. In fact, with the modification, we might need even more reserved memory (which the OS cannot reclaim): we still need the portion for the loop func, we still need the portion for the stacks, and we also need a new page table. So that actually seems *worse*. The edk2 codebase seen as a whole gets more complicated, that's a negative too. There are some savings in X64/MpFuncs.nasm for 64-bit Intel processors, but we need to preserve (and maintain) the preexistent code too, so it's a pure code growth, from the codebase perspective. All this against the alleged benefit of "avoiding switching modes". What is wrong with switching modes? The OS needs to boot up the APs *once*, when it starts. The mode in which the firmware parked the APs is effectively irrelevant. Is this change worth the effort and code complications at all? Now, there *is* one benefit I can imagine. For Intel maintainers, it may be difficult to maintain and to "route around" the SEV-related stuff in "X64/MpFuncs.nasm", in the long term. I can wholly accept that. So splitting and duplicating the assembly code for that purpose is justified. But then the commit message should state this, and not present "staying in 64-bit" as a benefit per se. Then the purpose is to ease the assembly code maintenance for Intel developers. Entirely justified goal in my view; nobody likes to work with complicated code they can't regression-test (and I presume Intel developers can't easily test the various SEV enablement levels in-house, on a range of AMD processors). Thanks Laszlo > Imaging in a long future that 32bit x86 support is removed from edk2 and the legacy SEV is removed as well, we can cleanup > the unnecessary XD bit removal then. > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 10:35 ` Laszlo Ersek @ 2023-01-06 11:14 ` Gerd Hoffmann 2023-01-06 12:20 ` Laszlo Ersek 0 siblings, 1 reply; 17+ messages in thread From: Gerd Hoffmann @ 2023-01-06 11:14 UTC (permalink / raw) To: Laszlo Ersek Cc: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com Hi, > Now, there *is* one benefit I can imagine. For Intel maintainers, it may > be difficult to maintain and to "route around" the SEV-related stuff in > "X64/MpFuncs.nasm", in the long term. I can wholly accept that. So > splitting and duplicating the assembly code for that purpose is > justified. But then the commit message should state this, and not > present "staying in 64-bit" as a benefit per se. > > Then the purpose is to ease the assembly code maintenance for Intel > developers. Entirely justified goal in my view; nobody likes to work > with complicated code they can't regression-test (and I presume Intel > developers can't easily test the various SEV enablement levels in-house, > on a range of AMD processors). Which is exactly why I suggested to catch the SEV case by checking the PCD we have for that in C code. That'll also remove the confusion we have right now wrt intel + amd processors. The special case we have to worry about is SEV being active, not running on a AMD processor. In case SEV is not active we'll just have the IA32 and X64 cases. take care, Gerd ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 11:14 ` Gerd Hoffmann @ 2023-01-06 12:20 ` Laszlo Ersek 0 siblings, 0 replies; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 12:20 UTC (permalink / raw) To: Gerd Hoffmann Cc: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao, thomas.lendacky@amd.com On 1/6/23 12:14, Gerd Hoffmann wrote: > Hi, > >> Now, there *is* one benefit I can imagine. For Intel maintainers, it may >> be difficult to maintain and to "route around" the SEV-related stuff in >> "X64/MpFuncs.nasm", in the long term. I can wholly accept that. So >> splitting and duplicating the assembly code for that purpose is >> justified. But then the commit message should state this, and not >> present "staying in 64-bit" as a benefit per se. >> >> Then the purpose is to ease the assembly code maintenance for Intel >> developers. Entirely justified goal in my view; nobody likes to work >> with complicated code they can't regression-test (and I presume Intel >> developers can't easily test the various SEV enablement levels in-house, >> on a range of AMD processors). > > Which is exactly why I suggested to catch the SEV case by checking the > PCD we have for that in C code. That'll also remove the confusion we > have right now wrt intel + amd processors. The special case we have to > worry about is SEV being active, not running on a AMD processor. In > case SEV is not active we'll just have the IA32 and X64 cases. Thanks for repeating your suggestion. It seems very plausible, on second reading. I guess I just couldn't grasp it enough the first time you proposed it, sorry. I'd be very interested to see this in actual code! Thanks! Laszlo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:03 ` Gerd Hoffmann 2023-01-06 8:30 ` Laszlo Ersek @ 2023-01-06 8:43 ` Yuanhao Xie 2023-01-06 9:04 ` Laszlo Ersek 2023-01-06 15:42 ` Lendacky, Thomas 2 siblings, 1 reply; 17+ messages in thread From: Yuanhao Xie @ 2023-01-06 8:43 UTC (permalink / raw) To: devel@edk2.groups.io, kraxel@redhat.com, Laszlo Ersek Cc: Ni, Ray, ardb@kernel.org, thomas.lendacky@amd.com Hi all, Thanks for feedbacks. I will revert the original ones, and send the new version. Yuanhao -----Original Message----- From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Gerd Hoffmann Sent: Friday, January 6, 2023 4:03 PM To: Laszlo Ersek <lersek@redhat.com> Cc: devel@edk2.groups.io; Ni, Ray <ray.ni@intel.com>; ardb@kernel.org; Xie, Yuanhao <yuanhao.xie@intel.com>; thomas.lendacky@amd.com Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: > On 1/6/23 05:12, Ni, Ray wrote: > > > > Ard, > > > > Only AMD X64 (including SEV and without SEV) runs the code that > > switches to 32bit paging disabled mode. > > Intel X64 runs the code that stays at 64bit paging mode. So no need > > for <4G memory. > > All IA32 CPUs (including intel and AMD) stays at 32bit paging > > disabled mode. The AllocateReservedPages() call should not return a > > memory above 4GB in 32bit env. > > This argument about the allocations sounds valid, thanks. > > The code still remains incredibly hard to read. It needs serious > cleanup. > > (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", > to better reflect the revised check. Maybe even better: Use PcdConfidentialComputingGuestAttr to figure whenever SEV is active, if so branch into Amd assembler code. Rename "Amd" to "AmdSev". Otherwise just call normal X64 / Ia32 code. Amd assembler code can subsequently be simplified, the checks for SEV are not needed any more (but should not harm either). [ Adding Tom to CC ] > Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before > booting to OS.", 2022-12-20) *removed* the executable marking. > > (4a) Is that not a problem? I think so. Booting with strict NX checking (PcdDxeNxMemoryProtectionPolicy = 0xC000000000007FD5) and "qemu -smp 2" makes my qemu hang with 200% CPU, so probably both vcpus are spinning in a dead loop. For the BSP this is expected behavior (buggy grub.efi, see parallel thread). For the AP it is not, so apparently it is not running idle in hlt like it is supposed to. > Honestly, at this point I'm *even more convinced* that the original > series should be reverted, and redone from the ground up. Yes, "back to drawing board" looks like the best option at this point. take care, Gerd ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:43 ` Yuanhao Xie @ 2023-01-06 9:04 ` Laszlo Ersek 0 siblings, 0 replies; 17+ messages in thread From: Laszlo Ersek @ 2023-01-06 9:04 UTC (permalink / raw) To: Xie, Yuanhao, devel@edk2.groups.io, kraxel@redhat.com Cc: Ni, Ray, ardb@kernel.org, thomas.lendacky@amd.com On 1/6/23 09:43, Xie, Yuanhao wrote: > Hi all, > > Thanks for feedbacks. I will revert the original ones, and send the > new version. OK, thanks. Please pay attention the ordering of the reverts. The original series was merged in the following order: (a) 1 7bda8c648192 UefiCpuPkg: Duplicated AsmRelocateApLoop as AsmRelocateApLoopAmd 2 73ccde8f6d04 UefiCpuPkg: Has APs in 64 bit long-mode before booting to OS. 3 4a8642422460 OvmfPkg: Add CpuPageTableLib required by MpInitLib. 4 3f378450dfaf UefiPayloadPkg: Add CpuPageTableLib required by MpInitLib. Commit #2 introduced a new lib class dependency. The dependency was resolved for OvmfPkg and UefiPayloadPkg only in patches #3 and #4, respectively. This means that, if someone checks out the tree at #2 or #3, then at #2, neither the OvmfPkg nor UefiPayloadPkg platforms build, and at #3, the UefiPayloadPkg platforms don't build: > $ git checkout 73ccde8f6d04 > > $ build -a X64 -b NOOPT -p OvmfPkg/OvmfPkgX64.dsc -t GCC5 > > [...] > > .../OvmfPkg/OvmfPkgX64.dsc(...): error 4000: Instance of library class [CpuPageTableLib] is not found > in [.../UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf] [X64] > consumed by module [.../UefiCpuPkg/CpuDxe/CpuDxe.inf] This is a problem. It's a problem because a broken build interferes with the "git bisect" command's ability to narrow down a functional (runtime) issue to a specific patch. If you can't build the tree at a particular commit, then you cannot test whether that commit already contains the regression, or not. Usually when a series is reverted, the revert commits are ordered in reverse to the original commits. However, in this case, we shouldn't do that, because then we'd introduce two more commits into the git history at which the tree doesn't build. The proper original order (for keeping the tree buildable at all times) would have been the following (moving (a)/#2 to the end, so that by the time the CpuPageTableLib dependency is introduced to CpuDxe, all CpuDxe-dependent DSC files in the tree have a CpuPageTableLib resolution): (b) 1 7bda8c648192 UefiCpuPkg: Duplicated AsmRelocateApLoop as AsmRelocateApLoopAmd 2 4a8642422460 OvmfPkg: Add CpuPageTableLib required by MpInitLib. 3 3f378450dfaf UefiPayloadPkg: Add CpuPageTableLib required by MpInitLib. 4 73ccde8f6d04 UefiCpuPkg: Has APs in 64 bit long-mode before booting to OS. Therefore the right order to revert is the inverse of (b), and not the inverse of (a): git revert 73ccde8f6d04 # UefiCpuPkg: Has APs in 64 bit long-mode before booting to OS. git revert 3f378450dfaf # UefiPayloadPkg: Add CpuPageTableLib required by MpInitLib. git revert 4a8642422460 # OvmfPkg: Add CpuPageTableLib required by MpInitLib. git revert 7bda8c648192 # UefiCpuPkg: Duplicated AsmRelocateApLoop as AsmRelocateApLoopAmd This keeps the tree buildable at every stage of the revert. The importance of this cannot be over-emphasized. If someone investigates a completely unrelated problem in edk2 in a year from now, and they don't even know what package the issue could be in, so they can't restrict "git-bisect" to a particular package, then their git-bisect run could very well land on one of the non-building commits, at some point. The present UefiCpuPkg commits may be totally irrelevant for their problem, but they won't be able to tell or test that, because the tree will simply not build for them. "git-bisect" supports a command called "git bisect skip", which more or less deals with such situations, by picking a "nearby" commit to try, but that's really just a kludge. Laszlo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [edk2-devel] [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB 2023-01-06 8:03 ` Gerd Hoffmann 2023-01-06 8:30 ` Laszlo Ersek 2023-01-06 8:43 ` Yuanhao Xie @ 2023-01-06 15:42 ` Lendacky, Thomas 2 siblings, 0 replies; 17+ messages in thread From: Lendacky, Thomas @ 2023-01-06 15:42 UTC (permalink / raw) To: Gerd Hoffmann, Laszlo Ersek; +Cc: devel, ray.ni, ardb@kernel.org, Xie, Yuanhao On 1/6/23 02:03, Gerd Hoffmann wrote: > On Fri, Jan 06, 2023 at 07:42:20AM +0100, Laszlo Ersek wrote: >> On 1/6/23 05:12, Ni, Ray wrote: >>> >>> Ard, >>> >>> Only AMD X64 (including SEV and without SEV) runs the code that >>> switches to 32bit paging disabled mode. >>> Intel X64 runs the code that stays at 64bit paging mode. So no need >>> for <4G memory. >>> All IA32 CPUs (including intel and AMD) stays at 32bit paging disabled >>> mode. The AllocateReservedPages() call should not return a memory >>> above 4GB in 32bit env. >> >> This argument about the allocations sounds valid, thanks. >> >> The code still remains incredibly hard to read. It needs serious >> cleanup. >> >> (1) Wherever we have "Amd" in an identifier, let's rename it to "Amd64", >> to better reflect the revised check. > > Maybe even better: Use PcdConfidentialComputingGuestAttr to figure > whenever SEV is active, if so branch into Amd assembler code. Rename > "Amd" to "AmdSev". > > Otherwise just call normal X64 / Ia32 code. > > Amd assembler code can subsequently be simplified, the checks for SEV > are not needed any more (but should not harm either). > > [ Adding Tom to CC ] Yes, I agree with all this. Thanks, Tom > >> Commit 73ccde8f6d04 ("UefiCpuPkg: Has APs in 64 bit long-mode before >> booting to OS.", 2022-12-20) *removed* the executable marking. >> >> (4a) Is that not a problem? > > I think so. > > Booting with strict NX checking (PcdDxeNxMemoryProtectionPolicy = > 0xC000000000007FD5) and "qemu -smp 2" makes my qemu hang with 200% CPU, > so probably both vcpus are spinning in a dead loop. For the BSP this is > expected behavior (buggy grub.efi, see parallel thread). For the AP it > is not, so apparently it is not running idle in hlt like it is supposed > to. > >> Honestly, at this point I'm *even more convinced* that the original >> series should be reverted, and redone from the ground up. > > Yes, "back to drawing board" looks like the best option at this point. > > take care, > Gerd > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2023-01-06 15:42 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-01-05 6:21 [PATCH] UefiCpuPkg:Fixed AsmRelocateApLoopStart and ensure allocated memory <4GB Yuanhao Xie 2023-01-05 6:28 ` [edk2-devel] " Ni, Ray 2023-01-05 7:19 ` Yuanhao Xie 2023-01-05 9:38 ` Ard Biesheuvel 2023-01-06 4:12 ` Ni, Ray 2023-01-06 6:42 ` Laszlo Ersek 2023-01-06 8:03 ` Gerd Hoffmann 2023-01-06 8:30 ` Laszlo Ersek 2023-01-06 8:39 ` Ni, Ray 2023-01-06 9:19 ` Laszlo Ersek 2023-01-06 9:45 ` Ni, Ray 2023-01-06 10:35 ` Laszlo Ersek 2023-01-06 11:14 ` Gerd Hoffmann 2023-01-06 12:20 ` Laszlo Ersek 2023-01-06 8:43 ` Yuanhao Xie 2023-01-06 9:04 ` Laszlo Ersek 2023-01-06 15:42 ` Lendacky, Thomas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox