From: "Fan, Jeff" <jeff.fan@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
"edk2-devel@ml01.01.org" <edk2-devel@ml01.01.org>
Cc: "Tian, Feng" <feng.tian@intel.com>,
"Kinney, Michael D" <michael.d.kinney@intel.com>
Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB
Date: Thu, 24 Nov 2016 02:23:22 +0000 [thread overview]
Message-ID: <542CF652F8836A4AB8DBFAAD40ED192A4A2E686B@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <25db000e-ad85-2fd6-6936-e0147e7c20b4@redhat.com>
Laszlo,
Thanks your comments. I update the feedback as below in [Jeff]
Jeff
-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, November 24, 2016 2:31 AM
To: Fan, Jeff; edk2-devel@ml01.01.org
Cc: Tian, Feng; Kinney, Michael D
Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB
On 11/23/16 15:01, Jeff Fan wrote:
> For long mode DXE, we will disable paging on AP to protected mode to
> execute AP safe loop code in ACPI NVS range under 4GB. But we forget
> to allocate stack for AP under 4GB and AP still are using original AP
> stack. If original AP stack is larger than 4GB, it cannot be used after AP is transferred to protected mode.
>
> Moreover, even though AP stack is always under 4GB on protected mode
> DXE, AP stack(in BootServiceData) maybe crashed by OS after Exit Boot Service event.
>
> This fix is to allocate ACPI NVS range under 4GB together with AP safe
> loop code. APs will switch to new stack at beginning of safe loop code.
(1) We are actually allocating EfiReservedMemoryType -- please update the commit message.
[Jeff] OK
(2) For a minute I was confused about using the stack *at all* in the HLT loop. But looking farther down, and reading up on the MONITOR statement, I see that the MWAIT loops in both the Ia32 and X64 source files use the stack inherentily. So, the stack usage is unavoidable in the MwaitSupport==TRUE case.
Can you mention in the commit message that the stack usage is only really necessary for MwaitSupport==TRUE, and in the HLT loop mode, it could be technically avoided?
[Jeff] OK
(I'm not suggesting that we implement a separate solution for the HLT loop mode, but we should be clear about the true incentive for this patch.)
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Feng Tian <feng.tian@intel.com>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jeff Fan <jeff.fan@intel.com>
> ---
> UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 19 +++++++++++++++++--
> UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 9 +++++++--
> UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 ++-
> UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 3 +++
> 4 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> index a0d5eeb..d0f9f7e 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> @@ -18,6 +18,7 @@
> #include <Library/UefiBootServicesTableLib.h>
>
> #define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100))
> +#define AP_SAFT_STACK_SIZE 128
(3) Is "SAFT" a typo? Did you mean "SAFE"?
[Jeff] Yes. Will correct it.
>
> CPU_MP_DATA *mCpuMpData = NULL;
> EFI_EVENT mCheckAllApsEvent = NULL;
> @@ -25,6 +26,7 @@ EFI_EVENT mMpInitExitBootServicesEvent = NULL;
> EFI_EVENT mLegacyBootEvent = NULL;
> volatile BOOLEAN mStopCheckAllApsStatus = TRUE;
> VOID *mReservedApLoopFunc = NULL;
> +UINTN mReservedTopOfApStack;
>
> /**
> Get the pointer to CPU MP Data structure.
> @@ -241,11 +243,18 @@ RelocateApLoop (
> CPU_MP_DATA *CpuMpData;
> BOOLEAN MwaitSupport;
> ASM_RELOCATE_AP_LOOP AsmRelocateApLoopFunc;
> + UINTN ProcessorNumber;
>
> + MpInitLibWhoAmI (&ProcessorNumber);
> CpuMpData = GetCpuMpData ();
> MwaitSupport = IsMwaitSupport ();
> AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN)
> mReservedApLoopFunc;
> - AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState,
> CpuMpData->PmCodeSegment);
> + AsmRelocateApLoopFunc (
> + MwaitSupport,
> + CpuMpData->ApTargetCState,
> + CpuMpData->PmCodeSegment,
> + mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE
> + );
> //
> // It should never reach here
> //
> @@ -289,6 +298,7 @@ InitMpGlobalData ( {
> EFI_STATUS Status;
> EFI_PHYSICAL_ADDRESS Address;
> + UINTN ApSafeBufferSize;
>
> SaveCpuMpData (CpuMpData);
>
> @@ -307,16 +317,21 @@ InitMpGlobalData (
> // Allocating it in advance since memory services are not available in
> // Exit Boot Services callback function.
> //
> + ApSafeBufferSize = CpuMpData->AddressMap.RelocateApLoopFuncSize;
> + ApSafeBufferSize += CpuMpData->CpuCount * AP_SAFT_STACK_SIZE;
> +
(So, I think this could be theoretically avoided for the HLT loop case, but I'm fine if we do it for both loop styles, for simplicity.)
> Address = BASE_4GB - 1;
> Status = gBS->AllocatePages (
> AllocateMaxAddress,
> EfiReservedMemoryType,
> - EFI_SIZE_TO_PAGES (sizeof (CpuMpData->AddressMap.RelocateApLoopFuncSize)),
> + EFI_SIZE_TO_PAGES (ApSafeBufferSize),
> &Address
> );
> ASSERT_EFI_ERROR (Status);
> mReservedApLoopFunc = (VOID *) (UINTN) Address;
> ASSERT (mReservedApLoopFunc != NULL);
> + mReservedTopOfApStack = (UINTN) Address + EFI_SIZE_TO_PAGES
> + (ApSafeBufferSize) * SIZE_4KB;
Ah, here you are moving the stacks to the end of the allocated area, so if there's any gap left after the assembly code and the stacks, the gap is now moved into the middle. Looks okay.
(4) I propose to replace the multiplication with the more idiomatic
EFI_PAGES_TO_SIZE (EFI_SIZE_TO_PAGES (ApSafeBufferSize))
[Jeff] Will correct it.
or else
ALIGN_VALUE (ApSafeBufferSize, EFI_PAGE_SIZE)
Although the current code is correct too.
> + ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1))
> + == 0);
> CopyMem (
> mReservedApLoopFunc,
> CpuMpData->AddressMap.RelocateApLoopFuncAddress,
> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> index 64e51d8..4e55760 100644
> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd:
> global ASM_PFX(AsmRelocateApLoop)
> ASM_PFX(AsmRelocateApLoop):
> AsmRelocateApLoopStart:
> - cmp byte [esp + 4], 1
> + push ebp
(5) Why do you save EBP on the stack? (Sorry if it is trivial, my assembly is not that great.) And, it is saved on the original AP stack.
[Jeff] Good point. It could be saved in new stack and used for stack trace. I will fix it.
> + mov ebp, esp
> + mov ecx, [ebp + 8] ; MwaitSupport
> + mov ebx, [ebp + 12] ; ApTargetCState
> + mov esp, [ebp + 20] ; TopOfApStack
> + cmp cl, 1 ; Check mwait-monitor support
> jnz HltLoop
> MwaitLoop:
> mov eax, esp
> xor ecx, ecx
> xor edx, edx
> monitor
> - mov eax, [esp + 8] ; Mwait Cx, Target C-State per eax[7:4]
> + mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4]
> shl eax, 4
> mwait
> jmp MwaitLoop
(6) These code changes look okay to me, but are they necessary in the 32-bit case as well? The original AP stack is under 4GB too.
Oh wait, I understand. Our purpose is dual here. The stack space used by MONITOR should be *both* under 4GB *and* in reserved type memory.
So, the code is fine, but can you please modify the following sentence in the commit message:
Moreover, even though AP stack is always under 4GB on protected
mode DXE, ...
to:
Moreover, even though AP stack is always under 4GB (a) in Ia32 DXE
and (b) with this patch, after transfering to protected mode from
X64 DXE, ...
It should be clear from the commit message that for Ia32, we solve problem #2 (memory type for the MONITOR area), while for X64, we solve problem #1 (address range) and problem #2 (memory type) *both*.
[Jeff] OK.
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h
> b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> index f73a469..e6dea18 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> @@ -250,7 +250,8 @@ VOID
> (EFIAPI * ASM_RELOCATE_AP_LOOP) (
> IN BOOLEAN MwaitSupport,
> IN UINTN ApTargetCState,
> - IN UINTN PmCodeSegment
> + IN UINTN PmCodeSegment,
> + IN UINTN TopOfApStack
> );
>
> /**
> diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> index aaabb50..7c8fa45 100644
> --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> @@ -224,6 +224,9 @@ RendezvousFunnelProcEnd:
> global ASM_PFX(AsmRelocateApLoop)
> ASM_PFX(AsmRelocateApLoop):
> AsmRelocateApLoopStart:
> + push rbp
(7) Again, not sure why we're saving RBP (on the current AP stack).
[Jeff] For x64, saving RBP is not necessary. It cannot used to do stack trace. I will remove it.
> + mov rbp, rsp
> + mov rsp, r9
> push rcx
> push rdx
>
>
This looks okay.
I have some more questions about the preexistent code:
(8) The MONITOR statement seems to set up an address *range* for monitoring with MWAIT. EAX provides the base address of the range, and we point it to our new stack. However, what determines the *size* of the address range? Obviously, it must fit in our new stack.
[Jeff] Yes. Monitor has size requirement to avoid fault wakeup. But this case, fault wakeup has no any real impact. It has rare case to write the memory. Even though fault wakeup happens, safe mwait-loop could make sure AP enter into C-state
again.
(9) In the original (pre-patch) X64 code, I see this:
MwaitLoop:
mov eax, esp ; Set Monitor Address
xor ecx, ecx ; ecx = 0
xor edx, edx ; edx = 0
monitor
shl ebx, 4
mov eax, ebx ; Mwait Cx, Target C-State per eax[7:4]
mwait
jmp MwaitLoop
I think this is wrong: EBX is supposed to contain ApTargetCState. In order to pass it to MWAIT, it has to be shifted left four bit positions, and moved to EAX, yes.
But, *unlike* in the Ia32 case, in the X64 code you shift EBX itself, and then you move the result from EBX to EAX. (In the Ia32 case, you move EBX to EAX first, and then shift EAX). This means that on the second iteration, EBX will contain garbage, and MWAIT won't be configured correctly.
EBX must be treated read-only in the loop, in my opinion. It should be fixed in a separate patch.
[Jeff] Good catch. Will fix it.
(10) The X64 HLT loop goes like this:
HltLoop:
cli
hlt
jmp HltLoop
ret
Can we remove the "ret" (in a separate patch)?
[Jeff] OK
Thanks!
Laszlo
next prev parent reply other threads:[~2016-11-24 2:23 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan
2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan
2016-11-23 17:08 ` Laszlo Ersek
2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan
2016-11-23 18:31 ` Laszlo Ersek
2016-11-24 2:23 ` Fan, Jeff [this message]
2016-11-24 9:20 ` Laszlo Ersek
2016-11-24 13:48 ` Fan, Jeff
2016-11-24 21:13 ` Laszlo Ersek
2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan
2016-11-23 18:52 ` Laszlo Ersek
2016-11-24 2:37 ` Fan, Jeff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=542CF652F8836A4AB8DBFAAD40ED192A4A2E686B@shsmsx102.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox