public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Jeff Fan <jeff.fan@intel.com>, edk2-devel@ml01.01.org
Cc: Feng Tian <feng.tian@intel.com>,
	Michael D Kinney <michael.d.kinney@intel.com>
Subject: Re: [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB
Date: Wed, 23 Nov 2016 19:31:28 +0100	[thread overview]
Message-ID: <25db000e-ad85-2fd6-6936-e0147e7c20b4@redhat.com> (raw)
In-Reply-To: <20161123140138.15976-3-jeff.fan@intel.com>

On 11/23/16 15:01, Jeff Fan wrote:
> For long mode DXE, we will disable paging on AP to protected mode to execute AP
> safe loop code in ACPI NVS range under 4GB. But we forget to allocate stack for
> AP under 4GB and AP still are using original AP stack. If original AP stack is
> larger than 4GB, it cannot be used after AP is transferred to protected mode.
> 
> Moreover, even though AP stack is always under 4GB on protected mode DXE, AP
> stack(in BootServiceData) maybe crashed by OS after Exit Boot Service event.
> 
> This fix is to allocate ACPI NVS range under 4GB together with AP safe loop
> code. APs will switch to new stack at beginning of safe loop code.

(1) We are actually allocating EfiReservedMemoryType -- please update
the commit message.

(2) For a minute I was confused about using the stack *at all* in the
HLT loop. But looking farther down, and reading up on the MONITOR
statement, I see that the MWAIT loops in both the Ia32 and X64 source
files use the stack inherentily. So, the stack usage is unavoidable in
the MwaitSupport==TRUE case.

Can you mention in the commit message that the stack usage is only
really necessary for MwaitSupport==TRUE, and in the HLT loop mode, it
could be technically avoided?

(I'm not suggesting that we implement a separate solution for the HLT
loop mode, but we should be clear about the true incentive for this patch.)

> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Feng Tian <feng.tian@intel.com>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jeff Fan <jeff.fan@intel.com>
> ---
>  UefiCpuPkg/Library/MpInitLib/DxeMpLib.c        | 19 +++++++++++++++++--
>  UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm |  9 +++++++--
>  UefiCpuPkg/Library/MpInitLib/MpLib.h           |  3 ++-
>  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm  |  3 +++
>  4 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> index a0d5eeb..d0f9f7e 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> @@ -18,6 +18,7 @@
>  #include <Library/UefiBootServicesTableLib.h>
>  
>  #define  AP_CHECK_INTERVAL     (EFI_TIMER_PERIOD_MILLISECONDS (100))
> +#define  AP_SAFT_STACK_SIZE    128

(3) Is "SAFT" a typo? Did you mean "SAFE"?

>  
>  CPU_MP_DATA      *mCpuMpData = NULL;
>  EFI_EVENT        mCheckAllApsEvent = NULL;
> @@ -25,6 +26,7 @@ EFI_EVENT        mMpInitExitBootServicesEvent = NULL;
>  EFI_EVENT        mLegacyBootEvent = NULL;
>  volatile BOOLEAN mStopCheckAllApsStatus = TRUE;
>  VOID             *mReservedApLoopFunc = NULL;
> +UINTN            mReservedTopOfApStack;
>  
>  /**
>    Get the pointer to CPU MP Data structure.
> @@ -241,11 +243,18 @@ RelocateApLoop (
>    CPU_MP_DATA            *CpuMpData;
>    BOOLEAN                MwaitSupport;
>    ASM_RELOCATE_AP_LOOP   AsmRelocateApLoopFunc;
> +  UINTN                  ProcessorNumber;
>  
> +  MpInitLibWhoAmI (&ProcessorNumber); 
>    CpuMpData    = GetCpuMpData ();
>    MwaitSupport = IsMwaitSupport ();
>    AsmRelocateApLoopFunc = (ASM_RELOCATE_AP_LOOP) (UINTN) mReservedApLoopFunc;
> -  AsmRelocateApLoopFunc (MwaitSupport, CpuMpData->ApTargetCState, CpuMpData->PmCodeSegment);
> +  AsmRelocateApLoopFunc (
> +    MwaitSupport,
> +    CpuMpData->ApTargetCState,
> +    CpuMpData->PmCodeSegment,
> +    mReservedTopOfApStack - ProcessorNumber * AP_SAFT_STACK_SIZE
> +    );
>    //
>    // It should never reach here
>    //
> @@ -289,6 +298,7 @@ InitMpGlobalData (
>  {
>    EFI_STATUS                 Status;
>    EFI_PHYSICAL_ADDRESS       Address;
> +  UINTN                      ApSafeBufferSize;
>  
>    SaveCpuMpData (CpuMpData);
>  
> @@ -307,16 +317,21 @@ InitMpGlobalData (
>    // Allocating it in advance since memory services are not available in
>    // Exit Boot Services callback function.
>    //
> +  ApSafeBufferSize  = CpuMpData->AddressMap.RelocateApLoopFuncSize;
> +  ApSafeBufferSize += CpuMpData->CpuCount * AP_SAFT_STACK_SIZE;
> +

(So, I think this could be theoretically avoided for the HLT loop case,
but I'm fine if we do it for both loop styles, for simplicity.)

>    Address = BASE_4GB - 1;
>    Status  = gBS->AllocatePages (
>                     AllocateMaxAddress,
>                     EfiReservedMemoryType,
> -                   EFI_SIZE_TO_PAGES (sizeof (CpuMpData->AddressMap.RelocateApLoopFuncSize)),
> +                   EFI_SIZE_TO_PAGES (ApSafeBufferSize),
>                     &Address
>                     );
>    ASSERT_EFI_ERROR (Status);
>    mReservedApLoopFunc = (VOID *) (UINTN) Address;
>    ASSERT (mReservedApLoopFunc != NULL);
> +  mReservedTopOfApStack = (UINTN) Address + EFI_SIZE_TO_PAGES (ApSafeBufferSize) * SIZE_4KB;

Ah, here you are moving the stacks to the end of the allocated area, so
if there's any gap left after the assembly code and the stacks, the gap
is now moved into the middle. Looks okay.

(4) I propose to replace the multiplication with the more idiomatic

  EFI_PAGES_TO_SIZE (EFI_SIZE_TO_PAGES (ApSafeBufferSize))

or else

  ALIGN_VALUE (ApSafeBufferSize, EFI_PAGE_SIZE)

Although the current code is correct too.

> +  ASSERT ((mReservedTopOfApStack & (UINTN)(CPU_STACK_ALIGNMENT - 1)) == 0);
>    CopyMem (
>      mReservedApLoopFunc,
>      CpuMpData->AddressMap.RelocateApLoopFuncAddress,
> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> index 64e51d8..4e55760 100644
> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
> @@ -217,14 +217,19 @@ RendezvousFunnelProcEnd:
>  global ASM_PFX(AsmRelocateApLoop)
>  ASM_PFX(AsmRelocateApLoop):
>  AsmRelocateApLoopStart:
> -    cmp        byte [esp + 4], 1
> +    push       ebp

(5) Why do you save EBP on the stack? (Sorry if it is trivial, my
assembly is not that great.) And, it is saved on the original AP stack.

> +    mov        ebp, esp
> +    mov        ecx, [ebp + 8]      ; MwaitSupport
> +    mov        ebx, [ebp + 12]     ; ApTargetCState
> +    mov        esp, [ebp + 20]     ; TopOfApStack
> +    cmp        cl,  1              ; Check mwait-monitor support
>      jnz        HltLoop
>  MwaitLoop:
>      mov        eax, esp
>      xor        ecx, ecx
>      xor        edx, edx
>      monitor
> -    mov        eax, [esp + 8]    ; Mwait Cx, Target C-State per eax[7:4]
> +    mov        eax, ebx            ; Mwait Cx, Target C-State per eax[7:4]
>      shl        eax, 4
>      mwait
>      jmp        MwaitLoop

(6) These code changes look okay to me, but are they necessary in the
32-bit case as well? The original AP stack is under 4GB too.

Oh wait, I understand. Our purpose is dual here. The stack space used by
MONITOR should be *both* under 4GB *and* in reserved type memory.

So, the code is fine, but can you please modify the following sentence
in the commit message:

    Moreover, even though AP stack is always under 4GB on protected
    mode DXE, ...

to:

    Moreover, even though AP stack is always under 4GB (a) in Ia32 DXE
    and (b) with this patch, after transfering to protected mode from
    X64 DXE, ...

It should be clear from the commit message that for Ia32, we solve
problem #2 (memory type for the MONITOR area), while for X64, we solve
problem #1 (address range) and problem #2 (memory type) *both*.

> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> index f73a469..e6dea18 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> @@ -250,7 +250,8 @@ VOID
>  (EFIAPI * ASM_RELOCATE_AP_LOOP) (
>    IN BOOLEAN                 MwaitSupport,
>    IN UINTN                   ApTargetCState,
> -  IN UINTN                   PmCodeSegment
> +  IN UINTN                   PmCodeSegment,
> +  IN UINTN                   TopOfApStack
>    );
>  
>  /**
> diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> index aaabb50..7c8fa45 100644
> --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
> @@ -224,6 +224,9 @@ RendezvousFunnelProcEnd:
>  global ASM_PFX(AsmRelocateApLoop)
>  ASM_PFX(AsmRelocateApLoop):
>  AsmRelocateApLoopStart:
> +    push       rbp

(7) Again, not sure why we're saving RBP (on the current AP stack).

> +    mov        rbp, rsp
> +    mov        rsp, r9
>      push       rcx
>      push       rdx
>  
> 

This looks okay.


I have some more questions about the preexistent code:

(8) The MONITOR statement seems to set up an address *range* for
monitoring with MWAIT. EAX provides the base address of the range, and
we point it to our new stack. However, what determines the *size* of the
address range? Obviously, it must fit in our new stack.

(9) In the original (pre-patch) X64 code, I see this:

MwaitLoop:
    mov        eax, esp           ; Set Monitor Address
    xor        ecx, ecx           ; ecx = 0
    xor        edx, edx           ; edx = 0
    monitor
    shl        ebx, 4
    mov        eax, ebx           ; Mwait Cx, Target C-State per eax[7:4]
    mwait
    jmp        MwaitLoop

I think this is wrong: EBX is supposed to contain ApTargetCState. In
order to pass it to MWAIT, it has to be shifted left four bit positions,
and moved to EAX, yes.

But, *unlike* in the Ia32 case, in the X64 code you shift EBX itself,
and then you move the result from EBX to EAX. (In the Ia32 case, you
move EBX to EAX first, and then shift EAX). This means that on the
second iteration, EBX will contain garbage, and MWAIT won't be
configured correctly.

EBX must be treated read-only in the loop, in my opinion. It should be
fixed in a separate patch.

(10) The X64 HLT loop goes like this:

HltLoop:
    cli
    hlt
    jmp        HltLoop
    ret

Can we remove the "ret" (in a separate patch)?

Thanks!
Laszlo


  reply	other threads:[~2016-11-23 18:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-23 14:01 [PATCH 0/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan
2016-11-23 14:01 ` [PATCH 1/3] UefiCpuPkg/DxeMpLib: Get safe AP loop handler from global variable Jeff Fan
2016-11-23 17:08   ` Laszlo Ersek
2016-11-23 14:01 ` [PATCH 2/3] UefiCpuPkg/DxeMpLib: Allocate new safe stack < 4GB Jeff Fan
2016-11-23 18:31   ` Laszlo Ersek [this message]
2016-11-24  2:23     ` Fan, Jeff
2016-11-24  9:20       ` Laszlo Ersek
2016-11-24 13:48         ` Fan, Jeff
2016-11-24 21:13           ` Laszlo Ersek
2016-11-23 14:01 ` [PATCH 3/3] UefiCpuPkg/DxeMpLib: Make sure APs in safe loop code Jeff Fan
2016-11-23 18:52   ` Laszlo Ersek
2016-11-24  2:37     ` Fan, Jeff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25db000e-ad85-2fd6-6936-e0147e7c20b4@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox