From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout01.posteo.de (mout01.posteo.de [185.67.36.65]) by mx.groups.io with SMTP id smtpd.web10.8741.1647515323455873032 for ; Thu, 17 Mar 2022 04:08:44 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=MlBB6ZjR; spf=pass (domain: posteo.de, ip: 185.67.36.65, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id F1C3324002C for ; Thu, 17 Mar 2022 12:08:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1647515321; bh=uBHJjOyWEFY2a5u23lJqrYpGuwCloT+caVfltEty6ng=; h=Date:From:Subject:To:Cc:From; b=MlBB6ZjR8eIlo3DLzUUhTbZOBMSdR6WQH3WjMYzPF3wIZ/80BLUzwQ9f2jEbq7YFi PsOk/e4XUmIXok87f0OxJOq6nMxm34Xdm8dir/qtObjxcWWmNtNNpwwZAOQAcf10LS 6nuHPUlUQMbJLUAFekaqPqBPyAXkHTz7K1F04qLJBAVzqA8ZMbKpS0bexELpQTMJyk 8ZHV+zhnXomfN3S+tPJauLvEhtAlNZThRogQuyDYAjIgGCqjcgdc5ALWPik5rrZIG6 du6jELgbChP34Y76mLf0m0ScYaP3EuivW3gFeIKdWPlMqHCWE8vFpph8D3OZN8wQFy zEC5KAwao4OBg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4KK4Bt3Kzwz9rxD; Thu, 17 Mar 2022 12:08:38 +0100 (CET) Message-ID: <944dec73-e095-601e-7331-38d2c72f013f@posteo.de> Date: Thu, 17 Mar 2022 11:08:37 +0000 MIME-Version: 1.0 From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Subject: Re: [edk2-devel][PATCH] MdeModulePkg: Make RSP 16-byte boundary aligned for PEI 64bit To: devel@edk2.groups.io, ted.kuo@intel.com Cc: "Bi, Dandan" , "Gao, Liming" , "De, Debkumar" , "Han, Harry" , "West, Catharine" , "Wang, Jian J" , "S, Ashraf Ali" , "Kinney, Michael D" References: In-Reply-To: Content-Language: en-GB Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Good day, > On 17. Mar 2022, at 02:05, Kuo, Ted wrote: > > =EF=BB=BFHi Liming and Mike, > > Can you please review the change? > > Thanks, > Ted > > -----Original Message----- > From: devel@edk2.groups.io On Behalf Of Kuo, Ted > Sent: Thursday, March 10, 2022 2:21 PM > To: devel@edk2.groups.io > Cc: Bi, Dandan ; Gao, Liming=20 > ; De, Debkumar ; Han,=20 > Harry ; West, Catharine=20 > ; Wang, Jian J ; S,=20 > Ashraf Ali > Subject: [edk2-devel][PATCH] MdeModulePkg: Make RSP 16-byte boundary=20 > aligned for PEI 64bit > > REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3D3865 > Use SwitchPeiCore instead of calling PeiCore directly when switching=20 > PeiCore from temporary memory to permanent memory. For PEI 32bit,=20 > SwitchPeiCore always calls PeiCore without any additional step. For=20 > PEI 64bit, SwitchPeiCore makes RSP 16-byte boundary aligned and then=20 > allocate 32 bytes as a shadow store on call stack before calling PeiCore. > > Cc: Dandan Bi > Cc: Liming Gao > Cc: Debkumar De > Cc: Harry Han > Cc: Catharine West > Cc: Jian J Wang > Cc: Ashraf Ali S > Signed-off-by: Ted Kuo > --- > MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c | =C2=A02 +-=20 > =C2=A0MdeModulePkg/Core/Pei/Ia32/SwitchPeiCore.nasm | 33=20 > +++++++++++++++++++++++ > MdeModulePkg/Core/Pei/PeiMain.h =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 25 ++++++++++++++++++ > MdeModulePkg/Core/Pei/PeiMain.inf =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| =C2=A06 +++++ > MdeModulePkg/Core/Pei/X64/SwitchPeiCore.nasm =C2=A0| 38=20 > +++++++++++++++++++++++++++ > 5 files changed, 103 insertions(+), 1 deletion(-) =C2=A0create mode 10064= 4=20 > MdeModulePkg/Core/Pei/Ia32/SwitchPeiCore.nasm > create mode 100644 MdeModulePkg/Core/Pei/X64/SwitchPeiCore.nasm > > diff --git a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c=20 > b/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c > index 3552feda8f..5af6e6e86f 100644 > --- a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c > +++ b/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c > @@ -871,7 +871,7 @@ PeiCheckAndSwitchStack ( > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0// > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0// Entry PEI Phase 2 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0// > - =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0PeiCore (SecCoreData, NULL, Private); > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SwitchPeiCore (SecCoreData, NULL, Private= ); > =C2=A0=C2=A0=C2=A0=C2=A0} else { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0// > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0// Migrate memory pages allocated in = pre-memory phase. > diff --git a/MdeModulePkg/Core/Pei/Ia32/SwitchPeiCore.nasm=20 > b/MdeModulePkg/Core/Pei/Ia32/SwitchPeiCore.nasm > new file mode 100644 > index 0000000000..23cfb5090b > --- /dev/null > +++ b/MdeModulePkg/Core/Pei/Ia32/SwitchPeiCore.nasm > @@ -0,0 +1,33 @@ > +;---------------------------------------------------------------------- > +-------- > +; > +; Copyright (c) 2022, Intel Corporation. All rights reserved.
; > +SPDX-License-Identifier: BSD-2-Clause-Patent ; ; Abstract: > +; > +; =C2=A0=C2=A0Switch PeiCore from temporary memory to permanent memory. > +; > +;---------------------------------------------------------------------- > +-------- > + > + =C2=A0=C2=A0=C2=A0SECTION .text > + > +extern ASM_PFX(PeiCore) > + > +;---------------------------------------------------------------------- > +-------- > +; VOID > +; EFIAPI > +; SwitchPeiCore ( > +; =C2=A0=C2=A0EFI_SEC_PEI_HAND_OFF =C2=A0=C2=A0=C2=A0*SecCoreDataPtr, > +; =C2=A0=C2=A0EFI_PEI_PPI_DESCRIPTOR =C2=A0*PpiList, > +; =C2=A0=C2=A0VOID =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*Data > +; =C2=A0=C2=A0); > +;---------------------------------------------------------------------- > +-------- > +global ASM_PFX(SwitchPeiCore) > +ASM_PFX(SwitchPeiCore): > + =C2=A0push =C2=A0=C2=A0DWORD [esp + 12] > + =C2=A0push =C2=A0=C2=A0DWORD [esp + 12] > + =C2=A0push =C2=A0=C2=A0DWORD [esp + 12] > + =C2=A0call =C2=A0=C2=A0ASM_PFX(PeiCore) > + =C2=A0jmp =C2=A0=C2=A0=C2=A0$ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0; Should never reach her= e > + =C2=A0ret > + I think there were efforts in the past to avoid ASM whenever possible.=20 Can=E2=80=99t this just remain a C function (for IA32 only of course) and i= f=20 not, wouldn't a simple jmp instruction be sufficient? > diff --git a/MdeModulePkg/Core/Pei/PeiMain.h=20 > b/MdeModulePkg/Core/Pei/PeiMain.h index 556beddad5..8e8ed3dadf 100644 > --- a/MdeModulePkg/Core/Pei/PeiMain.h > +++ b/MdeModulePkg/Core/Pei/PeiMain.h > @@ -2038,4 +2038,29 @@ PeiReinitializeFv ( > =C2=A0=C2=A0IN =C2=A0PEI_CORE_INSTANCE =C2=A0*PrivateData > =C2=A0=C2=A0); > > +/** > + =C2=A0This routine is invoked by main entry of PeiMain module during > +transition > + =C2=A0from temporary memory to permanent memory. > + > + =C2=A0@param SecCoreDataPtr =C2=A0Points to a data structure containing= =20 > information about the PEI core's operating > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= environment, such as the size and location=20 > of temporary RAM, the stack location and > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= the BFV location. > + =C2=A0@param PpiList =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Po= ints to a list of one or more PPI=20 > descriptors to be installed initially by the PEI core. > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= An empty PPI list consists of a single=20 > descriptor with the end-tag > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= EFI_PEI_PPI_DESCRIPTOR_TERMINATE_LIST. As=20 > part of its initialization > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= phase, the PEI Foundation will add these=20 > SEC-hosted PPIs to its PPI database such > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= that both the PEI Foundation and any modules=20 > can leverage the associated service > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= calls and/or code in these early PPIs > + =C2=A0@param Data =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0Pointer to old core data that is used to=20 > initialize the > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= core's data areas. > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= If NULL, it is first PeiCore entering. > + > +**/ > +VOID > +EFIAPI > +SwitchPeiCore ( > + =C2=A0IN CONST EFI_SEC_PEI_HAND_OFF =C2=A0=C2=A0=C2=A0*SecCoreDataPtr, > + =C2=A0IN CONST EFI_PEI_PPI_DESCRIPTOR =C2=A0*PpiList, > + =C2=A0IN VOID =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0*Data > + =C2=A0); > #endif > diff --git a/MdeModulePkg/Core/Pei/PeiMain.inf=20 > b/MdeModulePkg/Core/Pei/PeiMain.inf > index 0cf357371a..b597aed8f6 100644 > --- a/MdeModulePkg/Core/Pei/PeiMain.inf > +++ b/MdeModulePkg/Core/Pei/PeiMain.inf > @@ -47,6 +47,12 @@ > =C2=A0=C2=A0PciCfg2/PciCfg2.c > =C2=A0=C2=A0PeiMain.h > > +[Sources.IA32] > + =C2=A0Ia32/SwitchPeiCore.nasm > + > +[Sources.X64] > + =C2=A0X64/SwitchPeiCore.nasm > + > [Packages] > =C2=A0=C2=A0MdePkg/MdePkg.dec > =C2=A0=C2=A0MdeModulePkg/MdeModulePkg.dec > diff --git a/MdeModulePkg/Core/Pei/X64/SwitchPeiCore.nasm=20 > b/MdeModulePkg/Core/Pei/X64/SwitchPeiCore.nasm > new file mode 100644 > index 0000000000..94e09be757 > --- /dev/null > +++ b/MdeModulePkg/Core/Pei/X64/SwitchPeiCore.nasm > @@ -0,0 +1,38 @@ > +;---------------------------------------------------------------------- > +-------- > +; > +; Copyright (c) 2022, Intel Corporation. All rights reserved.
; > +SPDX-License-Identifier: BSD-2-Clause-Patent ; ; Abstract: > +; > +; =C2=A0=C2=A0Switch PeiCore from temporary memory to permanent memory. > +; > +;---------------------------------------------------------------------- > +-------- > + > + =C2=A0=C2=A0=C2=A0SECTION .text > + > +extern ASM_PFX(PeiCore) > + > +;---------------------------------------------------------------------- > +-------- > +; VOID > +; EFIAPI > +; SwitchPeiCore ( > +; =C2=A0=C2=A0EFI_SEC_PEI_HAND_OFF =C2=A0=C2=A0=C2=A0*SecCoreDataPtr, > +; =C2=A0=C2=A0EFI_PEI_PPI_DESCRIPTOR =C2=A0*PpiList, > +; =C2=A0=C2=A0VOID =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*Data > +; =C2=A0=C2=A0); > +;---------------------------------------------------------------------- > +-------- > +global ASM_PFX(SwitchPeiCore) > +ASM_PFX(SwitchPeiCore): > + =C2=A0; > + =C2=A0; Per X64 calling convention, make sure RSP is 16-byte aligned. > + =C2=A0; > + =C2=A0mov =C2=A0=C2=A0=C2=A0rax, rsp > + =C2=A0and =C2=A0=C2=A0=C2=A0rax, 0fh > + =C2=A0sub =C2=A0=C2=A0=C2=A0rsp, rax > + > + =C2=A0sub =C2=A0=C2=A0=C2=A0rsp, 20h =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0; Allocate 32 bytes as a shadow store on=20 > call stack > + =C2=A0call =C2=A0=C2=A0ASM_PFX(PeiCore) > + =C2=A0jmp =C2=A0=C2=A0=C2=A0$ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0; Should never reach her= e > + =C2=A0ret This might be my own ignorance, but this approach really worries me. The=20 stack is correctly aligned by this function, ok. But this code is called=20 from C, and the stack is not fully switched (only aligned relative to=20 the current location), and no architectural mode switch seems to occur,=20 which means 64-bit code already ran with an unaligned stack (otherwise=20 it would not need to be aligned here), agreed? The stack is last switched by SecCore's TemporaryRamMigration() to=20 TopOfNewStack - TemporaryStackSize. By ABI constraints,=20 TemporaryStackSize must be 16-Byte-aligned (is it?). Furthermore, we=20 have TopOfNewStack =3D Private->PhysicalMemoryBegin + NewStackSize.=20 NewStackSize is manually aligned to 4K (ok), but may be shortened to=20 PcdGet32 (PcdPeiCoreMaxPeiStackSize), which also must be 16-Byte-aligned=20 for ABI constraints (is it?). Private->PhysicalMemoryBegin is where my=20 knowledge falls flat, which guarantees hold for it? If it's not=20 guaranteed to be 16-Byte-aligned, this looks like a culprit and the=20 stack should be aligned in TemporaryRamMigration(). Otherwise, may one=20 of the constraints I mentioned not hold? Otherwise, does some ASM=20 in-between bork the stack alignment? Or am I completely lost as to what=20 is going on? :) Thanks! Best regards, Marvin > + > -- > 2.16.2.windows.1 > > > > > > > > > >=20 > >