From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web10.305.1647891989268899485 for ; Mon, 21 Mar 2022 12:46:30 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=UWjViqQD; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id A883F240105 for ; Mon, 21 Mar 2022 20:46:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1647891986; bh=BZ+q//uNok7t4t/+UBDxjZ6L4GWjzjiiKwbJ2pbuUrs=; h=Date:Subject:To:Cc:From:From; b=UWjViqQDxMjw7Lw14OCwaWs4RxvGB1bFfa5qCd26UeLS35xGYPF3MqCsORW9OzO5N aARFWZflucjZkUePEJ1fLb4f1OLUwhwz5k6H1AtM0rRM91N8FLWXjk2nYS9h501nrh zK9wkwVRSGSyFSWdAvUzZFB58zajb6tm+DGHFtPHdaOZNDlDN7GpnL6IlsQqVFlCkj 7lVSYxvMcsmjQj+bhk3Wx8MOkWbFW1GaM1uXXZtvscja6EnoWAxgrFDGGOp6TRWxle gC3K94my51gXK8YSr9w8ke0D0xWjjaEkxEYe6/7GygPL5h+MoZUooPovmJbF5wNAtZ LWKpvYV2ZpICw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4KMlVS1Jryz9rxN; Mon, 21 Mar 2022 20:46:23 +0100 (CET) Message-ID: Date: Mon, 21 Mar 2022 19:46:23 +0000 MIME-Version: 1.0 Subject: Re: [edk2-devel][PATCH 1/2] MdeModulePkg: StackOffset must be aligned to a 16-byte boundary in X64 To: devel@edk2.groups.io, ted.kuo@intel.com Cc: Michael D Kinney , Dandan Bi , Liming Gao , Debkumar De , Harry Han , Catharine West , Jian J Wang References: <6301e56ae7ec1852c8bf499c2df69e0a04420443.1647864782.git.ted.kuo@intel.com> From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= In-Reply-To: <6301e56ae7ec1852c8bf499c2df69e0a04420443.1647864782.git.ted.kuo@intel.com> Content-Language: en-GB Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Good day, Thanks for the update! On 21.03.22 13:43, Kuo, Ted wrote: > REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3D3865 > For X64, StackOffset must be aligned to a 16-byte boundary as well as > old stack and new stack. Otherwise, it'll get wrong data from Private > pointer after switching from old stack to new stack. > > Cc: Michael D Kinney > Cc: Dandan Bi > Cc: Liming Gao > Cc: Debkumar De > Cc: Harry Han > Cc: Catharine West > Cc: Jian J Wang > Cc: Marvin H=C3=A4user > Signed-off-by: Ted Kuo > --- > MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c b/MdeModuleP= kg/Core/Pei/Dispatcher/Dispatcher.c > index 3552feda8f..8a2c1ec779 100644 > --- a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c > +++ b/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c > @@ -823,6 +823,19 @@ PeiCheckAndSwitchStack ( > (VOID **)&TemporaryRamSupportPpi > ); > if (!EFI_ERROR (Status)) { > + // > + // For X64, StackOffset must be aligned to a 16-byte boundary. O= therwise, it'll get wrong data > + // from Private pointer after switching to new stack. > + // > + if ((sizeof (UINTN) =3D=3D sizeof (UINT64)) && ((StackOffset & 0= x0F) =3D=3D 8)) { > + if (StackOffsetPositive =3D=3D TRUE) { > + StackOffset -=3D 8; > + } else { > + StackOffset +=3D 8; > + } > + Private->StackOffset =3D StackOffset; > + } > + Hmm, the overall design (not your patch) looks very broken to me. So, if=20 the PPI exists, it's responsible for the migration of the stack, but it=20 is neither passed where to migrate the stack to, nor does it return=20 where it did migrate it to. This means the StackOffset calculated here=20 may be out-of-sync with what actually happens in the PPI, e.g., if the=20 PPI code is changed. There also is no detailed explanation for the=20 memory layout with FSP separate stack vs bootloader shared stack, so I=20 cannot really give detailed comments quickly. *Sigh*. Anyhow, as for the patch, I do not understand a few things: 1) Maybe most important of all, what even is broken? Which address is=20 not 16-Byte-aligned to cause this issue in the first place? 2) Why do you align StackOffset? Like yes, if the old top of the stack=20 and the offset to the new top of the stack are both 16-Byte-aligned,=20 then the new top of the stack is 16-Byte-aligned too. However,=20 StackOffset is more of a by-product and TopOfNewStack remains holding=20 the old value. I just don't really understand the idea of this approach. 3) This only works when StackOffset is guaranteed to be 8-Byte-aligned=20 (is it?). As we are dealing with the *top* of the stack (which should be=20 4K-aligned even for memory protection!), what would be wrong with just=20 aligning down and up instead? (Same question for the second patch to the FSP code) 4) The next patch performs a similar alignment operation (as mentioned=20 before). However, while this patch aligns the *top* of the stack, the=20 FSP patch aligns the *bottom* of the stack. This may or may not be=20 correct based on your premises. Can you maybe document why this is=20 correct, or even better, try to align the top of the stack in FSP as=20 well? (By transitivity, if you align the top correctly, the bottom=20 should be aligned correctly as well, as every nested call must preserve=20 the alignment invariant) 5) Why does this only happen when the PPI is found? Would that not risk=20 an unaligned stack if the PPI is not provided, the same way it is=20 unaligned now? 6) The comment explicitly mentions X64, but checks only for 64-bit=20 pointer size. So this should affect AArch64 and RISC-V-64 as well. Are=20 they guaranteed to function correctly after this patch (especially with=20 the PPI sync guarantees mentioned earlier)? 7) This only updates FSP code, similar to 5), but to QEMU and friends=20 continue to work? Thanks! Best regards, Marvin > // > // Heap Offset > // > @@ -852,7 +865,10 @@ PeiCheckAndSwitchStack ( > // Temporary Ram Support PPI is provided by platform, it will c= opy > // temporary memory to permanent memory and do stack switching. > // After invoking Temporary Ram Support PPI, the following code= 's > - // stack is in permanent memory. > + // stack is in permanent memory. For X64, the bit3:0 of the new = stack > + // produced by TemporaryRamMigration must be aligned with the bi= t3:0 of > + // the old stack. Otherwise, it'll break the original stack alig= nment > + // after switching to new stack. > // > TemporaryRamSupportPpi->TemporaryRamMigration ( > PeiServices,