From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web11.7780.1647941241776235203 for ; Tue, 22 Mar 2022 02:27:22 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@posteo.de header.s=2017 header.b=MkUlXJ6U; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 95BC4240104 for ; Tue, 22 Mar 2022 10:27:19 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1647941239; bh=GR47ks5TLcjWaou8fbRKTXLPot73p0YzT9Z1ua7xB+o=; h=From:Subject:Date:Cc:To:From; b=MkUlXJ6U5BKkGevbBbO9hAMwTpPZwvL8zVreNFwIasSYiaO4uukXp1BVBxC4aP3bI DuhfQvacjzkodcNdyoY0b3lB/3KpjyzbM8FwE5X4JbuokU/PAg8I9hTpcDuj+qGVV8 BHjqiREzc8M8yNeuj8tAG25tXI9FfHLYTotfgjxkcyUFoWalp6Ujwe17XYJL3U8Gma oG7o81cMHWML7LPBshvV3XSxCSXPM7iSfXrjkgejMZLSv+FKL2IGYHTplR32gXIwwH q/8Me3hSfLXvDTLA3JE+HBavigdNEdoJORix5g8c0vbk6HYXX/pfEF6kXG01sYyw4g aCeQPtc/RYAnA== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4KN5jc6HK2z9rxN; Tue, 22 Mar 2022 10:27:16 +0100 (CET) From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Mime-Version: 1.0 (1.0) Subject: Re: [edk2-devel][PATCH 1/2] MdeModulePkg: StackOffset must be aligned to a 16-byte boundary in X64 Date: Tue, 22 Mar 2022 09:27:16 +0000 Message-Id: References: Cc: devel@edk2.groups.io, "Kinney, Michael D" , "Bi, Dandan" , "Gao, Liming" , "De, Debkumar" , "Han, Harry" , "West, Catharine" , "Wang, Jian J" In-Reply-To: To: "Kuo, Ted" Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Good day, Thanks for the updates! > On 22. Mar 2022, at 08:23, Kuo, Ted wrote: >=20 > =EF=BB=BFHi Marvin, >=20 > Good day. Thanks for your valuable comments. After checking all of your co= mments, I decide to drop the patches and close the bugzilla ticket since the= changes should be specific to X64 in IntelFspPkg. You still can find my inl= ine comments with [Ted] for your questions. >=20 > Thanks, > Ted >=20 > -----Original Message----- > From: Marvin H=C3=A4user =20 > Sent: Tuesday, March 22, 2022 3:46 AM > To: devel@edk2.groups.io; Kuo, Ted > Cc: Kinney, Michael D ; Bi, Dandan ; Gao, Liming ; De, Debkumar ; Han, Harry ; West, Catharine ; Wang, Jian J > Subject: Re: [edk2-devel][PATCH 1/2] MdeModulePkg: StackOffset must be ali= gned to a 16-byte boundary in X64 >=20 > Good day, >=20 > Thanks for the update! >=20 >> On 21.03.22 13:43, Kuo, Ted wrote: >> REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3D3865 >> For X64, StackOffset must be aligned to a 16-byte boundary as well as=20 >> old stack and new stack. Otherwise, it'll get wrong data from Private=20 >> pointer after switching from old stack to new stack. >>=20 >> Cc: Michael D Kinney >> Cc: Dandan Bi >> Cc: Liming Gao >> Cc: Debkumar De >> Cc: Harry Han >> Cc: Catharine West >> Cc: Jian J Wang >> Cc: Marvin H=C3=A4user >> Signed-off-by: Ted Kuo >> --- >> MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c | 18 +++++++++++++++++- >> 1 file changed, 17 insertions(+), 1 deletion(-) >>=20 >> diff --git a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c=20 >> b/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c >> index 3552feda8f..8a2c1ec779 100644 >> --- a/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c >> +++ b/MdeModulePkg/Core/Pei/Dispatcher/Dispatcher.c >> @@ -823,6 +823,19 @@ PeiCheckAndSwitchStack ( >> (VOID **)&TemporaryRamSupportPpi >> ); >> if (!EFI_ERROR (Status)) { >> + // >> + // For X64, StackOffset must be aligned to a 16-byte boundary. Oth= erwise, it'll get wrong data >> + // from Private pointer after switching to new stack. >> + // >> + if ((sizeof (UINTN) =3D=3D sizeof (UINT64)) && ((StackOffset & 0x0= F) =3D=3D 8)) { >> + if (StackOffsetPositive =3D=3D TRUE) { >> + StackOffset -=3D 8; >> + } else { >> + StackOffset +=3D 8; >> + } >> + Private->StackOffset =3D StackOffset; >> + } >> + >=20 > Hmm, the overall design (not your patch) looks very broken to me. So, if t= he PPI exists, it's responsible for the migration of the stack, but it is ne= ither passed where to migrate the stack to, nor does it return where it did m= igrate it to. This means the StackOffset calculated here may be out-of-sync w= ith what actually happens in the PPI, e.g., if the PPI code is changed. Ther= e also is no detailed explanation for the memory layout with FSP separate st= ack vs bootloader shared stack, so I cannot really give detailed comments qu= ickly. *Sigh*. >=20 > Anyhow, as for the patch, I do not understand a few things: >=20 > 1) Maybe most important of all, what even is broken? Which address is not 1= 6-Byte-aligned to cause this issue in the first place? > [Ted]: CPU will generate exception when running some X64 instructions whic= h need input/output memory address to be 16-Byte-aligned. Yes, I understood as much. I built a chain of reasoning for alignment in the= response for the first revision, because a proper fix needs knowledge of wh= ich assumption is wrong in the first place. The question is, which *exact* v= alue in the chain is not 16-Byte aligned, and should it be? Did you ever pri= nt all involved values, like PhysicalMemoryBegin (or whatever its name was, s= orry, I=E2=80=99m responding from mobile)? >=20 > 2) Why do you align StackOffset? Like yes, if the old top of the stack and= the offset to the new top of the stack are both 16-Byte-aligned, then the n= ew top of the stack is 16-Byte-aligned too. However, StackOffset is more of a= by-product and TopOfNewStack remains holding the old value. I just don't re= ally understand the idea of this approach. > [Ted]: Since new stack must keep the original stack alignment as old stack= , it means stack offset must be 16-Byte-aligned too. Yes, I agreed above. But StackOffset is not the definition of where the new s= tack is located, but only a consequence from it. It is not the primary descr= iptor is what I=E2=80=99m trying to say. > And the OldStack/NewStack in the fsp patch indicates the *current* old/new= stack. The fsp patch just makes left shift 8-byte of whole used stack data w= hen new stack not aligning with old stack. Hence I think no need to update T= opOfNewStack. Yes, I understand what is happening, I don=E2=80=99t understand why it is ha= ppening. My comment here was separate from the FSP patch and only concerned t= he code here. If the top of the stack (as described by StackOffset) is align= ed from its original value, why can TopOfNewStack remain unchanged, when it a= ll points to the old, unaligned top, where no data should ever be stored? > e.g. > case1: > old stack: 0xfef5e000 > new stack: 0x49c8f3b0 > stack: 0x9c8f3b0 -> 16-Byte-aligned > case2:=20 > old stack: 0xfef5e008 Is this the top or bottom? If it=E2=80=99s the top, why can it *not* be 16-B= yte-aligned? If it=E2=80=99s the bottom, how is it relevant here? We are onl= y dealing with the top addresses in this patch, no? > new stack: 0x49c8f3b8 > stack: 0x9c8f3b0 -> 16-Byte-aligned >=20 > 3) This only works when StackOffset is guaranteed to be 8-Byte-aligned (is= it?). As we are dealing with the *top* of the stack (which should be 4K-ali= gned even for memory protection!), what would be wrong with just aligning do= wn and up instead? > (Same question for the second patch to the FSP code) > As my answer in Q2, what we adjust in the fsp patch is the new "current" s= tack in order to keep the same stack alignment as old stack after switching s= tack. Top of the new stack remains unchanged. If StackOffset is not adjusted= accordingly, bios will get wrong data from Private pointer after switching t= o new stack. But StackOffset describes the offset from the top of the old stack to the to= p of the new stack. How does the top =E2=80=9Cremain unchanged=E2=80=9D? Top= alignment is the root for the transitive alignment chain, if it is aligned,= and it must be, everything is aligned. >=20 > 4) The next patch performs a similar alignment operation (as mentioned bef= ore). However, while this patch aligns the *top* of the stack, the FSP patch= aligns the *bottom* of the stack. This may or may not be correct based on y= our premises. Can you maybe document why this is correct, or even better, tr= y to align the top of the stack in FSP as well? (By transitivity, if you ali= gn the top correctly, the bottom should be aligned correctly as well, as eve= ry nested call must preserve the alignment invariant) > [Ted]: Actually the new stack region (from top to bottom) is not changed w= ith the patches. It just adjusted the *current* new stack to align with the *= current* old stack. Confused, sorry. Maybe explaining the rest will make this clearer. >=20 > 5) Why does this only happen when the PPI is found? Would that not risk an= unaligned stack if the PPI is not provided, the same way it is unaligned no= w? > [Ted]: I didn't observe the unaligned stack issue in the else case (the PP= I is not found). This is why I asked about 1). Things like this are fundamental and should no= t be changed without understanding both the full scope of the issue and the f= ull scope of the changes. >=20 > 6) The comment explicitly mentions X64, but checks only for 64-bit pointer= size. So this should affect AArch64 and RISC-V-64 as well. Are they guarant= eed to function correctly after this patch (especially with the PPI sync gua= rantees mentioned earlier)? > [Ted]: Good point. The changes should only be needed for X64. I shall make= the changes specific to X64 only in IntelFspPkg. I'll drop the patch and cl= ose the bugzilla ticket. But how will StackOffset be in-sync with the changes in FSP SecCore then? If= I=E2=80=99m not mistaken with my intro from last mail, both calculate the s= tack address separately from each other, and their calculations must be in-s= ync. If I=E2=80=99m mistaken, how does it work then? >=20 > 7) This only updates FSP code, similar to 5), but to QEMU and friends cont= inue to work? > [Ted]: The changes should be specific to X64 architecture in IntelFspPkg w= ithout any impact to QEMU and other architectures. Should be solved by 6). Thanks! Best regards, Marvin >=20 >=20 > Thanks! >=20 > Best regards, > Marvin >=20 >> // >> // Heap Offset >> // >> @@ -852,7 +865,10 @@ PeiCheckAndSwitchStack ( >> // Temporary Ram Support PPI is provided by platform, it will copy= >> // temporary memory to permanent memory and do stack switching. >> // After invoking Temporary Ram Support PPI, the following code's >> - // stack is in permanent memory. >> + // stack is in permanent memory. For X64, the bit3:0 of the new st= ack >> + // produced by TemporaryRamMigration must be aligned with the bit3= :0 of >> + // the old stack. Otherwise, it'll break the original stack alignm= ent >> + // after switching to new stack. >> // >> TemporaryRamSupportPpi->TemporaryRamMigration ( >> PeiServices, >=20