From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 162CBD800F6 for ; Thu, 14 Mar 2024 21:58:16 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=6SEdgTYygjvzvgc2hgNuxBex39bK+KcAAVbmOL6PdB8=; c=relaxed/simple; d=groups.io; h=From:Message-Id:Mime-Version:Subject:Date:In-Reply-To:Cc:To:References:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type; s=20240206; t=1710453495; v=1; b=l48mHsSLOUJypeIzFoFZkhja6UTl9zpeiN9ri5Rpp0POSFwA9AbDt+c21/R/i4HFffrjP+r3 rV7xfu9KKKs7czk8udeOL9+ycn6VZjhjSdOUovTUq9Z0w//x6GebjjqgQHjykJEwep5quyvHEDj Si1eyMVL/iC6xH7O91XEmSfQoWSNZRGyTeTf7Dk6kUL4PRqsAgC4CND5ziYNyX6AB7eQgiCh75H vOS3UoUvTJJ5ifE+UJPTKmcQbR/sZYM2c3NrYwjjvtKNE5gdf93tkv/dkiyx0Sh21F85QaK3G8l galAi5U3zzb1FdFROh2SpQzSfqEtzVkwEGJ9BCcM565rw== X-Received: by 127.0.0.2 with SMTP id bM2pYY7687511xsghtHSdHaK; Thu, 14 Mar 2024 14:58:15 -0700 X-Received: from mout01.posteo.de (mout01.posteo.de [185.67.36.65]) by mx.groups.io with SMTP id smtpd.web11.5270.1710453493847644624 for ; Thu, 14 Mar 2024 14:58:14 -0700 X-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id DE229240027 for ; Thu, 14 Mar 2024 22:58:11 +0100 (CET) X-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Twh8K2qfYz9rxD; Thu, 14 Mar 2024 22:58:09 +0100 (CET) From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-Id: Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\)) Subject: Re: [edk2-devel][PATCH v1 1/3] MdeModulePkg: ImagePropertiesRecordLib: Use SectionAlignment for CodeSize Date: Thu, 14 Mar 2024 21:57:58 +0000 In-Reply-To: <9d6f07c2-3579-4555-94eb-fb7b79a8e05a@linux.microsoft.com> Cc: edk2-devel-groups-io , Leif Lindholm , Ard Biesheuvel , Sami Mujawar , Taylor Beebe To: Oliver Smith-Denny References: <20240227202721.30070-1-osde@...> <20240227202721.30070-2-osde@...> <5d95548d-d25f-4306-a271-a2960cc81ccf@...> <531b3cfe-b57c-4b5a-97fd-45e428f97853@...> <8DF936BA-F1D0-4513-A8F4-36A2CBEFD84A@posteo.de> <9d6f07c2-3579-4555-94eb-fb7b79a8e05a@linux.microsoft.com> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Thu, 14 Mar 2024 14:58:14 -0700 Reply-To: devel@edk2.groups.io,mhaeuser@posteo.de List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: dIw03gBh9wmYQvCGRFnBKKesx7686176AA= Content-Type: multipart/alternative; boundary="Apple-Mail=_9DFA2612-38A3-4937-A2C1-139F47805772" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240206 header.b=l48mHsSL; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io; dmarc=fail reason="SPF not aligned (strict), DKIM not aligned (strict)" header.from=posteo.de (policy=none) --Apple-Mail=_9DFA2612-38A3-4937-A2C1-139F47805772 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 14. Mar 2024, at 15:45, Oliver Smith-Denny = wrote: >=20 > This does not appear to be the case with MSVC built binaries. I am > seeing that VirtualSize is the size of the data-initialized part of the > section. What? :( So you are saying modern MSVC generates PEs that either have non-c= ontiguous sections, or they have sections that contain zero-initialized por= tions explicitly? Both variants sound badly borked and should be investigat= ed, even if only for resource constraints. Just thinking about it, is there any chance you set ALIGN =3D FILEALIGN? Be= cause then this will be invoked and your file will be converted to XIP: htt= ps://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65= d/BaseTools/Source/C/GenFw/GenFw.c#L612 BaseTools has no concept of what should be a XIP and what should not, so if= a file somewhat qualifies, it will just always emit a XIP: https://github.= com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/= Source/C/GenFw/GenFw.c#L2150 In AUDK I addressed this by tying XIP to reloc stripping, but that is more = of a convenient heuristic: https://github.com/acidanthera/audk/blob/13f99fa= 7cbde0829ba98e1e4f2a03891a0791ac6/BaseTools/Source/Python/GenFds/EfiSection= .py#L293 > Whereas on our ElfConverted binaries, what you say is true. The > difference concerns me. Yes, very concerning. Would you mind sharing dumps (section table, etc.) an= d/or binaries (unless you indeed hit what I outlined above)? I recall MSVC = to rather be more aggressive with space-savings, whereas ElfConvert would i= nclude section-alignment padding within the file (but not zero-init data). > Agreed. I have done further research and v2 uses VirtualSize. I was > initially led astray here because the ElfConvert tool sets VirtualSize > and SizeOfRawData to the SectionAlignment universally: >=20 > https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6ec= cdc65d/BaseTools/Source/C/GenFw/ElfConvert.c#L134-L136 Yes=E2=80=A6 plenty of more things in the tool to lead people astray. :) > Yep, the PE headers are definitely filled with information that can be > deduced from other sources and the spec is not very verbose, which is > why I definitely want to tread carefully here. For example, it says > SizeOfImage is the size of the image loaded into memory, which > current implementations seem to take to mean the sum of the headers > plus each section's VirtualSize rounded to the SectionAlignment (which > the spec does call out that SizeOfImage must be aligned to the > SectionAlignment). VirtualSize it says is also the size of the section > as loaded into memory, but current implementations (other than our > ElfConvert script which I'm currently investigating if we should > change that) seem to take this to mean the size of the initialized > data in memory, i.e. the "real" data from the image, not the zeroed > data to get to the SectionAlignment. Do you have an example for that? I=E2=80=99m quite certain SizeOfImage shou= ld always cover the whole thing. > But the language in the spec > is almost identical between the two. And again, we have a difference > in our MSVC and gcc built binaries here. Do we? :( > I agree with you there. That being said, the current Image record code > is, unsurprisingly, broken, and does need to be fixed. Can you review > the v2 of this? The only difference is that I changed to VirtualSize. LGTM. > Do you think the code should be updated for the case of VirtualSize > =3D=3D 0? In the current implementation, it will round 0 up to the > SectionAlignment It should not. Well, it will technically round, but it will remain 0. It do= es not round up to the strictly next boundary, but if the current value is = a multiple of the alignment (0 is a multiple of any alignment), it remains = unchanged. > , which is what I believe will happen in the PE > loader, also, but want to get your thoughts on it. >=20 > This would still be a short term solution to fix the active bug > (ARM64 cannot boot a 64k pagesize Linux with a broken MAT table or > we will boot without the benefit of the MAT, if we happen to have > an aligned total region but not each entry, testing shows). I do > think it is a broken pattern to try to determine the image records > after we've loaded them into memory, the PE loader should create > them with the factual data of what was loaded. There are many more broken patterns everywhere. Did you notice yet that the= API itself is broken from the ground up? Not to keep =E2=80=9Cadvertising= =E2=80=9D it, but maybe our paper will give you some fun too. :) https://ar= xiv.org/pdf/2012.05471.pdf Best regards, Marvin >=20 > Thanks, > Oliver -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#116780): https://edk2.groups.io/g/devel/message/116780 Mute This Topic: https://groups.io/mt/104610770/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- --Apple-Mail=_9DFA2612-38A3-4937-A2C1-139F47805772 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On 14. Mar 2024, at 15:45, Oliver Smith-Denny <osde@linux.microso= ft.com> wrote:

This does not appear to be th= e case with MSVC built binaries. I am
seeing that V= irtualSize is the size of the data-initialized part of the
section.

What? :( So = you are saying modern MSVC generates PEs that either have non-contiguous se= ctions, or they have sections that contain zero-initialized portions explic= itly? Both variants sound badly borked and should be investigated, even if = only for resource constraints.

Just thinking about= it, is there any chance you set ALIGN =3D FILEALIGN? Because then this wil= l be invoked and your file will be converted to XIP: https://github.com/tianocore/edk2/blo= b/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/GenFw.c= #L612

BaseTools has no concept of what should = be a XIP and what should not, so if a file somewhat qualifies, it will just= always emit a XIP: https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288= f6eccdc65d/BaseTools/Source/C/GenFw/GenFw.c#L2150

<= div>In AUDK I addressed this by tying XIP to reloc stripping, but that is m= ore of a convenient heuristic: https://github.com/acidanthera/audk/blob/13f9= 9fa7cbde0829ba98e1e4f2a03891a0791ac6/BaseTools/Source/Python/GenFds/EfiSect= ion.py#L293

Whereas on our ElfConverte= d binaries, what you say is true. The
difference co= ncerns me.

Yes, very concerning. Would you mind sharing dumps (section table, et= c.) and/or binaries (unless you indeed hit what I outlined above)? I recall= MSVC to rather be more aggressive with space-savings, whereas ElfConvert w= ould include section-alignment padding within the file (but not zero-init d= ata).

Agreed. I have done further research= and v2 uses VirtualSize. I was
initially led astr= ay here because the ElfConvert tool sets VirtualSize

https://github.com/tianocore/edk2/blob/5572b43c= 6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/ElfConvert.c#L134= -L136

Ye= s=E2=80=A6 plenty of more things in the tool to lead people astray. :)
Yep, the PE headers are definitely filled with= information that can be
deduced from other sources= and the spec is not very verbose, which is
= why I d= efinitely want to tread carefully here. For example, it says
SizeOfImage is the size of the image loaded into memory, which
current implementations seem to take to mean the sum = of the headers
plus each section's VirtualSize roun= ded to the SectionAlignment (which
the spec does ca= ll out that SizeOfImage must be aligned to the
Sect= ionAlignment). VirtualSize it says is also the size of the sectionas loaded into memory, but current implementations (other t= han our
ElfConvert script which I'm currently inves= tigating if we should
change that) seem to take thi= s to mean the size of the initialized
data in memor= y, i.e. the "real" data from the image, not the zeroed
data to get to the SectionAlignment.

=
Do you have an example for that? I=E2=80=99m quite certain SizeO= fImage should always cover the whole thing.

is almost identica= l between the two. And again, we have a difference
= in our MSVC and gcc built binaries here.

Do we? :(

I agree with you there. That being said, the current Image r= ecord code
is, unsurprisingly, broken, and does nee= d to be fixed. Can you review
the v2 of this? The o= nly difference is that I changed to VirtualSize.

LGTM.

Do you think the code should be updated for the case of VirtualSize=
=3D=3D 0? In the current implementation, it will r= ound 0 up to the
SectionAlignment

It should not. Well, it will technically round,= but it will remain 0. It does not round up to the strictly next boundary, = but if the current value is a multiple of the alignment (0 is a multiple of= any alignment), it remains unchanged.

<= div>, w= hich is what I believe will happen in the PE
loader= , also, but want to get your thoughts on it.

This would still be a short term solution to fix the act= ive bug
(ARM64 cannot boot a 64k pagesize Linux wit= h a broken MAT table or
we will boot without the be= nefit of the MAT, if we happen to have
an aligned t= otal region but not each entry, testing shows). I do
after we've loaded them into memory, the PE loader should= create
them with the factual data of what was load= ed.

T= here are many more broken patterns everywhere. Did you notice yet that the = API itself is broken from the ground up? Not to keep =E2=80=9Cadvertising= =E2=80=9D it, but maybe our paper will give you some fun too. :) https://arxiv.org/pdf/2012.0547= 1.pdf

Best regards,
Marvin

=

Thanks,
Oliver

_._,_._,_

Groups.io Links:

=20 You receive all messages sent to this group. =20 =20

View/Reply Online (#116780) | =20 | Mute= This Topic | New Topic
Your Subscriptio= n | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_
--Apple-Mail=_9DFA2612-38A3-4937-A2C1-139F47805772--