On 14. Mar 2024, at 15:45, Oliver Smith-Denny <osde@linux.microsoft.com> wrote:

This does not appear to be the case with MSVC built binaries. I am
seeing that VirtualSize is the size of the data-initialized part of the
section.

What? :( So you are saying modern MSVC generates PEs that either have non-contiguous sections, or they have sections that contain zero-initialized portions explicitly? Both variants sound badly borked and should be investigated, even if only for resource constraints.

Just thinking about it, is there any chance you set ALIGN = FILEALIGN? Because then this will be invoked and your file will be converted to XIP: https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/GenFw.c#L612

BaseTools has no concept of what should be a XIP and what should not, so if a file somewhat qualifies, it will just always emit a XIP: https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/GenFw.c#L2150

In AUDK I addressed this by tying XIP to reloc stripping, but that is more of a convenient heuristic: https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891a0791ac6/BaseTools/Source/Python/GenFds/EfiSection.py#L293

Whereas on our ElfConverted binaries, what you say is true. The
difference concerns me.

Yes, very concerning. Would you mind sharing dumps (section table, etc.) and/or binaries (unless you indeed hit what I outlined above)? I recall MSVC to rather be more aggressive with space-savings, whereas ElfConvert would include section-alignment padding within the file (but not zero-init data).

Agreed. I have done further research and v2 uses VirtualSize. I was
initially led astray here because the ElfConvert tool sets VirtualSize
and SizeOfRawData to the SectionAlignment universally:

https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/ElfConvert.c#L134-L136

Yes… plenty of more things in the tool to lead people astray. :)

Yep, the PE headers are definitely filled with information that can be
deduced from other sources and the spec is not very verbose, which is
why I definitely want to tread carefully here. For example, it says
SizeOfImage is the size of the image loaded into memory, which
current implementations seem to take to mean the sum of the headers
plus each section's VirtualSize rounded to the SectionAlignment (which
the spec does call out that SizeOfImage must be aligned to the
SectionAlignment). VirtualSize it says is also the size of the section
as loaded into memory, but current implementations (other than our
ElfConvert script which I'm currently investigating if we should
change that) seem to take this to mean the size of the initialized
data in memory, i.e. the "real" data from the image, not the zeroed
data to get to the SectionAlignment.

Do you have an example for that? I’m quite certain SizeOfImage should always cover the whole thing.

But the language in the spec
is almost identical between the two. And again, we have a difference
in our MSVC and gcc built binaries here.

Do we? :(

I agree with you there. That being said, the current Image record code
is, unsurprisingly, broken, and does need to be fixed. Can you review
the v2 of this? The only difference is that I changed to VirtualSize.

LGTM.

Do you think the code should be updated for the case of VirtualSize
== 0? In the current implementation, it will round 0 up to the
SectionAlignment

It should not. Well, it will technically round, but it will remain 0. It does not round up to the strictly next boundary, but if the current value is a multiple of the alignment (0 is a multiple of any alignment), it remains unchanged.

, which is what I believe will happen in the PE
loader, also, but want to get your thoughts on it.

This would still be a short term solution to fix the active bug
(ARM64 cannot boot a 64k pagesize Linux with a broken MAT table or
we will boot without the benefit of the MAT, if we happen to have
an aligned total region but not each entry, testing shows). I do
think it is a broken pattern to try to determine the image records
after we've loaded them into memory, the PE loader should create
them with the factual data of what was loaded.

There are many more broken patterns everywhere. Did you notice yet that the API itself is broken from the ground up? Not to keep “advertising” it, but maybe our paper will give you some fun too. :) https://arxiv.org/pdf/2012.05471.pdf

Best regards,

Marvin

Thanks,
Oliver