public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Oliver Smith-Denny" <osde@linux.microsoft.com>
To: devel@edk2.groups.io, mhaeuser@posteo.de
Cc: Leif Lindholm <quic_llindhol@quicinc.com>,
	Ard Biesheuvel <ardb+tianocore@kernel.org>,
	Sami Mujawar <sami.mujawar@arm.com>,
	Taylor Beebe <Taylor.Beebe@microsoft.com>
Subject: Re: [edk2-devel][PATCH v1 1/3] MdeModulePkg: ImagePropertiesRecordLib: Use SectionAlignment for CodeSize
Date: Thu, 14 Mar 2024 07:45:17 -0700	[thread overview]
Message-ID: <9d6f07c2-3579-4555-94eb-fb7b79a8e05a@linux.microsoft.com> (raw)
In-Reply-To: <8DF936BA-F1D0-4513-A8F4-36A2CBEFD84A@posteo.de>

Hi Marvin,

Thanks for looking at this. Since sending out the v1, I have done a lot
more digging into our PE loader, others, the spec, and differences
between MSVC and our ElfConvert tool. I agree that my understanding was
flawed in my original comments.

On 3/14/2024 2:57 AM, Marvin Häuser wrote:
> Good day everyone,
> 
> Sorry for interjecting, but I’d like to avoid premature changes to PE 
> code that would only make the current mess even worse.
> 
>> On 4. Mar 2024, at 20:24, Oliver Smith-Denny <osde@...> wrote:
>>
>> I relooked at the spec, you are correct, SizeOfRawData is aligned to
>> the FileAlignment. So this is only a bug for a file with
>> SectionAlignment != FileAlignment.
> 
> Still no, for some reasons you already mentioned, but also reasons you 
> didn’t:
> - VirtualSize often is not aligned, so they can still trivially mismatch.
> - VirtualSize includes zero-initialized data not part of the file.

This does not appear to be the case with MSVC built binaries. I am
seeing that VirtualSize is the size of the data-initialized part of the
section. Whereas on our ElfConverted binaries, what you say is true. The
difference concerns me.

> - In the old days, VirtualSize = 0 was used to disable loading of debug 
> data, but SizeOfRawData remained intact, so it could be loaded on demand.
> - SizeOfRawData may be unaligned due to bugs. In fact, there is no 
> reason to trust FileAlignment, it has no real meaning in an UEFI context 
> (I am not even sure it does on Windows).
> 
>> I see, yes I do believe VirtualSize could be used here instead.
> 
> *Must*. As Ard said, SizeOfRawData is only interesting to know how many 
> bytes to copy from the file and for *nothing* else.

Agreed. I have done further research and v2 uses VirtualSize. I was
initially led astray here because the ElfConvert tool sets VirtualSize
and SizeOfRawData to the SectionAlignment universally:

https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccdc65d/BaseTools/Source/C/GenFw/ElfConvert.c#L134-L136

> 
>> Two
>> things give me pause. One is that the PE spec states that SizeOfRawData
>> is rounded and VirtualSize is not, so that SizeOfRawData may be greater
>> than the VirtualSize in some cases (which seems incorrect).
>>
>> The other is that the image loader partially uses VirtualSize when
>> loading:
>>
>> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e2c7dea0/MdePkg/Library/BasePeCoffLib/BasePeCoff.c#L1399-L1400
>>
>> However, when determining the size of a loaded image (and therefore the
>> number of pages to allocate) it will allocate an extra page:
>>
>> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e2c7dea0/MdeModulePkg/Core/Dxe/Image/Image.c#L646-L652
>>
> 
> Sorry, I don’t understand the “however”, this is sound.
> 
>> as ImageSize here is:
>>
>> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e2c7dea0/MdePkg/Library/BasePeCoffLib/BasePeCoff.c#L312
>>
>> Which according to the spec, SizeOfImage is the size of the image as
>> loaded into memory and must be a multiple of section alignment.
> 
> Side note, SizeOfImage may be borked, not only due to bugs, but also due 
> to malice. There is literally no reason whatsoever to use it - to 
> validate it, you need to calculate a sane value, and once you have that, 
> you obviously do not need SizeOfImage anymore. You could reject images 
> with malformed SizeOfImage, but there likely are various 
> legitimate-bugged ones out there. This is just an ancient artifact from 
> before memory safety was a thing. :)

Yep, the PE headers are definitely filled with information that can be
deduced from other sources and the spec is not very verbose, which is
why I definitely want to tread carefully here. For example, it says
SizeOfImage is the size of the image loaded into memory, which
current implementations seem to take to mean the sum of the headers
plus each section's VirtualSize rounded to the SectionAlignment (which
the spec does call out that SizeOfImage must be aligned to the
SectionAlignment). VirtualSize it says is also the size of the section
as loaded into memory, but current implementations (other than our
ElfConvert script which I'm currently investigating if we should
change that) seem to take this to mean the size of the initialized
data in memory, i.e. the "real" data from the image, not the zeroed
data to get to the SectionAlignment. But the language in the spec
is almost identical between the two. And again, we have a difference
in our MSVC and gcc built binaries here.

> 
>> So, reflecting on this, let me test with VirtualSize here, I think
>> that is the right value to use, the only time we would have a
>> SizeOfRawData that is greater than the VirtualSize would be if our
>> FileAlignment is greater than our SectionAlignment, which would be
>> a misconfiguration.
> 
> No, it can simply be that FileAlignment = SectionAlignment and 
> SizeOfRawData is aligned and VirtualSize is not. Even when both are 
> aligned, the former may carry extra optional data (like previously 
> mentioned debug data), or it might be some weird I/O performance related 
> padding or something.

Agreed, this was misinformed by looking at gcc built binaries.

> 
>> I do think the ImageLoader should also be fixed to only allocate
>> ImageSize number of pages (which should be the sum of the section
>> VirtualSizes + any headers that aren't included). Might as well not
>> waste an extra page for each image and then our image record code is
>> simpler as well (ensuring we are protecting the right pages).
> 
> NO! :) There are *many* issues surrounding this:
> - Sections may overlap (observed with early macOS EFI images, could be 
> ignored, but needs to be validated across the landscape).
> - Sections may have gaps (observed with “old" Linux EFI Shims iirc, 
> there’ll be plenty of broken images in the wild yet).
> - The extra page is not pointless - as you can see, it is allocated when 
> exceeding EFI page size, which is the only granularity you can 
> page-allocate at. So, when you require > 4K alignment, you need the 
> extra page to be able to “shift” the image into proper alignment in the 
> allocated region as needed. In AUDK, I resolved this by abstracting 
> existing, buried aligned-allocation code into a MemoryAllocationLibEx 
> class: 
> https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891a0791ac6/MdeModulePkg/Library/CommonMemoryAllocationLib/CommonMemoryAllocationLibEx.c#L70 <https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891a0791ac6/MdeModulePkg/Library/CommonMemoryAllocationLib/CommonMemoryAllocationLibEx.c#L70>
> Without aligned-allocation, you cannot get rid of this extra page.
> 
> For reference, my image record code is here: 
> https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891a0791ac6/MdePkg/Library/BaseUefiImageLib/PeSupport.c#L335 <https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891a0791ac6/MdePkg/Library/BaseUefiImageLib/PeSupport.c#L335>

Yep, this was me being dumb and not looking a few lines further ahead :)
After sending this I looked deeper at this offhand comment and realized
the purpose.

> 
>> I think this patch series should go in, as it is fixing an active bug,
>> but I will also take a look at the image loader creating the image
>> records and having a protocol it produces to retrieve the list, to
>> attempt to avoid issues like this in the future.
> 
> Why does there need to be a protocol? Why should non-Core modules get 
> any information about image topology?

Agreed, only core code needs this. I think I said protocol originally
so that the existing ImagePropertiesRecordLib could use it without
depending on a core only function, but in a new implementation we
would drop the lib.

> 
> A few general words of caution when dealing with PE:
> - Do not trust the spec.
> - Do not trust the loader.
> - Do not trust the converter.
> - Do *not* trust your intuition.
> 
> *Everything* is broken, from the producers to the consumers, from the 
> signing to the validation tools, technically, security-wise, and 
> design-wise. Do *not* change things without as little as a collection of 
> Windows and Linux images from various versions and various OpROMs to 
> regression-test with. Otherwise, we will all be in a world of more pain. :)
> 

I agree with you there. That being said, the current Image record code
is, unsurprisingly, broken, and does need to be fixed. Can you review
the v2 of this? The only difference is that I changed to VirtualSize.
Do you think the code should be updated for the case of VirtualSize
== 0? In the current implementation, it will round 0 up to the
SectionAlignment, which is what I believe will happen in the PE
loader, also, but want to get your thoughts on it.

This would still be a short term solution to fix the active bug
(ARM64 cannot boot a 64k pagesize Linux with a broken MAT table or
we will boot without the benefit of the MAT, if we happen to have
an aligned total region but not each entry, testing shows). I do
think it is a broken pattern to try to determine the image records
after we've loaded them into memory, the PE loader should create
them with the factual data of what was loaded.

Thanks,
Oliver


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#116756): https://edk2.groups.io/g/devel/message/116756
Mute This Topic: https://groups.io/mt/104610770/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2024-03-14 14:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240227202721.30070-1-osde@...>
     [not found] ` <20240227202721.30070-2-osde@...>
     [not found]   ` <CAMj1kXGDCsFZmPsKe4GD1XdT+t9SKHVStNSx-FsA_YwBiVh6ng@...>
     [not found]     ` <5d95548d-d25f-4306-a271-a2960cc81ccf@...>
     [not found]       ` <CAMj1kXGsso6BjOajEgqE3AOC1ZSdStQmeYmCoJMQo_tK1QHTWw@...>
     [not found]         ` <531b3cfe-b57c-4b5a-97fd-45e428f97853@...>
2024-03-14  9:57           ` [edk2-devel][PATCH v1 1/3] MdeModulePkg: ImagePropertiesRecordLib: Use SectionAlignment for CodeSize Marvin Häuser
2024-03-14 14:45             ` Oliver Smith-Denny [this message]
2024-03-14 21:57               ` Marvin Häuser
2024-03-15 22:56                 ` Oliver Smith-Denny
2024-03-15 23:45                   ` Marvin Häuser
2024-03-16 15:47                     ` Oliver Smith-Denny
2024-02-27 20:27 [edk2-devel][PATCH v1 0/3] MdeModulePkg: ImagePropertiesRecordLib Fixes Oliver Smith-Denny
2024-02-27 20:27 ` [edk2-devel][PATCH v1 1/3] MdeModulePkg: ImagePropertiesRecordLib: Use SectionAlignment for CodeSize Oliver Smith-Denny
2024-03-01 11:58   ` Ard Biesheuvel
2024-03-04 17:49     ` Oliver Smith-Denny
2024-03-04 18:54       ` Ard Biesheuvel
2024-03-04 19:24         ` Oliver Smith-Denny
     [not found]         ` <17B9A63355E03AE5.30946@groups.io>
2024-03-04 22:38           ` Oliver Smith-Denny
     [not found]           ` <17B9B0CE1BDDC06B.30946@groups.io>
2024-03-11 19:34             ` Oliver Smith-Denny
2024-03-12  8:32               ` Ard Biesheuvel
2024-03-12 14:38                 ` Oliver Smith-Denny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d6f07c2-3579-4555-94eb-fb7b79a8e05a@linux.microsoft.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox