From: "Marvin Häuser" <mhaeuser@posteo.de>
To: devel@edk2.groups.io, ray.ni@intel.com
Cc: "Ma, Maurice" <maurice.ma@intel.com>,
"Dong, Guo" <guo.dong@intel.com>,
"You, Benjamin" <benjamin.you@intel.com>
Subject: Re: [edk2-devel] [PATCH v2 2/3] UefiPayloadPkg: Add PayloadLoaderPeim which can load ELF payload
Date: Wed, 9 Jun 2021 10:03:31 +0000 [thread overview]
Message-ID: <8eb8db11-90c2-57e0-6868-3532c5af8073@posteo.de> (raw)
In-Reply-To: <CO1PR11MB4930473BF50D013EDA2AB0AF8C369@CO1PR11MB4930.namprd11.prod.outlook.com>
On 09.06.21 11:49, Ni, Ray wrote:
>> Thank you for your quick reply, comments inline.
> I have to be quick because my project depends on the check-in of this code😊
Sure, so thanks a lot for taking the time!
One non-trivial comment at the bottom.
>
>>>>> + for ( Index = 0
>>>>> + ; RelaEntrySize * Index < RelaSize
>>>> Overflow?
>>>>
>>> Will change from:
>>> RelaEntrySize * Index < RelaSize
>>> to:
>>> Index < RelaSize / RelaEntrySize
>> imo add ASSERT for RelaEntrySize > 0 then.
> Sure. To avoid dividend by zero error.
>
>> Basically the alignment of any offset with which a pointer to
>> non-trivially-aligned (i.e. requirement > 1 Byte) data can be forged
>> must be checked.
>>
>> Examples from our new PE loader:
>> https://github.com/mhaeuser/ISPRASOpen-
>> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit.
>> c#L1226
>> ->
>> https://github.com/mhaeuser/ISPRASOpen-
>> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit.
>> c#L1242
>> https://github.com/mhaeuser/ISPRASOpen-
>> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit.
>> c#L1389
>> ->
>> https://github.com/mhaeuser/ISPRASOpen-
>> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit.
>> c#L148
>>
>> The idea here is that the base pointer (i.e. image header) is "maximally
>> aligned" (i.e. can hold data of any platform data alignment). For the 8
>> Bytes AllocatePool() guarantees (file data), this is sufficient for any
>> primitive and composite data type. For the 4 KB AllocatePages()
>> guarantees (destination), this is sufficient of that, and for advanced
>> things like AVX (however not needed here). If the base is maximally
>> aligned, Base + X is guaranteed aligned for A if X is aligned for A,
>> i.e. X % _Alignof(A) = 0. Failing to verify this can cause exceptions on
>> platforms which don't support or have disabled the capability to perform
>> unaligned memory accesses.
>>
> I understand now. I remember that X86 contains a control flag that can trigger
> CPU exception as well when unaligned access happens.
>
> But adding such check in all places might require a huge change to today's code.
> Can you accept that I ignore such check for now and add it later?
Of course, I mean, it needs some EDK II wide concept first anyway. Just
the overall situation is similar (but a lot worse in severity) with the
PE loader and now it's not easy to address the issues. :)
So if there are plans to address it, that's great!
>
>>>>> + //
>>>>> + // It's abnormal a DYN ELF doesn't contain a dynamic section.
>>>>> + //
>>>>> + ASSERT (DynShdr != NULL);
>>>>> + if (DynShdr == NULL) {
>>>>> + return EFI_UNSUPPORTED;
>>>>> + }
>>>>> + ASSERT (DynShdr->sh_type == SHT_DYNAMIC);
>>>>> + ASSERT (DynShdr->sh_entsize >= sizeof (*Dyn));
>>>> Abnormalities in unknown/untrusted data must be filtered with a runtime
>>>> check, not with an ASSERT.
>>>>
>>> Sure. I will add if-check below the assertion so assertion-enabled build can
>>> report the errors earlier.
>> I have started this discussion under another patch, maybe I should write
>> a broader-scope mail to the list for comments. Basically using ASSERTs
>> for anything but *impossible* (*not* assuming the input data is
>> well-formed) situations significantly reduces the efficacy of dynamic
>> analysis. When doing fuzzing for example, you want to keep the ASSERTs
>> enabled to be made aware of any internal invariant violations. But if
>> ASSERTs happen on possible conditions, they will kill the fuzzing
>> process for no good reason. Turning them off will not analyse your
>> ASSERTs for possible code defects.
>>
>> Maybe fuzzing would be a good idea for this library? :)
>>
> I understand now. I am ok to remove assertion for external inputs.
Thank you.
>
>>>
>>>>> + for ( Index = 0, Dyn = (Elf32_Dyn *) (ElfCt->FileBase + DynShdr-
>>> sh_offset)
>>>>> + ; Index < DynShdr->sh_size / DynShdr->sh_entsize
>>>> Is "sh_entsize" checked for 0?
>>>>
>>> No need. Because code above makes sure sh_entsize >= sizeof (*Dyn).
>> When you turn it into a runtime check as discussed above, yes.
>>
>>>
>>>>> + ASSERT (RelShdr->sh_type == RelaType);
>>>>> + ASSERT (RelShdr->sh_entsize == RelaEntrySize);
>>>> See above.
>>>>
>>> Agree. Will add if-checks.
>>>
>>>
>>>>> + DEBUG ((DEBUG_INFO, "DYN ELF: Relocate using dynamic
>> sections...\n"));
>>>>> + Status = RelocateElf32Dynamic (ElfCt);
>>>>> + ASSERT_EFI_ERROR (Status);
>>>> Why cannot this fail?
>>>>
>>> A DYN type ELF image should contain dynamic section. So only an abnormal
>> ELF image can fail.
>>
>> Same ASSERT point as above, "cannot" refers to both well-formed and
>> ill-formed images.
>>
> Sure. Will remove assertion.
>
>>>
>>>>> + return (Elf64_Phdr *)(ImageBase + Ehdr->e_phoff + Index * Ehdr-
>>> e_phentsize);
>>>> Alignment checks? Bounds checks?
>>>>
>>> For the alignment checks, do you suggest that I should make sure the
>> segment should be placed
>>> in the address that meets the alignment requirement?
>> It could be implemented, PE code does it, but I meant pointer alignment
>> as discussed above somewhere. I don't think ELFs would likely request
>> more than page alignment, but abort + DEBUG message sounds like a good
>> idea.
>>
>>> ELF spec just requires below for Elf64_Phdr.p_align:
>>> loadable process segments must have congruent values for p_vaddr and
>> p_offset, modulo the page size.
>>> I can add such check in ParseElfImage().
>>>
>>>>> + ProcessRelocation64 (
>>>>> + (Elf64_Rela *) (ElfCt->FileBase + RelShdr->sh_offset),
>>>> Alignment? :) I know there is no real concept in EDK II yet, but it
>>>> really is needed.
>>>>
>>> Can you explain a bit more on the alignment?
>> Done above, sorry.
>>
>>>
>>>>> +
>>>>> +/**
>>>>> + Check if the ELF image is valid.
>>>>> +
>>>>> + @param[in] ImageBase Memory address of an image.
>>>>> +
>>>>> + @retval TRUE if valid.
>>>>> +
>>>>> +**/
>>>>> +BOOLEAN
>>>>> +IsElfFormat (
>>>>> + IN CONST UINT8 *ImageBase
>>>> You cannot safely inspect untrusted/unknown data without a size field,
>>>> also needs checks below.
>>>>
>>> Agree. Original idea was to add a ELF loader that can load the ELF assuming
>>> the ELF image is well-formatted.
>> I get that idea, but the reality is that people will start using it for
>> external images once it is needed. :)
>> Sorry for being pedantic.
>>
>>> But with your help, I am glad to enhance the logic a bit more to expand
>>> the support of external ELF images.
>>>
>>> Will add a "UINTN ImageSize" parameter.
>>>
>>>>> + )
>>>>> +{
>>>>> + Elf32_Ehdr *Elf32Hdr;
>>>>> + Elf64_Ehdr *Elf64Hdr;
>>>>> +
>>>>> + ASSERT (ImageBase != NULL);
>>>>> +
>>>>> + Elf32Hdr = (Elf32_Ehdr *)ImageBase;
>>>>> +
>>>>> + //
>>>>> + // Start with correct signature "\7fELF"
>>>>> + //
>>>>> + if ((Elf32Hdr->e_ident[EI_MAG0] != ELFMAG0) ||
>>>>> + (Elf32Hdr->e_ident[EI_MAG1] != ELFMAG1) ||
>>>>> + (Elf32Hdr->e_ident[EI_MAG1] != ELFMAG1) ||
>>>>> + (Elf32Hdr->e_ident[EI_MAG2] != ELFMAG2)
>>>>> + ) {
>>>>> + return FALSE;
>>>>> + }
>>>>> +
>>>>> + //
>>>>> + // Support little-endian only
>>>>> + //
>>>>> + if (Elf32Hdr->e_ident[EI_DATA] != ELFDATA2LSB) {
>>>>> + return FALSE;
>>>>> + }
>>>>> +
>>>>> + //
>>>>> + // Check 32/64-bit architecture
>>>>> + //
>>>>> + if (Elf32Hdr->e_ident[EI_CLASS] == ELFCLASS64) {
>>>>> + Elf64Hdr = (Elf64_Ehdr *)Elf32Hdr;
>>>>> + Elf32Hdr = NULL;
>>>>> + } else if (Elf32Hdr->e_ident[EI_CLASS] == ELFCLASS32) {
>>>>> + Elf64Hdr = NULL;
>>>>> + } else {
>>>>> + return FALSE;
>>>>> + }
>>>> Why are the branches above and below separated when they map
>> basically 1:1?
>>> Indeed. Thanks for catching this.
>>> Will merge the separate "if" together.
>>>
>>>>> +
>>>>> + if (Elf64Hdr != NULL) {
>>>>> + //
>>>>> + // Support intel architecture only for now
>>>>> + //
>>>>> + if (Elf64Hdr->e_machine != EM_X86_64) {
>>>>> + return FALSE;
>>>>> + }
>>>>> +
>>>>> + // Use last section as end of file
>>>>> + Status = GetElfSectionPos (ElfCt, ElfCt->ShNum - 1, &Offset, &Size);
>>>> What if ShNum is 0?
>>>>
>>> Agree. The logic to calculate file size might not be needed.
>>> Let me confirm and try to remove the entire function.
>>>
>>>
>>>>> + if (ElfCt->EiClass == ELFCLASS32) {
>>>>> + Elf32Hdr = (Elf32_Ehdr *)ElfCt->FileBase;
>>>>> + FileSize2 = Elf32Hdr->e_shoff + Elf32Hdr->e_shentsize * Elf32Hdr-
>>> e_shnum;
>>>>> + } else if (ElfCt->EiClass == ELFCLASS64) {
>>>>> + Elf64Hdr = (Elf64_Ehdr *)ElfCt->FileBase;
>>>>> + FileSize2 = (UINTN)(Elf64Hdr->e_shoff + Elf64Hdr->e_shentsize *
>> Elf64Hdr->e_shnum);
>>>>> + }
>>>> Overflows?
>>>>
>>> Integer overflow?
>> Yes, sorry.
>>
>>> Will add integer overflow check if this file size calculation logic is still needed.
>>>
>>>
>>>>> +
>>>>> + if (ElfCt == NULL) {
>>>>> + return EFI_INVALID_PARAMETER;
>>>>> + }
>>>> As this is function contract, I'd replace this with an ASSERT, or at
>>>> least have both.
>>>>
>>> I will put "ASSERT (ElfCt != NULL)" above the if.
>>>
>>>
>>>>> + ZeroMem (ElfCt, sizeof(ELF_IMAGE_CONTEXT));
>>>>> +
>>>>> + if (ImageBase == NULL) {
>>>>> + return (ElfCt->ParseStatus = EFI_INVALID_PARAMETER);
>>>> If I see it correctly, all instances that can assign ParseStatus also
>>>> return it. Why is the member needed at all?
>>>>
>>> I expect that caller needs to call ParseElfImage() to get the ParseStatus
>>> properly assigned before calling LoadElfImage().
>> But it just throws back the error without doing anything as far as I can
>> see. For the new PE loader, there are "PeCoffInitializeContext" (more or
>> less "ParseElfImage") and "PeCoffLoadImage" (more or less
>> "LoadElfImage"), and there is a precondition to not call latter when
>> former error'd.
>> A minimal caller cal look like this:
>>
>> Status = PeCoffInitializeContext (&Context, FileBuffer, FileSize);
>> if (RETURN_ERROR (Status)) {
>> return Status;
>> }
>>
>> // [ ... hash image, allocate destination, and so on ... ]
>>
>> PeCoffLoadImage (Context, Destination, DestinationSize);
>>
>> The load function is never invoked if the init function fails. This
>> gives an intuitive and easy-to-comprehend control flow. The old PE lib
>> also has a status member in the context, and it was one of the first
>> things I went away with. Callers should not read from the context, and
>> callees have clear contracts.
>>
> Without the ParseStatus field, callee cannot know whether ParseElfImage() is called.
It can by function contracts, the caller guarantees it. I.e. with the PE
library I linked, no other function must be called before the init function.
Your "ParseElfImage" function is very similar. The context is
initialized by it, i.e. it is trash if it is not called, i.e. it must be
called before other functions.
If it is called, which we know, the caller has the return status. For
PE, it means the caller must not proceed with any further PE processing
and abort immediately.
Is there any scenario where this does not work for ELF? Sorry if I
missed something.
Best regards,
Marvin
> There are several APIs which all depend on the well format of ELF image.
> For example:
> GetElfSectionName
> GetElfSectionPos
> LoadElfImage
>
> If the ParseStatus is removed, all above API implementations need to call
> ParseElfImage() again internally to make sure the ELF image is well formatted.
>
> Caller doesn't need to read the ParseStatus. It just need to check the return
> status of API calls.
>
>>> The member ParseStatus is checked in LoadElfImage() later.
>>> Today it's just PayloadLoaderPeim that calls the ElfLib functions.
>>> But I expect that the ElfLib functions can be public lib APIs in future
>>> if needs appear.
>>>
>>>
>>>>> + Name = (CHAR8 *)(ElfCt->FileBase + ElfCt->ShStrOff + Elf64Shdr-
>>> sh_name);
>>>> 0-termination checks, or return size?
>>>>
>>> I will validate the string section in ParseElfImage(). The validation logic will:
>>> 1. Verify that each section name is pointed from the e_shstrndx
>>> 2. Verify that section name strings don't occupy spaces outside of the string
>> section.
>>>
>>>>> +
>>>>> + ZeroMem (&Context, sizeof (Context));
>>>> This is done by the callee already.
>>>>
>>> Indeed. Will remove this.
>> Rest looks good, thanks a lot!
>>
>> If you have some time, please consider checking the rest for similar
>> issues. I maybe could help with it if you would like that, but not right
>> now, sorry. :)
>>
>> Best regards,
>> Marvin
>>
>
>
>
>
>
next prev parent reply other threads:[~2021-06-09 10:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-03 6:22 [PATCH v2 0/3] Add PayloadLoaderPeim which can load ELF payload Ni, Ray
2021-06-03 6:22 ` [PATCH v2 1/3] MdeModulePkg/UniversalPayload: Add definition for extra info in payload Ni, Ray
2021-06-03 6:37 ` [edk2-devel] " Wu, Hao A
2021-06-04 1:01 ` Ni, Ray
2021-06-04 1:02 ` Wu, Hao A
2021-06-07 9:07 ` Ni, Ray
2021-06-07 23:25 ` Wu, Hao A
2021-06-03 6:22 ` [PATCH v2 2/3] UefiPayloadPkg: Add PayloadLoaderPeim which can load ELF payload Ni, Ray
2021-06-07 1:47 ` Guo Dong
2021-06-07 21:53 ` [edk2-devel] " Marvin Häuser
2021-06-08 2:06 ` Ni, Ray
2021-06-08 3:10 ` Ni, Ray
2021-06-08 8:12 ` Marvin Häuser
2021-06-09 9:49 ` Ni, Ray
2021-06-09 10:03 ` Marvin Häuser [this message]
2021-06-10 3:40 ` Ni, Ray
2021-06-10 7:30 ` Marvin Häuser
2021-06-10 9:39 ` Ni, Ray
2021-06-10 10:13 ` Marvin Häuser
2021-06-10 10:43 ` Michael Brown
2021-06-10 11:37 ` Ni, Ray
[not found] ` <168735878F610E03.10233@groups.io>
2021-06-15 14:36 ` Ni, Ray
2021-06-15 17:31 ` Marvin Häuser
2021-06-03 6:22 ` [PATCH v2 3/3] PeiCore: Remove assertion when failing to load PE image Ni, Ray
2021-06-07 23:28 ` Wu, Hao A
2021-06-07 20:33 ` [edk2-devel] [PATCH v2 0/3] Add PayloadLoaderPeim which can load ELF payload Guo Dong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8eb8db11-90c2-57e0-6868-3532c5af8073@posteo.de \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox