From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web12.5712.1623233017173458816 for ; Wed, 09 Jun 2021 03:03:37 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=cqRdLzt9; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id 298A5240101 for ; Wed, 9 Jun 2021 12:03:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1623233015; bh=c+WjPPOdLpmgwsUPMIvHNnD+AqrLhy9X5N4gY4nDuu0=; h=Subject:To:Cc:From:Date:From; b=cqRdLzt9uhpw8yWYYcIrZgOr+1dV3XuMuLi7LYDkPhf5wxApbuzezDye/3ZW22IbX tZa65pmIl9R0iOg10q2bnYg8ANZUtDme2K6ne++M7iRNMhZYCQEdMa06loodq7oD3K NFeJnteRnNz69l7mgf2aB2jmjBdak8pEvRB7Y2k9sIPQ5glu8mQvfNvZwH6xlt42km ZsUu/IdNoRKz/R0MlkGyEEPuRoaubw+pSHqRA5qBCVwleKGNJIvkXUC/KdQXD9K/zv ABfsCq0kGtVGHeMdB6s5N8BmziWxG6uz4g42V00oQOsV8Le1h4vM3qiVgzhWL0N1SF uIC+vBplGlUig== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4G0N3S5sf3z6tmm; Wed, 9 Jun 2021 12:03:32 +0200 (CEST) Subject: Re: [edk2-devel] [PATCH v2 2/3] UefiPayloadPkg: Add PayloadLoaderPeim which can load ELF payload To: devel@edk2.groups.io, ray.ni@intel.com Cc: "Ma, Maurice" , "Dong, Guo" , "You, Benjamin" References: <20210603062259.1390-1-ray.ni@intel.com> <20210603062259.1390-3-ray.ni@intel.com> <812b8f13-e951-5d27-9bd1-61711e6dd840@posteo.de> <486c5ab8-240e-3ac5-5a4a-7f368cb68644@posteo.de> From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-ID: <8eb8db11-90c2-57e0-6868-3532c5af8073@posteo.de> Date: Wed, 9 Jun 2021 10:03:31 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: quoted-printable On 09.06.21 11:49, Ni, Ray wrote: >> Thank you for your quick reply, comments inline. > I have to be quick because my project depends on the check-in of this co= de=F0=9F=98=8A Sure, so thanks a lot for taking the time! One non-trivial comment at the bottom. > >>>>> + for ( Index =3D 0 >>>>> + ; RelaEntrySize * Index < RelaSize >>>> Overflow? >>>> >>> Will change from: >>> RelaEntrySize * Index < RelaSize >>> to: >>> Index < RelaSize / RelaEntrySize >> imo add ASSERT for RelaEntrySize > 0 then. > Sure. To avoid dividend by zero error. > >> Basically the alignment of any offset with which a pointer to >> non-trivially-aligned (i.e. requirement > 1 Byte) data can be forged >> must be checked. >> >> Examples from our new PE loader: >> https://github.com/mhaeuser/ISPRASOpen- >> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit. >> c#L1226 >> -> >> https://github.com/mhaeuser/ISPRASOpen- >> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit. >> c#L1242 >> https://github.com/mhaeuser/ISPRASOpen- >> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit. >> c#L1389 >> -> >> https://github.com/mhaeuser/ISPRASOpen- >> SecurePE/tree/6a7abcd8647cf6faa733082f6d8dcc9adc141d7e/src/PeCoffInit. >> c#L148 >> >> The idea here is that the base pointer (i.e. image header) is "maximall= y >> aligned" (i.e. can hold data of any platform data alignment). For the 8 >> Bytes AllocatePool() guarantees (file data), this is sufficient for any >> primitive and composite data type. For the 4 KB AllocatePages() >> guarantees (destination), this is sufficient of that, and for advanced >> things like AVX (however not needed here). If the base is maximally >> aligned, Base + X is guaranteed aligned for A if X is aligned for A, >> i.e. X % _Alignof(A) =3D 0. Failing to verify this can cause exceptions= on >> platforms which don't support or have disabled the capability to perfor= m >> unaligned memory accesses. >> > I understand now. I remember that X86 contains a control flag that can t= rigger > CPU exception as well when unaligned access happens. > > But adding such check in all places might require a huge change to today= 's code. > Can you accept that I ignore such check for now and add it later? Of course, I mean, it needs some EDK II wide concept first anyway. Just=20 the overall situation is similar (but a lot worse in severity) with the=20 PE loader and now it's not easy to address the issues. :) So if there are plans to address it, that's great! > >>>>> + // >>>>> + // It's abnormal a DYN ELF doesn't contain a dynamic section. >>>>> + // >>>>> + ASSERT (DynShdr !=3D NULL); >>>>> + if (DynShdr =3D=3D NULL) { >>>>> + return EFI_UNSUPPORTED; >>>>> + } >>>>> + ASSERT (DynShdr->sh_type =3D=3D SHT_DYNAMIC); >>>>> + ASSERT (DynShdr->sh_entsize >=3D sizeof (*Dyn)); >>>> Abnormalities in unknown/untrusted data must be filtered with a runti= me >>>> check, not with an ASSERT. >>>> >>> Sure. I will add if-check below the assertion so assertion-enabled bui= ld can >>> report the errors earlier. >> I have started this discussion under another patch, maybe I should writ= e >> a broader-scope mail to the list for comments. Basically using ASSERTs >> for anything but *impossible* (*not* assuming the input data is >> well-formed) situations significantly reduces the efficacy of dynamic >> analysis. When doing fuzzing for example, you want to keep the ASSERTs >> enabled to be made aware of any internal invariant violations. But if >> ASSERTs happen on possible conditions, they will kill the fuzzing >> process for no good reason. Turning them off will not analyse your >> ASSERTs for possible code defects. >> >> Maybe fuzzing would be a good idea for this library? :) >> > I understand now. I am ok to remove assertion for external inputs. Thank you. > >>> >>>>> + for ( Index =3D 0, Dyn =3D (Elf32_Dyn *) (ElfCt->FileBase + DynSh= dr- >>> sh_offset) >>>>> + ; Index < DynShdr->sh_size / DynShdr->sh_entsize >>>> Is "sh_entsize" checked for 0? >>>> >>> No need. Because code above makes sure sh_entsize >=3D sizeof (*Dyn). >> When you turn it into a runtime check as discussed above, yes. >> >>> >>>>> + ASSERT (RelShdr->sh_type =3D=3D RelaType); >>>>> + ASSERT (RelShdr->sh_entsize =3D=3D RelaEntrySize); >>>> See above. >>>> >>> Agree. Will add if-checks. >>> >>> >>>>> + DEBUG ((DEBUG_INFO, "DYN ELF: Relocate using dynamic >> sections...\n")); >>>>> + Status =3D RelocateElf32Dynamic (ElfCt); >>>>> + ASSERT_EFI_ERROR (Status); >>>> Why cannot this fail? >>>> >>> A DYN type ELF image should contain dynamic section. So only an abnorm= al >> ELF image can fail. >> >> Same ASSERT point as above, "cannot" refers to both well-formed and >> ill-formed images. >> > Sure. Will remove assertion. > >>> >>>>> + return (Elf64_Phdr *)(ImageBase + Ehdr->e_phoff + Index * Ehdr- >>> e_phentsize); >>>> Alignment checks? Bounds checks? >>>> >>> For the alignment checks, do you suggest that I should make sure the >> segment should be placed >>> in the address that meets the alignment requirement? >> It could be implemented, PE code does it, but I meant pointer alignment >> as discussed above somewhere. I don't think ELFs would likely request >> more than page alignment, but abort + DEBUG message sounds like a good >> idea. >> >>> ELF spec just requires below for Elf64_Phdr.p_align: >>> loadable process segments must have congruent values for p_vaddr a= nd >> p_offset, modulo the page size. >>> I can add such check in ParseElfImage(). >>> >>>>> + ProcessRelocation64 ( >>>>> + (Elf64_Rela *) (ElfCt->FileBase + RelShdr->sh_offset), >>>> Alignment? :) I know there is no real concept in EDK II yet, but it >>>> really is needed. >>>> >>> Can you explain a bit more on the alignment? >> Done above, sorry. >> >>> >>>>> + >>>>> +/** >>>>> + Check if the ELF image is valid. >>>>> + >>>>> + @param[in] ImageBase Memory address of an image. >>>>> + >>>>> + @retval TRUE if valid. >>>>> + >>>>> +**/ >>>>> +BOOLEAN >>>>> +IsElfFormat ( >>>>> + IN CONST UINT8 *ImageBase >>>> You cannot safely inspect untrusted/unknown data without a size field= , >>>> also needs checks below. >>>> >>> Agree. Original idea was to add a ELF loader that can load the ELF ass= uming >>> the ELF image is well-formatted. >> I get that idea, but the reality is that people will start using it for >> external images once it is needed. :) >> Sorry for being pedantic. >> >>> But with your help, I am glad to enhance the logic a bit more to expan= d >>> the support of external ELF images. >>> >>> Will add a "UINTN ImageSize" parameter. >>> >>>>> + ) >>>>> +{ >>>>> + Elf32_Ehdr *Elf32Hdr; >>>>> + Elf64_Ehdr *Elf64Hdr; >>>>> + >>>>> + ASSERT (ImageBase !=3D NULL); >>>>> + >>>>> + Elf32Hdr =3D (Elf32_Ehdr *)ImageBase; >>>>> + >>>>> + // >>>>> + // Start with correct signature "\7fELF" >>>>> + // >>>>> + if ((Elf32Hdr->e_ident[EI_MAG0] !=3D ELFMAG0) || >>>>> + (Elf32Hdr->e_ident[EI_MAG1] !=3D ELFMAG1) || >>>>> + (Elf32Hdr->e_ident[EI_MAG1] !=3D ELFMAG1) || >>>>> + (Elf32Hdr->e_ident[EI_MAG2] !=3D ELFMAG2) >>>>> + ) { >>>>> + return FALSE; >>>>> + } >>>>> + >>>>> + // >>>>> + // Support little-endian only >>>>> + // >>>>> + if (Elf32Hdr->e_ident[EI_DATA] !=3D ELFDATA2LSB) { >>>>> + return FALSE; >>>>> + } >>>>> + >>>>> + // >>>>> + // Check 32/64-bit architecture >>>>> + // >>>>> + if (Elf32Hdr->e_ident[EI_CLASS] =3D=3D ELFCLASS64) { >>>>> + Elf64Hdr =3D (Elf64_Ehdr *)Elf32Hdr; >>>>> + Elf32Hdr =3D NULL; >>>>> + } else if (Elf32Hdr->e_ident[EI_CLASS] =3D=3D ELFCLASS32) { >>>>> + Elf64Hdr =3D NULL; >>>>> + } else { >>>>> + return FALSE; >>>>> + } >>>> Why are the branches above and below separated when they map >> basically 1:1? >>> Indeed. Thanks for catching this. >>> Will merge the separate "if" together. >>> >>>>> + >>>>> + if (Elf64Hdr !=3D NULL) { >>>>> + // >>>>> + // Support intel architecture only for now >>>>> + // >>>>> + if (Elf64Hdr->e_machine !=3D EM_X86_64) { >>>>> + return FALSE; >>>>> + } >>>>> + >>>>> + // Use last section as end of file >>>>> + Status =3D GetElfSectionPos (ElfCt, ElfCt->ShNum - 1, &Offset, &S= ize); >>>> What if ShNum is 0? >>>> >>> Agree. The logic to calculate file size might not be needed. >>> Let me confirm and try to remove the entire function. >>> >>> >>>>> + if (ElfCt->EiClass =3D=3D ELFCLASS32) { >>>>> + Elf32Hdr =3D (Elf32_Ehdr *)ElfCt->FileBase; >>>>> + FileSize2 =3D Elf32Hdr->e_shoff + Elf32Hdr->e_shentsize * Elf32= Hdr- >>> e_shnum; >>>>> + } else if (ElfCt->EiClass =3D=3D ELFCLASS64) { >>>>> + Elf64Hdr =3D (Elf64_Ehdr *)ElfCt->FileBase; >>>>> + FileSize2 =3D (UINTN)(Elf64Hdr->e_shoff + Elf64Hdr->e_shentsize= * >> Elf64Hdr->e_shnum); >>>>> + } >>>> Overflows? >>>> >>> Integer overflow? >> Yes, sorry. >> >>> Will add integer overflow check if this file size calculation logic is= still needed. >>> >>> >>>>> + >>>>> + if (ElfCt =3D=3D NULL) { >>>>> + return EFI_INVALID_PARAMETER; >>>>> + } >>>> As this is function contract, I'd replace this with an ASSERT, or at >>>> least have both. >>>> >>> I will put "ASSERT (ElfCt !=3D NULL)" above the if. >>> >>> >>>>> + ZeroMem (ElfCt, sizeof(ELF_IMAGE_CONTEXT)); >>>>> + >>>>> + if (ImageBase =3D=3D NULL) { >>>>> + return (ElfCt->ParseStatus =3D EFI_INVALID_PARAMETER); >>>> If I see it correctly, all instances that can assign ParseStatus also >>>> return it. Why is the member needed at all? >>>> >>> I expect that caller needs to call ParseElfImage() to get the ParseSta= tus >>> properly assigned before calling LoadElfImage(). >> But it just throws back the error without doing anything as far as I ca= n >> see. For the new PE loader, there are "PeCoffInitializeContext" (more o= r >> less "ParseElfImage") and "PeCoffLoadImage" (more or less >> "LoadElfImage"), and there is a precondition to not call latter when >> former error'd. >> A minimal caller cal look like this: >> >> Status =3D PeCoffInitializeContext (&Context, FileBuffer, FileSize)= ; >> if (RETURN_ERROR (Status)) { >> return Status; >> } >> >> // [ ... hash image, allocate destination, and so on ... ] >> >> PeCoffLoadImage (Context, Destination, DestinationSize); >> >> The load function is never invoked if the init function fails. This >> gives an intuitive and easy-to-comprehend control flow. The old PE lib >> also has a status member in the context, and it was one of the first >> things I went away with. Callers should not read from the context, and >> callees have clear contracts. >> > Without the ParseStatus field, callee cannot know whether ParseElfImage(= ) is called. It can by function contracts, the caller guarantees it. I.e. with the PE= =20 library I linked, no other function must be called before the init functio= n. Your "ParseElfImage" function is very similar. The context is=20 initialized by it, i.e. it is trash if it is not called, i.e. it must be= =20 called before other functions. If it is called, which we know, the caller has the return status. For=20 PE, it means the caller must not proceed with any further PE processing=20 and abort immediately. Is there any scenario where this does not work for ELF? Sorry if I=20 missed something. Best regards, Marvin > There are several APIs which all depend on the well format of ELF image. > For example: > GetElfSectionName > GetElfSectionPos > LoadElfImage > > If the ParseStatus is removed, all above API implementations need to cal= l > ParseElfImage() again internally to make sure the ELF image is well form= atted. > > Caller doesn't need to read the ParseStatus. It just need to check the r= eturn > status of API calls. > >>> The member ParseStatus is checked in LoadElfImage() later. >>> Today it's just PayloadLoaderPeim that calls the ElfLib functions. >>> But I expect that the ElfLib functions can be public lib APIs in futur= e >>> if needs appear. >>> >>> >>>>> + Name =3D (CHAR8 *)(ElfCt->FileBase + ElfCt->ShStrOff + Elf64S= hdr- >>> sh_name); >>>> 0-termination checks, or return size? >>>> >>> I will validate the string section in ParseElfImage(). The validation = logic will: >>> 1. Verify that each section name is pointed from the e_shstrndx >>> 2. Verify that section name strings don't occupy spaces outside of the= string >> section. >>> >>>>> + >>>>> + ZeroMem (&Context, sizeof (Context)); >>>> This is done by the callee already. >>>> >>> Indeed. Will remove this. >> Rest looks good, thanks a lot! >> >> If you have some time, please consider checking the rest for similar >> issues. I maybe could help with it if you would like that, but not righ= t >> now, sorry. :) >> >> Best regards, >> Marvin >> > > >=20 > >