From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 2D86974004C for ; Thu, 14 Mar 2024 14:45:20 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=Zhdp9xegBGr/ekCNbtuhJENXK5gAkHiHrVoZNbG9Z/w=; c=relaxed/simple; d=groups.io; h=DKIM-Filter:Message-ID:Date:MIME-Version:User-Agent:Subject:To:Cc:References:From:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Language:Content-Type:Content-Transfer-Encoding; s=20240206; t=1710427518; v=1; b=QidS266xhrEuh+ALA/3sb59gWMmwfalJvv+0BoBwftqrr8JS8JG7ez37gqRcKzRA4bhMuVSi +p1aOzOPeB9izGfCGd273MTtWad+hwCS1yub/muGFpZ+MEJn2GSfiZp0lCQONW/6TopuF5ymT3p /DELWVfV4suHPqrcMJ4JJpFaWQ5gu0aiuy7WMygSiHXrHzoqUndicHIzdaVQ7CtEFUd5F/r300Z RW3Y4hQaN94xEpK+HZsxuKDz5XcDgtoNFTAS+BavERFZbnDm+ai8gwgEHawVONYfY7J7tv7T4xT +8ZyyyTzj99zzDsghK1/MFoLNfJxejcuPRP/0axyRWHtQ== X-Received: by 127.0.0.2 with SMTP id J5PBYY7687511xOzPA9dAtBv; Thu, 14 Mar 2024 07:45:18 -0700 X-Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mx.groups.io with SMTP id smtpd.web11.14322.1710427518208732645 for ; Thu, 14 Mar 2024 07:45:18 -0700 X-Received: from [10.137.194.171] (unknown [131.107.159.43]) by linux.microsoft.com (Postfix) with ESMTPSA id 98AFD20B74C0; Thu, 14 Mar 2024 07:45:17 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 98AFD20B74C0 Message-ID: <9d6f07c2-3579-4555-94eb-fb7b79a8e05a@linux.microsoft.com> Date: Thu, 14 Mar 2024 07:45:17 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [edk2-devel][PATCH v1 1/3] MdeModulePkg: ImagePropertiesRecordLib: Use SectionAlignment for CodeSize To: devel@edk2.groups.io, mhaeuser@posteo.de Cc: Leif Lindholm , Ard Biesheuvel , Sami Mujawar , Taylor Beebe References: <20240227202721.30070-1-osde@...> <20240227202721.30070-2-osde@...> <5d95548d-d25f-4306-a271-a2960cc81ccf@...> <531b3cfe-b57c-4b5a-97fd-45e428f97853@...> <8DF936BA-F1D0-4513-A8F4-36A2CBEFD84A@posteo.de> From: "Oliver Smith-Denny" In-Reply-To: <8DF936BA-F1D0-4513-A8F4-36A2CBEFD84A@posteo.de> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Thu, 14 Mar 2024 07:45:18 -0700 Reply-To: devel@edk2.groups.io,osde@linux.microsoft.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: 6ROrRf51GIiktBK4S0KwSZfHx7686176AA= Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240206 header.b=QidS266x; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=linux.microsoft.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io Hi Marvin, Thanks for looking at this. Since sending out the v1, I have done a lot more digging into our PE loader, others, the spec, and differences between MSVC and our ElfConvert tool. I agree that my understanding was flawed in my original comments. On 3/14/2024 2:57 AM, Marvin H=C3=A4user wrote: > Good day everyone, >=20 > Sorry for interjecting, but I=E2=80=99d like to avoid premature changes t= o PE=20 > code that would only make the current mess even worse. >=20 >> On 4. Mar 2024, at 20:24, Oliver Smith-Denny wrote: >> >> I relooked at the spec, you are correct, SizeOfRawData is aligned to >> the FileAlignment. So this is only a bug for a file with >> SectionAlignment !=3D FileAlignment. >=20 > Still no, for some reasons you already mentioned, but also reasons you=20 > didn=E2=80=99t: > - VirtualSize often is not aligned, so they can still trivially mismatch. > - VirtualSize includes zero-initialized data not part of the file. This does not appear to be the case with MSVC built binaries. I am seeing that VirtualSize is the size of the data-initialized part of the section. Whereas on our ElfConverted binaries, what you say is true. The difference concerns me. > - In the old days, VirtualSize =3D 0 was used to disable loading of debug= =20 > data, but SizeOfRawData remained intact, so it could be loaded on demand. > - SizeOfRawData may be unaligned due to bugs. In fact, there is no=20 > reason to trust FileAlignment, it has no real meaning in an UEFI context= =20 > (I am not even sure it does on Windows). >=20 >> I see, yes I do believe VirtualSize could be used here instead. >=20 > *Must*. As Ard said, SizeOfRawData is only interesting to know how many= =20 > bytes to copy from the file and for *nothing* else. Agreed. I have done further research and v2 uses VirtualSize. I was initially led astray here because the ElfConvert tool sets VirtualSize and SizeOfRawData to the SectionAlignment universally: https://github.com/tianocore/edk2/blob/5572b43c6767f7cc46b074ae1fc288f6eccd= c65d/BaseTools/Source/C/GenFw/ElfConvert.c#L134-L136 >=20 >> Two >> things give me pause. One is that the PE spec states that SizeOfRawData >> is rounded and VirtualSize is not, so that SizeOfRawData may be greater >> than the VirtualSize in some cases (which seems incorrect). >> >> The other is that the image loader partially uses VirtualSize when >> loading: >> >> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e= 2c7dea0/MdePkg/Library/BasePeCoffLib/BasePeCoff.c#L1399-L1400 >> >> However, when determining the size of a loaded image (and therefore the >> number of pages to allocate) it will allocate an extra page: >> >> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e= 2c7dea0/MdeModulePkg/Core/Dxe/Image/Image.c#L646-L652 >> >=20 > Sorry, I don=E2=80=99t understand the =E2=80=9Chowever=E2=80=9D, this is = sound. >=20 >> as ImageSize here is: >> >> https://github.com/tianocore/edk2/blob/918288ab5a7c3abe9c58d576ccc0ae32e= 2c7dea0/MdePkg/Library/BasePeCoffLib/BasePeCoff.c#L312 >> >> Which according to the spec, SizeOfImage is the size of the image as >> loaded into memory and must be a multiple of section alignment. >=20 > Side note, SizeOfImage may be borked, not only due to bugs, but also due= =20 > to malice. There is literally no reason whatsoever to use it - to=20 > validate it, you need to calculate a sane value, and once you have that,= =20 > you obviously do not need SizeOfImage anymore. You could reject images=20 > with malformed SizeOfImage, but there likely are various=20 > legitimate-bugged ones out there. This is just an ancient artifact from= =20 > before memory safety was a thing. :) Yep, the PE headers are definitely filled with information that can be deduced from other sources and the spec is not very verbose, which is why I definitely want to tread carefully here. For example, it says SizeOfImage is the size of the image loaded into memory, which current implementations seem to take to mean the sum of the headers plus each section's VirtualSize rounded to the SectionAlignment (which the spec does call out that SizeOfImage must be aligned to the SectionAlignment). VirtualSize it says is also the size of the section as loaded into memory, but current implementations (other than our ElfConvert script which I'm currently investigating if we should change that) seem to take this to mean the size of the initialized data in memory, i.e. the "real" data from the image, not the zeroed data to get to the SectionAlignment. But the language in the spec is almost identical between the two. And again, we have a difference in our MSVC and gcc built binaries here. >=20 >> So, reflecting on this, let me test with VirtualSize here, I think >> that is the right value to use, the only time we would have a >> SizeOfRawData that is greater than the VirtualSize would be if our >> FileAlignment is greater than our SectionAlignment, which would be >> a misconfiguration. >=20 > No, it can simply be that FileAlignment =3D SectionAlignment and=20 > SizeOfRawData is aligned and VirtualSize is not. Even when both are=20 > aligned, the former may carry extra optional data (like previously=20 > mentioned debug data), or it might be some weird I/O performance related= =20 > padding or something. Agreed, this was misinformed by looking at gcc built binaries. >=20 >> I do think the ImageLoader should also be fixed to only allocate >> ImageSize number of pages (which should be the sum of the section >> VirtualSizes + any headers that aren't included). Might as well not >> waste an extra page for each image and then our image record code is >> simpler as well (ensuring we are protecting the right pages). >=20 > NO! :) There are *many* issues surrounding this: > - Sections may overlap (observed with early macOS EFI images, could be=20 > ignored, but needs to be validated across the landscape). > - Sections may have gaps (observed with =E2=80=9Cold" Linux EFI Shims iir= c,=20 > there=E2=80=99ll be plenty of broken images in the wild yet). > - The extra page is not pointless - as you can see, it is allocated when= =20 > exceeding EFI page size, which is the only granularity you can=20 > page-allocate at. So, when you require > 4K alignment, you need the=20 > extra page to be able to =E2=80=9Cshift=E2=80=9D the image into proper al= ignment in the=20 > allocated region as needed. In AUDK, I resolved this by abstracting=20 > existing, buried aligned-allocation code into a MemoryAllocationLibEx=20 > class:=20 > https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891= a0791ac6/MdeModulePkg/Library/CommonMemoryAllocationLib/CommonMemoryAllocat= ionLibEx.c#L70 > Without aligned-allocation, you cannot get rid of this extra page. >=20 > For reference, my image record code is here:=20 > https://github.com/acidanthera/audk/blob/13f99fa7cbde0829ba98e1e4f2a03891= a0791ac6/MdePkg/Library/BaseUefiImageLib/PeSupport.c#L335 Yep, this was me being dumb and not looking a few lines further ahead :) After sending this I looked deeper at this offhand comment and realized the purpose. >=20 >> I think this patch series should go in, as it is fixing an active bug, >> but I will also take a look at the image loader creating the image >> records and having a protocol it produces to retrieve the list, to >> attempt to avoid issues like this in the future. >=20 > Why does there need to be a protocol? Why should non-Core modules get=20 > any information about image topology? Agreed, only core code needs this. I think I said protocol originally so that the existing ImagePropertiesRecordLib could use it without depending on a core only function, but in a new implementation we would drop the lib. >=20 > A few general words of caution when dealing with PE: > - Do not trust the spec. > - Do not trust the loader. > - Do not trust the converter. > - Do *not* trust your intuition. >=20 > *Everything* is broken, from the producers to the consumers, from the=20 > signing to the validation tools, technically, security-wise, and=20 > design-wise. Do *not* change things without as little as a collection of= =20 > Windows and Linux images from various versions and various OpROMs to=20 > regression-test with. Otherwise, we will all be in a world of more pain. = :) >=20 I agree with you there. That being said, the current Image record code is, unsurprisingly, broken, and does need to be fixed. Can you review the v2 of this? The only difference is that I changed to VirtualSize. Do you think the code should be updated for the case of VirtualSize =3D=3D 0? In the current implementation, it will round 0 up to the SectionAlignment, which is what I believe will happen in the PE loader, also, but want to get your thoughts on it. This would still be a short term solution to fix the active bug (ARM64 cannot boot a 64k pagesize Linux with a broken MAT table or we will boot without the benefit of the MAT, if we happen to have an aligned total region but not each entry, testing shows). I do think it is a broken pattern to try to determine the image records after we've loaded them into memory, the PE loader should create them with the factual data of what was loaded. Thanks, Oliver -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#116756): https://edk2.groups.io/g/devel/message/116756 Mute This Topic: https://groups.io/mt/104610770/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-