> On 15. Mar 2024, at 23:57, Oliver Smith-Denny wrote: > > I don't think this is what I'm saying. What I am trying to say is that > on MSVC, I see PE images getting created that have VirtualSize set to > the actual number of initialized bytes in that section (not padded to > the section alignment). On ElfConverted binaries, I see the VirtualSize > is padded to the section alignment. I've dropped an example below Ah, mismatched terminology. Zero-initialized as Ard and I used it refers to implicitly or explicitly 0-initialized global variables and such, which is not stored in the file, not the padding. So when you mentioned “real data”, I assumed you meant strictly the non-0 data from the file. Same misunderstanding with SizeOfImage, so that’s all fine. Whew. :) > No, the specific case where I was researching this was explicitly > setting /ALIGN:0x10000 and /FILEALIGN:0x1000 for DXE_RUNTIME_DRIVERs > on ARM64 (a UEFI spec requirement). So I would see the SizeOfRawData > is aligned to the file alignment, as expected, but VirtualSize would > be the actual size of the data. Again, the troubling thing here for > me is that the same binary built with gcc has the VirtualSize aligned > to the section alignment. And I have seen other code that loads PE > images that relies on VirtualSize not including the padding. The spec > is vague here, it says VirtualSize is the size of the section as > loaded in memory (which would lead me to believe this should include > padding) but it does not explicitly say it should be a multiple of > the section alignment (as other fields do). But at a minimum I think > we should have different toolchains doing the same behavior here. Well, not rounding to pad is somewhat superior in some scenarios. If you round, you lose the information on what is section data and what is padding, so you might end up treating padding as data for some reason (because it is indistinguishable from mentioned 0-initialized data). This shouldn’t matter too much for executables and libraries, but MSVC/PE have a lot less of a distinction between object file and executable/library concepts (e.g. no distinction between sections and segments). That might be why they do it this way. > See below for the VirtualSize examples, I'm confused on your comment on > SizeOfImage. I agree that SizeOfImage covers everything as loaded into > memory and I have not seen any issues there. See first comment. > Do you mind adding your RB to v2? And certainly if you have any other > comments that is greatly appreciated. Will try to remember tomorrow. :) > Examples of the differences I see between MSVC and gcc binaries: > > I originally noticed this on ARM64 on edk2, but wanted to make sure I > saw it on x64 too, so this is with binaries from Project Mu's QemuQ35Pkg > (edk2 doesn't have VS2022 support and I didn't feel like adding it > or reverting back to VS2019). For reference, this is building the > current top of tree at a4dd5d9785f48302c95fec852338d6bd26dd961a. > > I dumped ReportStatusCodeRouterRuntimeDxe.efi from both using dumpbin > (from VS2022) to examine the PE headers. > > MSVC selected header values: > > Main header: > 0x3200 size of code > 0x2400 size of initialized data > 0x0 size of uninitialized data > 0x1000 section alignment > 0x200 file alignment > 0xB000 size of image > > 6 sections: .data, .pdata, .rdata, .reloc, .text, .xdata > > .text section: > 0x30DF virtual size > 0x3200 size of raw data > > .data section: > 0x130 virtual size > 0x200 size of raw data > > > GCC ElfConverted selected header values: > > Main header: > 0x4000 size of code > 0x1000 size of initialized data > 0x0 size of uninitialized data > 0x1000 section alignment > 0x1000 file alignment > 0x7000 size of image > > 3 sections: .data, .text, .reloc > > .text section: > 0x4000 virtual size > 0x4000 size of raw data > > .data section: > 0x1000 virtual size > 0x1000 size of raw data > > So my concern here is that ElfConvert takes a > different view of VirtualSize, that is should be > section aligned, whereas MSVC binaries take > VirtualSize to be the actual size without padding. > I think the correct thing to do would be change > ElfConvert to do what MSVC does (although the spec > is vague and so I can understand the confusion). I don’t think it really matters, but it wouldn’t hurt either. Both kinds of binaries are in the wild, so you cannot really leverage any of the choices’ advantages either way. Adjusting to MSVC’s behaviour would be right though, as you can at least properly distinguish between padding and 0-data with new binaries. > In practice this will tend to work as is, since we > are using SizeOfImage to allocate with, but as you > have pointed out there are so many edge cases that > having a difference here makes me worried we would > see something weird with only binaries built by one toolchain. There is plenty of room for that as-is, including that MSVC emits .rdata (and others), but GCC does not, and there is a super awkward heuristic in DxeCore to determine section permissions rather than using the PE values at face value - R-- is literally not respected. Also there had been issues with NASM sections, because PE needs .rdata, but ELF needs .rodata naming there. > I am curious to hear your thoughts though. This is > very easy to reproduce, build any UEFI binary with > MSVC and GCC and compare the headers. Yes, sorry for the confusion, this all looks as expected. Best regards, Marvin > > Thanks, > Oliver -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#116820): https://edk2.groups.io/g/devel/message/116820 Mute This Topic: https://groups.io/mt/104610770/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-