On Apr 8, 2021, at 9:06 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:

We use the loader code in userspace anyway for fuzzing and such. I also want to build a database of all sorts of UEFI binaries some time before the merge to confirm they are all accepted (Windows / macOS / Linux bootloaders, tools like memtest, drivers like iPXE). As part of that, I'm sure we can have a userspace tool that uses the code to emit parsing information.

But as the EDK II build system is very... not so userspace friendly, I will not promise it will be very nice. :)


Marvin,

The BaseTools can easily build C command line tools that are cross platform?

Actually GenFw [1] already does a lot of PE/COFF magic, so it should be relatively easy to add a -I, —info, and dump out an overview of a PE/COFF image, and make comments on things that are not secure. It would also probably be useful to dump out information about the Debug Directory entries, His sections, etc. for general debug.

[1] https://github.com/tianocore/edk2/tree/master/BaseTools/Source/C/GenFw
/Volumes/Case/edk2-github(eng/PR-557-XcodeResourceSections)>. edksetup.sh
Loading previous configuration from /Volumes/Case/edk2-github/Conf/BuildEnv.sh
WORKSPACE: /Volumes/Case/edk2-github
EDK_TOOLS_PATH: /Volumes/Case/edk2-github/BaseTools
CONF_PATH: /Volumes/Case/edk2-github/Conf
/Volumes/Case/edk2-github(eng/PR-557-XcodeResourceSections)>GenFw -h
GenFw Version 0.2 Developer Build based on Revision: Unknown 

Usage: GenFw [options] <input_file>

Copyright (c) 2007 - 2018, Intel Corporation. All rights reserved.

Options:
  -o FileName, --outputfile FileName
                        File will be created to store the output content.
  -e EFI_FILETYPE, --efiImage EFI_FILETYPE
                        Create Efi Image. EFI_FILETYPE is one of BASE,SMM_CORE,
                        PEI_CORE, PEIM, DXE_CORE, DXE_DRIVER, UEFI_APPLICATION,
                        SEC, DXE_SAL_DRIVER, UEFI_DRIVER, DXE_RUNTIME_DRIVER,
                        DXE_SMM_DRIVER, SECURITY_CORE, COMBINED_PEIM_DRIVER,
                        MM_STANDALONE, MM_CORE_STANDALONE,
                        PIC_PEIM, RELOCATABLE_PEIM, BS_DRIVER, RT_DRIVER,
                        APPLICATION, SAL_RT_DRIVER to support all module types
                        It can only be used together with --keepexceptiontable,
                        --keepzeropending, --keepoptionalheader, -r, -o option.
                        It is a action option. If it is combined with other action options,
                        the later input action option will override the previous one.
  -c, --acpi            Create Acpi table.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -t, --terse           Create Te Image.
                        It can only be used together with --keepexceptiontable,
                        --keepzeropending, --keepoptionalheader, -r, -o option.
                        It is a action option. If it is combined with other action options,
                        the later input action option will override the previous one.
  -u, --dump            Dump TeImage Header.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -z, --zero            Zero the Debug Data Fields in the PE input image file.
                        It also zeros the time stamp fields.
                        This option can be used to compare the binary efi image.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -b, --exe2bin         Convert the input EXE to the output BIN file.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -l, --stripped        Strip off the relocation info from PE or TE image.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -s timedate, --stamp timedate
                        timedate format is "yyyy-mm-dd 00:00:00". if timedata 
                        is set to NOW, current system time is used. The support
                        date scope is 1970-01-01 00+timezone:00:00
                        ~ 2038-01-19 03+timezone:14:07
                        The scope is adjusted according to the different zones.
                        It can't be combined with other action options
                        except for -o, -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -m, --mcifile         Convert input microcode txt file to microcode bin file.
                        It can't be combined with other action options
                        except for -o option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -j, --join            Combine multi microcode bin files to one file.
                        It can be specified with -a, -p, -o option.
                        No other options can be combined with it.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -a NUM, --align NUM   NUM is one HEX or DEC format alignment value.
                        This option is only used together with -j option.
  -p NUM, --pad NUM     NUM is one HEX or DEC format padding value.
                        This option is only used together with -j option.
  --keepexceptiontable  Don't clear exception table.
                        This option can be used together with -e or -t.
                        It doesn't work for other options.
  --keepoptionalheader  Don't zero PE/COFF optional header fields.
                        This option can be used together with -e or -t.
                        It doesn't work for other options.
  --keepzeropending     Don't strip zero pending of .reloc.
                        This option can be used together with -e or -t.
                        It doesn't work for other options.
  -r, --replace         Overwrite the input file with the output content.
                        If more input files are specified,
                        the last input file will be as the output file.
  -g HiiPackageListGuid, --hiiguid HiiPackageListGuid
                        Guid is used to specify hii package list guid.
                        Its format is xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
                        If not specified, the first Form FormSet guid is used.
  --hiipackage          Combine all input binary hii packages into 
                        a single package list as the text resource data(RC).
                        It can't be combined with other action options
                        except for -o option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  --hiibinpackage       Combine all input binary hii packages into 
                        a single package list as the binary resource section.
                        It can't be combined with other action options
                        except for -o option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  --rc FlieName         Append a Hii resource section to the
                        last PE/COFF section. The FileName is the resource section to append
                        If FileName does not exist this operation is skipped. This feature is
                        only intended for toolchains, like XCODE, that don't suport $(RC).
                        This option can only be combined with -e
  --rebase NewAddress   Rebase image to new base address. New address 
                        is also set to the first none code section header.
                        It can't be combined with other action options
                        except for -o or -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  --address NewAddress  Set new address into the first none code 
                        section header of the input image.
                        It can't be combined with other action options
                        except for -o or -r option. It is a action option.
                        If it is combined with other action options, the later
                        input action option will override the previous one.
  -v, --verbose         Turn on verbose output with informational messages.
  -q, --quiet           Disable all messages except key message and fatal error
  -d, --debug level     Enable debug messages, at input debug level.
  --version             Show program's version number and exit
  -h, --help            Show this help message and exit

Thanks,

Andrew Fish

Best regards,
Marvin

On 08.04.21 16:13, Andrew (EFI) Fish wrote:
At a minimum it would be nice if we had a tool that would point out the security faults with a given PE/COFF file layout.



On Apr 8, 2021, at 4:16 AM, Laszlo Ersek <lersek@redhat.com> wrote:

On 04/06/21 12:06, Marvin Häuser wrote:
Good day Nate,

Comments are inline.

Best regards,
Marvin

On 06.04.21 11:41, Nate DeSimone wrote:
Hi Marvin,

Great to meet you and welcome back! Glad you hear you are interested!
Completing a formal verification of a PE/COFF loader is certainly
impressive. Was this done with some sort of automated theorem proving?
It would seem a rather arduous task doing an inductive proof for an
algorithm like that by hand!
I would call it "semi-automated", a great deal of intermediate goals
(preconditions, postconditions, invariants, assertions, ...) were
required to show all interesting properties. But yes, the actual proof
steps are automated by common SMT solvers. It was done using the
AstraVer Toolset and ACSL, latter basically a language to express logic
statements with C-like syntax.

I completely agree with you that getting a formally verified PE/COFF
loader into mainline is undoubtably valuable and would pay security
dividends for years to come.
I'm glad to hear that. :)

Admittedly, this is an area of computer science that I don't have a
great deal of experience with. The furthest I have gone on this topic
is writing out proofs for simple algorithms on exams in my Algorithms
class in college. Regardless you have a much better idea of what the
current status is of the work that you and Vitaly have done. I guess
my only question is do you think there is sufficient work remaining to
fill the 10 week GSoC development window?
Please don't get me wrong, but I would be surprised if the UEFI
specification changes I'd like to discuss alone would be completed
within 10 weeks, let alone implementation throughout the codebase. While
I think the plain amount of code may be a bit less than say a
MinPlatform port, the changes are much deeper and require much more
caution to avoid regressions (e.g. by invalidating undocumented
assertions). This sadly is not a matter of just replacing the underlying
library implementation or "plug-in and play" at all. It furthermore
affects many parts of the stack, the core dispatchers used for all
platforms, image emulation (EBC), UEFI userland emulation (EmuPkg), and
so on. I was rather worried the scope is too broad time-wise, but it can
be narrowed/widened as you see fit really. This is one of *the* core
components used on millions of device, and many package maintainers need
to review and validate the changes, this must really be done right the
first try. :)

Certainly we can use some of that time to perform the code reviews you
mention and write up formal ECRs for the UEFI spec changes that you
believe are needed.
I believed that was part of the workload, yes, but even without it I
think there is plenty to do.

Thank you for sending the application and alerting us to the great
work you and Vitaly have done! I'll read your paper more closely and
come back with any questions I still have.
Thank you, I will gladly explain anything unclear. Just try to not give
Laszlo too many flashbacks. :)
I haven't commented yet in this thread, as I thought my stance on this
undertaking was (or should be) obvious.

I very much welcome a replacement for the PE/COFF parser (as I consider
its security issues unfixable in an incremental manner). From my reading
of Marvin's and Vitaly's paper (draft), they have my full trust, and I'm
ready to put their upcoming code to use in ArmVirtPkg and OvmfPkg with
minimal actual code review. If fixing the pervasive security problems
around this area cannot avoid spiraling out to other core code in edk2,
such as dispatchers, and even to the PI / UEFI specs, so be it.

Regarding GSoC itself: as I stated elsewhere previously, I support
edk2's participation in GSoC, while at the same time I'm not
volunteering for mentorship at all. I'm uncertain if GSoC is the best
framework for upstreaming such a large undertaking, but if it can help,
we should use it as much as possible.

Thanks
Laszlo





With Best Regards,
Nate

-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Marvin
Häuser
Sent: Sunday, April 4, 2021 4:02 PM
To: devel@edk2.groups.io; Laszlo Ersek <lersek@redhat.com>; Andrew Fish
<afish@apple.com>; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: [edk2-devel] [GSoC proposal] Secure Image Loader

Good day everyone,

I'll keep the introduction brief because I've been around for a while
now. :) I'm
Marvin Häuser, a third-year Computer Science student from TU
Kaiserslautern,
Germany. Late last year, my colleague Vitaly from ISP RAS and me
introduced a
formally verified Image Loader for UEFI usage at ISP RAS Open[1] due
to various
defects we outlined in the corresponding paper. Thank you once again
Laszlo
for your *incredible* review work on the publication part.

I now want to make an effort to mainline it, preferably as part of
the current
Google Summer of Code event. To be clear, my internship at ISP RAS has
concluded, and while Vitaly will be available for design discussion,
he has other
priorities at the moment and the practical part will be on me. I have
previously
submitted a proposal via the GSoC website for your review.

There are many things to consider:
1. The Image Loader is a core component, and there needs to be a
significant
level of quality and security assurance.
2. Being consumed by many packages, the proposed patch set will take
a lot of
time to review and integrate.
3. During my initial exploration, I discovered defective PPIs and
protocols (e.g.
returning data with no corresponding size) originating from the UEFI
PI and
UEFI specifications. Changes need to be discussed, settled on, and
submitted to
the UEFI Forum.
4. Some UEFI APIs like the Security Architecture protocols are
inconveniently
abstract, see 5.
5. Some of the current code does not use the existing context, or
accesses it
outside of the exposed APIs. The control flow of the dispatchers may
need to be
adapted to make the context available to appropriate APIs.

But obviously there are not only unpleasant considerations:
A. The Image Loader is mostly formally verified, and only very few
changes will
be required from the last proven state. This gives a lot of trust in
its correctness
and safety.
B. All outlined defects that are of critical nature have been fixed
successfully.
C. The Image Loader has been tested with real-world code loading
real-world
OSes on thousands of machines in the past few months, including
rejecting
malformed images (configurable by PCD).
D. The new APIs will centralise everything PE, reducing code
duplication and
potentially unsafe operations.
E. Centralising and reduced parse duplication may improve overall boot
performance.
F. The code has been coverage-tested to not contain dead code.
G. The code has been fuzz-tested including sanitizers to not invoke
undefined
behaviour.
H. I already managed to identify a malformed image in OVMF with its help
(incorrectly reported section alignment of an Intel IPXE driver). A
fix will be
submitted shortly.
I. I plan to support PE section permissions, allowing for read-only data
segments when enabled.

There are likely more points for both lists, but I hope this gives a
decent
starting point for discussion. What are your thoughts on the matter?
I strongly
encourage everyone to read the section regarding defects of our
publication[2]
to better understand the motivation. The vague points above can of
course be
elaborated in due time, as you see fit.

Thank you for your time!

Best regards,
Marvin


[1] https://github.com/mhaeuser/ISPRASOpen-SecurePE
[2] https://arxiv.org/pdf/2012.05471.pdf