public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Marvin Häuser" <mhaeuser@posteo.de>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Bret Barkelew <Bret.Barkelew@microsoft.com>,
	Thomas Abraham <thomas.abraham@arm.com>,
	"Ard Biesheuvel (TianoCore)" <ardb+tianocore@kernel.org>,
	"Lindholm, Leif" <leif@nuviainc.com>,
	Laszlo Ersek <lersek@redhat.com>,
	Sami Mujawar <sami.mujawar@arm.com>,
	"devel@edk2.groups.io" <devel@edk2.groups.io>, nd <nd@arm.com>
Subject: Re: ArmVirt and Self-Updating Code
Date: Fri, 23 Jul 2021 14:27:00 +0000	[thread overview]
Message-ID: <39afb072-3e01-6946-9387-e414a8f62f8d@posteo.de> (raw)
In-Reply-To: <CAMj1kXEHRT3_O-wwMUXruW_bng2JQy=qb0DEb-5JdDttSbYfZg@mail.gmail.com>



On 23.07.21 16:09, Ard Biesheuvel wrote:
> On Fri, 23 Jul 2021 at 12:47, Marvin Häuser <mhaeuser@posteo.de> wrote:
>> On 23.07.21 12:13, Ard Biesheuvel wrote:
>>> On Fri, 23 Jul 2021 at 11:55, Marvin Häuser <mhaeuser@posteo.de> wrote:
>>>> On 22.07.21 17:14, Ard Biesheuvel wrote:
>>>>> On Thu, 22 Jul 2021 at 16:54, Bret Barkelew<Bret.Barkelew@microsoft.com>  wrote:
>>>>>> Expanding audience to the full dev list…
>>>>>>
>>>>>> See below…
>>>>>>
>>>>>>
>>>>>>
>>>>>> - Bret
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Thomas Abraham
>>>>>> Sent: Wednesday, July 7, 2021 11:07 PM
>>>>>> To: Bret Barkelew; Ard Biesheuvel (TianoCore); Lindholm, Leif; Laszlo Ersek; Marvin Häuser; Sami Mujawar
>>>>>> Cc: nd
>>>>>> Subject: [EXTERNAL] RE: ArmVirt and Self-Updating Code
>>>>>>
>>>>>>
>>>>>>
>>>>>> + Sami
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Bret Barkelew<Bret.Barkelew@microsoft.com>
>>>>>> Sent: Thursday, July 8, 2021 11:05 AM
>>>>>> To: Thomas Abraham<thomas.abraham@arm.com>; Ard Biesheuvel (TianoCore)<ardb+tianocore@kernel.org>; Lindholm, Leif<leif@nuviainc.com>; Laszlo Ersek<lersek@redhat.com>; Marvin Häuser<mhaeuser@posteo.de>
>>>>>> Subject: ArmVirt and Self-Updating Code
>>>>>>
>>>>>>
>>>>>>
>>>>>> All,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Marvin asked me a question on the UEFI Talkbox Discord that’s a little beyond my ken…
>>>>>>
>>>>>>
>>>>>>
>>>>>> “There is self-relocating code in ArmVirtPkg:
>>>>>>
>>>>>> https://github.com/tianocore/edk2/blob/17143c4837393d42c484b42d1789b85b2cff1aaf/ArmVirtPkg/PrePi/PrePi.c#L133-L165
>>>>>>
>>>>>> According to comments in the ASM, it seems like this is for Linux-based RAM boot (I saw further stuff for KVM, so it makes sense I guess?). It seems unfortunate it cannot be mapped into a known address range so that self-relocation is not necessary, but that's out of my scope to understand.
>>>>>>
>>>>> "Mapping" implies that the MMU is on, but this code boots with the MMU
>>>>> off. Unlike x86, ARM does not define any physical address ranges that
>>>>> are guaranteed to be backed by DRAM, so a portable image either needs
>>>>> to be fully position independent, or carry the metadata it needs to
>>>>> relocate itself as it is invoked.
>>>> And I understood it right that the idea is to use "-fpie" to
>>>> 1) have all control flow instructions be position-independent (i.e.
>>>> jumps, calls, etc; ARM docs don't spill it out, but vaguely imply this
>>>> always is possible?), and
>>> The primary reason to use -fpie and PIE linking is to ensure that the
>>> resulting ELF executable contains a RELA section that describes every
>>> location in the binary where a memory address is stored that needs to
>>> be updated according to the actual placement in memory. The side
>>> effect of -fpie is that position independent global references are
>>> emitted (i.e., ADRP/ADD instructions which are relative to the program
>>> counter). However, the AArch64 compiler uses those by default anyway,
>>> so for this it is not strictly needed.
>>>
>>>> 2) emit a GOT, which ends up being converted to PE/COFF Relocations (->
>>>> self-relocation), for global data that cannot be referenced relatively?
>>>> Is there any way to know/force that no symbol in GOT is accessed up
>>>> until the end of the self-relocation process?

Do you maybe have one final comment regarding that second question, 
please? :)
Let's drop "GOT" and make it "any instruction that requires prior 
relocation to function correctly".

>>> It is not really a GOT. Actually, a GOT is undesirable, as it forces
>>> global variables to be referenced via an absolute address, even when a
>>> relative reference could be used.
>> Hmm, the GCC docs say a GOT is used for "all constant addresses" (I took
>> it as "absolute"?), it is kind of vague. I understood it this way:
>> 1) no-pie emits relocations that can target the .text and .data sections
>> for instructions that embed and variables that hold an absolute address
>> (I thought this was RELA?)
>> 2) pie emits a GOT such that there are no relocations as described in
>> 1), because all absolute addresses are indirected by GOT (just GOT
>> references are relocated)
>>
> Correct. And this works really well for shared libraries, where all
> text and data sections can be shared between processes, as they will
> not be modified by the loader. All locations targeted by relocations
> will be nicely lumped together in the GOT.
>
> However, for bare metal style programs, there is no sharing, and there
> is no advantage to lumping anything together. It is much better to use
> relative references where possible, and simply apply relocations
> wherever needed across the text and data sections,
>
>> If I understood the process right, but the term (GOT) is wrong, sorry,
>> that is what I gathered from the docs. :)
>> I have a x86 + PE background, so ARM + ELF is a bit of a learning curve...
>>
> The GOT is a special data structure used for implicit variable
> accesses, i.e., global vars used in the code. Statically initialized
> pointer variables are the other category, which are not code, and for
> which the same considerations do not apply, given that the right value
> simply needs to be stored in the variable before the program starts.
>
>>> For instance, a statically initialized pointer always carries an
>>> absolute address, and so it always needs an entry in the RELA table
>>>
>>> E.g.,
>>>
>>> int foo = 10; // external linkage
>>> static int *bar = &foo;
>>>
>>> In this case, there is no way to use relative addressing because the
>>> address of foo is taken at build time.
>>>
>>> However, if bar would be something like
>>>
>>> static int *bar() { return &foo; }
>>>
>>> the address is only taken at runtime, and the compiler can use a
>>> relative reference instead, and no RELA entry is needed. With a GOT,
>>> we force the compiler to allocate a variable that holds the absolute
>>> address, which we would prefer to avoid.
>> And this is not forced by whatever table -fpie uses, as per my
>> understanding above?
>>
> The selection of 'code model' as it is called is controlled by GCC's
> -mcmodel= argument, which defaults to 'small' on AArch64, regardless
> of whether you use PIC/PIE or not.

Aha, makes sense, thanks!

Best regards,
Marvin

>>>>>> “Now, StandaloneMmPkg has similar (self-)relocation code too:https://github.com/tianocore/edk2/blob/17143c4837393d42c484b42d1789b85b2cff1aaf/StandaloneMmPkg/Library/StandaloneMmCoreEntryPoint/AArch64/StandaloneMmCoreEntryPoint.c#L379-L386
>>>>>>
>>>>>> Because I cannot find such elsewhere, I assume it must be for the same ARM virtualised environment as above.
>>>>> No.
>>>>>
>>>>>> The binary it applies the Relocations to is documented to be the Standalone MM core, but in fact SecCore is located:
>>>>>>
>>>>>> https://github.com/tianocore/edk2/blob/17143c4837393d42c484b42d1789b85b2cff1aaf/StandaloneMmPkg/Library/StandaloneMmCoreEntryPoint/AArch64/SetPermissions.c#L131-L158
>>>> As per your comments below, I think SecCore should not be located here.
>>>> Is the Standalone MM core of *type* SecCore in the FFS (without *being*
>>>> SecCore)? This confused me the most.
>>>>
>>> If the FFS SecCore section type is used here, it does not mean that
>>> the image is a SEC image in the strict PI sense.
>>>
>>> Perhaps we were just too lazy to add a new type to the FFS spec?
>> That is what I meant to imply with the middle question (well, not
>> necessarily "lazy", for ARM there simply seems to not be any reason to
>> distinguish if the environments are fully separate), just wanted to make
>> sure I understand what the code does before modifying it.
>>
>> Thank you again!
>>
>> Best regards,
>> Marvin
>>
>>>>>> “This yields the following questions to me:
>>>>>>
>>>>>> 1) What even invokes Standalone MM on ARM? It is documented it is spawned during SEC, but I could not find any actual invocation.
>>>>>>
>>>>> It is not spawned by the normal world code that runs UEFI. It is a
>>>>> secure world component that runs in a completely different execution
>>>>> context (TrustZone). The code does run with the MMU enabled from the
>>>>> start, but running from an a priori fixed offset was considered to be
>>>>> a security hazard, so we added self relocation support.
>>>>>
>>>>> The alternative would have been to add metadata to the StMmCore
>>>>> component that can be interpreted by the secure world component that
>>>>> loads it, but this would go beyond any existing specs, and make
>>>>> portability more problematic.
>>>>>
>>>>>> 2) Why does Standalone MM (self-)relocation locate SecCore? Should it not already have been relocated with the code from ArmPlatformPkg? Is Standalone MM embedded into ARM SecCore?
>>>>>>
>>>>> No and no. Standalone MM has nothing to do with the code that runs as
>>>>> part of UEFI itself. ArmPlatformPkg is completely separate from
>>>>> StandaloneMmPkg.
>>>>>
>>>>>> 3) Why is SecCore the only module relocated? Are all others guaranteed to be "properly" loaded?
>>>>>>
>>>>> SecCore contains a PE/COFF loader, so all subsequent modules are
>>>>> loaded normally. This is similar to the ArmVirtQemuKernel
>>>>> self-relocating SEC module, which only relocates itself in this
>>>>> manner, and relies on standard PE/COFF metadata for loading other
>>>>> modules.
>>>> Interesting... this definitely is vastly different from the x86 side of
>>>> things. I think most things became very clear. Thanks a lot!
>>>>
>>>>>> 4) Is there maybe some high-level documented about the ARM boot flow? It seems to be significantly different from the x86 routes quite vastly.”
>>>>>>
>>>>> trustedfirmware.org may have some useful documentation.
>>>> I'll check it some time, hopefully this weekend. Thanks!
>>>>
>>> My pleasure.


  reply	other threads:[~2021-07-23 14:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <MW4PR21MB19074D9D114BBBCC21B53B6FEF199@MW4PR21MB1907.namprd21.prod.outlook.com>
     [not found] ` <DB9PR08MB67464B261E4F1815BCDD7C289D199@DB9PR08MB6746.eurprd08.prod.outlook.com>
2021-07-22 14:54   ` ArmVirt and Self-Updating Code Bret Barkelew
2021-07-22 15:14     ` Ard Biesheuvel
2021-07-23  9:54       ` Marvin Häuser
2021-07-23 10:13         ` Ard Biesheuvel
2021-07-23 10:47           ` Marvin Häuser
2021-07-23 14:09             ` Ard Biesheuvel
2021-07-23 14:27               ` Marvin Häuser [this message]
2021-07-23 14:34                 ` Ard Biesheuvel
2021-07-31 19:08                   ` Marvin Häuser
2021-08-01 16:33                     ` Ard Biesheuvel
2021-08-01 21:40                       ` [edk2-devel] " Marvin Häuser
2021-08-02 18:05                         ` Andrew Fish

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39afb072-3e01-6946-9387-e414a8f62f8d@posteo.de \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox