public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Andrew Fish <afish@apple.com>
To: Jordan Justen <jordan.l.justen@intel.com>
Cc: Mike Kinney <michael.d.kinney@intel.com>,
	Liu Yu <pedroa.liu@outlook.com>,
	 "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
	Laszlo Ersek <lersek@redhat.com>,
	Leif Lindholm <leif.lindholm@linaro.org>
Subject: Re: EmulatorPkg Unix Host Segmentation fault.
Date: Mon, 19 Nov 2018 16:54:03 -0800	[thread overview]
Message-ID: <07BAEF47-7784-46F4-8FDA-29C73B9C74D5@apple.com> (raw)
In-Reply-To: <154266655812.13241.5433680527994241263@jljusten-skl>



> On Nov 19, 2018, at 2:29 PM, Jordan Justen <jordan.l.justen@intel.com> wrote:
> 
> On 2018-11-19 13:22:27, Andrew Fish wrote:
>> 
>> 
>>> On Nov 19, 2018, at 11:16 AM, Jordan Justen <jordan.l.justen@intel.com> wrote:
>>> 
>>> On 2018-11-18 17:13:21, Jordan Justen wrote:
>>>> On 2018-11-18 14:37:09, Andrew Fish wrote:
>>>>> 
>>>>> 
>>>>>> On Nov 18, 2018, at 4:07 AM, Liu Yu <pedroa.liu@outlook.com> wrote:
>>>>>> 
>>>>>> sorry your  path can't fix this issue.   if this path just turn off 
>>>>>> optimization option within sec.c not global project.
>>>>>> 
>>>>>> I have tested different version GCC such as (GCC4,8, GCC5.x, GCC7.x)  
>>>>>> and all of them can duplicate this issue  (Ubuntu 16.04, 16.10,18.04 )
>>>>>> 
>>>>>> I have traced this issue on my hand.
>>>>>> 
>>>>>> you can see Dispatcher.c (MdeModulePkg/Pei/DIspatcher/) Line 792:
>>>>>> 
>>>>>> 
>>>>>> 790      if (StackOffsetPositive) {
>>>>>> 791       SecCoreData = (CONST EFI_SEC_PEI_HAND_OFF *)((UINTN)(VOID 
>>>>>> *)SecCoreData + StackOffset);
>>>>>> 792      Private = (PEI_CORE_INSTANCE *)((UINTN)(VOID *)Private + 
>>>>>> StackOffset);
>>>>>> 793     } else {
>>>>>> 794      ..........
>>>>>> 795      ..........
>>>>>> 796    }
>>>>>> 
>>>>>> 790 --792 disassembly code
>>>>>> 
>>>>>> 0x10200f2ca <PeiCheckAndSwitchStack+1030>:    test %r14b,%r14b
>>>>>> 0x10200f2cd <PeiCheckAndSwitchStack+1033>:    je 0x10200f2df 
>>>>>> <PeiCheckAndSwitchStack+1051>
>>>>>> 0x10200f2cf <PeiCheckAndSwitchStack+1035>:    mov 0x38(%rsp),%rax
>>>>>> 0x10200f2d4 <PeiCheckAndSwitchStack+1040>:    lea 0x0(%rbp,%rax,1),%r14
>>>>>> 0x10200f2d9 <PeiCheckAndSwitchStack+1045>:    lea (%rbx,%rax,1),%rbp
>>>>>> 
>>>>>> we can see Private value have been stored in %rbp  (rbp register be 
>>>>>> used as general register )   so when call 
>>>>>> TemporaryRamSupportPpi->TemporaryRamMigration()
>>>>>> 
>>>>> 
>>>>> The calling conventions define RBP as non-volatile must be preserved
>>>>> by callee. Using RBP as the frame pointer is optional.
>>>> 
>>>> This is kind of a mess. I think the definition of
>>>> EFI_PEI_TEMPORARY_RAM_SUPPORT_PPI is really to blame. It probably
>>>> should not have been spec'd as 'change stack and return'. Instead, it
>>>> should have been given a new function pointer to switch-stack and call
>>>> into.
>>> 
>>> Andrew, Mike,
>>> 
>>> I developed a PEI Core fix for this issue, which Liu Yu verified.
>>> 
>>> Unfortunately, it involves add assembly code in a key area of the PEI
>>> Core. See the top 2 commits of:
>>> 
>>> https://github.com/jljusten/edk2/commits/emulator-temp-ram <https://github.com/jljusten/edk2/commits/emulator-temp-ram>
>>> 
>>> I only wrote assembly for X64 nasm. But, I notice that neither the PEI
>>> or DXE Core modules include any assembly code. So, I want to make sure
>>> I'm heading in the right direction before working on it further.
>>> 
>> 
>> Mike,
>> 
>> I seem to remember we hit an issue like this a long time ago? Do you
>> remember the details? Maybe it was we needed to write the
>> TempRamSupport code in assembly?
> 
> It does also creep up there, which is why we also adjust ebp/rbp in
> the TemporaryRamMigration function in OVMF.
> 
> In that case, it helps prevent the stack from reverting to the old
> stack when TemporaryRamMigration is returning to it's caller. (As
> opposed to after it returns, which I think is what is happening now.)
> 

The stack is switched in both cases. The problem is the C code, for the gcc code gen, does a leave so it moves %rbp into %rsp. The %rbp in the C code was the temp stack frame on entry. So the leave instruction from the C code gen reverts to the old stack. 


>> The issue we are hitting is the gEfiTemporaryRamSupportPpiGuid
>> TemporaryRamMigration() function call does a stack switch, but it
>> returns. This causes an issue with the C calling conventions as RBP
>> is optionally a frame pointer. To quote the MSFT spec: "May be used
>> as a frame pointer; must be preserved by callee"
>> 1) It is used as a frame pointer. It looks like our existing code
>> fixes up the frame pointer to match the new location the stack got
>> moved to.
>> 2) Used as a general purpose register, and the value must be
>> preserved.
> 
> This is true, and I think it makes writing TemporaryRamMigration in C
> unsafe. Nevertheless, we've been doing that in OVMF for a while now.
> 

The OVMF form is probably more portable as it is at least following the coding standards. 

What I've observed is clang (at least the Xcode version) does not use the leave instruction as it just does math to the %rsp directly and thus does not move the %rbp into %rsp. VC++ and some flavors of gcc don't emit the frame pointer. So there is dependency on code construction.  

>> 
>>> As I mentioned below, I think PIWG could consider an new
>>> TemporarayRemSupport interface that might work better, but that also
>>> may not be worth the effort.
>> 
>> If the current API is not really portable I don;t think it is a bad
>> idea to update it.
> 
> The trickier part is that the C ABI doesn't cover exactly how the
> stack may be used after the call to TemporaryRamMigration returns. For
> example, there's no reason that the compiler couldn't have saved the
> stack pointer onto the stack (which was migrated) and somehow restore
> it to an old value from the migrated stack.
> 
> I think the only safe way to implement calling the current
> TemporaryRamMigration is by using assembly code, and then calling into
> a new function from the assembly code. Once we call into a new C
> function, there is no way C code-gen can accidentally restore the old
> stack pointer.
> 

Not having a function to call back into is the bug in the current API that we should fix. 

Ironically we have assembler code in BaseLib that does a proper stack switch. 
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseLib/X64/SwitchStack.S#L38


#------------------------------------------------------------------------------
# Routine Description:
#
#   Routine for switching stacks with 2 parameters
#
# Arguments:
#
#   (rcx) EntryPoint    - Entry point with new stack.
#   (rdx) Context1      - Parameter1 for entry point.
#   (r8)  Context2      - Parameter2 for entry point.
#   (r9)  NewStack      - The pointer to new stack.
#
# Returns:
#
#   None
#
#------------------------------------------------------------------------------
ASM_GLOBAL ASM_PFX(InternalSwitchStack)
ASM_PFX(InternalSwitchStack):
    pushq   %rbp
    movq    %rsp, %rbp

    mov     %rcx, %rax  // Shift registers for new call
    mov     %rdx, %rcx
    mov     %r8, %rdx
    #
    # Reserve space for register parameters (rcx, rdx, r8 & r9) on the stack,
    # in case the callee wishes to spill them.
    #
    lea     -0x20(%r9), %rsp
    pushq   $0        // stop gdb stack unwind
    jmp     *%rax     // call EntryPoint ()




>>> 
>>> Can something like this change be integrated into the PEI Core?
>> 
>> Jordan,
>> 
>> I'm not sure?
> 
> Part of my question was whether assembly code was allowed in the PEI
> Core.
> 

It seems like if we need it we should use it?

>> For the RBP == frame pointer case the frame pointer is
>> no longer valid (it is in temp RAM, not DRAM). It seems like the
>> point of SecTemporaryRamSupport() fixing up the callers RBP is for
>> the benefit of the caller. It looks to me like your fix is just
>> negating that fixup.
> 
> Actually, the rbp thing is a not important in terms of the
> implementation. Sorry, that part confuses the situation and I should
> have dropped it. (I was just trying to figure out how to get gdb to be
> able backtrace passed the assembly function, but it didn't work.)
> 

I though the original bug was RBP getting trashed? 

>From a gdb point of view the assumption is %rbp is always the valid frame pointer  (fp) for your current location. And you walk the frame via: 
next_pc = contents of fp + 8
next_fp = contents of fp 

if next_pc == 0 you stop unwinding. 

The InternalSwitchStack is an example of a context less stack switch. The push $0 jmp *%rax stops the stack unwind. The saving of %rsp to %rbp means if you take a breakpoint in this function you can unwind the OLD stack. As soon as the EntryPoint preamble saves the new %rsp to %rbp then the debugger will only see the new stack. 

In the temp RAM migration we actually copy the old stack over so you unwind as long as the temp memory is still available. If the temp RAM gets zero'ed out it will stop the stack unwind as when you point back into the temp ram stack the pc on that frame would always be zero. 

> The key part of moving it to assembly is to prevent anything from the
> C code gen from potentially being used to reset the stack back to the
> old temp-ram.
> 

Given this is a generic issue with this API and the way C works don't we really need to fix it for all supported CPU architectures?

The only thing I can think of is the Emulator code is hacking on %rbp to work around the issue with C code that is using the leave instruction in the post-amble. It would be easy enough on OVFM to fill the TempRam with 0xAFAFAFAFAFAFAFAF, or any value that will GP fault then dereferenced. But it is likely sensitive to the flavor of compiler that is used. 

Thanks,

Andrew Fish

> Thanks,
> 
> -Jordan
> 
>> So that would imply that either we could just
>> fix this in SecTemporaryRamSupport() or we have 2 code gen paths and
>> we need to know how the compiler is using RBP to "do the right
>> thing"
>> 
>> Do we have other code that supports X64 PEI? Is see OvmfPkg....
>> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Sec/SecMain.c#L875
>> 
>> Looks like OvmfPkg uses SetJump()/LongJump() to change the stack. 
>> 
>>  //
>>  // Use SetJump()/LongJump() to switch to a new stack.
>>  // 
>>  if (SetJump (&JumpBuffer) == 0) {
>> #if defined (MDE_CPU_IA32)
>>    JumpBuffer.Esp = JumpBuffer.Esp + DebugAgentContext.StackMigrateOffset;
>>    JumpBuffer.Ebp = JumpBuffer.Ebp + DebugAgentContext.StackMigrateOffset;
>> #endif    
>> #if defined (MDE_CPU_X64)
>>    JumpBuffer.Rsp = JumpBuffer.Rsp + DebugAgentContext.StackMigrateOffset;
>>    JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset;
>> #endif    
>>    LongJump (&JumpBuffer, (UINTN)-1);
>>  }
>> 
>>  SaveAndSetDebugTimerInterrupt (OldStatus);
>> 
>>  return EFI_SUCCESS;
>> }
>> 
>> But given the above code is C code RBP is going to be restored on return. This would seem to imply that the adjusting of the callers RBP is not required? So maybe try changing over Emulator to the OvmfPkg TemporaryRamMigration() algorithm? 
>> 
>> Thanks,
>> 
>> Andrew Fish
>> 
>>> 
>>> Thanks for your time,
>>> 
>>> -Jordan
>>> 
>>>> By returning with a switched stack, we don't really know what the
>>>> returning code could do with regard to the stack. For example, it
>>>> could have saved the stack value on the stack and then pop it into a
>>>> register and somehow switch the stack back to the old stack. It seems
>>>> unlikely, but I don't think anything prevents it.
>>>> 
>>>> Additionally, there is the issue of rbp/ebp being used as a frame
>>>> pointer. This can lead to some variables being used from the temp ram
>>>> location after we return from TemporaryRamMigration. (Assuming rbp/ebp
>>>> is not adjusted to point to the new stack.)
>>>> 
>>>> So, is it safe to adjust rbp? Unknown. It may not be if rbp is not
>>>> used as a frame pointer. Is it safe to *not* adjust rbp and
>>>> potentially allow the old temp ram stack to be used? Unknown.
>>>> 
>>>> Now, if TemporaryRamMigration received a new function to call into, we
>>>> can safely assume that the new stack transition would proceed as
>>>> expected without having to worry if a compiler is using rbp/ebp as a
>>>> framepointer or not.
>>>> 
>>>> Another advantage could have been that something like the BasePkg
>>>> SwitchStack function could have been used, hopefully preventing people
>>>> from trying to write error prone assembly code for
>>>> TemporaryRamMigration.
>>>> 
>>>> -Jordan
>> 



  reply	other threads:[~2018-11-20  0:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-18  4:51 EmulatorPkg Unix Host Segmentation fault Liu Yu
2018-11-18  9:27 ` Jordan Justen
2018-11-18 12:07   ` Liu Yu
2018-11-18 22:37     ` Andrew Fish
2018-11-19  1:13       ` Jordan Justen
2018-11-19 19:16         ` Jordan Justen
2018-11-19 21:22           ` Andrew Fish
2018-11-19 22:12             ` Laszlo Ersek
2018-11-19 23:39               ` Andrew Fish
2018-11-19 22:29             ` Jordan Justen
2018-11-20  0:54               ` Andrew Fish [this message]
2018-11-20  8:57               ` Laszlo Ersek
2019-02-16  7:29 ` Ni, Ray
2019-02-16  7:40 ` Ni, Ray
2019-02-16 20:23   ` Andrew Fish
2019-02-16  7:43 ` Ni, Ray
2019-02-16  8:05   ` Ni, Ray
2019-02-16  9:11     ` Jordan Justen
2019-02-18  2:25       ` Ni, Ray
2019-02-18  2:45         ` Jordan Justen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07BAEF47-7784-46F4-8FDA-29C73B9C74D5@apple.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox