public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements
@ 2016-08-30 14:27 Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
  To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel

This is v4 of my proposed changes to the AARCH64 implementation of EbcDxe
contributed by Jeff Brasen, which has recently been merged into Tianocore.

Changes since v3:
- fix typo in comment (#1)
- clarify comments around computed goto in EBC to native thunk, and make sure
  the jump target is 32-bit aligned (#3)
- fix comment and constify Args9_16[] in EbcInterpret() prototype (#4)
- add Leif's R-b (#1, #2, #4)

Ard Biesheuvel (4):
  MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file
  MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure
  MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
  MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks

 MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 285 +++++++++++---------
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c  | 193 ++++---------
 2 files changed, 210 insertions(+), 268 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file
  2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure Ard Biesheuvel
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
  To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel

Change to consistent // style comments. Also, remove bogus global
definitions for external functions, and move the real exports to
the top of the file.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 111 +++++++++-----------
 1 file changed, 52 insertions(+), 59 deletions(-)

diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index e858227586a8..17f379248a62 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -1,40 +1,35 @@
-#/** @file
-#
-#    This code provides low level routines that support the Virtual Machine
-#   for option ROMs.
-#
-#  Copyright (c) 2015, The Linux Foundation. All rights reserved.
-#  Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
-#  This program and the accompanying materials
-#  are licensed and made available under the terms and conditions of the BSD License
-#  which accompanies this distribution.  The full text of the license may be found at
-#  http://opensource.org/licenses/bsd-license.php
-#
-#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
-#
-#**/
-
-#---------------------------------------------------------------------------
-# Equate files needed.
-#---------------------------------------------------------------------------
-
-ASM_GLOBAL ASM_PFX(CopyMem);
-ASM_GLOBAL ASM_PFX(EbcInterpret);
-ASM_GLOBAL ASM_PFX(ExecuteEbcImageEntryPoint);
-
-#****************************************************************************
-# EbcLLCALLEX
-#
-# This function is called to execute an EBC CALLEX instruction.
-# This instruction requires that we thunk out to external native
-# code. For AArch64, we copy the VM stack into the main stack and then pop
-# the first 8 arguments off according to the AArch64 Procedure Call Standard
-# On return, we restore the stack pointer to its original location.
-#
-#****************************************************************************
-# UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
-ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative);
+///** @file
+//
+//  This code provides low level routines that support the Virtual Machine
+//  for option ROMs.
+//
+//  Copyright (c) 2015, The Linux Foundation. All rights reserved.
+//  Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
+//  This program and the accompanying materials
+//  are licensed and made available under the terms and conditions of the BSD License
+//  which accompanies this distribution.  The full text of the license may be found at
+//  http://opensource.org/licenses/bsd-license.php
+//
+//  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
+//  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
+//
+//**/
+
+ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative)
+ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret)
+ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint)
+
+//****************************************************************************
+// EbcLLCALLEX
+//
+// This function is called to execute an EBC CALLEX instruction.
+// This instruction requires that we thunk out to external native
+// code. For AArch64, we copy the VM stack into the main stack and then pop
+// the first 8 arguments off according to the AArch64 Procedure Call Standard
+// On return, we restore the stack pointer to its original location.
+//
+//****************************************************************************
+// UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
 ASM_PFX(EbcLLCALLEXNative):
       stp  x19, x20, [sp, #-16]!
       stp  x29, x30, [sp, #-16]!
@@ -61,16 +56,15 @@ ASM_PFX(EbcLLCALLEXNative):
 
       ret
 
-#****************************************************************************
-# EbcLLEbcInterpret
-#
-# This function is called by the thunk code to handle an Native to EBC call
-# This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-# x9 contains the Entry point that will be the first argument when
-# EBCInterpret is called.
-#
-#****************************************************************************
-ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret);
+//****************************************************************************
+// EbcLLEbcInterpret
+//
+// This function is called by the thunk code to handle an Native to EBC call
+// This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
+// x9 contains the Entry point that will be the first argument when
+// EBCInterpret is called.
+//
+//****************************************************************************
 ASM_PFX(EbcLLEbcInterpret):
     stp  x29, x30, [sp, #-16]!
 
@@ -105,7 +99,7 @@ ASM_PFX(EbcLLEbcInterpret):
     mov x1, x0
     mov x0, x9
 
-    # call C-code
+    // call C-code
     bl ASM_PFX(EbcInterpret)
     add sp, sp, #80
 
@@ -113,23 +107,22 @@ ASM_PFX(EbcLLEbcInterpret):
 
     ret
 
-#****************************************************************************
-# EbcLLExecuteEbcImageEntryPoint
-#
-# This function is called by the thunk code to handle the image entry point
-# x9 contains the Entry point that will be the first argument when
-# ExecuteEbcImageEntryPoint is called.
-#
-#****************************************************************************
-ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint);
+//****************************************************************************
+// EbcLLExecuteEbcImageEntryPoint
+//
+// This function is called by the thunk code to handle the image entry point
+// x9 contains the Entry point that will be the first argument when
+// ExecuteEbcImageEntryPoint is called.
+//
+//****************************************************************************
 ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
     stp  x29, x30, [sp, #-16]!
-    # build new paramater calling convention
+    // build new parameter calling convention
     mov  x2, x1
     mov  x1, x0
     mov  x0, x9
 
-    # call C-code
+    // call C-code
     bl ASM_PFX(ExecuteEbcImageEntryPoint)
     ldp  x29, x30, [sp], #16
     ret
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure
  2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
  To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel

The thunk generation is needlessly complex, given that it attempts to
deal with variable length instructions, which don't exist on AArch64.

So replace it with a simple template coded in assembler, with a matching
struct definition in C. That way, we can create and manipulate the thunks
easily without looping over the instructions looking for 'magic' numbers.

Also, use x16 rather than x9, since it is the architectural register to
use for thunks/veneers.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S |  32 ++++-
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c  | 133 +++++---------------
 2 files changed, 58 insertions(+), 107 deletions(-)

diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index 17f379248a62..b4b8531f1a01 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -3,8 +3,10 @@
 //  This code provides low level routines that support the Virtual Machine
 //  for option ROMs.
 //
-//  Copyright (c) 2015, The Linux Foundation. All rights reserved.
+//  Copyright (c) 2016, Linaro, Ltd. All rights reserved.<BR>
+//  Copyright (c) 2015, The Linux Foundation. All rights reserved.<BR>
 //  Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
+//
 //  This program and the accompanying materials
 //  are licensed and made available under the terms and conditions of the BSD License
 //  which accompanies this distribution.  The full text of the license may be found at
@@ -19,6 +21,8 @@ ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative)
 ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret)
 ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint)
 
+ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
+
 //****************************************************************************
 // EbcLLCALLEX
 //
@@ -61,7 +65,7 @@ ASM_PFX(EbcLLCALLEXNative):
 //
 // This function is called by the thunk code to handle an Native to EBC call
 // This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-// x9 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the first argument when
 // EBCInterpret is called.
 //
 //****************************************************************************
@@ -97,7 +101,7 @@ ASM_PFX(EbcLLEbcInterpret):
     mov x3, x2
     mov x2, x1
     mov x1, x0
-    mov x0, x9
+    mov x0, x16
 
     // call C-code
     bl ASM_PFX(EbcInterpret)
@@ -111,7 +115,7 @@ ASM_PFX(EbcLLEbcInterpret):
 // EbcLLExecuteEbcImageEntryPoint
 //
 // This function is called by the thunk code to handle the image entry point
-// x9 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the third argument when
 // ExecuteEbcImageEntryPoint is called.
 //
 //****************************************************************************
@@ -120,9 +124,27 @@ ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
     // build new parameter calling convention
     mov  x2, x1
     mov  x1, x0
-    mov  x0, x9
+    mov  x0, x16
 
     // call C-code
     bl ASM_PFX(ExecuteEbcImageEntryPoint)
     ldp  x29, x30, [sp], #16
     ret
+
+//****************************************************************************
+// mEbcInstructionBufferTemplate
+//****************************************************************************
+    .section    ".rodata", "a"
+    .align      3
+ASM_PFX(mEbcInstructionBufferTemplate):
+    adr     x17, 0f
+    ldp     x16, x17, [x17]
+    br      x17
+
+    //
+    // Add a magic code here to help the VM recognize the thunk.
+    //
+    hlt     #0xEBC
+
+0:  .quad   0   // EBC_ENTRYPOINT_SIGNATURE
+    .quad   0   // EBC_LL_EBC_ENTRYPOINT_SIGNATURE
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
index 23261a070143..a5f21f400274 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
@@ -2,8 +2,10 @@
   This module contains EBC support routines that are customized based on
   the target AArch64 processor.
 
-Copyright (c) 2015, The Linux Foundation. All rights reserved.
+Copyright (c) 2016, Linaro, Ltd. All rights reserved.<BR>
+Copyright (c) 2015, The Linux Foundation. All rights reserved.<BR>
 Copyright (c) 2006 - 2014, Intel Corporation. All rights reserved.<BR>
+
 This program and the accompanying materials
 are licensed and made available under the terms and conditions of the BSD License
 which accompanies this distribution.  The full text of the license may be found at
@@ -22,47 +24,16 @@ WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
 //
 #define STACK_REMAIN_SIZE (1024 * 4)
 
-//
-// This is instruction buffer used to create EBC thunk
-//
-#define EBC_MAGIC_SIGNATURE                0xCA112EBCCA112EBCull
-#define EBC_ENTRYPOINT_SIGNATURE           0xAFAFAFAFAFAFAFAFull
-#define EBC_LL_EBC_ENTRYPOINT_SIGNATURE    0xFAFAFAFAFAFAFAFAull
-UINT8  mInstructionBufferTemplate[] = {
-  0x03,  0x00, 0x00, 0x14, //b pc+16
-  //
-  // Add a magic code here to help the VM recognize the thunk..
-  //
-    (UINT8)(EBC_MAGIC_SIGNATURE & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 8) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 16) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 24) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 32) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 40) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 48) & 0xFF),
-    (UINT8)((EBC_MAGIC_SIGNATURE >> 56) & 0xFF),
-  0x69, 0x00, 0x00, 0x58, //ldr x9, #32
-  0x8A, 0x00, 0x00, 0x58, //ldr x10, #40
-  0x05, 0x00, 0x00, 0x14, //b pc+32
-    (UINT8)(EBC_ENTRYPOINT_SIGNATURE & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 8) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 16) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 24) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 32) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 40) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 48) & 0xFF),
-    (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 56) & 0xFF),
-    (UINT8)(EBC_LL_EBC_ENTRYPOINT_SIGNATURE & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 8) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 16) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 24) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 32) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 40) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 48) & 0xFF),
-    (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 56) & 0xFF),
-  0x40, 0x01, 0x1F, 0xD6 //br x10
-
-};
+#pragma pack(1)
+typedef struct {
+  UINT32    Instr[3];
+  UINT32    Magic;
+  UINT64    EbcEntryPoint;
+  UINT64    EbcLlEntryPoint;
+} EBC_INSTRUCTION_BUFFER;
+#pragma pack()
+
+extern CONST EBC_INSTRUCTION_BUFFER       mEbcInstructionBufferTemplate;
 
 /**
   Begin executing an EBC image.
@@ -414,10 +385,7 @@ EbcCreateThunks (
   IN  UINT32              Flags
   )
 {
-  UINT8       *Ptr;
-  UINT8       *ThunkBase;
-  UINT32      Index;
-  INT32       ThunkSize;
+  EBC_INSTRUCTION_BUFFER       *InstructionBuffer;
 
   //
   // Check alignment of pointer to EBC code
@@ -426,51 +394,38 @@ EbcCreateThunks (
     return EFI_INVALID_PARAMETER;
   }
 
-  ThunkSize = sizeof(mInstructionBufferTemplate);
-
-  Ptr = AllocatePool (sizeof(mInstructionBufferTemplate));
-
-  if (Ptr == NULL) {
+  InstructionBuffer = AllocatePool (sizeof (EBC_INSTRUCTION_BUFFER));
+  if (InstructionBuffer == NULL) {
     return EFI_OUT_OF_RESOURCES;
   }
-  //
-  //  Print(L"Allocate TH: 0x%X\n", (UINT32)Ptr);
-  //
-  // Save the start address so we can add a pointer to it to a list later.
-  //
-  ThunkBase = Ptr;
 
   //
   // Give them the address of our buffer we're going to fix up
   //
-  *Thunk = (VOID *) Ptr;
+  *Thunk = InstructionBuffer;
 
   //
   // Copy whole thunk instruction buffer template
   //
-  CopyMem (Ptr, mInstructionBufferTemplate, sizeof(mInstructionBufferTemplate));
+  CopyMem (InstructionBuffer, &mEbcInstructionBufferTemplate,
+    sizeof (EBC_INSTRUCTION_BUFFER));
 
   //
   // Patch EbcEntryPoint and EbcLLEbcInterpret
   //
-  for (Index = 0; Index < sizeof(mInstructionBufferTemplate) - sizeof(UINTN); Index++) {
-    if (*(UINTN *)&Ptr[Index] == EBC_ENTRYPOINT_SIGNATURE) {
-      *(UINTN *)&Ptr[Index] = (UINTN)EbcEntryPoint;
-    }
-    if (*(UINTN *)&Ptr[Index] == EBC_LL_EBC_ENTRYPOINT_SIGNATURE) {
-      if ((Flags & FLAG_THUNK_ENTRY_POINT) != 0) {
-        *(UINTN *)&Ptr[Index] = (UINTN)EbcLLExecuteEbcImageEntryPoint;
-      } else {
-        *(UINTN *)&Ptr[Index] = (UINTN)EbcLLEbcInterpret;
-      }
-    }
+  InstructionBuffer->EbcEntryPoint = (UINT64)EbcEntryPoint;
+  if ((Flags & FLAG_THUNK_ENTRY_POINT) != 0) {
+    InstructionBuffer->EbcLlEntryPoint = (UINT64)EbcLLExecuteEbcImageEntryPoint;
+  } else {
+    InstructionBuffer->EbcLlEntryPoint = (UINT64)EbcLLEbcInterpret;
   }
 
   //
   // Add the thunk to the list for this image. Do this last since the add
   // function flushes the cache for us.
   //
-  EbcAddImageThunk (ImageHandle, (VOID *) ThunkBase, ThunkSize);
+  EbcAddImageThunk (ImageHandle, InstructionBuffer,
+    sizeof (EBC_INSTRUCTION_BUFFER));
 
   return EFI_SUCCESS;
 }
@@ -500,40 +455,15 @@ EbcLLCALLEX (
   IN UINT8        Size
   )
 {
-  UINTN    IsThunk;
-  UINTN    TargetEbcAddr;
-  UINT8    InstructionBuffer[sizeof(mInstructionBufferTemplate)];
-  UINTN    Index;
-  UINTN    IndexOfEbcEntrypoint;
-
-  IsThunk       = 1;
-  TargetEbcAddr = 0;
-  IndexOfEbcEntrypoint = 0;
+  CONST EBC_INSTRUCTION_BUFFER *InstructionBuffer;
 
   //
   // Processor specific code to check whether the callee is a thunk to EBC.
   //
-  CopyMem (InstructionBuffer, (VOID *)FuncAddr, sizeof(InstructionBuffer));
-  //
-  // Fill the signature according to mInstructionBufferTemplate
-  //
-  for (Index = 0; Index < sizeof(mInstructionBufferTemplate) - sizeof(UINTN); Index++) {
-    if (*(UINTN *)&mInstructionBufferTemplate[Index] == EBC_ENTRYPOINT_SIGNATURE) {
-      *(UINTN *)&InstructionBuffer[Index] = EBC_ENTRYPOINT_SIGNATURE;
-      IndexOfEbcEntrypoint = Index;
-    }
-    if (*(UINTN *)&mInstructionBufferTemplate[Index] == EBC_LL_EBC_ENTRYPOINT_SIGNATURE) {
-      *(UINTN *)&InstructionBuffer[Index] = EBC_LL_EBC_ENTRYPOINT_SIGNATURE;
-    }
-  }
-  //
-  // Check if we need thunk to native
-  //
-  if (CompareMem (InstructionBuffer, mInstructionBufferTemplate, sizeof(mInstructionBufferTemplate)) != 0) {
-    IsThunk = 0;
-  }
+  InstructionBuffer = (EBC_INSTRUCTION_BUFFER *)FuncAddr;
 
-  if (IsThunk == 1){
+  if (CompareMem (InstructionBuffer, &mEbcInstructionBufferTemplate,
+        sizeof(EBC_INSTRUCTION_BUFFER) - 2 * sizeof (UINT64)) == 0) {
     //
     // The callee is a thunk to EBC, adjust the stack pointer down 16 bytes and
     // put our return address and frame pointer on the VM stack.
@@ -545,8 +475,7 @@ EbcLLCALLEX (
     VmPtr->Gpr[0] -= 8;
     VmWriteMem64 (VmPtr, (UINTN) VmPtr->Gpr[0], (UINT64) (UINTN) (VmPtr->Ip + Size));
 
-    CopyMem (&TargetEbcAddr, (UINT8 *)FuncAddr + IndexOfEbcEntrypoint, sizeof(UINTN));
-    VmPtr->Ip = (VMIP) (UINTN) TargetEbcAddr;
+    VmPtr->Ip = (VMIP) InstructionBuffer->EbcEntryPoint;
   } else {
     //
     // The callee is not a thunk to EBC, call native code,
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
  2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
  2016-08-30 14:27 ` [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
  2016-08-30 18:52   ` Leif Lindholm
  2016-08-30 14:27 ` [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks Ard Biesheuvel
  2016-08-30 14:34 ` [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
  4 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
  To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel

Instead of pessimistically copying at least 64 bytes from the VM stack
to the native stack, and popping off the register arguments again
before doing the native call, try to avoid touching the stack completely
if the VM stack frame is <= 64 bytes. Also, if the stack frame does exceed
64 bytes, there is no need to copy the first 64 bytes, since we are passing
those in registers anyway.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 85 +++++++++++++++-----
 1 file changed, 65 insertions(+), 20 deletions(-)

diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index b4b8531f1a01..34794c06a644 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -35,30 +35,75 @@ ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
 //****************************************************************************
 // UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
 ASM_PFX(EbcLLCALLEXNative):
-      stp  x19, x20, [sp, #-16]!
-      stp  x29, x30, [sp, #-16]!
+    mov     x8, x0                 // Preserve x0
+    mov     x9, x1                 // Preserve x1
 
-      mov  x19, x0
-      mov  x20, sp
-      sub  x2, x2, x1   // Length = NewStackPointer-FramePtr
-      sub  sp, sp, x2
-      sub  sp, sp, #64  // Make sure there is room for at least 8 args in the new stack
-      mov  x0, sp
-
-      bl   CopyMem      // Sp, NewStackPointer, Length
-
-      ldp  x0, x1, [sp], #16
-      ldp  x2, x3, [sp], #16
-      ldp  x4, x5, [sp], #16
-      ldp  x6, x7, [sp], #16
+    //
+    // If the EBC stack frame is smaller than or equal to 64 bytes, we know there
+    // are no stacked arguments #9 and beyond that we need to copy to the native
+    // stack. In this case, we can perform a tail call which is much more
+    // efficient, since there is no need to touch the native stack at all.
+    //
+    sub     x3, x2, x1              // Length = NewStackPointer - FramePtr
+    cmp     x3, #64
+    b.gt    1f
 
-      blr  x19
+    //
+    // While probably harmless in practice, we should not access the VM stack
+    // outside of the interval [NewStackPointer, FramePtr), which means we
+    // should not blindly fill all 8 argument registers with VM stack data.
+    // So instead, calculate how many argument registers we can fill based on
+    // the size of the VM stack frame, and skip the remaining ones.
+    //
+    adr     x0, 0f                  // Take address of 'br' instruction below
+    bic     x3, x3, #7              // Ensure correct alignment
+    sub     x0, x0, x3, lsr #1      // Subtract 4 bytes for each arg to unstack
+    br      x0                      // Skip remaining argument registers
+
+    ldr     x7, [x9, #56]           // Call with 8 arguments
+    ldr     x6, [x9, #48]           //  |
+    ldr     x5, [x9, #40]           //  |
+    ldr     x4, [x9, #32]           //  |
+    ldr     x3, [x9, #24]           //  |
+    ldr     x2, [x9, #16]           //  |
+    ldr     x1, [x9, #8]            //  V
+    ldr     x0, [x9]                // Call with 1 argument
+
+0:  br      x8                      // Call with no arguments
 
-      mov  sp,  x20
-      ldp  x29, x30, [sp], #16
-      ldp  x19, x20, [sp], #16
+    //
+    // More than 64 bytes: we need to build the full native stack frame and copy
+    // the part of the VM stack exceeding 64 bytes (which may contain stacked
+    // arguments) to the native stack
+    //
+1:  stp     x29, x30, [sp, #-16]!
+    mov     x29, sp
 
-      ret
+    //
+    // Ensure that the stack pointer remains 16 byte aligned,
+    // even if the size of the VM stack frame is not a multiple of 16
+    //
+    add     x1, x1, #64             // Skip over [potential] reg params
+    tbz     x3, #3, 2f              // Multiple of 16?
+    ldr     x4, [x2, #-8]!          // No? Then push one word
+    str     x4, [sp, #-16]!         // ... but use two slots
+    b       3f
+
+2:  ldp     x4, x5, [x2, #-16]!
+    stp     x4, x5, [sp, #-16]!
+3:  cmp     x2, x1
+    b.gt    2b
+
+    ldp     x0, x1, [x9]
+    ldp     x2, x3, [x9, #16]
+    ldp     x4, x5, [x9, #32]
+    ldp     x6, x7, [x9, #48]
+
+    blr     x8
+
+    mov     sp, x29
+    ldp     x29, x30, [sp], #16
+    ret
 
 //****************************************************************************
 // EbcLLEbcInterpret
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks
  2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
  2016-08-30 14:34 ` [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
  4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
  To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel

The prototypes of EbcInterpret() and ExecuteEbcImageEntryPoint() are
private to the AARCH64 implementation of EbcDxe, so we can shuffle
the arguments around a bit and make the assembler thunking glue a lot
simpler.

For ExecuteEbcImageEntryPoint(), this involves passing the EntryPoint
argument as the third parameter, rather than the first, which allows
us to do a tail call. For EbcInterpret(), instead of copying each
argument beyond #8 from one native stack frame to the next (before
another copy is made into the VM stack), pass a pointer to the
argument stack.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 59 +++++--------------
 MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c  | 60 ++++++++------------
 2 files changed, 36 insertions(+), 83 deletions(-)

diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index 34794c06a644..b1f09725ecf0 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -110,50 +110,23 @@ ASM_PFX(EbcLLCALLEXNative):
 //
 // This function is called by the thunk code to handle an Native to EBC call
 // This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-// x16 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the first stacked argument when
 // EBCInterpret is called.
 //
 //****************************************************************************
 ASM_PFX(EbcLLEbcInterpret):
-    stp  x29, x30, [sp, #-16]!
-
-    // copy the current arguments 9-16 from old location and add arg 7 to stack
-    // keeping 16 byte stack alignment
-    sub sp, sp, #80
-    str x7, [sp]
-    ldr x11, [sp, #96]
-    str x11, [sp, #8]
-    ldr x11, [sp, #104]
-    str x11, [sp, #16]
-    ldr x11, [sp, #112]
-    str x11, [sp, #24]
-    ldr x11, [sp, #120]
-    str x11, [sp, #32]
-    ldr x11, [sp, #128]
-    str x11, [sp, #40]
-    ldr x11, [sp, #136]
-    str x11, [sp, #48]
-    ldr x11, [sp, #144]
-    str x11, [sp, #56]
-    ldr x11, [sp, #152]
-    str x11, [sp, #64]
-
-    // Shift arguments and add entry point and as argument 1
-    mov x7, x6
-    mov x6, x5
-    mov x5, x4
-    mov x4, x3
-    mov x3, x2
-    mov x2, x1
-    mov x1, x0
-    mov x0, x16
+    stp     x29, x30, [sp, #-16]!
+    mov     x29, sp
 
-    // call C-code
-    bl ASM_PFX(EbcInterpret)
-    add sp, sp, #80
+    // push the entry point and the address of args #9 - #16 onto the stack
+    add     x17, sp, #16
+    stp     x16, x17, [sp, #-16]!
 
-    ldp  x29, x30, [sp], #16
+    // call C-code
+    bl      ASM_PFX(EbcInterpret)
 
+    add     sp, sp, #16
+    ldp     x29, x30, [sp], #16
     ret
 
 //****************************************************************************
@@ -165,16 +138,10 @@ ASM_PFX(EbcLLEbcInterpret):
 //
 //****************************************************************************
 ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
-    stp  x29, x30, [sp, #-16]!
-    // build new parameter calling convention
-    mov  x2, x1
-    mov  x1, x0
-    mov  x0, x16
+    mov     x2, x16
 
-    // call C-code
-    bl ASM_PFX(ExecuteEbcImageEntryPoint)
-    ldp  x29, x30, [sp], #16
-    ret
+    // tail call to C code
+    b       ASM_PFX(ExecuteEbcImageEntryPoint)
 
 //****************************************************************************
 // mEbcInstructionBufferTemplate
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
index a5f21f400274..c5cc76d7bdcb 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
@@ -89,7 +89,6 @@ PushU64 (
 
   This is a thunk function.
 
-  @param  EntryPoint            The entrypoint of EBC code.
   @param  Arg1                  The 1st argument.
   @param  Arg2                  The 2nd argument.
   @param  Arg3                  The 3rd argument.
@@ -98,14 +97,8 @@ PushU64 (
   @param  Arg6                  The 6th argument.
   @param  Arg7                  The 7th argument.
   @param  Arg8                  The 8th argument.
-  @param  Arg9                  The 9th argument.
-  @param  Arg10                 The 10th argument.
-  @param  Arg11                 The 11th argument.
-  @param  Arg12                 The 12th argument.
-  @param  Arg13                 The 13th argument.
-  @param  Arg14                 The 14th argument.
-  @param  Arg15                 The 15th argument.
-  @param  Arg16                 The 16th argument.
+  @param  EntryPoint            The entrypoint of EBC code.
+  @param  Args9_16[]            Array containing arguments #9 to #16.
 
   @return The value returned by the EBC application we're going to run.
 
@@ -113,23 +106,16 @@ PushU64 (
 UINT64
 EFIAPI
 EbcInterpret (
-  IN UINTN      EntryPoint,
-  IN UINTN      Arg1,
-  IN UINTN      Arg2,
-  IN UINTN      Arg3,
-  IN UINTN      Arg4,
-  IN UINTN      Arg5,
-  IN UINTN      Arg6,
-  IN UINTN      Arg7,
-  IN UINTN      Arg8,
-  IN UINTN      Arg9,
-  IN UINTN      Arg10,
-  IN UINTN      Arg11,
-  IN UINTN      Arg12,
-  IN UINTN      Arg13,
-  IN UINTN      Arg14,
-  IN UINTN      Arg15,
-  IN UINTN      Arg16
+  IN UINTN        Arg1,
+  IN UINTN        Arg2,
+  IN UINTN        Arg3,
+  IN UINTN        Arg4,
+  IN UINTN        Arg5,
+  IN UINTN        Arg6,
+  IN UINTN        Arg7,
+  IN UINTN        Arg8,
+  IN UINTN        EntryPoint,
+  IN CONST UINTN  Args9_16[]
   )
 {
   //
@@ -193,14 +179,14 @@ EbcInterpret (
   // For the worst case, assume there are 4 arguments passed in registers, store
   // them to VM's stack.
   //
-  PushU64 (&VmContext, (UINT64) Arg16);
-  PushU64 (&VmContext, (UINT64) Arg15);
-  PushU64 (&VmContext, (UINT64) Arg14);
-  PushU64 (&VmContext, (UINT64) Arg13);
-  PushU64 (&VmContext, (UINT64) Arg12);
-  PushU64 (&VmContext, (UINT64) Arg11);
-  PushU64 (&VmContext, (UINT64) Arg10);
-  PushU64 (&VmContext, (UINT64) Arg9);
+  PushU64 (&VmContext, (UINT64) Args9_16[7]);
+  PushU64 (&VmContext, (UINT64) Args9_16[6]);
+  PushU64 (&VmContext, (UINT64) Args9_16[5]);
+  PushU64 (&VmContext, (UINT64) Args9_16[4]);
+  PushU64 (&VmContext, (UINT64) Args9_16[3]);
+  PushU64 (&VmContext, (UINT64) Args9_16[2]);
+  PushU64 (&VmContext, (UINT64) Args9_16[1]);
+  PushU64 (&VmContext, (UINT64) Args9_16[0]);
   PushU64 (&VmContext, (UINT64) Arg8);
   PushU64 (&VmContext, (UINT64) Arg7);
   PushU64 (&VmContext, (UINT64) Arg6);
@@ -252,10 +238,10 @@ EbcInterpret (
 /**
   Begin executing an EBC image.
 
-  @param  EntryPoint       The entrypoint of EBC code.
   @param  ImageHandle      image handle for the EBC application we're executing
   @param  SystemTable      standard system table passed into an driver's entry
                            point
+  @param  EntryPoint       The entrypoint of EBC code.
 
   @return The value returned by the EBC application we're going to run.
 
@@ -263,9 +249,9 @@ EbcInterpret (
 UINT64
 EFIAPI
 ExecuteEbcImageEntryPoint (
-  IN UINTN                EntryPoint,
   IN EFI_HANDLE           ImageHandle,
-  IN EFI_SYSTEM_TABLE     *SystemTable
+  IN EFI_SYSTEM_TABLE     *SystemTable,
+  IN UINTN                EntryPoint
   )
 {
   //
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements
  2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2016-08-30 14:27 ` [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks Ard Biesheuvel
@ 2016-08-30 14:34 ` Ard Biesheuvel
  4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:34 UTC (permalink / raw)
  To: edk2-devel-01, Leif Lindholm; +Cc: Ard Biesheuvel

On 30 August 2016 at 15:27, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> This is v4 of my proposed changes to the AARCH64 implementation of EbcDxe
> contributed by Jeff Brasen, which has recently been merged into Tianocore.
>
> Changes since v3:
> - fix typo in comment (#1)
> - clarify comments around computed goto in EBC to native thunk, and make sure
>   the jump target is 32-bit aligned (#3)
> - fix comment and constify Args9_16[] in EbcInterpret() prototype (#4)
> - add Leif's R-b (#1, #2, #4)
>
> Ard Biesheuvel (4):
>   MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file
>   MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure
>   MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
>   MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks
>
>  MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 285 +++++++++++---------
>  MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c  | 193 ++++---------
>  2 files changed, 210 insertions(+), 268 deletions(-)
>

Hmm, I seem to have forgotten to put the actual MdeModulePkg maintainers on cc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
  2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
@ 2016-08-30 18:52   ` Leif Lindholm
  0 siblings, 0 replies; 7+ messages in thread
From: Leif Lindholm @ 2016-08-30 18:52 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: edk2-devel

On Tue, Aug 30, 2016 at 03:27:23PM +0100, Ard Biesheuvel wrote:
> Instead of pessimistically copying at least 64 bytes from the VM stack
> to the native stack, and popping off the register arguments again
> before doing the native call, try to avoid touching the stack completely
> if the VM stack frame is <= 64 bytes. Also, if the stack frame does exceed
> 64 bytes, there is no need to copy the first 64 bytes, since we are passing
> those in registers anyway.
> 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>

> ---
>  MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 85 +++++++++++++++-----
>  1 file changed, 65 insertions(+), 20 deletions(-)
> 
> diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> index b4b8531f1a01..34794c06a644 100644
> --- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> +++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> @@ -35,30 +35,75 @@ ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
>  //****************************************************************************
>  // UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
>  ASM_PFX(EbcLLCALLEXNative):
> -      stp  x19, x20, [sp, #-16]!
> -      stp  x29, x30, [sp, #-16]!
> +    mov     x8, x0                 // Preserve x0
> +    mov     x9, x1                 // Preserve x1
>  
> -      mov  x19, x0
> -      mov  x20, sp
> -      sub  x2, x2, x1   // Length = NewStackPointer-FramePtr
> -      sub  sp, sp, x2
> -      sub  sp, sp, #64  // Make sure there is room for at least 8 args in the new stack
> -      mov  x0, sp
> -
> -      bl   CopyMem      // Sp, NewStackPointer, Length
> -
> -      ldp  x0, x1, [sp], #16
> -      ldp  x2, x3, [sp], #16
> -      ldp  x4, x5, [sp], #16
> -      ldp  x6, x7, [sp], #16
> +    //
> +    // If the EBC stack frame is smaller than or equal to 64 bytes, we know there
> +    // are no stacked arguments #9 and beyond that we need to copy to the native
> +    // stack. In this case, we can perform a tail call which is much more
> +    // efficient, since there is no need to touch the native stack at all.
> +    //
> +    sub     x3, x2, x1              // Length = NewStackPointer - FramePtr
> +    cmp     x3, #64
> +    b.gt    1f
>  
> -      blr  x19
> +    //
> +    // While probably harmless in practice, we should not access the VM stack
> +    // outside of the interval [NewStackPointer, FramePtr), which means we
> +    // should not blindly fill all 8 argument registers with VM stack data.
> +    // So instead, calculate how many argument registers we can fill based on
> +    // the size of the VM stack frame, and skip the remaining ones.
> +    //
> +    adr     x0, 0f                  // Take address of 'br' instruction below
> +    bic     x3, x3, #7              // Ensure correct alignment
> +    sub     x0, x0, x3, lsr #1      // Subtract 4 bytes for each arg to unstack
> +    br      x0                      // Skip remaining argument registers
> +
> +    ldr     x7, [x9, #56]           // Call with 8 arguments
> +    ldr     x6, [x9, #48]           //  |
> +    ldr     x5, [x9, #40]           //  |
> +    ldr     x4, [x9, #32]           //  |
> +    ldr     x3, [x9, #24]           //  |
> +    ldr     x2, [x9, #16]           //  |
> +    ldr     x1, [x9, #8]            //  V
> +    ldr     x0, [x9]                // Call with 1 argument
> +
> +0:  br      x8                      // Call with no arguments
>  
> -      mov  sp,  x20
> -      ldp  x29, x30, [sp], #16
> -      ldp  x19, x20, [sp], #16
> +    //
> +    // More than 64 bytes: we need to build the full native stack frame and copy
> +    // the part of the VM stack exceeding 64 bytes (which may contain stacked
> +    // arguments) to the native stack
> +    //
> +1:  stp     x29, x30, [sp, #-16]!
> +    mov     x29, sp
>  
> -      ret
> +    //
> +    // Ensure that the stack pointer remains 16 byte aligned,
> +    // even if the size of the VM stack frame is not a multiple of 16
> +    //
> +    add     x1, x1, #64             // Skip over [potential] reg params
> +    tbz     x3, #3, 2f              // Multiple of 16?
> +    ldr     x4, [x2, #-8]!          // No? Then push one word
> +    str     x4, [sp, #-16]!         // ... but use two slots
> +    b       3f
> +
> +2:  ldp     x4, x5, [x2, #-16]!
> +    stp     x4, x5, [sp, #-16]!
> +3:  cmp     x2, x1
> +    b.gt    2b
> +
> +    ldp     x0, x1, [x9]
> +    ldp     x2, x3, [x9, #16]
> +    ldp     x4, x5, [x9, #32]
> +    ldp     x6, x7, [x9, #48]
> +
> +    blr     x8
> +
> +    mov     sp, x29
> +    ldp     x29, x30, [sp], #16
> +    ret
>  
>  //****************************************************************************
>  // EbcLLEbcInterpret
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-08-30 18:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
2016-08-30 18:52   ` Leif Lindholm
2016-08-30 14:27 ` [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks Ard Biesheuvel
2016-08-30 14:34 ` [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox