* [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure Ard Biesheuvel
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel
Change to consistent // style comments. Also, remove bogus global
definitions for external functions, and move the real exports to
the top of the file.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 111 +++++++++-----------
1 file changed, 52 insertions(+), 59 deletions(-)
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index e858227586a8..17f379248a62 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -1,40 +1,35 @@
-#/** @file
-#
-# This code provides low level routines that support the Virtual Machine
-# for option ROMs.
-#
-# Copyright (c) 2015, The Linux Foundation. All rights reserved.
-# Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
-# This program and the accompanying materials
-# are licensed and made available under the terms and conditions of the BSD License
-# which accompanies this distribution. The full text of the license may be found at
-# http://opensource.org/licenses/bsd-license.php
-#
-# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
-# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
-#
-#**/
-
-#---------------------------------------------------------------------------
-# Equate files needed.
-#---------------------------------------------------------------------------
-
-ASM_GLOBAL ASM_PFX(CopyMem);
-ASM_GLOBAL ASM_PFX(EbcInterpret);
-ASM_GLOBAL ASM_PFX(ExecuteEbcImageEntryPoint);
-
-#****************************************************************************
-# EbcLLCALLEX
-#
-# This function is called to execute an EBC CALLEX instruction.
-# This instruction requires that we thunk out to external native
-# code. For AArch64, we copy the VM stack into the main stack and then pop
-# the first 8 arguments off according to the AArch64 Procedure Call Standard
-# On return, we restore the stack pointer to its original location.
-#
-#****************************************************************************
-# UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
-ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative);
+///** @file
+//
+// This code provides low level routines that support the Virtual Machine
+// for option ROMs.
+//
+// Copyright (c) 2015, The Linux Foundation. All rights reserved.
+// Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
+// This program and the accompanying materials
+// are licensed and made available under the terms and conditions of the BSD License
+// which accompanies this distribution. The full text of the license may be found at
+// http://opensource.org/licenses/bsd-license.php
+//
+// THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
+// WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
+//
+//**/
+
+ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative)
+ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret)
+ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint)
+
+//****************************************************************************
+// EbcLLCALLEX
+//
+// This function is called to execute an EBC CALLEX instruction.
+// This instruction requires that we thunk out to external native
+// code. For AArch64, we copy the VM stack into the main stack and then pop
+// the first 8 arguments off according to the AArch64 Procedure Call Standard
+// On return, we restore the stack pointer to its original location.
+//
+//****************************************************************************
+// UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
ASM_PFX(EbcLLCALLEXNative):
stp x19, x20, [sp, #-16]!
stp x29, x30, [sp, #-16]!
@@ -61,16 +56,15 @@ ASM_PFX(EbcLLCALLEXNative):
ret
-#****************************************************************************
-# EbcLLEbcInterpret
-#
-# This function is called by the thunk code to handle an Native to EBC call
-# This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-# x9 contains the Entry point that will be the first argument when
-# EBCInterpret is called.
-#
-#****************************************************************************
-ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret);
+//****************************************************************************
+// EbcLLEbcInterpret
+//
+// This function is called by the thunk code to handle an Native to EBC call
+// This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
+// x9 contains the Entry point that will be the first argument when
+// EBCInterpret is called.
+//
+//****************************************************************************
ASM_PFX(EbcLLEbcInterpret):
stp x29, x30, [sp, #-16]!
@@ -105,7 +99,7 @@ ASM_PFX(EbcLLEbcInterpret):
mov x1, x0
mov x0, x9
- # call C-code
+ // call C-code
bl ASM_PFX(EbcInterpret)
add sp, sp, #80
@@ -113,23 +107,22 @@ ASM_PFX(EbcLLEbcInterpret):
ret
-#****************************************************************************
-# EbcLLExecuteEbcImageEntryPoint
-#
-# This function is called by the thunk code to handle the image entry point
-# x9 contains the Entry point that will be the first argument when
-# ExecuteEbcImageEntryPoint is called.
-#
-#****************************************************************************
-ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint);
+//****************************************************************************
+// EbcLLExecuteEbcImageEntryPoint
+//
+// This function is called by the thunk code to handle the image entry point
+// x9 contains the Entry point that will be the first argument when
+// ExecuteEbcImageEntryPoint is called.
+//
+//****************************************************************************
ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
stp x29, x30, [sp, #-16]!
- # build new paramater calling convention
+ // build new parameter calling convention
mov x2, x1
mov x1, x0
mov x0, x9
- # call C-code
+ // call C-code
bl ASM_PFX(ExecuteEbcImageEntryPoint)
ldp x29, x30, [sp], #16
ret
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel
The thunk generation is needlessly complex, given that it attempts to
deal with variable length instructions, which don't exist on AArch64.
So replace it with a simple template coded in assembler, with a matching
struct definition in C. That way, we can create and manipulate the thunks
easily without looping over the instructions looking for 'magic' numbers.
Also, use x16 rather than x9, since it is the architectural register to
use for thunks/veneers.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 32 ++++-
MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c | 133 +++++---------------
2 files changed, 58 insertions(+), 107 deletions(-)
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index 17f379248a62..b4b8531f1a01 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -3,8 +3,10 @@
// This code provides low level routines that support the Virtual Machine
// for option ROMs.
//
-// Copyright (c) 2015, The Linux Foundation. All rights reserved.
+// Copyright (c) 2016, Linaro, Ltd. All rights reserved.<BR>
+// Copyright (c) 2015, The Linux Foundation. All rights reserved.<BR>
// Copyright (c) 2007 - 2014, Intel Corporation. All rights reserved.<BR>
+//
// This program and the accompanying materials
// are licensed and made available under the terms and conditions of the BSD License
// which accompanies this distribution. The full text of the license may be found at
@@ -19,6 +21,8 @@ ASM_GLOBAL ASM_PFX(EbcLLCALLEXNative)
ASM_GLOBAL ASM_PFX(EbcLLEbcInterpret)
ASM_GLOBAL ASM_PFX(EbcLLExecuteEbcImageEntryPoint)
+ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
+
//****************************************************************************
// EbcLLCALLEX
//
@@ -61,7 +65,7 @@ ASM_PFX(EbcLLCALLEXNative):
//
// This function is called by the thunk code to handle an Native to EBC call
// This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-// x9 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the first argument when
// EBCInterpret is called.
//
//****************************************************************************
@@ -97,7 +101,7 @@ ASM_PFX(EbcLLEbcInterpret):
mov x3, x2
mov x2, x1
mov x1, x0
- mov x0, x9
+ mov x0, x16
// call C-code
bl ASM_PFX(EbcInterpret)
@@ -111,7 +115,7 @@ ASM_PFX(EbcLLEbcInterpret):
// EbcLLExecuteEbcImageEntryPoint
//
// This function is called by the thunk code to handle the image entry point
-// x9 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the third argument when
// ExecuteEbcImageEntryPoint is called.
//
//****************************************************************************
@@ -120,9 +124,27 @@ ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
// build new parameter calling convention
mov x2, x1
mov x1, x0
- mov x0, x9
+ mov x0, x16
// call C-code
bl ASM_PFX(ExecuteEbcImageEntryPoint)
ldp x29, x30, [sp], #16
ret
+
+//****************************************************************************
+// mEbcInstructionBufferTemplate
+//****************************************************************************
+ .section ".rodata", "a"
+ .align 3
+ASM_PFX(mEbcInstructionBufferTemplate):
+ adr x17, 0f
+ ldp x16, x17, [x17]
+ br x17
+
+ //
+ // Add a magic code here to help the VM recognize the thunk.
+ //
+ hlt #0xEBC
+
+0: .quad 0 // EBC_ENTRYPOINT_SIGNATURE
+ .quad 0 // EBC_LL_EBC_ENTRYPOINT_SIGNATURE
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
index 23261a070143..a5f21f400274 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
@@ -2,8 +2,10 @@
This module contains EBC support routines that are customized based on
the target AArch64 processor.
-Copyright (c) 2015, The Linux Foundation. All rights reserved.
+Copyright (c) 2016, Linaro, Ltd. All rights reserved.<BR>
+Copyright (c) 2015, The Linux Foundation. All rights reserved.<BR>
Copyright (c) 2006 - 2014, Intel Corporation. All rights reserved.<BR>
+
This program and the accompanying materials
are licensed and made available under the terms and conditions of the BSD License
which accompanies this distribution. The full text of the license may be found at
@@ -22,47 +24,16 @@ WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
//
#define STACK_REMAIN_SIZE (1024 * 4)
-//
-// This is instruction buffer used to create EBC thunk
-//
-#define EBC_MAGIC_SIGNATURE 0xCA112EBCCA112EBCull
-#define EBC_ENTRYPOINT_SIGNATURE 0xAFAFAFAFAFAFAFAFull
-#define EBC_LL_EBC_ENTRYPOINT_SIGNATURE 0xFAFAFAFAFAFAFAFAull
-UINT8 mInstructionBufferTemplate[] = {
- 0x03, 0x00, 0x00, 0x14, //b pc+16
- //
- // Add a magic code here to help the VM recognize the thunk..
- //
- (UINT8)(EBC_MAGIC_SIGNATURE & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 8) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 16) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 24) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 32) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 40) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 48) & 0xFF),
- (UINT8)((EBC_MAGIC_SIGNATURE >> 56) & 0xFF),
- 0x69, 0x00, 0x00, 0x58, //ldr x9, #32
- 0x8A, 0x00, 0x00, 0x58, //ldr x10, #40
- 0x05, 0x00, 0x00, 0x14, //b pc+32
- (UINT8)(EBC_ENTRYPOINT_SIGNATURE & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 8) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 16) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 24) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 32) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 40) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 48) & 0xFF),
- (UINT8)((EBC_ENTRYPOINT_SIGNATURE >> 56) & 0xFF),
- (UINT8)(EBC_LL_EBC_ENTRYPOINT_SIGNATURE & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 8) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 16) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 24) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 32) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 40) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 48) & 0xFF),
- (UINT8)((EBC_LL_EBC_ENTRYPOINT_SIGNATURE >> 56) & 0xFF),
- 0x40, 0x01, 0x1F, 0xD6 //br x10
-
-};
+#pragma pack(1)
+typedef struct {
+ UINT32 Instr[3];
+ UINT32 Magic;
+ UINT64 EbcEntryPoint;
+ UINT64 EbcLlEntryPoint;
+} EBC_INSTRUCTION_BUFFER;
+#pragma pack()
+
+extern CONST EBC_INSTRUCTION_BUFFER mEbcInstructionBufferTemplate;
/**
Begin executing an EBC image.
@@ -414,10 +385,7 @@ EbcCreateThunks (
IN UINT32 Flags
)
{
- UINT8 *Ptr;
- UINT8 *ThunkBase;
- UINT32 Index;
- INT32 ThunkSize;
+ EBC_INSTRUCTION_BUFFER *InstructionBuffer;
//
// Check alignment of pointer to EBC code
@@ -426,51 +394,38 @@ EbcCreateThunks (
return EFI_INVALID_PARAMETER;
}
- ThunkSize = sizeof(mInstructionBufferTemplate);
-
- Ptr = AllocatePool (sizeof(mInstructionBufferTemplate));
-
- if (Ptr == NULL) {
+ InstructionBuffer = AllocatePool (sizeof (EBC_INSTRUCTION_BUFFER));
+ if (InstructionBuffer == NULL) {
return EFI_OUT_OF_RESOURCES;
}
- //
- // Print(L"Allocate TH: 0x%X\n", (UINT32)Ptr);
- //
- // Save the start address so we can add a pointer to it to a list later.
- //
- ThunkBase = Ptr;
//
// Give them the address of our buffer we're going to fix up
//
- *Thunk = (VOID *) Ptr;
+ *Thunk = InstructionBuffer;
//
// Copy whole thunk instruction buffer template
//
- CopyMem (Ptr, mInstructionBufferTemplate, sizeof(mInstructionBufferTemplate));
+ CopyMem (InstructionBuffer, &mEbcInstructionBufferTemplate,
+ sizeof (EBC_INSTRUCTION_BUFFER));
//
// Patch EbcEntryPoint and EbcLLEbcInterpret
//
- for (Index = 0; Index < sizeof(mInstructionBufferTemplate) - sizeof(UINTN); Index++) {
- if (*(UINTN *)&Ptr[Index] == EBC_ENTRYPOINT_SIGNATURE) {
- *(UINTN *)&Ptr[Index] = (UINTN)EbcEntryPoint;
- }
- if (*(UINTN *)&Ptr[Index] == EBC_LL_EBC_ENTRYPOINT_SIGNATURE) {
- if ((Flags & FLAG_THUNK_ENTRY_POINT) != 0) {
- *(UINTN *)&Ptr[Index] = (UINTN)EbcLLExecuteEbcImageEntryPoint;
- } else {
- *(UINTN *)&Ptr[Index] = (UINTN)EbcLLEbcInterpret;
- }
- }
+ InstructionBuffer->EbcEntryPoint = (UINT64)EbcEntryPoint;
+ if ((Flags & FLAG_THUNK_ENTRY_POINT) != 0) {
+ InstructionBuffer->EbcLlEntryPoint = (UINT64)EbcLLExecuteEbcImageEntryPoint;
+ } else {
+ InstructionBuffer->EbcLlEntryPoint = (UINT64)EbcLLEbcInterpret;
}
//
// Add the thunk to the list for this image. Do this last since the add
// function flushes the cache for us.
//
- EbcAddImageThunk (ImageHandle, (VOID *) ThunkBase, ThunkSize);
+ EbcAddImageThunk (ImageHandle, InstructionBuffer,
+ sizeof (EBC_INSTRUCTION_BUFFER));
return EFI_SUCCESS;
}
@@ -500,40 +455,15 @@ EbcLLCALLEX (
IN UINT8 Size
)
{
- UINTN IsThunk;
- UINTN TargetEbcAddr;
- UINT8 InstructionBuffer[sizeof(mInstructionBufferTemplate)];
- UINTN Index;
- UINTN IndexOfEbcEntrypoint;
-
- IsThunk = 1;
- TargetEbcAddr = 0;
- IndexOfEbcEntrypoint = 0;
+ CONST EBC_INSTRUCTION_BUFFER *InstructionBuffer;
//
// Processor specific code to check whether the callee is a thunk to EBC.
//
- CopyMem (InstructionBuffer, (VOID *)FuncAddr, sizeof(InstructionBuffer));
- //
- // Fill the signature according to mInstructionBufferTemplate
- //
- for (Index = 0; Index < sizeof(mInstructionBufferTemplate) - sizeof(UINTN); Index++) {
- if (*(UINTN *)&mInstructionBufferTemplate[Index] == EBC_ENTRYPOINT_SIGNATURE) {
- *(UINTN *)&InstructionBuffer[Index] = EBC_ENTRYPOINT_SIGNATURE;
- IndexOfEbcEntrypoint = Index;
- }
- if (*(UINTN *)&mInstructionBufferTemplate[Index] == EBC_LL_EBC_ENTRYPOINT_SIGNATURE) {
- *(UINTN *)&InstructionBuffer[Index] = EBC_LL_EBC_ENTRYPOINT_SIGNATURE;
- }
- }
- //
- // Check if we need thunk to native
- //
- if (CompareMem (InstructionBuffer, mInstructionBufferTemplate, sizeof(mInstructionBufferTemplate)) != 0) {
- IsThunk = 0;
- }
+ InstructionBuffer = (EBC_INSTRUCTION_BUFFER *)FuncAddr;
- if (IsThunk == 1){
+ if (CompareMem (InstructionBuffer, &mEbcInstructionBufferTemplate,
+ sizeof(EBC_INSTRUCTION_BUFFER) - 2 * sizeof (UINT64)) == 0) {
//
// The callee is a thunk to EBC, adjust the stack pointer down 16 bytes and
// put our return address and frame pointer on the VM stack.
@@ -545,8 +475,7 @@ EbcLLCALLEX (
VmPtr->Gpr[0] -= 8;
VmWriteMem64 (VmPtr, (UINTN) VmPtr->Gpr[0], (UINT64) (UINTN) (VmPtr->Ip + Size));
- CopyMem (&TargetEbcAddr, (UINT8 *)FuncAddr + IndexOfEbcEntrypoint, sizeof(UINTN));
- VmPtr->Ip = (VMIP) (UINTN) TargetEbcAddr;
+ VmPtr->Ip = (VMIP) InstructionBuffer->EbcEntryPoint;
} else {
//
// The callee is not a thunk to EBC, call native code,
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 1/4] MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file Ard Biesheuvel
2016-08-30 14:27 ` [PATCH v4 2/4] MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
2016-08-30 18:52 ` Leif Lindholm
2016-08-30 14:27 ` [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks Ard Biesheuvel
2016-08-30 14:34 ` [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
4 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel
Instead of pessimistically copying at least 64 bytes from the VM stack
to the native stack, and popping off the register arguments again
before doing the native call, try to avoid touching the stack completely
if the VM stack frame is <= 64 bytes. Also, if the stack frame does exceed
64 bytes, there is no need to copy the first 64 bytes, since we are passing
those in registers anyway.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 85 +++++++++++++++-----
1 file changed, 65 insertions(+), 20 deletions(-)
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index b4b8531f1a01..34794c06a644 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -35,30 +35,75 @@ ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
//****************************************************************************
// UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
ASM_PFX(EbcLLCALLEXNative):
- stp x19, x20, [sp, #-16]!
- stp x29, x30, [sp, #-16]!
+ mov x8, x0 // Preserve x0
+ mov x9, x1 // Preserve x1
- mov x19, x0
- mov x20, sp
- sub x2, x2, x1 // Length = NewStackPointer-FramePtr
- sub sp, sp, x2
- sub sp, sp, #64 // Make sure there is room for at least 8 args in the new stack
- mov x0, sp
-
- bl CopyMem // Sp, NewStackPointer, Length
-
- ldp x0, x1, [sp], #16
- ldp x2, x3, [sp], #16
- ldp x4, x5, [sp], #16
- ldp x6, x7, [sp], #16
+ //
+ // If the EBC stack frame is smaller than or equal to 64 bytes, we know there
+ // are no stacked arguments #9 and beyond that we need to copy to the native
+ // stack. In this case, we can perform a tail call which is much more
+ // efficient, since there is no need to touch the native stack at all.
+ //
+ sub x3, x2, x1 // Length = NewStackPointer - FramePtr
+ cmp x3, #64
+ b.gt 1f
- blr x19
+ //
+ // While probably harmless in practice, we should not access the VM stack
+ // outside of the interval [NewStackPointer, FramePtr), which means we
+ // should not blindly fill all 8 argument registers with VM stack data.
+ // So instead, calculate how many argument registers we can fill based on
+ // the size of the VM stack frame, and skip the remaining ones.
+ //
+ adr x0, 0f // Take address of 'br' instruction below
+ bic x3, x3, #7 // Ensure correct alignment
+ sub x0, x0, x3, lsr #1 // Subtract 4 bytes for each arg to unstack
+ br x0 // Skip remaining argument registers
+
+ ldr x7, [x9, #56] // Call with 8 arguments
+ ldr x6, [x9, #48] // |
+ ldr x5, [x9, #40] // |
+ ldr x4, [x9, #32] // |
+ ldr x3, [x9, #24] // |
+ ldr x2, [x9, #16] // |
+ ldr x1, [x9, #8] // V
+ ldr x0, [x9] // Call with 1 argument
+
+0: br x8 // Call with no arguments
- mov sp, x20
- ldp x29, x30, [sp], #16
- ldp x19, x20, [sp], #16
+ //
+ // More than 64 bytes: we need to build the full native stack frame and copy
+ // the part of the VM stack exceeding 64 bytes (which may contain stacked
+ // arguments) to the native stack
+ //
+1: stp x29, x30, [sp, #-16]!
+ mov x29, sp
- ret
+ //
+ // Ensure that the stack pointer remains 16 byte aligned,
+ // even if the size of the VM stack frame is not a multiple of 16
+ //
+ add x1, x1, #64 // Skip over [potential] reg params
+ tbz x3, #3, 2f // Multiple of 16?
+ ldr x4, [x2, #-8]! // No? Then push one word
+ str x4, [sp, #-16]! // ... but use two slots
+ b 3f
+
+2: ldp x4, x5, [x2, #-16]!
+ stp x4, x5, [sp, #-16]!
+3: cmp x2, x1
+ b.gt 2b
+
+ ldp x0, x1, [x9]
+ ldp x2, x3, [x9, #16]
+ ldp x4, x5, [x9, #32]
+ ldp x6, x7, [x9, #48]
+
+ blr x8
+
+ mov sp, x29
+ ldp x29, x30, [sp], #16
+ ret
//****************************************************************************
// EbcLLEbcInterpret
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
@ 2016-08-30 18:52 ` Leif Lindholm
0 siblings, 0 replies; 7+ messages in thread
From: Leif Lindholm @ 2016-08-30 18:52 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: edk2-devel
On Tue, Aug 30, 2016 at 03:27:23PM +0100, Ard Biesheuvel wrote:
> Instead of pessimistically copying at least 64 bytes from the VM stack
> to the native stack, and popping off the register arguments again
> before doing the native call, try to avoid touching the stack completely
> if the VM stack frame is <= 64 bytes. Also, if the stack frame does exceed
> 64 bytes, there is no need to copy the first 64 bytes, since we are passing
> those in registers anyway.
>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
> ---
> MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 85 +++++++++++++++-----
> 1 file changed, 65 insertions(+), 20 deletions(-)
>
> diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> index b4b8531f1a01..34794c06a644 100644
> --- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> +++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
> @@ -35,30 +35,75 @@ ASM_GLOBAL ASM_PFX(mEbcInstructionBufferTemplate)
> //****************************************************************************
> // UINTN EbcLLCALLEXNative(UINTN FuncAddr, UINTN NewStackPointer, VOID *FramePtr)
> ASM_PFX(EbcLLCALLEXNative):
> - stp x19, x20, [sp, #-16]!
> - stp x29, x30, [sp, #-16]!
> + mov x8, x0 // Preserve x0
> + mov x9, x1 // Preserve x1
>
> - mov x19, x0
> - mov x20, sp
> - sub x2, x2, x1 // Length = NewStackPointer-FramePtr
> - sub sp, sp, x2
> - sub sp, sp, #64 // Make sure there is room for at least 8 args in the new stack
> - mov x0, sp
> -
> - bl CopyMem // Sp, NewStackPointer, Length
> -
> - ldp x0, x1, [sp], #16
> - ldp x2, x3, [sp], #16
> - ldp x4, x5, [sp], #16
> - ldp x6, x7, [sp], #16
> + //
> + // If the EBC stack frame is smaller than or equal to 64 bytes, we know there
> + // are no stacked arguments #9 and beyond that we need to copy to the native
> + // stack. In this case, we can perform a tail call which is much more
> + // efficient, since there is no need to touch the native stack at all.
> + //
> + sub x3, x2, x1 // Length = NewStackPointer - FramePtr
> + cmp x3, #64
> + b.gt 1f
>
> - blr x19
> + //
> + // While probably harmless in practice, we should not access the VM stack
> + // outside of the interval [NewStackPointer, FramePtr), which means we
> + // should not blindly fill all 8 argument registers with VM stack data.
> + // So instead, calculate how many argument registers we can fill based on
> + // the size of the VM stack frame, and skip the remaining ones.
> + //
> + adr x0, 0f // Take address of 'br' instruction below
> + bic x3, x3, #7 // Ensure correct alignment
> + sub x0, x0, x3, lsr #1 // Subtract 4 bytes for each arg to unstack
> + br x0 // Skip remaining argument registers
> +
> + ldr x7, [x9, #56] // Call with 8 arguments
> + ldr x6, [x9, #48] // |
> + ldr x5, [x9, #40] // |
> + ldr x4, [x9, #32] // |
> + ldr x3, [x9, #24] // |
> + ldr x2, [x9, #16] // |
> + ldr x1, [x9, #8] // V
> + ldr x0, [x9] // Call with 1 argument
> +
> +0: br x8 // Call with no arguments
>
> - mov sp, x20
> - ldp x29, x30, [sp], #16
> - ldp x19, x20, [sp], #16
> + //
> + // More than 64 bytes: we need to build the full native stack frame and copy
> + // the part of the VM stack exceeding 64 bytes (which may contain stacked
> + // arguments) to the native stack
> + //
> +1: stp x29, x30, [sp, #-16]!
> + mov x29, sp
>
> - ret
> + //
> + // Ensure that the stack pointer remains 16 byte aligned,
> + // even if the size of the VM stack frame is not a multiple of 16
> + //
> + add x1, x1, #64 // Skip over [potential] reg params
> + tbz x3, #3, 2f // Multiple of 16?
> + ldr x4, [x2, #-8]! // No? Then push one word
> + str x4, [sp, #-16]! // ... but use two slots
> + b 3f
> +
> +2: ldp x4, x5, [x2, #-16]!
> + stp x4, x5, [sp, #-16]!
> +3: cmp x2, x1
> + b.gt 2b
> +
> + ldp x0, x1, [x9]
> + ldp x2, x3, [x9, #16]
> + ldp x4, x5, [x9, #32]
> + ldp x6, x7, [x9, #48]
> +
> + blr x8
> +
> + mov sp, x29
> + ldp x29, x30, [sp], #16
> + ret
>
> //****************************************************************************
> // EbcLLEbcInterpret
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
` (2 preceding siblings ...)
2016-08-30 14:27 ` [PATCH v4 3/4] MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk Ard Biesheuvel
@ 2016-08-30 14:27 ` Ard Biesheuvel
2016-08-30 14:34 ` [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:27 UTC (permalink / raw)
To: edk2-devel, leif.lindholm; +Cc: Ard Biesheuvel
The prototypes of EbcInterpret() and ExecuteEbcImageEntryPoint() are
private to the AARCH64 implementation of EbcDxe, so we can shuffle
the arguments around a bit and make the assembler thunking glue a lot
simpler.
For ExecuteEbcImageEntryPoint(), this involves passing the EntryPoint
argument as the third parameter, rather than the first, which allows
us to do a tail call. For EbcInterpret(), instead of copying each
argument beyond #8 from one native stack frame to the next (before
another copy is made into the VM stack), pass a pointer to the
argument stack.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
---
MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 59 +++++--------------
MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c | 60 ++++++++------------
2 files changed, 36 insertions(+), 83 deletions(-)
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
index 34794c06a644..b1f09725ecf0 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S
@@ -110,50 +110,23 @@ ASM_PFX(EbcLLCALLEXNative):
//
// This function is called by the thunk code to handle an Native to EBC call
// This can handle up to 16 arguments (1-8 on in x0-x7, 9-16 are on the stack)
-// x16 contains the Entry point that will be the first argument when
+// x16 contains the Entry point that will be the first stacked argument when
// EBCInterpret is called.
//
//****************************************************************************
ASM_PFX(EbcLLEbcInterpret):
- stp x29, x30, [sp, #-16]!
-
- // copy the current arguments 9-16 from old location and add arg 7 to stack
- // keeping 16 byte stack alignment
- sub sp, sp, #80
- str x7, [sp]
- ldr x11, [sp, #96]
- str x11, [sp, #8]
- ldr x11, [sp, #104]
- str x11, [sp, #16]
- ldr x11, [sp, #112]
- str x11, [sp, #24]
- ldr x11, [sp, #120]
- str x11, [sp, #32]
- ldr x11, [sp, #128]
- str x11, [sp, #40]
- ldr x11, [sp, #136]
- str x11, [sp, #48]
- ldr x11, [sp, #144]
- str x11, [sp, #56]
- ldr x11, [sp, #152]
- str x11, [sp, #64]
-
- // Shift arguments and add entry point and as argument 1
- mov x7, x6
- mov x6, x5
- mov x5, x4
- mov x4, x3
- mov x3, x2
- mov x2, x1
- mov x1, x0
- mov x0, x16
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
- // call C-code
- bl ASM_PFX(EbcInterpret)
- add sp, sp, #80
+ // push the entry point and the address of args #9 - #16 onto the stack
+ add x17, sp, #16
+ stp x16, x17, [sp, #-16]!
- ldp x29, x30, [sp], #16
+ // call C-code
+ bl ASM_PFX(EbcInterpret)
+ add sp, sp, #16
+ ldp x29, x30, [sp], #16
ret
//****************************************************************************
@@ -165,16 +138,10 @@ ASM_PFX(EbcLLEbcInterpret):
//
//****************************************************************************
ASM_PFX(EbcLLExecuteEbcImageEntryPoint):
- stp x29, x30, [sp, #-16]!
- // build new parameter calling convention
- mov x2, x1
- mov x1, x0
- mov x0, x16
+ mov x2, x16
- // call C-code
- bl ASM_PFX(ExecuteEbcImageEntryPoint)
- ldp x29, x30, [sp], #16
- ret
+ // tail call to C code
+ b ASM_PFX(ExecuteEbcImageEntryPoint)
//****************************************************************************
// mEbcInstructionBufferTemplate
diff --git a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
index a5f21f400274..c5cc76d7bdcb 100644
--- a/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
+++ b/MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c
@@ -89,7 +89,6 @@ PushU64 (
This is a thunk function.
- @param EntryPoint The entrypoint of EBC code.
@param Arg1 The 1st argument.
@param Arg2 The 2nd argument.
@param Arg3 The 3rd argument.
@@ -98,14 +97,8 @@ PushU64 (
@param Arg6 The 6th argument.
@param Arg7 The 7th argument.
@param Arg8 The 8th argument.
- @param Arg9 The 9th argument.
- @param Arg10 The 10th argument.
- @param Arg11 The 11th argument.
- @param Arg12 The 12th argument.
- @param Arg13 The 13th argument.
- @param Arg14 The 14th argument.
- @param Arg15 The 15th argument.
- @param Arg16 The 16th argument.
+ @param EntryPoint The entrypoint of EBC code.
+ @param Args9_16[] Array containing arguments #9 to #16.
@return The value returned by the EBC application we're going to run.
@@ -113,23 +106,16 @@ PushU64 (
UINT64
EFIAPI
EbcInterpret (
- IN UINTN EntryPoint,
- IN UINTN Arg1,
- IN UINTN Arg2,
- IN UINTN Arg3,
- IN UINTN Arg4,
- IN UINTN Arg5,
- IN UINTN Arg6,
- IN UINTN Arg7,
- IN UINTN Arg8,
- IN UINTN Arg9,
- IN UINTN Arg10,
- IN UINTN Arg11,
- IN UINTN Arg12,
- IN UINTN Arg13,
- IN UINTN Arg14,
- IN UINTN Arg15,
- IN UINTN Arg16
+ IN UINTN Arg1,
+ IN UINTN Arg2,
+ IN UINTN Arg3,
+ IN UINTN Arg4,
+ IN UINTN Arg5,
+ IN UINTN Arg6,
+ IN UINTN Arg7,
+ IN UINTN Arg8,
+ IN UINTN EntryPoint,
+ IN CONST UINTN Args9_16[]
)
{
//
@@ -193,14 +179,14 @@ EbcInterpret (
// For the worst case, assume there are 4 arguments passed in registers, store
// them to VM's stack.
//
- PushU64 (&VmContext, (UINT64) Arg16);
- PushU64 (&VmContext, (UINT64) Arg15);
- PushU64 (&VmContext, (UINT64) Arg14);
- PushU64 (&VmContext, (UINT64) Arg13);
- PushU64 (&VmContext, (UINT64) Arg12);
- PushU64 (&VmContext, (UINT64) Arg11);
- PushU64 (&VmContext, (UINT64) Arg10);
- PushU64 (&VmContext, (UINT64) Arg9);
+ PushU64 (&VmContext, (UINT64) Args9_16[7]);
+ PushU64 (&VmContext, (UINT64) Args9_16[6]);
+ PushU64 (&VmContext, (UINT64) Args9_16[5]);
+ PushU64 (&VmContext, (UINT64) Args9_16[4]);
+ PushU64 (&VmContext, (UINT64) Args9_16[3]);
+ PushU64 (&VmContext, (UINT64) Args9_16[2]);
+ PushU64 (&VmContext, (UINT64) Args9_16[1]);
+ PushU64 (&VmContext, (UINT64) Args9_16[0]);
PushU64 (&VmContext, (UINT64) Arg8);
PushU64 (&VmContext, (UINT64) Arg7);
PushU64 (&VmContext, (UINT64) Arg6);
@@ -252,10 +238,10 @@ EbcInterpret (
/**
Begin executing an EBC image.
- @param EntryPoint The entrypoint of EBC code.
@param ImageHandle image handle for the EBC application we're executing
@param SystemTable standard system table passed into an driver's entry
point
+ @param EntryPoint The entrypoint of EBC code.
@return The value returned by the EBC application we're going to run.
@@ -263,9 +249,9 @@ EbcInterpret (
UINT64
EFIAPI
ExecuteEbcImageEntryPoint (
- IN UINTN EntryPoint,
IN EFI_HANDLE ImageHandle,
- IN EFI_SYSTEM_TABLE *SystemTable
+ IN EFI_SYSTEM_TABLE *SystemTable,
+ IN UINTN EntryPoint
)
{
//
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements
2016-08-30 14:27 [PATCH v4 0/4] MdeModulePkg/EbcDxe: AARCH64 improvements Ard Biesheuvel
` (3 preceding siblings ...)
2016-08-30 14:27 ` [PATCH v4 4/4] MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks Ard Biesheuvel
@ 2016-08-30 14:34 ` Ard Biesheuvel
4 siblings, 0 replies; 7+ messages in thread
From: Ard Biesheuvel @ 2016-08-30 14:34 UTC (permalink / raw)
To: edk2-devel-01, Leif Lindholm; +Cc: Ard Biesheuvel
On 30 August 2016 at 15:27, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> This is v4 of my proposed changes to the AARCH64 implementation of EbcDxe
> contributed by Jeff Brasen, which has recently been merged into Tianocore.
>
> Changes since v3:
> - fix typo in comment (#1)
> - clarify comments around computed goto in EBC to native thunk, and make sure
> the jump target is 32-bit aligned (#3)
> - fix comment and constify Args9_16[] in EbcInterpret() prototype (#4)
> - add Leif's R-b (#1, #2, #4)
>
> Ard Biesheuvel (4):
> MdeModulePkg/EbcDxe AARCH64: clean up comment style in ASM file
> MdeModulePkg/EbcDxe AARCH64: use a fixed size thunk structure
> MdeModulePkg/EbxDxe AARCH64: use tail call for EBC to native thunk
> MdeModulePkg/EbcDxe AARCH64: simplify interpreter entry point thunks
>
> MdeModulePkg/Universal/EbcDxe/AArch64/EbcLowLevel.S | 285 +++++++++++---------
> MdeModulePkg/Universal/EbcDxe/AArch64/EbcSupport.c | 193 ++++---------
> 2 files changed, 210 insertions(+), 268 deletions(-)
>
Hmm, I seem to have forgotten to put the actual MdeModulePkg maintainers on cc.
^ permalink raw reply [flat|nested] 7+ messages in thread