* [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
@ 2016-09-22 9:43 Pete Batard
2016-09-22 10:06 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Pete Batard @ 2016-09-22 9:43 UTC (permalink / raw)
To: edk2-devel
Hi,
The following is an updated/fixed version of the patch(es), put forward
by Ard Biesheuvel on August 9 ([1], [2]), and re-submitted for formal
inclusion, so that the EDK2 can provide EBC functionality for all of
IA32, IA64, X64, AARCH64 and ARM at last.
This updated patch now includes the necessary corollary dsc/fdf updates
as well as fixes to the assembly's EbcLLCALLEXNative, as I found the
following issues there:
- At least gcc5 didn't seem to like the manually optimized branching for
all register args ("sub r1, r1, r3, lsr #1"), and one can never be sure
of the actual size instructions will be assembled into, in case of
assembler internal alignment/optimization, so I broke it down into
actual labelled branches. There are only 4 of those anyway.
- For register + stack calls, while 8 x 64 bit registers on AARCH64 do
equate to #64 bytes that need to be taken off the stack, on ARM the 4 x
32 bit registers equate to #16 bytes, not #32
- Even after fixing the above, I found some issues with the manual stack
duplication assembly code, so I switched to using a call to CopyMem(),
like IA32 does.
With these changes, I believe that the ARM/EBC feature should be fully
functional, especially as I have heavily tested multiparameter calls
from EBC into native, using an fasmg-based EBC assembler [3], to confirm
that they performed just as well with ARM as with AARCH64, IA32 or X64.
Regards,
/Pete
[1] https://patches.linaro.org/patch/73554/
[2] https://patches.linaro.org/patch/73559/
[3] https://github.com/pbatard/fasmg-ebc
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 9:43 [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support Pete Batard
@ 2016-09-22 10:06 ` Ard Biesheuvel
2016-09-22 10:13 ` Ard Biesheuvel
2016-09-22 11:05 ` Pete Batard
0 siblings, 2 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 10:06 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 10:43, Pete Batard <pete@akeo.ie> wrote:
> Hi,
>
> The following is an updated/fixed version of the patch(es), put forward by
> Ard Biesheuvel on August 9 ([1], [2]), and re-submitted for formal
> inclusion, so that the EDK2 can provide EBC functionality for all of IA32,
> IA64, X64, AARCH64 and ARM at last.
>
> This updated patch now includes the necessary corollary dsc/fdf updates as
> well as fixes to the assembly's EbcLLCALLEXNative, as I found the following
> issues there:
> - At least gcc5 didn't seem to like the manually optimized branching for all
> register args ("sub r1, r1, r3, lsr #1"), and one can never be sure of the
> actual size instructions will be assembled into, in case of assembler
> internal alignment/optimization, so I broke it down into actual labelled
> branches. There are only 4 of those anyway.
> - For register + stack calls, while 8 x 64 bit registers on AARCH64 do
> equate to #64 bytes that need to be taken off the stack, on ARM the 4 x 32
> bit registers equate to #16 bytes, not #32
> - Even after fixing the above, I found some issues with the manual stack
> duplication assembly code, so I switched to using a call to CopyMem(), like
> IA32 does.
>
> With these changes, I believe that the ARM/EBC feature should be fully
> functional, especially as I have heavily tested multiparameter calls from
> EBC into native, using an fasmg-based EBC assembler [3], to confirm that
> they performed just as well with ARM as with AARCH64, IA32 or X64.
>
Hello Pete,
Thanks a lot for this contribution. I had spotted (and fixed) some of
the above issues as well.
However, there is a fundamental issue with EBC on ARM that has not
been addressed yet, which makes EBC support problematic:
ARM uses natural alignment for 64-bit types, which means it leaves
gaps in the stack frame, and the thunking code has no way of dealing
with that.
I am pasting my analysis below, which I sent out internally a couple
of weeks ago. In summary, we need language spec and compiler updates
before we can fully support this on 32-bit ARM.
Thanks,
Ard.
--------------------------------------
This compares the EBC argument stack with the argument assignment across
registers and stack expected by the respective Procedure Call Standards
for AArch64 and AArch32.
Since the EBC thunking layer is not aware of the actual prototype signature
of the function that is being called (it does not even know which part of the
stack frame consists of outgoing arguments, and so it needs to assume that the
entire stack frame needs to be copied into arguments and the native stack), the
calls can only execute correctly if the EBC stack frame happens to align all
arguments natively, in which case the AAPCS happens to agree with the EBC
cdecl calling conventions (although the first 8 resp 4 arguments are passed
via registers). In the diagrams below, this is the case if the diagrams line
up horizontally.
In summary, EBC on AArch64 seems to be OK (although more testing is needed),
as long as we don't pass arguments whose size exceeds 64 bits (which the EBC
compiler is unlikely to support anyway)
EBC on AArch32 happens to work as long as no UINT64 values appear as the
return value or as an odd-numbered argument. Since it is impossible to infer
from EBC bytecode whether any such function calls are being performed, the only
way to fix this is to update the EBC spec (and the compiler) to insert hints
into the bytecode when such problematic values occur.
Below is a comparison between the stack frame layouts of various protocol
entry points that are relevant to EBC drivers, i.e., PCI I/O, block I/O and
network I/O)
--------------------------------------------------------------------------------
typedef
EFI_STATUS
(EFIAPI *EFI_BLOCK_READ)(
IN EFI_BLOCK_IO_PROTOCOL *This,
IN UINT32 MediaId,
IN EFI_LBA Lba,
IN UINTN BufferSize,
OUT VOID *Buffer
);
Executing on 64-bit (ok)
------------------------
EBC stack AArch64 registers
0x00 +----------------+ +----------------+
| This | x0 | This |
0x08 +----------------+ +----------------+
| MediaId | x1 | MediaId |
0x10 +----------------+ +----------------+
| Lba | x2 | Lba |
0x18 +----------------+ +----------------+
| BufferSize | x3 | BufferSize |
0x20 +----------------+ +----------------+
| Buffer | x4 | Buffer |
0x28 +----------------+ +----------------+
: :
+----------------+ +----------------+
R7 | Return value | x0 | Return value |
+----------------+ +----------------+
Executing on 32-bit (ok)
------------------------
EBC stack AArch32 registers + stack
0x00 +----------------+ +----------------+
| This | r0 | This |
0x04 +----------------+ +----------------+
| MediaId | r1 | MediaId |
0x08 +----------------+ +----------------+
| Lba | r2 | Lba |
| | r3 | |
0x10 +----------------+ 0x0 +================+
| BufferSize | | BufferSize |
0x14 +----------------+ 0x4 +----------------+
| Buffer | | Buffer |
0x18 +----------------+ 0x8 +----------------+
: :
+----------------+ +----------------+
R7 | Return value | r0 | Return value |
+----------------+ +----------------+
--------------------------------------------------------------------------------
typedef
EFI_STATUS
(EFIAPI *EFI_PCI_IO_PROTOCOL_ALLOCATE_BUFFER)(
IN EFI_PCI_IO_PROTOCOL *This,
IN EFI_ALLOCATE_TYPE Type,
IN EFI_MEMORY_TYPE MemoryType,
IN UINTN Pages,
OUT VOID **HostAddress,
IN UINT64 Attributes
);
Executing on 64-bit (ok)
------------------------
EBC stack AArch64 registers
0x00 +----------------+ +----------------+
| This | x0 | This |
0x08 +----------------+ +----------------+
| Type | x1 | Type |
0x10 +----------------+ +----------------+
| MemoryType | x2 | MemoryType |
0x18 +----------------+ +----------------+
| Pages | x3 | Pages |
0x20 +----------------+ +----------------+
| HostAddress | x4 | HostAddress |
0x28 +----------------+ +----------------+
| Attributes | x5 | Attributes |
0x30 +----------------+ +----------------+
: :
+----------------+ +----------------+
R7 | Return value | x0 | Return value |
+----------------+ +----------------+
Executing on 32-bit (FAIL)
--------------------------
EBC stack AArch32 registers + stack
0x00 +----------------+ +----------------+
| This | r0 | This |
0x04 +----------------+ +----------------+
| Type | r1 | Type |
0x08 +----------------+ +----------------+
| MemoryType | r2 | MemoryType |
0x0c +----------------+ +----------------+
| Pages | r3 | Pages |
0x10 +----------------+ 0x00 +================+
| HostAddress | | HostAddress |
0x14 +----------------+ 0x04 +----------------+
| Attributes | | <padding> | <----------
| | 0x08 +----------------+
0x1c +----------------+ | Attributes |
: | |
: 0x10 +----------------+
: :
+----------------+ +----------------+
R7 | Return value | r0 | Return value |
+----------------+ +----------------+
--------------------------------------------------------------------------------
typedef
EFI_STATUS
(EFIAPI *EFI_SIMPLE_NETWORK_RECEIVE)(
IN EFI_SIMPLE_NETWORK_PROTOCOL *This,
OUT UINTN *HeaderSize OPTIONAL,
IN OUT UINTN *BufferSize,
OUT VOID *Buffer,
OUT EFI_MAC_ADDRESS *SrcAddr OPTIONAL,
OUT EFI_MAC_ADDRESS *DestAddr OPTIONAL,
OUT UINT16 *Protocol OPTIONAL
);
Executing on 64-bit (ok)
------------------------
EBC stack AArch64 registers
0x00 +----------------+ +----------------+
| This | x0 | This |
0x08 +----------------+ +----------------+
| HeaderSize | x1 | HeaderSize |
0x10 +----------------+ +----------------+
| BufferSize | x2 | BufferSize |
0x18 +----------------+ +----------------+
| Buffer | x3 | Buffer |
0x20 +----------------+ +----------------+
| SrcAddr | x4 | SrcAddr |
0x28 +----------------+ +----------------+
| DestAddr | x5 | DestAddr |
0x30 +----------------+ +----------------+
| Protocol | x6 | Protocol |
0x38 +----------------+ +----------------+
+----------------+ +----------------+
R7 | Return value | x0 | Return value |
+----------------+ +----------------+
Executing on 32-bit (ok)
------------------------
EBC stack AArch32 registers + stack
0x00 +----------------+ +----------------+
| This | r0 | This |
0x04 +----------------+ +----------------+
| HeaderSize | r1 | HeaderSize |
0x08 +----------------+ +----------------+
| BufferSize | r2 | BufferSize |
0x0c +----------------+ +----------------+
| Buffer | r3 | Buffer |
0x10 +----------------+ 0x0 +================+
| SrcAddr | | SrcAddr |
0x14 +----------------+ 0x4 +----------------+
| DestAddr | | DestAddr |
0x18 +----------------+ 0xc +----------------+
| Protocol | | Protocol |
0x1c +----------------+ 0x8 +----------------+
: :
+----------------+ +----------------+
R7 | Return value | r0 | Return value |
+----------------+ +----------------+
--------------------------------------------------------------------------------
typedef
EFI_STATUS
(EFIAPI *EFI_TIMER_SET_TIMER_PERIOD)(
IN EFI_TIMER_ARCH_PROTOCOL *This,
IN UINT64 TimerPeriod
);
Executing on 64-bit (ok)
------------------------
EBC stack AArch64 registers
0x00 +----------------+ +----------------+
| This | x0 | This |
0x08 +----------------+ +----------------+
| TimerPeriod | x1 | TimerPeriod |
0x10 +----------------+ +----------------+
: :
+----------------+ +----------------+
R7 | Return value | x0 | Return value |
+----------------+ +----------------+
Executing on 32-bit (FAIL)
--------------------------
EBC stack AArch32 registers
0x00 +----------------+ +----------------+
| This | r0 | This |
0x04 +----------------+ +----------------+
| TimerPeriod | r1 | <padding> | <----------
| | +----------------+
0x0c +----------------+ r2 | TimerPeriod |
: r3 | |
: +----------------+
: :
+----------------+ +----------------+
R7 | Return value | r0 | Return value |
+----------------+ +----------------+
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 10:06 ` Ard Biesheuvel
@ 2016-09-22 10:13 ` Ard Biesheuvel
2016-09-22 11:05 ` Pete Batard
1 sibling, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 10:13 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 11:06, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 22 September 2016 at 10:43, Pete Batard <pete@akeo.ie> wrote:
>> Hi,
>>
>> The following is an updated/fixed version of the patch(es), put forward by
>> Ard Biesheuvel on August 9 ([1], [2]), and re-submitted for formal
>> inclusion, so that the EDK2 can provide EBC functionality for all of IA32,
>> IA64, X64, AARCH64 and ARM at last.
>>
>> This updated patch now includes the necessary corollary dsc/fdf updates as
>> well as fixes to the assembly's EbcLLCALLEXNative, as I found the following
>> issues there:
>> - At least gcc5 didn't seem to like the manually optimized branching for all
>> register args ("sub r1, r1, r3, lsr #1"), and one can never be sure of the
>> actual size instructions will be assembled into, in case of assembler
>> internal alignment/optimization, so I broke it down into actual labelled
>> branches. There are only 4 of those anyway.
>> - For register + stack calls, while 8 x 64 bit registers on AARCH64 do
>> equate to #64 bytes that need to be taken off the stack, on ARM the 4 x 32
>> bit registers equate to #16 bytes, not #32
>> - Even after fixing the above, I found some issues with the manual stack
>> duplication assembly code, so I switched to using a call to CopyMem(), like
>> IA32 does.
>>
>> With these changes, I believe that the ARM/EBC feature should be fully
>> functional, especially as I have heavily tested multiparameter calls from
>> EBC into native, using an fasmg-based EBC assembler [3], to confirm that
>> they performed just as well with ARM as with AARCH64, IA32 or X64.
>>
>
> Hello Pete,
>
> Thanks a lot for this contribution. I had spotted (and fixed) some of
> the above issues as well.
>
> However, there is a fundamental issue with EBC on ARM that has not
> been addressed yet, which makes EBC support problematic:
> ARM uses natural alignment for 64-bit types, which means it leaves
> gaps in the stack frame, and the thunking code has no way of dealing
> with that.
>
> I am pasting my analysis below, which I sent out internally a couple
> of weeks ago. In summary, we need language spec and compiler updates
> before we can fully support this on 32-bit ARM.
>
BTW, the EDK2 tree has an EBC version of the FAT filesystem driver,
which is what I have been using to test EBC. I have a Frankenstein
version of the 32-bit ARM one (shared below) that deals with the
padding of known protocol methods that contain UINT64 arguments at odd
positions, but it is not pretty, and a clear example why the spec
needs to be updated to accommodate ARM
https://git.linaro.org/people/ard.biesheuvel/uefi-next.git/shortlog/refs/heads/ebc3
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 10:06 ` Ard Biesheuvel
2016-09-22 10:13 ` Ard Biesheuvel
@ 2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
` (2 more replies)
1 sibling, 3 replies; 12+ messages in thread
From: Pete Batard @ 2016-09-22 11:05 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: edk2-devel@lists.01.org
On 2016.09.22 11:06, Ard Biesheuvel wrote:
> However, there is a fundamental issue with EBC on ARM that has not
> been addressed yet, which makes EBC support problematic:
> ARM uses natural alignment for 64-bit types, which means it leaves
> gaps in the stack frame, and the thunking code has no way of dealing
> with that.
I was hoping you would comment on this, as I believe the issue is larger
than Arm (which is why I thought the Arm patch could be integrated),
since I ran into something similar for X64 and AARCH64, with the
conversion from stack to register parameters.
Let me start by showing a real-life example of what the current EBC
implementation does, on different architectures, if you CALLEX from EBC
into a native function call such as:
VOID MultiParamNative(
UINT32,
UINT64,
UINT64,
UINT64,
UINT32,
UINT32,
UINT64
);
with values:
0x1C1C1C1C,
0x2B2B2B2B2A2A2A2A,
0x3B3B3B3B3A3A3A3A,
0x4B4B4B4B4A4A4A4A,
0x5C5C5C5C,
0x6C6C6C6C,
0x7B7B7B7B7A7A7A7A
If you do that, then the parameter values seen by each Arch will be as
follows:
IA32:
p1 = 0x1C1C1C1C
p2 = 0x2B2B2B2B2A2A2A2A
p3 = 0x3B3B3B3B3A3A3A3A
p4 = 0x4B4B4B4B4A4A4A4A
p5 = 0x5C5C5C5C
p6 = 0x6C6C6C6C
p7 = 0x7B7B7B7B7A7A7A7A
X64:
p1 = 0x1C1C1C1C
p2 = 0x3A3A3A3A2B2B2B2B
p3 = 0x4A4A4A4A3B3B3B3B
p4 = 0x5C5C5C5C4B4B4B4B
p5 = 0x6C6C6C6C
p6 = 0x7B7B7B7B
p7 = 0x06F23E4012345678
ARM:
p1 = 0x1C1C1C1C
p2 = 0x3A3A3A3A2B2B2B2B
p3 = 0x4A4A4A4A3B3B3B3B
p4 = 0x5C5C5C5C4B4B4B4B
p5 = 0x6C6C6C6C
p6 = 0x7A7A7A7A
p7 = 0x446EEC467B7B7B7B
AA64:
p1 = 0x1C1C1C1C
p2 = 0x3A3A3A3A2B2B2B2B
p3 = 0x4A4A4A4A3B3B3B3B
p4 = 0x5C5C5C5C4B4B4B4B
p5 = 0x6C6C6C6C
p6 = 0x7B7B7B7B
p7 = 0x00000000FFFFFF91
Note that these are real-life results gotten from a native set of
drivers [1] + EBC sample [2], specifically designed to test the above.
So, as you can see, only IA32 currently retrieves the parameters with
their expected values. All the other platforms, and not just Arm, have
an issue with parameter retrieval.
I too performed some analysis [3], to understand the problem, the result
of which can be summarized as follows:
Let's say you have native protocol function:
ProtocolCall(UINT32, UINT64, UINT64)
to which you want to pass values:
(0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
With the EBC VM, the parameters then get stacked as (little endian,
CDECL and using 32-bit longwords to represent the stack):
+--------+
|1C1C1C1C|
+--------+
|2A2A2A2A|
+--------+
|2B2B2B2B|
+--------+
|3A3A3A3A|
+--------+
|3B3B3B3B|
+--------+
+????????+
+--------+
Now, if you are calling into an x86_32 arch, this is no issue, as the
native call reads the parameters off the stack, and finds each one it
its expected location.
But if, say, you are calling into native Arm, then the calling
convention dictates that the first four 32 bits parameters must be
placed into Arm registers r0-r3, rather than on the stack, and what's
more, that if there exist 64 bit parameters among the register ones,
they must start with an even register (r0 or r2).
What this means is that, with the current EBC handling, which simply
maps the top of the stack onto registers for native CALLEX (as the VM
cannot guess the parameter signature of the function it is calling into,
and therefore will not do anything else but a blind mapping of the stack
onto registers), the native Arm function effectively gets called with
the following parameter mapping:
+--------+
|1C1C1C1C| -> r0 (32-bit first parameter)
+--------+
|2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
+--------+
|2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
+--------+
|3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
+--------+
|3B3B3B3B| -> lower half of 64-bit third parameter (stack)
+--------+
+????????+ -> upper half of 64-bit third parameter (stack)
+--------+
The end result is that, the Arm call ends up with these values:
(0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
However, while we used Arm for this example, this is not an Arm specific
issue, as x86_64 and Arm64 also expect any of the first eight parameters
to a native call, that are smaller than 64-bit, to get passed as a
64-bit register, which means they too have the same issue as the one
illustrated above.
Now, I'm not sure what the solution to that issue would be. I tend to
agree that, short of including a parameter signature for function calls,
this function argument marshalling issue between EBC and native will be
difficult to solve. A possible half-workaround I thought of could be to
keep track of all the PUSHes having been carried out before a CALLEX,
and *assume* (or mandate in the specs) that all the arguments were
pushed individually and that the size of the PUSH matches the desired
size for a register argument, but even that would still add a lot of
complexity and be prone to breakage...
The other solution of course is to ask EBC code developers to be aware
of and compensate for the calling convention issue, and pad the stack as
required depending on the ISA they are calling into, which is how I made
my protocol.asm sample work [4], but this is still rather inconvenient,
especially if not coding in EBC assembly, and defeats the point of
trying to write Arch independent code.
Considering this, I'm not entirely convinced ARM EBC integration should
be held back, as the problem we are faced with is part of a larger issue
that we'll need to resolve for all archs (except IA32), and not just ARM...
Regards,
/Pete
[1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
[2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
[3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
[4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
@ 2016-09-22 11:14 ` Ard Biesheuvel
2016-09-22 11:26 ` Pete Batard
2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:24 ` Andrew Fish
2 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 11:14 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 12:05, Pete Batard <pete@akeo.ie> wrote:
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>>
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
>
> I was hoping you would comment on this, as I believe the issue is larger
> than Arm (which is why I thought the Arm patch could be integrated), since I
> ran into something similar for X64 and AARCH64, with the conversion from
> stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC
> implementation does, on different architectures, if you CALLEX from EBC into
> a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as
> follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers
> [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their
> expected values. All the other platforms, and not just Arm, have an issue
> with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of
> which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL
> and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native
> call reads the parameters off the stack, and finds each one it its expected
> location.
>
> But if, say, you are calling into native Arm, then the calling convention
> dictates that the first four 32 bits parameters must be placed into Arm
> registers r0-r3, rather than on the stack, and what's more, that if there
> exist 64 bit parameters among the register ones, they must start with an
> even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps
> the top of the stack onto registers for native CALLEX (as the VM cannot
> guess the parameter signature of the function it is calling into, and
> therefore will not do anything else but a blind mapping of the stack onto
> registers), the native Arm function effectively gets called with the
> following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific
> issue, as x86_64 and Arm64 also expect any of the first eight parameters to
> a native call, that are smaller than 64-bit, to get passed as a 64-bit
> register, which means they too have the same issue as the one illustrated
> above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree
> that, short of including a parameter signature for function calls, this
> function argument marshalling issue between EBC and native will be difficult
> to solve. A possible half-workaround I thought of could be to keep track of
> all the PUSHes having been carried out before a CALLEX, and *assume* (or
> mandate in the specs) that all the arguments were pushed individually and
> that the size of the PUSH matches the desired size for a register argument,
> but even that would still add a lot of complexity and be prone to
> breakage...
>
> The other solution of course is to ask EBC code developers to be aware of
> and compensate for the calling convention issue, and pad the stack as
> required depending on the ISA they are calling into, which is how I made my
> protocol.asm sample work [4], but this is still rather inconvenient,
> especially if not coding in EBC assembly, and defeats the point of trying to
> write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be
> held back, as the problem we are faced with is part of a larger issue that
> we'll need to resolve for all archs (except IA32), and not just ARM...
>
Hello Peter,
For X64 and AARCH64, the issue does not exist because the EBC spec
mandates that all function arguments are widened to the native word
size. So when executing on a 64-bit architecture, the EBC stack looks
differently from what you describe above, and maps seamlessly onto the
register assignment mandated by the respective calling conventions.
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:14 ` Ard Biesheuvel
@ 2016-09-22 11:26 ` Pete Batard
2016-09-22 11:40 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Pete Batard @ 2016-09-22 11:26 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: edk2-devel@lists.01.org
On 2016.09.22 12:14, Ard Biesheuvel wrote:
> For X64 and AARCH64, the issue does not exist because the EBC spec
> mandates that all function arguments are widened to the native word
> size. So when executing on a 64-bit architecture, the EBC stack looks
> differently from what you describe above, and maps seamlessly onto the
> register assignment mandated by the respective calling conventions.
Ah, I see that you're right, and that I was trying to solve an issue
that shouldn't exist:
From UEFI 2.6, paragraph 21.9.3:
"32-bit integers are pushed as natural size (since they
should be passed as 64-bit parameter values on 64-bit machines)."
I must admit I was a bit curious as to why this problem wouldn't have
been picked before.
So that leaves only the issue you mentioned. But then I'm not too
hopeful with the timeframe for Arm/EBC integration when you say "we need
language spec and compiler updates before we can fully support this"...
Do you know if work has been started on this? Or are we just going to
consider that this is too troublesome a problem to fix?
Regards,
/Pete
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:26 ` Pete Batard
@ 2016-09-22 11:40 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 11:40 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 12:26, Pete Batard <pete@akeo.ie> wrote:
> On 2016.09.22 12:14, Ard Biesheuvel wrote:
>>
>> For X64 and AARCH64, the issue does not exist because the EBC spec
>> mandates that all function arguments are widened to the native word
>> size. So when executing on a 64-bit architecture, the EBC stack looks
>> differently from what you describe above, and maps seamlessly onto the
>> register assignment mandated by the respective calling conventions.
>
>
> Ah, I see that you're right, and that I was trying to solve an issue that
> shouldn't exist:
>
> From UEFI 2.6, paragraph 21.9.3:
>
> "32-bit integers are pushed as natural size (since they
> should be passed as 64-bit parameter values on 64-bit machines)."
>
> I must admit I was a bit curious as to why this problem wouldn't have been
> picked before.
>
>
> So that leaves only the issue you mentioned. But then I'm not too hopeful
> with the timeframe for Arm/EBC integration when you say "we need language
> spec and compiler updates before we can fully support this"...
>
> Do you know if work has been started on this? Or are we just going to
> consider that this is too troublesome a problem to fix?
>
We are debating this internally. Even on AArch64, there are numerous
issues with EBC drivers, even if the code itself executes fine (this
is mainly related to PCI drivers that don't bother to enable support
for 64-bit DMA since this is never needed on Intel platforms)
So while EBC is of high importance to many Linaro members, I am not
sure if that includes 32-bit ARM support.
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
@ 2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:21 ` Ard Biesheuvel
2016-09-22 21:40 ` Pete Batard
2016-09-22 21:24 ` Andrew Fish
2 siblings, 2 replies; 12+ messages in thread
From: Andrew Fish @ 2016-09-22 20:27 UTC (permalink / raw)
To: Pete Batard; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
> I was hoping you would comment on this, as I believe the issue is larger than Arm (which is why I thought the Arm patch could be integrated), since I ran into something similar for X64 and AARCH64, with the conversion from stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC implementation does, on different architectures, if you CALLEX from EBC into a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their expected values. All the other platforms, and not just Arm, have an issue with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native call reads the parameters off the stack, and finds each one it its expected location.
>
> But if, say, you are calling into native Arm, then the calling convention dictates that the first four 32 bits parameters must be placed into Arm registers r0-r3, rather than on the stack, and what's more, that if there exist 64 bit parameters among the register ones, they must start with an even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps the top of the stack onto registers for native CALLEX (as the VM cannot guess the parameter signature of the function it is calling into, and therefore will not do anything else but a blind mapping of the stack onto registers), the native Arm function effectively gets called with the following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific issue, as x86_64 and Arm64 also expect any of the first eight parameters to a native call, that are smaller than 64-bit, to get passed as a 64-bit register, which means they too have the same issue as the one illustrated above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree that, short of including a parameter signature for function calls, this function argument marshalling issue between EBC and native will be difficult to solve. A possible half-workaround I thought of could be to keep track of all the PUSHes having been carried out before a CALLEX, and *assume* (or mandate in the specs) that all the arguments were pushed individually and that the size of the PUSH matches the desired size for a register argument, but even that would still add a lot of complexity and be prone to breakage...
>
It seems like tracking the PUSHes would work? We could update the spec to require this behavior. But it brings up another issue in that you don't know how many PUSHes to go back? What is a calling convention PUSH vs. a state save push? Maybe we could also add a max number of args that are supported?
Thanks,
Andrew Fish
PS I'm guessing that from a clang/LLVM point of view enforcing the push rules in the code gen should not be hard if the bit code gets optimized and then is converted to EBC. As the bit code for the external code call will contain all the info EBC needs to know about. For example:
%7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)
Example:
~/work/Compiler/llvm>cat callit.c
typedef unsigned int UINT32;
typedef unsigned long long UINT64;
int MultiParamNative(UINT32, UINT64, UINT32);
int Example (UINT32 A, UINT64 B, UINT32 C)
{
return MultiParamNative (A, B, C);
}
~/work/Compiler/llvm>clang -S -emit-llvm callit.c
~/work/Compiler/llvm>cat callit.ll
; ModuleID = 'callit.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"
; Function Attrs: nounwind ssp uwtable
define i32 @Example(i32 %A, i64 %B, i32 %C) #0 {
%1 = alloca i32, align 4
%2 = alloca i64, align 8
%3 = alloca i32, align 4
store i32 %A, i32* %1, align 4
store i64 %B, i64* %2, align 8
store i32 %C, i32* %3, align 4
%4 = load i32* %1, align 4
%5 = load i64* %2, align 8
%6 = load i32* %3, align 4
%7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)
ret i32 %7
}
declare i32 @MultiParamNative(i32, i64, i32) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)"}
> The other solution of course is to ask EBC code developers to be aware of and compensate for the calling convention issue, and pad the stack as required depending on the ISA they are calling into, which is how I made my protocol.asm sample work [4], but this is still rather inconvenient, especially if not coding in EBC assembly, and defeats the point of trying to write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be held back, as the problem we are faced with is part of a larger issue that we'll need to resolve for all archs (except IA32), and not just ARM...
>
> Regards,
>
> /Pete
>
> [1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
> [2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
> [3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
> [4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 20:27 ` Andrew Fish
@ 2016-09-22 21:21 ` Ard Biesheuvel
2016-09-22 21:40 ` Pete Batard
1 sibling, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 21:21 UTC (permalink / raw)
To: Andrew Fish; +Cc: Pete Batard, edk2-devel@lists.01.org
On 22 September 2016 at 21:27, Andrew Fish <afish@apple.com> wrote:
>
>> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>>
>> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>>>
>>> However, there is a fundamental issue with EBC on ARM that has not
>>> been addressed yet, which makes EBC support problematic:
>>> ARM uses natural alignment for 64-bit types, which means it leaves
>>> gaps in the stack frame, and the thunking code has no way of dealing
>>> with that.
[...]
>> Now, I'm not sure what the solution to that issue would be. I tend to agree
>> that, short of including a parameter signature for function calls, this
>> function argument marshalling issue between EBC and native will be difficult
>> to solve. A possible half-workaround I thought of could be to keep track of
>> all the PUSHes having been carried out before a CALLEX, and *assume* (or
>> mandate in the specs) that all the arguments were pushed individually and
>> that the size of the PUSH matches the desired size for a register argument,
>> but even that would still add a lot of complexity and be prone to
>> breakage...
>
> It seems like tracking the PUSHes would work? We could update the spec to
> require this behavior. But it brings up another issue in that you don't know
> how many PUSHes to go back? What is a calling convention PUSH vs. a state
> save push? Maybe we could also add a max number of args that are supported?
>
Forcing the compiler to emit every function call as a series of
in-order argument pushes may be overly restrictive, and only solves
half of the problem. Native to EBC calls suffer from the same issue,
and so we have to deal with this in the thunking layer.
The EBC code already issues BRK instructions (IIRC) to call into the
interpreter to generate the native to EBC thunks, so I think we should
extend that mechanism to publish prototype annotations for all entry
points subject to thunking (in either direction). The existing
implementations for x86 could ignore these entirely, or perhaps use
them to trim the size of the native to EBC stack copy (or vice versa).
ARM would require these annotations (or perhaps only for prototypes
where UINT64 arguments appear in odd positions)
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
2016-09-22 20:27 ` Andrew Fish
@ 2016-09-22 21:24 ` Andrew Fish
2016-09-22 21:29 ` Ard Biesheuvel
2 siblings, 1 reply; 12+ messages in thread
From: Andrew Fish @ 2016-09-22 21:24 UTC (permalink / raw)
To: Pete Batard; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
> I was hoping you would comment on this, as I believe the issue is larger than Arm (which is why I thought the Arm patch could be integrated), since I ran into something similar for X64 and AARCH64, with the conversion from stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC implementation does, on different architectures, if you CALLEX from EBC into a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
Pete,
Stupid question. Does adding EFIAPI fix this issue? So:
VOID
EFIAPI
MultiParammNative(
UINT32,
UINT64,
UINT64,
UINT64,
UINT32,
UINT32,
UINT64
);
I'm sitting next to Mike Kinney at the UEFI Forum Plug Fest and we were wondering if the calling convention stuff was mucked up on C side.
Thanks,
Andrew Fish
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their expected values. All the other platforms, and not just Arm, have an issue with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native call reads the parameters off the stack, and finds each one it its expected location.
>
> But if, say, you are calling into native Arm, then the calling convention dictates that the first four 32 bits parameters must be placed into Arm registers r0-r3, rather than on the stack, and what's more, that if there exist 64 bit parameters among the register ones, they must start with an even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps the top of the stack onto registers for native CALLEX (as the VM cannot guess the parameter signature of the function it is calling into, and therefore will not do anything else but a blind mapping of the stack onto registers), the native Arm function effectively gets called with the following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific issue, as x86_64 and Arm64 also expect any of the first eight parameters to a native call, that are smaller than 64-bit, to get passed as a 64-bit register, which means they too have the same issue as the one illustrated above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree that, short of including a parameter signature for function calls, this function argument marshalling issue between EBC and native will be difficult to solve. A possible half-workaround I thought of could be to keep track of all the PUSHes having been carried out before a CALLEX, and *assume* (or mandate in the specs) that all the arguments were pushed individually and that the size of the PUSH matches the desired size for a register argument, but even that would still add a lot of complexity and be prone to breakage...
>
> The other solution of course is to ask EBC code developers to be aware of and compensate for the calling convention issue, and pad the stack as required depending on the ISA they are calling into, which is how I made my protocol.asm sample work [4], but this is still rather inconvenient, especially if not coding in EBC assembly, and defeats the point of trying to write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be held back, as the problem we are faced with is part of a larger issue that we'll need to resolve for all archs (except IA32), and not just ARM...
>
> Regards,
>
> /Pete
>
> [1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
> [2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
> [3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
> [4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 21:24 ` Andrew Fish
@ 2016-09-22 21:29 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 21:29 UTC (permalink / raw)
To: Andrew Fish; +Cc: Pete Batard, edk2-devel@lists.01.org
On 22 September 2016 at 22:24, Andrew Fish <afish@apple.com> wrote:
>
[..]
> Pete,
>
> Stupid question. Does adding EFIAPI fix this issue? So:
>
> VOID
> EFIAPI
> MultiParammNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> I'm sitting next to Mike Kinney at the UEFI Forum Plug Fest and we were
> wondering if the calling convention stuff was mucked up on C side.
>
Pete already confirmed that his analysis regarding X64 and AARCH64 was
incorrect. The only architecture that is affected by this issue is
ARM, and purely due to the fact that its calling convention leaves
'holes' in the register/stack slot assignment to ensure UINT64 values
appear naturally aligned (i.e., the protoype above would not use r1,
but pass the second argument in r2 and r3)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:21 ` Ard Biesheuvel
@ 2016-09-22 21:40 ` Pete Batard
1 sibling, 0 replies; 12+ messages in thread
From: Pete Batard @ 2016-09-22 21:40 UTC (permalink / raw)
To: Andrew Fish; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
Hi Andrew,
On 2016.09.22 21:27, Andrew Fish wrote:
> It seems like tracking the PUSHes would work?
First of all, as got pointed out by Ard, I need to mention that much of
my earlier analysis results, which you quoted, were due to a misread of
the specs, as it does mandates pushing 32 bit parameters as native.
So the output I gave for AARCH64 and X64 was erroneous, since it was a
result of using PUSH32, and the problematic output disappears when using
PUSHn.
The only issue therefore, is with 32 bit ARM.
> We could update the spec to require this behavior.
If the idea is to add the tracking when processing POP/PUSH on Arm
(which is how I originally saw it), please note that we may also have to
track stack manipulation with MOV against R0, the EBC stack pointer, if
someone issues something like 'MOV R0, R0(-1,0)' (equivalent to PUSHn)
or 'MOV R0, R0(0,-8)' (equivalent to PUSH64) for some 32 or 64 bit
optional parameter...
However, this brings us back to our original issue if there exists a
succession of optional 32 and 64 bit parameters, which the application
doesn't care about filling, and that may be handled with a single 'MOV
R0, R0(-1,-8)' in the assembly. If we are faced with this in our
tracking, then we still have no way of knowing which of the 32 or 64 bit
parameter comes first.
So, notwithstanding other issues, we'd have to mandate something very
specific against doing this in the UEFI/EBC specs (such as only using
individual PUSHes)...
Regards,
/Pete
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-09-22 21:40 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-22 9:43 [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support Pete Batard
2016-09-22 10:06 ` Ard Biesheuvel
2016-09-22 10:13 ` Ard Biesheuvel
2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
2016-09-22 11:26 ` Pete Batard
2016-09-22 11:40 ` Ard Biesheuvel
2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:21 ` Ard Biesheuvel
2016-09-22 21:40 ` Pete Batard
2016-09-22 21:24 ` Andrew Fish
2016-09-22 21:29 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox