* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
@ 2016-09-22 11:14 ` Ard Biesheuvel
2016-09-22 11:26 ` Pete Batard
2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:24 ` Andrew Fish
2 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 11:14 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 12:05, Pete Batard <pete@akeo.ie> wrote:
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>>
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
>
> I was hoping you would comment on this, as I believe the issue is larger
> than Arm (which is why I thought the Arm patch could be integrated), since I
> ran into something similar for X64 and AARCH64, with the conversion from
> stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC
> implementation does, on different architectures, if you CALLEX from EBC into
> a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as
> follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers
> [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their
> expected values. All the other platforms, and not just Arm, have an issue
> with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of
> which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL
> and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native
> call reads the parameters off the stack, and finds each one it its expected
> location.
>
> But if, say, you are calling into native Arm, then the calling convention
> dictates that the first four 32 bits parameters must be placed into Arm
> registers r0-r3, rather than on the stack, and what's more, that if there
> exist 64 bit parameters among the register ones, they must start with an
> even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps
> the top of the stack onto registers for native CALLEX (as the VM cannot
> guess the parameter signature of the function it is calling into, and
> therefore will not do anything else but a blind mapping of the stack onto
> registers), the native Arm function effectively gets called with the
> following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific
> issue, as x86_64 and Arm64 also expect any of the first eight parameters to
> a native call, that are smaller than 64-bit, to get passed as a 64-bit
> register, which means they too have the same issue as the one illustrated
> above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree
> that, short of including a parameter signature for function calls, this
> function argument marshalling issue between EBC and native will be difficult
> to solve. A possible half-workaround I thought of could be to keep track of
> all the PUSHes having been carried out before a CALLEX, and *assume* (or
> mandate in the specs) that all the arguments were pushed individually and
> that the size of the PUSH matches the desired size for a register argument,
> but even that would still add a lot of complexity and be prone to
> breakage...
>
> The other solution of course is to ask EBC code developers to be aware of
> and compensate for the calling convention issue, and pad the stack as
> required depending on the ISA they are calling into, which is how I made my
> protocol.asm sample work [4], but this is still rather inconvenient,
> especially if not coding in EBC assembly, and defeats the point of trying to
> write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be
> held back, as the problem we are faced with is part of a larger issue that
> we'll need to resolve for all archs (except IA32), and not just ARM...
>
Hello Peter,
For X64 and AARCH64, the issue does not exist because the EBC spec
mandates that all function arguments are widened to the native word
size. So when executing on a 64-bit architecture, the EBC stack looks
differently from what you describe above, and maps seamlessly onto the
register assignment mandated by the respective calling conventions.
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:14 ` Ard Biesheuvel
@ 2016-09-22 11:26 ` Pete Batard
2016-09-22 11:40 ` Ard Biesheuvel
0 siblings, 1 reply; 12+ messages in thread
From: Pete Batard @ 2016-09-22 11:26 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: edk2-devel@lists.01.org
On 2016.09.22 12:14, Ard Biesheuvel wrote:
> For X64 and AARCH64, the issue does not exist because the EBC spec
> mandates that all function arguments are widened to the native word
> size. So when executing on a 64-bit architecture, the EBC stack looks
> differently from what you describe above, and maps seamlessly onto the
> register assignment mandated by the respective calling conventions.
Ah, I see that you're right, and that I was trying to solve an issue
that shouldn't exist:
From UEFI 2.6, paragraph 21.9.3:
"32-bit integers are pushed as natural size (since they
should be passed as 64-bit parameter values on 64-bit machines)."
I must admit I was a bit curious as to why this problem wouldn't have
been picked before.
So that leaves only the issue you mentioned. But then I'm not too
hopeful with the timeframe for Arm/EBC integration when you say "we need
language spec and compiler updates before we can fully support this"...
Do you know if work has been started on this? Or are we just going to
consider that this is too troublesome a problem to fix?
Regards,
/Pete
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:26 ` Pete Batard
@ 2016-09-22 11:40 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 11:40 UTC (permalink / raw)
To: Pete Batard; +Cc: edk2-devel@lists.01.org
On 22 September 2016 at 12:26, Pete Batard <pete@akeo.ie> wrote:
> On 2016.09.22 12:14, Ard Biesheuvel wrote:
>>
>> For X64 and AARCH64, the issue does not exist because the EBC spec
>> mandates that all function arguments are widened to the native word
>> size. So when executing on a 64-bit architecture, the EBC stack looks
>> differently from what you describe above, and maps seamlessly onto the
>> register assignment mandated by the respective calling conventions.
>
>
> Ah, I see that you're right, and that I was trying to solve an issue that
> shouldn't exist:
>
> From UEFI 2.6, paragraph 21.9.3:
>
> "32-bit integers are pushed as natural size (since they
> should be passed as 64-bit parameter values on 64-bit machines)."
>
> I must admit I was a bit curious as to why this problem wouldn't have been
> picked before.
>
>
> So that leaves only the issue you mentioned. But then I'm not too hopeful
> with the timeframe for Arm/EBC integration when you say "we need language
> spec and compiler updates before we can fully support this"...
>
> Do you know if work has been started on this? Or are we just going to
> consider that this is too troublesome a problem to fix?
>
We are debating this internally. Even on AArch64, there are numerous
issues with EBC drivers, even if the code itself executes fine (this
is mainly related to PCI drivers that don't bother to enable support
for 64-bit DMA since this is never needed on Intel platforms)
So while EBC is of high importance to many Linaro members, I am not
sure if that includes 32-bit ARM support.
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
@ 2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:21 ` Ard Biesheuvel
2016-09-22 21:40 ` Pete Batard
2016-09-22 21:24 ` Andrew Fish
2 siblings, 2 replies; 12+ messages in thread
From: Andrew Fish @ 2016-09-22 20:27 UTC (permalink / raw)
To: Pete Batard; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
> I was hoping you would comment on this, as I believe the issue is larger than Arm (which is why I thought the Arm patch could be integrated), since I ran into something similar for X64 and AARCH64, with the conversion from stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC implementation does, on different architectures, if you CALLEX from EBC into a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their expected values. All the other platforms, and not just Arm, have an issue with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native call reads the parameters off the stack, and finds each one it its expected location.
>
> But if, say, you are calling into native Arm, then the calling convention dictates that the first four 32 bits parameters must be placed into Arm registers r0-r3, rather than on the stack, and what's more, that if there exist 64 bit parameters among the register ones, they must start with an even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps the top of the stack onto registers for native CALLEX (as the VM cannot guess the parameter signature of the function it is calling into, and therefore will not do anything else but a blind mapping of the stack onto registers), the native Arm function effectively gets called with the following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific issue, as x86_64 and Arm64 also expect any of the first eight parameters to a native call, that are smaller than 64-bit, to get passed as a 64-bit register, which means they too have the same issue as the one illustrated above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree that, short of including a parameter signature for function calls, this function argument marshalling issue between EBC and native will be difficult to solve. A possible half-workaround I thought of could be to keep track of all the PUSHes having been carried out before a CALLEX, and *assume* (or mandate in the specs) that all the arguments were pushed individually and that the size of the PUSH matches the desired size for a register argument, but even that would still add a lot of complexity and be prone to breakage...
>
It seems like tracking the PUSHes would work? We could update the spec to require this behavior. But it brings up another issue in that you don't know how many PUSHes to go back? What is a calling convention PUSH vs. a state save push? Maybe we could also add a max number of args that are supported?
Thanks,
Andrew Fish
PS I'm guessing that from a clang/LLVM point of view enforcing the push rules in the code gen should not be hard if the bit code gets optimized and then is converted to EBC. As the bit code for the external code call will contain all the info EBC needs to know about. For example:
%7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)
Example:
~/work/Compiler/llvm>cat callit.c
typedef unsigned int UINT32;
typedef unsigned long long UINT64;
int MultiParamNative(UINT32, UINT64, UINT32);
int Example (UINT32 A, UINT64 B, UINT32 C)
{
return MultiParamNative (A, B, C);
}
~/work/Compiler/llvm>clang -S -emit-llvm callit.c
~/work/Compiler/llvm>cat callit.ll
; ModuleID = 'callit.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"
; Function Attrs: nounwind ssp uwtable
define i32 @Example(i32 %A, i64 %B, i32 %C) #0 {
%1 = alloca i32, align 4
%2 = alloca i64, align 8
%3 = alloca i32, align 4
store i32 %A, i32* %1, align 4
store i64 %B, i64* %2, align 8
store i32 %C, i32* %3, align 4
%4 = load i32* %1, align 4
%5 = load i64* %2, align 8
%6 = load i32* %3, align 4
%7 = call i32 @MultiParamNative(i32 %4, i64 %5, i32 %6)
ret i32 %7
}
declare i32 @MultiParamNative(i32, i64, i32) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)"}
> The other solution of course is to ask EBC code developers to be aware of and compensate for the calling convention issue, and pad the stack as required depending on the ISA they are calling into, which is how I made my protocol.asm sample work [4], but this is still rather inconvenient, especially if not coding in EBC assembly, and defeats the point of trying to write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be held back, as the problem we are faced with is part of a larger issue that we'll need to resolve for all archs (except IA32), and not just ARM...
>
> Regards,
>
> /Pete
>
> [1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
> [2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
> [3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
> [4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 20:27 ` Andrew Fish
@ 2016-09-22 21:21 ` Ard Biesheuvel
2016-09-22 21:40 ` Pete Batard
1 sibling, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 21:21 UTC (permalink / raw)
To: Andrew Fish; +Cc: Pete Batard, edk2-devel@lists.01.org
On 22 September 2016 at 21:27, Andrew Fish <afish@apple.com> wrote:
>
>> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>>
>> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>>>
>>> However, there is a fundamental issue with EBC on ARM that has not
>>> been addressed yet, which makes EBC support problematic:
>>> ARM uses natural alignment for 64-bit types, which means it leaves
>>> gaps in the stack frame, and the thunking code has no way of dealing
>>> with that.
[...]
>> Now, I'm not sure what the solution to that issue would be. I tend to agree
>> that, short of including a parameter signature for function calls, this
>> function argument marshalling issue between EBC and native will be difficult
>> to solve. A possible half-workaround I thought of could be to keep track of
>> all the PUSHes having been carried out before a CALLEX, and *assume* (or
>> mandate in the specs) that all the arguments were pushed individually and
>> that the size of the PUSH matches the desired size for a register argument,
>> but even that would still add a lot of complexity and be prone to
>> breakage...
>
> It seems like tracking the PUSHes would work? We could update the spec to
> require this behavior. But it brings up another issue in that you don't know
> how many PUSHes to go back? What is a calling convention PUSH vs. a state
> save push? Maybe we could also add a max number of args that are supported?
>
Forcing the compiler to emit every function call as a series of
in-order argument pushes may be overly restrictive, and only solves
half of the problem. Native to EBC calls suffer from the same issue,
and so we have to deal with this in the thunking layer.
The EBC code already issues BRK instructions (IIRC) to call into the
interpreter to generate the native to EBC thunks, so I think we should
extend that mechanism to publish prototype annotations for all entry
points subject to thunking (in either direction). The existing
implementations for x86 could ignore these entirely, or perhaps use
them to trim the size of the native to EBC stack copy (or vice versa).
ARM would require these annotations (or perhaps only for prototypes
where UINT64 arguments appear in odd positions)
--
Ard.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 20:27 ` Andrew Fish
2016-09-22 21:21 ` Ard Biesheuvel
@ 2016-09-22 21:40 ` Pete Batard
1 sibling, 0 replies; 12+ messages in thread
From: Pete Batard @ 2016-09-22 21:40 UTC (permalink / raw)
To: Andrew Fish; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
Hi Andrew,
On 2016.09.22 21:27, Andrew Fish wrote:
> It seems like tracking the PUSHes would work?
First of all, as got pointed out by Ard, I need to mention that much of
my earlier analysis results, which you quoted, were due to a misread of
the specs, as it does mandates pushing 32 bit parameters as native.
So the output I gave for AARCH64 and X64 was erroneous, since it was a
result of using PUSH32, and the problematic output disappears when using
PUSHn.
The only issue therefore, is with 32 bit ARM.
> We could update the spec to require this behavior.
If the idea is to add the tracking when processing POP/PUSH on Arm
(which is how I originally saw it), please note that we may also have to
track stack manipulation with MOV against R0, the EBC stack pointer, if
someone issues something like 'MOV R0, R0(-1,0)' (equivalent to PUSHn)
or 'MOV R0, R0(0,-8)' (equivalent to PUSH64) for some 32 or 64 bit
optional parameter...
However, this brings us back to our original issue if there exists a
succession of optional 32 and 64 bit parameters, which the application
doesn't care about filling, and that may be handled with a single 'MOV
R0, R0(-1,-8)' in the assembly. If we are faced with this in our
tracking, then we still have no way of knowing which of the 32 or 64 bit
parameter comes first.
So, notwithstanding other issues, we'd have to mandate something very
specific against doing this in the UEFI/EBC specs (such as only using
individual PUSHes)...
Regards,
/Pete
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 11:05 ` Pete Batard
2016-09-22 11:14 ` Ard Biesheuvel
2016-09-22 20:27 ` Andrew Fish
@ 2016-09-22 21:24 ` Andrew Fish
2016-09-22 21:29 ` Ard Biesheuvel
2 siblings, 1 reply; 12+ messages in thread
From: Andrew Fish @ 2016-09-22 21:24 UTC (permalink / raw)
To: Pete Batard; +Cc: Ard Biesheuvel, edk2-devel@lists.01.org
> On Sep 22, 2016, at 4:05 AM, Pete Batard <pete@akeo.ie> wrote:
>
> On 2016.09.22 11:06, Ard Biesheuvel wrote:
>> However, there is a fundamental issue with EBC on ARM that has not
>> been addressed yet, which makes EBC support problematic:
>> ARM uses natural alignment for 64-bit types, which means it leaves
>> gaps in the stack frame, and the thunking code has no way of dealing
>> with that.
>
> I was hoping you would comment on this, as I believe the issue is larger than Arm (which is why I thought the Arm patch could be integrated), since I ran into something similar for X64 and AARCH64, with the conversion from stack to register parameters.
>
> Let me start by showing a real-life example of what the current EBC implementation does, on different architectures, if you CALLEX from EBC into a native function call such as:
>
> VOID MultiParamNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
Pete,
Stupid question. Does adding EFIAPI fix this issue? So:
VOID
EFIAPI
MultiParammNative(
UINT32,
UINT64,
UINT64,
UINT64,
UINT32,
UINT32,
UINT64
);
I'm sitting next to Mike Kinney at the UEFI Forum Plug Fest and we were wondering if the calling convention stuff was mucked up on C side.
Thanks,
Andrew Fish
> with values:
>
> 0x1C1C1C1C,
> 0x2B2B2B2B2A2A2A2A,
> 0x3B3B3B3B3A3A3A3A,
> 0x4B4B4B4B4A4A4A4A,
> 0x5C5C5C5C,
> 0x6C6C6C6C,
> 0x7B7B7B7B7A7A7A7A
>
> If you do that, then the parameter values seen by each Arch will be as follows:
>
> IA32:
> p1 = 0x1C1C1C1C
> p2 = 0x2B2B2B2B2A2A2A2A
> p3 = 0x3B3B3B3B3A3A3A3A
> p4 = 0x4B4B4B4B4A4A4A4A
> p5 = 0x5C5C5C5C
> p6 = 0x6C6C6C6C
> p7 = 0x7B7B7B7B7A7A7A7A
>
> X64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x06F23E4012345678
>
> ARM:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7A7A7A7A
> p7 = 0x446EEC467B7B7B7B
>
> AA64:
> p1 = 0x1C1C1C1C
> p2 = 0x3A3A3A3A2B2B2B2B
> p3 = 0x4A4A4A4A3B3B3B3B
> p4 = 0x5C5C5C5C4B4B4B4B
> p5 = 0x6C6C6C6C
> p6 = 0x7B7B7B7B
> p7 = 0x00000000FFFFFF91
>
> Note that these are real-life results gotten from a native set of drivers [1] + EBC sample [2], specifically designed to test the above.
>
> So, as you can see, only IA32 currently retrieves the parameters with their expected values. All the other platforms, and not just Arm, have an issue with parameter retrieval.
>
> I too performed some analysis [3], to understand the problem, the result of which can be summarized as follows:
>
> Let's say you have native protocol function:
>
> ProtocolCall(UINT32, UINT64, UINT64)
>
> to which you want to pass values:
>
> (0x1C1C1C1C, 0x2B2B2B2B2A2A2A2A, 0x3B3B3B3B3A3A3A3A)
>
> With the EBC VM, the parameters then get stacked as (little endian, CDECL and using 32-bit longwords to represent the stack):
>
> +--------+
> |1C1C1C1C|
> +--------+
> |2A2A2A2A|
> +--------+
> |2B2B2B2B|
> +--------+
> |3A3A3A3A|
> +--------+
> |3B3B3B3B|
> +--------+
> +????????+
> +--------+
>
> Now, if you are calling into an x86_32 arch, this is no issue, as the native call reads the parameters off the stack, and finds each one it its expected location.
>
> But if, say, you are calling into native Arm, then the calling convention dictates that the first four 32 bits parameters must be placed into Arm registers r0-r3, rather than on the stack, and what's more, that if there exist 64 bit parameters among the register ones, they must start with an even register (r0 or r2).
>
> What this means is that, with the current EBC handling, which simply maps the top of the stack onto registers for native CALLEX (as the VM cannot guess the parameter signature of the function it is calling into, and therefore will not do anything else but a blind mapping of the stack onto registers), the native Arm function effectively gets called with the following parameter mapping:
>
> +--------+
> |1C1C1C1C| -> r0 (32-bit first parameter)
> +--------+
> |2A2A2A2A| -> (r1/unused, since first parameter is 32-bit)
> +--------+
> |2B2B2B2B| -> r2 (lower half of 64-bit second parameter)
> +--------+
> |3A3A3A3A| -> r3 (upper half of 64-bit second parameter)
> +--------+
> |3B3B3B3B| -> lower half of 64-bit third parameter (stack)
> +--------+
> +????????+ -> upper half of 64-bit third parameter (stack)
> +--------+
>
> The end result is that, the Arm call ends up with these values:
>
> (0x1C1C1C1C, 0x3A3A3A3A2B2B2B2B, 0x????????3B3B3B3B)
>
> However, while we used Arm for this example, this is not an Arm specific issue, as x86_64 and Arm64 also expect any of the first eight parameters to a native call, that are smaller than 64-bit, to get passed as a 64-bit register, which means they too have the same issue as the one illustrated above.
>
> Now, I'm not sure what the solution to that issue would be. I tend to agree that, short of including a parameter signature for function calls, this function argument marshalling issue between EBC and native will be difficult to solve. A possible half-workaround I thought of could be to keep track of all the PUSHes having been carried out before a CALLEX, and *assume* (or mandate in the specs) that all the arguments were pushed individually and that the size of the PUSH matches the desired size for a register argument, but even that would still add a lot of complexity and be prone to breakage...
>
> The other solution of course is to ask EBC code developers to be aware of and compensate for the calling convention issue, and pad the stack as required depending on the ISA they are calling into, which is how I made my protocol.asm sample work [4], but this is still rather inconvenient, especially if not coding in EBC assembly, and defeats the point of trying to write Arch independent code.
>
> Considering this, I'm not entirely convinced ARM EBC integration should be held back, as the problem we are faced with is part of a larger issue that we'll need to resolve for all archs (except IA32), and not just ARM...
>
> Regards,
>
> /Pete
>
> [1] https://github.com/pbatard/fasmg-ebc/tree/master/Protocol
> [2] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm
> [3] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L113-L177
> [4] https://github.com/pbatard/fasmg-ebc/blob/master/protocol.asm#L211-L219
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/1] MdeModulePkg/EbcDxe: add ARM support
2016-09-22 21:24 ` Andrew Fish
@ 2016-09-22 21:29 ` Ard Biesheuvel
0 siblings, 0 replies; 12+ messages in thread
From: Ard Biesheuvel @ 2016-09-22 21:29 UTC (permalink / raw)
To: Andrew Fish; +Cc: Pete Batard, edk2-devel@lists.01.org
On 22 September 2016 at 22:24, Andrew Fish <afish@apple.com> wrote:
>
[..]
> Pete,
>
> Stupid question. Does adding EFIAPI fix this issue? So:
>
> VOID
> EFIAPI
> MultiParammNative(
> UINT32,
> UINT64,
> UINT64,
> UINT64,
> UINT32,
> UINT32,
> UINT64
> );
>
> I'm sitting next to Mike Kinney at the UEFI Forum Plug Fest and we were
> wondering if the calling convention stuff was mucked up on C side.
>
Pete already confirmed that his analysis regarding X64 and AARCH64 was
incorrect. The only architecture that is affected by this issue is
ARM, and purely due to the fact that its calling convention leaves
'holes' in the register/stack slot assignment to ensure UINT64 values
appear naturally aligned (i.e., the protoype above would not use r1,
but pass the second argument in r2 and r3)
^ permalink raw reply [flat|nested] 12+ messages in thread