From: Laszlo Ersek <lersek@redhat.com>
To: Ruiyu Ni <ruiyu.ni@intel.com>, edk2-devel@lists.01.org
Cc: Michael D Kinney <michael.d.kinney@intel.com>,
Jiewen Yao <jiewen.yao@intel.com>,
Liming Gao <liming.gao@intel.com>
Subject: Re: [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value
Date: Tue, 25 Sep 2018 20:29:10 +0200 [thread overview]
Message-ID: <9a0e968d-f6e4-18ad-f869-60fc88806735@redhat.com> (raw)
In-Reply-To: <f45090cf-91f7-e2a9-9139-cf22df6b6c7c@redhat.com>
On 09/25/18 18:18, Laszlo Ersek wrote:
> On 09/25/18 16:22, Laszlo Ersek wrote:
>> On 09/13/18 11:50, Ruiyu Ni wrote:
>>
>>> diff --git a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> index 5224dd063f..4c4d6e3fc7 100644
>>> --- a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> +++ b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> @@ -15,14 +15,12 @@
>>>
>>>
>>>
>>> -
>>> /**
>>> Performs an atomic increment of an 32-bit unsigned integer.
>>>
>>> Performs an atomic increment of the 32-bit unsigned integer specified by
>>> Value and returns the incremented value. The increment operation must be
>>> - performed using MP safe mechanisms. The state of the return value is not
>>> - guaranteed to be MP safe.
>>> + performed using MP safe mechanisms.
>>>
>>> @param Value A pointer to the 32-bit value to increment.
>>>
>>> @@ -38,9 +36,10 @@ InternalSyncIncrement (
>>> UINT32 Result;
>>>
>>> __asm__ __volatile__ (
>>> + "movl $1, %%eax \n\t"
>>> "lock \n\t"
>>> - "incl %2 \n\t"
>>> - "mov %2, %%eax "
>>> + "xadd %%eax, %2 \n\t"
>>> + "inc %%eax "
>>> : "=a" (Result), // %0
>>> "=m" (*Value) // %1
>>> : "m" (*Value) // %2
>>
>> The operand list for the inline assembly is incorrect, I suspect. When I
>> build this code with GCC48 as part of OVMF, for the NOOPT target, then
>> disassemble "CpuMpPei.debug", I get:
>>
>>> 0000000000004383 <InternalSyncIncrement>:
>>> UINT32
>>> EFIAPI
>>> InternalSyncIncrement (
>>> IN volatile UINT32 *Value
>>> )
>>> {
>>> 4383: 55 push %rbp
>>> 4384: 48 89 e5 mov %rsp,%rbp
>>> 4387: 48 83 ec 10 sub $0x10,%rsp
>>> 438b: 48 89 4d 10 mov %rcx,0x10(%rbp)
>>> UINT32 Result;
>>>
>>> __asm__ __volatile__ (
>>> 438f: 48 8b 55 10 mov 0x10(%rbp),%rdx
>>> 4393: 48 8b 45 10 mov 0x10(%rbp),%rax
>>> 4397: b8 01 00 00 00 mov $0x1,%eax
>>> 439c: f0 0f c1 00 lock xadd %eax,(%rax)
>>> 43a0: ff c0 inc %eax
>>> 43a2: 89 45 fc mov %eax,-0x4(%rbp)
>>> : "m" (*Value) // %2
>>> : "memory",
>>> "cc"
>>> );
>>>
>>> return Result;
>>> 43a5: 8b 45 fc mov -0x4(%rbp),%eax
>>> }
>>> 43a8: c9 leaveq
>>> 43a9: c3 retq
>>>
>>
>> The generated code is trying to use EAX for both (a) the
>> exchange-and-add, i.e. as the source operand of the XADD instruction,
>> and (b) as (part of) the address of the (*Value) argument.
>>
>> In effect, before we reach the XADD instruction, the least significant
>> dword of Value's address is overwritten with 0x0000_0001. The exchange
>> then happens against that memory location.
>>
>> It is interesting that GCC first spills the parameter from RCX to the
>> stack, and then it loads the parameter from the stack to both RDX and
>> RAX.
>>
>> I think the operand list used with the inline assembly should be updated
>> to prevent GCC from associating %2 with EAX.
>>
>> I'll try to experiment some more.
>
> The operand lists for almost all of the functions in
>
> - MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
> - MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>
> are wrong. The InternalSyncIncrement() and InternalSyncDecrement() issue
> has now bitten us in practice. The InternalSyncCompareExchangeXx()
> functions also have incorrect operand lists, they just haven't blown up yet.
>
> The core problem is that input-output operands for the inline assembly
> templates must be marked as such. The GCC documentation describes two
> ways for that:
>
> - List the r/w operand only in the output operand list, and use the "+"
> (not "=") constraint.
>
> - Alternatively, list the operand in both input and output lists, and
> use the "=" constraint in the output list. However, in the input list,
> the constraing must be "N" (a decimal number), where N stands for the
> operand number in the output list. This is how GCC knows those operands
> are actually the same. Using just the same *expression* to fill in the
> two instances (= input and output) of the operand is not guaranteed to
> work; the gcc documentation explicitly highlights this fact.
>
> In addition, in the ASM template, operands can be referenced by "[name]"
> instead of %N. This has been supported since at least gcc-3.1. This
> makes the template a lot more readable.
>
> I'm working on a patch set.
Apologies for conversing myself, but now I've managed to review more
background discussion from the mailing list.
In particular, in the following messages:
http://mid.mail-archive.com/4A89E2EF3DFEDB4C8BFDE51014F606A14E2F666F@SHSMSX104.ccr.corp.intel.com
http://mid.mail-archive.com/E92EE9817A31E24EB0585FDF735412F5B8AD687E@ORSMSX113.amr.corp.intel.com
an agreement was reached that ultimately *two* implementations would
remain: compiler intrinsics for the MSFT toolchain family, and the NASM
implementation for *everything else*, including *both* GCC and INTEL
toolchain families.
So, why was version 3 of the patch then accepted and pushed? Because now
we still have *three* implementations (excerpt):
[Sources.X64]
InterlockedIncrementMsc.c | MSFT
InterlockedDecrementMsc.c | MSFT
X64/InterlockedDecrement.nasm | INTEL
X64/InterlockedIncrement.nasm | INTEL
X64/GccInline.c | GCC
This is not in the spirit of the agreement on the list.
Why this is relevant to me: because now I realize I shouldn't *fix* the
GCC inline assembly. Instead, we should *remove* it, and consume the
NASM implementation.
So here's what I'm going to do. I will submit a standalone, "surgical"
patch, for fixing the regression introduced in 17634d026f96.
Additionally, I will file a TianoCore BZ about the *other* bugs in
GccInline.c, and suggest that we fix those issues by eliminating the
GCC-specific variant altogether. This way the regression is going to be
averted while we discuss that BZ.
Thanks
Laszlo
next prev parent reply other threads:[~2018-09-25 18:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-13 9:50 [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value Ruiyu Ni
2018-09-21 7:45 ` Gao, Liming
2018-09-25 14:08 ` Laszlo Ersek
2018-09-25 14:22 ` Laszlo Ersek
2018-09-25 16:18 ` Laszlo Ersek
2018-09-25 18:29 ` Laszlo Ersek [this message]
2018-09-25 19:25 ` Laszlo Ersek
2018-09-26 17:39 ` Kinney, Michael D
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a0e968d-f6e4-18ad-f869-60fc88806735@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox