Re: [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value

public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed

From: Laszlo Ersek <lersek@redhat.com>
To: Ruiyu Ni <ruiyu.ni@intel.com>, edk2-devel@lists.01.org
Cc: Michael D Kinney <michael.d.kinney@intel.com>,
	Jiewen Yao <jiewen.yao@intel.com>,
	Liming Gao <liming.gao@intel.com>
Subject: Re: [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value
Date: Tue, 25 Sep 2018 20:29:10 +0200	[thread overview]
Message-ID: <9a0e968d-f6e4-18ad-f869-60fc88806735@redhat.com> (raw)
In-Reply-To: <f45090cf-91f7-e2a9-9139-cf22df6b6c7c@redhat.com>

On 09/25/18 18:18, Laszlo Ersek wrote:
> On 09/25/18 16:22, Laszlo Ersek wrote:
>> On 09/13/18 11:50, Ruiyu Ni wrote:
>>
>>> diff --git a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> index 5224dd063f..4c4d6e3fc7 100644
>>> --- a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> +++ b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>>> @@ -15,14 +15,12 @@
>>>
>>>
>>>
>>> -
>>>  /**
>>>    Performs an atomic increment of an 32-bit unsigned integer.
>>>
>>>    Performs an atomic increment of the 32-bit unsigned integer specified by
>>>    Value and returns the incremented value. The increment operation must be
>>> -  performed using MP safe mechanisms. The state of the return value is not
>>> -  guaranteed to be MP safe.
>>> +  performed using MP safe mechanisms.
>>>
>>>    @param  Value A pointer to the 32-bit value to increment.
>>>
>>> @@ -38,9 +36,10 @@ InternalSyncIncrement (
>>>    UINT32  Result;
>>>
>>>    __asm__ __volatile__ (
>>> +    "movl    $1, %%eax  \n\t"
>>>      "lock               \n\t"
>>> -    "incl    %2         \n\t"
>>> -    "mov     %2, %%eax      "
>>> +    "xadd    %%eax, %2  \n\t"
>>> +    "inc     %%eax          "
>>>      : "=a" (Result),          // %0
>>>        "=m" (*Value)           // %1
>>>      : "m"  (*Value)           // %2
>>
>> The operand list for the inline assembly is incorrect, I suspect. When I
>> build this code with GCC48 as part of OVMF, for the NOOPT target, then
>> disassemble "CpuMpPei.debug", I get:
>>
>>> 0000000000004383 <InternalSyncIncrement>:
>>> UINT32
>>> EFIAPI
>>> InternalSyncIncrement (
>>>   IN      volatile UINT32    *Value
>>>   )
>>> {
>>>     4383:       55                      push   %rbp
>>>     4384:       48 89 e5                mov    %rsp,%rbp
>>>     4387:       48 83 ec 10             sub    $0x10,%rsp
>>>     438b:       48 89 4d 10             mov    %rcx,0x10(%rbp)
>>>   UINT32  Result;
>>>
>>>   __asm__ __volatile__ (
>>>     438f:       48 8b 55 10             mov    0x10(%rbp),%rdx
>>>     4393:       48 8b 45 10             mov    0x10(%rbp),%rax
>>>     4397:       b8 01 00 00 00          mov    $0x1,%eax
>>>     439c:       f0 0f c1 00             lock xadd %eax,(%rax)
>>>     43a0:       ff c0                   inc    %eax
>>>     43a2:       89 45 fc                mov    %eax,-0x4(%rbp)
>>>     : "m"  (*Value)           // %2
>>>     : "memory",
>>>       "cc"
>>>     );
>>>
>>>   return Result;
>>>     43a5:       8b 45 fc                mov    -0x4(%rbp),%eax
>>> }
>>>     43a8:       c9                      leaveq
>>>     43a9:       c3                      retq
>>>
>>
>> The generated code is trying to use EAX for both (a) the
>> exchange-and-add, i.e.  as the source operand of the XADD instruction,
>> and (b) as (part of) the address of the (*Value) argument.
>>
>> In effect, before we reach the XADD instruction, the least significant
>> dword of Value's address is overwritten with 0x0000_0001. The exchange
>> then happens against that memory location.
>>
>> It is interesting that GCC first spills the parameter from RCX to the
>> stack, and then it loads the parameter from the stack to both RDX and
>> RAX.
>>
>> I think the operand list used with the inline assembly should be updated
>> to prevent GCC from associating %2 with EAX.
>>
>> I'll try to experiment some more.
> 
> The operand lists for almost all of the functions in
> 
> - MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
> - MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
> 
> are wrong. The InternalSyncIncrement() and InternalSyncDecrement() issue
> has now bitten us in practice. The InternalSyncCompareExchangeXx()
> functions also have incorrect operand lists, they just haven't blown up yet.
> 
> The core problem is that input-output operands for the inline assembly
> templates must be marked as such. The GCC documentation describes two
> ways for that:
> 
> - List the r/w operand only in the output operand list, and use the "+"
> (not "=") constraint.
> 
> - Alternatively, list the operand in both input and output lists, and
> use the "=" constraint in the output list. However, in the input list,
> the constraing must be "N" (a decimal number), where N stands for the
> operand number in the output list. This is how GCC knows those operands
> are actually the same. Using just the same *expression* to fill in the
> two instances (= input and output) of the operand is not guaranteed to
> work; the gcc documentation explicitly highlights this fact.
> 
> In addition, in the ASM template, operands can be referenced by "[name]"
> instead of %N. This has been supported since at least gcc-3.1. This
> makes the template a lot more readable.
> 
> I'm working on a patch set.

Apologies for conversing myself, but now I've managed to review more
background discussion from the mailing list.

In particular, in the following messages:

http://mid.mail-archive.com/4A89E2EF3DFEDB4C8BFDE51014F606A14E2F666F@SHSMSX104.ccr.corp.intel.com

http://mid.mail-archive.com/E92EE9817A31E24EB0585FDF735412F5B8AD687E@ORSMSX113.amr.corp.intel.com

an agreement was reached that ultimately *two* implementations would
remain: compiler intrinsics for the MSFT toolchain family, and the NASM
implementation for *everything else*, including *both* GCC and INTEL
toolchain families.

So, why was version 3 of the patch then accepted and pushed? Because now
we still have *three* implementations (excerpt):

[Sources.X64]
  InterlockedIncrementMsc.c | MSFT
  InterlockedDecrementMsc.c | MSFT

  X64/InterlockedDecrement.nasm | INTEL
  X64/InterlockedIncrement.nasm | INTEL

  X64/GccInline.c | GCC

This is not in the spirit of the agreement on the list.

Why this is relevant to me: because now I realize I shouldn't *fix* the
GCC inline assembly. Instead, we should *remove* it, and consume the
NASM implementation.

So here's what I'm going to do. I will submit a standalone, "surgical"
patch, for fixing the regression introduced in 17634d026f96.
Additionally, I will file a TianoCore BZ about the *other* bugs in
GccInline.c, and suggest that we fix those issues by eliminating the
GCC-specific variant altogether. This way the regression is going to be
averted while we discuss that BZ.

Thanks
Laszlo

next prev parent reply	other threads:[~2018-09-25 18:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13  9:50 [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value Ruiyu Ni
2018-09-21  7:45 ` Gao, Liming
2018-09-25 14:08 ` Laszlo Ersek
2018-09-25 14:22 ` Laszlo Ersek
2018-09-25 16:18   ` Laszlo Ersek
2018-09-25 18:29     ` Laszlo Ersek [this message]
2018-09-25 19:25       ` Laszlo Ersek
2018-09-26 17:39         ` Kinney, Michael D

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a0e968d-f6e4-18ad-f869-60fc88806735@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox