From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <lersek@redhat.com>
Received-SPF: Pass (sender SPF authorized) identity=mailfrom;
 client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com;
 receiver=edk2-devel@lists.01.org 
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ml01.01.org (Postfix) with ESMTPS id 8D6DC21C8EFB5
 for <edk2-devel@lists.01.org>; Tue, 25 Sep 2018 09:18:33 -0700 (PDT)
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com
 [10.5.11.12])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id 3E454C004A99;
 Tue, 25 Sep 2018 16:18:33 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-71.rdu2.redhat.com
 [10.10.120.71])
 by smtp.corp.redhat.com (Postfix) with ESMTP id A3A4160BE8;
 Tue, 25 Sep 2018 16:18:31 +0000 (UTC)
From: Laszlo Ersek <lersek@redhat.com>
To: Ruiyu Ni <ruiyu.ni@intel.com>, edk2-devel@lists.01.org
Cc: Michael D Kinney <michael.d.kinney@intel.com>,
 Jiewen Yao <jiewen.yao@intel.com>, Liming Gao <liming.gao@intel.com>
References: <20180913095046.9480-1-ruiyu.ni@intel.com>
 <f4217b25-a4f9-48ff-0a70-1e233016972d@redhat.com>
Message-ID: <f45090cf-91f7-e2a9-9139-cf22df6b6c7c@redhat.com>
Date: Tue, 25 Sep 2018 18:18:30 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <f4217b25-a4f9-48ff-0a70-1e233016972d@redhat.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.5.110.32]); Tue, 25 Sep 2018 16:18:33 +0000 (UTC)
Subject: Re: [PATCH v3] MdePkg/SynchronizationLib: fix Interlocked[De|In]crement return value
X-BeenThere: edk2-devel@lists.01.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: EDK II Development  <edk2-devel.lists.01.org>
List-Unsubscribe: <https://lists.01.org/mailman/options/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/edk2-devel/>
List-Post: <mailto:edk2-devel@lists.01.org>
List-Help: <mailto:edk2-devel-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Sep 2018 16:18:34 -0000
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit

On 09/25/18 16:22, Laszlo Ersek wrote:
> On 09/13/18 11:50, Ruiyu Ni wrote:
> 
>> diff --git a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>> index 5224dd063f..4c4d6e3fc7 100644
>> --- a/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>> +++ b/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c
>> @@ -15,14 +15,12 @@
>>
>>
>>
>> -
>>  /**
>>    Performs an atomic increment of an 32-bit unsigned integer.
>>
>>    Performs an atomic increment of the 32-bit unsigned integer specified by
>>    Value and returns the incremented value. The increment operation must be
>> -  performed using MP safe mechanisms. The state of the return value is not
>> -  guaranteed to be MP safe.
>> +  performed using MP safe mechanisms.
>>
>>    @param  Value A pointer to the 32-bit value to increment.
>>
>> @@ -38,9 +36,10 @@ InternalSyncIncrement (
>>    UINT32  Result;
>>
>>    __asm__ __volatile__ (
>> +    "movl    $1, %%eax  \n\t"
>>      "lock               \n\t"
>> -    "incl    %2         \n\t"
>> -    "mov     %2, %%eax      "
>> +    "xadd    %%eax, %2  \n\t"
>> +    "inc     %%eax          "
>>      : "=a" (Result),          // %0
>>        "=m" (*Value)           // %1
>>      : "m"  (*Value)           // %2
> 
> The operand list for the inline assembly is incorrect, I suspect. When I
> build this code with GCC48 as part of OVMF, for the NOOPT target, then
> disassemble "CpuMpPei.debug", I get:
> 
>> 0000000000004383 <InternalSyncIncrement>:
>> UINT32
>> EFIAPI
>> InternalSyncIncrement (
>>   IN      volatile UINT32    *Value
>>   )
>> {
>>     4383:       55                      push   %rbp
>>     4384:       48 89 e5                mov    %rsp,%rbp
>>     4387:       48 83 ec 10             sub    $0x10,%rsp
>>     438b:       48 89 4d 10             mov    %rcx,0x10(%rbp)
>>   UINT32  Result;
>>
>>   __asm__ __volatile__ (
>>     438f:       48 8b 55 10             mov    0x10(%rbp),%rdx
>>     4393:       48 8b 45 10             mov    0x10(%rbp),%rax
>>     4397:       b8 01 00 00 00          mov    $0x1,%eax
>>     439c:       f0 0f c1 00             lock xadd %eax,(%rax)
>>     43a0:       ff c0                   inc    %eax
>>     43a2:       89 45 fc                mov    %eax,-0x4(%rbp)
>>     : "m"  (*Value)           // %2
>>     : "memory",
>>       "cc"
>>     );
>>
>>   return Result;
>>     43a5:       8b 45 fc                mov    -0x4(%rbp),%eax
>> }
>>     43a8:       c9                      leaveq
>>     43a9:       c3                      retq
>>
> 
> The generated code is trying to use EAX for both (a) the
> exchange-and-add, i.e.  as the source operand of the XADD instruction,
> and (b) as (part of) the address of the (*Value) argument.
> 
> In effect, before we reach the XADD instruction, the least significant
> dword of Value's address is overwritten with 0x0000_0001. The exchange
> then happens against that memory location.
> 
> It is interesting that GCC first spills the parameter from RCX to the
> stack, and then it loads the parameter from the stack to both RDX and
> RAX.
> 
> I think the operand list used with the inline assembly should be updated
> to prevent GCC from associating %2 with EAX.
> 
> I'll try to experiment some more.

The operand lists for almost all of the functions in

- MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
- MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c

are wrong. The InternalSyncIncrement() and InternalSyncDecrement() issue
has now bitten us in practice. The InternalSyncCompareExchangeXx()
functions also have incorrect operand lists, they just haven't blown up yet.

The core problem is that input-output operands for the inline assembly
templates must be marked as such. The GCC documentation describes two
ways for that:

- List the r/w operand only in the output operand list, and use the "+"
(not "=") constraint.

- Alternatively, list the operand in both input and output lists, and
use the "=" constraint in the output list. However, in the input list,
the constraing must be "N" (a decimal number), where N stands for the
operand number in the output list. This is how GCC knows those operands
are actually the same. Using just the same *expression* to fill in the
two instances (= input and output) of the operand is not guaranteed to
work; the gcc documentation explicitly highlights this fact.

In addition, in the ASM template, operands can be referenced by "[name]"
instead of %N. This has been supported since at least gcc-3.1. This
makes the template a lot more readable.

I'm working on a patch set.

Thanks
Laszlo