From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.61]) by mx.groups.io with SMTP id smtpd.web11.10220.1590079233755747041 for ; Thu, 21 May 2020 09:40:33 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=d5F+HqFq; spf=pass (domain: redhat.com, ip: 205.139.110.61, mailfrom: philmd@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590079232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6rz6aFF3hA/OdaquMyBbwknR2897+ElOAf7lfvGobek=; b=d5F+HqFqSgcg9XxRyVb7RQ8GNFBseVgdlMu8Mt9ZvlskgtPsBHvFzdl+qbaeyK0sPRiQVv X+20iQEcbkXnVhjB9aMd3ZQO3lB7vNmRkYOQ/8YXXbk7FYWSeVuXLR/MCj6mRmzp/0bvaU i1GGei//qrE5z+nqjkHEyisW8zs/9w8= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-140-8YD_r3UdMAeZU8ckz2-LyA-1; Thu, 21 May 2020 12:40:29 -0400 X-MC-Unique: 8YD_r3UdMAeZU8ckz2-LyA-1 Received: by mail-wm1-f71.google.com with SMTP id a206so2046688wmh.6 for ; Thu, 21 May 2020 09:40:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=6rz6aFF3hA/OdaquMyBbwknR2897+ElOAf7lfvGobek=; b=d6bEfR24LFO4jkBfvcqCbep8uLeTGv95yxI/lDGI3d40FtcoSnGjxrlfW+vBVcMU0a x9uSN+MPYO3XPJvDFgeEtaju1xV9Qi/lkYyAOOKWDZQAfXqmr538mQU81bpAcHbmaDzv 0I04U585Hw9soJNHkpRAPU0mtQb8Qr9ntfej/HRR1wt7OKqCYkqMCmodBOiVT2rGE1Qj pKYhqEWxfkdL9ggCF+lz0aB2TB3zvEht4SumJKjfBoGII1RUlJBAezwaMwamsdJ27cHS tD7VOClmdeZJuOrh2uu0R19etFP+cdqZavkngOqZh30o0kA4qIhgVIsvzQdSJ4Q9NttO el+Q== X-Gm-Message-State: AOAM533QOX/Lc76Gj3Fq0cGto5XhNXr4hb3Vp73GKJxVF9H6UYajprTj rIi3VhvaqBpPngkwIJAyi1s3FZUZai08vK/1Kq7+roRhMraWBXYHO+Lyfbz6Xtx/BYmBvhkdjd8 JP7wD/RzlSUWaBw== X-Received: by 2002:a1c:8148:: with SMTP id c69mr10077241wmd.144.1590079227721; Thu, 21 May 2020 09:40:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJzO7ULhhxBfg8r49o2nkR7QKBw1edKSI31L6LtECztzrP1bjBTyLth3ZHnaUSTtzjKqVUCQ== X-Received: by 2002:a1c:8148:: with SMTP id c69mr10077215wmd.144.1590079227418; Thu, 21 May 2020 09:40:27 -0700 (PDT) Return-Path: Received: from [192.168.1.40] (17.red-88-21-202.staticip.rima-tde.net. [88.21.202.17]) by smtp.gmail.com with ESMTPSA id m82sm7320650wmf.3.2020.05.21.09.40.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 May 2020 09:40:26 -0700 (PDT) Subject: Re: [edk2-devel] [PATCH v2] ArmPkg/CompilerIntrinsicsLib: provide atomics intrinsics From: =?UTF-8?B?UGhpbGlwcGUgTWF0aGlldS1EYXVkw6k=?= To: devel@edk2.groups.io, ard.biesheuvel@arm.com Cc: glin@suse.com, leif@nuviainc.com, lersek@redhat.com, liming.gao@intel.com References: <20200520114448.26104-1-ard.biesheuvel@arm.com> <47f54425-df5d-17a3-e134-fe9e01fb08bd@redhat.com> Message-ID: <6c9ec3a3-8aa1-c4d4-c7ba-1b9e28fd0866@redhat.com> Date: Thu, 21 May 2020 18:40:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <47f54425-df5d-17a3-e134-fe9e01fb08bd@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 5/20/20 2:37 PM, Philippe Mathieu-Daudé wrote: > Hi Ard, > > On 5/20/20 1:44 PM, Ard Biesheuvel wrote: >> Gary reports the GCC 10 will emit calls to atomics intrinsics routines >> unless -mno-outline-atomics is specified. This means GCC-10 introduces >> new intrinsics, and even though it would be possible to work around this >> by specifying the command line option, this would require a new GCC10 >> toolchain profile to be created, which we prefer to avoid. >> >> So instead, add the new intrinsics to our library so they are provided >> when necessary. >> >> Signed-off-by: Ard Biesheuvel >> --- >> v2: >> - add missing .globl to export the functions from the object file >> - add function end markers so the size of each is visible in the ELF >> metadata >> - add some comments to describe what is going on > > Thanks, head hurts a bit less... > >> >>   ArmPkg/Library/CompilerIntrinsicsLib/CompilerIntrinsicsLib.inf |   3 + >>   ArmPkg/Library/CompilerIntrinsicsLib/AArch64/Atomics.S         | 142 >> ++++++++++++++++++++ >>   2 files changed, 145 insertions(+) >> >> diff --git >> a/ArmPkg/Library/CompilerIntrinsicsLib/CompilerIntrinsicsLib.inf >> b/ArmPkg/Library/CompilerIntrinsicsLib/CompilerIntrinsicsLib.inf >> index d5bad9467758..fcf48c678119 100644 >> --- a/ArmPkg/Library/CompilerIntrinsicsLib/CompilerIntrinsicsLib.inf >> +++ b/ArmPkg/Library/CompilerIntrinsicsLib/CompilerIntrinsicsLib.inf >> @@ -79,6 +79,9 @@ [Sources.ARM] >>     Arm/ldivmod.asm      | MSFT >>     Arm/llsr.asm         | MSFT >> +[Sources.AARCH64] >> +  AArch64/Atomics.S    | GCC >> + >>   [Packages] >>     MdePkg/MdePkg.dec >>     ArmPkg/ArmPkg.dec >> diff --git a/ArmPkg/Library/CompilerIntrinsicsLib/AArch64/Atomics.S >> b/ArmPkg/Library/CompilerIntrinsicsLib/AArch64/Atomics.S >> new file mode 100644 >> index 000000000000..dc61d6bb8e52 >> --- /dev/null >> +++ b/ArmPkg/Library/CompilerIntrinsicsLib/AArch64/Atomics.S >> @@ -0,0 +1,142 @@ >> +#------------------------------------------------------------------------------ >> >> +# >> +# Copyright (c) 2020, Arm, Limited. All rights reserved.
>> +# >> +# SPDX-License-Identifier: BSD-2-Clause-Patent >> +# >> +#------------------------------------------------------------------------------ >> >> + >> +    /* >> +     * Provide the GCC intrinsics that are required when using GCC 9 or >> +     * later with the -moutline-atomics options (which became the >> default >> +     * in GCC 10) >> +     */ >> +    .arch armv8-a >> + >> +    .macro        reg_alias, pfx, sz >> +    r0_\sz        .req    \pfx\()0 >> +    r1_\sz        .req    \pfx\()1 >> +    tmp0_\sz    .req    \pfx\()16 >> +    tmp1_\sz    .req    \pfx\()17 >> +    .endm >> + >> +    /* >> +     * Define register aliases of the right type for each size >> +     * (xN for 8 bytes, wN for everything smaller) >> +     */ >> +    reg_alias    w, 1 >> +    reg_alias    w, 2 >> +    reg_alias    w, 4 >> +    reg_alias    x, 8 >> + >> +    .macro        fn_start, name:req >> +    .section    .text.\name >> +    .globl        \name >> +    .type        \name, %function >> +\name\(): >> +    .endm >> + >> +    .macro        fn_end, name:req >> +    .size        \name, . - \name >> +    .endm >> + >> +    /* >> +     * Emit an atomic helper for \model with operands of size \sz, using >> +     * the operation specified by \insn (which is the LSE name), and >> which >> +     * can be implemented using the generic >> load-locked/store-conditional >> +     * (LL/SC) sequence below, using the arithmetic operation given by >> +     * \opc. >> +     */ >> +    .macro         emit_ld_sz, sz:req, insn:req, opc:req, model:req, >> s, a, l >> +    fn_start    __aarch64_\insn\()\sz\()\model >> +    mov        tmp0_\sz, r0_\sz >> +0:    ld\a\()xr\s    r0_\sz, [x1] >> +    .ifnc        \insn, swp >> +    \opc        tmp1_\sz, r0_\sz, tmp0_\sz >> +    .else >> +    \opc        tmp1_\sz, tmp0_\sz >> +    .endif >> +    st\l\()xr\s    w15, tmp1_\sz, [x1] >> +    cbnz        w15, 0b > > I see at the end \s is in {,b,h} range. > > Don't you need to use x15 on 64-bit? Ard, I expanded all macros and reviewed this patch, but I am still having hard time to figure why w15 temp is OK instead of x15. Any hint? > >> +    ret >> +    fn_end        __aarch64_\insn\()\sz\()\model >> +    .endm >> + >> +    /* >> +     * Emit atomic helpers for \model for operand sizes in the >> +     * set {1, 2, 4, 8}, for the instruction pattern given by >> +     * \insn. (This is the LSE name, but this implementation uses >> +     * the generic LL/SC sequence using \opc as the arithmetic >> +     * operation on the target.) >> +     */ >> +    .macro        emit_ld, insn:req, opc:req, model:req, a, l >> +    emit_ld_sz    1, \insn, \opc, \model, b, \a, \l >> +    emit_ld_sz    2, \insn, \opc, \model, h, \a, \l >> +    emit_ld_sz    4, \insn, \opc, \model,  , \a, \l >> +    emit_ld_sz    8, \insn, \opc, \model,  , \a, \l >> +    .endm >> + >> +    /* >> +     * Emit the compare and swap helper for \model and size \sz >> +     * using LL/SC instructions. >> +     */ >> +    .macro         emit_cas_sz, sz:req, model:req, uxt:req, s, a, l >> +    fn_start    __aarch64_cas\sz\()\model >> +    \uxt        tmp0_\sz, r0_\sz >> +0:    ld\a\()xr\s    r0_\sz, [x2] >> +    cmp        r0_\sz, tmp0_\sz >> +    bne        1f >> +    st\l\()xr\s    w15, r1_\sz, [x2] >> +    cbnz        w15, 0b >> +1:    ret >> +    fn_end        __aarch64_cas\sz\()\model >> +    .endm >> + >> +    /* >> +     * Emit compare-and-swap helpers for \model for operand sizes in the >> +     * set {1, 2, 4, 8, 16}. >> +     */ >> +    .macro        emit_cas, model:req, a, l >> +    emit_cas_sz    1, \model, uxtb, b, \a, \l >> +    emit_cas_sz    2, \model, uxth, h, \a, \l >> +    emit_cas_sz    4, \model, mov ,  , \a, \l >> +    emit_cas_sz    8, \model, mov ,  , \a, \l >> + >> +    /* >> +     * We cannot use the parameterized sequence for 16 byte CAS, so we >> +     * need to define it explicitly. >> +     */ >> +    fn_start    __aarch64_cas16\model >> +    mov        x16, x0 >> +    mov        x17, x1 >> +0:    ld\a\()xp    x0, x1, [x4] >> +    cmp        x0, x16 >> +    ccmp        x1, x17, #0, eq >> +    bne        1f >> +    st\l\()xp    w15, x16, x17, [x4] >> +    cbnz        w15, 0b >> +1:    ret >> +    fn_end        __aarch64_cas16\model >> +    .endm >> + >> +    /* >> +     * Emit the set of GCC outline atomic helper functions for >> +     * the memory ordering model given by \model: >> +     * - relax    unordered loads and stores >> +     * - acq    load-acquire, unordered store >> +     * - rel    unordered load, store-release >> +     * - acq_rel    load-acquire, store-release >> +     */ >> +    .macro        emit_model, model:req, a, l >> +    emit_ld        ldadd, add, \model, \a, \l >> +    emit_ld        ldclr, bic, \model, \a, \l >> +    emit_ld        ldeor, eor, \model, \a, \l >> +    emit_ld        ldset, orr, \model, \a, \l >> +    emit_ld        swp,   mov, \model, \a, \l >> +    emit_cas    \model, \a, \l >> +    .endm >> + >> +    emit_model    _relax >> +    emit_model    _acq, a >> +    emit_model    _rel,, l >> +    emit_model    _acq_rel, a, l >>