According to Ard's explanation, it seems that we do not need to worry about multi-core issues. However, it is assumed that spinlocks are not shared with DMA masters.

 

Maugan, Sean,

If you have any other comments or concerns about this, please leave a comment.

 

Following Ard's suggestion, we may improve the code as below if needed.

 

SPIN_LOCK *

EFIAPI

ReleaseSpinLock (

  IN OUT  SPIN_LOCK                 *SpinLock

  )

{

  SPIN_LOCK    LockValue;

 

  ASSERT (SpinLock != NULL);

 

  LockValue = *SpinLock;

  ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED == LockValue);

 

  InterlockedCompareExchangePointer (

    (VOID**)SpinLock,

    (VOID*)SPIN_LOCK_ACQUIRED,

    (VOID*)SPIN_LOCK_RELEASED

);

 

return SpinLock;

}

 

--Bin

 

From: Ard Biesheuvel <ard.biesheuvel@arm.com>
Sent: Wednesday, January 6, 2021 10:28 PM
To: Bin, Sung-Uk (Bin) <sunguk-bin@hp.com>; devel@edk2.groups.io
Cc: gaoliming@byosoft.com.cn; Villatel, Maugan <maugan.villatel@hp.com>; Collison, Sean <scollison@hp.com>
Subject: Re: [RFC] Incorrect memory ordering in ReleaseSpinLock()

 

On 1/6/21 12:29 PM, Bin, Sung-Uk (Bin) wrote:
> Dear, Ard and maintainers
>
>  
>
> We are concerning that ReleaseSpinLock() does not have a memory barrier.
> This is reported to https://bugzilla.tianocore.org/show_bug.cgi?id=3005.
>  We’d like to hear from you whether current implementation needs
> improvement or not.
>

I think you are correct that the current implementation is insufficient.
However, I would prefer for someone to do a comprehensive audit of all
the locking primitives for concurrency problems.


>  
>
> The concern comes from *'weak memory ordering' and multi-core.* (we are
> using AARCH64) And the scenario that we’re concerning is like below:
>

When does UEFI run multi-core on a AArch64 system? The UEFI spec does
not permit SMP at boot time, and at runtime, the runtime services are
not reentrant, in which case we should be able to rely on barriers in
the OS's critical section code to ensure visibility when several cores
compete for the UEFI runtime services from the OS.


>  
>
> AcquireSpinLock(); // contains ‘dmb sy’ and prevents "a = *b" from
> moving up (and unnecessarily prevents other things from moving down)
>
> a = *b;
>
> a = a + 1;
>
> *b = a;
>
> *ReleaseSpinLock(); // No write barrier here, so "*b = a" can move down.
> Another core acquires the spinlock and can read stale data*
>
>  
>
>  
>
> Please let me know if it would be helpful to add MemoryFence like below:
>

For symmetry, I'd prefer it if we could simply implement the release
side in terms of InterlockedCompareExchangePointer(), and ASSERT() on
the output.

*However*, looking at the current code, there seems to be something
seriously wrong: ReleaseSpinLock() has

ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED == LockValue);


which means you can release a released spinlock even on a DEBUG build
without a diagnostic being printed - that seems like a bug to me.

>  
>
> SPIN_LOCK *
>
> EFIAPI
>
> ReleaseSpinLock (
>
>   IN OUT  SPIN_LOCK                 *SpinLock
>
>   )
>
> {
>
>   SPIN_LOCK    LockValue;
>
>  
>
>   ASSERT (SpinLock != NULL);
>
>  
>
> *  MemoryFence(); *
>
>  
>
>   LockValue = *SpinLock;
>
>   ASSERT (SPIN_LOCK_ACQUIRED == LockValue || SPIN_LOCK_RELEASED ==
> LockValue);
>
>  
>
>   *SpinLock = SPIN_LOCK_RELEASED;
>
>   return SpinLock;
>
> }
>
> * *
>
> *MemoryFence is implemented with 'dmb', but I just wonder if it is okay
> to not implement it with 'dsb'.*
>

DSB is for cache and TLB maintenance, not for memory ordering. DMB
should be sufficient here. And actually, we don't need a system wide DMB
here, an inner shareable DMB should be sufficient (given that we don't
share spinlocks with DMA masters)




>  
>
> * Attaching linux documentation describing SMP barrier pairing
>
> https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt
>
>  
>
> SMP BARRIER PAIRING
>
>
>
> -------------------
>
>
>
>  
>
>
>
> When dealing with CPU-CPU interactions, certain types of memory barrier
> should
>
>
>
> always be paired.  A lack of appropriate pairing is almost certainly an
> error.
>
>
>
>  
>
>
>
> General barriers pair with each other, though they also pair with most
>
>
>
> other types of barriers, albeit without multicopy atomicity.  An acquire
>
>
>
> barrier pairs with a release barrier, but both may also pair with other
>
>
>
> barriers, including of course general barriers.  A write barrier pairs
>
>
>
> with a data dependency barrier, a control dependency, an acquire barrier,
>
>
>
> a release barrier, a read barrier, or a general barrier.  Similarly a
>
>
>
> read barrier, control dependency, or a data dependency barrier pairs
>
>
>
> with a write barrier, an acquire barrier, a release barrier, or a
>
>
>
> general barrier:
>
>
>
>  
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============
>
>
>
>        WRITE_ONCE(a, 1);
>
>
>
>        <write barrier>
>
>
>
>        WRITE_ONCE(b, 2);     x = READ_ONCE(b);
>
>
>
>                            <read barrier>
>
>
>
>                            y = READ_ONCE(a);
>
>
>
>  
>
>
>
> Or:
>
>
>
>  
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============================
>
>
>
>        a = 1;
>
>
>
>        <write barrier>
>
>
>
>        WRITE_ONCE(b, &a);    x = READ_ONCE(b);
>
>
>
>                            <data dependency barrier>
>
>
>
>                            y = *x;
>
>
>
>  
>
>
>
> Or even:
>
>
>
>  
>
>
>
>        CPU 1               CPU 2
>
>
>
>        ===============            ===============================
>
>
>
>        r1 = READ_ONCE(y);
>
>
>
>        <general barrier>
>
>
>
>        WRITE_ONCE(x, 1);     if (r2 = READ_ONCE(x)) {
>
>
>
>                               <implicit control dependency>
>
>
>
>                               WRITE_ONCE(y, 1);
>
>
>
>                            }
>
>
>
>  
>
>
>
>        assert(r1 == 0 || r2 == 0);
>
>
>
>  
>
>
>
> Basically, the read barrier always has to be there, even though it can be of
>
>
>
> the "weaker" type.
>
>
>
>  
>
>
>
> [!] Note that the stores before the write barrier would normally be
> expected to
>
>
>
> match the loads after the read barrier or the data dependency barrier,
> and vice
>
>
>
> versa:
>
>
>
>  
>
>
>
>        CPU 1                               CPU 2
>
>
>
>        ===================                 ===================
>
>
>
>        WRITE_ONCE(a, 1);    }----   --->{  v = READ_ONCE(c);
>
>
>
>        WRITE_ONCE(b, 2);    }    \ /    {  w = READ_ONCE(d);
>
>
>
>        <write barrier>            \        <read barrier>
>
>
>
>        WRITE_ONCE(c, 3);    }    / \    {  x = READ_ONCE(a);
>
>
>
>        WRITE_ONCE(d, 4);    }----   --->{  y = READ_ONCE(b);
>
>
>
>  
>
>  
>
>  
>
>  
>
> Thanks,
>
> Bin
>
>  
>
> *From:* bugzilla-daemon@bugzilla.tianocore.org
> <bugzilla-daemon@bugzilla.tianocore.org>
> *Sent:* Wednesday, November 4, 2020 10:44 AM
> *To:* Bin, Sung-Uk (Bin) <sunguk-bin@hp.com>
> *Subject:* [Bug 3005] ReleaseSpinLock() requires a barrier at the beginning
>
>  
>
> https://bugzilla.tianocore.org/show_bug.cgi?id=3005
>
> gaoliming@byosoft.com.cn <mailto:gaoliming@byosoft.com.cn> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Priority|Lowest |Normal
> Status|UNCONFIRMED |CONFIRMED
> CC| |leif@nuviainc.com <mailto:|leif@nuviainc.com>
> Assignee|unassigned@tianocore.org
> <mailto:Assignee|unassigned@tianocore.org> |ard.biesheuvel@arm.com
> <mailto:|ard.biesheuvel@arm.com>
> Ever confirmed|0 |1
>
> --- Comment #5 from gaoliming@byosoft.com.cn
> <mailto:gaoliming@byosoft.com.cn> ---
> Ard: can you help check it? This issue in AARCH64.
>
> --
> You are receiving this mail because:
> You reported the bug.
>