From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web09.2953.1612385939521560027 for ; Wed, 03 Feb 2021 12:58:59 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=E3VD3bl4; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612385938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7IIvkJDDs+aftGp6b3cPUDzeocP6TmDnthc24hsBsx4=; b=E3VD3bl4Q0Q2nd8I818/PoqZHPbTkJHUE/s7awBGMiCTFEoqOJ/BoaB1LusrhHUy7Q+ou7 Oz5VrrN0sZJUnMe522pFmiye3xyOhE7lIvIKG4lMzqM35hqN209A5dhFqZ/lTvjKD9YsbU 8bwWAP4dr8iKamLsnrMHAvT0kyMHNn8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-396-dU6AG43yNViaMT22oaQwBQ-1; Wed, 03 Feb 2021 15:58:55 -0500 X-MC-Unique: dU6AG43yNViaMT22oaQwBQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8DBD6100CCC0; Wed, 3 Feb 2021 20:58:53 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-113-32.ams2.redhat.com [10.36.113.32]) by smtp.corp.redhat.com (Postfix) with ESMTP id 11EAE100AE4A; Wed, 3 Feb 2021 20:58:51 +0000 (UTC) Subject: Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() To: Ankur Arora , devel@edk2.groups.io Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com, Jordan Justen , Ard Biesheuvel , Aaron Young References: <20210129005950.467638-1-ankur.a.arora@oracle.com> <20210129005950.467638-8-ankur.a.arora@oracle.com> <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> From: "Laszlo Ersek" Message-ID: Date: Wed, 3 Feb 2021 21:58:50 +0100 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 02/03/21 07:45, Ankur Arora wrote: > On 2021-02-02 6:15 a.m., Laszlo Ersek wrote: >> On 02/02/21 15:00, Laszlo Ersek wrote: >> >>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the >>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In >>> combination with the sync-up point that you quoted. This seems to match >>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses, >>> so atomicity is not a concern, and serializing the instruction streams >>> coarsely, with the sync-up, in combination with volatile accesses, >>> should presumably guarantee visibility (on x86 anyway). >> >> To summarize, this is what I would ask for: >> >> - make CPU_HOT_EJECT_DATA volatile >> >> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile >> >> - after storing something to CPU_HOT_EJECT_DATA or >> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence() >> >> - before fetching something from CPU_HOT_EJECT_DATA or >> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence() >> >> >> Except: MemoryFence() isn't a *memory fence* in fact. >> >> See "MdePkg/Library/BaseLib/X64/GccInline.c". >> >> It's just a compiler barrier, which may not add anything beyond what >> we'd already have from "volatile". >> >> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does >> not contain a single invocation of MemoryFence(). It uses volatile >> objects, and a handful of InterlockedCompareExchangeXx() calls, for >> implementing semaphores. (NB: there is no 8-bit variant of >> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic >> in itself, and a suitable basis for a sempahore too.) And given the >> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that >> updates to the *other* volatile objects are both atomic and visible. >> >> I'm pretty sure this only works because x86 is in-order. There are >> instruction stream barriers in place, and compiler barriers too, but no >> actual memory barriers. > > Right and just to separate them explicitly, there are two problems: > >  - compiler: where the compiler caches stuff in or looks at stale memory > locations. Now, AFAICS, this should not happen because the ApicIdMap would > never change once set so the compiler should reasonably be able to cache > the address of ApicIdMap and dereference it (thus obviating the need for > volatile.) (CPU_HOT_EJECT_DATA.Handler does change though.) > The compiler could, however, cache any assignments to ApicIdMap[Idx] > (especially given LTO) and we could use a MemoryFence() (as the compiler > barrier that it is) to force the store. > >  - CPU pipeline: as you said, right now we basically depend on x86 store > order semantics (and the CpuPause() loop around AllCpusInSync, kinda > provides > a barrier.) > > So the BSP writes in this order: > ApicIdMap[Idx]=x; ... ->AllCpusInSync = true > > And whenever the AP sees ->AllCpusInSync == True, it has to now see > ApicIdMap[Idx] == x. Yes. > > Maybe the thing to do is to detail this barrier in a commit note/comment? That would be nice. > And add the MemoryFence() but not the volatile. Yes, that should work. Thanks, Laszlo