From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web09.10132.1612541197936070657 for ; Fri, 05 Feb 2021 08:06:38 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=H7SQJ6bu; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612541197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UpwYoe/RP4g7y7LkljlcTk5tp+fFpGXN6NyGNdgvPgU=; b=H7SQJ6buIn6HCM9pUBftaDavyDeEK2qcPtpxC7aqJPuq7RZfBNCG43QC6tiN//Gi5rW++Y KcdoRqakrUcZx/ha/o7Wz6kDvGBZebNRIzrhX3hAnZ/86zVsQVtzLTXJmDmFTRpYdt5se8 WoY7ieIPdtJFXUsqSWXu1pBzR9a4m+s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-166-V9ypdYHPNPykNS1Z4Rj_3w-1; Fri, 05 Feb 2021 11:06:33 -0500 X-MC-Unique: V9ypdYHPNPykNS1Z4Rj_3w-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D7F21C73A3; Fri, 5 Feb 2021 16:06:31 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-113-245.ams2.redhat.com [10.36.113.245]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0AFCB5D71D; Fri, 5 Feb 2021 16:06:29 +0000 (UTC) Subject: Re: [edk2-devel] [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() To: devel@edk2.groups.io, ankur.a.arora@oracle.com Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com, Jordan Justen , Ard Biesheuvel , Aaron Young References: <20210129005950.467638-1-ankur.a.arora@oracle.com> <20210129005950.467638-8-ankur.a.arora@oracle.com> <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> <039138dc-c20b-39fb-8c14-953c090b8961@oracle.com> From: "Laszlo Ersek" Message-ID: <35f1e98b-7a6a-ca94-881a-59cf49015be7@redhat.com> Date: Fri, 5 Feb 2021 17:06:29 +0100 MIME-Version: 1.0 In-Reply-To: <039138dc-c20b-39fb-8c14-953c090b8961@oracle.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Hi Ankur, I figure it's prudent for me to follow up here too: On 02/04/21 03:49, Ankur Arora wrote: > On 2021-02-03 12:58 p.m., Laszlo Ersek wrote: >> On 02/03/21 07:45, Ankur Arora wrote: >>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote: >>>> On 02/02/21 15:00, Laszlo Ersek wrote: >>>> >>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the >>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In >>>>> combination with the sync-up point that you quoted. This seems to >>>>> match >>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent >>>>> accesses, >>>>> so atomicity is not a concern, and serializing the instruction streams >>>>> coarsely, with the sync-up, in combination with volatile accesses, >>>>> should presumably guarantee visibility (on x86 anyway). >>>> >>>> To summarize, this is what I would ask for: >>>> >>>> - make CPU_HOT_EJECT_DATA volatile >>>> >>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile >>>> >>>> - after storing something to CPU_HOT_EJECT_DATA or >>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence() >>>> >>>> - before fetching something from CPU_HOT_EJECT_DATA or >>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence() >>>> >>>> >>>> Except: MemoryFence() isn't a *memory fence* in fact. >>>> >>>> See "MdePkg/Library/BaseLib/X64/GccInline.c". >>>> >>>> It's just a compiler barrier, which may not add anything beyond what >>>> we'd already have from "volatile". >>>> >>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does >>>> not contain a single invocation of MemoryFence(). It uses volatile >>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for >>>> implementing semaphores. (NB: there is no 8-bit variant of >>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic >>>> in itself, and a suitable basis for a sempahore too.) And given the >>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that >>>> updates to the *other* volatile objects are both atomic and visible. >>>> >>>> I'm pretty sure this only works because x86 is in-order. There are >>>> instruction stream barriers in place, and compiler barriers too, but no >>>> actual memory barriers. >>> >>> Right and just to separate them explicitly, there are two problems: >>> >>>   - compiler: where the compiler caches stuff in or looks at stale >>> memory >>> locations. Now, AFAICS, this should not happen because the ApicIdMap >>> would >>> never change once set so the compiler should reasonably be able to cache >>> the address of ApicIdMap and dereference it (thus obviating the need for >>> volatile.) >> >> (CPU_HOT_EJECT_DATA.Handler does change though.) > > Yeah, I did kinda elide over that. Let me think this through in v7 > and add more explicit comments and then we can see if it still looks > fishy? > > Thanks > Ankur > >> >>> The compiler could, however, cache any assignments to ApicIdMap[Idx] >>> (especially given LTO) and we could use a MemoryFence() (as the compiler >>> barrier that it is) to force the store. >>> >>>   - CPU pipeline: as you said, right now we basically depend on x86 >>> store >>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda >>> provides >>> a barrier.) >>> >>> So the BSP writes in this order: >>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true >>> >>> And whenever the AP sees ->AllCpusInSync == True, it has to now see >>> ApicIdMap[Idx] == x. >> >> Yes. >> >>> >>> Maybe the thing to do is to detail this barrier in a commit >>> note/comment? >> >> That would be nice. >> >>> And add the MemoryFence() but not the volatile. >> >> Yes, that should work. Please *do* add the volatile, and also the MemoryFence(). When built with Visual Studio, MemoryFence() does nothing at all (at least when LTO is in effect -- which it almost always is). So we should have the volatile for making things work, and MemoryFence() as a conceptual reminder, so we know where to fix up things, when (if!) we come around fixing this mess with MemoryFence(). Reference: https://edk2.groups.io/g/rfc/message/500 https://edk2.groups.io/g/rfc/message/501 https://edk2.groups.io/g/rfc/message/502 https://edk2.groups.io/g/rfc/message/503 Thanks! Laszlo