From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web08.5292.1612429087743807952 for ; Thu, 04 Feb 2021 00:58:08 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=i5VWG9IJ; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612429087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+O9PVkskHYTZdgqK0X37ncbQP04/oeHsU1WfoRzrJGI=; b=i5VWG9IJ3FKeJ3TAe3w8LY7CjfB5U4iXCEy+52l3G4p8E1qQe2PFXUJAwD5CG5BpyYY+Gt 5VS5FSRV9T3Uz0wj1U2Vuoz8nDzrw2UQ/HPeKBAlIOT6gyI5dUnvpRETBvIEj8YdgO0Tvh qmMzZr+b+i6PUBwuddBNnsAh6A73+ZE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-334-NaOb4a1AOPagKpma8aOOxQ-1; Thu, 04 Feb 2021 03:58:05 -0500 X-MC-Unique: NaOb4a1AOPagKpma8aOOxQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0E74D801980; Thu, 4 Feb 2021 08:58:03 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-114-169.ams2.redhat.com [10.36.114.169]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3694D100AE4D; Thu, 4 Feb 2021 08:58:01 +0000 (UTC) Subject: Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() To: Ankur Arora , devel@edk2.groups.io Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com, Jordan Justen , Ard Biesheuvel , Aaron Young References: <20210129005950.467638-1-ankur.a.arora@oracle.com> <20210129005950.467638-8-ankur.a.arora@oracle.com> <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> <039138dc-c20b-39fb-8c14-953c090b8961@oracle.com> From: "Laszlo Ersek" Message-ID: <6079af8e-8f48-d67f-0f87-66d4ea72af0a@redhat.com> Date: Thu, 4 Feb 2021 09:58:00 +0100 MIME-Version: 1.0 In-Reply-To: <039138dc-c20b-39fb-8c14-953c090b8961@oracle.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 02/04/21 03:49, Ankur Arora wrote: > On 2021-02-03 12:58 p.m., Laszlo Ersek wrote: >> On 02/03/21 07:45, Ankur Arora wrote: >>> On 2021-02-02 6:15 a.m., Laszlo Ersek wrote: >>>> On 02/02/21 15:00, Laszlo Ersek wrote: >>>> >>>>> ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the >>>>> array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In >>>>> combination with the sync-up point that you quoted. This seems to >>>>> match >>>>> existing practice in PiSmmCpuDxeSmm -- there are no concurrent >>>>> accesses, >>>>> so atomicity is not a concern, and serializing the instruction streams >>>>> coarsely, with the sync-up, in combination with volatile accesses, >>>>> should presumably guarantee visibility (on x86 anyway). >>>> >>>> To summarize, this is what I would ask for: >>>> >>>> - make CPU_HOT_EJECT_DATA volatile >>>> >>>> - make (*CPU_HOT_EJECT_DATA.ApicIdMap) volatile >>>> >>>> - after storing something to CPU_HOT_EJECT_DATA or >>>> CPU_HOT_EJECT_DATA.ApicIdMap on the BSP, execute a MemoryFence() >>>> >>>> - before fetching something from CPU_HOT_EJECT_DATA or >>>> CPU_HOT_EJECT_DATA.ApicIdMap on an AP, execute a MemoryFence() >>>> >>>> >>>> Except: MemoryFence() isn't a *memory fence* in fact. >>>> >>>> See "MdePkg/Library/BaseLib/X64/GccInline.c". >>>> >>>> It's just a compiler barrier, which may not add anything beyond what >>>> we'd already have from "volatile". >>>> >>>> Case in point: PiSmmCpuDxeSmm performs heavy multi-processing, but does >>>> not contain a single invocation of MemoryFence(). It uses volatile >>>> objects, and a handful of InterlockedCompareExchangeXx() calls, for >>>> implementing semaphores. (NB: there is no 8-bit variant of >>>> InterlockedCompareExchange(), as "volatile UINT8" is considered atomic >>>> in itself, and a suitable basis for a sempahore too.) And given the >>>> synchronization from those semaphores, PiSmmCpuDpxeSmm trusts that >>>> updates to the *other* volatile objects are both atomic and visible. >>>> >>>> I'm pretty sure this only works because x86 is in-order. There are >>>> instruction stream barriers in place, and compiler barriers too, but no >>>> actual memory barriers. >>> >>> Right and just to separate them explicitly, there are two problems: >>> >>>   - compiler: where the compiler caches stuff in or looks at stale >>> memory >>> locations. Now, AFAICS, this should not happen because the ApicIdMap >>> would >>> never change once set so the compiler should reasonably be able to cache >>> the address of ApicIdMap and dereference it (thus obviating the need for >>> volatile.) >> >> (CPU_HOT_EJECT_DATA.Handler does change though.) > > Yeah, I did kinda elide over that. Let me think this through in v7 > and add more explicit comments and then we can see if it still looks > fishy? OK. (Clearly, I don't want to block progress on this with concerns that are purely theoretical.) Thanks, Laszlo > Thanks > Ankur > >> >>> The compiler could, however, cache any assignments to ApicIdMap[Idx] >>> (especially given LTO) and we could use a MemoryFence() (as the compiler >>> barrier that it is) to force the store. >>> >>>   - CPU pipeline: as you said, right now we basically depend on x86 >>> store >>> order semantics (and the CpuPause() loop around AllCpusInSync, kinda >>> provides >>> a barrier.) >>> >>> So the BSP writes in this order: >>> ApicIdMap[Idx]=x; ... ->AllCpusInSync = true >>> >>> And whenever the AP sees ->AllCpusInSync == True, it has to now see >>> ApicIdMap[Idx] == x. >> >> Yes. >> >>> >>> Maybe the thing to do is to detail this barrier in a commit >>> note/comment? >> >> That would be nice. >> >>> And add the MemoryFence() but not the volatile. >> >> Yes, that should work. >> >> Thanks, >> Laszlo >> >