From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mx.groups.io with SMTP id smtpd.web12.52433.1612274563795865108 for ; Tue, 02 Feb 2021 06:02:43 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MAuTHSAw; spf=pass (domain: redhat.com, ip: 216.205.24.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612274563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bOrHmhH3sbCVtIRXA5TDYoqXpJVNWe19NkWTQ3LxGwE=; b=MAuTHSAwnhygK7a2ta9/NYCXxXL5JfDC2YYBRhn7au7l2VE8EN6SjT2aVbnoGEUA0v/0Z3 xk41Lk1NIOUSsgHE9sa0ED7ZvNNrU0qzqLDniU/FiXFEjS+uG1dQqvDvdcj4TKD5CHDzYz 1Ww5Ll9vLoAuy6a5ho8SEpzZnHUmIaA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-459-mEaHh0xuPp6oZBMtKn4-tg-1; Tue, 02 Feb 2021 09:02:41 -0500 X-MC-Unique: mEaHh0xuPp6oZBMtKn4-tg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D37491372D3; Tue, 2 Feb 2021 14:00:49 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-115-241.ams2.redhat.com [10.36.115.241]) by smtp.corp.redhat.com (Postfix) with ESMTP id D353560C5F; Tue, 2 Feb 2021 14:00:47 +0000 (UTC) Subject: Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() To: Ankur Arora , devel@edk2.groups.io Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com, Jordan Justen , Ard Biesheuvel , Aaron Young References: <20210129005950.467638-1-ankur.a.arora@oracle.com> <20210129005950.467638-8-ankur.a.arora@oracle.com> <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> From: "Laszlo Ersek" Message-ID: Date: Tue, 2 Feb 2021 15:00:46 +0100 MIME-Version: 1.0 In-Reply-To: <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 02/01/21 21:12, Ankur Arora wrote: > On 2021-02-01 11:08 a.m., Laszlo Ersek wrote: >> apologies, I've got more comments here: >> >> On 01/29/21 01:59, Ankur Arora wrote: >> >>> /** >>> + CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(), >>> + on each CPU at exit from SMM. >>> + >>> + If, the executing CPU is not being ejected, nothing to be done. >>> + If, the executing CPU is being ejected, wait in a CpuDeadLoop() >>> + until ejected. >>> + >>> + @param[in] ProcessorNum Index of executing CPU. >>> + >>> +**/ >>> +VOID >>> +EFIAPI >>> +CpuEject ( >>> + IN UINTN ProcessorNum >>> + ) >>> +{ >>> + // >>> + // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64 >>> + // so use UINT64 throughout. >>> + // >>> + UINT64 ApicId; >>> + >>> + ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum]; >>> + if (ApicId == CPU_EJECT_INVALID) { >>> + return; >>> + } >>> + >>> + // >>> + // CPU(s) being unplugged get here from >>> SmmCpuFeaturesSmiRendezvousExit() >>> + // after having been cleared to exit the SMI by the monarch and >>> thus have >>> + // no SMM processing remaining. >>> + // >>> + // Given that we cannot allow them to escape to the guest, we pen >>> them >>> + // here until the SMM monarch tells the HW to unplug them. >>> + // >>> + CpuDeadLoop (); >>> +} >> >> (15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() -- >> it's SmmCpuFeaturesRendezvousExit(). >> >> (16) This function uses a data structure for communication between BSP >> and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on >> the BSP, and checked above by the APs (too). >> >> What guarantees the visibility of mCpuHotEjectData->ApicIdMap? > > I was banking on SmiRendezvous() explicitly signalling that all > processing on the BSP was done before any AP will look at > mCpuHotEjectData in SmmCpuFeaturesRendezvousExit(). > > 1716 // > 1717 // Wait for BSP's signal to exit SMI > 1718 // > 1719 while (*mSmmMpSyncData->AllCpusInSync) { > 1720 CpuPause (); > 1721 } > 1722 } > 1723 > 1724 Exit: > 1725 SmmCpuFeaturesRendezvousExit (CpuIndex); Right; it's a general pattern in edk2: volatile UINT8 (aka BOOLEAN) objects are considered atomic. (See SMM_DISPATCHER_MP_SYNC_DATA.AllCpusInSync -- it's a pointer to a volatile BOOLEAN.) But our UINT64 values are neither volatile nor UINT8, and I got suddenly doubtful about "AllCpusInSync" working as a multiprocessor barrier. (I could be unjustifiedly worried, as a bunch of other fields in SMM_DISPATCHER_MP_SYNC_DATA are volatile, wider than UINT8, and *not* accessed with InterlockedCompareExchageXx().) > >> >> I think we might want to use InterlockedCompareExchange64() in both >> EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in >> addition). InterlockedCompareExchange64() can be used just for >> comparison as well, by passing ExchangeValue=CompareValue. > > > Speaking specifically about the ApicIdMap, I'm not sure I fully > agree (assuming my comment just above is correct.) > > > The only AP (reader) ApicIdMap deref is here: > > CpuEject(): > 218 ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum]; > > For the to-be-ejected-AP, this value can only move from > valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID. > > Given that, by the time the worker does the write on line 254, this > AP is guaranteed to be dead already, I don't think there's any > scenario where the to-be-ejected-AP can see anything other than > a valid-APIC-ID. The scenario I had in mind was different: what guarantees that the effect of 375 mCpuHotEjectData->ApicIdMap[ProcessorNum] = (UINT64)RemoveApicId; which is performed by the BSP in UnplugCpus(), is visible by the AP on line 218 (see your quote above)? What if the AP gets to line 218 before the BSP's write on line 375 *propagates* sufficiently? There's no question that the BSP writes before the AP reads, but I'm uncertain if that suffices for the *effect* of the write to be visible to the AP. My concern is not whether the AP sees a partial vs. a settled update; my concern is if the AP could see an entirely *stale* value. The consequence of that problem would be that an AP that the BSP were about to eject would return from CpuEject() to SmmCpuFeaturesRendezvousExit() to SmiRendezvous(). ... I guess that volatile-qualifying both CPU_HOT_EJECT_DATA, and the array pointed-to by CPU_HOT_EJECT_DATA.ApicIdMap, should suffice. In combination with the sync-up point that you quoted. This seems to match existing practice in PiSmmCpuDxeSmm -- there are no concurrent accesses, so atomicity is not a concern, and serializing the instruction streams coarsely, with the sync-up, in combination with volatile accesses, should presumably guarantee visibility (on x86 anyway). Thanks Laszlo > > 241 QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId); > 242 QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED); > 243 > 244 // > 245 // Compiler barrier to ensure the next store isn't reordered > 246 // > 247 MemoryFence (); > 248 > 249 // > 250 // Clear the eject status for CpuIndex to ensure that an > invalid > 251 // SMI later does not end up trying to eject it or a newly > 252 // hotplugged CpuIndex does not go into the dead loop. > 253 // > 254 mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID; > For APs that are not being ejected, they will always see > CPU_EJECT_INVALID > since the writer never changes that. > > The one scenario in which bad things could happen is if entries in the > ApicIdMap are unaligned (or if the compiler or cpu-arch tears aligned > writes). > >> >> (17) I think a similar observation applies to the "Handler" field too, >> as APs call it, while the BSP keeps flipping it between NULL and a real >> function address. We might have to turn that field into an > From a real function address, to NULL is the problem part right? > > (Same argument as above for the transition in UnplugCpus() from > NULL -> function-address.) > > >> EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use >> InterlockedCompareExchange64() again. > > AFAICS, these are the problematic derefs: > > SmmCpuFeaturesRendezvousExit(): > > 450 if (mCpuHotEjectData == NULL || > 451 mCpuHotEjectData->Handler == NULL) { > 452 return; > > and problematic assignments: > > 266 // > 267 // We are done until the next hot-unplug; clear the handler. > 268 // > 269 mCpuHotEjectData->Handler = NULL; > 270 return; > 271 } > > Here as well, I've been banking on aligned writes such that the APs would > only see the before or after value not an intermediate value. > > Thanks > Ankur > >> >> Thanks >> Laszlo >> >