From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by mx.groups.io with SMTP id smtpd.web11.41053.1612210483005724861 for ; Mon, 01 Feb 2021 12:14:43 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=zXqHMp/j; spf=pass (domain: oracle.com, ip: 156.151.31.86, mailfrom: ankur.a.arora@oracle.com) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 111KE9I4001827; Mon, 1 Feb 2021 20:14:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : in-reply-to : content-type : content-transfer-encoding : mime-version; s=corp-2020-01-29; bh=Wg4py4zgT1pZym8QgDHNqKkWUac4vWsPKTmJZR6dvaQ=; b=zXqHMp/jrIiJf5eZtr6X3b7xLcUeN/Pdk/ez0mx+JfFp9h8lUMbBdJQ1bRyz+g9zSsvQ bNawzMQPC1VQdhPlvLN74PavoL7NvwZd6m/KFz8mO2faMlLeot8x9wNKp5s2jdYIaDjB +QrRmxGHUdfX/ttEDgHQWS5vfDCxQoJbWZK3EojdCBj6ADZrhyIZz8+MmYkrKVKvYnq3 OP5LA1bWEZS9kQ6CLSTpWENafonhvw/gw2jml0bedOVp6J/wsi4O/UQ1ArZKy4lzdI7J qGg0WcLrzb8Ac3MPaCtCEiMhCU0tbwkEAYg/jHTELzrqfacLodWSs+oWLhJfwZH0UfnX sg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 36cxvqyc8n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 01 Feb 2021 20:14:39 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 111KBRBk088564; Mon, 1 Feb 2021 20:12:38 GMT Received: from nam11-dm6-obe.outbound.protection.outlook.com (mail-dm6nam11lp2176.outbound.protection.outlook.com [104.47.57.176]) by aserp3030.oracle.com with ESMTP id 36dh1mwvj7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 01 Feb 2021 20:12:38 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YUGCtDBRRcT7K9G5pEVoiCLAgrgv4USNZ5M5oUOJ0z1fbSCL8rYatMaed+KrDYMTDZQ62nTf7IlKCnrKsIAoSYWJ4CNmK2+tSwjN+phRC7+KMl6VxQlBdVTzMnxEvmaM0XizRUmc/lFwTZuq8TMsXzHOS+0Hopd16cP+0JmzW3y8SNSKTpw6YcaQ8PJ8Nr6S18nzf060rXZn8OxQ7SG7TeNQwJrnh91yMz8Yc+cdhLawWf4r0ULdTy4A8m8QsviOS7wsOPAiHAuOiiFEX0zAlrErFJfwV4QZzR0KRcT59NsKuIrCY6Nt4Ox3A8muEIJllgPYgd96sRur45VQ+OhmeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wg4py4zgT1pZym8QgDHNqKkWUac4vWsPKTmJZR6dvaQ=; b=eqEHfo/6sSaT0c6PziUlqrTs798wsCfKFHLBsRMrRe8Wzc35twMPp/gdmPcty6Njfen0GmZVaS5cVBYzANbwUlfDGQ3+P2RA7Cw+vNRC0TxICBElOpzjCo0Y6g9ZdiYY8MwsJy6t+4R2TuZsLWfJ0Futnth467kvisPELVhDChly6gEOT0f/BDneqtFoQlJNjy721kwMbiMeNKIXxvdBG6w9omvDJ64oQPPzZugWFcZ57TOcv3cz7YQlr7pWvv68UXixrFpXx3y1q8cqmbLyihBWYA7KCyxMY0s46+MOHYMrj8ivdf+CiixkQhagKLR40LLCWH3wtUK1bTzpvLcikg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wg4py4zgT1pZym8QgDHNqKkWUac4vWsPKTmJZR6dvaQ=; b=aRjcdAotUznLYX8eatBHciBMZeFh/21WCn1aygHw9qd95RTkUFNzQjrCsT9LvLjcokteHg5BAbw+8c9jIjaiQgC1F2RJ7O0ORYMQI9XL4s+Q2gDDkhkBf2jhVmSyWQ1l7e+G5wB7WabGPOvTLiK5Mdg2zN3w8DFZAB0XtE2XuRI= Received: from SJ0PR10MB4605.namprd10.prod.outlook.com (2603:10b6:a03:2d9::24) by SJ0PR10MB4495.namprd10.prod.outlook.com (2603:10b6:a03:2d6::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.17; Mon, 1 Feb 2021 20:12:36 +0000 Received: from SJ0PR10MB4605.namprd10.prod.outlook.com ([fe80::b1e9:811d:aa8e:f62a]) by SJ0PR10MB4605.namprd10.prod.outlook.com ([fe80::b1e9:811d:aa8e:f62a%6]) with mapi id 15.20.3805.025; Mon, 1 Feb 2021 20:12:36 +0000 Subject: Re: [PATCH v6 7/9] OvmfPkg/CpuHotplugSmm: add CpuEject() To: Laszlo Ersek , devel@edk2.groups.io Cc: imammedo@redhat.com, boris.ostrovsky@oracle.com, Jordan Justen , Ard Biesheuvel , Aaron Young References: <20210129005950.467638-1-ankur.a.arora@oracle.com> <20210129005950.467638-8-ankur.a.arora@oracle.com> From: "Ankur Arora" Message-ID: <14068eb6-dc43-c244-5985-709d685fc750@oracle.com> Date: Mon, 1 Feb 2021 12:12:33 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 In-Reply-To: X-Originating-IP: [70.36.60.91] X-ClientProxiedBy: MW4PR04CA0063.namprd04.prod.outlook.com (2603:10b6:303:6b::8) To SJ0PR10MB4605.namprd10.prod.outlook.com (2603:10b6:a03:2d9::24) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.0.108] (70.36.60.91) by MW4PR04CA0063.namprd04.prod.outlook.com (2603:10b6:303:6b::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.16 via Frontend Transport; Mon, 1 Feb 2021 20:12:36 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f24d0ece-3e1d-4cea-b9ca-08d8c6edb7ac X-MS-TrafficTypeDiagnostic: SJ0PR10MB4495: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LagWsM7IXRxIPCVkNfdWRHPiz9NzXnhzyQzPj5DRBqq25ASD0sIu+3L3nCamFMIPtkuDn2mKGndw38Ilxf7MQNt7gioi8HzkyZ9fyvr4+Cvi40F5ETR7AC5IDw8XzzPJTMh090WKteGKSlOZ81VpkPZVZufWOC6Gx0pfS/xyR/pQWlUBUFdrebw0GOkFBmWN8hCBZIEibBOJXrioo034yz9gopbIjdJNhPr2Xakb4iK0+gXtHBqVY13/VseoNnAfoFtMEOm5nso/fJIYiUi0dpe5BvnsHC0g1gqEQTuAcgvdDJZtBUAyOlbUlwXy8regiyVs05kFwepB8VQSvOOOnn8FJX/9bggHm3tAuKUKSF49+snck1uthE5vfdkwNZg7tvBovUcsDe2BZwBde9vyat12DKHYjpkmEDkynmDqXfWjq4I7++Ug18MePHfba4cC+KuGVcaj/Ze/zAbg40ISWJkhwO/PonXTEVa9HSYAxq9Fnq50g1MnyOPzU/S+mOExI0VtwE1oOiEa+Y/J9F+SMPi0ODQuDYxkqVVsdS/odN8DgHBM0k3Hr/TLo1IX1Nv8SYjG1xyaOlzplObdfLpKJ7jL835SRuWJ6EHr1rbLvws= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB4605.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(39860400002)(136003)(376002)(396003)(346002)(26005)(6666004)(478600001)(107886003)(16526019)(6486002)(31696002)(186003)(53546011)(86362001)(5660300002)(83380400001)(31686004)(4326008)(36756003)(8936002)(2616005)(54906003)(2906002)(316002)(956004)(66476007)(66946007)(66556008)(8676002)(16576012)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?R2VaMkIwZSthMXlGcG13bGhkb0VMYmwrZzNrQUJReUtNTW8zM3lia2I1NXVD?= =?utf-8?B?VmpKOFU5STQ0ellBTDNBZnhtYy83SVVzVENGcVpnQWExQWd6SVpsWGw3Rm8r?= =?utf-8?B?Z1RmbzBuUnJPMExlNWhPNkpEK0FKdklZRmI3YVBQSlZQWFRoQ29HNFA4OVpo?= =?utf-8?B?a2FUYXBOaS9iSHFRaXVqR0NGUWVpdERzTEFLVG1OdHlTcUdLOEVEUS94TjA0?= =?utf-8?B?WkNHQ3Rnbmo3N3JMcVM1aHl3UXUwVmpRaDdLUUVrN2tmTyszYlJ3SlpkZHQx?= =?utf-8?B?UTNnQ3l6bHhld0lqdWF6QWlkWFkvSmZqbHZhaFZjVlJsWmQ5WHJJLzJ5aEJH?= =?utf-8?B?a2ZxL09IWEpycXU2eExRWThjQ3lzZXY1ZlhrNDJvWjVPcXRFRkMvUmxTMGNG?= =?utf-8?B?NXoyT1lnSlJ1YWh5MUJQQndiWVpYaHYrR0NZUFZYdWwxQndlUXkyNEJEM213?= =?utf-8?B?cS9oTE0wZnFCMjVUay9CTFl6Q2lLN09LQm0yaFpoMWpha0k5bjNSaWJRQVFK?= =?utf-8?B?Y3dqNHVrRUJMS3ZlK2crQW5rajhOWEdQVERLK051VnRHdDV4WjNrT0NUSDdI?= =?utf-8?B?K3U4OURKTVdNMm5PMlJKSVQxVXp4WTAwQnBla2hadFQ4VjArVmNxbGU5Skxs?= =?utf-8?B?QWhBVi9WTk1xcXFPWlNmQnp2MS8wYTc5V1dwcTQrSC9hWW1idkhEb08ydWl5?= =?utf-8?B?Rk1RTnNlOFFNb1Y3NXp1OXAycGFTcnNHeEppNWY4NTQ2aTVvemllRTl0UDhN?= =?utf-8?B?N0NtRkhtQ0pkQlhDNENkMTdsWW1vMm1WcEhkdytEajhQSWtGcDZYWUk1bkh0?= =?utf-8?B?eVV2bVlWR3dkSjRBQy9od3ZWQ2ttM2NnWGxnTlBUOWVFc2FyRHBnQk9KOU9Q?= =?utf-8?B?ZHIyNHlSUHpCZVl4amF1cmpzNERRcUwxeHJabng2R25iSWRnK2hGa3hUVS94?= =?utf-8?B?UWpFN2ErY1U2cGhFUEtOOVRoVDh3S0dxOWZxSVpCR0VBM2tkV1RNaC92NDRE?= =?utf-8?B?OW05NUYrM0plQ3hYVjBoOUVJT3RFUUtUWThDVzJzQ1l4UnZ5Ni9FWjNvWmEv?= =?utf-8?B?TEZZR2FsQ0wydGljZC9nakVMeTdaZ0NKV0xyaUNKNVh0ZndkY0NJeWNVSXFF?= =?utf-8?B?Wkp4K3FOdlZMcERjMmh4akZhR0RSMndDdkJtRUhNU2pNNFYvL0h6VHdKbUR5?= =?utf-8?B?SkhUR0FCdiswZUM0L1lJQjJacmhQcDZUcmY0V013TWlUcTNQNHlSa0N5WlB3?= =?utf-8?B?U2tuTWJWSlA4YTJVai9adjFSZFFub255VXU1SDVMNnhrcXp1K2hLVktleWVm?= =?utf-8?B?VTRZVjJ3QmNEcHUrOVVrdk5rVFoyRDRpQlI3S2RrdE5SZnQxYVdXWXNoTzlC?= =?utf-8?B?T25DVm9qQVVlMTd6WmIyR1NFRzlkOTVBTHZaMDkwRmxWaXMxSG85bk1mTWtR?= =?utf-8?Q?hHRvUhDl?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f24d0ece-3e1d-4cea-b9ca-08d8c6edb7ac X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4605.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Feb 2021 20:12:36.5617 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6QeZyKmEQueXZQMo6mdMMTV1gyTUQiGfF4EaE45iGswT+sXEL0wouDrE3Y/EYePzn3CHpKYOo6sEw1WiOw1wOM0ISvbi+IhxwZsjXlvTT0o= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4495 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9882 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 spamscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2102010108 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9882 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 mlxscore=0 priorityscore=1501 spamscore=0 impostorscore=0 clxscore=1015 suspectscore=0 lowpriorityscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2102010109 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 2021-02-01 11:08 a.m., Laszlo Ersek wrote: > apologies, I've got more comments here: > > On 01/29/21 01:59, Ankur Arora wrote: > >> /** >> + CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit(), >> + on each CPU at exit from SMM. >> + >> + If, the executing CPU is not being ejected, nothing to be done. >> + If, the executing CPU is being ejected, wait in a CpuDeadLoop() >> + until ejected. >> + >> + @param[in] ProcessorNum Index of executing CPU. >> + >> +**/ >> +VOID >> +EFIAPI >> +CpuEject ( >> + IN UINTN ProcessorNum >> + ) >> +{ >> + // >> + // APIC ID is UINT32, but mCpuHotEjectData->ApicIdMap[] is UINT64 >> + // so use UINT64 throughout. >> + // >> + UINT64 ApicId; >> + >> + ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum]; >> + if (ApicId == CPU_EJECT_INVALID) { >> + return; >> + } >> + >> + // >> + // CPU(s) being unplugged get here from SmmCpuFeaturesSmiRendezvousExit() >> + // after having been cleared to exit the SMI by the monarch and thus have >> + // no SMM processing remaining. >> + // >> + // Given that we cannot allow them to escape to the guest, we pen them >> + // here until the SMM monarch tells the HW to unplug them. >> + // >> + CpuDeadLoop (); >> +} > > (15) There is no such function as SmmCpuFeaturesSmiRendezvousExit() -- > it's SmmCpuFeaturesRendezvousExit(). > > (16) This function uses a data structure for communication between BSP > and APs -- mCpuHotEjectData->ApicIdMap is modified in UnplugCpus() on > the BSP, and checked above by the APs (too). > > What guarantees the visibility of mCpuHotEjectData->ApicIdMap? I was banking on SmiRendezvous() explicitly signalling that all processing on the BSP was done before any AP will look at mCpuHotEjectData in SmmCpuFeaturesRendezvousExit(). 1716 // 1717 // Wait for BSP's signal to exit SMI 1718 // 1719 while (*mSmmMpSyncData->AllCpusInSync) { 1720 CpuPause (); 1721 } 1722 } 1723 1724 Exit: 1725 SmmCpuFeaturesRendezvousExit (CpuIndex); > > I think we might want to use InterlockedCompareExchange64() in both > EjectCpu() and UnplugCpus() (and make "ApicIdMap" volatile, in > addition). InterlockedCompareExchange64() can be used just for > comparison as well, by passing ExchangeValue=CompareValue. Speaking specifically about the ApicIdMap, I'm not sure I fully agree (assuming my comment just above is correct.) The only AP (reader) ApicIdMap deref is here: CpuEject(): 218 ApicId = mCpuHotEjectData->ApicIdMap[ProcessorNum]; For the to-be-ejected-AP, this value can only move from valid-APIC-ID (=> wait in CpuDeadLoop()) -> CPU_EJECT_INVALID. Given that, by the time the worker does the write on line 254, this AP is guaranteed to be dead already, I don't think there's any scenario where the to-be-ejected-AP can see anything other than a valid-APIC-ID. 241 QemuCpuhpWriteCpuSelector (mMmCpuIo, (APIC_ID) RemoveApicId); 242 QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECTED); 243 244 // 245 // Compiler barrier to ensure the next store isn't reordered 246 // 247 MemoryFence (); 248 249 // 250 // Clear the eject status for CpuIndex to ensure that an invalid 251 // SMI later does not end up trying to eject it or a newly 252 // hotplugged CpuIndex does not go into the dead loop. 253 // 254 mCpuHotEjectData->ApicIdMap[CpuIndex] = CPU_EJECT_INVALID; For APs that are not being ejected, they will always see CPU_EJECT_INVALID since the writer never changes that. The one scenario in which bad things could happen is if entries in the ApicIdMap are unaligned (or if the compiler or cpu-arch tears aligned writes). > > (17) I think a similar observation applies to the "Handler" field too, > as APs call it, while the BSP keeps flipping it between NULL and a real > function address. We might have to turn that field into an From a real function address, to NULL is the problem part right? (Same argument as above for the transition in UnplugCpus() from NULL -> function-address.) > EFI_PHYSICAL_ADDRESS (just a fancy name for UINT64), and use > InterlockedCompareExchange64() again. AFAICS, these are the problematic derefs: SmmCpuFeaturesRendezvousExit(): 450 if (mCpuHotEjectData == NULL || 451 mCpuHotEjectData->Handler == NULL) { 452 return; and problematic assignments: 266 // 267 // We are done until the next hot-unplug; clear the handler. 268 // 269 mCpuHotEjectData->Handler = NULL; 270 return; 271 } Here as well, I've been banking on aligned writes such that the APs would only see the before or after value not an intermediate value. Thanks Ankur > > Thanks > Laszlo >