From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 578C8AC141E for ; Thu, 26 Oct 2023 13:36:31 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=YqjqSTG04YYz4pfiulRPYT6Vk8hTPZaSyx4g/NQN6w0=; c=relaxed/simple; d=groups.io; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Language:Content-Type:Content-Transfer-Encoding; s=20140610; t=1698327390; v=1; b=MDFgol14WN0sjgYTpvFcwbomK7um0Cu1t0B/9GpVQ+z0nfS0Y6CtBxpsjFkhY9Sdf1Gpv5My wd5GW5+7lT4uSFbWHFcjh8CXIWcM2OX7dOfazTvODtwpo3cvcxM/DL3vn6Ub2pdfK/kr3ZsgQkQ /5UmkCpzfTHK/Yc1IDyWGplw= X-Received: by 127.0.0.2 with SMTP id AdAeYY7687511xf4gifgu06z; Thu, 26 Oct 2023 06:36:30 -0700 X-Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web10.200365.1698327389346615653 for ; Thu, 26 Oct 2023 06:36:29 -0700 X-Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-210-pBBPgcHFN9yUXy7mSYeNgw-1; Thu, 26 Oct 2023 09:36:25 -0400 X-MC-Unique: pBBPgcHFN9yUXy7mSYeNgw-1 X-Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9661C86317F; Thu, 26 Oct 2023 13:36:24 +0000 (UTC) X-Received: from [10.39.192.119] (unknown [10.39.192.119]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9CA441121314; Thu, 26 Oct 2023 13:36:23 +0000 (UTC) Message-ID: <5a82610a-4898-24e9-9687-be985d5cea36@redhat.com> Date: Thu, 26 Oct 2023 15:36:22 +0200 MIME-Version: 1.0 Subject: Re: [edk2-devel] [Patch V3] UefiCpuPkg/MpInitLib: Wait for all APs to finish initialization To: devel@edk2.groups.io, yuanhao.xie@intel.com Cc: Ray Ni , Eric Dong , Rahul Kumar , Tom Lendacky References: <20231025114216.2824-1-yuanhao.xie@intel.com> From: "Laszlo Ersek" In-Reply-To: <20231025114216.2824-1-yuanhao.xie@intel.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,lersek@redhat.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: xqPKGFd5uQ4LtqNoNnP2mbb5x7686176AA= Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=MDFgol14; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=redhat.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io On 10/25/23 13:42, Yuanhao Xie wrote: > Aim: > - To solve the assertion that checks if CpuMpData->FinishedCount > equals (CpuMpData->CpuCount - 1). The assertion arises from a timing > discrepancy between the BSP's completion of startup signal checks and > the APs' incrementation of the FinishedCount. > - This patch also ensures that "finished" reporting from the APs is as > later as possible. >=20 > More specifially: >=20 > In the SwitchApContext() function, the BSP trigers > the startup signal and check whether the APs have received it. After > completing this check, the BSP then verifies if the FinishedCount is > equal to CpuCount-1. >=20 > On the AP side, upon receiving the startup signal, they invoke > SwitchContextPerAp() and increase the FinishedCount to indicate their > activation. However, even when all APs have received the startup signal, > they might not have finished incrementing the FinishedCount. This timing > gap results in the triggering of the assertion. >=20 > Solution: > Instead of assertion, use while loop to waits until all the APs have > incremented the FinishedCount. >=20 > Fixes: 964a4f032dcd >=20 > Signed-off-by: Yuanhao Xie > Cc: Ray Ni > Cc: Eric Dong > Cc: Rahul Kumar > Cc: Tom Lendacky > --- > UefiCpuPkg/Library/MpInitLib/MpLib.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) >=20 > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/Mp= InitLib/MpLib.c > index 6f1456cfe1..9a6ec5db5c 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c > @@ -913,8 +913,8 @@ DxeApEntryPoint ( > UINTN ProcessorNumber; > =20 > GetProcessorNumber (CpuMpData, &ProcessorNumber); > - InterlockedIncrement ((UINT32 *)&CpuMpData->FinishedCount); > RestoreVolatileRegisters (&CpuMpData->CpuData[0].VolatileRegisters, FA= LSE); > + InterlockedIncrement ((UINT32 *)&CpuMpData->FinishedCount); > PlaceAPInMwaitLoopOrRunLoop ( > CpuMpData->ApLoopMode, > CpuMpData->CpuData[ProcessorNumber].StartupApSignal, > @@ -2201,7 +2201,12 @@ MpInitLibInitialize ( > // looping process there. > // > SwitchApContext (MpHandOff); > - ASSERT (CpuMpData->FinishedCount =3D=3D (CpuMpData->CpuCount - 1))= ; > + // > + // Wait for all APs finished initialization > + // > + while (CpuMpData->FinishedCount < (CpuMpData->CpuCount - 1)) { > + CpuPause (); > + } > =20 > // > // Set Apstate as Idle, otherwise Aps cannot be waken-up again. Reviewed-by: Laszlo Ersek The change is not testable using OVMF, because OVMF (intentionally) uses ApLoopMode=3DApInHltLoop, and in that case, neither hunk is reachable. (Accordingly, the log message reports WaitLoopExecutionMode as zero.) I've still regression-tested this change, with my usual configs: - OVMF IA32 with SMM_REQUIRE, on q35 - OVMF IA32X64 with SMM_REQUIRE, on q35 - OVMF X64 without SMM_REQUIRE, on pc (i440fx) The test goes like - boot with 1 cold-plugged plus 2 more hot-pluggable VCPUs - [*] - hotplug 2 VCPUs - [*] - hot-unplug 2 VCPUs - [*] - poweroff where [*] stands for: - run efibootmgr, bound to each online VCPU in separation - ACPI S3 suspend/resume - run efibootmgr, bound to each online VCPU in separation I used Fedora and RHEL guests. So: Regression-tested-by: Laszlo Ersek Thanks Laszlo -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110101): https://edk2.groups.io/g/devel/message/110101 Mute This Topic: https://groups.io/mt/102176057/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/19134562= 12/xyzzy [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-