From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.20; helo=mga02.intel.com; envelope-from=eric.dong@intel.com; receiver=edk2-devel@lists.01.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 89E2921CEB124 for ; Tue, 24 Oct 2017 08:36:45 -0700 (PDT) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Oct 2017 08:40:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,428,1503385200"; d="scan'208";a="166984117" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by fmsmga006.fm.intel.com with ESMTP; 24 Oct 2017 08:40:28 -0700 Received: from fmsmsx157.amr.corp.intel.com (10.18.116.73) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 24 Oct 2017 08:40:28 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by FMSMSX157.amr.corp.intel.com (10.18.116.73) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 24 Oct 2017 08:40:27 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.175]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.218]) with mapi id 14.03.0319.002; Tue, 24 Oct 2017 23:40:26 +0800 From: "Dong, Eric" To: "Dong, Eric" , Laszlo Ersek , "edk2-devel@lists.01.org" CC: "Ni, Ruiyu" , Paolo Bonzini Thread-Topic: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic. Thread-Index: AQHTS8/RPtqizoqgTUaTHjY9+C766aLyRGEAgADUE/CAAAoQ4A== Date: Tue, 24 Oct 2017 15:40:26 +0000 Message-ID: References: <1508743358-3640-1-git-send-email-eric.dong@intel.com> <1508743358-3640-3-git-send-email-eric.dong@intel.com> <4fe39a52-0cd3-de2e-84f2-7363823a1b60@redhat.com> In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTkwZWE2YjQtNmI3YS00NzM5LTgxNTYtOGFiMzExZTM1ODQ0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjIuNS4xOCIsIlRydXN0ZWRMYWJlbEhhc2giOiJSdlVTTmU3azVDZTR3YTRud3FXd1ppZG16ZUN6TXRBWFU4NkRabGhPRVFsVVhKSGw1Q2JYYnFtbHA3VU9JZHlwIn0= x-ctpclassification: CTP_IC dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Oct 2017 15:36:45 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Laszlo, Add more comments for TimedWaitForApFinish function in mail. > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of > Dong, Eric > Sent: Tuesday, October 24, 2017 11:24 PM > To: Laszlo Ersek ; edk2-devel@lists.01.org > Cc: Ni, Ruiyu ; Paolo Bonzini > Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for > AP initialization logic. >=20 > Laszlo, >=20 > > -----Original Message----- > > From: Laszlo Ersek [mailto:lersek@redhat.com] > > Sent: Tuesday, October 24, 2017 6:16 PM > > To: Dong, Eric ; edk2-devel@lists.01.org > > Cc: Ni, Ruiyu ; Paolo Bonzini > > > > Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting > > for AP initialization logic. > > > > CC Paolo > > > > On 10/23/17 09:22, Eric Dong wrote: > > > Current logic always waiting for a specific value to collect all APs > > > count. This logic may caused some platforms cost too much time to > > > wait for time out. > > > This patch add new logic to collect APs count. It adds new variable > > > NumApsExecuting to detect whether all APs have finished initializatio= n. > > > Each AP let NumApsExecuting++ when begin to initialize itself and > > > let > > > NumApsExecuting-- when it finish the initialization. BSP base on > > > whether NumApsExecuting =3D=3D 0 to finished the collect AP process. > > > > > > Cc: Ruiyu Ni > > > Contributed-under: TianoCore Contribution Agreement 1.1 > > > Signed-off-by: Eric Dong > > > --- > > > UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 1 + > > > UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm | 6 ++++++ > > > UefiCpuPkg/Library/MpInitLib/MpLib.c | 20 ++++++++++++++--= ---- > > > UefiCpuPkg/Library/MpInitLib/MpLib.h | 1 + > > > UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 3 ++- > > > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 6 ++++++ > > > 6 files changed, 30 insertions(+), 7 deletions(-) > > > > I was out of office yesterday, and did not get a chance to comment on > > this patch. > > > > In a virtualization guest, I see the following problem with the patch: > > > > VCPUs (virtual CPUs) may not be scheduled by the hypervisor similarly > > to how physical CPUs behave on a physical board. It is possible that a > > VCPU starts up and even finishes its initialization routine before > > another VCPU starts running at all. > > > > Therefore the locked NumApsExecuting counter may hit zero, even > > multiple times, before all APs have finished initializing. > > > > In OVMF, we query QEMU about the exact number of virtual processors, > > in PlatformPei. So OVMF configures the logical processor count in > > advance that MpInitLib has to wait for. Correspondingly, we also set > > the timeout to "infinity". > > > > Please see the MaxCpuCountInitialization() function in following commit= : > > > > https://github.com/tianocore/edk2/commit/45a70db3c3a59 > > > > In the past, we used to have AP initialization problems in OVMF due to > > the VCPU scheduling artifacts I mention above. After commit > > 45a70db3c3a59, things have been stable; it would be nice to keep that > working. > > > > Please note that simply testing this patch on my end is not sufficient. > > The AP init problems we used to face were sporadic and also specific > > to the virtualization host systems (i.e., dependent on the physical > > hardware and the host kernel). > > > > Furthermore: >=20 > [[Eric]] Will comments later if needed, need some investigation first. >=20 > > > > > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > > > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > > > index 976af1f..bdfe0d3 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > > > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > > > @@ -40,4 +40,5 @@ EnableExecuteDisableLocation equ > LockLocation > > + 30h > > > Cr3Location equ LockLocation + 34h > > > InitFlagLocation equ LockLocation + 38h > > > CpuInfoLocation equ LockLocation + 3Ch > > > +NumApsExecutingLocation equ LockLocation + 40h > > > > > > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > > > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > > > index 1b9c6a6..2b6c27d 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > > > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > > > @@ -86,6 +86,12 @@ Flat32Start: ; p= rotected mode > > entry point > > > > > > mov esi, ebx > > > > > > + ; Increment the number of APs executing here as early as possibl= e > > > + ; This is decremented in C code when AP is finished executing > > > + mov edi, esi > > > + add edi, NumApsExecutingLocation > > > + lock inc dword [edi] > > > + > > > mov edi, esi > > > add edi, EnableExecuteDisableLocation > > > cmp byte [edi], 0 > > > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c > > > b/UefiCpuPkg/Library/MpInitLib/MpLib.c > > > index db923c9..48f930b 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c > > > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c > > > @@ -662,6 +662,7 @@ ApWakeupFunction ( > > > // AP finished executing C code > > > // > > > InterlockedIncrement ((UINT32 *) &CpuMpData->FinishedCount); > > > + InterlockedDecrement ((UINT32 *) > > > + &CpuMpData->MpCpuExchangeInfo->NumApsExecuting); > > > > > > // > > > // Place AP is specified loop mode @@ -765,6 +766,7 @@ > > > FillExchangeInfoData ( > > > > > > ExchangeInfo->CFunction =3D (UINTN) ApWakeupFunction; > > > ExchangeInfo->ApIndex =3D 0; > > > + ExchangeInfo->NumApsExecuting =3D 0; > > > ExchangeInfo->InitFlag =3D (UINTN) CpuMpData->InitFlag; > > > ExchangeInfo->CpuInfo =3D (CPU_INFO_IN_HOB *) (UINTN) > > CpuMpData->CpuInfoInHob; > > > ExchangeInfo->CpuMpData =3D CpuMpData; > > > @@ -934,13 +936,19 @@ WakeUpAP ( > > > } > > > if (CpuMpData->InitFlag =3D=3D ApInitConfig) { > > > // > > > - // Wait for all potential APs waken up in one specified period > > > + // Wait for one potential AP waken up in one specified period > > > // > > > - TimedWaitForApFinish ( > > > - CpuMpData, > > > - PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1, > > > - PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds) > > > - ); > > > + if (CpuMpData->CpuCount =3D=3D 0) { > > > + TimedWaitForApFinish ( > > > + CpuMpData, > > > + PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1, > > > + PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds) > > > + ); > > > + } > > > > I don't understand this change. The new comment says, > > > > Wait for *one* potential AP waken up in one specified period > > > > However, the second parameter of TimedWaitForApFinish(), namely > > "FinishedApLimit", gets the same value as before: > > > > PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1 > > > > It means that all of the (possible) APs are waited-for, just the same a= s > before. >=20 [[Eric]] We still use the original PCD and not change the value, because th= is value need to tune on different platforms. So we think it's no need to c= hange the default value for this PCD.=20 For TimedWaitForApFinish, it will return in two cases, one is the specified= CPU number (PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,) have been fou= nd, the other is the specified time value(PcdGet32 (PcdCpuApInitTimeOutInMi= croSeconds)) has reached. I think with old solution, this function will re= turn when it found the specified CPU numbers (except for the cpu hotplug ca= se, which the value may bigger than the actual CPU numbers). With new solut= ion, after update PCD value base on platform, it will return when the speci= fic time out is reach. > [[Eric]] This patch changes the collect AP count logic, original solution= always > waits for a specific time to let all APs start up. If the input CPU numbe= r > (PcdGet32(PcdCpuMaxLogicalProcessorNumber) - 1) have been found or > after a specific time(PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)). BSP > will not wait anymore and use CpuMpData->CpuCount as the found AP > count. >=20 > New logic also wait for a specific time, but this time is smaller than th= e > original one. It just wait for the first AP(any AP) begin to do the > initialization( do CpuMpData->MpCpuExchangeInfo->NumApsExecuting++ > means it begin to do the initialization). When Ap finishes initialization= , it will > do CpuMpData->MpCpuExchangeInfo->NumApsExecuting--. So after BSP > waits for a specific time at first, it just needs to check whether CpuMpD= ata- > >MpCpuExchangeInfo->NumApsExecuting =3D=3D 0 to know whether all Aps have > finished initialization. Here we still use the original PCD > (PcdCpuApInitTimeOutInMicroSeconds) for the new time value. >=20 > When one AP do the initialization, it will also do CpuMpData->CpuCount++. > So here I check whether CpuMpData->CpuCount !=3D 0 to know whether APs > already begin to do the initialization. If yes, I not need to do the time= out > waiting anymore, just check the CpuMpData->MpCpuExchangeInfo- > >NumApsExecuting to know whether all Aps have finished initialization. >=20 > > > > In other words, I think the patch does not correctly implement what > > the commit message says -- and for QEMU / OVMF, that's actually a good > > thing at the moment, because a correct implementation of the > > description would likely break on QEMU. > > > > Thanks > > Laszlo > > > > > + > > > + while (CpuMpData->MpCpuExchangeInfo->NumApsExecuting !=3D 0) { > > > + CpuPause(); > > > + } > > > } else { > > > // > > > // Wait all APs waken up if this is not the 1st broadcast of > > > SIPI diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h > > > b/UefiCpuPkg/Library/MpInitLib/MpLib.h > > > index e41d2db..d13d5c0 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > > > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > > > @@ -176,6 +176,7 @@ typedef struct { > > > UINTN Cr3; > > > UINTN InitFlag; > > > CPU_INFO_IN_HOB *CpuInfo; > > > + UINTN NumApsExecuting; > > > CPU_MP_DATA *CpuMpData; > > > UINTN InitializeFloatingPointUnitsAddress; > > > } MP_CPU_EXCHANGE_INFO; > > > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > > > b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > > > index 114f4e0..d255ca5 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > > > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > > > @@ -40,5 +40,6 @@ EnableExecuteDisableLocation equ > LockLocation > > + 5Ch > > > Cr3Location equ LockLocation + 64h > > > InitFlagLocation equ LockLocation + 6Ch > > > CpuInfoLocation equ LockLocation + 74h > > > -InitializeFloatingPointUnitsAddress equ LockLocation + 84h > > > +NumApsExecutingLocation equ LockLocation + 7Ch > > > +InitializeFloatingPointUnitsAddress equ LockLocation + 8Ch > > > > > > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > > > b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > > > index 4ada649..21d2786 100644 > > > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > > > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > > > @@ -124,6 +124,12 @@ LongModeStart: > > > cmp qword [edi], 1 ; ApInitConfig > > > jnz GetApicId > > > > > > + ; Increment the number of APs executing here as early as possibl= e > > > + ; This is decremented in C code when AP is finished executing > > > + mov edi, esi > > > + add edi, NumApsExecutingLocation > > > + lock inc dword [edi] > > > + > > > ; AP init > > > mov edi, esi > > > add edi, LockLocation > > > >=20 > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel