From: Laszlo Ersek <lersek@redhat.com>
To: "Dong, Eric" <eric.dong@intel.com>,
"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jeff Fan <vanjeff_919@hotmail.com>
Subject: Re: [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.
Date: Tue, 24 Oct 2017 19:40:21 +0200 [thread overview]
Message-ID: <1329f926-4c33-aaca-108a-7d15ae0503bc@redhat.com> (raw)
In-Reply-To: <ED077930C258884BBCB450DB737E66224AA3DA63@shsmsx102.ccr.corp.intel.com>
Hi Eric,
On 10/24/17 17:23, Dong, Eric wrote:
> Laszlo,
>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Tuesday, October 24, 2017 6:16 PM
>> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
>> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>; Paolo Bonzini <pbonzini@redhat.com>
>> Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
>> AP initialization logic.
>>
>> CC Paolo
>>
>> On 10/23/17 09:22, Eric Dong wrote:
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> index 976af1f..bdfe0d3 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> @@ -40,4 +40,5 @@ EnableExecuteDisableLocation equ LockLocation
>> + 30h
>>> Cr3Location equ LockLocation + 34h
>>> InitFlagLocation equ LockLocation + 38h
>>> CpuInfoLocation equ LockLocation + 3Ch
>>> +NumApsExecutingLocation equ LockLocation + 40h
>>>
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> index 1b9c6a6..2b6c27d 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> @@ -86,6 +86,12 @@ Flat32Start: ; protected mode
>> entry point
>>>
>>> mov esi, ebx
>>>
>>> + ; Increment the number of APs executing here as early as possible
>>> + ; This is decremented in C code when AP is finished executing
>>> + mov edi, esi
>>> + add edi, NumApsExecutingLocation
>>> + lock inc dword [edi]
>>> +
>>> mov edi, esi
>>> add edi, EnableExecuteDisableLocation
>>> cmp byte [edi], 0
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> index db923c9..48f930b 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> @@ -662,6 +662,7 @@ ApWakeupFunction (
>>> // AP finished executing C code
>>> //
>>> InterlockedIncrement ((UINT32 *) &CpuMpData->FinishedCount);
>>> + InterlockedDecrement ((UINT32 *)
>>> + &CpuMpData->MpCpuExchangeInfo->NumApsExecuting);
>>>
>>> //
>>> // Place AP is specified loop mode @@ -765,6 +766,7 @@
>>> FillExchangeInfoData (
>>>
>>> ExchangeInfo->CFunction = (UINTN) ApWakeupFunction;
>>> ExchangeInfo->ApIndex = 0;
>>> + ExchangeInfo->NumApsExecuting = 0;
>>> ExchangeInfo->InitFlag = (UINTN) CpuMpData->InitFlag;
>>> ExchangeInfo->CpuInfo = (CPU_INFO_IN_HOB *) (UINTN)
>> CpuMpData->CpuInfoInHob;
>>> ExchangeInfo->CpuMpData = CpuMpData;
>>> @@ -934,13 +936,19 @@ WakeUpAP (
>>> }
>>> if (CpuMpData->InitFlag == ApInitConfig) {
>>> //
>>> - // Wait for all potential APs waken up in one specified period
>>> + // Wait for one potential AP waken up in one specified period
>>> //
>>> - TimedWaitForApFinish (
>>> - CpuMpData,
>>> - PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> - PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> - );
>>> + if (CpuMpData->CpuCount == 0) {
>>> + TimedWaitForApFinish (
>>> + CpuMpData,
>>> + PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> + PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> + );
>>> + }
>>
>> I don't understand this change. The new comment says,
>>
>> Wait for *one* potential AP waken up in one specified period
>>
>> However, the second parameter of TimedWaitForApFinish(), namely
>> "FinishedApLimit", gets the same value as before:
>>
>> PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1
>>
>> It means that all of the (possible) APs are waited-for, just the same
>> as before.
>
> [[Eric]] This patch changes the collect AP count logic, original
> solution always waits for a specific time to let all APs start up. If
> the input CPU number (PcdGet32(PcdCpuMaxLogicalProcessorNumber) - 1)
> have been found or after a specific time(PcdGet32
> (PcdCpuApInitTimeOutInMicroSeconds)). BSP will not wait anymore and use
> CpuMpData->CpuCount as the found AP count.
>
> New logic also wait for a specific time, but this time is smaller than
> the original one. It just wait for the first AP(any AP) begin to do the
> initialization( do CpuMpData->MpCpuExchangeInfo->NumApsExecuting++ means
> it begin to do the initialization). When Ap finishes initialization, it
> will do CpuMpData->MpCpuExchangeInfo->NumApsExecuting--. So after BSP
> waits for a specific time at first, it just needs to check whether
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting == 0 to know whether all
> Aps have finished initialization. Here we still use the original PCD
> (PcdCpuApInitTimeOutInMicroSeconds) for the new time value.
>
> When one AP do the initialization, it will also do
> CpuMpData->CpuCount++. So here I check whether CpuMpData->CpuCount != 0
> to know whether APs already begin to do the initialization. If yes, I
> not need to do the time out waiting anymore, just check the
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting to know whether all Aps
> have finished initialization.
Thanks for the explanation.
The "NumApsExecuting" increment / decrement logic in this patch expects
that the APs work as follows:
(1) After the INIT-SIPI-SIPI, there may be a delay before the APs start
running. During this delay, the BSP may or may not reach the code in
question. The (CpuMpData->CpuCount != 0) check is supposed to take this
into account.
(2) After at least one AP has started running, the logic expects
"NumApsExecuting" to monotonically grow for a while, and then to
monotonically decrease, back to zero. For example, if we have 1 BSP and
7 APs, the BSP logic more or less expects the following values in
"NumApsExecuting":
1; 2; 3; 4; 5; 6; 7;
6; 5; 4; 3; 2; 1; 0
While this may be a valid expectation for physical processors (which
actually run in parallel, in the physical world), in a virtual machine,
it is not guaranteed. Dependent on hypervisor scheduling artifacts, it
is possible that, say, three APs start up *and finish* before the
remaining four APs start up *at all*. The INIT-SIPI-SIPI marks all APs
for execution / scheduling in the hypervisor, yes, but the actual code
execution may commence a lot later. For example, the BSP may witness the
following series of values in "NumApsExecuting":
1; 2; 3;
2; 1; 0;
1; 2; 3; 4;
3; 2; 1; 0
and the BSP could think that there are 3 APs only, when it sees the
first 0 value.
Now, let me get back to the use case that actually matters to OVMF and
QEMU. As I mentioned earlier, OVMF knows from QEMU the exact number of
APs. If there is a way for OVMF to tell MpInitLib to wait for this exact
number of APs -- no matter how long it takes --, then that's what I
would like to use.
Please see the original discussion around OVMF commit 45a70db3c3a59:
* In version 1, I introduced a new PCD called
PcdCpuKnownLogicalProcessorNumber
and I modified MpInitLib to wait for this AP number, ignoring timeout
completely, if the PCD was set:
https://lists.01.org/pipermail/edk2-devel/2016-November/005076.html
However, Jeff suggested to use the preexistent PCD
"PcdCpuMaxLogicalProcessorNumber" for this purpose:
https://lists.01.org/pipermail/edk2-devel/2016-November/005092.html
* In version 2, I used the PCD suggested by Jeff, but I also introduced
a new special value for the timeout. Timeout=0 would mean "infinity":
https://lists.01.org/pipermail/edk2-devel/2016-November/005209.html
Jeff didn't like the special value, and suggested that OVMF simply use
MAX_UINT32 microseconds (approx. 71 minutes) instead of "infinity":
https://lists.01.org/pipermail/edk2-devel/2016-November/005266.html
* In v3, I implemented that, and that was pushed as:
- UefiCpuPkg commit 6e1987f19af7 ("UefiCpuPkg/MpInitLib: wait no
longer than necessary for initial AP startup", 2016-11-24)
- and OvmfPkg commit 45a70db3c3a5 ("OvmfPkg/PlatformPei: take VCPU
count from QEMU and configure MpInitLib", 2016-11-24).
So, again, the use case that I would like to cover is:
* the exact number of APs is known at boot, to OvmfPkg/PlatformPei,
* after the very first INIT-SIPI-SIPI, MpInitLib should wait for exactly
this number of APs to "report in", regardless of:
- how long it takes,
- in what order / sequence the APs report in. (Again, please remember
that some APs may complete the initialization before other APs
execute their very first instruction.)
* Preferably, the case should be handled when the processor count grows
from cold boot to S3 resume (VCPU hotplug). TianoCore BZ#251 used to
track this use case separately.
(
Jeff closed BZ#251 today, with the argument that commit 0594ec417c89
(this patch) finds the CPU count dynamically anyway, so a platform can
simply set "PcdCpuMaxLogicalProcessorNumber" to the maximum possible value.
This argument does not work in a virtual machine, because commit
0594ec417c89 (this patch) may in fact not find the VCPU count correctly
-- in a VM, (NumApsExecuting==0) does not imply that all APs have finished.
)
Thanks!
Laszlo
next prev parent reply other threads:[~2017-10-24 17:36 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-23 7:22 [Patch 0/2] Enhance collect AP Count logic Eric Dong
2017-10-23 7:22 ` [Patch 1/2] UefiCpuPkg/MpInitLib: Change AP Index variable name Eric Dong
2017-10-24 6:00 ` Ni, Ruiyu
2017-10-23 7:22 ` [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic Eric Dong
2017-10-24 6:02 ` Ni, Ruiyu
2017-10-24 6:27 ` 答复: " Fan Jeff
2017-10-24 7:18 ` Ni, Ruiyu
2017-10-24 7:32 ` 答复: " Fan Jeff
2017-10-24 10:15 ` Laszlo Ersek
2017-10-24 14:24 ` 答复: " Fan Jeff
2017-10-24 16:29 ` Laszlo Ersek
2017-10-24 15:23 ` Dong, Eric
2017-10-24 15:40 ` Dong, Eric
2017-10-24 17:40 ` Laszlo Ersek [this message]
2017-10-24 22:30 ` Brian J. Johnson
2017-10-25 5:35 ` 答复: " Fan Jeff
2017-10-25 5:32 ` Fan Jeff
2017-10-25 5:42 ` Dong, Eric
2017-10-25 15:07 ` Laszlo Ersek
2017-10-26 1:13 ` Dong, Eric
2017-10-26 20:48 ` Brian J. Johnson
2017-10-27 1:31 ` Dong, Eric
2017-10-24 7:31 ` 答复: [Patch 0/2] Enhance collect AP Count logic Fan Jeff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1329f926-4c33-aaca-108a-7d15ae0503bc@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox