public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Fan Jeff <vanjeff_919@hotmail.com>
To: Laszlo Ersek <lersek@redhat.com>,
	"Dong, Eric" <eric.dong@intel.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: 答复: [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.
Date: Wed, 25 Oct 2017 05:32:26 +0000	[thread overview]
Message-ID: <CY1PR19MB0283462DE5EBD55CC9D9D854D7440@CY1PR19MB0283.namprd19.prod.outlook.com> (raw)
In-Reply-To: <1329f926-4c33-aaca-108a-7d15ae0503bc@redhat.com>

Laszlo,

I agree BZ#251 can not be closed if commit 0594ec417c89 does not work for OVMF.

I will reopen it.

Jeff

发件人: Laszlo Ersek<mailto:lersek@redhat.com>
发送时间: 2017年10月25日 1:40
收件人: Dong, Eric<mailto:eric.dong@intel.com>; edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
抄送: Ni, Ruiyu<mailto:ruiyu.ni@intel.com>; Paolo Bonzini<mailto:pbonzini@redhat.com>; Jeff Fan<mailto:vanjeff_919@hotmail.com>
主题: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.

Hi Eric,

On 10/24/17 17:23, Dong, Eric wrote:
> Laszlo,
>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Tuesday, October 24, 2017 6:16 PM
>> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
>> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>; Paolo Bonzini <pbonzini@redhat.com>
>> Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
>> AP initialization logic.
>>
>> CC Paolo
>>
>> On 10/23/17 09:22, Eric Dong wrote:

>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> index 976af1f..bdfe0d3 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
>>> @@ -40,4 +40,5 @@ EnableExecuteDisableLocation  equ        LockLocation
>> + 30h
>>>  Cr3Location                   equ        LockLocation + 34h
>>>  InitFlagLocation              equ        LockLocation + 38h
>>>  CpuInfoLocation               equ        LockLocation + 3Ch
>>> +NumApsExecutingLocation       equ        LockLocation + 40h
>>>
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> index 1b9c6a6..2b6c27d 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
>>> @@ -86,6 +86,12 @@ Flat32Start:                                   ; protected mode
>> entry point
>>>
>>>      mov        esi, ebx
>>>
>>> +    ; Increment the number of APs executing here as early as possible
>>> +    ; This is decremented in C code when AP is finished executing
>>> +    mov        edi, esi
>>> +    add        edi, NumApsExecutingLocation
>>> +    lock inc   dword [edi]
>>> +
>>>      mov         edi, esi
>>>      add         edi, EnableExecuteDisableLocation
>>>      cmp         byte [edi], 0
>>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> index db923c9..48f930b 100644
>>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>>> @@ -662,6 +662,7 @@ ApWakeupFunction (
>>>      // AP finished executing C code
>>>      //
>>>      InterlockedIncrement ((UINT32 *) &CpuMpData->FinishedCount);
>>> +    InterlockedDecrement ((UINT32 *)
>>> + &CpuMpData->MpCpuExchangeInfo->NumApsExecuting);
>>>
>>>      //
>>>      // Place AP is specified loop mode @@ -765,6 +766,7 @@
>>> FillExchangeInfoData (
>>>
>>>    ExchangeInfo->CFunction       = (UINTN) ApWakeupFunction;
>>>    ExchangeInfo->ApIndex         = 0;
>>> +  ExchangeInfo->NumApsExecuting = 0;
>>>    ExchangeInfo->InitFlag        = (UINTN) CpuMpData->InitFlag;
>>>    ExchangeInfo->CpuInfo         = (CPU_INFO_IN_HOB *) (UINTN)
>> CpuMpData->CpuInfoInHob;
>>>    ExchangeInfo->CpuMpData       = CpuMpData;
>>> @@ -934,13 +936,19 @@ WakeUpAP (
>>>      }
>>>      if (CpuMpData->InitFlag == ApInitConfig) {
>>>        //
>>> -      // Wait for all potential APs waken up in one specified period
>>> +      // Wait for one potential AP waken up in one specified period
>>>        //
>>> -      TimedWaitForApFinish (
>>> -        CpuMpData,
>>> -        PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> -        PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> -        );
>>> +      if (CpuMpData->CpuCount == 0) {
>>> +        TimedWaitForApFinish (
>>> +          CpuMpData,
>>> +          PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
>>> +          PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
>>> +          );
>>> +      }
>>
>> I don't understand this change. The new comment says,
>>
>>   Wait for *one* potential AP waken up in one specified period
>>
>> However, the second parameter of TimedWaitForApFinish(), namely
>> "FinishedApLimit", gets the same value as before:
>>
>>   PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1
>>
>> It means that all of the (possible) APs are waited-for, just the same
>> as before.
>
> [[Eric]] This patch changes the collect AP count logic, original
> solution always waits for a specific time to let all APs start up. If
> the input CPU number (PcdGet32(PcdCpuMaxLogicalProcessorNumber) - 1)
> have been found or after a specific time(PcdGet32
> (PcdCpuApInitTimeOutInMicroSeconds)). BSP will not wait anymore and use
> CpuMpData->CpuCount as the found AP count.
>
> New logic also wait for a specific time, but this time is smaller than
> the original one. It just wait for the first AP(any AP) begin to do the
> initialization( do CpuMpData->MpCpuExchangeInfo->NumApsExecuting++ means
> it begin to do the initialization). When Ap finishes initialization, it
> will do CpuMpData->MpCpuExchangeInfo->NumApsExecuting--. So after BSP
> waits for a specific time at first, it just needs to check whether
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting == 0 to know whether all
> Aps have finished initialization. Here we still use the original PCD
> (PcdCpuApInitTimeOutInMicroSeconds) for the new time value.
>
> When one AP do the initialization, it will also do
> CpuMpData->CpuCount++. So here I check whether CpuMpData->CpuCount != 0
> to know whether APs already begin to do the initialization. If yes, I
> not need to do the time out waiting anymore, just check the
> CpuMpData->MpCpuExchangeInfo->NumApsExecuting to know whether all Aps
> have finished initialization.

Thanks for the explanation.

The "NumApsExecuting" increment / decrement logic in this patch expects
that the APs work as follows:

(1) After the INIT-SIPI-SIPI, there may be a delay before the APs start
running. During this delay, the BSP may or may not reach the code in
question. The (CpuMpData->CpuCount != 0) check is supposed to take this
into account.

(2) After at least one AP has started running, the logic expects
"NumApsExecuting" to monotonically grow for a while, and then to
monotonically decrease, back to zero. For example, if we have 1 BSP and
7 APs, the BSP logic more or less expects the following values in
"NumApsExecuting":

1; 2; 3; 4; 5; 6; 7;
6; 5; 4; 3; 2; 1; 0


While this may be a valid expectation for physical processors (which
actually run in parallel, in the physical world), in a virtual machine,
it is not guaranteed. Dependent on hypervisor scheduling artifacts, it
is possible that, say, three APs start up *and finish* before the
remaining four APs start up *at all*. The INIT-SIPI-SIPI marks all APs
for execution / scheduling in the hypervisor, yes, but the actual code
execution may commence a lot later. For example, the BSP may witness the
following series of values in "NumApsExecuting":

1; 2; 3;
2; 1; 0;
1; 2; 3; 4;
3; 2; 1; 0

and the BSP could think that there are 3 APs only, when it sees the
first 0 value.


Now, let me get back to the use case that actually matters to OVMF and
QEMU. As I mentioned earlier, OVMF knows from QEMU the exact number of
APs. If there is a way for OVMF to tell MpInitLib to wait for this exact
number of APs -- no matter how long it takes --, then that's what I
would like to use.

Please see the original discussion around OVMF commit 45a70db3c3a59:

* In version 1, I introduced a new PCD called

  PcdCpuKnownLogicalProcessorNumber

and I modified MpInitLib to wait for this AP number, ignoring timeout
completely, if the PCD was set:

  https://lists.01.org/pipermail/edk2-devel/2016-November/005076.html

However, Jeff suggested to use the preexistent PCD
"PcdCpuMaxLogicalProcessorNumber" for this purpose:

  https://lists.01.org/pipermail/edk2-devel/2016-November/005092.html

* In version 2, I used the PCD suggested by Jeff, but I also introduced
a new special value for the timeout. Timeout=0 would mean "infinity":

  https://lists.01.org/pipermail/edk2-devel/2016-November/005209.html

Jeff didn't like the special value, and suggested that OVMF simply use
MAX_UINT32 microseconds (approx. 71 minutes) instead of "infinity":

  https://lists.01.org/pipermail/edk2-devel/2016-November/005266.html

* In v3, I implemented that, and that was pushed as:

  - UefiCpuPkg commit 6e1987f19af7 ("UefiCpuPkg/MpInitLib: wait no
    longer than necessary for initial AP startup", 2016-11-24)

  - and OvmfPkg commit 45a70db3c3a5 ("OvmfPkg/PlatformPei: take VCPU
    count from QEMU and configure MpInitLib", 2016-11-24).


So, again, the use case that I would like to cover is:

* the exact number of APs is known at boot, to OvmfPkg/PlatformPei,

* after the very first INIT-SIPI-SIPI, MpInitLib should wait for exactly
  this number of APs to "report in", regardless of:

  - how long it takes,

  - in what order / sequence the APs report in. (Again, please remember
    that some APs may complete the initialization before other APs
    execute their very first instruction.)

* Preferably, the case should be handled when the processor count grows
  from cold boot to S3 resume (VCPU hotplug). TianoCore BZ#251 used to
  track this use case separately.

(

Jeff closed BZ#251 today, with the argument that commit 0594ec417c89
(this patch) finds the CPU count dynamically anyway, so a platform can
simply set "PcdCpuMaxLogicalProcessorNumber" to the maximum possible value.

This argument does not work in a virtual machine, because commit
0594ec417c89 (this patch) may in fact not find the VCPU count correctly
-- in a VM, (NumApsExecuting==0) does not imply that all APs have finished.

)

Thanks!
Laszlo


  parent reply	other threads:[~2017-10-25  5:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-23  7:22 [Patch 0/2] Enhance collect AP Count logic Eric Dong
2017-10-23  7:22 ` [Patch 1/2] UefiCpuPkg/MpInitLib: Change AP Index variable name Eric Dong
2017-10-24  6:00   ` Ni, Ruiyu
2017-10-23  7:22 ` [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic Eric Dong
2017-10-24  6:02   ` Ni, Ruiyu
2017-10-24  6:27     ` 答复: " Fan Jeff
2017-10-24  7:18       ` Ni, Ruiyu
2017-10-24  7:32         ` 答复: " Fan Jeff
2017-10-24 10:15   ` Laszlo Ersek
2017-10-24 14:24     ` 答复: " Fan Jeff
2017-10-24 16:29       ` Laszlo Ersek
2017-10-24 15:23     ` Dong, Eric
2017-10-24 15:40       ` Dong, Eric
2017-10-24 17:40       ` Laszlo Ersek
2017-10-24 22:30         ` Brian J. Johnson
2017-10-25  5:35           ` 答复: " Fan Jeff
2017-10-25  5:32         ` Fan Jeff [this message]
2017-10-25  5:42         ` Dong, Eric
2017-10-25 15:07           ` Laszlo Ersek
2017-10-26  1:13             ` Dong, Eric
2017-10-26 20:48               ` Brian J. Johnson
2017-10-27  1:31                 ` Dong, Eric
2017-10-24  7:31 ` 答复: [Patch 0/2] Enhance collect AP Count logic Fan Jeff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CY1PR19MB0283462DE5EBD55CC9D9D854D7440@CY1PR19MB0283.namprd19.prod.outlook.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox