From: "Dong, Eric" <eric.dong@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic.
Date: Thu, 26 Oct 2017 01:13:34 +0000 [thread overview]
Message-ID: <ED077930C258884BBCB450DB737E66224AA3E322@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <94c27429-4b7d-6c21-c1e5-e07f3e9a5556@redhat.com>
Laszlo,
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Laszlo Ersek
> Sent: Wednesday, October 25, 2017 11:08 PM
> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>; Paolo Bonzini <pbonzini@redhat.com>
> Subject: Re: [edk2] [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for
> AP initialization logic.
>
> Hi Eric,
>
> On 10/25/17 07:42, Dong, Eric wrote:
> > Hi Laszlo,
> >
> > I think I have clear about your raised issues for Ovmf platform. In this case, I
> think your platform not suit for this code change. How about I do below
> change based on the new code:
> >
> > - if (CpuMpData->CpuCount == 0) {
> > TimedWaitForApFinish (
> > CpuMpData,
> > PcdGet32 (PcdCpuMaxLogicalProcessorNumber) - 1,
> > PcdGet32 (PcdCpuApInitTimeOutInMicroSeconds)
> > );
> > - }
> >
> > After this change, for Ovmf can still set
> > PcdCpuApInitTimeOutInMicroSeconds to MAX_UINT32 to keep old
> solution.
> > For the real platform, it can set a small value to follow the new
> > solution. For this new change, it just wait one more
> > PcdCpuApInitTimeOutInMicroSeconds timeout, should not have
> performance
> > impact (This time may also been cost in later while
> > (CpuMpData->MpCpuExchangeInfo->NumApsExecuting != 0) loop) or have
> > litter performance impact (not cover by the later loop).
> The suggested change will work for OVMF, thanks!
>
> (Just please don't forget to un-indent the TimedWaitForApFinish() call that
> will become unconditional again.)
>
> --o--
>
> Do you think Brian's concern should be investigated as well (perhaps in a
> separate Bugzilla entry)?
As Jeff has mentioned, only Ovmf can know the exist Ap numbers before
send the Init-sipi-sipi. So we don't know how many bits needed for this bitmap.
In case we can create the bitmap, we still don't know when all Aps have been
found(If the begin map bit value same as finish map bit value, we still can't get
the conclusion that all Aps have been found, because maybe other Aps not
started yet).
Thanks,
Eric
>
> Because, I believe, the scheduling pattern that I described earlier could be
> possible on physical platforms as well, at least in theory:
>
> >> (2) After at least one AP has started running, the logic expects
> >> "NumApsExecuting" to monotonically grow for a while, and then to
> >> monotonically decrease, back to zero. For example, if we have 1 BSP
> >> and
> >> 7 APs, the BSP logic more or less expects the following values in
> >> "NumApsExecuting":
> >>
> >> 1; 2; 3; 4; 5; 6; 7;
> >> 6; 5; 4; 3; 2; 1; 0
> >>
> >>
> >> While this may be a valid expectation for physical processors (which
> >> actually run in parallel, in the physical world), in a virtual machine, it is not
> guaranteed.
> >> Dependent on hypervisor scheduling artifacts, it is possible that,
> >> say, three APs start up *and finish* before the remaining four APs
> >> start up *at all*. The INIT-SIPI-SIPI marks all APs for execution /
> >> scheduling in the hypervisor, yes, but the actual code execution may
> >> commence a lot later. For example, the BSP may witness the following
> series of values in "NumApsExecuting":
> >>
> >> 1; 2; 3;
> >> 2; 1; 0;
> >> 1; 2; 3; 4;
> >> 3; 2; 1; 0
> >>
> >> and the BSP could think that there are 3 APs only, when it sees the
> >> first 0 value.
>
> I don't know if such a pattern is "likely", "unlikely", or "extremely unlikely" on
> physical platforms. I think the pattern is "theoretically possible" at least.
>
> I suggest that we go ahead with the small patch for the OVMF use case first,
> and then open a new BZ if Brian thinks there's a real safety problem with the
> code.
>
We also notice this theoretical problem when we implement this change, but
We think this is rarely to happen on physical platforms. We think the value for
the change is much bigger than not do this change because of this theoretical
problem.
Thanks,
Eric
> ... I should note that, with the small patch that's going to fix the OVMF use
> case, the previous behavior will be restored for *all* platforms that continue
> using their previous PCD values.
>
> The cumulative effect of commit 0594ec417c89 and the upcoming patch will
> be that a platform may significantly decrease
> "PcdCpuApInitTimeOutInMicroSeconds", if it chooses to, and then the
> "NumApsExecuting" loop will take the role of the preexisting
> TimedWaitForApFinish() call. If a platform doesn't want that, then it should
> keep its "PcdCpuApInitTimeOutInMicroSeconds" as high as before, and then
> any implementation details in the "NumApsExecuting" loop will be irrelevant.
>
> Thanks!
> Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
next prev parent reply other threads:[~2017-10-26 1:09 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-23 7:22 [Patch 0/2] Enhance collect AP Count logic Eric Dong
2017-10-23 7:22 ` [Patch 1/2] UefiCpuPkg/MpInitLib: Change AP Index variable name Eric Dong
2017-10-24 6:00 ` Ni, Ruiyu
2017-10-23 7:22 ` [Patch 2/2] UefiCpuPkg/MpInitLib: Enhance waiting for AP initialization logic Eric Dong
2017-10-24 6:02 ` Ni, Ruiyu
2017-10-24 6:27 ` 答复: " Fan Jeff
2017-10-24 7:18 ` Ni, Ruiyu
2017-10-24 7:32 ` 答复: " Fan Jeff
2017-10-24 10:15 ` Laszlo Ersek
2017-10-24 14:24 ` 答复: " Fan Jeff
2017-10-24 16:29 ` Laszlo Ersek
2017-10-24 15:23 ` Dong, Eric
2017-10-24 15:40 ` Dong, Eric
2017-10-24 17:40 ` Laszlo Ersek
2017-10-24 22:30 ` Brian J. Johnson
2017-10-25 5:35 ` 答复: " Fan Jeff
2017-10-25 5:32 ` Fan Jeff
2017-10-25 5:42 ` Dong, Eric
2017-10-25 15:07 ` Laszlo Ersek
2017-10-26 1:13 ` Dong, Eric [this message]
2017-10-26 20:48 ` Brian J. Johnson
2017-10-27 1:31 ` Dong, Eric
2017-10-24 7:31 ` 答复: [Patch 0/2] Enhance collect AP Count logic Fan Jeff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ED077930C258884BBCB450DB737E66224AA3E322@shsmsx102.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox