public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Dong, Eric" <eric.dong@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>
Subject: Re: [Patch V2] UefiCpuPkg/MpInitLib: Remove redundant parameter.
Date: Wed, 25 Jul 2018 03:50:59 +0000	[thread overview]
Message-ID: <ED077930C258884BBCB450DB737E66224AC5A453@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <055fb2f1-cd73-e5a9-11b2-407f31e81305@redhat.com>

Hi Laszlo,

I have root cause this issue, the AP hangs in the procedure when PiSmmCpuDxeSmm driver start up trigged this issue.

When PiSmmCpuDxeSmm driver start up, it will call StartAllAps to set memory attribute.  In StartAllAps function, after call WakeUpAp to start Aps, it calls CheckAllAps to wait all Aps finished the task. In CheckAllAps function, it detect AP state to know whether the AP has finished its task. In old code, it check whether the AP state is CpuStateFinished to know whether AP has finished tasks. This state is only set by AP when it truly finished task. In new logic, CpuStateFinished been replace with CpuStateIdle. And CpuStateIdle is also the begin state of the AP. AP will change state from CpuStateIdle to CpuStateBusy when it start execute the procedure. And after it finished the procedure, it will change state back to CpuStateIdle.

So when the hang issue raised, AP state is not been changed to CpuStateBusy when BSP calls CheckAllAps to check whether the AP has finished its task. So the state for the AP still in CpuStateIdle, but BSP think AP has finished its task. In this case, BSP think all the Aps has finished their tasks and it continues boot. But some AP may wake up later and it failed to return from the procedure. In this case, the AP state keeps at CpuStateBusy. So later in ChangeApLoopCallback function, because this AP state still in CpuStateBusy, this AP will not trig the procedure. But BSP wait all APs to trig the procedure(BSP wait the Aps to reduce the mNumberToFinish value in procedure to continue boot) to continue the boot, so the hang occurred.

I think we should keep a middle state to let us know whether the AP truly finished its task. I will send  another serial patch for this issue. Please help to check the new patches.

Thanks,
Eric

> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Laszlo Ersek
> Sent: Saturday, July 21, 2018 12:30 AM
> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>
> Subject: Re: [edk2] [Patch V2] UefiCpuPkg/MpInitLib: Remove redundant
> parameter.
> 
> On 07/20/18 08:53, Dong, Eric wrote:
> >> -----Original Message----- From: Laszlo Ersek
> >> [mailto:lersek@redhat.com]
> 
> >> Therefore, please upgrade the host to Fedora 26. In Fedora 26, QEMU
> >> 2.9 is shipped:
> >>
> >> https://koji.fedoraproject.org/koji/buildinfo?buildID=986762
> >>
> >> ... It's even better if you can upgrade to Fedora 27, as Fedora 27 is
> >> the oldest Fedora release still supported at this point. The
> >> following article describes the recommended upgrade method:
> >>
> >> https://fedoraproject.org/wiki/DNF_system_upgrade
> >>
> >
> > I updated the system to fedora 28, but it failed to boot. :(  so I
> > borrowed an exited fedora 27 DVD and installed it. With this OS, I can
> > reproduce this issue now. I found this issue is an random issue, I
> > booted 5 times and met the issue.  I'm checking the issue.
> 
> Awesome!
> 
> (I'm not happy about the problem itself, of course, but I'm *very* thankful
> that you took the time to install a Linux box, for testing with
> KVM!!!)
> 
> Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


  reply	other threads:[~2018-07-25  3:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29  3:20 [Patch V2] UefiCpuPkg/MpInitLib: Remove redundant parameter Eric Dong
2018-06-29 12:14 ` Laszlo Ersek
2018-07-18 12:59   ` Dong, Eric
2018-07-19 17:01     ` Laszlo Ersek
2018-07-20  6:53       ` Dong, Eric
2018-07-20 16:30         ` Laszlo Ersek
2018-07-25  3:50           ` Dong, Eric [this message]
2018-07-25 10:13             ` Laszlo Ersek
2018-07-25 11:35               ` Dong, Eric
2018-07-25 15:35                 ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ED077930C258884BBCB450DB737E66224AC5A453@shsmsx102.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox