Re: [PATCH 4/4] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug

public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed

From: "Laszlo Ersek" <lersek@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>,
	Anthony Perard <anthony.perard@citrix.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Eric Dong <eric.dong@intel.com>,
	Jordan Justen <jordan.l.justen@intel.com>,
	Julien Grall <julien.grall@arm.com>, Ray Ni <ray.ni@intel.com>
Subject: Re: [PATCH 4/4] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug
Date: Tue, 8 Oct 2019 23:12:10 +0200	[thread overview]
Message-ID: <636bdf77-eda3-89ff-1f6c-ff227e8ae2a1@redhat.com> (raw)
In-Reply-To: <20191008170655.3c0362a1@redhat.com>

On 10/08/19 17:06, Igor Mammedov wrote:
> On Tue,  8 Oct 2019 13:27:14 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
>
>> MaxCpuCountInitialization() currently handles the following options:
>>
>> (1) QEMU does not report the boot CPU count.
>>
>>     In this case, PlatformPei makes MpInitLib enumerate APs up to the
>>     default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until
>>     the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses.
>>     (Whichever is reached first.)
>>
>>     Time-limited AP enumeration had never been reliable on QEMU/KVM, which
>>     is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF.
>>
>> (2) QEMU reports the boot CPU count.
>>
>>     In this case, PlatformPei sets PcdCpuMaxLogicalProcessorNumber to the
>>     reported boot CPU count, and PcdCpuApInitTimeOutInMicroSeconds to
>>     practically "infinity" (MAX_UINT32, ~71 minutes). That causes
>>     MpInitLib to enumerate exactly the available (boot) APs.
>>
>>     With CPU hotplug in mind, this method is not good enough. While
>>     UefiCpuPkg expects PcdCpuMaxLogicalProcessorNumber to include both
>>     boot (i.e., cold-plugged), and all *potentially* hot-plugged, logical
>>     processors, the boot CPU count reported by QEMU does not cover the
>>     second category.
>
> Can you elaborate why it doesn't cover the second category?

(Elaborate just in this thread, or elaborate in the commit message?)

The boot CPU count reported by QEMU does not cover all potentially
hot-plugged logical processors ... because it is not supposed to.

Namely, FW_CFG_NB_CPUS exposes "pcms->boot_cpus", and that is not
supposed to cover all potentially hot-plugged CPUs, only the boot-time
CPUs.

Whereas PcdCpuMaxLogicalProcessorNumber needs to include all potentially
hot-plugged CPUs too. For that, we need to expose "ms->smp.max_cpus"
from QEMU.

Does this answer your question?

>> Rewrite MaxCpuCountInitialization() for handling the following cases:
>>
>> (1) The behavior of case (1) does not change. (No UefiCpuPkg PCDs are set
>>     to values different from the defaults.)
>>
>> (2) QEMU reports the boot CPU count, but not the potential maximum CPU
>>     count.
>>
>>     In this case, the behavior remains unchanged.
>>
>>     The way MpInitLib is instructed to do the same differs however: we now
>>     set the new PcdCpuBootLogicalProcessorNumber to the boot CPU count
>>     (together with PcdCpuMaxLogicalProcessorNumber).
>>     PcdCpuApInitTimeOutInMicroSeconds is irrelevant.
>>
>> (3) QEMU reports both the boot CPU count, and the potential maximum CPU
>>     count.
>>
>>     We tell UefiCpuPkg about the potential maximum through
>>     PcdCpuMaxLogicalProcessorNumber. We also tell MpInitLib the boot CPU
>>     count for precise and quick AP enumeration, via
>>     PcdCpuBootLogicalProcessorNumber. PcdCpuApInitTimeOutInMicroSeconds is
>>     irrelevant.
> What for max cpu count is/will be used?

Before this patch:

- "pcms->boot_cpus" determines the boot CPU count for OVMF
- "pcms->boot_cpus" determines the maximum CPU count for OVMF

- when OVMF is built without SMM_REQUIRE:
  - OS-time CPU hotplug works (no firmware involvement)
  - after OS-time CPU hotplug, S3 resume breaks (firmware can't handle
    unexpected CPUs appearing beyond original maximum)

- when OVMF is built with SMM_REQUIRE:
  - OS-time CPU hotplug does not work (yet)

After this patch:

- "pcms->boot_cpus" determines the boot CPU count for OVMF
- "ms->smp.max_cpus" determines the maximum CPU count for OVMF [*]

- when OVMF is built without SMM_REQUIRE:
  - OS-time CPU hotplug works (no firmware involvement)
  - after OS-time CPU hotplug, S3 resume works (any new CPUs are
    within the original maximum, therefore they are expected) [*]

- when OVMF is built with SMM_REQUIRE:
  - OS-time CPU hotplug does not work (yet)

The differences are marked with [*].

>> This patch is a pre-requisite for enabling CPU hotplug with SMM_REQUIRE.
>> As a side effect, it also enables S3 to work with CPU hotplug at once,
>> *without* SMM_REQUIRE.
> Does OVMF reread boot CPU count on resume from QEMU?

Yes, it does.

When a CPU is hotplugged at OS runtime, "pcms->boot_cpus" increases in
QEMU.

Before the patch, the "pcms->boot_cpus" increase causes
PcdCpuMaxLogicalProcessorNumber to increase as well. That breaks S3
resume. PcdCpuMaxLogicalProcessorNumber must not change from initial
boot to S3 resume.

After the patch, the "pcms->boot_cpus" increase does not increase
PcdCpuMaxLogicalProcessorNumber. PcdCpuMaxLogicalProcessorNumber remains
the same (namely, "ms->smp.max_cpus"). Therefore, the CPU structures
allocated during normal boot, for "ms->smp.max_cpus", will accommodate
the CPUs that have been hotplugged prior to S3 resume.

I tested the S3 behavior, plus the values mentioned above are also
logged to the OVMF debug log, by this patch:

> +  DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", __FUNCTION__,
> +    BootCpuCount, mMaxCpuCount));

This is one libvirt domain XML snippet I've used for testing:

  <vcpu placement='static' current='3'>4</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no'/>
    <vcpu id='1' enabled='yes' hotpluggable='no'/>
    <vcpu id='2' enabled='yes' hotpluggable='no'/>
    <vcpu id='3' enabled='no' hotpluggable='yes'/>
  </vcpus>
  <os>
    <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type>
...
  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>Haswell-noTSX</model>
    <topology sockets='1' cores='2' threads='2'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='spec-ctrl'/>
    <feature policy='require' name='pdpe1gb'/>
  </cpu>

And the command used for testing was:

$ virsh setvcpu GUEST 3 --enable --live

When this is done, the guest kernel dmesg confirms the CPU hotplug, but
it is not onlined automatically. I onlined it via /sys manually in the
guest.

After that point, I exercised S3 suspend / resume. During S3 resume, the
"BootCpuCount" log (from above) increases from 3 to 4 (and the related
edk2 infrastructure messages are in sync). All the while "mMaxCpuCount"
preserves its initial value, 4.

Finally, after S3 resume, I ran "rdmsr -a 0x3a" in the guest, which
printed "5" for all *four* CPUs. This confirms that OVMF has set the
Feature Control MSR on the new VCPU too, during S3 resume.

(The MSR is zeroed on platform reset, and has to be re-configured by
firmware, on every CPU, during S3. It is set to 5 due to the "vmx" CPU
feature in the above domain XML snippet. References:

- https://bugzilla.tianocore.org/show_bug.cgi?id=86
- https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with-QEMU,-KVM-and-libvirt#confirm-simple-multiprocessing-during-boot

The specific meaning of the MSR is totally irrelevant now, it's just a
good use case for testing multi-processing during S3 resume. For MSR
0x3A (Feature Control), the firmware must execute the wrmsr
instructionon on each CPU, and the "rdmsr -a" Linux userspace command
reads back the MSR on each CPU separately.)

Thanks!
Laszlo

next prev parent reply	other threads:[~2019-10-08 21:12 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08 11:27 [PATCH 0/4] UefiCpuPkg, OvmfPkg: separate PCDs for boot CPU count vs. max CPU count Laszlo Ersek
2019-10-08 11:27 ` [PATCH 1/4] UefiCpuPkg/MpInitLib: honor the platform's boot CPU count in AP detection Laszlo Ersek
2019-10-09  0:57   ` [edk2-devel] " Dong, Eric
2019-10-09 14:55     ` Laszlo Ersek
2019-10-10  2:52       ` Ni, Ray
2019-10-10  7:38         ` Laszlo Ersek
2019-10-10  7:50           ` Ni, Ray
2019-10-10  8:28           ` Laszlo Ersek
2019-10-08 11:27 ` [PATCH 2/4] OvmfPkg/OvmfXen.dsc: remove PcdCpu* dynamic defaults Laszlo Ersek
2019-10-08 13:11   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-10-08 20:27     ` Laszlo Ersek
2019-10-08 11:27 ` [PATCH 3/4] OvmfPkg/IndustryStandard: define FW_CFG_X86_TOPOLOGY structure type Laszlo Ersek
2019-10-08 11:27 ` [PATCH 4/4] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug Laszlo Ersek
2019-10-08 15:06   ` Igor Mammedov
2019-10-08 21:12     ` Laszlo Ersek [this message]
2019-10-09 10:23       ` Igor Mammedov
2019-10-09 14:43         ` Laszlo Ersek
2019-10-08 11:35 ` [PATCH 0/4] UefiCpuPkg, OvmfPkg: separate PCDs for boot CPU count vs. max CPU count Ard Biesheuvel
2019-10-08 20:27   ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=636bdf77-eda3-89ff-1f6c-ff227e8ae2a1@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox