From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.groups.io with SMTP id smtpd.web12.797.1570569134056285590 for ; Tue, 08 Oct 2019 14:12:14 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 80E2D9D1F0; Tue, 8 Oct 2019 21:12:13 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-109.rdu2.redhat.com [10.10.120.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6D1F95C231; Tue, 8 Oct 2019 21:12:11 +0000 (UTC) Subject: Re: [PATCH 4/4] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug To: Igor Mammedov Cc: edk2-devel-groups-io , Anthony Perard , Ard Biesheuvel , Eric Dong , Jordan Justen , Julien Grall , Ray Ni References: <20191008112714.6215-1-lersek@redhat.com> <20191008112714.6215-5-lersek@redhat.com> <20191008170655.3c0362a1@redhat.com> From: "Laszlo Ersek" Message-ID: <636bdf77-eda3-89ff-1f6c-ff227e8ae2a1@redhat.com> Date: Tue, 8 Oct 2019 23:12:10 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20191008170655.3c0362a1@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 08 Oct 2019 21:12:13 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/08/19 17:06, Igor Mammedov wrote: > On Tue, 8 Oct 2019 13:27:14 +0200 > Laszlo Ersek wrote: > >> MaxCpuCountInitialization() currently handles the following options: >> >> (1) QEMU does not report the boot CPU count. >> >> In this case, PlatformPei makes MpInitLib enumerate APs up to the >> default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until >> the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses. >> (Whichever is reached first.) >> >> Time-limited AP enumeration had never been reliable on QEMU/KVM, which >> is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF. >> >> (2) QEMU reports the boot CPU count. >> >> In this case, PlatformPei sets PcdCpuMaxLogicalProcessorNumber to the >> reported boot CPU count, and PcdCpuApInitTimeOutInMicroSeconds to >> practically "infinity" (MAX_UINT32, ~71 minutes). That causes >> MpInitLib to enumerate exactly the available (boot) APs. >> >> With CPU hotplug in mind, this method is not good enough. While >> UefiCpuPkg expects PcdCpuMaxLogicalProcessorNumber to include both >> boot (i.e., cold-plugged), and all *potentially* hot-plugged, logical >> processors, the boot CPU count reported by QEMU does not cover the >> second category. > > Can you elaborate why it doesn't cover the second category? (Elaborate just in this thread, or elaborate in the commit message?) The boot CPU count reported by QEMU does not cover all potentially hot-plugged logical processors ... because it is not supposed to. Namely, FW_CFG_NB_CPUS exposes "pcms->boot_cpus", and that is not supposed to cover all potentially hot-plugged CPUs, only the boot-time CPUs. Whereas PcdCpuMaxLogicalProcessorNumber needs to include all potentially hot-plugged CPUs too. For that, we need to expose "ms->smp.max_cpus" from QEMU. Does this answer your question? >> Rewrite MaxCpuCountInitialization() for handling the following cases: >> >> (1) The behavior of case (1) does not change. (No UefiCpuPkg PCDs are set >> to values different from the defaults.) >> >> (2) QEMU reports the boot CPU count, but not the potential maximum CPU >> count. >> >> In this case, the behavior remains unchanged. >> >> The way MpInitLib is instructed to do the same differs however: we now >> set the new PcdCpuBootLogicalProcessorNumber to the boot CPU count >> (together with PcdCpuMaxLogicalProcessorNumber). >> PcdCpuApInitTimeOutInMicroSeconds is irrelevant. >> >> (3) QEMU reports both the boot CPU count, and the potential maximum CPU >> count. >> >> We tell UefiCpuPkg about the potential maximum through >> PcdCpuMaxLogicalProcessorNumber. We also tell MpInitLib the boot CPU >> count for precise and quick AP enumeration, via >> PcdCpuBootLogicalProcessorNumber. PcdCpuApInitTimeOutInMicroSeconds is >> irrelevant. > What for max cpu count is/will be used? Before this patch: - "pcms->boot_cpus" determines the boot CPU count for OVMF - "pcms->boot_cpus" determines the maximum CPU count for OVMF - when OVMF is built without SMM_REQUIRE: - OS-time CPU hotplug works (no firmware involvement) - after OS-time CPU hotplug, S3 resume breaks (firmware can't handle unexpected CPUs appearing beyond original maximum) - when OVMF is built with SMM_REQUIRE: - OS-time CPU hotplug does not work (yet) After this patch: - "pcms->boot_cpus" determines the boot CPU count for OVMF - "ms->smp.max_cpus" determines the maximum CPU count for OVMF [*] - when OVMF is built without SMM_REQUIRE: - OS-time CPU hotplug works (no firmware involvement) - after OS-time CPU hotplug, S3 resume works (any new CPUs are within the original maximum, therefore they are expected) [*] - when OVMF is built with SMM_REQUIRE: - OS-time CPU hotplug does not work (yet) The differences are marked with [*]. >> This patch is a pre-requisite for enabling CPU hotplug with SMM_REQUIRE. >> As a side effect, it also enables S3 to work with CPU hotplug at once, >> *without* SMM_REQUIRE. > Does OVMF reread boot CPU count on resume from QEMU? Yes, it does. When a CPU is hotplugged at OS runtime, "pcms->boot_cpus" increases in QEMU. Before the patch, the "pcms->boot_cpus" increase causes PcdCpuMaxLogicalProcessorNumber to increase as well. That breaks S3 resume. PcdCpuMaxLogicalProcessorNumber must not change from initial boot to S3 resume. After the patch, the "pcms->boot_cpus" increase does not increase PcdCpuMaxLogicalProcessorNumber. PcdCpuMaxLogicalProcessorNumber remains the same (namely, "ms->smp.max_cpus"). Therefore, the CPU structures allocated during normal boot, for "ms->smp.max_cpus", will accommodate the CPUs that have been hotplugged prior to S3 resume. I tested the S3 behavior, plus the values mentioned above are also logged to the OVMF debug log, by this patch: > + DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", __FUNCTION__, > + BootCpuCount, mMaxCpuCount)); This is one libvirt domain XML snippet I've used for testing: 4 hvm ... Haswell-noTSX And the command used for testing was: $ virsh setvcpu GUEST 3 --enable --live When this is done, the guest kernel dmesg confirms the CPU hotplug, but it is not onlined automatically. I onlined it via /sys manually in the guest. After that point, I exercised S3 suspend / resume. During S3 resume, the "BootCpuCount" log (from above) increases from 3 to 4 (and the related edk2 infrastructure messages are in sync). All the while "mMaxCpuCount" preserves its initial value, 4. Finally, after S3 resume, I ran "rdmsr -a 0x3a" in the guest, which printed "5" for all *four* CPUs. This confirms that OVMF has set the Feature Control MSR on the new VCPU too, during S3 resume. (The MSR is zeroed on platform reset, and has to be re-configured by firmware, on every CPU, during S3. It is set to 5 due to the "vmx" CPU feature in the above domain XML snippet. References: - https://bugzilla.tianocore.org/show_bug.cgi?id=86 - https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with-QEMU,-KVM-and-libvirt#confirm-simple-multiprocessing-during-boot The specific meaning of the MSR is totally irrelevant now, it's just a good use case for testing multi-processing during S3 resume. For MSR 0x3A (Feature Control), the firmware must execute the wrmsr instructionon on each CPU, and the "rdmsr -a" Linux userspace command reads back the MSR on each CPU separately.) Thanks! Laszlo