From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.groups.io with SMTP id smtpd.web12.3586.1570616635379519920 for ; Wed, 09 Oct 2019 03:23:55 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: imammedo@redhat.com) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DC6E569084; Wed, 9 Oct 2019 10:23:54 +0000 (UTC) Received: from localhost (unknown [10.43.2.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 248CA1001E75; Wed, 9 Oct 2019 10:23:52 +0000 (UTC) Date: Wed, 9 Oct 2019 12:23:50 +0200 From: "Igor Mammedov" To: Laszlo Ersek Cc: edk2-devel-groups-io , Anthony Perard , Ard Biesheuvel , Eric Dong , Jordan Justen , Julien Grall , Ray Ni Subject: Re: [PATCH 4/4] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug Message-ID: <20191009122350.1030c12c@redhat.com> In-Reply-To: <636bdf77-eda3-89ff-1f6c-ff227e8ae2a1@redhat.com> References: <20191008112714.6215-1-lersek@redhat.com> <20191008112714.6215-5-lersek@redhat.com> <20191008170655.3c0362a1@redhat.com> <636bdf77-eda3-89ff-1f6c-ff227e8ae2a1@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 09 Oct 2019 10:23:54 +0000 (UTC) Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Tue, 8 Oct 2019 23:12:10 +0200 Laszlo Ersek wrote: > On 10/08/19 17:06, Igor Mammedov wrote: > > On Tue, 8 Oct 2019 13:27:14 +0200 > > Laszlo Ersek wrote: > > > >> MaxCpuCountInitialization() currently handles the following options: > >> > >> (1) QEMU does not report the boot CPU count. > >> > >> In this case, PlatformPei makes MpInitLib enumerate APs up to the > >> default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until > >> the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses. > >> (Whichever is reached first.) > >> > >> Time-limited AP enumeration had never been reliable on QEMU/KVM, which > >> is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF. > >> > >> (2) QEMU reports the boot CPU count. > >> > >> In this case, PlatformPei sets PcdCpuMaxLogicalProcessorNumber to the > >> reported boot CPU count, and PcdCpuApInitTimeOutInMicroSeconds to > >> practically "infinity" (MAX_UINT32, ~71 minutes). That causes > >> MpInitLib to enumerate exactly the available (boot) APs. > >> > >> With CPU hotplug in mind, this method is not good enough. While > >> UefiCpuPkg expects PcdCpuMaxLogicalProcessorNumber to include both > >> boot (i.e., cold-plugged), and all *potentially* hot-plugged, logical > >> processors, the boot CPU count reported by QEMU does not cover the > >> second category. > > > > Can you elaborate why it doesn't cover the second category? > > (Elaborate just in this thread, or elaborate in the commit message?) > > The boot CPU count reported by QEMU does not cover all potentially > hot-plugged logical processors ... because it is not supposed to. > > Namely, FW_CFG_NB_CPUS exposes "pcms->boot_cpus", and that is not > supposed to cover all potentially hot-plugged CPUs, only the boot-time > CPUs. 'potentially hotplugged' confuses me greatly in this context (I'm not sure if you talk about hotplugged or not existing yet CPUs). If it's about possible CPUs limit, I'd use "possible CPUs" instead (wich includes present and not present CPUs). > Whereas PcdCpuMaxLogicalProcessorNumber needs to include all potentially > hot-plugged CPUs too. For that, we need to expose "ms->smp.max_cpus" > from QEMU. > > Does this answer your question? You answered question better below [...] > When a CPU is hotplugged at OS runtime, "pcms->boot_cpus" increases in > QEMU. > Before the patch, the "pcms->boot_cpus" increase causes > PcdCpuMaxLogicalProcessorNumber to increase as well. That breaks S3 > resume. PcdCpuMaxLogicalProcessorNumber must not change from initial > boot to S3 resume. Probably something along this lines should be put in commit message, which clearly states a requirement for max_cpus > > After the patch, the "pcms->boot_cpus" increase does not increase > PcdCpuMaxLogicalProcessorNumber. PcdCpuMaxLogicalProcessorNumber remains > the same (namely, "ms->smp.max_cpus"). Therefore, the CPU structures > allocated during normal boot, for "ms->smp.max_cpus", will accommodate > the CPUs that have been hotplugged prior to S3 resume. > > I tested the S3 behavior, plus the values mentioned above are also > logged to the OVMF debug log, by this patch: > > > + DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", __FUNCTION__, > > + BootCpuCount, mMaxCpuCount)); > > This is one libvirt domain XML snippet I've used for testing: > > 4 > > > > > > > > hvm > ... > > Haswell-noTSX > > > > > > > > And the command used for testing was: > > $ virsh setvcpu GUEST 3 --enable --live > > When this is done, the guest kernel dmesg confirms the CPU hotplug, but > it is not onlined automatically. I onlined it via /sys manually in the > guest. > > After that point, I exercised S3 suspend / resume. During S3 resume, the > "BootCpuCount" log (from above) increases from 3 to 4 (and the related > edk2 infrastructure messages are in sync). All the while "mMaxCpuCount" > preserves its initial value, 4. > > Finally, after S3 resume, I ran "rdmsr -a 0x3a" in the guest, which > printed "5" for all *four* CPUs. This confirms that OVMF has set the > Feature Control MSR on the new VCPU too, during S3 resume. > > (The MSR is zeroed on platform reset, and has to be re-configured by > firmware, on every CPU, during S3. It is set to 5 due to the "vmx" CPU > feature in the above domain XML snippet. References: > > - https://bugzilla.tianocore.org/show_bug.cgi?id=86 > - https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with-QEMU,-KVM-and-libvirt#confirm-simple-multiprocessing-during-boot > > The specific meaning of the MSR is totally irrelevant now, it's just a > good use case for testing multi-processing during S3 resume. For MSR > 0x3A (Feature Control), the firmware must execute the wrmsr > instructionon on each CPU, and the "rdmsr -a" Linux userspace command > reads back the MSR on each CPU separately.) > > Thanks! > Laszlo