From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.groups.io with SMTP id smtpd.web12.945.1571931223731719914 for ; Thu, 24 Oct 2019 08:33:43 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: philmd@redhat.com) Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E2313A705 for ; Thu, 24 Oct 2019 15:33:42 +0000 (UTC) Received: by mail-wr1-f70.google.com with SMTP id c6so13085511wrp.3 for ; Thu, 24 Oct 2019 08:33:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=t/bizLT+jFWGhukCbdS+HvG2IjNLtSh57V1pItvDW9A=; b=WeanQPu/caRW35XKcd1+7kpStIA+jYSfIPfk3YDV4n0u8Y4mLErp74GGUYc73NoNI7 mBv4+93OndBF064h7uqL56x6beuyXywqZ76LuwrVbvZP7NozrviOi6CBflVPgWhT9t8V YOUH8MCoPZfhCz9IELImiJT2+h6hRPuEXSSENfrXqDjhJSb+w7KDi/s6/4MTKjjh0w8P AeJEwpKq7iSBSQTMT0wRlv/QL4y7L5gMaHws6HHLbAep7ZnN9C8b1/Z3pLwo6WmeQnZy A3cvGu8oAAfnN37De7aKxUwS/o5AK7gdsh4Td+aN9TqXH9EaWu3bAS4qzLJ1f9dG6wEa ysdg== X-Gm-Message-State: APjAAAX5Db2ZFqJ+N1iYAOiOYroq0Oj+Dtbpsfz9akktkc4mZmsFRgrT cbRsRlGIQ2MvsrqrBrgRvLDdPlL2YAWlMwW2IIUHV8jNCVQsIHsmPAswsr949Z0Ng4XzkPz4dSH yuCDVQzOEeqKt9w== X-Received: by 2002:a05:600c:387:: with SMTP id w7mr5641928wmd.138.1571931221503; Thu, 24 Oct 2019 08:33:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqzwV1mwkKWd2d5YZ/tFwJsLgNef6FCoMqc2QDJ1byH3KHGuTknqgZ46aMVX4mYexK2WjDDKjw== X-Received: by 2002:a05:600c:387:: with SMTP id w7mr5641901wmd.138.1571931221144; Thu, 24 Oct 2019 08:33:41 -0700 (PDT) Received: from [192.168.1.115] (129.red-83-57-174.dynamicip.rima-tde.net. [83.57.174.129]) by smtp.gmail.com with ESMTPSA id e12sm22262399wrs.49.2019.10.24.08.33.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 24 Oct 2019 08:33:40 -0700 (PDT) Subject: Re: [edk2-devel] [PATCH v2 3/3] OvmfPkg/PlatformPei: rewrite MaxCpuCountInitialization() for CPU hotplug To: devel@edk2.groups.io, lersek@redhat.com Cc: Anthony Perard , Ard Biesheuvel , Igor Mammedov , Jordan Justen , Julien Grall References: <20191022221554.14963-1-lersek@redhat.com> <20191022221554.14963-4-lersek@redhat.com> From: =?UTF-8?B?UGhpbGlwcGUgTWF0aGlldS1EYXVkw6k=?= Message-ID: <9826409d-43ff-f432-f894-9b5c2f9d5a2a@redhat.com> Date: Thu, 24 Oct 2019 17:33:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20191022221554.14963-4-lersek@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/23/19 12:15 AM, Laszlo Ersek wrote: > MaxCpuCountInitialization() currently handles the following options: > > (1) QEMU does not report the boot CPU count (FW_CFG_NB_CPUS is 0) > > In this case, PlatformPei makes MpInitLib enumerate APs up to the > default PcdCpuMaxLogicalProcessorNumber value (64) minus 1, or until > the default PcdCpuApInitTimeOutInMicroSeconds (50,000) elapses. > (Whichever is reached first.) > > Time-limited AP enumeration had never been reliable on QEMU/KVM, which > is why commit 45a70db3c3a5 strated handling case (2) below, in OVMF. > > (2) QEMU reports the boot CPU count (FW_CFG_NB_CPUS is nonzero) > > In this case, PlatformPei sets > > - PcdCpuMaxLogicalProcessorNumber to the reported boot CPU count > (FW_CFG_NB_CPUS, which exports "PCMachineState.boot_cpus"), > > - and PcdCpuApInitTimeOutInMicroSeconds to practically "infinity" > (MAX_UINT32, ~71 minutes). > > That causes MpInitLib to enumerate exactly the present (boot) APs. > > With CPU hotplug in mind, this method is not good enough. Because, > using QEMU terminology, UefiCpuPkg expects > PcdCpuMaxLogicalProcessorNumber to provide the "possible CPUs" count > ("MachineState.smp.max_cpus"), which includes present and not present > CPUs both (with not present CPUs being subject for hot-plugging). > FW_CFG_NB_CPUS does not include not present CPUs. > > Rewrite MaxCpuCountInitialization() for handling the following cases: > > (1) The behavior of case (1) does not change. (No UefiCpuPkg PCDs are set > to values different from the defaults.) > > (2) QEMU reports the boot CPU count ("PCMachineState.boot_cpus", via > FW_CFG_NB_CPUS), but not the possible CPUs count > ("MachineState.smp.max_cpus"). > > In this case, the behavior remains unchanged. > > The way MpInitLib is instructed to do the same differs however: we now > set the new PcdCpuBootLogicalProcessorNumber to the boot CPU count > (while continuing to set PcdCpuMaxLogicalProcessorNumber identically). > PcdCpuApInitTimeOutInMicroSeconds becomes irrelevant. > > (3) QEMU reports both the boot CPU count ("PCMachineState.boot_cpus", via > FW_CFG_NB_CPUS), and the possible CPUs count > ("MachineState.smp.max_cpus"). > > We tell UefiCpuPkg about the possible CPUs count through > PcdCpuMaxLogicalProcessorNumber. We also tell MpInitLib the boot CPU > count for precise and quick AP enumeration, via > PcdCpuBootLogicalProcessorNumber. PcdCpuApInitTimeOutInMicroSeconds is > irrelevant again. > > This patch is a pre-requisite for enabling CPU hotplug with SMM_REQUIRE. > As a side effect, the patch also enables S3 to work with CPU hotplug at > once, *without* SMM_REQUIRE. > > (Without the patch, S3 resume fails, if a CPU is hot-plugged at OS > runtime, prior to suspend: the FW_CFG_NB_CPUS increase seen during resume > causes PcdCpuMaxLogicalProcessorNumber to increase as well, which is not > permitted. > > With the patch, PcdCpuMaxLogicalProcessorNumber stays the same, namely > "MachineState.smp.max_cpus". Therefore, the CPU structures allocated > during normal boot can accommodate the CPUs at S3 resume that have been > hotplugged prior to S3 suspend.) > > Cc: Anthony Perard > Cc: Ard Biesheuvel > Cc: Igor Mammedov > Cc: Jordan Justen > Cc: Julien Grall > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515 > Signed-off-by: Laszlo Ersek > --- > > Notes: > v2: > > - use "possible CPUs" term in the code and the commit message [Igor] > > - add details about S3 to the commit message [Igor] > > - use QEMU's existent CPU hotplug register block, rather than a new > named file in fw_cfg [Igor] > > - tested on QEMU v2.6.2 (legacy-only CPU hotplug register block), v2.7.1 > (modern register block, but buggy fw_cfg), v2.8.1.1 (no QEMU issues), > v4.0.0 (no QEMU issues) > > OvmfPkg/OvmfPkgIa32.dsc | 2 +- > OvmfPkg/OvmfPkgIa32X64.dsc | 2 +- > OvmfPkg/OvmfPkgX64.dsc | 2 +- > OvmfPkg/PlatformPei/PlatformPei.inf | 2 +- > OvmfPkg/PlatformPei/Platform.c | 172 +++++++++++++++++--- > 5 files changed, 151 insertions(+), 29 deletions(-) > > diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc > index 4301e7821902..68d8a9fb9655 100644 > --- a/OvmfPkg/OvmfPkgIa32.dsc > +++ b/OvmfPkg/OvmfPkgIa32.dsc > @@ -553,11 +553,11 @@ [PcdsDynamicDefault] > gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE > gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE > > # UefiCpuPkg PCDs related to initial AP bringup and general AP management. > gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64 > - gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000 > + gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0 > > # Set memory encryption mask > gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0 > > !if $(SMM_REQUIRE) == TRUE > diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc > index 803fd74ae8e4..e5a6260b6088 100644 > --- a/OvmfPkg/OvmfPkgIa32X64.dsc > +++ b/OvmfPkg/OvmfPkgIa32X64.dsc > @@ -565,11 +565,11 @@ [PcdsDynamicDefault] > gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE > gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE > > # UefiCpuPkg PCDs related to initial AP bringup and general AP management. > gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64 > - gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000 > + gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0 > > # Set memory encryption mask > gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0 > > !if $(SMM_REQUIRE) == TRUE > diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc > index 5dbd1b793a90..f5d904945103 100644 > --- a/OvmfPkg/OvmfPkgX64.dsc > +++ b/OvmfPkg/OvmfPkgX64.dsc > @@ -564,11 +564,11 @@ [PcdsDynamicDefault] > gEfiMdeModulePkgTokenSpaceGuid.PcdSetNxForStack|FALSE > gEfiMdeModulePkgTokenSpaceGuid.PcdPropertiesTableEnable|FALSE > > # UefiCpuPkg PCDs related to initial AP bringup and general AP management. > gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber|64 > - gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds|50000 > + gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber|0 > > # Set memory encryption mask > gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask|0x0 > > !if $(SMM_REQUIRE) == TRUE > diff --git a/OvmfPkg/PlatformPei/PlatformPei.inf b/OvmfPkg/PlatformPei/PlatformPei.inf > index d9fd9c8f05b3..30eaebdfae63 100644 > --- a/OvmfPkg/PlatformPei/PlatformPei.inf > +++ b/OvmfPkg/PlatformPei/PlatformPei.inf > @@ -96,11 +96,11 @@ [Pcd] > gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiS3Enable > gEfiMdeModulePkgTokenSpaceGuid.PcdPteMemoryEncryptionAddressOrMask > gEfiSecurityPkgTokenSpaceGuid.PcdOptionRomImageVerificationPolicy > gUefiCpuPkgTokenSpaceGuid.PcdCpuLocalApicBaseAddress > gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber > - gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds > + gUefiCpuPkgTokenSpaceGuid.PcdCpuBootLogicalProcessorNumber > gUefiCpuPkgTokenSpaceGuid.PcdCpuApStackSize > > [FixedPcd] > gEfiMdePkgTokenSpaceGuid.PcdPciExpressBaseAddress > > diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c > index 3ba2459872d9..e5e8581752b5 100644 > --- a/OvmfPkg/PlatformPei/Platform.c > +++ b/OvmfPkg/PlatformPei/Platform.c > @@ -28,11 +28,14 @@ > #include > #include > #include > #include > #include > +#include > #include > +#include > +#include > #include > > #include "Platform.h" > #include "Cmos.h" > > @@ -562,47 +565,165 @@ S3Verification ( > #endif > } > > > /** > - Fetch the number of boot CPUs from QEMU and expose it to UefiCpuPkg modules. > - Set the mMaxCpuCount variable. > + Fetch the boot CPU count and the possible CPU count from QEMU, and expose > + them to UefiCpuPkg modules. Set the mMaxCpuCount variable. > **/ > VOID > MaxCpuCountInitialization ( > VOID > ) > { > - UINT16 ProcessorCount; > + UINT16 BootCpuCount; > RETURN_STATUS PcdStatus; > > + // > + // Try to fetch the boot CPU count. > + // > QemuFwCfgSelectItem (QemuFwCfgItemSmpCpuCount); > - ProcessorCount = QemuFwCfgRead16 (); > - // > - // If the fw_cfg key or fw_cfg entirely is unavailable, load mMaxCpuCount > - // from the PCD default. No change to PCDs. > - // > - if (ProcessorCount == 0) { > + BootCpuCount = QemuFwCfgRead16 (); > + if (BootCpuCount == 0) { > + // > + // QEMU doesn't report the boot CPU count. (BootCpuCount == 0) will let > + // MpInitLib count APs up to (PcdCpuMaxLogicalProcessorNumber - 1), or > + // until PcdCpuApInitTimeOutInMicroSeconds elapses (whichever is reached > + // first). > + // > + DEBUG ((DEBUG_WARN, "%a: boot CPU count unavailable\n", __FUNCTION__)); > mMaxCpuCount = PcdGet32 (PcdCpuMaxLogicalProcessorNumber); > - return; > + } else { > + // > + // We will expose BootCpuCount to MpInitLib. MpInitLib will count APs up to > + // (BootCpuCount - 1) precisely, regardless of timeout. > + // > + // Now try to fetch the possible CPU count. > + // > + UINTN CpuHpBase; > + UINT32 CmdData2; > + > + CpuHpBase = ((mHostBridgeDevId == INTEL_Q35_MCH_DEVICE_ID) ? > + ICH9_CPU_HOTPLUG_BASE : PIIX4_CPU_HOTPLUG_BASE); > + > + // > + // If only legacy mode is available in the CPU hotplug register block, or > + // the register block is completely missing, then the writes below are > + // no-ops. > + // > + // 1. Switch the hotplug register block to modern mode. > + // > + IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, 0); > + // > + // 2. Select a valid CPU for deterministic reading of > + // QEMU_CPUHP_R_CMD_DATA2. > + // > + // CPU#0 is always valid; it is the always present and non-removable > + // BSP. > + // > + IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, 0); > + // > + // 3. Send a command after which QEMU_CPUHP_R_CMD_DATA2 is specified to > + // read as zero, and which does not invalidate the selector. (The > + // selector may change, but it must not become invalid.) > + // > + // Send QEMU_CPUHP_CMD_GET_PENDING, as it will prove useful later. > + // > + IoWrite8 (CpuHpBase + QEMU_CPUHP_W_CMD, QEMU_CPUHP_CMD_GET_PENDING); > + // > + // 4. Read QEMU_CPUHP_R_CMD_DATA2. > + // > + // If the register block is entirely missing, then this is an unassigned > + // IO read, returning all-bits-one. > + // > + // If only legacy mode is available, then bit#0 stands for CPU#0 in the > + // "CPU present bitmap". CPU#0 is always present. > + // > + // Otherwise, QEMU_CPUHP_R_CMD_DATA2 is either still reserved (returning > + // all-bits-zero), or it is specified to read as zero after the above > + // steps. Both cases confirm modern mode. > + // > + CmdData2 = IoRead32 (CpuHpBase + QEMU_CPUHP_R_CMD_DATA2); > + DEBUG ((DEBUG_VERBOSE, "%a: CmdData2=0x%x\n", __FUNCTION__, CmdData2)); > + if (CmdData2 != 0) { > + // > + // QEMU doesn't support the modern CPU hotplug interface. Assume that the > + // possible CPU count equals the boot CPU count (precluding hotplug). > + // > + DEBUG ((DEBUG_WARN, "%a: modern CPU hotplug interface unavailable\n", > + __FUNCTION__)); > + mMaxCpuCount = BootCpuCount; > + } else { > + // > + // Grab the possible CPU count from the modern CPU hotplug interface. > + // > + UINT32 Present, Possible, Selected; > + > + Present = 0; > + Possible = 0; > + > + // > + // We've sent QEMU_CPUHP_CMD_GET_PENDING last; this ensures > + // QEMU_CPUHP_RW_CMD_DATA can now be read usefully. However, > + // QEMU_CPUHP_CMD_GET_PENDING may have selected a CPU with actual pending > + // hotplug events; therefore, select CPU#0 forcibly. > + // > + IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, Possible); Nitpicking, I find this easier to read (matches the comment): IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, 0); Rest very well documented, thanks. Reviewed-by: Philippe Mathieu-Daude > + > + do { > + UINT8 CpuStatus; > + > + // > + // Read the status of the currently selected CPU. This will help with a > + // sanity check against "BootCpuCount". > + // > + CpuStatus = IoRead8 (CpuHpBase + QEMU_CPUHP_R_CPU_STAT); > + if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) != 0) { > + ++Present; > + } > + // > + // Attempt to select the next CPU. > + // > + ++Possible; > + IoWrite32 (CpuHpBase + QEMU_CPUHP_W_CPU_SEL, Possible); > + // > + // If the selection is successful, then the following read will return > + // the selector (which we know is positive at this point). Otherwise, > + // the read will return 0. > + // > + Selected = IoRead32 (CpuHpBase + QEMU_CPUHP_RW_CMD_DATA); > + ASSERT (Selected == Possible || Selected == 0); > + } while (Selected > 0); > + > + // > + // Sanity check: fw_cfg and the modern CPU hotplug interface should > + // return the same boot CPU count. > + // > + if (BootCpuCount != Present) { > + DEBUG ((DEBUG_WARN, "%a: QEMU v2.7 reset bug: BootCpuCount=%d " > + "Present=%u\n", __FUNCTION__, BootCpuCount, Present)); > + // > + // The handling of QemuFwCfgItemSmpCpuCount, across CPU hotplug plus > + // platform reset (including S3), was corrected in QEMU commit > + // e3cadac073a9 ("pc: fix FW_CFG_NB_CPUS to account for -device added > + // CPUs", 2016-11-16), part of release v2.8.0. > + // > + BootCpuCount = (UINT16)Present; > + } > + > + mMaxCpuCount = Possible; > + } > } > - // > - // Otherwise, set mMaxCpuCount to the value reported by QEMU. > - // > - mMaxCpuCount = ProcessorCount; > - // > - // Additionally, tell UefiCpuPkg modules (a) the exact number of VCPUs, (b) > - // to wait, in the initial AP bringup, exactly as long as it takes for all of > - // the APs to report in. For this, we set the longest representable timeout > - // (approx. 71 minutes). > - // > - PcdStatus = PcdSet32S (PcdCpuMaxLogicalProcessorNumber, ProcessorCount); > + > + DEBUG ((DEBUG_INFO, "%a: BootCpuCount=%d mMaxCpuCount=%u\n", __FUNCTION__, > + BootCpuCount, mMaxCpuCount)); > + ASSERT (BootCpuCount <= mMaxCpuCount); > + > + PcdStatus = PcdSet32S (PcdCpuBootLogicalProcessorNumber, BootCpuCount); > ASSERT_RETURN_ERROR (PcdStatus); > - PcdStatus = PcdSet32S (PcdCpuApInitTimeOutInMicroSeconds, MAX_UINT32); > + PcdStatus = PcdSet32S (PcdCpuMaxLogicalProcessorNumber, mMaxCpuCount); > ASSERT_RETURN_ERROR (PcdStatus); > - DEBUG ((DEBUG_INFO, "%a: QEMU reports %d processor(s)\n", __FUNCTION__, > - ProcessorCount)); > } > > > /** > Perform Platform PEI initialization. > @@ -636,17 +757,18 @@ InitializePlatform ( > } > > S3Verification (); > BootModeInitialization (); > AddressWidthInitialization (); > - MaxCpuCountInitialization (); > > // > // Query Host Bridge DID > // > mHostBridgeDevId = PciRead16 (OVMF_HOSTBRIDGE_DID); > > + MaxCpuCountInitialization (); > + > if (FeaturePcdGet (PcdSmmSmramRequire)) { > Q35TsegMbytesInitialization (); > } > > PublishPeiMemory (); >