From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.groups.io with SMTP id smtpd.web10.63677.1673547734533605338 for ; Thu, 12 Jan 2023 10:22:14 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UUJcDjk4; spf=pass (domain: redhat.com, ip: 170.10.129.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673547733; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LeP+mEUz4VlTlvq+V97NyaT+deI3tq6NgH6Qs3p4kM0=; b=UUJcDjk4zCmhlujYgU4cbkqdkkInbGihBMT5hiVpM4bfam7HlQuIKSVNGzhSIunZ9Bultn mOqkeDubRXkkY0dDZ8xhLYYlm3nZIbIgbGvqb1/CsKDUk035ACSZi1X0Os3Upwy4QucrGy Y8kJhGqXsIEgi0C7rn3GRRbWygzhREY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-633-chRjOWshPbW3f52RSqSfjQ-1; Thu, 12 Jan 2023 13:22:10 -0500 X-MC-Unique: chRjOWshPbW3f52RSqSfjQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CEF113C10254; Thu, 12 Jan 2023 18:22:07 +0000 (UTC) Received: from [10.39.192.93] (unknown [10.39.192.93]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8CB3E40C2006; Thu, 12 Jan 2023 18:22:05 +0000 (UTC) Message-ID: Date: Thu, 12 Jan 2023 19:22:04 +0100 MIME-Version: 1.0 Subject: Re: [edk2-devel] [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression From: "Laszlo Ersek" To: Michael Brown , devel@edk2.groups.io Cc: Ard Biesheuvel , Brijesh Singh , Erdem Aktas , Gerd Hoffmann , James Bottomley , Jiewen Yao , Jordan Justen , Min Xu , Oliver Steffen , Sebastien Boeuf , Tom Lendacky References: <20230112082845.128463-1-lersek@redhat.com> <01020185a568604c-e16d8581-963a-4ff3-8566-bf0640ad327d-000000@eu-west-1.amazonses.com> <407c5cee-7a6c-cbc8-35cc-8f2c2724914c@redhat.com> <01020185a6bda78a-05d82180-4d1a-4af4-9a9b-ac78088d11ed-000000@eu-west-1.amazonses.com> <49e4e8bb-3bbd-0ca8-ee59-e75560deffa7@redhat.com> In-Reply-To: <49e4e8bb-3bbd-0ca8-ee59-e75560deffa7@redhat.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 1/12/23 18:58, Laszlo Ersek wrote: > Your proposal is entirely justified, from a practical / user > perspective, but I'm not the right person for it. This should do what you propose (untested), but I hate the code myself. diff --git a/OvmfPkg/Library/PlatformInitLib/Platform.c b/OvmfPkg/Library/PlatformInitLib/Platform.c index 3e13c5d4b34f..b4c0b47c8bb5 100644 --- a/OvmfPkg/Library/PlatformInitLib/Platform.c +++ b/OvmfPkg/Library/PlatformInitLib/Platform.c @@ -541,6 +541,22 @@ PlatformMaxCpuCountInitialization ( ASSERT (Selected == Possible || Selected == 0); } while (Selected > 0); + if (Present == 0) { + if (FeaturePcdGet (PcdSmmSmramRequire)) { + ASSERT (FALSE); + CpuDeadLoop(); + } + // + // The bug is in QEMU v5.1.0+, where we're not affected by the QEMU v2.7 + // reset bug, so BootCpuCount from fw_cfg is reliable. Assume a fully + // populated topology, precluding hotplug, like when the modern CPU + // hotplug interface is unavailable. This will satisfy the QEMU v2.7 + // reset bug sanity check below as well. + // + Present = BootCpuCount; + Possible = BootCpuCount; + } + // // Sanity check: fw_cfg and the modern CPU hotplug interface should // return the same boot CPU count. I *really* don't like this. With this, you might get through the firmware and reach the guest OS, and pre-boot, the MP PPI and protocol might even work. But, because the QEMU register block is broken, I don't have the slightest idea what's going to happen if you attempt a CPU hot-*unplug* at OS runtime (unplug because in effect you start with a fully populated topology). OVMF does not participate in that (because of no SMM), but the ACPI content generated by QEMU does, and that content (AML methods) do massage the register block. In a sense, with the register block negotiation broken, even without SMM in the picture, the guest OS could consider OVMF doing it a *service* by hanging, so that guest OS-level data is not lost when the admin attempts an unplug, and the guest OS crashes. (I've not reproduced such a crash to be clear, I just think it could be possible.) There's got to be a limit to how far we try to compensate for broken (virtual) hardware. :( The right thing to do is to wait for the QEMU patch to reach as many as possible stable branches, let the distros pick up the new stable releases, and then merge the hardliner hang. Laszlo