From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>,
Brijesh Singh <brijesh.singh@amd.com>,
Erdem Aktas <erdemaktas@google.com>,
Gerd Hoffmann <kraxel@redhat.com>,
James Bottomley <jejb@linux.ibm.com>,
Jiewen Yao <jiewen.yao@intel.com>,
Jordan Justen <jordan.l.justen@intel.com>,
Min Xu <min.m.xu@intel.com>, Oliver Steffen <osteffen@redhat.com>,
Sebastien Boeuf <sebastien.boeuf@intel.com>,
Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [edk2-devel] [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression
Date: Thu, 12 Jan 2023 19:34:50 +0100 [thread overview]
Message-ID: <8f9592c3-08b7-3e8d-47c4-7ae78f0b8c36@redhat.com> (raw)
In-Reply-To: <20230112082845.128463-1-lersek@redhat.com>
On 1/12/23 09:28, Laszlo Ersek wrote:
> In QEMU v5.1.0, the CPU hotplug register block misbehaves: the negotiation
> protocol is (effectively) broken such that it suggests that switching from
> the legacy interface to the modern interface works, but in reality the
> switch never happens. The symptom has been witnessed when using TCG
> acceleration; KVM seems to mask the issue. The issue persists with the
> following (latest) stable QEMU releases: v5.2.0, v6.2.0, v7.2.0. Currently
> there is no stable release that addresses the problem.
>
> The QEMU bug confuses the Present and Possible counting in function
> PlatformMaxCpuCountInitialization(), in
> "OvmfPkg/Library/PlatformInitLib/Platform.c". OVMF ends up with Present=0
> Possible=1. This in turn further confuses MpInitLib in UefiCpuPkg (hence
> firmware-time multiprocessing will be broken). Worse, CPU hot(un)plug with
> SMI will be summarily broken in OvmfPkg/CpuHotplugSmm, which (considering
> the privilege level of SMM) is not that great.
>
> Detect the issue in PlatformMaxCpuCountInitialization(), and print an
> error message and *hang* if the issue is present.
>
> The problem was originally reported by Ard [0]. We analyzed it at [1] and
> [2]. A QEMU patch was sent at [3]; now merged as commit dab30fbef389
> ("acpi: cpuhp: fix guest-visible maximum access size to the legacy reg
> block", 2023-01-08), to be included in QEMU v8.0.0.
>
> [0] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c2
>
> [1] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c3
>
> [2] IO port write width clamping differs between TCG and KVM
> http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html
>
> [3] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
> http://mid.mail-archive.com/20230104090138.214862-1-lersek@redhat.com
> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00278.html
>
> NOTE: PlatformInitLib is used in the following platform DSCs:
>
> OvmfPkg/AmdSev/AmdSevX64.dsc
> OvmfPkg/CloudHv/CloudHvX64.dsc
> OvmfPkg/IntelTdx/IntelTdxX64.dsc
> OvmfPkg/Microvm/MicrovmX64.dsc
> OvmfPkg/OvmfPkgIa32.dsc
> OvmfPkg/OvmfPkgIa32X64.dsc
> OvmfPkg/OvmfPkgX64.dsc
>
> but I can only test this change with the last three platforms, running on
> QEMU.
>
> Test results:
>
> TCG QEMU OVMF result
> patched patched
> --- ------- ------- -------------------------------------------------
> 0 0 0 CPU counts OK (KVM masks the QEMU bug)
> 0 0 1 CPU counts OK (KVM masks the QEMU bug)
> 0 1 0 CPU counts OK (QEMU fix, but KVM masks the QEMU
> bug anyway)
> 0 1 1 CPU counts OK (QEMU fix, but KVM masks the QEMU
> bug anyway)
> 1 0 0 boot with broken CPU counts (original QEMU bug)
> 1 0 1 broken CPU count caught (boot hangs)
> 1 1 0 CPU counts OK (QEMU fix)
> 1 1 1 CPU counts OK (QEMU fix)
>
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: Erdem Aktas <erdemaktas@google.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: James Bottomley <jejb@linux.ibm.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Min Xu <min.m.xu@intel.com>
> Cc: Oliver Steffen <osteffen@redhat.com>
> Cc: Sebastien Boeuf <sebastien.boeuf@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4250
> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>
> Notes:
> v2:
>
> - V1 was at
> <http://mid.mail-archive.com/20230104151234.286030-1-lersek@redhat.com>.
>
> - Repo: <https://pagure.io/lersek/edk2.git>, branch:
> cpuhp-reg-catch-4250-v2
>
> - Remove KVM as a proposed workaround from the error message, because in
> the QEMU discussion, we had found that the KVM accelerator's behavior
> in QEMU (masking the problem) was not right, and that a fix for that
> had been in progress for quite some time.
>
> - Add the QEMU commit hash to the commit message, the code comment, and
> the error message.
>
> - Pick up Gerd's R-b; add Oliver to the Cc list.
>
> OvmfPkg/Library/PlatformInitLib/Platform.c | 35 ++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/OvmfPkg/Library/PlatformInitLib/Platform.c b/OvmfPkg/Library/PlatformInitLib/Platform.c
> index 3e13c5d4b34f..13348afb4890 100644
> --- a/OvmfPkg/Library/PlatformInitLib/Platform.c
> +++ b/OvmfPkg/Library/PlatformInitLib/Platform.c
> @@ -541,6 +541,41 @@ PlatformMaxCpuCountInitialization (
> ASSERT (Selected == Possible || Selected == 0);
> } while (Selected > 0);
>
> + //
> + // Sanity check: we need at least 1 present CPU (CPU#0 is always present).
> + //
> + // The legacy-to-modern switching of the CPU hotplug register block got
> + // broken (for TCG) in QEMU v5.1.0. Refer to "IO port write width clamping
> + // differs between TCG and KVM" at
> + // <http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com>
> + // or at
> + // <https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html>.
> + //
> + // QEMU received the fix in commit dab30fbef389 ("acpi: cpuhp: fix
> + // guest-visible maximum access size to the legacy reg block",
> + // 2023-01-08), to be included in QEMU v8.0.0.
> + //
> + // If we're affected by this QEMU bug, then we must not continue: it
> + // confuses the multiprocessing in UefiCpuPkg/Library/MpInitLib, and
> + // breaks CPU hot(un)plug with SMI in OvmfPkg/CpuHotplugSmm.
> + //
> + if (Present == 0) {
> + DEBUG ((
> + DEBUG_ERROR,
> + "%a: Broken CPU hotplug register block: Present=%u Possible=%u.\n"
> + "%a: Update QEMU to v8, or to stable with dab30fbef389 backported.\n"
> + "%a: Refer to "
> + "<https://bugzilla.tianocore.org/show_bug.cgi?id=4250>.\n",
> + __FUNCTION__,
> + Present,
> + Possible,
> + __FUNCTION__,
> + __FUNCTION__
> + ));
> + ASSERT (FALSE);
> + CpuDeadLoop ();
> + }
> +
> //
> // Sanity check: fw_cfg and the modern CPU hotplug interface should
> // return the same boot CPU count.
>
please do with this what you will
prev parent reply other threads:[~2023-01-12 18:34 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-12 8:28 [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression Laszlo Ersek
2023-01-12 9:55 ` [edk2-devel] " Michael Brown
2023-01-12 10:09 ` Ard Biesheuvel
2023-01-12 13:31 ` Laszlo Ersek
2023-01-12 13:22 ` Laszlo Ersek
2023-01-12 16:08 ` Michael Brown
2023-01-12 17:58 ` Laszlo Ersek
2023-01-12 18:22 ` Laszlo Ersek
2023-01-12 22:49 ` Michael Brown
2023-01-13 6:03 ` Gerd Hoffmann
2023-01-13 9:32 ` Gerd Hoffmann
2023-01-13 10:10 ` Laszlo Ersek
2023-01-13 12:22 ` Gerd Hoffmann
2023-01-16 14:42 ` Ard Biesheuvel
2023-01-16 14:48 ` Laszlo Ersek
2023-01-17 12:37 ` Gerd Hoffmann
2023-01-17 16:43 ` Ard Biesheuvel
2023-01-18 7:25 ` Gerd Hoffmann
2023-01-18 11:50 ` Laszlo Ersek
2023-01-18 13:10 ` Gerd Hoffmann
2023-01-18 13:25 ` Laszlo Ersek
2023-01-18 13:10 ` Ard Biesheuvel
2023-01-18 13:21 ` Laszlo Ersek
2023-01-12 18:34 ` Laszlo Ersek [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8f9592c3-08b7-3e8d-47c4-7ae78f0b8c36@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox