public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Erdem Aktas <erdemaktas@google.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	James Bottomley <jejb@linux.ibm.com>,
	Jiewen Yao <jiewen.yao@intel.com>,
	Jordan Justen <jordan.l.justen@intel.com>,
	Min Xu <min.m.xu@intel.com>, Oliver Steffen <osteffen@redhat.com>,
	Sebastien Boeuf <sebastien.boeuf@intel.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [edk2-devel] [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression
Date: Thu, 12 Jan 2023 19:34:50 +0100	[thread overview]
Message-ID: <8f9592c3-08b7-3e8d-47c4-7ae78f0b8c36@redhat.com> (raw)
In-Reply-To: <20230112082845.128463-1-lersek@redhat.com>

On 1/12/23 09:28, Laszlo Ersek wrote:
> In QEMU v5.1.0, the CPU hotplug register block misbehaves: the negotiation
> protocol is (effectively) broken such that it suggests that switching from
> the legacy interface to the modern interface works, but in reality the
> switch never happens. The symptom has been witnessed when using TCG
> acceleration; KVM seems to mask the issue. The issue persists with the
> following (latest) stable QEMU releases: v5.2.0, v6.2.0, v7.2.0. Currently
> there is no stable release that addresses the problem.
> 
> The QEMU bug confuses the Present and Possible counting in function
> PlatformMaxCpuCountInitialization(), in
> "OvmfPkg/Library/PlatformInitLib/Platform.c". OVMF ends up with Present=0
> Possible=1. This in turn further confuses MpInitLib in UefiCpuPkg (hence
> firmware-time multiprocessing will be broken). Worse, CPU hot(un)plug with
> SMI will be summarily broken in OvmfPkg/CpuHotplugSmm, which (considering
> the privilege level of SMM) is not that great.
> 
> Detect the issue in PlatformMaxCpuCountInitialization(), and print an
> error message and *hang* if the issue is present.
> 
> The problem was originally reported by Ard [0]. We analyzed it at [1] and
> [2]. A QEMU patch was sent at [3]; now merged as commit dab30fbef389
> ("acpi: cpuhp: fix guest-visible maximum access size to the legacy reg
> block", 2023-01-08), to be included in QEMU v8.0.0.
> 
> [0] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c2
> 
> [1] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c3
> 
> [2] IO port write width clamping differs between TCG and KVM
>     http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
>     https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html
> 
> [3] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
>     http://mid.mail-archive.com/20230104090138.214862-1-lersek@redhat.com
>     https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00278.html
> 
> NOTE: PlatformInitLib is used in the following platform DSCs:
> 
>   OvmfPkg/AmdSev/AmdSevX64.dsc
>   OvmfPkg/CloudHv/CloudHvX64.dsc
>   OvmfPkg/IntelTdx/IntelTdxX64.dsc
>   OvmfPkg/Microvm/MicrovmX64.dsc
>   OvmfPkg/OvmfPkgIa32.dsc
>   OvmfPkg/OvmfPkgIa32X64.dsc
>   OvmfPkg/OvmfPkgX64.dsc
> 
> but I can only test this change with the last three platforms, running on
> QEMU.
> 
> Test results:
> 
>   TCG  QEMU     OVMF     result
>        patched  patched
>   ---  -------  -------  -------------------------------------------------
>   0    0        0        CPU counts OK (KVM masks the QEMU bug)
>   0    0        1        CPU counts OK (KVM masks the QEMU bug)
>   0    1        0        CPU counts OK (QEMU fix, but KVM masks the QEMU
>                          bug anyway)
>   0    1        1        CPU counts OK (QEMU fix, but KVM masks the QEMU
>                          bug anyway)
>   1    0        0        boot with broken CPU counts (original QEMU bug)
>   1    0        1        broken CPU count caught (boot hangs)
>   1    1        0        CPU counts OK (QEMU fix)
>   1    1        1        CPU counts OK (QEMU fix)
> 
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> Cc: Brijesh Singh <brijesh.singh@amd.com>
> Cc: Erdem Aktas <erdemaktas@google.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: James Bottomley <jejb@linux.ibm.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Min Xu <min.m.xu@intel.com>
> Cc: Oliver Steffen <osteffen@redhat.com>
> Cc: Sebastien Boeuf <sebastien.boeuf@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4250
> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
> 
> Notes:
>     v2:
>     
>     - V1 was at
>       <http://mid.mail-archive.com/20230104151234.286030-1-lersek@redhat.com>.
>     
>     - Repo: <https://pagure.io/lersek/edk2.git>, branch:
>       cpuhp-reg-catch-4250-v2
>     
>     - Remove KVM as a proposed workaround from the error message, because in
>       the QEMU discussion, we had found that the KVM accelerator's behavior
>       in QEMU (masking the problem) was not right, and that a fix for that
>       had been in progress for quite some time.
>     
>     - Add the QEMU commit hash to the commit message, the code comment, and
>       the error message.
>     
>     - Pick up Gerd's R-b; add Oliver to the Cc list.
> 
>  OvmfPkg/Library/PlatformInitLib/Platform.c | 35 ++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/OvmfPkg/Library/PlatformInitLib/Platform.c b/OvmfPkg/Library/PlatformInitLib/Platform.c
> index 3e13c5d4b34f..13348afb4890 100644
> --- a/OvmfPkg/Library/PlatformInitLib/Platform.c
> +++ b/OvmfPkg/Library/PlatformInitLib/Platform.c
> @@ -541,6 +541,41 @@ PlatformMaxCpuCountInitialization (
>          ASSERT (Selected == Possible || Selected == 0);
>        } while (Selected > 0);
>  
> +      //
> +      // Sanity check: we need at least 1 present CPU (CPU#0 is always present).
> +      //
> +      // The legacy-to-modern switching of the CPU hotplug register block got
> +      // broken (for TCG) in QEMU v5.1.0. Refer to "IO port write width clamping
> +      // differs between TCG and KVM" at
> +      // <http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com>
> +      // or at
> +      // <https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html>.
> +      //
> +      // QEMU received the fix in commit dab30fbef389 ("acpi: cpuhp: fix
> +      // guest-visible maximum access size to the legacy reg block",
> +      // 2023-01-08), to be included in QEMU v8.0.0.
> +      //
> +      // If we're affected by this QEMU bug, then we must not continue: it
> +      // confuses the multiprocessing in UefiCpuPkg/Library/MpInitLib, and
> +      // breaks CPU hot(un)plug with SMI in OvmfPkg/CpuHotplugSmm.
> +      //
> +      if (Present == 0) {
> +        DEBUG ((
> +          DEBUG_ERROR,
> +          "%a: Broken CPU hotplug register block: Present=%u Possible=%u.\n"
> +          "%a: Update QEMU to v8, or to stable with dab30fbef389 backported.\n"
> +          "%a: Refer to "
> +          "<https://bugzilla.tianocore.org/show_bug.cgi?id=4250>.\n",
> +          __FUNCTION__,
> +          Present,
> +          Possible,
> +          __FUNCTION__,
> +          __FUNCTION__
> +          ));
> +        ASSERT (FALSE);
> +        CpuDeadLoop ();
> +      }
> +
>        //
>        // Sanity check: fw_cfg and the modern CPU hotplug interface should
>        // return the same boot CPU count.
> 

please do with this what you will


      parent reply	other threads:[~2023-01-12 18:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12  8:28 [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression Laszlo Ersek
2023-01-12  9:55 ` [edk2-devel] " Michael Brown
2023-01-12 10:09   ` Ard Biesheuvel
2023-01-12 13:31     ` Laszlo Ersek
2023-01-12 13:22   ` Laszlo Ersek
2023-01-12 16:08     ` Michael Brown
2023-01-12 17:58       ` Laszlo Ersek
2023-01-12 18:22         ` Laszlo Ersek
2023-01-12 22:49           ` Michael Brown
2023-01-13  6:03         ` Gerd Hoffmann
2023-01-13  9:32           ` Gerd Hoffmann
2023-01-13 10:10             ` Laszlo Ersek
2023-01-13 12:22               ` Gerd Hoffmann
2023-01-16 14:42                 ` Ard Biesheuvel
2023-01-16 14:48                 ` Laszlo Ersek
2023-01-17 12:37                   ` Gerd Hoffmann
2023-01-17 16:43                     ` Ard Biesheuvel
2023-01-18  7:25                       ` Gerd Hoffmann
2023-01-18 11:50                         ` Laszlo Ersek
2023-01-18 13:10                           ` Gerd Hoffmann
2023-01-18 13:25                             ` Laszlo Ersek
2023-01-18 13:10                           ` Ard Biesheuvel
2023-01-18 13:21                             ` Laszlo Ersek
2023-01-12 18:34 ` Laszlo Ersek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f9592c3-08b7-3e8d-47c4-7ae78f0b8c36@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox