From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.groups.io with SMTP id smtpd.web11.8895.1672909772954853600 for ; Thu, 05 Jan 2023 01:09:33 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=C7uOP35A; spf=pass (domain: redhat.com, ip: 170.10.129.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1672909772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vXPh+uBh0MrPE216uerLJeceGFIJr5oFVjbGlPJzJOA=; b=C7uOP35Ak7fAXx/1dPMvsXETwvIkN+9zZDOFClU4ePrgBZbBpiuh3McUrHJythNDCWNKDe gzP6+ea7BpH9Avrjx+/ypCYOm5rSXsToBBFVMEXyQYKyuYm6ZVVXUQCfOBeBDQAzKDTUxu UimBVDAFUQGzx6Qj3nvCir2Rux4nERI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-649-CU_CEjVvNEyZ4jeJYrJXDA-1; Thu, 05 Jan 2023 04:09:28 -0500 X-MC-Unique: CU_CEjVvNEyZ4jeJYrJXDA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 44A3F1C06EE0; Thu, 5 Jan 2023 09:09:28 +0000 (UTC) Received: from [10.39.192.44] (unknown [10.39.192.44]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 48CE62026D4B; Thu, 5 Jan 2023 09:09:26 +0000 (UTC) Message-ID: <9b3040df-fd2e-4b2d-fda6-62760425fdce@redhat.com> Date: Thu, 5 Jan 2023 10:09:24 +0100 MIME-Version: 1.0 Subject: Re: [edk2-devel] [PATCH V1 1/1] OvmfPkg/AcpiPlatformDxe: Add back log and fix code cohesion To: devel@edk2.groups.io, min.m.xu@intel.com Cc: Erdem Aktas , James Bottomley , Jiewen Yao , Gerd Hoffmann , Tom Lendacky , Sebastien Boeuf , Michael Brown , =?UTF-8?Q?Marvin_H=c3=a4user?= References: <20230104141309.1426-1-min.m.xu@intel.com> From: "Laszlo Ersek" In-Reply-To: <20230104141309.1426-1-min.m.xu@intel.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 1/4/23 15:13, Min Xu wrote: > From: Min M Xu > > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4237 > > Commit 9fdc70af6ba8 breaks log analysis and code cohesion in > AcpiPlatformDxe. See the detailed information in BZ#4237. > > There are below changes in this patch: > 1) Add back debug log > 2) Add error checking and handling if InstallProtocolInterface failed. > 3) Delete the QemuAcpiTableNotify.h because it is superfluous. > 4) Delete the global variable "mQemuAcpiHandle" instead of a local > variable. > 5) Install the QEMU_ACPI_TABLE_NOTIFY_PROTOCOL with a NULL associated > interface. > > Above changes 2/4/5 are applied in both CloudHvAcpi.c and QemuFwCfgAcpi.c. > > Cc: Laszlo Ersek > Cc: Erdem Aktas > Cc: James Bottomley > Cc: Jiewen Yao > Cc: Gerd Hoffmann > Cc: Tom Lendacky > Cc: Sebastien Boeuf > Reported-by: Laszlo Ersek > Signed-off-by: Min Xu > --- > OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c | 25 +++++------ > OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c | 42 +++++++++++++------ > .../Include/Protocol/QemuAcpiTableNotify.h | 27 ------------ > 3 files changed, 42 insertions(+), 52 deletions(-) > delete mode 100644 OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h First of all, I wouldn't like to make any comments on the "CloudHvAcpi.c" source file. I wouldn't like my Reviewed-by to cover any non-trivial change made to "CloudHvAcpi.c". Therefore, I suggest that this patch be split up to a series of small patches. The first patch should, in my opinion, just remove the QEMU_ACPI_TABLE_NOTIFY_PROTOCOL structure -- that is, "step 5". This should be a trivial change to both QemuFwCfgAcpi.c and CloudHvAcpi.c, and I'm comfortable reviewing that for both of these source files. This patch can also cover "step 3", the removal of "QemuAcpiTableNotify.h". Patches #2 and #3 should cover "step 4", *split* to separate patches between CloudHvAcpi.c and QemuFwCfgAcpi.c. I would only like to review the QemuFwCfgAcpi.c change, from these. So then we are left with steps 1 and 2. Patch #4 should just cover step 1, add back the debug log. This only affects QemuFwCfgAcpi.c, which I'd like to review. For step 2, patches #5 and #6 should add the missing error handling, and move the protocol installation to the right spot, and extend the error path accordingly. This must *definitely* be seprate for QemuFwCfgAcpi.c and CloudHvAcpi.c, hence my request for splitting step 2 to two patches (#5, #6). I won't look at the CloudHvAcpi.c, but will scrutinize the heck out of QemuFwCfgAcpi.c. --*-- I can already say that your current change for step 2 in "QemuFwCfgAcpi.c" is wrong, or at least incomplete. Please consider the following pattern: It is frequently the case that (a) a function needs to create a set of *permanent* resources (objects that are effectively the function's "output"), which are supposed to be published if the functon entirely succeeds, while at the same time (b) the function needs some *temporary* resources for the creation of the permanent resource set. The "permanent" resources must be preserved (= output to the caller, or "committed" to the system as a side effect) if and only if all steps succeed, whereas the "temporary" resources must be torn down (or rolled back) *unconditionally*. In InstallQemuFwCfgTables(), we have the following *temporary* resources: (1.1) LoaderStart -- the QEMU linker/loader script downloaded from fw_cfg (1.2) AllocationsRestrictedTo32Bit -- a dictionary of fw_cfg blobs that must not be allocated from 64-bit address space (1.3) Tracker -- a dictionary collecting the fw_cfg blobs that we have thus far downloaded from QEMU, as instructed by the linker/loader script (1.4) InstalledKey -- an array collecting the keys of the ACPI tables we've thus far installed with the ACPI Table Protocol. (1.5) SeenPointers -- a dictionary for one of the command types in the linker/loader script, ensuring that we don't attempt to install referenced tables more than once And we have the following *permanent* resources (committed to the system as a side-effect) (2.1) S3Context -- effectively an S3 Boot Script representation. Some operations in the ACPI linker/loader script of QEMU -- namely, the "write pointer" commands -- convey information from the guest *back* to QEMU, over fw_cfg, and these must be replayed during S3 resume. The system reset that occurs between suspend and resume clears out a bunch of information from QEMU, and these operations re-teach a set of information to QEMU. (2.2) the *current runtime effect* of the same "write pointer" commands as discussed above. S3Context is for S3 resume, but the same commands must take primarily *right now*. (2.3) the set of ACPI tables we've installed Now, if you look at InstallQemuFwCfgTables() *before* your original change (9fdc70af6ba8), you will see that: (a) the publication of S3Context (permanent resource 2.1) as an S3 Boot Script is the *final* step in the function. Due to various reasons, rolling back this step is not possible. Therefore no action that could fail must ever be performed in the function after this action. (b) The error path is constructed with *multiple labels* in such a way that the order of rollback operations is the *reverse order* of constructions, intermixing both temporary and permanent resources. When a constructive operation fails, a jump is taken such that the rollback steps for all constructive operations *not yet reached* are skipped, and the rollbacks for the constructions that *have* succeeded are all performed. We can call this a "cascade" of error handling labels. (c) Note that in some (infrequent!) cases, *multiple* resources can be released under the same error handling label, but that is generally the exception. It is only possible if the nature of some construction operations is such that they cannot fail separately (but still need tear-down), or some of the resources are data sets which can be genuinely empty, and whose tear-down needs no separation between empty vs. non-empty. You can see this with the uninstallation of the ACPI tables we've installed (permanent resource 2.3), which is grouped together under the same "UninstallAcpiTables" label with the releasing of the "SeenPointers" dictionary (temporary resource 1.5). There is no grand theory behind this; it simply mirrors the construction path: > SeenPointers = OrderedCollectionInit (PointerCompare, PointerCompare); > if (SeenPointers == NULL) { > Status = EFI_OUT_OF_RESOURCES; > goto FreeKeys; > } > > // > // second pass: identify and install ACPI tables > // > Installed = 0; > for (LoaderEntry = LoaderStart; LoaderEntry < LoaderEnd; ++LoaderEntry) { > if (LoaderEntry->Type == QemuLoaderCmdAddPointer) { > Status = Process2ndPassCmdAddPointer ( > &LoaderEntry->Command.AddPointer, > Tracker, > AcpiProtocol, > InstalledKey, > &Installed, > SeenPointers > ); > if (EFI_ERROR (Status)) { > goto UninstallAcpiTables; > } > } > } If "SeenPointers" (a dictionary) is initialized successfully, but *any* ACPI table installation fails in the loop later, then we must tear down "SeenPointers" (regardless of how many entries it contains) *and* uninstall all previously installed ACPI tables as well (using the keys we've put into InstalledKey -- their number is given by Installed). (d) Very importantly, because we have both permanent and temporary resources, we must fall through the error path at the end of the function even if the function as a whole succeeds. Therefore the tear-down steps on the error path for the *permanent* resources are all *restricted* to "Status" differing from EFI_SUCCESS. In other words, we go through the entire cascade in this ("grand success") case, but omit those rollback steps that would affect our permanent resources, which are now all done. Only the temporaries are torn down. At the very end, we have "return Status", which is consistent with the following two requirements (which the function does satisfy): - any time we perform a "goto" to an error handling label, Status must be set to an error value, either manually, or by the operation that just failed (and because of which we're taking the goto) - by the time we reach the first error handling label via *normal progression* (no goto), "Status" must be set to EFI_SUCCESS. This is achievable in two ways: one, set Status to EFI_SUCCESS explicitly, right there; or two: prove that, by that point, Status will have been set to EFI_SUCCESS. In this function, the latter is the case. --*-- Okay. After this brief introduction, consider what you're doing with the protocol installation: - the protocol in the protocol database is a new permanent resource (a system-wide side-effect) - installing the protocol is an operation that can fail - protocol installation *can* be rolled back (I discussed the TPL concerns around this in the Bugzilla, but I'm not going to repeat them here). This fact -- i.e. that the protocol installation *can* be rolled back -- is really nice BTW, because we already have one operation in the function that *cannot*. We can't have *two* final steps in the function, so it's just as well that the protocol installation *need not* be the last one. So what you need to do in patch#5 or patch#6 is this (diff shown with 6 lines of context): > diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > index 38584fd608df..0c2b2215f316 100644 > --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > @@ -1246,50 +1246,63 @@ InstallQemuFwCfgTables ( > if (EFI_ERROR (Status)) { > goto UninstallAcpiTables; > } > } > } > > + // > + // Install a protocol to notify that the ACPI table provided by Qemu is > + // ready. > + // > + Status = gBS->InstallProtocolInterface ( > + &QemuAcpiHandle, > + &gQemuAcpiTableNotifyProtocolGuid, > + EFI_NATIVE_INTERFACE, > + NULL > + ); > + if (EFI_ERROR (Status)) { > + goto UninstallAcpiTables; > + } > + > // > // Translating the condensed QEMU_LOADER_WRITE_POINTER commands to ACPI S3 > // Boot Script opcodes has to be the last operation in this function, because > // if it succeeds, it cannot be undone. > // > if (S3Context != NULL) { > Status = TransferS3ContextToBootScript (S3Context); > if (EFI_ERROR (Status)) { > - goto UninstallAcpiTables; > + goto UninstallQemuAcpiTableNotifyProtocol; > } > > // > // Ownership of S3Context has been transferred. > // > S3Context = NULL; > } > > +UninstallQemuAcpiTableNotifyProtocol: > + if (EFI_ERROR (Status)) { > + gBS->UninstallProtocolInterface ( > + QemuAcpiHandle, > + &gQemuAcpiTableNotifyProtocolGuid, > + NULL > + ); > + } > + > UninstallAcpiTables: > if (EFI_ERROR (Status)) { > // > // roll back partial installation > // > while (Installed > 0) { > --Installed; > AcpiProtocol->UninstallAcpiTable (AcpiProtocol, InstalledKey[Installed]); > } > } else { > DEBUG ((DEBUG_INFO, "%a: installed %d tables\n", __FUNCTION__, Installed)); > - // > - // Install a protocol to notify that the ACPI table provided by Qemu is > - // ready. > - // > - gBS->InstallProtocolInterface ( > - &mQemuAcpiHandle, > - &gQemuAcpiTableNotifyProtocolGuid, > - EFI_NATIVE_INTERFACE, > - &mAcpiNotifyProtocol > - ); > } > > for (SeenPointerEntry = OrderedCollectionMin (SeenPointers); > SeenPointerEntry != NULL; > SeenPointerEntry = SeenPointerEntry2) > { Your proposed solution on the error path -- placing the uninstallation under the existent "UninstallAcpiTables" label, and gating it with (QemuAcpiHandle != NULL) -- may not be wrong, functionally speaking, but it still breaks the cohesion of the function. Specifically, the cohesion-break is that your proposal creates a path through the code where the (QemuAcpiHandle != NULL) condition is evaluated, on the error path, *without control ever having reached* InstallProtocolInterface() previously. Namely, if one of the Process2ndPassCmdAddPointer() calls fails, in the loop above InstallProtocolInterface(). Saving that code path is why you need to set "QemuAcpiHandle" to NULL at the top of the function. In fact, I request that you *not* set QemuAcpiHandle to NULL at the top of the function. Letting *only* InstallProtocolInterface() set it, upon success, is safe, as long as you use the pattern shown above. --*-- Okay, yet another topic. Before you ask, let me comment on this (from preexistent code): > FreeS3Context: > if (S3Context != NULL) { > ReleaseS3Context (S3Context); > } This is being gated so because S3Context undergoes an ownership transfer -- in that last, impossible-to-rollback step on the construction path --, and because S3Context is *optional* as well. So at this point S3Context is non-NULL precisely if we've had a failure *AND* S3Context was *needed* in the first place. Just checking EFI_ERROR (Status) is insufficient, because even on error, S3Context could still be NULL (it's optional), and once we check for nullity, checking Status becomes superfluous. (S3Context!=NULL at this point implies EFI_ERROR(Status), per above.) --*-- Finally, I *will* admit that the original placement of the DEBUG message is not ideal. It should not be on an "else" branch under the "UninstallAcpiTables" label. Instead, it should be, and *stay*, just above the very first error handling label, where we effectively declare the function "completed". So the original incorrect placement of the DEBUG is my mistake, from earlier -- I'm sorry. Thus, in patch#4, please do reinstate the DEBUG just above the "UninstallAcpiTables" label. And in patch #5 or #6, keep it above the new label "UninstallQemuAcpiTableNotifyProtocol". Thanks Laszlo > > diff --git a/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c b/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > index cbe8bb9b0c75..81f387438a64 100644 > --- a/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > +++ b/OvmfPkg/AcpiPlatformDxe/CloudHvAcpi.c > @@ -18,13 +18,9 @@ > > #include > #include > -#include // QEMU_ACPI_TABLE_NOTIFY_PROTOCOL > > #include "AcpiPlatform.h" > > -EFI_HANDLE mChAcpiHandle = NULL; > -QEMU_ACPI_TABLE_NOTIFY_PROTOCOL mChAcpiNotifyProtocol; > - > EFI_STATUS > EFIAPI > InstallCloudHvTablesTdx ( > @@ -33,13 +29,15 @@ InstallCloudHvTablesTdx ( > { > EFI_STATUS Status; > UINTN TableHandle; > + EFI_HANDLE ChAcpiHandle; > > EFI_PEI_HOB_POINTERS Hob; > EFI_ACPI_DESCRIPTION_HEADER *CurrentTable; > EFI_ACPI_DESCRIPTION_HEADER *DsdtTable; > > - DsdtTable = NULL; > - TableHandle = 0; > + DsdtTable = NULL; > + TableHandle = 0; > + ChAcpiHandle = NULL; > > Hob.Guid = (EFI_HOB_GUID_TYPE *)GetFirstGuidHob (&gUefiOvmfPkgTdxAcpiHobGuid); > > @@ -92,12 +90,15 @@ InstallCloudHvTablesTdx ( > // Install a protocol to notify that the ACPI table provided by CH is > // ready. > // > - gBS->InstallProtocolInterface ( > - &mChAcpiHandle, > - &gQemuAcpiTableNotifyProtocolGuid, > - EFI_NATIVE_INTERFACE, > - &mChAcpiNotifyProtocol > - ); > + Status = gBS->InstallProtocolInterface ( > + &ChAcpiHandle, > + &gQemuAcpiTableNotifyProtocolGuid, > + EFI_NATIVE_INTERFACE, > + NULL > + ); > + if (EFI_ERROR (Status)) { > + return Status; > + } > > return EFI_SUCCESS; > } > diff --git a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > index c8dee17c13e6..22340b13e3ee 100644 > --- a/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > +++ b/OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpi.c > @@ -19,10 +19,7 @@ > #include // QemuFwCfgS3Enabled() > #include // gBS > > -#include > #include "AcpiPlatform.h" > -EFI_HANDLE mQemuAcpiHandle = NULL; > -QEMU_ACPI_TABLE_NOTIFY_PROTOCOL mAcpiNotifyProtocol; > > // > // The user structure for the ordered collection that will track the fw_cfg > @@ -1103,6 +1100,9 @@ InstallQemuFwCfgTables ( > ORDERED_COLLECTION_ENTRY *TrackerEntry, *TrackerEntry2; > ORDERED_COLLECTION *SeenPointers; > ORDERED_COLLECTION_ENTRY *SeenPointerEntry, *SeenPointerEntry2; > + EFI_HANDLE QemuAcpiHandle; > + > + QemuAcpiHandle = NULL; > > Status = QemuFwCfgFindFile ("etc/table-loader", &FwCfgItem, &FwCfgSize); > if (EFI_ERROR (Status)) { > @@ -1249,6 +1249,20 @@ InstallQemuFwCfgTables ( > } > } > > + // > + // Install a protocol to notify that the ACPI table provided by Qemu is > + // ready. > + // > + Status = gBS->InstallProtocolInterface ( > + &QemuAcpiHandle, > + &gQemuAcpiTableNotifyProtocolGuid, > + EFI_NATIVE_INTERFACE, > + NULL > + ); > + if (EFI_ERROR (Status)) { > + goto UninstallAcpiTables; > + } > + > // > // Translating the condensed QEMU_LOADER_WRITE_POINTER commands to ACPI S3 > // Boot Script opcodes has to be the last operation in this function, because > @@ -1275,17 +1289,19 @@ UninstallAcpiTables: > --Installed; > AcpiProtocol->UninstallAcpiTable (AcpiProtocol, InstalledKey[Installed]); > } > + > + // > + // Uninstall the notification if it has been installed. > + // > + if (QemuAcpiHandle != NULL) { > + gBS->UninstallProtocolInterface ( > + QemuAcpiHandle, > + &gQemuAcpiTableNotifyProtocolGuid, > + NULL > + ); > + } > } else { > - // > - // Install a protocol to notify that the ACPI table provided by Qemu is > - // ready. > - // > - gBS->InstallProtocolInterface ( > - &mQemuAcpiHandle, > - &gQemuAcpiTableNotifyProtocolGuid, > - EFI_NATIVE_INTERFACE, > - &mAcpiNotifyProtocol > - ); > + DEBUG ((DEBUG_INFO, "%a: installed %d tables\n", __FUNCTION__, Installed)); > } > > for (SeenPointerEntry = OrderedCollectionMin (SeenPointers); > diff --git a/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h b/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > deleted file mode 100644 > index a3dd2fc1dc91..000000000000 > --- a/OvmfPkg/Include/Protocol/QemuAcpiTableNotify.h > +++ /dev/null > @@ -1,27 +0,0 @@ > -/** @file > - > - SPDX-License-Identifier: BSD-2-Clause-Patent > - > -**/ > - > -#ifndef QEMU_ACPI_TABLE_NOTIFY_H_ > -#define QEMU_ACPI_TABLE_NOTIFY_H_ > - > -#define QEMU_ACPI_TABLE_NOTIFY_GUID \ > - { 0x928939b2, 0x4235, 0x462f, { 0x95, 0x80, 0xf6, 0xa2, 0xb2, 0xc2, 0x1a, 0x4f } }; > - > -/// > -/// Forward declaration > -/// > -typedef struct _QEMU_ACPI_TABLE_NOTIFY_PROTOCOL QEMU_ACPI_TABLE_NOTIFY_PROTOCOL; > - > -/// > -/// Protocol structure > -/// > -struct _QEMU_ACPI_TABLE_NOTIFY_PROTOCOL { > - UINT8 Notify; > -}; > - > -extern EFI_GUID gQemuAcpiTableNotifyProtocolGuid; > - > -#endif