From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 2F6531A1E26 for ; Tue, 23 Aug 2016 20:27:14 -0700 (PDT) Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9D9BC20266; Wed, 24 Aug 2016 03:27:13 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-2.phx2.redhat.com [10.3.116.2]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u7O3RBOJ027975; Tue, 23 Aug 2016 23:27:11 -0400 To: "Zeng, Star" , "Fan, Jeff" , "edk2-devel@ml01.01.org" References: <1470128388-17960-1-git-send-email-jeff.fan@intel.com> <1470128388-17960-49-git-send-email-jeff.fan@intel.com> <4f61b2b4-eeb4-8435-412f-20848347c88e@redhat.com> <542CF652F8836A4AB8DBFAAD40ED192A143D9CFD@shsmsx102.ccr.corp.intel.com> <795dd4fe-cd16-c0f1-7f04-f78601e2c7a8@redhat.com> <82c6b5c9-dcab-f3c1-5ffe-20fb27ddb1af@intel.com> <6cd44902-8c07-8f76-1204-d79c85559ae5@intel.com> <7d39ce29-88ce-c5f2-af61-7fd964acb5c0@intel.com> Cc: "Kinney, Michael D" , "Tian, Feng" , "Jordan Justen (Intel address)" From: Laszlo Ersek Message-ID: <801f810c-4cd7-d657-f536-a8471a37802c@redhat.com> Date: Tue, 23 Aug 2016 23:27:10 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <7d39ce29-88ce-c5f2-af61-7fd964acb5c0@intel.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 24 Aug 2016 03:27:13 +0000 (UTC) Subject: Re: [Patch v5 48/48] UefiCpuPkg/PiSmmCpuDxeSmm: Add gEfiVariableArchProtocolGuid dependency X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2016 03:27:14 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 08/23/16 22:39, Zeng, Star wrote: > On 2016/8/23 23:33, Laszlo Ersek wrote: >> On 08/18/16 22:57, Zeng, Star wrote: >>> On 2016/8/19 10:45, Zeng, Star wrote: >>>> On 2016/8/19 10:26, Laszlo Ersek wrote: >>>>> On 08/19/16 04:00, Fan, Jeff wrote: >>>>>> Laszlo, >>>>>> >>>>>> I could revert this patch firstly. >>>>> >>>>> Thank you, that would be very kind. >>>>> >>>>>> Could you please dig out the OVMF to address the potential issue, >>>>>> then I could re-commit it for those platforms required this patch. >>>>> >>>>> The problem is that this week (what remains of it) and the next week >>>>> I won't really have time for this -- tomorrow I'm hoping to finish >>>>> up something else in a mortal hurry. It was actually in preparation >>>>> for rebasing / pushing that work that I pulled "master", and found >>>>> out about the regression. >>>>> >>>>> Can we perhaps get help from the variable stack maintainers? What's >>>>> the design of the (lack of) depexes on those drivers? >>>> >>>> Variable driver consumes >>>> PcdFlashNvStorageVariableBase(64)/PcdFlashNvStorageVariableSize to >>>> produce *gEfiVariableArchProtocolGuid* protocol. Variable driver >>>> registers (SMM)FTW protocol notification function >>>> SmmFtwNotificationEvent() or FtwNotificationEvent() to produce >>>> *gEfiVariableWriteArchProtocolGuid* protocol. (SMM)FTW driver has >>>> dependency on gEfiSmmFirmwareVolumeBlockProtocolGuid or >>>> gEfiFirmwareVolumeBlockProtocolGuid. >>>> >>>> I am not so sure what you said about the (lack of) depexes on those >>>> drivers. >>>> >>>> Did you see variable driver loaded and started when ASSERT appeared >>>> on OVMF? >>> >>> >>> You may could compare the serial logs to get if there is some driver >>> execution flow differences for the images built without and with this >>> patch. >> >> The only difference is that in the working case, PiSmmCpuDxeSmm.efi is >> dispatched immediately before FvbServicesSmm.efi, while in the broken >> case, PiSmmCpuDxeSmm.efi is not dispatched (it is delayed due to the new >> depex) and FvbServicesSmm.efi is dispatched without PiSmmCpuDxeSmm.efi >> being present, and it breaks. >> >> I see that PiSmmCpuDxeSmm.efi installs: >> - gEfiSmmConfigurationProtocolGuid, >> - EfiSmmCpuProtocol, >> - EfiSmmCpuServiceProtocol, >> >> so it might not be hard to add a depex on one of those (DXE/SMM) >> protocols to FvbServicesSmm.efi. In particular, the PI 1.4a spec says in >> Volume 4, "1.6.1 SMM Drivers": >> >> the dependency expression in the file section EFI_SECTION_SMM_DEPEX >> [...] can refer to both UEFI and SMM protocols >> >> so we could easily make FvbServicesSmm.efi dependent on either of those >> protocols, I think. >> >> However, is that the official way to delay the entry point function of a >> DXE_SMM_DRIVER module until the code would actually run in SMM? Section >> "1.7 SMM Driver Initialization" says: >> >> An SMM Driver's initialization phase begins when the driver has been >> loaded into SMRAM and its entry point is called. An SMM Driver's >> initialization phase ends when the entry point returns. >> >> During SMM Driver initialization, SMM Drivers have access to two >> sets of protocols: UEFI and SMM. UEFI protocols are those which are >> installed and discovered using the UEFI Boot Services. UEFI >> protocols can be located and used by SMM drivers only during SMM >> Initialization. SMM protocols are those which are installed and >> discovered using the System Management Services Table (SMST). SMM >> protocols can be discovered by SMM drivers during initialization >> time and accessed while inside of SMM. >> >> These paragraphs seem to imply that the processor will execute the entry >> point function of a DXE_SMM_DRIVER module from SMRAM, without the >> processor necessarily being in SMM just yet. (Which further implies that >> SMRAM will not have been closed and locked at that point, but that's a >> side remark only.) >> >> This seems reasonable to me, but in the entry point of OVMF's >> FvbServicesSmm.efi, we specifically need to test a write access to the >> pflash chip, to make sure that the QEMU configuration is suitable and >> secure. For this, FvbServicesSmm.efi must be dispatched in SMM. And, >> apparently, in the current tree, when PiSmmCpuDxe launches first, >> FvbServicesSmm *is* dispatched in SMM. >> >> I find it quite non-intuitive that a DXE_SMM_DRIVER's entry point may or >> may not be executed in SMM, but more importantly, what's the best way to >> delay the driver dispatch until after PiSmmCpuDxe has been dispatched? >> >> The PiSmmIpl driver (a DXE_RUNTIME_DRIVER) produces the >> EFI_SMM_BASE2_PROTOCOL, which has a member function called InSmm(). >> However: >> >> - according to the documentation, it doesn't test for SMM, it tests for >> being executed from SMRAM (which are two different things!), and it >> only really makes sense for SMM/DXE combined drivers, >> >> - in fact I don't need a query-like function here, for the code, but >> likely a new depex for FvbServicesSmm that delays its dispatch until >> after PiSmmCpuDxe. >> >> Actually, I think the simplest way to solve this in OVMF is to add a >> depex on EFI_SMM_CPU_PROTOCOL to FvbServicesSmm. From the spec: >> >> Provides access to CPU-related information while in SMM. >> >> [...] >> >> This protocol allows SMM drivers to access architecture-standard >> registers from any of the CPU save state areas. [...] >> >> I think a DXE_SMM_DRIVER might perfectly want to use this protocol in >> its entry point function (in the general case, for whatever reason), so >> it seems very suitable for delaying FvbServicesSmm. >> >> For example, in >> "QuarkPlatformPkg/Acpi/DxeSmm/SmmPowerManagement/SmmPowerManagement.inf", >> I think we see the same pattern: >> - MODULE_TYPE = DXE_SMM_DRIVER >> - Depex on gEfiSmmCpuProtocolGuid >> >> .... Okay, I tried to test this patch (in combination with re-adding the >> gEfiVariableArchProtocolGuid dependency to PiSmmCpuDxeSmm): >> >>> diff --git >>> a/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>> b/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>> index ba2d3679a46d..a241f7702ca2 100644 >>> --- a/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>> +++ b/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>> @@ -88,4 +88,4 @@ [FeaturePcd] >>> gUefiOvmfPkgTokenSpaceGuid.PcdSmmSmramRequire >>> >>> [Depex] >>> - TRUE >>> + gEfiSmmCpuProtocolGuid >> >> With this patch, in the build report I get: >> >>> Module Name: FvbServicesSmm >>> Module INF Path: >>> OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>> >>> [...] >>> >>>> ----------------------------------------------------------------------------------------------------------------------< >>>> >>> Final Dependency Expression (DEPEX) Instructions >>> PUSH gEfiSmmCpuProtocolGuid >>> PUSH gEfiPcdProtocolGuid >>> PUSH gEfiSmmBase2ProtocolGuid >>> PUSH gEfiSmmAccess2ProtocolGuid >>> PUSH gEfiDevicePathUtilitiesProtocolGuid >>> AND >>> AND >>> AND >>> AND >>> END >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Dependency Expression (DEPEX) from INF >>> (gEfiSmmCpuProtocolGuid) AND (gEfiPcdProtocolGuid) AND >>> (gEfiSmmBase2ProtocolGuid) AND (gEfiSmmAccess2ProtocolGuid) AND >>> (gEfiDevicePathUtilitiesProtocolGuid) >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> From Module INF: gEfiSmmCpuProtocolGuid >>> From Library INF: (gEfiPcdProtocolGuid) AND >>> (gEfiSmmBase2ProtocolGuid) AND (gEfiSmmAccess2ProtocolGuid) AND >>> (gEfiDevicePathUtilitiesProtocolGuid) >>> <----------------------------------------------------------------------------------------------------------------------> >>> >> >> and FvbServicesSmm *is* appropriately delayed. However, the variable >> driver blows up: >> >>> Loading SMM driver at 0x0007FF7D000 EntryPoint=0x0007FF7D271 >>> VariableSmm.efi >>> mSmmMemLibInternalMaximumSupportAddress = 0xFFFFFFFFF >>> VarCheckLibRegisterSetVariableCheckHandler - 0x7FF89334 Success >>> Firmware Volume for Variable Store is corrupted >>> >>> ASSERT_EFI_ERROR (Status = Volume Corrupt) >>> ASSERT MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c(927): >>> !EFI_ERROR (Status) >> >> This is the call tree that fails: >> >> VariableServiceInitialize() >> [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c] >> VariableCommonInitialize() >> [MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c] >> InitNonVolatileVariableStore() >> [MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c] >> >> However, the variable store is definitely not corrupted; this failure >> reproduces with the pristine "OVMF_VARS.fd" varstore template that falls >> right out of the OVMF build. > > I am not so clear about "the pristine "OVMF_VARS.fd" varstore template > that falls right out of the OVMF build". > > Variable driver depends on PcdFlashNvStorageVariableBase(64) be set > correctly to produce gEfiVariableArchProtocolGuid protocol. > > After PiSmmCpuDxeSmm adds gEfiVariableArchProtocolGuid dependency and > FvbServicesSmm adds gEfiSmmCpuProtocolGuid dependency, there will be a > dependency circle below. > - PiSmmCpuDxeSmm depends on Variable driver to produce > gEfiVariableArchProtocolGuid protocol. > - FvbServicesSmm depends on PiSmmCpuDxeSmm to produce > gEfiSmmCpuProtocolGuid protocol. > - Variable driver depends on FvbServicesSmm to set > PcdFlashNvStorageVariableBase(64) PCD. I understand, thank you for the explanation. The 64-bit PCD that you mention is set at the end of the entry point function. > Are below approaches possible in OVMF? > 1. Do InitializeVariableFvHeader () and > PcdFlashNvStorageVariableBase(64) PCD set at PEI phase. InitializeVariableFvHeader() writes to the pflash chip, which is only possible if the code runs in SMM (under the use case we are discussing now). So running it as part of a PEIM would not be right. > 2. Do InitializeVariableFvHeader () and > PcdFlashNvStorageVariableBase(64) PCD set in entrypoint with dependency > = TRUE, and do other part of code in FvbInitialize() in another > gEfiSmmCpuProtocolGuid notification function? That's again not good, because it would allow the entry point function to exit with success (and to produce the FVB protocol interface) even if the pflash configuration on the QEMU command line is incorrect (that is, a -D SMM_REQUIRE OVMF build is being run with "-bios", rather than the pflash chips). The "other part" that you mention is about exiting the FVB driver as early as possible, if the varstore is not backed by a pflash chip: - in the SMM_REQUIRE build, this is actually a fatal error, - in the SMM-less build, this allows the EmuVariableFvbRuntimeDxe driver to run. Let me turn the question around: can PiSmmCpuDxeSmm consume its settings from HOBs? The PEI phase has read-only access to variables, and some PEIM could produce those HOBs, either directly, or from reading variables. (Also, it would be nice to know exactly what aspects of PiSmmCpuDxeSmm are planned to be controlled dynamically.) Alternatively, if Jordan agrees, we could break the cycle like this: (1) Drop EmuVariableFvbRuntimeDxe and its dependencies completely. This would eliminate OVMF's reuse of PcdFlashNvStorageVariableBase64 for arbitrating between QemuFlashFvbServicesRuntimeDxe and EmuVariableFvbRuntimeDxe. (For more background on this, please see commits 4313b26de098b and 182eb4562731f.) Dropping EmuVariableFvbRuntimeDxe would also make pflash chips mandatory for OVMF, and prevent it from running with the "-bios" option (at long last). I'd strongly support this. (2) Set PcdFlashNvStorageVariableBase64 and the other PCDs that the variable driver consumes directly in the OVMF DSC or FDF files, matching the PcdOvmfFlashNvStorageXXX settings that QemuFlashFvbServicesRuntimeDxe already uses as sorce of information. This would allow the variable driver to proceed without QemuFlashFvbServicesRuntimeDxe's assistance (for read-only access only). This would make perfect sense, because for read-only access, the variable driver doesn't need FVB, it only needs to know where to read (with direct memory references). (3) QemuFlashFvbServicesRuntimeDxe would keep its structure in other aspects; for example it would get a new depex on gEfiSmmCpuProtocolGuid. This would ensure its entry point function was executed in SMM. Thanks Laszlo