From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 306B41A1E30 for ; Wed, 24 Aug 2016 04:53:56 -0700 (PDT) Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ABD8D81249; Wed, 24 Aug 2016 11:53:55 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-16.phx2.redhat.com [10.3.116.16]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u7OBrs6N027103; Wed, 24 Aug 2016 07:53:54 -0400 To: "Zeng, Star" , "Fan, Jeff" , "edk2-devel@ml01.01.org" References: <1470128388-17960-1-git-send-email-jeff.fan@intel.com> <1470128388-17960-49-git-send-email-jeff.fan@intel.com> <4f61b2b4-eeb4-8435-412f-20848347c88e@redhat.com> <542CF652F8836A4AB8DBFAAD40ED192A143D9CFD@shsmsx102.ccr.corp.intel.com> <795dd4fe-cd16-c0f1-7f04-f78601e2c7a8@redhat.com> <82c6b5c9-dcab-f3c1-5ffe-20fb27ddb1af@intel.com> <6cd44902-8c07-8f76-1204-d79c85559ae5@intel.com> <7d39ce29-88ce-c5f2-af61-7fd964acb5c0@intel.com> <801f810c-4cd7-d657-f536-a8471a37802c@redhat.com> Cc: "Kinney, Michael D" , "Jordan Justen (Intel address)" , "Tian, Feng" From: Laszlo Ersek Message-ID: Date: Wed, 24 Aug 2016 07:53:54 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 24 Aug 2016 11:53:55 +0000 (UTC) Subject: Re: [Patch v5 48/48] UefiCpuPkg/PiSmmCpuDxeSmm: Add gEfiVariableArchProtocolGuid dependency X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2016 11:53:56 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 08/24/16 01:22, Zeng, Star wrote: > On 2016/8/24 11:27, Laszlo Ersek wrote: >> On 08/23/16 22:39, Zeng, Star wrote: >>> On 2016/8/23 23:33, Laszlo Ersek wrote: >>>> On 08/18/16 22:57, Zeng, Star wrote: >>>>> On 2016/8/19 10:45, Zeng, Star wrote: >>>>>> On 2016/8/19 10:26, Laszlo Ersek wrote: >>>>>>> On 08/19/16 04:00, Fan, Jeff wrote: >>>>>>>> Laszlo, >>>>>>>> >>>>>>>> I could revert this patch firstly. >>>>>>> >>>>>>> Thank you, that would be very kind. >>>>>>> >>>>>>>> Could you please dig out the OVMF to address the potential issue, >>>>>>>> then I could re-commit it for those platforms required this patch. >>>>>>> >>>>>>> The problem is that this week (what remains of it) and the next week >>>>>>> I won't really have time for this -- tomorrow I'm hoping to finish >>>>>>> up something else in a mortal hurry. It was actually in preparation >>>>>>> for rebasing / pushing that work that I pulled "master", and found >>>>>>> out about the regression. >>>>>>> >>>>>>> Can we perhaps get help from the variable stack maintainers? What's >>>>>>> the design of the (lack of) depexes on those drivers? >>>>>> >>>>>> Variable driver consumes >>>>>> PcdFlashNvStorageVariableBase(64)/PcdFlashNvStorageVariableSize to >>>>>> produce *gEfiVariableArchProtocolGuid* protocol. Variable driver >>>>>> registers (SMM)FTW protocol notification function >>>>>> SmmFtwNotificationEvent() or FtwNotificationEvent() to produce >>>>>> *gEfiVariableWriteArchProtocolGuid* protocol. (SMM)FTW driver has >>>>>> dependency on gEfiSmmFirmwareVolumeBlockProtocolGuid or >>>>>> gEfiFirmwareVolumeBlockProtocolGuid. >>>>>> >>>>>> I am not so sure what you said about the (lack of) depexes on those >>>>>> drivers. >>>>>> >>>>>> Did you see variable driver loaded and started when ASSERT appeared >>>>>> on OVMF? >>>>> >>>>> >>>>> You may could compare the serial logs to get if there is some driver >>>>> execution flow differences for the images built without and with this >>>>> patch. >>>> >>>> The only difference is that in the working case, PiSmmCpuDxeSmm.efi is >>>> dispatched immediately before FvbServicesSmm.efi, while in the broken >>>> case, PiSmmCpuDxeSmm.efi is not dispatched (it is delayed due to the >>>> new >>>> depex) and FvbServicesSmm.efi is dispatched without PiSmmCpuDxeSmm.efi >>>> being present, and it breaks. >>>> >>>> I see that PiSmmCpuDxeSmm.efi installs: >>>> - gEfiSmmConfigurationProtocolGuid, >>>> - EfiSmmCpuProtocol, >>>> - EfiSmmCpuServiceProtocol, >>>> >>>> so it might not be hard to add a depex on one of those (DXE/SMM) >>>> protocols to FvbServicesSmm.efi. In particular, the PI 1.4a spec >>>> says in >>>> Volume 4, "1.6.1 SMM Drivers": >>>> >>>> the dependency expression in the file section EFI_SECTION_SMM_DEPEX >>>> [...] can refer to both UEFI and SMM protocols >>>> >>>> so we could easily make FvbServicesSmm.efi dependent on either of those >>>> protocols, I think. >>>> >>>> However, is that the official way to delay the entry point function >>>> of a >>>> DXE_SMM_DRIVER module until the code would actually run in SMM? Section >>>> "1.7 SMM Driver Initialization" says: >>>> >>>> An SMM Driver's initialization phase begins when the driver has >>>> been >>>> loaded into SMRAM and its entry point is called. An SMM Driver's >>>> initialization phase ends when the entry point returns. >>>> >>>> During SMM Driver initialization, SMM Drivers have access to two >>>> sets of protocols: UEFI and SMM. UEFI protocols are those which are >>>> installed and discovered using the UEFI Boot Services. UEFI >>>> protocols can be located and used by SMM drivers only during SMM >>>> Initialization. SMM protocols are those which are installed and >>>> discovered using the System Management Services Table (SMST). SMM >>>> protocols can be discovered by SMM drivers during initialization >>>> time and accessed while inside of SMM. >>>> >>>> These paragraphs seem to imply that the processor will execute the >>>> entry >>>> point function of a DXE_SMM_DRIVER module from SMRAM, without the >>>> processor necessarily being in SMM just yet. (Which further implies >>>> that >>>> SMRAM will not have been closed and locked at that point, but that's a >>>> side remark only.) >>>> >>>> This seems reasonable to me, but in the entry point of OVMF's >>>> FvbServicesSmm.efi, we specifically need to test a write access to the >>>> pflash chip, to make sure that the QEMU configuration is suitable and >>>> secure. For this, FvbServicesSmm.efi must be dispatched in SMM. And, >>>> apparently, in the current tree, when PiSmmCpuDxe launches first, >>>> FvbServicesSmm *is* dispatched in SMM. >>>> >>>> I find it quite non-intuitive that a DXE_SMM_DRIVER's entry point >>>> may or >>>> may not be executed in SMM, but more importantly, what's the best >>>> way to >>>> delay the driver dispatch until after PiSmmCpuDxe has been dispatched? >>>> >>>> The PiSmmIpl driver (a DXE_RUNTIME_DRIVER) produces the >>>> EFI_SMM_BASE2_PROTOCOL, which has a member function called InSmm(). >>>> However: >>>> >>>> - according to the documentation, it doesn't test for SMM, it tests for >>>> being executed from SMRAM (which are two different things!), and it >>>> only really makes sense for SMM/DXE combined drivers, >>>> >>>> - in fact I don't need a query-like function here, for the code, but >>>> likely a new depex for FvbServicesSmm that delays its dispatch until >>>> after PiSmmCpuDxe. >>>> >>>> Actually, I think the simplest way to solve this in OVMF is to add a >>>> depex on EFI_SMM_CPU_PROTOCOL to FvbServicesSmm. From the spec: >>>> >>>> Provides access to CPU-related information while in SMM. >>>> >>>> [...] >>>> >>>> This protocol allows SMM drivers to access architecture-standard >>>> registers from any of the CPU save state areas. [...] >>>> >>>> I think a DXE_SMM_DRIVER might perfectly want to use this protocol in >>>> its entry point function (in the general case, for whatever reason), so >>>> it seems very suitable for delaying FvbServicesSmm. >>>> >>>> For example, in >>>> "QuarkPlatformPkg/Acpi/DxeSmm/SmmPowerManagement/SmmPowerManagement.inf", >>>> >>>> I think we see the same pattern: >>>> - MODULE_TYPE = DXE_SMM_DRIVER >>>> - Depex on gEfiSmmCpuProtocolGuid >>>> >>>> .... Okay, I tried to test this patch (in combination with re-adding >>>> the >>>> gEfiVariableArchProtocolGuid dependency to PiSmmCpuDxeSmm): >>>> >>>>> diff --git >>>>> a/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>>>> b/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>>>> index ba2d3679a46d..a241f7702ca2 100644 >>>>> --- a/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>>>> +++ b/OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>>>> @@ -88,4 +88,4 @@ [FeaturePcd] >>>>> gUefiOvmfPkgTokenSpaceGuid.PcdSmmSmramRequire >>>>> >>>>> [Depex] >>>>> - TRUE >>>>> + gEfiSmmCpuProtocolGuid >>>> >>>> With this patch, in the build report I get: >>>> >>>>> Module Name: FvbServicesSmm >>>>> Module INF Path: >>>>> OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesSmm.inf >>>>> >>>>> [...] >>>>> >>>>>> ----------------------------------------------------------------------------------------------------------------------< >>>>>> >>>>>> >>>>> Final Dependency Expression (DEPEX) Instructions >>>>> PUSH gEfiSmmCpuProtocolGuid >>>>> PUSH gEfiPcdProtocolGuid >>>>> PUSH gEfiSmmBase2ProtocolGuid >>>>> PUSH gEfiSmmAccess2ProtocolGuid >>>>> PUSH gEfiDevicePathUtilitiesProtocolGuid >>>>> AND >>>>> AND >>>>> AND >>>>> AND >>>>> END >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> >>>>> Dependency Expression (DEPEX) from INF >>>>> (gEfiSmmCpuProtocolGuid) AND (gEfiPcdProtocolGuid) AND >>>>> (gEfiSmmBase2ProtocolGuid) AND (gEfiSmmAccess2ProtocolGuid) AND >>>>> (gEfiDevicePathUtilitiesProtocolGuid) >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> >>>>> From Module INF: gEfiSmmCpuProtocolGuid >>>>> From Library INF: (gEfiPcdProtocolGuid) AND >>>>> (gEfiSmmBase2ProtocolGuid) AND (gEfiSmmAccess2ProtocolGuid) AND >>>>> (gEfiDevicePathUtilitiesProtocolGuid) >>>>> <----------------------------------------------------------------------------------------------------------------------> >>>>> >>>>> >>>> >>>> and FvbServicesSmm *is* appropriately delayed. However, the variable >>>> driver blows up: >>>> >>>>> Loading SMM driver at 0x0007FF7D000 EntryPoint=0x0007FF7D271 >>>>> VariableSmm.efi >>>>> mSmmMemLibInternalMaximumSupportAddress = 0xFFFFFFFFF >>>>> VarCheckLibRegisterSetVariableCheckHandler - 0x7FF89334 Success >>>>> Firmware Volume for Variable Store is corrupted >>>>> >>>>> ASSERT_EFI_ERROR (Status = Volume Corrupt) >>>>> ASSERT MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c(927): >>>>> !EFI_ERROR (Status) >>>> >>>> This is the call tree that fails: >>>> >>>> VariableServiceInitialize() >>>> [MdeModulePkg/Universal/Variable/RuntimeDxe/VariableSmm.c] >>>> VariableCommonInitialize() >>>> [MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c] >>>> InitNonVolatileVariableStore() >>>> [MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c] >>>> >>>> However, the variable store is definitely not corrupted; this failure >>>> reproduces with the pristine "OVMF_VARS.fd" varstore template that >>>> falls >>>> right out of the OVMF build. >>> >>> I am not so clear about "the pristine "OVMF_VARS.fd" varstore template >>> that falls right out of the OVMF build". >>> >>> Variable driver depends on PcdFlashNvStorageVariableBase(64) be set >>> correctly to produce gEfiVariableArchProtocolGuid protocol. >>> >>> After PiSmmCpuDxeSmm adds gEfiVariableArchProtocolGuid dependency and >>> FvbServicesSmm adds gEfiSmmCpuProtocolGuid dependency, there will be a >>> dependency circle below. >>> - PiSmmCpuDxeSmm depends on Variable driver to produce >>> gEfiVariableArchProtocolGuid protocol. >>> - FvbServicesSmm depends on PiSmmCpuDxeSmm to produce >>> gEfiSmmCpuProtocolGuid protocol. >>> - Variable driver depends on FvbServicesSmm to set >>> PcdFlashNvStorageVariableBase(64) PCD. >> >> I understand, thank you for the explanation. The 64-bit PCD that you >> mention is set at the end of the entry point function. >> >>> Are below approaches possible in OVMF? >>> 1. Do InitializeVariableFvHeader () and >>> PcdFlashNvStorageVariableBase(64) PCD set at PEI phase. >> >> InitializeVariableFvHeader() writes to the pflash chip, which is only >> possible if the code runs in SMM (under the use case we are discussing >> now). So running it as part of a PEIM would not be right. >> >>> 2. Do InitializeVariableFvHeader () and >>> PcdFlashNvStorageVariableBase(64) PCD set in entrypoint with dependency >>> = TRUE, and do other part of code in FvbInitialize() in another >>> gEfiSmmCpuProtocolGuid notification function? >> >> That's again not good, because it would allow the entry point function >> to exit with success (and to produce the FVB protocol interface) even if >> the pflash configuration on the QEMU command line is incorrect (that is, >> a -D SMM_REQUIRE OVMF build is being run with "-bios", rather than the >> pflash chips). > > I am a little curious about what makes "writing to the pflash chip is > only possible if the code runs in SMM" in OVMF? QEMU does that. That is the way QEMU protects the pflash chip from direct writes by the guest OS. If you pass -global driver=cfi.pflash01,property=secure,value=on to QEMU (which we do), then all write accesses to the chip will be rejected, unless the code doing the write access runs in SMM. > I meant the code flow for approach 2) I mentioned is like below, I am > not sure if it works or not. :) > > FvbInitialize() - This may could be done at PEI phase. > InitializeVariableFvHeader() > if ValidateFvHeader () returns error status # mark1 > allocate buffer to hold data from GetFvbInfo() # mark2 > endif > return PcdOvmfFlashNvStorageVariableBase or the buffer > (allocated at mark2) pointer > set PcdFlashNvStorageVariableBase64/ > PcdFlashNvStorageFtwWorkingBase/ > PcdFlashNvStorageFtwSpareBase > according to the return from InitializeVariableFvHeader() > > FvbWriteInitialize() - Notification function on gEfiSmmCpuProtocolGuid > QemuFlashInitialize() > ... > Free buffer(allocated at mark2) > set PcdFlashNvStorageVariableBase64/ > PcdFlashNvStorageFtwWorkingBase/ > PcdFlashNvStorageFtwSpareBase > again if mark1 returned success > Install FVB protocol > ... I don't understand the purpose of the buffer allocated at mark2. Also, currently InitializeVariableFvHeader() erases blocks, after it prints "Variable FV header is not valid. It will be reinitialized". As I said, that part cannot be done in PEI. ... Anyway, I just tried to set PcdFlashNvStorageVariableBase64 / PcdFlashNvStorageFtwWorkingBase / PcdFlashNvStorageFtwSpareBase right in the FDF file, at build time -- that is, even earlier than in PEI --, to the correct values, as dynamic defaults. (I even verified those values in the build report file.) It doesn't work: I get the exact same assertion failure. In other words, even if those PCDs are set to the correct values, VariableSmm blows up with "Firmware Volume for Variable Store is corrupted". .... Ahh, I realized why this happens! QEMU prevents even *read accesses* if they don't come from code running in SMM. Independently, I have further thoughts about the following: >> The "other part" that you mention is about exiting the FVB driver as >> early as possible, if the varstore is not backed by a pflash chip: >> - in the SMM_REQUIRE build, this is actually a fatal error, >> - in the SMM-less build, this allows the EmuVariableFvbRuntimeDxe driver >> to run. >> >> Let me turn the question around: can PiSmmCpuDxeSmm consume its settings >> from HOBs? The PEI phase has read-only access to variables, and some >> PEIM could produce those HOBs, either directly, or from reading >> variables. (Also, it would be nice to know exactly what aspects of >> PiSmmCpuDxeSmm are planned to be controlled dynamically.) >> >> Alternatively, if Jordan agrees, we could break the cycle like this: >> >> (1) Drop EmuVariableFvbRuntimeDxe and its dependencies completely. Unfortunately, this won't work -- I just realized it --, because Xen has no pflash chips, and removing EmuVariableFvbRuntimeDxe would break OVMF on Xen. Can we research the HOB (or dynamic PCD) avenue for PiSmmCpuDxeSmm a bit? Because, it seems that on QEMU, *both* FvbServicesSmm *and* VariableSmm can only be dispatched after PiSmmCpuDxeSmm enters SMM. In other words, if gEfiVariableArchProtocolGuid is added as a depex to PiSmmCpuDxeSmm, then we have a much tighter cycle than you described earlier, on QEMU: - PiSmmCpuDxeSmm wants read-only variable access, but - VariableSmm must be dispatched in SMM (= after PiSmmCpuDxeSmm runs), on QEMU. In fact, this makes me wonder if PEI-phase read-only variable access is possible at all on QEMU. (OVMF has never used MdeModulePkg/Universal/Variable/Pei/.) Thanks Laszlo >> This would eliminate OVMF's reuse of PcdFlashNvStorageVariableBase64 for >> arbitrating between QemuFlashFvbServicesRuntimeDxe and >> EmuVariableFvbRuntimeDxe. (For more background on this, please see >> commits 4313b26de098b and 182eb4562731f.) >> >> Dropping EmuVariableFvbRuntimeDxe would also make pflash chips mandatory >> for OVMF, and prevent it from running with the "-bios" option (at long >> last). I'd strongly support this. >> >> (2) Set PcdFlashNvStorageVariableBase64 and the other PCDs that the >> variable driver consumes directly in the OVMF DSC or FDF files, matching >> the PcdOvmfFlashNvStorageXXX settings that >> QemuFlashFvbServicesRuntimeDxe already uses as sorce of information. >> This would allow the variable driver to proceed without >> QemuFlashFvbServicesRuntimeDxe's assistance (for read-only access only). >> This would make perfect sense, because for read-only access, the >> variable driver doesn't need FVB, it only needs to know where to read >> (with direct memory references). >> >> (3) QemuFlashFvbServicesRuntimeDxe would keep its structure in other >> aspects; for example it would get a new depex on gEfiSmmCpuProtocolGuid. >> This would ensure its entry point function was executed in SMM. >> >> Thanks >> Laszlo >> >