From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web11.604.1615577461273412794 for ; Fri, 12 Mar 2021 11:31:01 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=O3yyS6Mj; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615577460; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JEOce7ULLNBhi2U5lRUPHRJs+q54haOAIIp7hV/wpPc=; b=O3yyS6MjKcb+L27ml5dhxa/vyOCmV/C+Qco4Dl1S4j9nZhUq01aZzzPChqD4KEw71sllfT sYq8BRoud9DBcu/QAckDDuaX0elacQXhPbgkCN6czi/uKK7Dq26IiNMn5sdwaD5rn2mRaQ ne2ZrEo+nR/zqMQIdJCojCoDvEHCNms= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-474-uJCUqxSFPxS98ePLths7jA-1; Fri, 12 Mar 2021 14:30:56 -0500 X-MC-Unique: uJCUqxSFPxS98ePLths7jA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 01DA7800D55; Fri, 12 Mar 2021 19:30:55 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-112-80.ams2.redhat.com [10.36.112.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id E31A25D9CC; Fri, 12 Mar 2021 19:30:52 +0000 (UTC) Subject: Re: [edk2-devel] Conflicting virtual addresses causing Runtime Services issues To: Jon Nettleton , devel@edk2.groups.io Cc: Ard Biesheuvel , Hao A Wu , Liming Gao , "Ard Biesheuvel (TianoCore)" , "Leif Lindholm (Nuvia address)" References: <5363bdf0-afac-73bf-d001-77949916f511@redhat.com> <166B374585A9D8FC.18699@groups.io> <290a35ce-9116-af00-85f4-8df1c5228680@redhat.com> <4841241f-fc6d-6185-efe6-ed9a536534dd@redhat.com> <166B792D1514133B.31346@groups.io> From: "Laszlo Ersek" Message-ID: Date: Fri, 12 Mar 2021 20:30:51 +0100 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 03/12/21 06:59, Jon Nettleton wrote: > On Fri, Mar 12, 2021 at 4:02 AM Jon Nettleton via groups.io > wrote: >> On Thu, Mar 11, 2021 at 11:39 PM Ard Biesheuvel wrote: >>> On Thu, 11 Mar 2021 at 23:25, Laszlo Ersek wrote: >>>> Adding Ard and Leif, comments below: >>>> >>>> On 03/11/21 15:50, Laszlo Ersek wrote: >>>>> On 03/11/21 10:48, Jon Nettleton wrote: >>>> [...] >>>> >>>>>> And this is where the pointer gets remapped again and into the MMIO >>>>>> space of the nor flash. If I remove the calls to ConvertPointer for >>>>>> the FvbProtocol I am still seeing those addresses getting remapped >>>>>> but only once and runtime works as expected. >>>>>> >>>>>> I am seeing that in >>>>>> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c >>>>>> &mVariableModuleGlobal->FvbInstance->* are all being converted. It >>>>>> is possible this is a long standing bug and it just so happens that >>>>>> our configuration has caused a conflict and exposed it. >>>>> Yes, this is curious, I noticed it too yesterday, trying to see where >>>>> the FVB protocol member function pointers were converted. I found that >>>>> OVMF's flash driver (OvmfPkg/QemuFlashFvbServicesRuntimeDxe) didn't do >>>>> it, but MdeModulePkg/Universal/Variable/RuntimeDxe did. That was >>>>> certainly strange, as the variable driver is a consumer of the >>>>> protocol (not the producer thereof), so I'd say it has no business >>>>> poking new values into the protocol interface structure. >>>> [...] >>>> >>>>> ... Strangely, the other flash (FVB) driver in edk2, >>>>> ArmPlatformPkg/Drivers/NorFlashDxe, *does* perform the conversion >>>>> itself! See NorFlashVirtualNotifyEvent(). >>>>> >>>>> I don't understand that. Is it possible that, with >>>>> "ArmPlatformPkg/Drivers/NorFlashDxe" too, the conversion happens >>>>> *twice*, but (at least) one of those mappings is "identity"? >>>> Confirmed. >>>> >>>> I had to write some elaborate debug patches for determining this, >>>> because in ArmVirtQemu, I cannot produce DEBUG output from the >>>> SetVirtualAddressMap() notification functions. So here's the approach I >>>> took: >>>> >>>> (1) Introduce a new GUID-ed HOB structure in MdeModulePkg. The structure >>>> itself lives in reserved memory, but its address is exposed in a GUID-ed >>>> HOB. The structure is named FVB_ADDRESS_LIST, and it has the following >>>> fields: >>>> >>>> - signature ("FVBADRLS" -- FVB Address List) >>>> - 16 entries of: >>>> - owner signature [what driver set this entry] >>>> - address >>>> - number of entries used (aka next entry to fill) >>>> >>>> (2) In PlatformPei, allocate and initialize this structure (in reserved >>>> memory), and expose its address via the GUID-ed HOB. Furthermore, >>>> produce a log message with the allocation address. >>>> >>>> (3) In NorFlashDxe, look up the structure via the GUID-ed HOB, in the >>>> entry point function; remember the address in a global variable. In the >>>> SetVirtualAddressMap() handler function, treat the conversion of the >>>> "GetPhysicalAddress" FVB member function specially: via the global >>>> variable pointer to FVB_ADDRESS_LIST in reserved memory, save both the >>>> physical (original) and the virtual (converted) address of the >>>> "GetPhysicalAddress" FVB member function, in new entries. As owner >>>> signature in both entries, use "NORFLASH". >>>> >>>> (4) In the runtime DXE variable driver, do the exact same thing, just >>>> use a different "owner signature" -- "VARIABLE". >>>> >>>> (5) Once the guest is up and running, run "efibootmgr --delete-timeout" >>>> at a root prompt in the guest, deleting the existent "Timeout" UEFI >>>> non-volatile variable, for verifying that the runtime variable (write) >>>> service is functional. >>>> >>>> (6) Using the log message from point (2): >>>> >>>>> PlatformPeim: FvbAddressList @ 13FEC9000 >>>> hexdump the guest memory containing the FVB_ADDRESS_LIST, as follows: >>>> >>>>> $ virsh qemu-monitor-command aavmf.rhel7.registered --hmp xp /268cb 0x13FEC9000 >>>> Ccomments to the right of the hexdump: >>>> >>>>> 000000013fec9000: 'F' 'V' 'B' 'A' 'D' 'R' 'L' 'S' <- structure signature: FVBADRLS >>>>> 000000013fec9008: 'N' 'O' 'R' 'F' 'L' 'A' 'S' 'H' <- entry[0], signature: NORFLASH >>>>> 000000013fec9010: 'T' ' ' '\xc6' ';' '\x01' '\x00' '\x00' '\x00' <- entry[0], GetPhysicalAddress *physical*: 0x000000013bc62054 >>>>> 000000013fec9018: 'N' 'O' 'R' 'F' 'L' 'A' 'S' 'H' <- entry[1], signature: NORFLASH >>>>> 000000013fec9020: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[1], GetPhysicalAddress *virtual*: 0x00000000244e2054 >>>>> 000000013fec9028: 'V' 'A' 'R' 'I' 'A' 'B' 'L' 'E' <- entry[2], signature: VARIABLE >>>>> 000000013fec9030: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[2], GetPhysicalAddress *physical*: 0x00000000244e2054 >>>>> 000000013fec9038: 'V' 'A' 'R' 'I' 'A' 'B' 'L' 'E' <- entry[3], signature: VARIABLE >>>>> 000000013fec9040: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[3], GetPhysicalAddress *virtual*: 0x00000000244e2054 >>>>> 000000013fec9048: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9050: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9058: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9060: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9068: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9070: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9078: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9080: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9088: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9090: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9098: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90a0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90a8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90b0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90b8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90c0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90c8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90d0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90d8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90e0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90e8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90f0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec90f8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9100: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>>>> 000000013fec9108: '\x04' '\x00' '\x00' '\x00' <- number of entries used: 4 >>>> This shows the following: >>>> >>>> - both NorFlashDxe and the runtime DXE variable driver converted the >>>> FVB.GetPhysicalAddress member function, >>>> >>>> - the NorFlashDxe driver acted first, the runtime DXE variable driver >>>> acted second, >>>> >>>> - when the runtime DXE variable driver "converted" the "physical" >>>> address to virtual address, there was no change (and no crash!), >>>> because the virtual address map passed in by the Linux kernel >>>> apparently identity maps this area -- just as I guessed. >>>> >>>> So we definitely have a bug (only Linux's page tables save us from the >>>> crash); now the question is: >>>> >>>> Which driver is wrong to even attempt the conversion of the FVB member >>>> functions? >>>> >>>> The answer must be documented somewhere highly visible. >>>> >>>> Debug patches attached, for the record (based on commit edd46cd407ea). >>>> >>> Thanks for inviting me to this party! >>> >>> So the tl;dr here is that some points get converted twice, which >>> usually is not a problem because the virtual address resulting from >>> the conversion is rarely mistaken for a physical address living in a >>> EFI_MEMORY_RUNTIME region. >>> >>> So I agree with Laszlo's assertion that the consumer of a protocol has >>> no business updating its protocol pointers, so this should definitely >>> be fixed in the core VariableRuntime driver. However, given the >>> typical nature of the variable stack, i.e., a platform specfic NOR >>> flash driver combined with the generic FTW and variable drivers, doing >>> so would likely break many out of tree platforms where the NOR flash >>> driver does not bother to update its pointers at all. >> Not ideal, but we could add a flag to Runtime Services and let the platform >> specify if it is remapping the FVB. This would allow us to not break legacy >> drivers, but still easily let properly designed platforms bypass this >> behaviour. As time progressed this could be used to for a deprecation >> warning, until it became the default handling of FVB pointer conversion. >> > Something like the attached patch possibly? > > > 0001-MdeModulePkg-Variable-VariableRuntimeDxe-Add-Bool-Pc.patch > > > From 126bd8bebf24e0269696f22a282a4b0340d9923b Mon Sep 17 00:00:00 2001 > From: Jon Nettleton > Date: Fri, 12 Mar 2021 06:52:56 +0100 > Subject: [PATCH] MdeModulePkg/Variable/VariableRuntimeDxe: Add Bool > PcdDriverConvertsFvbFuncPointers > > VariableRuntimeDxe is unconditionally converting the pointers to the Fvb protocol > functions even if the platform drivers are already converting the pointers > themselves. This leads to a double pointer conversion and can cause unexpected > runtime behaviour depending on the memory layout of the platform. > > Since we don't want to break legacy and out of tree platforms we add a Bool flag > that defaults to the existing behaviour but allows platforms to only use the > pointer conversion being done in their driver implementation that produces the > FVB Protocol. > > Signed-off-by: Jon Nettleton > --- > MdeModulePkg/MdeModulePkg.dec | 6 ++++++ > .../Universal/Variable/RuntimeDxe/VariableDxe.c | 16 +++++++++------- > .../Variable/RuntimeDxe/VariableRuntimeDxe.inf | 1 + > 3 files changed, 16 insertions(+), 7 deletions(-) > > diff --git a/MdeModulePkg/MdeModulePkg.dec b/MdeModulePkg/MdeModulePkg.dec > index 1483955110..07b84b1837 100644 > --- a/MdeModulePkg/MdeModulePkg.dec > +++ b/MdeModulePkg/MdeModulePkg.dec > @@ -1130,6 +1130,12 @@ > # @Prompt Reclaim variable space at EndOfDxe. > gEfiMdeModulePkgTokenSpaceGuid.PcdReclaimVariableSpaceAtEndOfDxe|FALSE|BOOLEAN|0x30000008 > > + ## Driver converts FVB function pointers.

> + # The value is FALSE as default to retain the current behaviour and retain compatibility with out of tree platorms.
> + # If the value is set to TRUE, variable driver does not convert the FVB function pointers.
> + # @Prompt Platform driver converts FVB pointers. > + gEfiMdeModulePkgTokenSpaceGuid.PcdDriverConvertsFvbFuncPointers|FALSE|BOOLEAN|0x3000000b > + > ## The size of volatile buffer. This buffer is used to store VOLATILE attribute variables. > # @Prompt Variable storage size. > gEfiMdeModulePkgTokenSpaceGuid.PcdVariableStoreSize|0x10000|UINT32|0x30000005 > diff --git a/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c b/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c > index 0fca0bb2a9..ede2a66682 100644 > --- a/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c > +++ b/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c > @@ -247,13 +247,15 @@ VariableClassAddressChangeEvent ( > UINTN Index; > > if (mVariableModuleGlobal->FvbInstance != NULL) { > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetBlockSize); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetPhysicalAddress); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetAttributes); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->SetAttributes); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->Read); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->Write); > - EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->EraseBlocks); > + if (!PcdGetBool (PcdDriverConvertsFvbFuncPointers)) { > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetBlockSize); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetPhysicalAddress); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->GetAttributes); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->SetAttributes); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->Read); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->Write); > + EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance->EraseBlocks); > + } > EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->FvbInstance); > } > EfiConvertPointer (0x0, (VOID **) &mVariableModuleGlobal->PlatformLangCodes); > diff --git a/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf b/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf > index c9434df631..f3373f8137 100644 > --- a/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf > +++ b/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf > @@ -137,6 +137,7 @@ > gEfiMdeModulePkgTokenSpaceGuid.PcdMaxUserNvVariableSpaceSize ## CONSUMES > gEfiMdeModulePkgTokenSpaceGuid.PcdBoottimeReservedNvVariableSpaceSize ## CONSUMES > gEfiMdeModulePkgTokenSpaceGuid.PcdReclaimVariableSpaceAtEndOfDxe ## CONSUMES > + gEfiMdeModulePkgTokenSpaceGuid.PcdDriverConvertsFvbFuncPointers ## CONSUMES > gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvModeEnable ## SOMETIMES_CONSUMES > gEfiMdeModulePkgTokenSpaceGuid.PcdEmuVariableNvStoreReserved ## SOMETIMES_CONSUMES > > -- 2.27.0 > (1) I suggest filing a TianoCore BZ for this, and then submitting the patch to the mailing list with git-send-email. Please don't forget to CC the Variable Driver reviewers / maintainers. official: https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Development-Process unofficial hints: https://github.com/tianocore/tianocore.github.io/wiki/Laszlo%27s-unkempt-git-guide-for-edk2-contributors-and-maintainers (2) I don't think removing the original condition "mVariableModuleGlobal->FvbInstance" is valid. That condition comes from commit 7cd69959463a ("MdeModulePkg Variable: Add emulated variable NV mode support", 2019-01-24). The condition should likely be *restricted* with the new condition. (3) Rather than a PCD, I wonder if we should add a new FVB attribute bit instead, to EFI_FVB_ATTRIBUTES_2, returned by FVB2.GetAttributes(). This would raise the topic to the PI spec's level. For an example, we have EFI_FVB2_WEAK_ALIGNMENT (0x80000000) -- from commit 3837e91c58d4 ("MdeModulePkg: Add support for weakly aligned FVs.", 2013-09-16) --, associated with . Just an idea; it really needs to be commented upon by the maintainers. I don't "insist" that it be explained in the PI spec, just saying that it might help with the confusion. Thanks Laszlo