From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web11.387.1615576650742556260 for ; Fri, 12 Mar 2021 11:17:30 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Zc4ldQjs; spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615576649; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PGb2t8B+CUg12SwXBRvEJztWMpiXGp1DP0MLJLDLi0Q=; b=Zc4ldQjsJCyS6MHFqBA+eQOQVmLNO2vyqXRJklbn6epZ9xzUbGFIWYobiJBzXCXTT2yGMB 1xbyhSRsYzvAWqkgfFFmNo7qyU8acjqUrGW4iR3DJrbFaCtH5bokHv2xwzLGrMs25neQ7b nwy50ziK5F82F0xl8QJP+bZ1S0m1dtM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-194-S0Dd5AgkM5OtYUIurA9oBw-1; Fri, 12 Mar 2021 14:17:25 -0500 X-MC-Unique: S0Dd5AgkM5OtYUIurA9oBw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2D749100C660; Fri, 12 Mar 2021 19:17:24 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-112-80.ams2.redhat.com [10.36.112.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6DEF02AA19; Fri, 12 Mar 2021 19:17:22 +0000 (UTC) Subject: Re: [edk2-devel] Conflicting virtual addresses causing Runtime Services issues To: Ard Biesheuvel Cc: devel@edk2.groups.io, jon@solid-run.com, Hao A Wu , Liming Gao , "Ard Biesheuvel (TianoCore)" , "Leif Lindholm (Nuvia address)" References: <5363bdf0-afac-73bf-d001-77949916f511@redhat.com> <166B374585A9D8FC.18699@groups.io> <290a35ce-9116-af00-85f4-8df1c5228680@redhat.com> <4841241f-fc6d-6185-efe6-ed9a536534dd@redhat.com> From: "Laszlo Ersek" Message-ID: Date: Fri, 12 Mar 2021 20:17:21 +0100 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 03/11/21 23:39, Ard Biesheuvel wrote: > On Thu, 11 Mar 2021 at 23:25, Laszlo Ersek wrote: >> >> Adding Ard and Leif, comments below: >> >> On 03/11/21 15:50, Laszlo Ersek wrote: >>> On 03/11/21 10:48, Jon Nettleton wrote: >> >> [...] >> >>>> And this is where the pointer gets remapped again and into the MMIO >>>> space of the nor flash. If I remove the calls to ConvertPointer for >>>> the FvbProtocol I am still seeing those addresses getting remapped >>>> but only once and runtime works as expected. >>>> >>>> I am seeing that in >>>> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c >>>> &mVariableModuleGlobal->FvbInstance->* are all being converted. It >>>> is possible this is a long standing bug and it just so happens that >>>> our configuration has caused a conflict and exposed it. >>> >>> Yes, this is curious, I noticed it too yesterday, trying to see where >>> the FVB protocol member function pointers were converted. I found that >>> OVMF's flash driver (OvmfPkg/QemuFlashFvbServicesRuntimeDxe) didn't do >>> it, but MdeModulePkg/Universal/Variable/RuntimeDxe did. That was >>> certainly strange, as the variable driver is a consumer of the >>> protocol (not the producer thereof), so I'd say it has no business >>> poking new values into the protocol interface structure. >> >> [...] >> >>> ... Strangely, the other flash (FVB) driver in edk2, >>> ArmPlatformPkg/Drivers/NorFlashDxe, *does* perform the conversion >>> itself! See NorFlashVirtualNotifyEvent(). >>> >>> I don't understand that. Is it possible that, with >>> "ArmPlatformPkg/Drivers/NorFlashDxe" too, the conversion happens >>> *twice*, but (at least) one of those mappings is "identity"? >> >> Confirmed. >> >> I had to write some elaborate debug patches for determining this, >> because in ArmVirtQemu, I cannot produce DEBUG output from the >> SetVirtualAddressMap() notification functions. So here's the approach I >> took: >> >> (1) Introduce a new GUID-ed HOB structure in MdeModulePkg. The structure >> itself lives in reserved memory, but its address is exposed in a GUID-ed >> HOB. The structure is named FVB_ADDRESS_LIST, and it has the following >> fields: >> >> - signature ("FVBADRLS" -- FVB Address List) >> - 16 entries of: >> - owner signature [what driver set this entry] >> - address >> - number of entries used (aka next entry to fill) >> >> (2) In PlatformPei, allocate and initialize this structure (in reserved >> memory), and expose its address via the GUID-ed HOB. Furthermore, >> produce a log message with the allocation address. >> >> (3) In NorFlashDxe, look up the structure via the GUID-ed HOB, in the >> entry point function; remember the address in a global variable. In the >> SetVirtualAddressMap() handler function, treat the conversion of the >> "GetPhysicalAddress" FVB member function specially: via the global >> variable pointer to FVB_ADDRESS_LIST in reserved memory, save both the >> physical (original) and the virtual (converted) address of the >> "GetPhysicalAddress" FVB member function, in new entries. As owner >> signature in both entries, use "NORFLASH". >> >> (4) In the runtime DXE variable driver, do the exact same thing, just >> use a different "owner signature" -- "VARIABLE". >> >> (5) Once the guest is up and running, run "efibootmgr --delete-timeout" >> at a root prompt in the guest, deleting the existent "Timeout" UEFI >> non-volatile variable, for verifying that the runtime variable (write) >> service is functional. >> >> (6) Using the log message from point (2): >> >>> PlatformPeim: FvbAddressList @ 13FEC9000 >> >> hexdump the guest memory containing the FVB_ADDRESS_LIST, as follows: >> >>> $ virsh qemu-monitor-command aavmf.rhel7.registered --hmp xp /268cb 0x13FEC9000 >> >> Ccomments to the right of the hexdump: >> >>> 000000013fec9000: 'F' 'V' 'B' 'A' 'D' 'R' 'L' 'S' <- structure signature: FVBADRLS >>> 000000013fec9008: 'N' 'O' 'R' 'F' 'L' 'A' 'S' 'H' <- entry[0], signature: NORFLASH >>> 000000013fec9010: 'T' ' ' '\xc6' ';' '\x01' '\x00' '\x00' '\x00' <- entry[0], GetPhysicalAddress *physical*: 0x000000013bc62054 >>> 000000013fec9018: 'N' 'O' 'R' 'F' 'L' 'A' 'S' 'H' <- entry[1], signature: NORFLASH >>> 000000013fec9020: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[1], GetPhysicalAddress *virtual*: 0x00000000244e2054 >>> 000000013fec9028: 'V' 'A' 'R' 'I' 'A' 'B' 'L' 'E' <- entry[2], signature: VARIABLE >>> 000000013fec9030: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[2], GetPhysicalAddress *physical*: 0x00000000244e2054 >>> 000000013fec9038: 'V' 'A' 'R' 'I' 'A' 'B' 'L' 'E' <- entry[3], signature: VARIABLE >>> 000000013fec9040: 'T' ' ' 'N' '$' '\x00' '\x00' '\x00' '\x00' <- entry[3], GetPhysicalAddress *virtual*: 0x00000000244e2054 >>> 000000013fec9048: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9050: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9058: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9060: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9068: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9070: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9078: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9080: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9088: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9090: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9098: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90a0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90a8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90b0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90b8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90c0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90c8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90d0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90d8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90e0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90e8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90f0: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec90f8: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9100: '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' '\x00' >>> 000000013fec9108: '\x04' '\x00' '\x00' '\x00' <- number of entries used: 4 >> >> This shows the following: >> >> - both NorFlashDxe and the runtime DXE variable driver converted the >> FVB.GetPhysicalAddress member function, >> >> - the NorFlashDxe driver acted first, the runtime DXE variable driver >> acted second, >> >> - when the runtime DXE variable driver "converted" the "physical" >> address to virtual address, there was no change (and no crash!), >> because the virtual address map passed in by the Linux kernel >> apparently identity maps this area -- just as I guessed. >> >> So we definitely have a bug (only Linux's page tables save us from the >> crash); now the question is: >> >> Which driver is wrong to even attempt the conversion of the FVB member >> functions? >> >> The answer must be documented somewhere highly visible. >> >> Debug patches attached, for the record (based on commit edd46cd407ea). >> > > Thanks for inviting me to this party! > > So the tl;dr here is that some points get converted twice, which > usually is not a problem because the virtual address resulting from > the conversion is rarely mistaken for a physical address living in a > EFI_MEMORY_RUNTIME region. Ah, good point! Where I assumed that an identity mapping must have existed, from the OS's mappings, there's a much simpler explanation indeed: If the "physical address" that's being converted simply doesn't fall into a domain that's supposed to be runtime-mapped (per the "VirtualMap" parameter of SetVirtualAddressMap()), the ConvertPointer() call simply fails with EFI_NOT_FOUND, and the pointer is left intact. > So I agree with Laszlo's assertion that the consumer of a protocol has > no business updating its protocol pointers, so this should definitely > be fixed in the core VariableRuntime driver. However, given the > typical nature of the variable stack, i.e., a platform specfic NOR > flash driver combined with the generic FTW and variable drivers, doing > so would likely break many out of tree platforms where the NOR flash > driver does not bother to update its pointers at all. Yes, this is indeed the compatibility argument. Where I see a gray area though is the PI spec. I checked PI v1.7 yesterday (all occurrences of "runtime"), and FVB drivers / protocols are not required to be runtime drivers / protocol -- not even the *possibility* is raised. The variable write arch protocol / driver must be runtime, but how that may (or may not) translate to FVB is not mentioned, as far as I recall. FWIW, the variable driver bug goes back to historical commit 8a9e0b7274c69, dated 2009-03-09. The commit message is... obscure. Hmmm... look at related commit 00f3851372eb ("retire FvbServiceLib class in MdeModulePkg [...]", 2009-03-09). It looks like the ConvertPointer() stuff was originally there? "Firmeware Volume BLock Service Library". FvbServiceLib seems to have been a helper library for FVB drivers, and so it was in its right to offer pointer conversion services... FvbLibInitialize() was the constructor, and it registered FvbVirtualAddressChangeNotifyEvent(). FvbServiceLib was originally added in commit 677472aae492, dated 2008-10-25. I don't know why commit 8a9e0b7274c69 merged the pointer conversion into the variable driver; that seems to have been wrong. But... it's been with us for 12 years now :/ Thanks Laszlo