From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) by mx.groups.io with SMTP id smtpd.web10.8193.1615477838388470414 for ; Thu, 11 Mar 2021 07:50:38 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@solid-run-com.20150623.gappssmtp.com header.s=20150623 header.b=FX940upk; spf=pass (domain: solid-run.com, ip: 209.85.218.49, mailfrom: jon@solid-run.com) Received: by mail-ej1-f49.google.com with SMTP id hs11so47195627ejc.1 for ; Thu, 11 Mar 2021 07:50:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=solid-run-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NsJdRGJvN8OiKKOfla7STitZIbINWi7QVkYP4dbubaA=; b=FX940upkZxQIBEcbU0IMVd+KCOHV0kYRNnxxrcv2IkIUf2IqsgR47D3xkmxAL/ckhm mblfSJTxVxkOubTF4QWft00f6Wr4Blm0F6x2CTIF3QPy+xKTrbkpBeR92VrgsswoI9y3 K1qOkLqC0wKwkfhmrFXzBjrRnQf1DPkr9oo/2227d0qIYCnnHyLd3RAEHOuz2325bsj0 dCnIcuO0PD2+l11Pn59N+aSNAfhuPzmLEyA7WODS7bfgMaSSjrWxk79tgJNxUkfzplDV jS2C2lOI1VFXIIjqxnGyfoXU4sdthV6aRWUnaxhPpkSlp/uLrhbRoN1nXsBtfF/Udspr Z4oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NsJdRGJvN8OiKKOfla7STitZIbINWi7QVkYP4dbubaA=; b=dbveOqUL26uk1act8g/Kf1v+z9NPEL7MeVSwv3Zz0DK9GYAcZUgJhDct0mb6ZOttuZ X6UW/WWS0WydR4loOkeCMKgRAYxBjOTkldKMu/S4RE1guyMXD5SomByvuyg41D5EBI94 ms1Hc/OlNXhXdmY/nhoHEsVXehLZuYn2+C1ptRl0g7dztR/fj0yC3BVWhiy9vLpaFfdY ihP70zOOiQH1cZ/U3MPlUdnyhAoyu7gO8LMgS+F0Ppqy4rZvSZ1YsUXoQ+u8wWyDvRqq S/83kmj5m2evzcuCNbQCJh+FknsqsyaE81AT4ILPOHrR0tHDip0ipEHLWD0xgLRRgXtX LMeA== X-Gm-Message-State: AOAM530rdmneLcuenhcxKcOp98kVZcvLpkl4qJ0WzIlrpC+kR4ivgm7f rncUnMlDQg34Soh2FokgGf2ur2FPB66z6QNR6N+y/I//F07Eig== X-Google-Smtp-Source: ABdhPJwddayutD9OzkJ+i2sRcid4csQMRO2TBU95VhAaoFSyza+OYD1fI3RhJVnMqPGMBXjbpUW5nZ8klUJbwr88iT4= X-Received: by 2002:a17:906:fc1c:: with SMTP id ov28mr3697643ejb.342.1615477836120; Thu, 11 Mar 2021 07:50:36 -0800 (PST) MIME-Version: 1.0 References: <5363bdf0-afac-73bf-d001-77949916f511@redhat.com> <166B374585A9D8FC.18699@groups.io> <290a35ce-9116-af00-85f4-8df1c5228680@redhat.com> In-Reply-To: <290a35ce-9116-af00-85f4-8df1c5228680@redhat.com> From: "Jon Nettleton" Date: Thu, 11 Mar 2021 16:49:57 +0100 Message-ID: Subject: Re: [edk2-devel] Conflicting virtual addresses causing Runtime Services issues To: devel@edk2.groups.io, Laszlo Ersek Cc: Hao A Wu , Liming Gao Content-Type: text/plain; charset="UTF-8" On Thu, Mar 11, 2021 at 3:51 PM Laszlo Ersek wrote: > > On 03/11/21 10:48, Jon Nettleton wrote: > > On Thu, Mar 11, 2021 at 7:54 AM Jon Nettleton via groups.io > > wrote: > >> > >> On Wed, Mar 10, 2021 at 3:52 PM Laszlo Ersek wrote: > >>> > >>> On 03/10/21 09:04, Jon Nettleton wrote: > >>>> I am debugging a failure that I am seeing while using the HoneyComb's > >>>> spi-nor flash for runtime variable storage. I am hoping someone on > >>>> the list can give me some insight as what may be the problem. > >>>> > >>>> The problem showed up when we switched the MMIO region for the fspi > >>>> flash device to be marked as non executable. reading variables is > >>>> fine, however writes began throwing an error. > >>>> > >>>> [ 556.709828] Unable to handle kernel execute from non-executable > >>>> memory at virtual address 00000000206a3968 > >>>> > >>>> I have patched the kernel and removed the X86 requirement and enabled > >>>> the sysfs runtime mappings kernel config so I can get an easy view of > >>>> the mappings the kernel carries for runtime services. I then track > >>>> that virtual address to the MMIO region of nor flash, which makes > >>>> sense that region is marked as non executable. The question is why is > >>>> code being executed from this address range > >>>> > >>>> attribute > >>>> :::::::::::::: > >>>> 0x8000000000000001 > >>>> :::::::::::::: > >>>> num_pages > >>>> :::::::::::::: > >>>> 0x40 > >>>> :::::::::::::: > >>>> phys_addr > >>>> :::::::::::::: > >>>> 0x20500000 > >>>> :::::::::::::: > >>>> type > >>>> :::::::::::::: > >>>> 0xb > >>>> :::::::::::::: > >>>> virt_addr > >>>> :::::::::::::: > >>>> 0x20680000 > >>>> > >>>> So then I patched the PL011 serial driver to be able to log to the > >>>> console in runtime and I track down the access to Status = > >>>> Fvb->GetPhysicalAddress(Fvb, &FvVolHdr); in UpdateVariableStore(). > >>>> What I don't understand is why EfiConvertPointer is mapping that > >>>> pointer into the Virtual address space occupied by the runtime mmio of > >>>> the flash. The pointer is being properly remapped. Here are the > >>>> pointer addresses in EFI and Kernel Runtime > >>>> > >>>> EFI: > >>>> UpdateVariableStore:156 ECE33968 > >>>> FvbGetPhysicalAddress(BaseAddress=0x20000000) > >>>> > >>>> KERNEL: > >>>> UpdateVariableStore:156 206A3968 > >>>> [ 556.709828] Unable to handle kernel execute from non-executable > >>>> memory at virtual address 00000000206a3968 > >>>> > >>>> Any insight that anyone could provide would be much appreciated. > >>> > >>> Your platform appears misconfigured -- the flash MMIO range appears to > >>> overlap runtime services code even before SetVirtualAddressMap. The > >>> virtual address conflict is likely the result of the original physical > >>> address conflict. > >> > >> There is no physical address conflict. Running in physical mode the pointer > >> for GetPhysicalAddress is at 0xECE33968 and the MMIO physical > >> region is 0x20000000 - 0x2FFFFFFF. Shown as the BaseAddress above. > >> Obviously I don't use that full MMIO region because our flash is not > >> 256MB. > >> > >>> > >>> After the virtual address updates, the EfiMemoryMappedIO (0xb) type > >>> range is mapped at virtual address range [0x20680000, 0x206C0000). The > >>> GetPhysicalAddress function seems to be located at offset 0x23968 in > >>> that range (with 0x1C698 bytes to go in the range). That's inexplicable. > >> > >> Yes, exactly why I am reaching out. > >> > >>> > >>> What is the physical base address of the flash? What are the PCDs used > >>> in "ArmPlatformPkg/Drivers/NorFlashDxe/NorFlashFvb.c"? > >> > >> We don't use the ArmPlatformPkg > >> > >> The code is available here. > >> https://github.com/SolidRun/edk2-platforms/tree/LX2160_UEFI_ACPI_EAR3-lx2160cex7/Silicon/NXP/Drivers/SpiNorFlashDxe > >> > >> Thanks > >>> > > > > Found the root of my problem and fixed it. As used from other devices > > they are relocating the individual pointer for each function in FvbProtocol > > of the DXE. This is done in ArmPlatformPkg/Drivers/NorFlashDxe/NorFlashFvb.c, > > as well as lots of vendor drivers. I was able to dump the addresses being used > > in RuntimeDriverConvertPointer and could locate the pointer to > > GetPhysicalAddress. > > > > Convert 0xEC8C1648:0xEC8B0000:0x203D0000 > > New virtual address is then 203E1648 which is fine. > > > > However then later down in the remapping I found > > Convert 0x203E1648:0x20000000:0x20800000 > > > > And this is where the pointer gets remapped again and into the > > MMIO space of the nor flash. If I remove the calls to > > ConvertPointer for the FvbProtocol I am still seeing those addresses > > getting remapped but only once and runtime works as expected. > > > > I am seeing that in MdeModulePkg/Universal/Variable/RuntimeDxe/VariableDxe.c > > &mVariableModuleGlobal->FvbInstance->* are all being converted. It is > > possible this is a long standing bug and it just so happens that our > > configuration > > has caused a conflict and exposed it. > > Yes, this is curious, I noticed it too yesterday, trying to see where > the FVB protocol member function pointers were converted. I found that > OVMF's flash driver (OvmfPkg/QemuFlashFvbServicesRuntimeDxe) didn't do > it, but MdeModulePkg/Universal/Variable/RuntimeDxe did. That was > certainly strange, as the variable driver is a consumer of the protocol > (not the producer thereof), so I'd say it has no business poking new > values into the protocol interface structure. > > So the issue (IIUC) on your platform is that the function pointers are > doubly-converted, once by your flash driver (which is arguably the right > thing), and another time by the variable driver. Correct, that seems to be what is happening > > I recommend filing a bug about this, at > , against "VariableDxe.c". Sure > > I suggest CC'ing the "UEFI Variable modules" reviewers on the BZ as > well, from "Maintainers.txt" -- FWIW, I'm CC'ing Hao and Liming on this > email, now. > > ... It feels as if the Platform Init specification should speak to the > FVB[2] protocol's runtime aspects... But (as of v1.7), I don't see > anything runtime-related in that section. In fact I've checked the whole > PI v1.7 spec for "runtime", and the closest matches are under > EFI_VARIABLE_WRITE_ARCH_PROTOCOL. FVB2 is not spelled out as a > (potential) runtime dependency of EFI_VARIABLE_WRITE_ARCH_PROTOCOL. The > only explicitly runtime protocol (unrelated to runtime services) is > "Status Code Runtime Protocol". > > So what we're seeing here is an edk2-ism, I believe. I checked the commits and the code that does that conversion is quite old. > > I tried seeing if one of the "tour beyond BIOS" edk2 whitepapers said > anything on FVB -- but the only reference I found (in "Implementing > UEFI Authenticated Variables in SMM with EDKII", version 2) mentions > only *SMM* FVB. > > https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-white-papers > > I think we should have a TianoCore BZ about this, even if we don' change > the code as a result -- which driver is responsible for updating the > runtime FVB protocol member functions should be spelled out somewhere > "searchable". > > ... Strangely, the other flash (FVB) driver in edk2, > ArmPlatformPkg/Drivers/NorFlashDxe, *does* perform the conversion > itself! See NorFlashVirtualNotifyEvent(). > > I don't understand that. Is it possible that, with > "ArmPlatformPkg/Drivers/NorFlashDxe" too, the conversion happens > *twice*, but (at least) one of those mappings is "identity"? I believe all the drivers that are doing the conversion are getting double mappings. It just so happens our second mapping got poked into an area that we had marked non-executable which started the debugging and led to the issue. > > Thanks > Laszlo > > > > > >