From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=66.187.233.73; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id B1F8D223CDC38 for ; Mon, 5 Mar 2018 10:16:40 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 85572405CC05; Mon, 5 Mar 2018 18:22:52 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-64.rdu2.redhat.com [10.10.120.64]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2AF8B111CB9E; Mon, 5 Mar 2018 18:22:48 +0000 (UTC) To: =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= , Andrew Fish Cc: edk2-devel@lists.01.org, QEMU , Javier Martinez Canillas , Peter Jones , Jiewen Yao References: <20180223132311.26555-1-marcandre.lureau@redhat.com> <20180223132311.26555-4-marcandre.lureau@redhat.com> <91D51528-C2CA-41DF-921F-DBD5C2EC028C@apple.com> From: Laszlo Ersek Message-ID: <26353964-066c-5664-1d67-6a79a3d21ac9@redhat.com> Date: Mon, 5 Mar 2018 19:22:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Mon, 05 Mar 2018 18:22:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Mon, 05 Mar 2018 18:22:52 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'lersek@redhat.com' RCPT:'' Subject: Re: [PATCH 3/7] HACK: HobLib: workaround infinite loop X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Mar 2018 18:16:41 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 03/05/18 15:05, Marc-André Lureau wrote: > Hi > > On Fri, Feb 23, 2018 at 8:45 PM, Andrew Fish wrote: >> >> >>> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote: >>> >>> From: Marc-André Lureau >>> >>> Without this hack, GetNextHob() loops infinitely with the next >>> patch. I don't understand the reason. >>> >>> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid) >>> call. >>> >>> CC: Laszlo Ersek >>> CC: Stefan Berger >>> Contributed-under: TianoCore Contribution Agreement 1.0 >>> Signed-off-by: Marc-André Lureau >>> --- >>> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c >>> index 5c0eeb992f..ed3c5fbd6d 100644 >>> --- a/MdePkg/Library/PeiHobLib/HobLib.c >>> +++ b/MdePkg/Library/PeiHobLib/HobLib.c >>> @@ -89,6 +89,10 @@ GetNextHob ( >>> if (Hob.Header->HobType == Type) { >>> return Hob.Raw; >>> } >>> + if (GET_HOB_LENGTH (HobStart) == 0) { >> >> As Laszlo points out this error condition is likely memory >> corruption. Thus it would be better to check for all know illegal >> values? >> >> if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER) >> > > Thanks, I have adjusted the check. > > With manual calls and printf (I don't know a better way to debug ovmf > ;), Well that's how I generally debug it too :) > I try to locate the issue. It's somehow related to > RegisterForShadow(). The "corruption" seems to happen during the > second call. After the > PeiLoadImage(...,PEIM_STATE_REGISTER_FOR_SHADOW,..), right before > calling PeimEntryPoint(), a GetFirstGuidHob() succeed, but inside the > function, it fails (with the same arguments). Right after it succeeds > again... The PeimEntryPoint() is not the Tcg2Pei:PeimEntryMA(), I > suppose there is some kind of wrapping code, but I fail to find where. > Any idea? This sounds helpful. I don't know what the problem is, but I can elaborate on your details a bit; perhaps someone else will have more ideas. Apparently there is a PEI service called RegisterForShadow(). ("Apparently", because I've never seen, let alone written, a PEIM calling this service.) The service is specified in the PI spec, which is quoted in the edk2 tree [MdePkg/Include/Pi/PiPeiCis.h]: > /** > Register a PEIM so that it will be shadowed and called again. > > This service registers a file handle so that after memory is > available, the PEIM will be re-loaded into permanent memory and > re-initialized. The PEIM registered this way will always be > initialized twice. The first time, this function call will > return EFI_SUCCESS. The second time, this function call will > return EFI_ALREADY_STARTED. Depending on the order in which > PEIMs are dispatched, the PEIM making this call may be > initialized after permanent memory is installed, even the first > time. > > @param FileHandle PEIM's file handle. Must be the currently > executing PEIM. > > @retval EFI_SUCCESS The PEIM was successfully registered for > shadowing. > @retval EFI_ALREADY_STARTED The PEIM was previously > registered for shadowing. > @retval EFI_NOT_FOUND The FileHandle does not refer to a > valid file handle. > > **/ > typedef > EFI_STATUS > (EFIAPI *EFI_PEI_REGISTER_FOR_SHADOW)( > IN EFI_PEI_FILE_HANDLE FileHandle > ); PEIMs generally "execute in place" (XIP), i.e. they run from flash, not RAM. In this status they use "temporary RAM" (e.g. CPU caches configured like RAM) for stack & heap; whatever HOBs they produce are stored in "temp RAM" as well. Then one of the PEIMs "discovers permanent RAM" (basically it programs the memory controller and publishes the RAM ranges). In turn the PEI core "migrates" PEIMs from temporary to permanent RAM, including the HOB list. Before the temporary RAM migration (when still executing in place from flash), PEIMs cannot have writeable global variables. For example, dynamic PCDs are also maintained in a HOB (the PCD HOB). A PEIM normally cannot (and shouldn't) tell whether it is dispatched before or after permanent RAM is published. If needed, a PEIM can advertise that it depends on permanent RAM (for example because it needs a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX. Finally, it seems like a PEIM can also express, "I'm fine with being dispatched from both flash (XIP) vs. permanent RAM, just the PEI core tell me whichever it is". Apparently, if the PEIM is dispatched from flash (before permanent RAM is available), its call to RegisterForShadow() returns EFI_SUCCESS (and then its entry point function will be invoked for a 2nd time, after the temp RAM migration). And when a PEIM is dispatched from RAM (either for the very first time, or for the second time, after being dispatched from flash first), the same call returns EFI_ALREADY_STARTED. Honestly, I'm unsure what this is good for (both in general, and specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for doing the measurements (which makes sense); I just wonder what exactly it does so that it cannot simply specify "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX. I do see that the (!mImageInMemory) branch contains the TPM initialization code. However, as the PI spec itself says, it is not guaranteed that a PEIM will be dispatched from XIP (i.e., before permanent RAM) *at all*. So it's not clear to me how "business functionality" can depend on XIP. Now, OVMF is a bit different wrt. "memory controller programming" -- it runs on virtual platforms where programming the memory controllers is unnecessary. What happens is, instead, that only the SEC phase runs from flash (XIP), and it decompresses even the PEI firmware volume to normal RAM. The phase where PEIMs "run from flash" (XIP) still exists, of course, except they actually run from RAM -- so, for example, they have writeable global variables right off the bat. Perhaps this interferes somehow with the RegisterForShadow() service and/or how PEIMs expect that service to work. Regarding the PEIM entry point -- the PEIMs' "own" entry point functions are always wrapped. - The outermost (common) entry point function is called _ModuleEntryPoint(). It is declared in "MdePkg/Include/Library/PeimEntryPoint.h". This is what the PEI core calls when dispatching a PEIM. - Individual PEIMs add the PeimEntryPoint lib class to their INF files, under [LibraryClasses]. The implementation is in "MdePkg/Library/PeimEntryPoint". In particular, the function calls ProcessLibraryConstructorList(). - The build tools generate the source code for ProcessLibraryConstructorList(); based on the (recursively determined) library instance dependency tree. (Search the Build subdirectory for "AutoGen.c" files.) So, before the PEIM's actual entry point is called, the lib instances' CONSTRUCTOR functions are called, in the right order, so that the PEIM is entered with all libraries "ready to use". I guess that, when Tcg2Pei is dispatched for the 2nd time, from permanent RAM, the first (successful) GetFirstGuidHob() call that you see occurs like this, from generated code, as part of library construction. Some library instance's constructor function calls GetFirstGuidHob(), successfully. The corruption could somehow be related to the HOB list's migration from temp RAM to permanent RAM. The OVMF debug log should contain something like this: > Temp Stack : BaseAddress=0x818000 Length=0x8000 > Temp Heap : BaseAddress=0x810000 Length=0x8000 > Total temporary memory: 65536 bytes. > temporary memory stack ever used: 28656 bytes. > temporary memory heap used for HobList: 6056 bytes. > temporary memory heap occupied by memory pages: 0 bytes. Implying that, before the temp RAM migration, the HOB list consumed ~6KB in total of the 32KB temp RAM heap. It seems unlikely that we run out of temp RAM heap. Hopefully this helps you proceed with the debugging... Corrections are welcome too, obviously! Thanks, Laszlo