From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web11.35331.1618244605909298391 for ; Mon, 12 Apr 2021 09:23:26 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=EWz8KHk4; spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618244605; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aBs3qs8LLuOVrzR6kQBqPjrIpSsa9TYmfE1HTC7OWLc=; b=EWz8KHk43gMTBxpHEYiGQQ5SDtTvLJ9MDM+M+aCIsu+21OezTpeOK4Wj0mrmcJJdp1Ye8k 0RkUe4iIOkBXfblfveCbTKFIh62OzVvQQVSvQUVElW5EbevUsHX7xCZ0twF6tvneGvP9mh mmhYa5c/dPcghm5//00RCV8yxq2pv28= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-186-oKMWg0JWNEOd8Y72jP86cQ-1; Mon, 12 Apr 2021 12:23:22 -0400 X-MC-Unique: oKMWg0JWNEOd8Y72jP86cQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D109E10054F6; Mon, 12 Apr 2021 16:23:20 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-115-23.ams2.redhat.com [10.36.115.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 70E0E6062F; Mon, 12 Apr 2021 16:23:18 +0000 (UTC) From: "Laszlo Ersek" Subject: Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support To: Brijesh Singh , devel@edk2.groups.io Cc: James Bottomley , Min Xu , Jiewen Yao , Tom Lendacky , Jordan Justen , Ard Biesheuvel References: <20210324153215.17971-1-brijesh.singh@amd.com> <34a50603-32b9-1476-04a5-8476dd810fe4@redhat.com> <213ce382-0b04-16f4-dd77-f9bb2cc32698@amd.com> <2cb456d3-c407-ff00-c274-af30d4385749@amd.com> Message-ID: <867ca68c-0a55-7466-43c2-cf6845fb8d75@redhat.com> Date: Mon, 12 Apr 2021 18:23:17 +0200 MIME-Version: 1.0 In-Reply-To: <2cb456d3-c407-ff00-c274-af30d4385749@amd.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 04/10/21 00:43, Brijesh Singh wrote: > > On 4/9/21 7:24 AM, Laszlo Ersek wrote: >> On 04/08/21 13:59, Brijesh Singh wrote: >>> On 4/8/21 4:58 AM, Laszlo Ersek wrote: >>>> On 03/24/21 16:31, Brijesh Singh wrote: >>>>> At this time we only support the pre-validation. OVMF detects all >>>>> the available system RAM in the PEI phase. When SEV-SNP is >>>>> enabled, the memory is validated before it is made available to >>>>> the EDK2 core. >>>> Can you describe this in a bit more detail, before I look at the >>>> individual patches? Specifically, what existing logic in the PEI >>>> phase was taken, and extended, and how? >>> One of the key requirement is that the guest private pages much be >>> validated before the access. If guest tries to access the pages >>> before the validation then it will result in #VC >>> (page-not-validated) exception. To avoid the #VC, we propose the >>> validating the memory before the access. We will incrementally add >>> the support to lazy validate (i.e validate on access). >> What is the primary threat (in simple terms please) that validation >> is supposed to prevent? > > > To protect against the memory re-mapping attack the guest pages must > be validated. The idea is that every guest page can map only to a > single physical memory page at one time. I don't understand. How can a given guest page frame map to multiple host page frames? Do you mean that the situation to prevent is when multiple guest page frames (GPAs) map to the same host page frame, as set up by the hypervisor? >> And, against that particular threat, does pre-validation offer some >> protection, or will that only come with lazy validation? > > For the hardware it does not matter how the memory was validated -- > lazy vs prevalidate. Both approaches use the PVALIDATE instruction to > validate the page. > > In the case of pre-validation, the memory is validated before the > access. Whereas in the lazy validation, the access will cause a fault > and fault handler should validate the page and retry the access. Its > similar to a page fault handler, OS can populate the page table or > back the pages on demand. > > The only downside of pre-validation is, we will take a hit on the boot > time. The GHCB spec provides method by which we can batch multiple > requests at once to minimize the context switches. If pre-validation is simpler, and the only drawback is a one-time hit during boot, then wouldn't it be better to stick with pre-validation forever? I expect that would be a lot simpler for (a) the #VC handlers, (b) the tracking of the "already validated" information. >>> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP >>> firmware before the launch. The SNP firmware also validates the >>> memory page after encrypting. This allows us to boot the initial >>> entry code without guest going through the validation process. >>> >>> The OVMF reset vector uses few data pages (e.g page table, early Sec >>> stack). Access to these data pages will result in #VC. There are two >>> approaches we can take to validate these data pages: >>> >>> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an >>> special command that can be used to pre-validate the pages without >>> affecting the measurement. >> This means the two pflash chips, right? > > > This does not need to be two pflash chips. The SNP firmware command > does not know anything about the ROM or pflash chip. The command > accepts the system physical address that need to be validated by the > firmware. In patch #2, OVMF provides a range of data pages that need > to be validated by the SNP firmware before booting the guest. I guess my question related to the guest code that executes from pflash, and/or accesses data from pflash, before the guest itself could ask for any validation. Such as, the reset vector (which runs from pflash). Then, some non-volatile UEFI variable data that resides in the other pflash chip, and is accessed (read-only) during the PEI phase (the memory type information variable namely). I understood your comments as QEMU pre-validating those areas before guest launch. Is that correct? Anyway, I guess at this point I should just start reviewing the series. Some more comments below. >> - We need more areas made accessible in SEC than just the >> decompression buffer; for example the temporary SEC/PEI heap and >> stack, and (IIRC) various special SEV-ES areas laid out via MEMFD. >> Where are all of those handled / enumerated? > > > Sorry, I simplified my response by saying just decompression. You are > right that its more than the decompression buffer. In my current > patch, the SEC phase validates all the memory up to > PcdOvmfDecompressionScratchEnd (which includes more than just output > buffer). See patch #13. That sounds good (will check the patch out of course). >>> One of the important thing is we should *never* validate the pages >>> twice. >> What are the symptoms / consequences of: >> >> - the guest accessing an unvalidated page (I understand it causes a >> #VC, but what is the direct result of that, when this series is >> applied?), > > > After the series it applied, an access to an unvalidated page will > result in #VC. The series does not extend the VC handler to handle the > page-not-validated error code. The current VC handler > [InternalVmgExitHandleVc()] will inject a #GP to terminate the guest > if #VC is raised for unvalidated page. OK. >> - doubly-validating a page? >> >> The first question is relevant because we should crash as soon as we >> touch a page we forgot to validate (we shouldn't let any corruption >> or similar spread out until we finally realize there's a problem). >> >> The second question is relevant for security I guess. What attacks >> become possible, and/or what symptoms are produced, if we >> doubly-validate a page? > > Assuming that guest validates its memory, this guarantees the > injective mapping between GPAs and SPAs. > > As I noted that the page validation process consist of two steps #1 > RMPUPDATE and 2#PVALIDATE. The RMPUPDATE is a hypervisor instruction > and PVALIDATE is a guest instruction. A malicious hypervisor can remap > gpa to different physical address in RMP table using the RMPUPDATE > just before the second PVALIDATE instruction is executed. The guest > need to ensure that it does not leave the security vulnerability that > can be exploited by a malicious hypervisor. The SNP white paper > recommends that a guest OS should keep track of what is has validate > so that it can avoid getting into these situation. Hm, OK. I think this answers my top-most question. >> Furthermore, IIRC, we have separate #VC handlers for the different >> firmware phases; do they behave consistently / identicall when a >> #VC(page-not-validated) occurs, when this patch set is applied? >> >> My first question is basically asking whether we can *exclusively* >> rely on #VC(page-not-validated) faults to tell us that we missed >> validating a particular page. If we can do that, then the job is a >> bit easier, because from the GPA, we can more or less also derive >> *when and where* we should pre-validate the page (at least until >> validation is done completely lazily). > > Yes, we should be able to rely #VC to tell us that we missed > validating the page. In my this series so far I have not seen a #VC > due to page-not-validated because we have carefully validated the > pages before they are accessed. I am thinking that > #VC(page-not-validated) should be treated as an error in the first > phase. Incrementally we will add the lazy validation, in which the > #VC(page-not-validated) will provide a hint as to what has not been > validated. I'm OK with the current handling of #VC(page-not-validated) -- i.e., fatal error. I'm not yet convinced that lazy validation will be useful at all, but that could be just my lack of understanding. >>> The MemEncryptSevSnpValidateRam() uses a interval search tree to >>> keep the record of what has been validated. Before validating the >>> range, it lookup in its tree and if it finds that range is already >>> validated then do nothing. If it detects an overlap then it will >>> validate only non overlapping regions -- see patch #14. hat data >>> structure is used for implementing the interval tree? >> >> I'm not necessarily looking for a data structure with "nice" >> asymptotic time complexity. With pre-validation especially, I think >> simplicity (ease of review) is more important for the data structure >> than performance. If it's not an actual "tree", we shouldn't call it >> a "tree". (An "interval tree" is usually an extension of a Red-Black >> Tree, and that's not the simplest data structure in existence; >> although edk2 does offer an rbtree library.) > > > I am using binary search tree. Currently, we don't have many nodes to > track. Most of the guest memory is private and are validated before we > exit from the PEI phase. My preliminary comments about MemEncryptSevSnpValidateSystemRam(): (1) The function prototype (patch#12) promises that the function can never fail. In patch#14 however, I see some AddRangeToIntervalTree() calls whose return values are not checked. I think error checking should be strict, and if an error is detected (such as memory allocation failure), a plain ASSERT (FALSE) is not sufficient. Both ASSERT() and CpuDeadLoop() should be used explicitly, for the sake of RELEASE builds. And the function prototype (the leading comment) should highlight that the function either succeeds, or doesn't return. (2) I don't like the recursive nature of the functions. That's always convenient first, and almost always a mess to get out of later. However, this is not a request to complicate the tree's implementation, because... (3) ... you mention we don't have many nodes to track. Is a tree structure, with dynamically allocated nodes, justified at all? (4) Regarding repeating dynamic allocation -- that's never a great thing to do in the SEC / PEI phases. Can we use a fixed size, once-allocated, or even static, array somewhere? In particular, patch#14 uses AllocatePool() for SNP_VALIDATED_RANGE, in the PEI phase, that results in EFI_HOB_TYPE_MEMORY_POOL HOBs; see: AllocatePool() [MdePkg/Library/PeiMemoryAllocationLib/MemoryAllocationLib.c] PeiServicesAllocatePool() [MdePkg/Library/PeiServicesLib/PeiServicesLib.c] PeiAllocatePool() [MdeModulePkg/Core/Pei/Memory/MemoryServices.c] Now, dependent on where you call this AllocatePool() from, it could occur still on the temporary SEC/PEI RAM -- i.e., before permanent PEI RAM is installed. That means that such HOBs would be subject to the HOB list migration, after permanent PEI RAM is installed. The PI spec writes about EFI_HOB_MEMORY_POOL: "The HOB consumer phase should be able to ignore these HOBs". The PI spec writes about AllocatePool(): "The early allocations from temporary memory will be migrated to permanent memory when permanent main memory is installed; this migration shall occur when the HOB list is migrated to permanent memory". So, whenever we use AllocatePool() in the PEI phase, we need an "address stability proof", namely that the AllocatePool() call, and any later access to the allocated area (the HOB), are never separated by the installation of "gEfiPeiMemoryDiscoveredPpiGuid" in the PPI database, by the PEI Core. If we can't guarantee that, then an access later on could occur to the original allocation address, which has by then been invalidated by the temporary RAM migration (= the allocation HOB has been moved). AllocatePages() is "stable" however, in this sense; no proof like the one described above is necessary. (5) The "mRootNode" variable is defined in the file "OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c" (patch #14). That means that every single module (i.e., PEIM) which consumes this library instance will own a separate "mRootNode" variable, with a validation-tracking tree that's correspondingly private to that PEIM. But the validation state is global to the entire firmware. I think we should somehow clarify, or even enforce, that the tracking data structure is firmware-global. Perhaps this can be solved by (a) moving the ownership of the data structure to a particular firmware module, with sole responsibility for pre-validation, and (b) updating the validation APIs to take this data structure explicitly. Alternatively, if multiple modules are expected to inter-operate on the tracker data structure (by which I mean inter-operation *in sequence*, not concurrently in a multiprocessing sense), then we might need a dynamic PCD for inter-driver understanding. (NB: both points (4) and (5) seem to conflict with *existent* code in BaseMemEncryptSevLib. However: - concerning (4), the current allocations are all performed with AllocateAlignedPages(), which are unaffected by temporary-to-permanent RAM migration, - and concerning (5), namely the existent "mPageTablePool" global variable, is a *known bug* -- see commit b721aa749b86a and TianoCore#847. This issue (namely, multiple modules messing with the page tables independently) remains unsolved, and I'd like to avoid introducing *more* paging-related allocations that are module-owned, and not firmware-level.) All in all, here's what I'd prefer (and please correct me if it's unrealistic): - make the SNP validation functions take a simple, *unresizeable* tracker structure for input/output, - allocate that data structure with a single AllocatePages() call in the proper PEIM, - if the SNP validation functions run out of space in the structure, hang hard, - avoid recursion, and use a simple linear search through the array, - evaluate whether graceful handling of overlaps (= range splitting, merging) is actually a requirement. See more on this below. (6) Can you please confirm that the order in which MemEncryptSevSnpValidateSystemRam()'s declaration, call sites, and implementations are introduced, is sensible? I can't tell immediately, but I'd like to be sure that the tree at least builds at every stage (no linkage errors at any stage). >> Furthermore, what you describe above is called idempotency. No matter >> how many times we attempt to validate a range, it may (or may not >> even) cause an actual change in the first action only. Is this >> property (=idempotency) an inherent requirement of the technology, or >> is it a simplification of the implementation? Put differently: if you >> called CpuDeadLoop() in the validation function any time an >> overlapping validation request were received, would that hugely >> complicate the call sites? > > From the technology point of view you can validate the same page again > and again. Hardware does not force any restriction. To protect against > time-based remap attack a guest also need to ensure that it does not > blindly validates the pages otherwise guest is taking risk. Avoiding a > double validation is strong recommendation. In current implementation, > the validation burden is not put on the caller. Caller calls standard > APIs to toggle the encryption attribute. Under the hood, if we see > that page is already validated then no need to issue the PVALIDATE. If > the page is getting changed from C=1 -> 0 then unvalidate it. There remains a huge gap in my understanding here. (The rest of my earlier questions did concern the relationship between the C bit and validation, but I might as well comment right at this spot.) "Invalidation in connection to C-bit toggling" means that there's going to be significant "validation churn" in the DXE phase. For example when fw_cfg is used (which relies on the edk2 IOMMU protocol), or when virtio transfers are performed (which also relies on the edk2 IOMMU protocol, although less directly -- through PciIo, namely). That, however, brings up several problems for me: - Is the tracker data structure scalable? I don't see any balancing in it. (And I wouldn't want a new, complex data structure for it anyway.) - In fact, I don't see any code related to invalidation -- more precisely, I don't see where and how the C-bit changes consult the tracker structure *at all*. - Given issue (5), I don't see how the tracker structure would be inherited from the PEI phase to the DXE phase. In particular, in patch#18, the flipping of the C-bit results in calls to SetPageStateInternal(). But SetPageStateInternal() does not go *through* the tracker structure. Even in patch#14, SevSnpValidateSystemRam() calls SetPageStateInternal() *in addition to* AddRangeToIntervalTree(). In other words, I don't see where validity is tracked in the DXE phase, in response to the C-bit toggling. >> I'm kind of "obsessing" about idempotency because you say we must >> *never* doubly-validate a page, so the *difference* between: >> - explicitly crashing on such an attempt, >> - and silently ignoring such an attempt, >> may be meaningful. > > > I said we should never doubly validate to avoid creating a > vulnerability that can later be exploited by the malicious hypervisor. > Currently we lookup address in our tree to ensure that we don't do a > double validation. The PVALIDATE instruction also can be used to > determine if we are attempting to validate an already validated page. > In current implementation after the PVALIDATE completion, I check the > rflags.CF to ensure that its not a double validation condition. If its > double validation then abort the guest. Its bug in the guest. > >> It's kind of the difference between "oops this is a call site *bug*, >> but we patched it up", vs. "this is expected of call sites, we should >> just handle it internally". > > Our attempt should be to patch to avoid the double validation. I would > still check the status of PVALIDATE, if instruction detects it as a > double validation then we have a bug in our tracking and should abort > the guest. Let me rephrase. My question concerns the *abstraction level* at which we should catch and handle double-validation attempts. If we silently split and merge ranges in the tracker structure, then the guest (as a whole) will not double-validate, which is good. However, double-validation attempts will potentially still *exist*, in higher-level modules in the firmware. My question is whether it is *right* to paper over such overlaps, coming from the high-level code, in the tracker structure. I suspect that it's not right. In other words, the spot where we should abort, due to "double-validation detected", is not when the PVALIDATE instruction reports a problem in the tracker structure. Instead, the spot where we should abort is when we detect an overlapping request *before* issuing PVALIDATE. I'd like the tracker to be as simple and dumb as possible, and I'd like the actual sites that request validation to blow up immediately, if they issue an overlapping request. Is my expectation wrong? >>> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call >>> the SNP page state change during the C-bit toggle. >> - When exactly do we invalidate? > > Basically before the page is transition from C=1 -> 0 in the page > table. > >> >> - Does the time and place of invalidation depend on whether we >> perform pre-validation, or lazy validation? > It does not matter. Both type of validation update the Validated bit > in the RMP table. The important thing is that invalidation can happen > only for a already validated private page. If the page is shared then > invalidation will result in #GP. >> >> - Is page invalidation idempotent too? > We should treat this as idempotent too. As I said, tacking the page > state is strongly recommended for the security reason. The SNP white > paper has a section dedicated for the validation and may able to > clarify things which is not covered in my response. >> >> - What is the consequence (security impact) if we forget >> invalidation? > I can't think of a security implication of it but if we forget to > invalidate then it can lead issues in the tracking data structure when > it comes to validation next time. e.g a page was mapped as C=1 and we > changed it to C=0 without invalidating it. Now if the same page is > being changed from C=0 -> 1 then tracking data structure may think > that page is already validated and will skip the validation process > and thus access will result in #VC (page-not-validated). As I said above, I don't see where the validation tracking and the C-bit flipping "meet". Especially across the PEI and DXE phases. Basically I'm missing the core concept for maintaining a firmware-global structure, for tracking the validation status of GPA intervals. Up to and *excluding* the DXE phase, the validation tracking should be trivial, naive, and unforgiving. This is because, in the PEI phase, validation is one-shot. I would expect the tracker structure to live in the PlatformPei module. Starting with the DXE phase, where we need (in)validation to closely accompany C-bit toggling, we need a different, performant data structure for tracking validation status. This data structure should be primed from the status last set up in the PEI phase. The tracking should still be unforgiving. I believe *this* tracker structure may have to live in the IOMMU driver. BaseMemEncryptSevLib may offer helper functions, but even if it does, it should never attempt to *own* tracking information, via an internal global variable. Furthermore, those helper functions that only make sense for a particular firmware phase, should have such implementations that hang hard, in library instances that match the *other* firmware phases. For example, SevSnpValidateSystemRam() should have a DXE-specific implementation that hangs hard. It's entirely possible that my points above are complete garbage. I don't have a grasp on the core concept, for the firmware-wide tracking of the validation status. ... Ugh, wait. I've just noticed something. In patch#19, "SEC_SEV_ES_WORK_AREA.SnpSystemRamValidatedRootAddress" is introduced, which seems (?) to be implementing the kind of global (inter-driver, inter-phase) tracking that I've been missing. However, even if my guess (?) is correct about that, this idea definitely doesn't belong in the last patch, presented as a "bugfix". Even if it were a bugfix, it should be incorporated into one of the earlier patches (so that we preferably not introduce the bug in the first place). However, my argument is that it's not a bugfix -- to me that seems to be the core concept. That's what the whole series needs to be built upon. (In all of my thinking, before this particular paragraph, I *never* looked at patch#19. There was no reason to.) >> - There are four page states I can imagine: >> - encrypted for host access, valid for guest access >> - encrypted for host access, invalid for guest access >> - decrypted for host access, valid for guest access >> - decrypted for host access, invalid for guest access >> >> Does a state exist, from these four, that's never used? (Potentially >> caught by the hardware?) > > I would not mix the page state with the host access. Basically there > is two page state exist > > - Guest-Valid > > - Guest-Invalid > > By default all the guest memory is in Guest-Invalid state on boot. The > PVALIDATE instruction is used to transition from Valid->Invalid. Do you mean "Invalid->Valid" transition? > Guest can use the PVALIDATE to change the state from Valid to Invalid. > > A guest private page (i.e page mapped C=1) must be in Guest Valid > state. If hardware see that a private access page is not in the > Guest-Valid state then it will trigger #VC. > > >> >> Do the patches highlight / explain the validity transitions, in >> comments? My understanding is that the C-bit toggle and the guest >> access valid/invalid toggle are separate actions, so "middle" states >> do exist, but perhaps only temporarily. > > I have tried to document the steps in the patch description and I can > add more comments as required. When SEV-SNP is active then we need to > invalidate the page before toggling the C-bit from C=1 -> 0 and > similarly after the page is toggled from C=0 -> 1 we must validate it. > > >> I'm curious how it works, for example, with variousvirtio transfers >> (bus master read, bus master write, bus master common buffer). In the >> IoMmuDxe driver minimally, we access memory both through PTEs with >> the C-bit set and through PTEs with the C-bit clear, meaning that >> "encrypted, valid", and "decrypted, valid" are both required states. >> But that seems to conflict with the notion that "C-bit toggle" be >> directly coupled with a "validity toggle". > > That should not be an issue. Each page have entry in the RMP table. > So, you can have one page as guest invalid and another page as a guest > valid. > >> Put differently, I'm happy that modifying IoMmuDxe appears >> unnecessary, but then that tells me I'm missing something about the >> state transitions between the above *four* states. > > > Only thing which I would propose changing in IoMmuDxe driver is to use > a pre-allocated buffer and avoid toggling the C-bit on every > read/write. In past the C-bit toggle was all contained inside the > guest. But when SNP is enabled, each C-bit toggle causes two > additional steps #1 RMPUPDATE and #2 PVALIDATE. The RMPUPDATE is > requested using the VMEXIT so it will cause a VM context switch. I > would prefer to avoid it. > > My current patch set does not have this optimization yet. Please let's stick with the absolute minimum in this series, yes. I realize my email is complete chaos. That's because my understanding of this work is complete chaos. Can we approach this feature as follows: - Make "validation tracking" the core tenet (central principle) of this feature. - Design the validation *patterns* that are required in each firmware phase. - Identify the *sole* owner module, for validation tracking, in each firmware phase. If we can't readily do this, we might have to introduce new drivers (PEIMs, DXE drivers?) and new interfaces (PPIs, protocols?) - Design the hand-off between the owner modules (on firmware phase boundaries). - Add only helper functions to BaseMemEncryptSevLib, without any ownership of tracking info. The helpers should be as restricted as possible. I'd like to avoid "smearing" page validation responsibility over a bunch of modules. (See again: TianoCore#847.) I'd like responsibilities strictly centralized. If you feel that my points make no sense, I can start going over the individual patches. However, even that way, I'd have found patch#19 only in the end, and that doesn't seem right. And I wouldn't like to go "numb" on the actual code, being distracted by style issues and so on, before we have some agreement on the core approach. Another tip: if you can add content to MdePkg that is governed by AMD specifications, such as patches #3, #5, #9, I would recommend splitting that off to a separate (pre-requisite) series. That content seems unaffected by whatever we do in OvmfPkg, and it would help decrease the size of the OvmfPkg series. It's not like we're going to abandon the SEV-SNP feature, so I don't think there's a risk in merging the MdePkg series first, while the OvmfPkg series is not ready. The MdePkg stuff will not remain unused indefinitely. Thanks, Laszlo