From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (NAM10-DM6-obe.outbound.protection.outlook.com [40.107.93.77]) by mx.groups.io with SMTP id smtpd.web11.823.1618008195147021382 for ; Fri, 09 Apr 2021 15:43:15 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@amd.com header.s=selector1 header.b=ud+/wJyu; spf=permerror, err=parse error for token &{10 18 %{i}._ip.%{h}._ehlo.%{d}._spf.vali.email}: invalid domain name (domain: amd.com, ip: 40.107.93.77, mailfrom: brijesh.singh@amd.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S027qErS9r0GIQ229lsjtDPeHl1zw61PCwGsGpYjIHh927naYrZG9o29egCEAOqA8Jr0wUZfRgMc9JkmC60hwUu/0QqkzMyC8XfC1H36TH7msshOvCT60aLY6HHKvRiP5L92lR0zjeYwZk8cj1Pw7rDC4W5J+MkdLDjjLQd8J2Xc/95jMtQb9Bq39/hsvqk2pRz0WTD3YE6p8u5qdtU2tgJ90r1v1wgy2BCVhmex2IaOUnw9XGktUq2RQaVZ6DimOhJUFv/6s/UMdQCJX8kDWSDx0xvFWgvK12j2XlIT1OC/ADMstVsLFDB5AMKGES1eC3EbPxyKboQfvT5Z5LN8uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8bVOHJi661QoZQvDLSNgNQAhXTqJ5iBDRGicOUANl0I=; b=fmURiVFi1Ut1wZv/6HUKkplDQtceFIpyqbPOsxhUG5VwGZzeuwRpsHvmVya33xjqIENIQ7uE3aPDQSkd7x2HPM/8EfC+IMAo5Aij7i8EohGO1F3/Iye4MLPBnET364tSk9MdWgc8J3yvchlyF/yBUCxYrhvjs4MdnY6yzUuze5I1GkCxNNxeuDyOqMaMulk0oyDZ63pUiJdgTM4/667h8hKuK8QupmJ8PbnjJzocV/pxb2Z8KnNDr+QW2JeVPa6p3jpqzZiCikGTeDvMb6dt0RTzsUJoHoRWGd5zcW0Bnq5Qstj5BkTaYjLFm/kETQOLqjp3Ks8J4d+IJl4aMd609g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8bVOHJi661QoZQvDLSNgNQAhXTqJ5iBDRGicOUANl0I=; b=ud+/wJyue1C9rwfqlQw/nKqD+suBXYwooU3D9tTLJcyhrqAHs+ZIwQBL4jd9/vq4JTmVZVFox5Guex5UbYcgRlpcLaAT0D01fDGrv2ZMXaSB4sVJjVHJ4A48VMI7QPvHEl32t/SlEMEnzBEUXfA0CFDkrjpIr0UGPctiI41qIwQ= Authentication-Results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=amd.com; Received: from SN6PR12MB2718.namprd12.prod.outlook.com (2603:10b6:805:6f::22) by SA0PR12MB4559.namprd12.prod.outlook.com (2603:10b6:806:9e::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.18; Fri, 9 Apr 2021 22:43:13 +0000 Received: from SN6PR12MB2718.namprd12.prod.outlook.com ([fe80::9898:5b48:a062:db94]) by SN6PR12MB2718.namprd12.prod.outlook.com ([fe80::9898:5b48:a062:db94%6]) with mapi id 15.20.4020.021; Fri, 9 Apr 2021 22:43:13 +0000 CC: brijesh.singh@amd.com, James Bottomley , Min Xu , Jiewen Yao , Tom Lendacky , Jordan Justen , Ard Biesheuvel Subject: Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support To: Laszlo Ersek , devel@edk2.groups.io References: <20210324153215.17971-1-brijesh.singh@amd.com> <34a50603-32b9-1476-04a5-8476dd810fe4@redhat.com> <213ce382-0b04-16f4-dd77-f9bb2cc32698@amd.com> From: "Brijesh Singh" Message-ID: <2cb456d3-c407-ff00-c274-af30d4385749@amd.com> Date: Fri, 9 Apr 2021 17:43:11 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 In-Reply-To: X-Originating-IP: [70.112.153.56] X-ClientProxiedBy: SA9PR11CA0027.namprd11.prod.outlook.com (2603:10b6:806:6e::32) To SN6PR12MB2718.namprd12.prod.outlook.com (2603:10b6:805:6f::22) Return-Path: brijesh.singh@amd.com MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from Brijeshs-MacBook-Pro.local (70.112.153.56) by SA9PR11CA0027.namprd11.prod.outlook.com (2603:10b6:806:6e::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.21 via Frontend Transport; Fri, 9 Apr 2021 22:43:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 632564cd-fd12-4908-9034-08d8fba8db90 X-MS-TrafficTypeDiagnostic: SA0PR12MB4559: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xE5b0+9r2wIGLeCI60wYt+OnTwraILke3dPGFX8tDfX8W2qRApWvO/k/9KjUMfkdf5Zefp+H+YO2C9ZoYhNIfxW0BL3zVfains25F4/3rD51lipGAdcxQHIWe4j8mVCjnQVNSs+0JYGupimotT3ujJM+5o9F0EOqYEXGUMwf3PQz3ToOFlWHt93j4iQW3MpJJUemyHw19N7L17Vjta883os6/BLDK80a4tJqiRVgsOmQ0CUnpCm08/suj/DEyQ7lQm2xmKpUW23Vy3ofUQ7xgbDBj+qThNhyy2SK+AnMiGKPHhpDOCYQqvr7hYRRFeJi7l2+1ZJG4ZvqPpl9NKq0Q1KlPcK1H1Pidqp1gsdh2+DY6YJZ20phK3ZAeDMB+9yMM6flCHCRlQsiHLUHHftS3KPgI3jZ04J1EIq1E2kgWfxvTdKYlyXyxYfHevkRrLnrzHkkLpjv8ROoAVZilrT8X665tSpgd74241D68RlSEX2kLRh6MhPE4MGuhon8crAqVQqOch2xoouezwz+KZGHEXu3BDc7DV10nYMpNYiiq1Fe7JTVJQqDB2mMkIp0jbddKJ2UfpuaxZhPE7GMWlxp8AUCdaEqQS3xaJ6tOm+MvL1x4sIteU7MRz/HqyDmJwShBP0/cCOmQwIIohb0B29/l70yEpR1G/DxHsOpNjavMKd9hGuj1bB5etiCJTNmxil265cbwLGulZ+wOeybS1qdeIGnOI4FDigGs0M2ULydAKWqbneup9+pyjO0GYM0iPbq X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN6PR12MB2718.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(396003)(366004)(136003)(376002)(346002)(39860400002)(31696002)(30864003)(2906002)(478600001)(2616005)(18074004)(44832011)(83380400001)(8936002)(86362001)(31686004)(316002)(956004)(8676002)(6486002)(53546011)(186003)(66476007)(6512007)(38350700001)(26005)(5660300002)(38100700001)(4326008)(16526019)(52116002)(6506007)(54906003)(36756003)(66556008)(66946007)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?us-ascii?Q?JSN/sydW6XRc/4H+aZnEnLD1X1VCGMgA27ngK24lmsYS80ggmAWJWzRQatze?= =?us-ascii?Q?B/xfv92vupKnTOm2woCkJHWC6ye1r52LcIoNHM1fsiF6FkWARfh3Srd5uGAK?= =?us-ascii?Q?Q0ApIULMzpr01qQWqZUDbiy+anjuLAbmoPQC/dEKeOFHPeJCSKAc0plOQUBS?= =?us-ascii?Q?Xea9JZkmB83ksfsBwXEVxmDUaJoAG2hZeaxn/kIDcMbtYKBPKpvoeoec1dBf?= =?us-ascii?Q?wi6i1XRRjS/SCZoWbyuudWwjCFWducSfC1E5UOfOq77ueH8y723kH9dALwY1?= =?us-ascii?Q?A8/fEYpoEJXMr54mnBQb5/GQAr9kkTltf6WKP3Jw6ZQF0r3b5IhVExnUL8rG?= =?us-ascii?Q?NuGmmnHqNFNHSs08z9rX9Jgz7TfWxklFyQ0b9XlKQP5WsZFE4qfVNN1WMg+o?= =?us-ascii?Q?puK4iLyeMLHhjjoLpUTF5ItFc1ZjNX2mNWzqiQYyqi4OZEy0sJJkESUGlZ0L?= =?us-ascii?Q?HD+a2SxKwx6yYc4h1vuLJ569q5mbjwYxpCNyTG0e4p8nw0rE7T5Rgp+uoO1P?= =?us-ascii?Q?FUMVHAAsHDwxhDyhA4hvSVcgVN1/lFHCIsxzX0g7OM9cXBqmxWIzKQBXlHkh?= =?us-ascii?Q?E4Yu18FobKmxzcW7jrQTIfRrr5xPsiITun9LfHEbAnRSzohY15H7MRrqtmpV?= =?us-ascii?Q?OY837GlSaHrCG+7LgbBu/PAr1JrsHzw0OZ/xRtn80SLJs0Q7A+np0l6JWkfc?= =?us-ascii?Q?3fc2UHux8pBPFrvRaXeqAzYVJ5nCZNhRg/oOQS5bTyWE574uwZ7PwZ74cY8X?= =?us-ascii?Q?1I6AkbIbHF0heEsXtdnmyh1XPDqloaQXO8K2Ls9xfgsqqdSzmBVEvgQvLZGs?= =?us-ascii?Q?R7eUVMVC+53W8ZQB/l9nSI3FJL1Cqtmmg2vOszzNJc7o5ookORFvVFv0Uz/t?= =?us-ascii?Q?ZQ6Yx/BuEI/5Wfst/pXLmYnKiQoXdXSH1Sn/Ac8Vw9NEJ0dL+nFnzVYUAsRJ?= =?us-ascii?Q?V3wnGa7frA/PmHSQZ8Ixy5ceS7CXM4jWcGA9i6gGzHDPzSeNmUbNr89NWg6y?= =?us-ascii?Q?STXPQrP6uZ81TH9rDtyFtCHXs8zRnznQ8Alnw9b5EoTDSkR6ycy0/7fivhVn?= =?us-ascii?Q?qUqsfR+fYyt7nqKyXcy6M0KE1uI1uC7xPJJHjOL26rSHGhAsHPkAoFiKknjy?= =?us-ascii?Q?+Z+akZT304fCDFGafzCJrsrv3CvpJzDHJ3RLyM4MZFwk9nz0NkpA7rfFv3pO?= =?us-ascii?Q?kI8WoGPmAe0BZeWX60VWCkHhfDSONykgAhCCwJ6Q5QjRoD0oO0HPZi8NvD2U?= =?us-ascii?Q?CWHiXudcDaK7OogvasF4ityVYz0fSPHdxvusuPn3MZMUvYRWensPoqYNKgXr?= =?us-ascii?Q?fJddQBB5JqR93zUB0ZXIxW06?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 632564cd-fd12-4908-9034-08d8fba8db90 X-MS-Exchange-CrossTenant-AuthSource: SN6PR12MB2718.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Apr 2021 22:43:13.2495 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BSIQsWUtrFuqiN+Zr/Iys7bXs64iAyB3g/aTmDeAV6JNrszkbfXYCRN5LoMj5ykqzEFEezEYS6qbsvvVGuAhlg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR12MB4559 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US On 4/9/21 7:24 AM, Laszlo Ersek wrote: > On 04/08/21 13:59, Brijesh Singh wrote: >> On 4/8/21 4:58 AM, Laszlo Ersek wrote: >>> On 03/24/21 16:31, Brijesh Singh wrote: >>>> At this time we only support the pre-validation. OVMF detects all the = available >>>> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is va= lidated >>>> before it is made available to the EDK2 core. >>> Can you describe this in a bit more detail, before I look at the >>> individual patches? Specifically, what existing logic in the PEI phase >>> was taken, and extended, and how? >> One of the key requirement is that the guest private pages much be >> validated before the access. If guest tries to access the pages before >> the validation then it will result in #VC (page-not-validated) >> exception. To avoid the #VC, we propose the validating the memory before >> the access. We will incrementally add the support to lazy validate (i.e >> validate on access). > What is the primary threat (in simple terms please) that validation is > supposed to prevent? To protect against the memory re-mapping attack the guest pages must be validated. The idea is that every guest page can map only to a single physical memory page at one time. > And, against that particular threat, does pre-validation offer some > protection, or will that only come with lazy validation? For the hardware it does not matter how the memory was validated -- lazy vs prevalidate. Both approaches use the PVALIDATE instruction to validate the page. In the case of pre-validation, the memory is validated before the access. Whereas in the lazy validation, the access will cause a fault and fault handler should validate the page and retry the access. Its similar to a page fault handler, OS can populate the page table or back the pages on demand. The only downside of pre-validation is, we will take a hit on the boot time. The GHCB spec provides method by which we can batch multiple requests at once to minimize the context switches. > >> Let me try explaining a bit, the page validation process consist of two >> steps: >> >> 1. Add the pages in the RMP table -- must be done by the hypervisor >> using the RMPUPDATE instruction. The guest can use VMGEXIT NAEs to ask >> hypervisor to add or remove pages from the RMP table. >> >> 2. Guest issue the PVALIDATE instruction -- this sets the validate bit >> in the RMP table. >> >> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP firmware >> before the launch. The SNP firmware also validates the memory page after >> encrypting. This allows us to boot the initial entry code without guest >> going through the validation process. >> >> The OVMF reset vector uses few data pages (e.g page table, early Sec >> stack). Access to these data pages will result in #VC. There are two >> approaches we can take to validate these data pages: >> >> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an >> special command that can be used to pre-validate the pages without >> affecting the measurement. > This means the two pflash chips, right? This does not need to be two pflash chips. The SNP firmware command does not know anything about the ROM or pflash chip. The command accepts the system physical address that need to be validated by the firmware. In patch #2, OVMF provides a range of data pages that need to be validated by the SNP firmware before booting the guest. > >> 2. Enhance the reset vector code to validate the pages. >> >> For now I choose #1. >> >> The pre-validation performed by the SNP firmware is sufficient to boot >> through the SEC phase. The SEC phase later decompress the Fv to a new >> memory location. Now we need the OVMF to take over the validation >> procedure.=C2=A0 The series extends the MemEncryptSevLib to add a new he= lper >> MemEncryptSevSnpValidateRam(). The helper is used to validate the system >> RAM. See patch #12. SEC phase calls the MemEncryptSevSnpValidateRam() to >> validate the output buffer used for the decompression. This was >> sufficient to boot into the PEI phase, see patch #13. > Two questions here: > > - Is ACPI S3 in scope for now? Its not in the scope yet. I have not looked at it. > > - We need more areas made accessible in SEC than just the decompression > buffer; for example the temporary SEC/PEI heap and stack, and (IIRC) > various special SEV-ES areas laid out via MEMFD. Where are all of those > handled / enumerated? Sorry, I simplified my response by saying just decompression. You are right that its more than the decompression buffer. In my current patch, the SEC phase validates all the memory up to PcdOvmfDecompressionScratchEnd (which includes more than just output buffer). See patch #13. > >> The PEI detects >> all the available system RAM. After the memory detection is completed >> the PlatformPei calls the AmdSevSnpInitialize(). The initialization >> routine iterate through the HOB and calls the >> MemEncryptSevSnpValidateRam() to validate all the system RAM. Is it >> possible the more system ram can be detected after the PlatformPei is >> completed ? > That would cause problems even without SEV-SNP (i.e., with plain SEV), > so I'm not worried about it. > >> One of the important thing is we should *never* validate the pages >> twice. > What are the symptoms / consequences of: > > - the guest accessing an unvalidated page (I understand it causes a #VC, > but what is the direct result of that, when this series is applied?), After the series it applied, an access to an unvalidated page will result in #VC. The series does not extend the VC handler to handle the page-not-validated error code. The current VC handler [InternalVmgExitHandleVc()] will inject a #GP to terminate the guest if #VC is raised for unvalidated page. > > - doubly-validating a page? > > The first question is relevant because we should crash as soon as we > touch a page we forgot to validate (we shouldn't let any corruption or > similar spread out until we finally realize there's a problem). > > The second question is relevant for security I guess. What attacks > become possible, and/or what symptoms are produced, if we > doubly-validate a page? Assuming that guest validates its memory, this guarantees the injective mapping between GPAs and SPAs. As I noted that the page validation process consist of two steps #1 RMPUPDATE and 2#PVALIDATE. The RMPUPDATE is a hypervisor instruction and PVALIDATE is a guest instruction. A malicious hypervisor can remap gpa to different physical address in RMP table using the RMPUPDATE just before the second PVALIDATE instruction is executed. The guest need to ensure that it does not leave the security vulnerability that can be exploited by a malicious hypervisor. The SNP white paper recommends that a guest OS should keep track of what is has validate so that it can avoid getting into these situation. > > Furthermore, IIRC, we have separate #VC handlers for the different > firmware phases; do they behave consistently / identicall when a > #VC(page-not-validated) occurs, when this patch set is applied? > > My first question is basically asking whether we can *exclusively* rely > on #VC(page-not-validated) faults to tell us that we missed validating a > particular page. If we can do that, then the job is a bit easier, > because from the GPA, we can more or less also derive *when and where* > we should pre-validate the page (at least until validation is done > completely lazily). Yes, we should be able to rely #VC to tell us that we missed validating the page. In my this series so far I have not seen a #VC due to page-not-validated because we have carefully validated the pages before they are accessed. I am thinking that #VC(page-not-validated) should be treated as an error in the first phase. Incrementally we will add the lazy validation, in which the #VC(page-not-validated) will provide a hint as to what has not been validated. > >> The MemEncryptSevSnpValidateRam() uses a interval search tree to >> keep the record of what has been validated. Before validating the range, >> it lookup in its tree and if it finds that range is already validated >> then do nothing. If it detects an overlap then it will validate only non >> overlapping regions -- see patch #14. > What data structure is used for implementing the interval tree? > > I'm not necessarily looking for a data structure with "nice" asymptotic > time complexity. With pre-validation especially, I think simplicity > (ease of review) is more important for the data structure than > performance. If it's not an actual "tree", we shouldn't call it a > "tree". (An "interval tree" is usually an extension of a Red-Black Tree, > and that's not the simplest data structure in existence; although edk2 > does offer an rbtree library.) I am using binary search tree. Currently, we don't have many nodes to track. Most of the guest memory is private and are validated before we exit from the PEI phase. > Furthermore, what you describe above is called idempotency. No matter > how many times we attempt to validate a range, it may (or may not even) > cause an actual change in the first action only. Is this property > (=3Didempotency) an inherent requirement of the technology, or is it a > simplification of the implementation? Put differently: if you called > CpuDeadLoop() in the validation function any time an overlapping > validation request were received, would that hugely complicate the call > sites? >>From the technology point of view you can validate the same page again and again. Hardware does not force any restriction. To protect against time-based remap attack a guest also need to ensure that it does not blindly validates the pages otherwise guest is taking risk. Avoiding a double validation is strong recommendation. In current implementation, the validation burden is not put on the caller. Caller calls standard APIs to toggle the encryption attribute. Under the hood, if we see that page is already validated then no need to issue the PVALIDATE. If the page is getting changed from C=3D1 -> 0 then unvalidate it. > I'm kind of "obsessing" about idempotency because you say we must > *never* doubly-validate a page, so the *difference* between: > - explicitly crashing on such an attempt, > - and silently ignoring such an attempt, > may be meaningful. I said we should never doubly validate to avoid creating a vulnerability that can later be exploited by the malicious hypervisor. Currently we lookup address in our tree to ensure that we don't do a double validation. The PVALIDATE instruction also can be used to determine if we are attempting to validate an already validated page. In current implementation after the PVALIDATE completion, I check the rflags.CF to ensure that its not a double validation condition. If its double validation then abort the guest. Its bug in the guest. > It's kind of the difference between "oops this is a call site *bug*, but > we patched it up", vs. "this is expected of call sites, we should just > handle it internally". Our attempt should be to patch to avoid the double validation. I would still check the status of PVALIDATE, if instruction detects it as a double validation then we have a bug in our tracking and should abort the guest. > >> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call the >> SNP page state change during the C-bit toggle. > - When exactly do we invalidate? Basically before the page is transition from C=3D1 -> 0 in the page table. > > - Does the time and place of invalidation depend on whether we perform > pre-validation, or lazy validation? It does not matter. Both type of validation update the Validated bit in the RMP table. The important thing is that invalidation can happen only for a already validated private page. If the page is shared then invalidation will result in #GP. > > - Is page invalidation idempotent too? We should treat this as idempotent too. As I said, tacking the page state is strongly recommended for the security reason. The SNP white paper has a section dedicated for the validation and may able to clarify things which is not covered in my response. > > - What is the consequence (security impact) if we forget invalidation? I can't think of a security implication of it but if we forget to invalidate then it can lead issues in the tracking data structure when it comes to validation next time. e.g a page was mapped as C=3D1 and we changed it to C=3D0 without invalidating it. Now if the same page is being changed from C=3D0 -> 1 then tracking data structure may think that page is already validated and will skip the validation process and thus access will result in #VC (page-not-validated). > > - There are four page states I can imagine: > - encrypted for host access, valid for guest access > - encrypted for host access, invalid for guest access > - decrypted for host access, valid for guest access > - decrypted for host access, invalid for guest access > > Does a state exist, from these four, that's never used? (Potentially > caught by the hardware?) I would not mix the page state with the host access. Basically there is two page state exist - Guest-Valid - Guest-Invalid By default all the guest memory is in Guest-Invalid state on boot. The PVALIDATE instruction is used to transition from Valid->Invalid. Guest can use the PVALIDATE to change the state from Valid to Invalid. A guest private page (i.e page mapped C=3D1) must be in Guest Valid state. If hardware see that a private access page is not in the Guest-Valid state then it will trigger #VC. > > Do the patches highlight / explain the validity transitions, in > comments? My understanding is that the C-bit toggle and the guest access > valid/invalid toggle are separate actions, so "middle" states do exist, > but perhaps only temporarily. I have tried to document the steps in the patch description and I can add more comments as required. When SEV-SNP is active then we need to invalidate the page before toggling the C-bit from C=3D1 -> 0 and similarly after the page is toggled from C=3D0 -> 1 we must validate it. > I'm curious how it works, for example, with variousvirtio transfers (bus > master read, bus master write, bus master common buffer). In the > IoMmuDxe driver minimally, we access memory both through PTEs with the > C-bit set and through PTEs with the C-bit clear, meaning that > "encrypted, valid", and "decrypted, valid" are both required states. But > that seems to conflict with the notion that "C-bit toggle" be directly > coupled with a "validity toggle". That should not be an issue. Each page have entry in the RMP table. So, you can have one page as guest invalid and another page as a guest valid. > Put differently, I'm happy that modifying IoMmuDxe appears unnecessary, > but then that tells me I'm missing something about the state transitions > between the above *four* states. Only thing which I would propose changing in IoMmuDxe driver is to use a pre-allocated buffer and avoid toggling the C-bit on every read/write. In past the C-bit toggle was all contained inside the guest. But when SNP is enabled, each C-bit toggle causes two additional steps #1 RMPUPDATE and #2 PVALIDATE. The RMPUPDATE is requested using the VMEXIT so it will cause a VM context switch. I would prefer to avoid it. My current patch set does not have this optimization yet. > >> Please let me know if you have any questions. We can hash out the design >> here before you taking a closure look at the code. > Sorry that I've been (and am being) slow to start reviewing this series. No worries, take your time. Thank you for all your feedback. -Brijesh > > Thanks, > Laszlo >