From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 820202236BA86 for ; Thu, 4 Jan 2018 11:04:31 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 578635B300; Thu, 4 Jan 2018 19:07:04 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-121-44.rdu2.redhat.com [10.10.121.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2D26760170; Thu, 4 Jan 2018 19:07:02 +0000 (UTC) To: Brijesh Singh , edk2-devel@lists.01.org Cc: Tom Lendacky , Jian J Wang , Jiewen Yao , Jordan Justen , Ard Biesheuvel References: <20180104170602.6617-1-brijesh.singh@amd.com> From: Laszlo Ersek Message-ID: <016c210b-2f23-2624-fc7c-867d247521ee@redhat.com> Date: Thu, 4 Jan 2018 20:07:01 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180104170602.6617-1-brijesh.singh@amd.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 04 Jan 2018 19:07:04 +0000 (UTC) Subject: Re: [PATCH] OvmfPkg/BaseMemEncryptSevLib: Enable protection for newly added page table X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Jan 2018 19:04:31 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Hi Brijesh, meta comment: please also CC Ard on OvmfPkg patches; he too co-maintains OvmfPkg. More below: On 01/04/18 18:06, Brijesh Singh wrote: > Commit 2ac1730bf2a5 (MdeModulePkg/DxeIpl: Mark page table as read-only) > sets the memory pages used for page table as read-only after paging is > setup and sets CR0.WP to protect CPU modifying the read-only pages. > The commit causes #PF when MemEncryptSevClearPageEncMask() or > MemEncryptSevSetPageEncMask() tries to change the page-table attributes. > > This patch takes the similar approach as Commit 147fd35c3e38 > (UefiCpuPkg/CpuDxe: Enable protection for newly added page table). > When page table protection is enabled, we disable it temporarily before > changing the page table attributes. > > This patch makes use of the same approach as Commit 2ac1730bf2a5 > (MdeModulePkg/DxeIpl: Mark page table as read-only)) for allocating > page table memory from reserved memory pool, which helps to reduce a "reserved memory pool" is a misleading term, it invokes thoughts about EfiReservedMemoryType, which is also allocatable. > potential "split" operation. > > Cc: Jian J Wang > Cc: Jiewen Yao > Cc: Jordan Justen > Cc: Laszlo Ersek > Signed-off-by: Brijesh Singh The following line is missing, from above your S-o-b: Contributed-under: TianoCore Contribution Agreement 1.1 > --- > OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h | 28 ++ > OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.c | 378 +++++++++++++++++++- > 2 files changed, 399 insertions(+), 7 deletions(-) I find it awful that we have to duplicate all this code in OvmfPkg. The page protection (+ other security) features have been constantly refined since Jian started to work on them. There's no guarantee that this is the last synchronization we have to do in OvmfPkg due to another feature (or bugfix) under UefiCpuPkg. You can see in the messages of both commits 2ac1730bf2a5 and 147fd35c3e38 that I participated in the regression-testing of those commits. You can also see on the commit messages that I simply ran out of steam while trying to keep up with rapid iterations of those patches -- I regression-tested versions that I thought would be final, but even after that, further updates were made, and I couldn't test the final versions. Even in those regression tests that I managed to do, I didn't test the patches in a SEV guest. The reason is that the test matrix has now grown to an unmanageable size, such that sometimes it doesn't even occur to me that this or that virt environment could be affected by a UefiCpuPkg or Mde*Pkg patch. I realized the risk, which is why I asked for, and got, Reviewer role under UefiCpuPkg -- purely so I could *test* (not necessarily review!) UefiCpuPkg patches first-hand, *before* they are committed. But, I'm at (and beyond) capacity, even in recognizing what affects what. There are only two fixes for this (they are independent, and both should be done): (1) An automated regression test environment. We discussed this earlier with Jian (because his work had both introduced, and elsewhere exposed, a large number of bugs). After that, I also talked to virt-QE at Red Hat. Hopefully virt-QE @RH will find some resources in February 2018 to write OVMF test cases for the avocado-vt project. I'm not sure. Currently, *zero* OVMF test cases exist in any automated test environment that I'm aware of; and I have spent a frightening amount of time on manual regression testing. (2) Code sharing / reuse between top-level Pkgs in edk2, as diligently as possible. I very much dislike that we have page table management scattered between MdeModulePkg/Core/DxeIplPeim, UefiCpuPkg/CpuDxe, and OvmfPkg. If there is a well-defined security feature, namely "protect page tables by making them read-only", then the core feature had better provide the toolkit for *all* relevant modules to reuse. Do we really need two definitions of the macro PAGING_L4_ADDRESS_SHIFT? Do we really need *three* definitions of the macro IA32_PG_RW? - MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.h - OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h - UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h This is just sloppy work, mindless code duplication. I'm not willing to review ~400 lines of page table manipulation in detail that admittedly duplicates UefiCpuPkg code, from commit 147fd35c3e38. Sorry, I don't scale to that level. Here's what we can do: (a) I can ACK this patch, and push it, without looking at more than just the commit message and the diffstat. (b) Or else, you and Jian could collaborate on factoring out the shared bits to new Include/IndustryStandard headers, Include/Library class headers, and Library instances. (UefiCpuPkg can consume Mde*Pkg, and OvmfPkg can consume all of those.) After that, we might not need any such OvmfPkg patches in the first place, or if we did, I could reasonably find the time to look at them. I totally don't request that library interfaces and industry standard macros be added to public headers *upfront*, before *multiple* consumers appear. I don't believe in "design by committee". However, I do subscribe to interface extraction when organic project growth results in multiple uses of the same pattern. (c) Obviously, other OvmfPkg maintainers are welcome to review the patch for you! NB, it is not lost on me that previous edk2 practice has actively discouraged feature normalization, on occasion. The rejection of the IoFifoLib class was a prime example, and I disagree with that decision -- including the resultant duplication of the new IoFifo* functions to all IoLib instances -- to this day. Another utility function we sorely miss is scanning PCI config space for a given capability that has known size constraints. So we have open-coded such scanning at least thrice. This dire situation is not helped by the fact that upstreaming features to core packages, such as MdeModulePkg and UefiCpuPkg, is *incredibly* hard. *Way* harder than it should be. So I don't "blame" you for starting with the easier way (I have followed that path myself in the past, several times, out of desperation), but as a reviewer / co-maintainer, I simply cannot keep up with this level of code duplication, and the consequent (predictable) churn in the future. Please pick (a), (b) or (c). Thanks Laszlo