From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id BDF0C211CFFED for ; Thu, 28 Feb 2019 06:07:24 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1A95A308CFB3; Thu, 28 Feb 2019 14:07:24 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-128.rdu2.redhat.com [10.10.120.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6F0211001DE9; Thu, 28 Feb 2019 14:07:23 +0000 (UTC) To: Jian J Wang , edk2-devel@lists.01.org References: <20190228101017.2812-1-jian.j.wang@intel.com> From: Laszlo Ersek Message-ID: <96944b84-a1d2-ff6a-5ae4-93b945336763@redhat.com> Date: Thu, 28 Feb 2019 15:07:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20190228101017.2812-1-jian.j.wang@intel.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Thu, 28 Feb 2019 14:07:24 +0000 (UTC) Subject: Re: [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Feb 2019 14:07:25 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Hi Jian, On 02/28/19 11:10, Jian J Wang wrote: > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576 > Test: > - Pass special test of accessing memory beyond 4G > - Boot to OS with Qemu emulator platform (Fedora27, Ubuntu18.04, > Windows7, Windows10) > > Jian J Wang (2): > UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code > UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS > > UefiCpuPkg/CpuDxe/CpuPageTable.c | 11 ++++++++++- > .../Ia32/ExceptionHandlerAsm.nasm | 7 ------- > .../X64/ExceptionHandlerAsm.nasm | 4 ---- > 3 files changed, 10 insertions(+), 12 deletions(-) > thanks for the good description in the BZ and the patches. Also thanks for the good commit message on commit dcc026217fdc ("UefiCpuPkg/CpuDxe: implement non-stop mode for uefi", 2018-08-30), which is a handy reminder about nonstop mode. (1) My understanding is that, for the problem to arise, it is necessary for a platform to set PcdCpuSmmStaticPageTable = FALSE Because of that, I expect the code changes as well to be restricted to such code paths (i.e. I expect the code changes to be invisible with PcdCpuSmmStaticPageTable=TRUE). Therefore I'll skip regression testing this series. (2) Both commit messages say, "the fix is to *move* the logic to C code part of page fault exception handler" (In fact, the commit messages are identical.) Therefore I think the two patches should be squashed into one. It should be one atomic code movement. For example, if someone bisects the git history, and they check out the tree between the two patches, they will have the TF bit logic in both places. That's probably not good. (3) I think this is my most important comment for this series: The subject lines of the patches do not state the *actual* change. The actual change is not that we move the TF bit manipulation from assembly code to C code. Instead, the change is that we make the TF setting *conditional*. In other words, we restrict the setting of TF (--> the single stepping, = the debug exception after re-executing the originally faulting instruction) only if we *need* the debug exception, for restoring the strict page attributes, after the faulting instruction is allowed to pass. In other words, we enable the "restore page attributes" logic (#DB exception handler) only if there is a reason for that (= we are in nonstop mode). My apologies if the above paragraph is too verbose. I simply suggest that the squashed patch have the following subject: UefiCpuPkg: restore strict page attributes via #DB in nonstop mode only (71 characters). (4) I also suggest adding: Fixes: dcc026217fdc363f55c217039fc43d344f69fed6 near the end of the commit message. With (2) through (4) addressed: Acked-by: Laszlo Ersek Thanks Laszlo