public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G
@ 2019-02-28 10:10 Jian J Wang
  2019-02-28 10:10 ` [PATCH 1/2] UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code Jian J Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jian J Wang @ 2019-02-28 10:10 UTC (permalink / raw)
  To: edk2-devel

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576
Test:
- Pass special test of accessing memory beyond 4G
- Boot to OS with Qemu emulator platform (Fedora27, Ubuntu18.04,
  Windows7, Windows10)

Jian J Wang (2):
  UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code
  UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS

 UefiCpuPkg/CpuDxe/CpuPageTable.c                      | 11 ++++++++++-
 .../Ia32/ExceptionHandlerAsm.nasm                     |  7 -------
 .../X64/ExceptionHandlerAsm.nasm                      |  4 ----
 3 files changed, 10 insertions(+), 12 deletions(-)

-- 
2.17.1.windows.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code
  2019-02-28 10:10 [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Jian J Wang
@ 2019-02-28 10:10 ` Jian J Wang
  2019-02-28 10:10 ` [PATCH 2/2] UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS Jian J Wang
  2019-02-28 14:07 ` [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Laszlo Ersek
  2 siblings, 0 replies; 7+ messages in thread
From: Jian J Wang @ 2019-02-28 10:10 UTC (permalink / raw)
  To: edk2-devel; +Cc: Eric Dong, Laszlo Ersek, Ruiyu Ni, Star Zeng

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576

The root cause of this issue is that non-stop mode of Heap Guard and
NULL Detection set TF bit (single-step) in EFLAG unconditionally in
the common handler in CpuExceptionLib.

If PcdCpuSmmStaticPageTable is FALSE, the SMM will only create page
table for memory below 4G. If SMM tries to access memory beyond 4G,
a page fault exception will be triggered and the memory to access
will be added to page table so that SMM code can continue the access.

Because of above issue, the TF bit is set after the page fault is
handled and then fall into another DEBUG exception. Since non-stop
mode of Heap Guard and NULL Detection are not enabled, no special
DEBUG exception handler is registered. The default handler just
prints exception context and go into dead loop.

Actually EFLAGS can be changed in any standard exception handler.
There's no need to do single-step setup in assembly code. So the fix
is to move the logic to C code part of page fault exception handler
so that we can fully validate the configuration and prevent TF bit
from being set unexpectedly.

Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
---
 UefiCpuPkg/CpuDxe/CpuPageTable.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c b/UefiCpuPkg/CpuDxe/CpuPageTable.c
index 4bee8c7772..812537417d 100644
--- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
+++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
@@ -1300,7 +1300,16 @@ PageFaultExceptionHandler (
   // Display ExceptionType, CPU information and Image information
   //
   DumpCpuContext (ExceptionType, SystemContext);
-  if (!NonStopMode) {
+  if (NonStopMode) {
+    //
+    // Set TF in EFLAGS
+    //
+    if (mPagingContext.MachineType == IMAGE_FILE_MACHINE_I386) {
+      SystemContext.SystemContextIa32->Eflags |= (UINT32)BIT8;
+    } else {
+      SystemContext.SystemContextX64->Rflags |= (UINT64)BIT8;
+    }
+  } else {
     CpuDeadLoop ();
   }
 }
-- 
2.17.1.windows.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS
  2019-02-28 10:10 [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Jian J Wang
  2019-02-28 10:10 ` [PATCH 1/2] UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code Jian J Wang
@ 2019-02-28 10:10 ` Jian J Wang
  2019-02-28 14:07 ` [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Laszlo Ersek
  2 siblings, 0 replies; 7+ messages in thread
From: Jian J Wang @ 2019-02-28 10:10 UTC (permalink / raw)
  To: edk2-devel; +Cc: Eric Dong, Laszlo Ersek, Ruiyu Ni, Star Zeng

REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576

The root cause of this issue is that non-stop mode of Heap Guard and
NULL Detection set TF bit (single-step) in EFLAG unconditionally in
the common handler in CpuExceptionLib.

If PcdCpuSmmStaticPageTable is FALSE, the SMM will only create page
table for memory below 4G. If SMM tries to access memory beyond 4G,
a page fault exception will be triggered and the memory to access
will be added to page table so that SMM code can continue the access.

Because of above issue, the TF bit is set after the page fault is
handled and then fall into another DEBUG exception. Since non-stop
mode of Heap Guard and NULL Detection are not enabled, no special
DEBUG exception handler is registered. The default handler just
prints exception context and go into dead loop.

Actually EFLAGS can be changed in any standard exception handler.
There's no need to do single-step setup in assembly code. So the fix
is to move the logic to C code part of page fault exception handler
so that we can fully validate the configuration and prevent TF bit
from being set unexpectedly.

Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
---
 .../CpuExceptionHandlerLib/Ia32/ExceptionHandlerAsm.nasm   | 7 -------
 .../CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm    | 4 ----
 2 files changed, 11 deletions(-)

diff --git a/UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ExceptionHandlerAsm.nasm b/UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ExceptionHandlerAsm.nasm
index 6fcf5fb23f..45d6474091 100644
--- a/UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ExceptionHandlerAsm.nasm
+++ b/UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ExceptionHandlerAsm.nasm
@@ -383,13 +383,6 @@ ErrorCodeAndVectorOnStack:
     pop     dword [ebp - 4]
     mov     esp, ebp
     pop     ebp
-
-; Enable TF bit after page fault handler runs
-    cmp     dword [esp], 14       ; #PF?
-    jne     .5
-    bts     dword [esp + 16], 8   ; EFLAGS
-
-.5:
     add     esp, 8
     cmp     dword [esp - 16], 0   ; check EXCEPTION_HANDLER_CONTEXT.OldIdtHandler
     jz      DoReturn
diff --git a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm
index f842af2336..7b97810d10 100644
--- a/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm
+++ b/UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm
@@ -336,10 +336,6 @@ HasErrorCode:
     pop     r15
 
     mov     rsp, rbp
-    cmp     qword [rbp + 8], 14 ; #PF?
-    jne     .1
-    bts     qword [rsp + 40], 8 ; RFLAGS.TF
-.1:
     pop     rbp
     add     rsp, 16
     cmp     qword [rsp - 32], 0  ; check EXCEPTION_HANDLER_CONTEXT.OldIdtHandler
-- 
2.17.1.windows.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G
  2019-02-28 10:10 [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Jian J Wang
  2019-02-28 10:10 ` [PATCH 1/2] UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code Jian J Wang
  2019-02-28 10:10 ` [PATCH 2/2] UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS Jian J Wang
@ 2019-02-28 14:07 ` Laszlo Ersek
  2019-02-28 14:10   ` Laszlo Ersek
  2019-03-01  0:52   ` Wang, Jian J
  2 siblings, 2 replies; 7+ messages in thread
From: Laszlo Ersek @ 2019-02-28 14:07 UTC (permalink / raw)
  To: Jian J Wang, edk2-devel

Hi Jian,

On 02/28/19 11:10, Jian J Wang wrote:
> REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576
> Test:
> - Pass special test of accessing memory beyond 4G
> - Boot to OS with Qemu emulator platform (Fedora27, Ubuntu18.04,
>   Windows7, Windows10)
> 
> Jian J Wang (2):
>   UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code
>   UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS
> 
>  UefiCpuPkg/CpuDxe/CpuPageTable.c                      | 11 ++++++++++-
>  .../Ia32/ExceptionHandlerAsm.nasm                     |  7 -------
>  .../X64/ExceptionHandlerAsm.nasm                      |  4 ----
>  3 files changed, 10 insertions(+), 12 deletions(-)
> 

thanks for the good description in the BZ and the patches.

Also thanks for the good commit message on commit dcc026217fdc
("UefiCpuPkg/CpuDxe: implement non-stop mode for uefi", 2018-08-30),
which is a handy reminder about nonstop mode.


(1) My understanding is that, for the problem to arise, it is necessary
for a platform to set

  PcdCpuSmmStaticPageTable = FALSE

Because of that, I expect the code changes as well to be restricted to
such code paths (i.e. I expect the code changes to be invisible with
PcdCpuSmmStaticPageTable=TRUE). Therefore I'll skip regression testing
this series.


(2) Both commit messages say,

"the fix is to *move* the logic to C code part of page fault exception
handler"

(In fact, the commit messages are identical.)

Therefore I think the two patches should be squashed into one. It should
be one atomic code movement. For example, if someone bisects the git
history, and they check out the tree between the two patches, they will
have the TF bit logic in both places. That's probably not good.


(3) I think this is my most important comment for this series:

The subject lines of the patches do not state the *actual* change. The
actual change is not that we move the TF bit manipulation from assembly
code to C code. Instead, the change is that we make the TF setting
*conditional*. In other words, we restrict the setting of TF (--> the
single stepping, = the debug exception after re-executing the originally
faulting instruction) only if we *need* the debug exception, for
restoring the strict page attributes, after the faulting instruction is
allowed to pass.

In other words, we enable the "restore page attributes" logic (#DB
exception handler) only if there is a reason for that (= we are in
nonstop mode).

My apologies if the above paragraph is too verbose. I simply suggest
that the squashed patch have the following subject:

UefiCpuPkg: restore strict page attributes via #DB in nonstop mode only

(71 characters).


(4) I also suggest adding:

Fixes: dcc026217fdc363f55c217039fc43d344f69fed6

near the end of the commit message.


With (2) through (4) addressed:

Acked-by: Laszlo Ersek <lersek@redhat.com>

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G
  2019-02-28 14:07 ` [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Laszlo Ersek
@ 2019-02-28 14:10   ` Laszlo Ersek
  2019-03-01  0:57     ` Wang, Jian J
  2019-03-01  0:52   ` Wang, Jian J
  1 sibling, 1 reply; 7+ messages in thread
From: Laszlo Ersek @ 2019-02-28 14:10 UTC (permalink / raw)
  To: Jian J Wang, edk2-devel

On 02/28/19 15:07, Laszlo Ersek wrote:

> (4) I also suggest adding:
> 
> Fixes: dcc026217fdc363f55c217039fc43d344f69fed6

Additionally append:

Fixes: 16b918bbaf51211a32ae04d9d8a5ba6ccca25a6a

Thanks,
Laszlo

> 
> near the end of the commit message.
> 
> 
> With (2) through (4) addressed:
> 
> Acked-by: Laszlo Ersek <lersek@redhat.com>
> 
> Thanks
> Laszlo
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G
  2019-02-28 14:07 ` [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Laszlo Ersek
  2019-02-28 14:10   ` Laszlo Ersek
@ 2019-03-01  0:52   ` Wang, Jian J
  1 sibling, 0 replies; 7+ messages in thread
From: Wang, Jian J @ 2019-03-01  0:52 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel@lists.01.org

Laszlo,


> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Thursday, February 28, 2019 10:07 PM
> To: Wang, Jian J <jian.j.wang@intel.com>; edk2-devel@lists.01.org
> Subject: Re: [edk2] [PATCH 0/2] Fix SMM driver hang at accessing memory
> beyond 4G
> 
> Hi Jian,
> 
> On 02/28/19 11:10, Jian J Wang wrote:
> > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1576
> > Test:
> > - Pass special test of accessing memory beyond 4G
> > - Boot to OS with Qemu emulator platform (Fedora27, Ubuntu18.04,
> >   Windows7, Windows10)
> >
> > Jian J Wang (2):
> >   UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code
> >   UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS
> >
> >  UefiCpuPkg/CpuDxe/CpuPageTable.c                      | 11 ++++++++++-
> >  .../Ia32/ExceptionHandlerAsm.nasm                     |  7 -------
> >  .../X64/ExceptionHandlerAsm.nasm                      |  4 ----
> >  3 files changed, 10 insertions(+), 12 deletions(-)
> >
> 
> thanks for the good description in the BZ and the patches.
> 
> Also thanks for the good commit message on commit dcc026217fdc
> ("UefiCpuPkg/CpuDxe: implement non-stop mode for uefi", 2018-08-30),
> which is a handy reminder about nonstop mode.
> 
> 
> (1) My understanding is that, for the problem to arise, it is necessary
> for a platform to set
> 
>   PcdCpuSmmStaticPageTable = FALSE
> 
> Because of that, I expect the code changes as well to be restricted to
> such code paths (i.e. I expect the code changes to be invisible with
> PcdCpuSmmStaticPageTable=TRUE). Therefore I'll skip regression testing
> this series.
> 

Correct.

> 
> (2) Both commit messages say,
> 
> "the fix is to *move* the logic to C code part of page fault exception
> handler"
> 
> (In fact, the commit messages are identical.)
> 
> Therefore I think the two patches should be squashed into one. It should
> be one atomic code movement. For example, if someone bisects the git
> history, and they check out the tree between the two patches, they will
> have the TF bit logic in both places. That's probably not good.
> 
> 

Agree. I'll merge them.

> (3) I think this is my most important comment for this series:
> 
> The subject lines of the patches do not state the *actual* change. The
> actual change is not that we move the TF bit manipulation from assembly
> code to C code. Instead, the change is that we make the TF setting
> *conditional*. In other words, we restrict the setting of TF (--> the
> single stepping, = the debug exception after re-executing the originally
> faulting instruction) only if we *need* the debug exception, for
> restoring the strict page attributes, after the faulting instruction is
> allowed to pass.
> 
> In other words, we enable the "restore page attributes" logic (#DB
> exception handler) only if there is a reason for that (= we are in
> nonstop mode).
> 
> My apologies if the above paragraph is too verbose. I simply suggest
> that the squashed patch have the following subject:
> 
> UefiCpuPkg: restore strict page attributes via #DB in nonstop mode only
> 
> (71 characters).
> 
> 

Very good explanation and abstraction. Agree.

> (4) I also suggest adding:
> 
> Fixes: dcc026217fdc363f55c217039fc43d344f69fed6
> 
> near the end of the commit message.
> 
> 
> With (2) through (4) addressed:
> 
> Acked-by: Laszlo Ersek <lersek@redhat.com>
> 

It'll be added. Thanks for the great feedback.

Jian

> Thanks
> Laszlo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G
  2019-02-28 14:10   ` Laszlo Ersek
@ 2019-03-01  0:57     ` Wang, Jian J
  0 siblings, 0 replies; 7+ messages in thread
From: Wang, Jian J @ 2019-03-01  0:57 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel@lists.01.org

Laszlo,

Sure. It'll be added

Regards,
Jian


> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Thursday, February 28, 2019 10:10 PM
> To: Wang, Jian J <jian.j.wang@intel.com>; edk2-devel@lists.01.org
> Subject: Re: [edk2] [PATCH 0/2] Fix SMM driver hang at accessing memory
> beyond 4G
> 
> On 02/28/19 15:07, Laszlo Ersek wrote:
> 
> > (4) I also suggest adding:
> >
> > Fixes: dcc026217fdc363f55c217039fc43d344f69fed6
> 
> Additionally append:
> 
> Fixes: 16b918bbaf51211a32ae04d9d8a5ba6ccca25a6a
> 
> Thanks,
> Laszlo
> 
> >
> > near the end of the commit message.
> >
> >
> > With (2) through (4) addressed:
> >
> > Acked-by: Laszlo Ersek <lersek@redhat.com>
> >
> > Thanks
> > Laszlo
> >


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-01  0:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-28 10:10 [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Jian J Wang
2019-02-28 10:10 ` [PATCH 1/2] UefiCpuPkg/CpuDxe: set TF bit in EFLAGS in C code Jian J Wang
2019-02-28 10:10 ` [PATCH 2/2] UefiCpuPkg/CpuDxe: remove code to set TF bit in EFLAGS Jian J Wang
2019-02-28 14:07 ` [PATCH 0/2] Fix SMM driver hang at accessing memory beyond 4G Laszlo Ersek
2019-02-28 14:10   ` Laszlo Ersek
2019-03-01  0:57     ` Wang, Jian J
2019-03-01  0:52   ` Wang, Jian J

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox