public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* ASSERT in QemuVideoDxe driver during reset
@ 2017-09-06  8:15 Wang, Jian J
  2017-09-06 10:20 ` Laszlo Ersek
  0 siblings, 1 reply; 13+ messages in thread
From: Wang, Jian J @ 2017-09-06  8:15 UTC (permalink / raw)
  To: lersek@redhat.com, Justen, Jordan L, Yao, Jiewen
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

Hi guys,

I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe driver during reset. The assert statement is like below.

    ASSERT (Int0x10->Segment == 0x0000);
    ASSERT (Int0x10->Offset  == 0x0000);

This happened after I have enabled NULL pointer access detection feature, in which page 0 (4K)  is disabled. And because of page 0 disabled, I have to skip the memory clearing for page 0 in DXE core. Otherwise it will cause page fault exception there. It seems that QEMU may clear all its memory at startup. Skipping the action of clearing page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and QEMU will not clear all its memory during warm boot. ASSERT will be triggered after reset.

It's easy to fix this issue but there're some subtle situations which I'm not quite certain. I'd like your opinions for them.

Here're my thoughts on several solutions:
a) Remove the ASSERT statement in InstallVbeShim(). But I'm sure if it is safe to do so because I don't quite understand the purpose of the ASSERT. 
b) Instead of skipping clearing page 0, enable it, do clearing and then disable it. The problem here is that CPU arch protocol is not ready at that time. I have to "manually" do page operation, which might be non-portable and a little bit odd in DXE core.
c) Move code clearing page 0 from DXE core to another place wherever appropriate, like DxeIpl or cpu driver. But I think there's a good reason to put code there before.

Thanks,
Wang, Jian J


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06  8:15 ASSERT in QemuVideoDxe driver during reset Wang, Jian J
@ 2017-09-06 10:20 ` Laszlo Ersek
  2017-09-06 11:16   ` Yao, Jiewen
  0 siblings, 1 reply; 13+ messages in thread
From: Laszlo Ersek @ 2017-09-06 10:20 UTC (permalink / raw)
  To: Wang, Jian J, Justen, Jordan L, Yao, Jiewen
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

On 09/06/17 10:15, Wang, Jian J wrote:
> Hi guys,
>
> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
> driver during reset. The assert statement is like below.
>
>     ASSERT (Int0x10->Segment == 0x0000);
>     ASSERT (Int0x10->Offset  == 0x0000);
>
> This happened after I have enabled NULL pointer access detection
> feature, in which page 0 (4K)  is disabled.

The NULL pointer access detection conflicts with the VBE shim. For
installing the VBE shim, OVMF has to write to page 0, on purpose.

Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).

> And because of page 0 disabled, I have to skip the memory clearing for
> page 0 in DXE core.

By doing that, you invalidate the following comment in said OVMF commit:

+    //
+    // We managed to allocate the page at zero. SVN r14218 guarantees that it
+    // is NUL-filled.
+    //
+    ASSERT (Int0x10->Segment == 0x0000);
+    ASSERT (Int0x10->Offset  == 0x0000);

Because SVN r14218 was what added the clearing of page 0 to the DXE
core. (Expressed as a git commit: d436d5ca0936.)

> Otherwise it will cause page fault exception there. It seems that QEMU
> may clear all its memory at startup. Skipping the action of clearing
> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
> QEMU will not clear all its memory during warm boot. ASSERT will be
> triggered after reset.

QEMU does not clear guest RAM at reboot (or S3 resume).

> It's easy to fix this issue but there're some subtle situations which
> I'm not quite certain. I'd like your opinions for them.
>
> Here're my thoughts on several solutions:
> a) Remove the ASSERT statement in InstallVbeShim().

Not a good idea.

> But I'm sure if it is safe to do so because I don't quite understand
> the purpose of the ASSERT.

The ASSERT expresses our belief that, after we manage to *allocate* page
0, we can freely write the real mode Int0x10 vector there, without
overwriting anything else.

> b) Instead of skipping clearing page 0, enable it, do clearing and
> then disable it. The problem here is that CPU arch protocol is not
> ready at that time. I have to "manually" do page operation, which
> might be non-portable and a little bit odd in DXE core.
> c) Move code clearing page 0 from DXE core to another place wherever
> appropriate, like DxeIpl or cpu driver. But I think there's a good
> reason to put code there before.

I'm not sure how the NULL pointer access detection is enabled. Is it
enabled by a PCD?

Hm, yes, it seems, from your patch

  [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core

that the PCD is called

  gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection

I can see in that patch that the DXE core is updated *not* to clear page
0 if the PCD is set to TRUE. That makes sense.

However, OVMF should also be updated to skip the VBE shim installation
if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
simply cannot put the legacy Int0x10 vector in place, so we shouldn't
even try.

Please extend your original patch set with a patch for OvmfPkg. In the
file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
site:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>       Private->Variant == QEMU_VIDEO_BOCHS) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

and please skip the call if the PCD is set to TRUE:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

Now, the consequence of this would be a working OVMF build, but it would
also break booting UEFI Windows 7 on OVMF. Therefore, please append
*another* patch to your patch set, setting the Feature PCD to FALSE in
all three OVMF DSC files:

- OvmfPkg/OvmfPkgIa32.dsc
- OvmfPkg/OvmfPkgIa32X64.dsc
- OvmfPkg/OvmfPkgX64.dsc

We cannot break Windows 7 booting in upstream OVMF. If someone really
wants the null pointer detection in OVMF (and doesn't need Windows 7 for
sure), they can patch the DSC files, or else pass the

  --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE

option to the "build" utility.

... I'm now seeing in the original discussion that perhaps you will
introduce a bitmap instead (one bit per firmware phase). That's OK with
me; please reinterpret all of the above with the "DXE bit" then.

When you post the next version of the patch set, please CC Jordan and
myself on the OvmfPkg patches.

Thank you!
Laszlo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 10:20 ` Laszlo Ersek
@ 2017-09-06 11:16   ` Yao, Jiewen
  2017-09-06 13:18     ` Laszlo Ersek
  2017-09-07  0:41     ` Wang, Jian J
  0 siblings, 2 replies; 13+ messages in thread
From: Yao, Jiewen @ 2017-09-06 11:16 UTC (permalink / raw)
  To: Laszlo Ersek, Wang, Jian J, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

HI
I think the NULL detection feature should *never* break any existing platform.
No real function should be skipped for PcdNullDetection.

If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.

I suggest below options:

1)       In OVMF, if CSM is enabled, disable PcdNullDetection.

2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.

Either #1 or #2 is OK.
#1 is simple, and #2 can help detect more potential issue.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Wednesday, September 6, 2017 6:21 PM
To: Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>; Yao, Jiewen <jiewen.yao@intel.com>
Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: ASSERT in QemuVideoDxe driver during reset

On 09/06/17 10:15, Wang, Jian J wrote:
> Hi guys,
>
> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
> driver during reset. The assert statement is like below.
>
>     ASSERT (Int0x10->Segment == 0x0000);
>     ASSERT (Int0x10->Offset  == 0x0000);
>
> This happened after I have enabled NULL pointer access detection
> feature, in which page 0 (4K)  is disabled.

The NULL pointer access detection conflicts with the VBE shim. For
installing the VBE shim, OVMF has to write to page 0, on purpose.

Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).

> And because of page 0 disabled, I have to skip the memory clearing for
> page 0 in DXE core.

By doing that, you invalidate the following comment in said OVMF commit:

+    //
+    // We managed to allocate the page at zero. SVN r14218 guarantees that it
+    // is NUL-filled.
+    //
+    ASSERT (Int0x10->Segment == 0x0000);
+    ASSERT (Int0x10->Offset  == 0x0000);

Because SVN r14218 was what added the clearing of page 0 to the DXE
core. (Expressed as a git commit: d436d5ca0936.)

> Otherwise it will cause page fault exception there. It seems that QEMU
> may clear all its memory at startup. Skipping the action of clearing
> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
> QEMU will not clear all its memory during warm boot. ASSERT will be
> triggered after reset.

QEMU does not clear guest RAM at reboot (or S3 resume).

> It's easy to fix this issue but there're some subtle situations which
> I'm not quite certain. I'd like your opinions for them.
>
> Here're my thoughts on several solutions:
> a) Remove the ASSERT statement in InstallVbeShim().

Not a good idea.

> But I'm sure if it is safe to do so because I don't quite understand
> the purpose of the ASSERT.

The ASSERT expresses our belief that, after we manage to *allocate* page
0, we can freely write the real mode Int0x10 vector there, without
overwriting anything else.

> b) Instead of skipping clearing page 0, enable it, do clearing and
> then disable it. The problem here is that CPU arch protocol is not
> ready at that time. I have to "manually" do page operation, which
> might be non-portable and a little bit odd in DXE core.
> c) Move code clearing page 0 from DXE core to another place wherever
> appropriate, like DxeIpl or cpu driver. But I think there's a good
> reason to put code there before.

I'm not sure how the NULL pointer access detection is enabled. Is it
enabled by a PCD?

Hm, yes, it seems, from your patch

  [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core

that the PCD is called

  gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection

I can see in that patch that the DXE core is updated *not* to clear page
0 if the PCD is set to TRUE. That makes sense.

However, OVMF should also be updated to skip the VBE shim installation
if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
simply cannot put the legacy Int0x10 vector in place, so we shouldn't
even try.

Please extend your original patch set with a patch for OvmfPkg. In the
file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
site:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>       Private->Variant == QEMU_VIDEO_BOCHS) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

and please skip the call if the PCD is set to TRUE:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

Now, the consequence of this would be a working OVMF build, but it would
also break booting UEFI Windows 7 on OVMF. Therefore, please append
*another* patch to your patch set, setting the Feature PCD to FALSE in
all three OVMF DSC files:

- OvmfPkg/OvmfPkgIa32.dsc
- OvmfPkg/OvmfPkgIa32X64.dsc
- OvmfPkg/OvmfPkgX64.dsc

We cannot break Windows 7 booting in upstream OVMF. If someone really
wants the null pointer detection in OVMF (and doesn't need Windows 7 for
sure), they can patch the DSC files, or else pass the

  --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE

option to the "build" utility.

... I'm now seeing in the original discussion that perhaps you will
introduce a bitmap instead (one bit per firmware phase). That's OK with
me; please reinterpret all of the above with the "DXE bit" then.

When you post the next version of the patch set, please CC Jordan and
myself on the OvmfPkg patches.

Thank you!
Laszlo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 11:16   ` Yao, Jiewen
@ 2017-09-06 13:18     ` Laszlo Ersek
  2017-09-06 14:47       ` Gao, Liming
  2017-09-06 15:06       ` Yao, Jiewen
  2017-09-07  0:41     ` Wang, Jian J
  1 sibling, 2 replies; 13+ messages in thread
From: Laszlo Ersek @ 2017-09-06 13:18 UTC (permalink / raw)
  To: Yao, Jiewen, Wang, Jian J, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

On 09/06/17 13:16, Yao, Jiewen wrote:
> HI
> I think the NULL detection feature should *never* break any existing platform.
> No real function should be skipped for PcdNullDetection.
> 
> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.

Wait a second please -- the VBE shim exists so that we can boot Windows7
on OVMF *without* the CSM.

If you build OVMF with the SeaBIOS CSM, then Windows7 will simply boot
in legacy mode, and work as usual.

But if you build OVMF without a CSM, then Windows7 will break mid-way
into booting. The reason is that Windows7 insists on some legacy VBE
services even if it is booted on a pure UEFI system. By faking a minimal
set of legacy VBE services, Windows7 boots just fine in UEFI mode.

(Windows 7 doesn't actually execute the VBE services' real mode code on
the processor; instead the code is interpreted in a software emulator.)

The dependency on legacy VBE services is arguably a bug of UEFI Windows
7. One way to work it around is to build OVMF with the SeaBIOS CSM, and
just boot Windows7 in legacy mode.

However, in some environments, building OVMF with a CSM is not
desirable. That's why we created the VBE shim. It is a terrible hack,
but it fits the Windows7 bug just fine.

> I suggest below options:
> 
> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
> 
> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
> 
> Either #1 or #2 is OK.
> #1 is simple, and #2 can help detect more potential issue.

* Regarding #1:

As I wrote above, the VBE shim is mutually exclusive with CSM:

  //
  // The allocation request may fail, eg. if LegacyBiosDxe has already run.
  //

So, if the CSM was built into OVMF, then the VBE shim is already skipped
at boot.


* Regarding #2:

I agree that *temporarily* enabling R/W access to page zero, using the
appropriate DXE service, could make the VBE Shim installation work.

However, making page 0 unaccessible right after, could very easily break
Windows 7, if some component of Windows 7 accessed page 0, before
setting up new page tables. After all, in the InstallVbeShim() function,
we write stuff to page 0 *because* Windows 7 will read it from there.


* If it is more convenient than a "--pcd=..." option for "build", we can
also add another build-time macro to the OVMF DSC files, called
NULL_DETECT_ENABLE. This would then turn on the PcdNullDetection bits,
for "-D NULL_DETECT_ENABLE".

NULL_DETECT_ENABLE should default to FALSE.


* Independently: if the CSM too conflicts with the null detection
feature, then we should probably report an "unsupported configuration"
error for the following case:

  -D NULL_DETECT_ENABLE -D CSM_ENABLE

But, I don't know if triggering build errors, with custom error
messages, is possible in DSC files.

Thanks
Laszlo


> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, September 6, 2017 6:21 PM
> To: Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>; Yao, Jiewen <jiewen.yao@intel.com>
> Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
> Subject: Re: ASSERT in QemuVideoDxe driver during reset
> 
> On 09/06/17 10:15, Wang, Jian J wrote:
>> Hi guys,
>>
>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>> driver during reset. The assert statement is like below.
>>
>>     ASSERT (Int0x10->Segment == 0x0000);
>>     ASSERT (Int0x10->Offset  == 0x0000);
>>
>> This happened after I have enabled NULL pointer access detection
>> feature, in which page 0 (4K)  is disabled.
> 
> The NULL pointer access detection conflicts with the VBE shim. For
> installing the VBE shim, OVMF has to write to page 0, on purpose.
> 
> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
> 
>> And because of page 0 disabled, I have to skip the memory clearing for
>> page 0 in DXE core.
> 
> By doing that, you invalidate the following comment in said OVMF commit:
> 
> +    //
> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
> +    // is NUL-filled.
> +    //
> +    ASSERT (Int0x10->Segment == 0x0000);
> +    ASSERT (Int0x10->Offset  == 0x0000);
> 
> Because SVN r14218 was what added the clearing of page 0 to the DXE
> core. (Expressed as a git commit: d436d5ca0936.)
> 
>> Otherwise it will cause page fault exception there. It seems that QEMU
>> may clear all its memory at startup. Skipping the action of clearing
>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>> QEMU will not clear all its memory during warm boot. ASSERT will be
>> triggered after reset.
> 
> QEMU does not clear guest RAM at reboot (or S3 resume).
> 
>> It's easy to fix this issue but there're some subtle situations which
>> I'm not quite certain. I'd like your opinions for them.
>>
>> Here're my thoughts on several solutions:
>> a) Remove the ASSERT statement in InstallVbeShim().
> 
> Not a good idea.
> 
>> But I'm sure if it is safe to do so because I don't quite understand
>> the purpose of the ASSERT.
> 
> The ASSERT expresses our belief that, after we manage to *allocate* page
> 0, we can freely write the real mode Int0x10 vector there, without
> overwriting anything else.
> 
>> b) Instead of skipping clearing page 0, enable it, do clearing and
>> then disable it. The problem here is that CPU arch protocol is not
>> ready at that time. I have to "manually" do page operation, which
>> might be non-portable and a little bit odd in DXE core.
>> c) Move code clearing page 0 from DXE core to another place wherever
>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>> reason to put code there before.
> 
> I'm not sure how the NULL pointer access detection is enabled. Is it
> enabled by a PCD?
> 
> Hm, yes, it seems, from your patch
> 
>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
> 
> that the PCD is called
> 
>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
> 
> I can see in that patch that the DXE core is updated *not* to clear page
> 0 if the PCD is set to TRUE. That makes sense.
> 
> However, OVMF should also be updated to skip the VBE shim installation
> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
> even try.
> 
> Please extend your original patch set with a patch for OvmfPkg. In the
> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
> site:
> 
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
> 
> and please skip the call if the PCD is set to TRUE:
> 
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
> 
> Now, the consequence of this would be a working OVMF build, but it would
> also break booting UEFI Windows 7 on OVMF. Therefore, please append
> *another* patch to your patch set, setting the Feature PCD to FALSE in
> all three OVMF DSC files:
> 
> - OvmfPkg/OvmfPkgIa32.dsc
> - OvmfPkg/OvmfPkgIa32X64.dsc
> - OvmfPkg/OvmfPkgX64.dsc
> 
> We cannot break Windows 7 booting in upstream OVMF. If someone really
> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
> sure), they can patch the DSC files, or else pass the
> 
>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
> 
> option to the "build" utility.
> 
> ... I'm now seeing in the original discussion that perhaps you will
> introduce a bitmap instead (one bit per firmware phase). That's OK with
> me; please reinterpret all of the above with the "DXE bit" then.
> 
> When you post the next version of the patch set, please CC Jordan and
> myself on the OvmfPkg patches.
> 
> Thank you!
> Laszlo
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 13:18     ` Laszlo Ersek
@ 2017-09-06 14:47       ` Gao, Liming
  2017-09-06 15:35         ` Laszlo Ersek
  2017-09-06 15:06       ` Yao, Jiewen
  1 sibling, 1 reply; 13+ messages in thread
From: Gao, Liming @ 2017-09-06 14:47 UTC (permalink / raw)
  To: Laszlo Ersek, Yao, Jiewen, Wang, Jian J, Justen, Jordan L
  Cc: Kinney, Michael D, edk2-devel@lists.01.org

Laszlo:

> * Independently: if the CSM too conflicts with the null detection
> feature, then we should probably report an "unsupported configuration"
> error for the following case:
> 
>   -D NULL_DETECT_ENABLE -D CSM_ENABLE
> 
> But, I don't know if triggering build errors, with custom error
> messages, is possible in DSC files.

Now, there is no support to report the error message for the invalid combination. I think this is a good request to BaseTools. We can introduce !error "error message" syntax in DSC/FDF. If the invalid combination happen, !error will trig build error message. Could you help submit this feature request in BugZillar?

Thanks
Liming
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> Sent: Wednesday, September 6, 2017 9:19 PM
> To: Yao, Jiewen <jiewen.yao@intel.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com>; edk2-devel@lists.01.org
> Subject: Re: [edk2] ASSERT in QemuVideoDxe driver during reset
> 
> On 09/06/17 13:16, Yao, Jiewen wrote:
> > HI
> > I think the NULL detection feature should *never* break any existing platform.
> > No real function should be skipped for PcdNullDetection.
> >
> > If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.
> 
> Wait a second please -- the VBE shim exists so that we can boot Windows7
> on OVMF *without* the CSM.
> 
> If you build OVMF with the SeaBIOS CSM, then Windows7 will simply boot
> in legacy mode, and work as usual.
> 
> But if you build OVMF without a CSM, then Windows7 will break mid-way
> into booting. The reason is that Windows7 insists on some legacy VBE
> services even if it is booted on a pure UEFI system. By faking a minimal
> set of legacy VBE services, Windows7 boots just fine in UEFI mode.
> 
> (Windows 7 doesn't actually execute the VBE services' real mode code on
> the processor; instead the code is interpreted in a software emulator.)
> 
> The dependency on legacy VBE services is arguably a bug of UEFI Windows
> 7. One way to work it around is to build OVMF with the SeaBIOS CSM, and
> just boot Windows7 in legacy mode.
> 
> However, in some environments, building OVMF with a CSM is not
> desirable. That's why we created the VBE shim. It is a terrible hack,
> but it fits the Windows7 bug just fine.
> 
> > I suggest below options:
> >
> > 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
> >
> > 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if
> PcdNullDetection is enabled.
> >
> > Either #1 or #2 is OK.
> > #1 is simple, and #2 can help detect more potential issue.
> 
> * Regarding #1:
> 
> As I wrote above, the VBE shim is mutually exclusive with CSM:
> 
>   //
>   // The allocation request may fail, eg. if LegacyBiosDxe has already run.
>   //
> 
> So, if the CSM was built into OVMF, then the VBE shim is already skipped
> at boot.
> 
> 
> * Regarding #2:
> 
> I agree that *temporarily* enabling R/W access to page zero, using the
> appropriate DXE service, could make the VBE Shim installation work.
> 
> However, making page 0 unaccessible right after, could very easily break
> Windows 7, if some component of Windows 7 accessed page 0, before
> setting up new page tables. After all, in the InstallVbeShim() function,
> we write stuff to page 0 *because* Windows 7 will read it from there.
> 
> 
> * If it is more convenient than a "--pcd=..." option for "build", we can
> also add another build-time macro to the OVMF DSC files, called
> NULL_DETECT_ENABLE. This would then turn on the PcdNullDetection bits,
> for "-D NULL_DETECT_ENABLE".
> 
> NULL_DETECT_ENABLE should default to FALSE.
> 
> 
> * Independently: if the CSM too conflicts with the null detection
> feature, then we should probably report an "unsupported configuration"
> error for the following case:
> 
>   -D NULL_DETECT_ENABLE -D CSM_ENABLE
> 
> But, I don't know if triggering build errors, with custom error
> messages, is possible in DSC files.
> 
> Thanks
> Laszlo
> 
> 
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: Wednesday, September 6, 2017 6:21 PM
> > To: Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>; Yao, Jiewen <jiewen.yao@intel.com>
> > Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
> > Subject: Re: ASSERT in QemuVideoDxe driver during reset
> >
> > On 09/06/17 10:15, Wang, Jian J wrote:
> >> Hi guys,
> >>
> >> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
> >> driver during reset. The assert statement is like below.
> >>
> >>     ASSERT (Int0x10->Segment == 0x0000);
> >>     ASSERT (Int0x10->Offset  == 0x0000);
> >>
> >> This happened after I have enabled NULL pointer access detection
> >> feature, in which page 0 (4K)  is disabled.
> >
> > The NULL pointer access detection conflicts with the VBE shim. For
> > installing the VBE shim, OVMF has to write to page 0, on purpose.
> >
> > Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
> > Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
> >
> >> And because of page 0 disabled, I have to skip the memory clearing for
> >> page 0 in DXE core.
> >
> > By doing that, you invalidate the following comment in said OVMF commit:
> >
> > +    //
> > +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
> > +    // is NUL-filled.
> > +    //
> > +    ASSERT (Int0x10->Segment == 0x0000);
> > +    ASSERT (Int0x10->Offset  == 0x0000);
> >
> > Because SVN r14218 was what added the clearing of page 0 to the DXE
> > core. (Expressed as a git commit: d436d5ca0936.)
> >
> >> Otherwise it will cause page fault exception there. It seems that QEMU
> >> may clear all its memory at startup. Skipping the action of clearing
> >> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
> >> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
> >> QEMU will not clear all its memory during warm boot. ASSERT will be
> >> triggered after reset.
> >
> > QEMU does not clear guest RAM at reboot (or S3 resume).
> >
> >> It's easy to fix this issue but there're some subtle situations which
> >> I'm not quite certain. I'd like your opinions for them.
> >>
> >> Here're my thoughts on several solutions:
> >> a) Remove the ASSERT statement in InstallVbeShim().
> >
> > Not a good idea.
> >
> >> But I'm sure if it is safe to do so because I don't quite understand
> >> the purpose of the ASSERT.
> >
> > The ASSERT expresses our belief that, after we manage to *allocate* page
> > 0, we can freely write the real mode Int0x10 vector there, without
> > overwriting anything else.
> >
> >> b) Instead of skipping clearing page 0, enable it, do clearing and
> >> then disable it. The problem here is that CPU arch protocol is not
> >> ready at that time. I have to "manually" do page operation, which
> >> might be non-portable and a little bit odd in DXE core.
> >> c) Move code clearing page 0 from DXE core to another place wherever
> >> appropriate, like DxeIpl or cpu driver. But I think there's a good
> >> reason to put code there before.
> >
> > I'm not sure how the NULL pointer access detection is enabled. Is it
> > enabled by a PCD?
> >
> > Hm, yes, it seems, from your patch
> >
> >   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
> >
> > that the PCD is called
> >
> >   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
> >
> > I can see in that patch that the DXE core is updated *not* to clear page
> > 0 if the PCD is set to TRUE. That makes sense.
> >
> > However, OVMF should also be updated to skip the VBE shim installation
> > if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
> > simply cannot put the legacy Int0x10 vector in place, so we shouldn't
> > even try.
> >
> > Please extend your original patch set with a patch for OvmfPkg. In the
> > file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
> > site:
> >
> >> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
> >>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
> >>       Private->Variant == QEMU_VIDEO_BOCHS) {
> >>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
> >>   }
> >> #endif
> >
> > and please skip the call if the PCD is set to TRUE:
> >
> >> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
> >>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
> >>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
> >>        Private->Variant == QEMU_VIDEO_BOCHS)) {
> >>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
> >>   }
> >> #endif
> >
> > Now, the consequence of this would be a working OVMF build, but it would
> > also break booting UEFI Windows 7 on OVMF. Therefore, please append
> > *another* patch to your patch set, setting the Feature PCD to FALSE in
> > all three OVMF DSC files:
> >
> > - OvmfPkg/OvmfPkgIa32.dsc
> > - OvmfPkg/OvmfPkgIa32X64.dsc
> > - OvmfPkg/OvmfPkgX64.dsc
> >
> > We cannot break Windows 7 booting in upstream OVMF. If someone really
> > wants the null pointer detection in OVMF (and doesn't need Windows 7 for
> > sure), they can patch the DSC files, or else pass the
> >
> >   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
> >
> > option to the "build" utility.
> >
> > ... I'm now seeing in the original discussion that perhaps you will
> > introduce a bitmap instead (one bit per firmware phase). That's OK with
> > me; please reinterpret all of the above with the "DXE bit" then.
> >
> > When you post the next version of the patch set, please CC Jordan and
> > myself on the OvmfPkg patches.
> >
> > Thank you!
> > Laszlo
> >
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 13:18     ` Laszlo Ersek
  2017-09-06 14:47       ` Gao, Liming
@ 2017-09-06 15:06       ` Yao, Jiewen
  2017-09-06 15:25         ` Laszlo Ersek
  1 sibling, 1 reply; 13+ messages in thread
From: Yao, Jiewen @ 2017-09-06 15:06 UTC (permalink / raw)
  To: Laszlo Ersek, Wang, Jian J, Justen, Jordan L
  Cc: Kinney, Michael D, edk2-devel@lists.01.org

HI Laszlo
Would you please clarify Window 7 is "UEFI Window 7" or "Legacy Window 7" ?

Thank you
Yao Jiewen

From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
Sent: Wednesday, September 6, 2017 9:19 PM
To: Yao, Jiewen <jiewen.yao@intel.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; edk2-devel@lists.01.org
Subject: Re: [edk2] ASSERT in QemuVideoDxe driver during reset

On 09/06/17 13:16, Yao, Jiewen wrote:
> HI
> I think the NULL detection feature should *never* break any existing platform.
> No real function should be skipped for PcdNullDetection.
>
> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.

Wait a second please -- the VBE shim exists so that we can boot Windows7
on OVMF *without* the CSM.

If you build OVMF with the SeaBIOS CSM, then Windows7 will simply boot
in legacy mode, and work as usual.

But if you build OVMF without a CSM, then Windows7 will break mid-way
into booting. The reason is that Windows7 insists on some legacy VBE
services even if it is booted on a pure UEFI system. By faking a minimal
set of legacy VBE services, Windows7 boots just fine in UEFI mode.

(Windows 7 doesn't actually execute the VBE services' real mode code on
the processor; instead the code is interpreted in a software emulator.)

The dependency on legacy VBE services is arguably a bug of UEFI Windows
7. One way to work it around is to build OVMF with the SeaBIOS CSM, and
just boot Windows7 in legacy mode.

However, in some environments, building OVMF with a CSM is not
desirable. That's why we created the VBE shim. It is a terrible hack,
but it fits the Windows7 bug just fine.

> I suggest below options:
>
> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
>
> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
>
> Either #1 or #2 is OK.
> #1 is simple, and #2 can help detect more potential issue.

* Regarding #1:

As I wrote above, the VBE shim is mutually exclusive with CSM:

  //
  // The allocation request may fail, eg. if LegacyBiosDxe has already run.
  //

So, if the CSM was built into OVMF, then the VBE shim is already skipped
at boot.


* Regarding #2:

I agree that *temporarily* enabling R/W access to page zero, using the
appropriate DXE service, could make the VBE Shim installation work.

However, making page 0 unaccessible right after, could very easily break
Windows 7, if some component of Windows 7 accessed page 0, before
setting up new page tables. After all, in the InstallVbeShim() function,
we write stuff to page 0 *because* Windows 7 will read it from there.


* If it is more convenient than a "--pcd=..." option for "build", we can
also add another build-time macro to the OVMF DSC files, called
NULL_DETECT_ENABLE. This would then turn on the PcdNullDetection bits,
for "-D NULL_DETECT_ENABLE".

NULL_DETECT_ENABLE should default to FALSE.


* Independently: if the CSM too conflicts with the null detection
feature, then we should probably report an "unsupported configuration"
error for the following case:

  -D NULL_DETECT_ENABLE -D CSM_ENABLE

But, I don't know if triggering build errors, with custom error
messages, is possible in DSC files.

Thanks
Laszlo


> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, September 6, 2017 6:21 PM
> To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
> Subject: Re: ASSERT in QemuVideoDxe driver during reset
>
> On 09/06/17 10:15, Wang, Jian J wrote:
>> Hi guys,
>>
>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>> driver during reset. The assert statement is like below.
>>
>>     ASSERT (Int0x10->Segment == 0x0000);
>>     ASSERT (Int0x10->Offset  == 0x0000);
>>
>> This happened after I have enabled NULL pointer access detection
>> feature, in which page 0 (4K)  is disabled.
>
> The NULL pointer access detection conflicts with the VBE shim. For
> installing the VBE shim, OVMF has to write to page 0, on purpose.
>
> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
>
>> And because of page 0 disabled, I have to skip the memory clearing for
>> page 0 in DXE core.
>
> By doing that, you invalidate the following comment in said OVMF commit:
>
> +    //
> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
> +    // is NUL-filled.
> +    //
> +    ASSERT (Int0x10->Segment == 0x0000);
> +    ASSERT (Int0x10->Offset  == 0x0000);
>
> Because SVN r14218 was what added the clearing of page 0 to the DXE
> core. (Expressed as a git commit: d436d5ca0936.)
>
>> Otherwise it will cause page fault exception there. It seems that QEMU
>> may clear all its memory at startup. Skipping the action of clearing
>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>> QEMU will not clear all its memory during warm boot. ASSERT will be
>> triggered after reset.
>
> QEMU does not clear guest RAM at reboot (or S3 resume).
>
>> It's easy to fix this issue but there're some subtle situations which
>> I'm not quite certain. I'd like your opinions for them.
>>
>> Here're my thoughts on several solutions:
>> a) Remove the ASSERT statement in InstallVbeShim().
>
> Not a good idea.
>
>> But I'm sure if it is safe to do so because I don't quite understand
>> the purpose of the ASSERT.
>
> The ASSERT expresses our belief that, after we manage to *allocate* page
> 0, we can freely write the real mode Int0x10 vector there, without
> overwriting anything else.
>
>> b) Instead of skipping clearing page 0, enable it, do clearing and
>> then disable it. The problem here is that CPU arch protocol is not
>> ready at that time. I have to "manually" do page operation, which
>> might be non-portable and a little bit odd in DXE core.
>> c) Move code clearing page 0 from DXE core to another place wherever
>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>> reason to put code there before.
>
> I'm not sure how the NULL pointer access detection is enabled. Is it
> enabled by a PCD?
>
> Hm, yes, it seems, from your patch
>
>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
>
> that the PCD is called
>
>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
>
> I can see in that patch that the DXE core is updated *not* to clear page
> 0 if the PCD is set to TRUE. That makes sense.
>
> However, OVMF should also be updated to skip the VBE shim installation
> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
> even try.
>
> Please extend your original patch set with a patch for OvmfPkg. In the
> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
> site:
>
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
>
> and please skip the call if the PCD is set to TRUE:
>
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
>
> Now, the consequence of this would be a working OVMF build, but it would
> also break booting UEFI Windows 7 on OVMF. Therefore, please append
> *another* patch to your patch set, setting the Feature PCD to FALSE in
> all three OVMF DSC files:
>
> - OvmfPkg/OvmfPkgIa32.dsc
> - OvmfPkg/OvmfPkgIa32X64.dsc
> - OvmfPkg/OvmfPkgX64.dsc
>
> We cannot break Windows 7 booting in upstream OVMF. If someone really
> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
> sure), they can patch the DSC files, or else pass the
>
>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
>
> option to the "build" utility.
>
> ... I'm now seeing in the original discussion that perhaps you will
> introduce a bitmap instead (one bit per firmware phase). That's OK with
> me; please reinterpret all of the above with the "DXE bit" then.
>
> When you post the next version of the patch set, please CC Jordan and
> myself on the OvmfPkg patches.
>
> Thank you!
> Laszlo
>

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 15:06       ` Yao, Jiewen
@ 2017-09-06 15:25         ` Laszlo Ersek
  2017-09-07  0:54           ` Yao, Jiewen
  0 siblings, 1 reply; 13+ messages in thread
From: Laszlo Ersek @ 2017-09-06 15:25 UTC (permalink / raw)
  To: Yao, Jiewen, Wang, Jian J, Justen, Jordan L
  Cc: Kinney, Michael D, edk2-devel@lists.01.org

On 09/06/17 17:06, Yao, Jiewen wrote:
> HI Laszlo
> Would you please clarify Window 7 is "UEFI Window 7" or "Legacy Window 7" ?

I mean that you grab a Windows 7 / Windows Server 2008 R2 (X64)
installer ISO, and boot it either in Legacy mode, or in UEFI mode. The
ISO image is bootable in both modes, but even if you boot it in UEFI
mode, it wants to use VBE services.

Some ISOs that I currently have on my disk -- fully legally, of course
--, are

en_windows_server_2008_r2_with_sp1_x64_dvd_617601.iso
en_windows_7_enterprise_n_with_sp1_x64_dvd_u_677704
en_windows_7_professional_with_sp1_x64_dvd_u_676939
en_windows_7_ultimate_n_with_sp1_x86_dvd_u_677597

Thanks,
Laszlo


> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> Sent: Wednesday, September 6, 2017 9:19 PM
> To: Yao, Jiewen <jiewen.yao@intel.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com>; edk2-devel@lists.01.org
> Subject: Re: [edk2] ASSERT in QemuVideoDxe driver during reset
> 
> On 09/06/17 13:16, Yao, Jiewen wrote:
>> HI
>> I think the NULL detection feature should *never* break any existing platform.
>> No real function should be skipped for PcdNullDetection.
>>
>> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.
> 
> Wait a second please -- the VBE shim exists so that we can boot Windows7
> on OVMF *without* the CSM.
> 
> If you build OVMF with the SeaBIOS CSM, then Windows7 will simply boot
> in legacy mode, and work as usual.
> 
> But if you build OVMF without a CSM, then Windows7 will break mid-way
> into booting. The reason is that Windows7 insists on some legacy VBE
> services even if it is booted on a pure UEFI system. By faking a minimal
> set of legacy VBE services, Windows7 boots just fine in UEFI mode.
> 
> (Windows 7 doesn't actually execute the VBE services' real mode code on
> the processor; instead the code is interpreted in a software emulator.)
> 
> The dependency on legacy VBE services is arguably a bug of UEFI Windows
> 7. One way to work it around is to build OVMF with the SeaBIOS CSM, and
> just boot Windows7 in legacy mode.
> 
> However, in some environments, building OVMF with a CSM is not
> desirable. That's why we created the VBE shim. It is a terrible hack,
> but it fits the Windows7 bug just fine.
> 
>> I suggest below options:
>>
>> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
>>
>> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
>>
>> Either #1 or #2 is OK.
>> #1 is simple, and #2 can help detect more potential issue.
> 
> * Regarding #1:
> 
> As I wrote above, the VBE shim is mutually exclusive with CSM:
> 
>   //
>   // The allocation request may fail, eg. if LegacyBiosDxe has already run.
>   //
> 
> So, if the CSM was built into OVMF, then the VBE shim is already skipped
> at boot.
> 
> 
> * Regarding #2:
> 
> I agree that *temporarily* enabling R/W access to page zero, using the
> appropriate DXE service, could make the VBE Shim installation work.
> 
> However, making page 0 unaccessible right after, could very easily break
> Windows 7, if some component of Windows 7 accessed page 0, before
> setting up new page tables. After all, in the InstallVbeShim() function,
> we write stuff to page 0 *because* Windows 7 will read it from there.
> 
> 
> * If it is more convenient than a "--pcd=..." option for "build", we can
> also add another build-time macro to the OVMF DSC files, called
> NULL_DETECT_ENABLE. This would then turn on the PcdNullDetection bits,
> for "-D NULL_DETECT_ENABLE".
> 
> NULL_DETECT_ENABLE should default to FALSE.
> 
> 
> * Independently: if the CSM too conflicts with the null detection
> feature, then we should probably report an "unsupported configuration"
> error for the following case:
> 
>   -D NULL_DETECT_ENABLE -D CSM_ENABLE
> 
> But, I don't know if triggering build errors, with custom error
> messages, is possible in DSC files.
> 
> Thanks
> Laszlo
> 
> 
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Wednesday, September 6, 2017 6:21 PM
>> To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>
>> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
>> Subject: Re: ASSERT in QemuVideoDxe driver during reset
>>
>> On 09/06/17 10:15, Wang, Jian J wrote:
>>> Hi guys,
>>>
>>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>>> driver during reset. The assert statement is like below.
>>>
>>>     ASSERT (Int0x10->Segment == 0x0000);
>>>     ASSERT (Int0x10->Offset  == 0x0000);
>>>
>>> This happened after I have enabled NULL pointer access detection
>>> feature, in which page 0 (4K)  is disabled.
>>
>> The NULL pointer access detection conflicts with the VBE shim. For
>> installing the VBE shim, OVMF has to write to page 0, on purpose.
>>
>> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
>> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
>>
>>> And because of page 0 disabled, I have to skip the memory clearing for
>>> page 0 in DXE core.
>>
>> By doing that, you invalidate the following comment in said OVMF commit:
>>
>> +    //
>> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
>> +    // is NUL-filled.
>> +    //
>> +    ASSERT (Int0x10->Segment == 0x0000);
>> +    ASSERT (Int0x10->Offset  == 0x0000);
>>
>> Because SVN r14218 was what added the clearing of page 0 to the DXE
>> core. (Expressed as a git commit: d436d5ca0936.)
>>
>>> Otherwise it will cause page fault exception there. It seems that QEMU
>>> may clear all its memory at startup. Skipping the action of clearing
>>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>>> QEMU will not clear all its memory during warm boot. ASSERT will be
>>> triggered after reset.
>>
>> QEMU does not clear guest RAM at reboot (or S3 resume).
>>
>>> It's easy to fix this issue but there're some subtle situations which
>>> I'm not quite certain. I'd like your opinions for them.
>>>
>>> Here're my thoughts on several solutions:
>>> a) Remove the ASSERT statement in InstallVbeShim().
>>
>> Not a good idea.
>>
>>> But I'm sure if it is safe to do so because I don't quite understand
>>> the purpose of the ASSERT.
>>
>> The ASSERT expresses our belief that, after we manage to *allocate* page
>> 0, we can freely write the real mode Int0x10 vector there, without
>> overwriting anything else.
>>
>>> b) Instead of skipping clearing page 0, enable it, do clearing and
>>> then disable it. The problem here is that CPU arch protocol is not
>>> ready at that time. I have to "manually" do page operation, which
>>> might be non-portable and a little bit odd in DXE core.
>>> c) Move code clearing page 0 from DXE core to another place wherever
>>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>>> reason to put code there before.
>>
>> I'm not sure how the NULL pointer access detection is enabled. Is it
>> enabled by a PCD?
>>
>> Hm, yes, it seems, from your patch
>>
>>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
>>
>> that the PCD is called
>>
>>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
>>
>> I can see in that patch that the DXE core is updated *not* to clear page
>> 0 if the PCD is set to TRUE. That makes sense.
>>
>> However, OVMF should also be updated to skip the VBE shim installation
>> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
>> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
>> even try.
>>
>> Please extend your original patch set with a patch for OvmfPkg. In the
>> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
>> site:
>>
>>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>>   }
>>> #endif
>>
>> and please skip the call if the PCD is set to TRUE:
>>
>>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>>   }
>>> #endif
>>
>> Now, the consequence of this would be a working OVMF build, but it would
>> also break booting UEFI Windows 7 on OVMF. Therefore, please append
>> *another* patch to your patch set, setting the Feature PCD to FALSE in
>> all three OVMF DSC files:
>>
>> - OvmfPkg/OvmfPkgIa32.dsc
>> - OvmfPkg/OvmfPkgIa32X64.dsc
>> - OvmfPkg/OvmfPkgX64.dsc
>>
>> We cannot break Windows 7 booting in upstream OVMF. If someone really
>> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
>> sure), they can patch the DSC files, or else pass the
>>
>>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
>>
>> option to the "build" utility.
>>
>> ... I'm now seeing in the original discussion that perhaps you will
>> introduce a bitmap instead (one bit per firmware phase). That's OK with
>> me; please reinterpret all of the above with the "DXE bit" then.
>>
>> When you post the next version of the patch set, please CC Jordan and
>> myself on the OvmfPkg patches.
>>
>> Thank you!
>> Laszlo
>>
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
> https://lists.01.org/mailman/listinfo/edk2-devel
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 14:47       ` Gao, Liming
@ 2017-09-06 15:35         ` Laszlo Ersek
  0 siblings, 0 replies; 13+ messages in thread
From: Laszlo Ersek @ 2017-09-06 15:35 UTC (permalink / raw)
  To: Gao, Liming, Yao, Jiewen, Wang, Jian J, Justen, Jordan L
  Cc: Kinney, Michael D, edk2-devel@lists.01.org

On 09/06/17 16:47, Gao, Liming wrote:
> Laszlo:
> 
>> * Independently: if the CSM too conflicts with the null detection
>> feature, then we should probably report an "unsupported configuration"
>> error for the following case:
>>
>>   -D NULL_DETECT_ENABLE -D CSM_ENABLE
>>
>> But, I don't know if triggering build errors, with custom error
>> messages, is possible in DSC files.
> 
> Now, there is no support to report the error message for the invalid combination. I think this is a good request to BaseTools. We can introduce !error "error message" syntax in DSC/FDF. If the invalid combination happen, !error will trig build error message. Could you help submit this feature request in BugZillar?

Thank you very much for considering it; I've filed:

https://bugzilla.tianocore.org/show_bug.cgi?id=701

Laszlo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 11:16   ` Yao, Jiewen
  2017-09-06 13:18     ` Laszlo Ersek
@ 2017-09-07  0:41     ` Wang, Jian J
  2017-09-07  1:28       ` Yao, Jiewen
  1 sibling, 1 reply; 13+ messages in thread
From: Wang, Jian J @ 2017-09-07  0:41 UTC (permalink / raw)
  To: Yao, Jiewen, Laszlo Ersek, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

Hi Jiewen and Laszlo,

According to both of your comments:


1.      VBE SHIM must be installed to avoid boot failure of Windows 7.

2.      Enabling NULL detection should not break existing platforms.

Let me clarify a few things in advance. I think there’s a little misunderstanding on this issue (Sorry for my poor description originally).

1.      NULL detection is implemented by disable first 4K page (0-4095). Let me call it page 0.

2.      The ASSERT was not caused by accessing page 0. I do enabling page 0 before access int10 vector in QemuVideoDxe driver while NULL detection is enabled.

3.  The ASSERT will only be triggered by reset because QemuVideoDxe will write int10 vector and memory at that place keeps intact during reset.

4.  The root cause of ASSERT is the fact that page 0 is always cleared in DXE core except to NULL detection enabled. I changed CoreAddRange() in page.c in DXC core like below to avoid page fault in core.
//
  // If memory of type EfiConventionalMemory is being added that includes the page
  // starting at address 0, then zero the page starting at address 0.  This has
  // two benifits.  It helps find NULL pointer bugs and it also maximizes
  // compatibility with operating systems that may evaluate memory in this page
  // for legacy data structures.  If memory of any other type is added starting
  // at address 0, then do not zero the page at address 0 because the page is being
  // used for other purposes.
  //
  if (Type == EfiConventionalMemory && Start == 0 && (End >= EFI_PAGE_SIZE - 1)) {
    if ((PcdGet8(PcdNullPointerDetectionPropertyMask) & BIT0) == 0) {
      SetMem ((VOID *)(UINTN)Start, EFI_PAGE_SIZE, 0);
    }
  }

I think the best solution at present is to update above code to make sure page 0 is always cleared no matter NULL detection is enable or not. To make it possible, I have to enable page 0 before memory clearing and disable it again after then. My concern is that I cannot use CPU arch protocol to do so in CoreAddRange() function because it’s not ready at that time. There’re two options here. One is “manually” disabling page 0 (read cr3, then pm4l, then pde…) in above code; another is moving above code to another place where paging operation is at hand, like DxeIpl or CPU driver. My personal opinion is second one would be better. What’s your opinions? Or any better ideas?

Thanks,
Wang, Jian J

From: Yao, Jiewen
Sent: Wednesday, September 06, 2017 7:17 PM
To: Laszlo Ersek <lersek@redhat.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: RE: ASSERT in QemuVideoDxe driver during reset

HI
I think the NULL detection feature should *never* break any existing platform.
No real function should be skipped for PcdNullDetection.

If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.

I suggest below options:

1)      In OVMF, if CSM is enabled, disable PcdNullDetection.

2)      In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.

Either #1 or #2 is OK.
#1 is simple, and #2 can help detect more potential issue.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Wednesday, September 6, 2017 6:21 PM
To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>
Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: ASSERT in QemuVideoDxe driver during reset

On 09/06/17 10:15, Wang, Jian J wrote:
> Hi guys,
>
> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
> driver during reset. The assert statement is like below.
>
>     ASSERT (Int0x10->Segment == 0x0000);
>     ASSERT (Int0x10->Offset  == 0x0000);
>
> This happened after I have enabled NULL pointer access detection
> feature, in which page 0 (4K)  is disabled.

The NULL pointer access detection conflicts with the VBE shim. For
installing the VBE shim, OVMF has to write to page 0, on purpose.

Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).

> And because of page 0 disabled, I have to skip the memory clearing for
> page 0 in DXE core.

By doing that, you invalidate the following comment in said OVMF commit:

+    //
+    // We managed to allocate the page at zero. SVN r14218 guarantees that it
+    // is NUL-filled.
+    //
+    ASSERT (Int0x10->Segment == 0x0000);
+    ASSERT (Int0x10->Offset  == 0x0000);

Because SVN r14218 was what added the clearing of page 0 to the DXE
core. (Expressed as a git commit: d436d5ca0936.)

> Otherwise it will cause page fault exception there. It seems that QEMU
> may clear all its memory at startup. Skipping the action of clearing
> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
> QEMU will not clear all its memory during warm boot. ASSERT will be
> triggered after reset.

QEMU does not clear guest RAM at reboot (or S3 resume).

> It's easy to fix this issue but there're some subtle situations which
> I'm not quite certain. I'd like your opinions for them.
>
> Here're my thoughts on several solutions:
> a) Remove the ASSERT statement in InstallVbeShim().

Not a good idea.

> But I'm sure if it is safe to do so because I don't quite understand
> the purpose of the ASSERT.

The ASSERT expresses our belief that, after we manage to *allocate* page
0, we can freely write the real mode Int0x10 vector there, without
overwriting anything else.

> b) Instead of skipping clearing page 0, enable it, do clearing and
> then disable it. The problem here is that CPU arch protocol is not
> ready at that time. I have to "manually" do page operation, which
> might be non-portable and a little bit odd in DXE core.
> c) Move code clearing page 0 from DXE core to another place wherever
> appropriate, like DxeIpl or cpu driver. But I think there's a good
> reason to put code there before.

I'm not sure how the NULL pointer access detection is enabled. Is it
enabled by a PCD?

Hm, yes, it seems, from your patch

  [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core

that the PCD is called

  gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection

I can see in that patch that the DXE core is updated *not* to clear page
0 if the PCD is set to TRUE. That makes sense.

However, OVMF should also be updated to skip the VBE shim installation
if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
simply cannot put the legacy Int0x10 vector in place, so we shouldn't
even try.

Please extend your original patch set with a patch for OvmfPkg. In the
file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
site:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>       Private->Variant == QEMU_VIDEO_BOCHS) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

and please skip the call if the PCD is set to TRUE:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

Now, the consequence of this would be a working OVMF build, but it would
also break booting UEFI Windows 7 on OVMF. Therefore, please append
*another* patch to your patch set, setting the Feature PCD to FALSE in
all three OVMF DSC files:

- OvmfPkg/OvmfPkgIa32.dsc
- OvmfPkg/OvmfPkgIa32X64.dsc
- OvmfPkg/OvmfPkgX64.dsc

We cannot break Windows 7 booting in upstream OVMF. If someone really
wants the null pointer detection in OVMF (and doesn't need Windows 7 for
sure), they can patch the DSC files, or else pass the

  --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE

option to the "build" utility.

... I'm now seeing in the original discussion that perhaps you will
introduce a bitmap instead (one bit per firmware phase). That's OK with
me; please reinterpret all of the above with the "DXE bit" then.

When you post the next version of the patch set, please CC Jordan and
myself on the OvmfPkg patches.

Thank you!
Laszlo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-06 15:25         ` Laszlo Ersek
@ 2017-09-07  0:54           ` Yao, Jiewen
  0 siblings, 0 replies; 13+ messages in thread
From: Yao, Jiewen @ 2017-09-07  0:54 UTC (permalink / raw)
  To: Laszlo Ersek, Wang, Jian J, Justen, Jordan L
  Cc: Kinney, Michael D, edk2-devel@lists.01.org

Got it. Thanks.

I think we need validate both UEFI Win7 and Legacy Win7 to make sure they still work.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Wednesday, September 6, 2017 11:25 PM
To: Yao, Jiewen <jiewen.yao@intel.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; edk2-devel@lists.01.org
Subject: Re: [edk2] ASSERT in QemuVideoDxe driver during reset

On 09/06/17 17:06, Yao, Jiewen wrote:
> HI Laszlo
> Would you please clarify Window 7 is "UEFI Window 7" or "Legacy Window 7" ?

I mean that you grab a Windows 7 / Windows Server 2008 R2 (X64)
installer ISO, and boot it either in Legacy mode, or in UEFI mode. The
ISO image is bootable in both modes, but even if you boot it in UEFI
mode, it wants to use VBE services.

Some ISOs that I currently have on my disk -- fully legally, of course
--, are

en_windows_server_2008_r2_with_sp1_x64_dvd_617601.iso
en_windows_7_enterprise_n_with_sp1_x64_dvd_u_677704
en_windows_7_professional_with_sp1_x64_dvd_u_676939
en_windows_7_ultimate_n_with_sp1_x86_dvd_u_677597

Thanks,
Laszlo


> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek
> Sent: Wednesday, September 6, 2017 9:19 PM
> To: Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>; Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>
> Subject: Re: [edk2] ASSERT in QemuVideoDxe driver during reset
>
> On 09/06/17 13:16, Yao, Jiewen wrote:
>> HI
>> I think the NULL detection feature should *never* break any existing platform.
>> No real function should be skipped for PcdNullDetection.
>>
>> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.
>
> Wait a second please -- the VBE shim exists so that we can boot Windows7
> on OVMF *without* the CSM.
>
> If you build OVMF with the SeaBIOS CSM, then Windows7 will simply boot
> in legacy mode, and work as usual.
>
> But if you build OVMF without a CSM, then Windows7 will break mid-way
> into booting. The reason is that Windows7 insists on some legacy VBE
> services even if it is booted on a pure UEFI system. By faking a minimal
> set of legacy VBE services, Windows7 boots just fine in UEFI mode.
>
> (Windows 7 doesn't actually execute the VBE services' real mode code on
> the processor; instead the code is interpreted in a software emulator.)
>
> The dependency on legacy VBE services is arguably a bug of UEFI Windows
> 7. One way to work it around is to build OVMF with the SeaBIOS CSM, and
> just boot Windows7 in legacy mode.
>
> However, in some environments, building OVMF with a CSM is not
> desirable. That's why we created the VBE shim. It is a terrible hack,
> but it fits the Windows7 bug just fine.
>
>> I suggest below options:
>>
>> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
>>
>> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
>>
>> Either #1 or #2 is OK.
>> #1 is simple, and #2 can help detect more potential issue.
>
> * Regarding #1:
>
> As I wrote above, the VBE shim is mutually exclusive with CSM:
>
>   //
>   // The allocation request may fail, eg. if LegacyBiosDxe has already run.
>   //
>
> So, if the CSM was built into OVMF, then the VBE shim is already skipped
> at boot.
>
>
> * Regarding #2:
>
> I agree that *temporarily* enabling R/W access to page zero, using the
> appropriate DXE service, could make the VBE Shim installation work.
>
> However, making page 0 unaccessible right after, could very easily break
> Windows 7, if some component of Windows 7 accessed page 0, before
> setting up new page tables. After all, in the InstallVbeShim() function,
> we write stuff to page 0 *because* Windows 7 will read it from there.
>
>
> * If it is more convenient than a "--pcd=..." option for "build", we can
> also add another build-time macro to the OVMF DSC files, called
> NULL_DETECT_ENABLE. This would then turn on the PcdNullDetection bits,
> for "-D NULL_DETECT_ENABLE".
>
> NULL_DETECT_ENABLE should default to FALSE.
>
>
> * Independently: if the CSM too conflicts with the null detection
> feature, then we should probably report an "unsupported configuration"
> error for the following case:
>
>   -D NULL_DETECT_ENABLE -D CSM_ENABLE
>
> But, I don't know if triggering build errors, with custom error
> messages, is possible in DSC files.
>
> Thanks
> Laszlo
>
>
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Wednesday, September 6, 2017 6:21 PM
>> To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com<mailto:jian.j.wang@intel.com%3cmailto:jian.j.wang@intel.com>>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com%3cmailto:jordan.l.justen@intel.com>>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com<mailto:jiewen.yao@intel.com%3cmailto:jiewen.yao@intel.com>>>
>> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com%3cmailto:michael.d.kinney@intel.com>>>
>> Subject: Re: ASSERT in QemuVideoDxe driver during reset
>>
>> On 09/06/17 10:15, Wang, Jian J wrote:
>>> Hi guys,
>>>
>>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>>> driver during reset. The assert statement is like below.
>>>
>>>     ASSERT (Int0x10->Segment == 0x0000);
>>>     ASSERT (Int0x10->Offset  == 0x0000);
>>>
>>> This happened after I have enabled NULL pointer access detection
>>> feature, in which page 0 (4K)  is disabled.
>>
>> The NULL pointer access detection conflicts with the VBE shim. For
>> installing the VBE shim, OVMF has to write to page 0, on purpose.
>>
>> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
>> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
>>
>>> And because of page 0 disabled, I have to skip the memory clearing for
>>> page 0 in DXE core.
>>
>> By doing that, you invalidate the following comment in said OVMF commit:
>>
>> +    //
>> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
>> +    // is NUL-filled.
>> +    //
>> +    ASSERT (Int0x10->Segment == 0x0000);
>> +    ASSERT (Int0x10->Offset  == 0x0000);
>>
>> Because SVN r14218 was what added the clearing of page 0 to the DXE
>> core. (Expressed as a git commit: d436d5ca0936.)
>>
>>> Otherwise it will cause page fault exception there. It seems that QEMU
>>> may clear all its memory at startup. Skipping the action of clearing
>>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>>> QEMU will not clear all its memory during warm boot. ASSERT will be
>>> triggered after reset.
>>
>> QEMU does not clear guest RAM at reboot (or S3 resume).
>>
>>> It's easy to fix this issue but there're some subtle situations which
>>> I'm not quite certain. I'd like your opinions for them.
>>>
>>> Here're my thoughts on several solutions:
>>> a) Remove the ASSERT statement in InstallVbeShim().
>>
>> Not a good idea.
>>
>>> But I'm sure if it is safe to do so because I don't quite understand
>>> the purpose of the ASSERT.
>>
>> The ASSERT expresses our belief that, after we manage to *allocate* page
>> 0, we can freely write the real mode Int0x10 vector there, without
>> overwriting anything else.
>>
>>> b) Instead of skipping clearing page 0, enable it, do clearing and
>>> then disable it. The problem here is that CPU arch protocol is not
>>> ready at that time. I have to "manually" do page operation, which
>>> might be non-portable and a little bit odd in DXE core.
>>> c) Move code clearing page 0 from DXE core to another place wherever
>>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>>> reason to put code there before.
>>
>> I'm not sure how the NULL pointer access detection is enabled. Is it
>> enabled by a PCD?
>>
>> Hm, yes, it seems, from your patch
>>
>>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
>>
>> that the PCD is called
>>
>>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
>>
>> I can see in that patch that the DXE core is updated *not* to clear page
>> 0 if the PCD is set to TRUE. That makes sense.
>>
>> However, OVMF should also be updated to skip the VBE shim installation
>> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
>> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
>> even try.
>>
>> Please extend your original patch set with a patch for OvmfPkg. In the
>> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
>> site:
>>
>>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>>   }
>>> #endif
>>
>> and please skip the call if the PCD is set to TRUE:
>>
>>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>>   }
>>> #endif
>>
>> Now, the consequence of this would be a working OVMF build, but it would
>> also break booting UEFI Windows 7 on OVMF. Therefore, please append
>> *another* patch to your patch set, setting the Feature PCD to FALSE in
>> all three OVMF DSC files:
>>
>> - OvmfPkg/OvmfPkgIa32.dsc
>> - OvmfPkg/OvmfPkgIa32X64.dsc
>> - OvmfPkg/OvmfPkgX64.dsc
>>
>> We cannot break Windows 7 booting in upstream OVMF. If someone really
>> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
>> sure), they can patch the DSC files, or else pass the
>>
>>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
>>
>> option to the "build" utility.
>>
>> ... I'm now seeing in the original discussion that perhaps you will
>> introduce a bitmap instead (one bit per firmware phase). That's OK with
>> me; please reinterpret all of the above with the "DXE bit" then.
>>
>> When you post the next version of the patch set, please CC Jordan and
>> myself on the OvmfPkg patches.
>>
>> Thank you!
>> Laszlo
>>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>
> https://lists.01.org/mailman/listinfo/edk2-devel
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-07  0:41     ` Wang, Jian J
@ 2017-09-07  1:28       ` Yao, Jiewen
  2017-09-07 10:58         ` Laszlo Ersek
  0 siblings, 1 reply; 13+ messages in thread
From: Yao, Jiewen @ 2017-09-07  1:28 UTC (permalink / raw)
  To: Wang, Jian J, Laszlo Ersek, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

OK. Got you.

I think we can move the zero memory to DxeIpl.
The DxeIpl can check if the 0 is allocated in MemoryAllocationHob, and zero it if it is not allocated.

Thank you
Yao Jiewen

From: Wang, Jian J
Sent: Thursday, September 7, 2017 8:41 AM
To: Yao, Jiewen <jiewen.yao@intel.com>; Laszlo Ersek <lersek@redhat.com>; Justen, Jordan L <jordan.l.justen@intel.com>
Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: RE: ASSERT in QemuVideoDxe driver during reset

Hi Jiewen and Laszlo,

According to both of your comments:


1.       VBE SHIM must be installed to avoid boot failure of Windows 7.

2.       Enabling NULL detection should not break existing platforms.

Let me clarify a few things in advance. I think there’s a little misunderstanding on this issue (Sorry for my poor description originally).

1.       NULL detection is implemented by disable first 4K page (0-4095). Let me call it page 0.

2.       The ASSERT was not caused by accessing page 0. I do enabling page 0 before access int10 vector in QemuVideoDxe driver while NULL detection is enabled.

3.  The ASSERT will only be triggered by reset because QemuVideoDxe will write int10 vector and memory at that place keeps intact during reset.

4.  The root cause of ASSERT is the fact that page 0 is always cleared in DXE core except to NULL detection enabled. I changed CoreAddRange() in page.c in DXC core like below to avoid page fault in core.
//
  // If memory of type EfiConventionalMemory is being added that includes the page
  // starting at address 0, then zero the page starting at address 0.  This has
  // two benifits.  It helps find NULL pointer bugs and it also maximizes
  // compatibility with operating systems that may evaluate memory in this page
  // for legacy data structures.  If memory of any other type is added starting
  // at address 0, then do not zero the page at address 0 because the page is being
  // used for other purposes.
  //
  if (Type == EfiConventionalMemory && Start == 0 && (End >= EFI_PAGE_SIZE - 1)) {
    if ((PcdGet8(PcdNullPointerDetectionPropertyMask) & BIT0) == 0) {
      SetMem ((VOID *)(UINTN)Start, EFI_PAGE_SIZE, 0);
    }
  }

I think the best solution at present is to update above code to make sure page 0 is always cleared no matter NULL detection is enable or not. To make it possible, I have to enable page 0 before memory clearing and disable it again after then. My concern is that I cannot use CPU arch protocol to do so in CoreAddRange() function because it’s not ready at that time. There’re two options here. One is “manually” disabling page 0 (read cr3, then pm4l, then pde…) in above code; another is moving above code to another place where paging operation is at hand, like DxeIpl or CPU driver. My personal opinion is second one would be better. What’s your opinions? Or any better ideas?

Thanks,
Wang, Jian J

From: Yao, Jiewen
Sent: Wednesday, September 06, 2017 7:17 PM
To: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>; Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>
Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: RE: ASSERT in QemuVideoDxe driver during reset

HI
I think the NULL detection feature should *never* break any existing platform.
No real function should be skipped for PcdNullDetection.

If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.

I suggest below options:

1)       In OVMF, if CSM is enabled, disable PcdNullDetection.

2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.

Either #1 or #2 is OK.
#1 is simple, and #2 can help detect more potential issue.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Wednesday, September 6, 2017 6:21 PM
To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>
Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: ASSERT in QemuVideoDxe driver during reset

On 09/06/17 10:15, Wang, Jian J wrote:
> Hi guys,
>
> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
> driver during reset. The assert statement is like below.
>
>     ASSERT (Int0x10->Segment == 0x0000);
>     ASSERT (Int0x10->Offset  == 0x0000);
>
> This happened after I have enabled NULL pointer access detection
> feature, in which page 0 (4K)  is disabled.

The NULL pointer access detection conflicts with the VBE shim. For
installing the VBE shim, OVMF has to write to page 0, on purpose.

Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).

> And because of page 0 disabled, I have to skip the memory clearing for
> page 0 in DXE core.

By doing that, you invalidate the following comment in said OVMF commit:

+    //
+    // We managed to allocate the page at zero. SVN r14218 guarantees that it
+    // is NUL-filled.
+    //
+    ASSERT (Int0x10->Segment == 0x0000);
+    ASSERT (Int0x10->Offset  == 0x0000);

Because SVN r14218 was what added the clearing of page 0 to the DXE
core. (Expressed as a git commit: d436d5ca0936.)

> Otherwise it will cause page fault exception there. It seems that QEMU
> may clear all its memory at startup. Skipping the action of clearing
> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
> QEMU will not clear all its memory during warm boot. ASSERT will be
> triggered after reset.

QEMU does not clear guest RAM at reboot (or S3 resume).

> It's easy to fix this issue but there're some subtle situations which
> I'm not quite certain. I'd like your opinions for them.
>
> Here're my thoughts on several solutions:
> a) Remove the ASSERT statement in InstallVbeShim().

Not a good idea.

> But I'm sure if it is safe to do so because I don't quite understand
> the purpose of the ASSERT.

The ASSERT expresses our belief that, after we manage to *allocate* page
0, we can freely write the real mode Int0x10 vector there, without
overwriting anything else.

> b) Instead of skipping clearing page 0, enable it, do clearing and
> then disable it. The problem here is that CPU arch protocol is not
> ready at that time. I have to "manually" do page operation, which
> might be non-portable and a little bit odd in DXE core.
> c) Move code clearing page 0 from DXE core to another place wherever
> appropriate, like DxeIpl or cpu driver. But I think there's a good
> reason to put code there before.

I'm not sure how the NULL pointer access detection is enabled. Is it
enabled by a PCD?

Hm, yes, it seems, from your patch

  [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core

that the PCD is called

  gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection

I can see in that patch that the DXE core is updated *not* to clear page
0 if the PCD is set to TRUE. That makes sense.

However, OVMF should also be updated to skip the VBE shim installation
if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
simply cannot put the legacy Int0x10 vector in place, so we shouldn't
even try.

Please extend your original patch set with a patch for OvmfPkg. In the
file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
site:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>       Private->Variant == QEMU_VIDEO_BOCHS) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

and please skip the call if the PCD is set to TRUE:

> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>   }
> #endif

Now, the consequence of this would be a working OVMF build, but it would
also break booting UEFI Windows 7 on OVMF. Therefore, please append
*another* patch to your patch set, setting the Feature PCD to FALSE in
all three OVMF DSC files:

- OvmfPkg/OvmfPkgIa32.dsc
- OvmfPkg/OvmfPkgIa32X64.dsc
- OvmfPkg/OvmfPkgX64.dsc

We cannot break Windows 7 booting in upstream OVMF. If someone really
wants the null pointer detection in OVMF (and doesn't need Windows 7 for
sure), they can patch the DSC files, or else pass the

  --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE

option to the "build" utility.

... I'm now seeing in the original discussion that perhaps you will
introduce a bitmap instead (one bit per firmware phase). That's OK with
me; please reinterpret all of the above with the "DXE bit" then.

When you post the next version of the patch set, please CC Jordan and
myself on the OvmfPkg patches.

Thank you!
Laszlo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-07  1:28       ` Yao, Jiewen
@ 2017-09-07 10:58         ` Laszlo Ersek
  2017-09-07 11:16           ` Yao, Jiewen
  0 siblings, 1 reply; 13+ messages in thread
From: Laszlo Ersek @ 2017-09-07 10:58 UTC (permalink / raw)
  To: Yao, Jiewen, Wang, Jian J, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

On 09/07/17 03:28, Yao, Jiewen wrote:
> OK. Got you.
> 
> I think we can move the zero memory to DxeIpl.
> The DxeIpl can check if the 0 is allocated in MemoryAllocationHob, and zero it if it is not allocated.

Regarding the zeroing out of page 0, this solution seems good to me as
well. (I agree that page 0 should be zeroed out even if the null pointer
detection is enabled.)

Regarding the question whether QemuVideoDxe should, or should not,
manually change the access permissions to page 0, before and after
setting up the Int0x10 vector, if null pointer detection is enabled:

This depends on Windows 7. If Windows 7 can boot fine with the null
pointer detection enabled, then I agree QemuVideoDxe should be updated
as described above. (For massaging the page protection, it should use
DXE services, not CpuArch services.) In this case it is also fine for
OVMF to inherit the default TRUE setting for the PCD.

If Windows 7 cannot boot with null pointer detection enabled, then OVMF
should set the PCD to FALSE, and QemuVideoDxe should be changed to not
call InstallVbeShim() at all if the PCD is TRUE.

Thanks
Laszlo

> From: Wang, Jian J
> Sent: Thursday, September 7, 2017 8:41 AM
> To: Yao, Jiewen <jiewen.yao@intel.com>; Laszlo Ersek <lersek@redhat.com>; Justen, Jordan L <jordan.l.justen@intel.com>
> Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
> Subject: RE: ASSERT in QemuVideoDxe driver during reset
> 
> Hi Jiewen and Laszlo,
> 
> According to both of your comments:
> 
> 
> 1.       VBE SHIM must be installed to avoid boot failure of Windows 7.
> 
> 2.       Enabling NULL detection should not break existing platforms.
> 
> Let me clarify a few things in advance. I think there’s a little misunderstanding on this issue (Sorry for my poor description originally).
> 
> 1.       NULL detection is implemented by disable first 4K page (0-4095). Let me call it page 0.
> 
> 2.       The ASSERT was not caused by accessing page 0. I do enabling page 0 before access int10 vector in QemuVideoDxe driver while NULL detection is enabled.
> 
> 3.  The ASSERT will only be triggered by reset because QemuVideoDxe will write int10 vector and memory at that place keeps intact during reset.
> 
> 4.  The root cause of ASSERT is the fact that page 0 is always cleared in DXE core except to NULL detection enabled. I changed CoreAddRange() in page.c in DXC core like below to avoid page fault in core.
> //
>   // If memory of type EfiConventionalMemory is being added that includes the page
>   // starting at address 0, then zero the page starting at address 0.  This has
>   // two benifits.  It helps find NULL pointer bugs and it also maximizes
>   // compatibility with operating systems that may evaluate memory in this page
>   // for legacy data structures.  If memory of any other type is added starting
>   // at address 0, then do not zero the page at address 0 because the page is being
>   // used for other purposes.
>   //
>   if (Type == EfiConventionalMemory && Start == 0 && (End >= EFI_PAGE_SIZE - 1)) {
>     if ((PcdGet8(PcdNullPointerDetectionPropertyMask) & BIT0) == 0) {
>       SetMem ((VOID *)(UINTN)Start, EFI_PAGE_SIZE, 0);
>     }
>   }
> 
> I think the best solution at present is to update above code to make sure page 0 is always cleared no matter NULL detection is enable or not. To make it possible, I have to enable page 0 before memory clearing and disable it again after then. My concern is that I cannot use CPU arch protocol to do so in CoreAddRange() function because it’s not ready at that time. There’re two options here. One is “manually” disabling page 0 (read cr3, then pm4l, then pde…) in above code; another is moving above code to another place where paging operation is at hand, like DxeIpl or CPU driver. My personal opinion is second one would be better. What’s your opinions? Or any better ideas?
> 
> Thanks,
> Wang, Jian J
> 
> From: Yao, Jiewen
> Sent: Wednesday, September 06, 2017 7:17 PM
> To: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>; Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
> Subject: RE: ASSERT in QemuVideoDxe driver during reset
> 
> HI
> I think the NULL detection feature should *never* break any existing platform.
> No real function should be skipped for PcdNullDetection.
> 
> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.
> 
> I suggest below options:
> 
> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
> 
> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
> 
> Either #1 or #2 is OK.
> #1 is simple, and #2 can help detect more potential issue.
> 
> Thank you
> Yao Jiewen
> 
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, September 6, 2017 6:21 PM
> To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
> Subject: Re: ASSERT in QemuVideoDxe driver during reset
> 
> On 09/06/17 10:15, Wang, Jian J wrote:
>> Hi guys,
>>
>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>> driver during reset. The assert statement is like below.
>>
>>     ASSERT (Int0x10->Segment == 0x0000);
>>     ASSERT (Int0x10->Offset  == 0x0000);
>>
>> This happened after I have enabled NULL pointer access detection
>> feature, in which page 0 (4K)  is disabled.
> 
> The NULL pointer access detection conflicts with the VBE shim. For
> installing the VBE shim, OVMF has to write to page 0, on purpose.
> 
> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
> 
>> And because of page 0 disabled, I have to skip the memory clearing for
>> page 0 in DXE core.
> 
> By doing that, you invalidate the following comment in said OVMF commit:
> 
> +    //
> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
> +    // is NUL-filled.
> +    //
> +    ASSERT (Int0x10->Segment == 0x0000);
> +    ASSERT (Int0x10->Offset  == 0x0000);
> 
> Because SVN r14218 was what added the clearing of page 0 to the DXE
> core. (Expressed as a git commit: d436d5ca0936.)
> 
>> Otherwise it will cause page fault exception there. It seems that QEMU
>> may clear all its memory at startup. Skipping the action of clearing
>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>> QEMU will not clear all its memory during warm boot. ASSERT will be
>> triggered after reset.
> 
> QEMU does not clear guest RAM at reboot (or S3 resume).
> 
>> It's easy to fix this issue but there're some subtle situations which
>> I'm not quite certain. I'd like your opinions for them.
>>
>> Here're my thoughts on several solutions:
>> a) Remove the ASSERT statement in InstallVbeShim().
> 
> Not a good idea.
> 
>> But I'm sure if it is safe to do so because I don't quite understand
>> the purpose of the ASSERT.
> 
> The ASSERT expresses our belief that, after we manage to *allocate* page
> 0, we can freely write the real mode Int0x10 vector there, without
> overwriting anything else.
> 
>> b) Instead of skipping clearing page 0, enable it, do clearing and
>> then disable it. The problem here is that CPU arch protocol is not
>> ready at that time. I have to "manually" do page operation, which
>> might be non-portable and a little bit odd in DXE core.
>> c) Move code clearing page 0 from DXE core to another place wherever
>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>> reason to put code there before.
> 
> I'm not sure how the NULL pointer access detection is enabled. Is it
> enabled by a PCD?
> 
> Hm, yes, it seems, from your patch
> 
>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
> 
> that the PCD is called
> 
>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
> 
> I can see in that patch that the DXE core is updated *not* to clear page
> 0 if the PCD is set to TRUE. That makes sense.
> 
> However, OVMF should also be updated to skip the VBE shim installation
> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
> even try.
> 
> Please extend your original patch set with a patch for OvmfPkg. In the
> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
> site:
> 
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
> 
> and please skip the call if the PCD is set to TRUE:
> 
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
> 
> Now, the consequence of this would be a working OVMF build, but it would
> also break booting UEFI Windows 7 on OVMF. Therefore, please append
> *another* patch to your patch set, setting the Feature PCD to FALSE in
> all three OVMF DSC files:
> 
> - OvmfPkg/OvmfPkgIa32.dsc
> - OvmfPkg/OvmfPkgIa32X64.dsc
> - OvmfPkg/OvmfPkgX64.dsc
> 
> We cannot break Windows 7 booting in upstream OVMF. If someone really
> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
> sure), they can patch the DSC files, or else pass the
> 
>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
> 
> option to the "build" utility.
> 
> ... I'm now seeing in the original discussion that perhaps you will
> introduce a bitmap instead (one bit per firmware phase). That's OK with
> me; please reinterpret all of the above with the "DXE bit" then.
> 
> When you post the next version of the patch set, please CC Jordan and
> myself on the OvmfPkg patches.
> 
> Thank you!
> Laszlo
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ASSERT in QemuVideoDxe driver during reset
  2017-09-07 10:58         ` Laszlo Ersek
@ 2017-09-07 11:16           ` Yao, Jiewen
  0 siblings, 0 replies; 13+ messages in thread
From: Yao, Jiewen @ 2017-09-07 11:16 UTC (permalink / raw)
  To: Laszlo Ersek, Wang, Jian J, Justen, Jordan L
  Cc: edk2-devel@lists.01.org, Kinney, Michael D

Agree. We should not break Window7.
Let’s see the result.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, September 7, 2017 6:58 PM
To: Yao, Jiewen <jiewen.yao@intel.com>; Wang, Jian J <jian.j.wang@intel.com>; Justen, Jordan L <jordan.l.justen@intel.com>
Cc: edk2-devel@lists.01.org; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: ASSERT in QemuVideoDxe driver during reset

On 09/07/17 03:28, Yao, Jiewen wrote:
> OK. Got you.
>
> I think we can move the zero memory to DxeIpl.
> The DxeIpl can check if the 0 is allocated in MemoryAllocationHob, and zero it if it is not allocated.

Regarding the zeroing out of page 0, this solution seems good to me as
well. (I agree that page 0 should be zeroed out even if the null pointer
detection is enabled.)

Regarding the question whether QemuVideoDxe should, or should not,
manually change the access permissions to page 0, before and after
setting up the Int0x10 vector, if null pointer detection is enabled:

This depends on Windows 7. If Windows 7 can boot fine with the null
pointer detection enabled, then I agree QemuVideoDxe should be updated
as described above. (For massaging the page protection, it should use
DXE services, not CpuArch services.) In this case it is also fine for
OVMF to inherit the default TRUE setting for the PCD.

If Windows 7 cannot boot with null pointer detection enabled, then OVMF
should set the PCD to FALSE, and QemuVideoDxe should be changed to not
call InstallVbeShim() at all if the PCD is TRUE.

Thanks
Laszlo

> From: Wang, Jian J
> Sent: Thursday, September 7, 2017 8:41 AM
> To: Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com>>; Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
> Subject: RE: ASSERT in QemuVideoDxe driver during reset
>
> Hi Jiewen and Laszlo,
>
> According to both of your comments:
>
>
> 1.       VBE SHIM must be installed to avoid boot failure of Windows 7.
>
> 2.       Enabling NULL detection should not break existing platforms.
>
> Let me clarify a few things in advance. I think there’s a little misunderstanding on this issue (Sorry for my poor description originally).
>
> 1.       NULL detection is implemented by disable first 4K page (0-4095). Let me call it page 0.
>
> 2.       The ASSERT was not caused by accessing page 0. I do enabling page 0 before access int10 vector in QemuVideoDxe driver while NULL detection is enabled.
>
> 3.  The ASSERT will only be triggered by reset because QemuVideoDxe will write int10 vector and memory at that place keeps intact during reset.
>
> 4.  The root cause of ASSERT is the fact that page 0 is always cleared in DXE core except to NULL detection enabled. I changed CoreAddRange() in page.c in DXC core like below to avoid page fault in core.
> //
>   // If memory of type EfiConventionalMemory is being added that includes the page
>   // starting at address 0, then zero the page starting at address 0.  This has
>   // two benifits.  It helps find NULL pointer bugs and it also maximizes
>   // compatibility with operating systems that may evaluate memory in this page
>   // for legacy data structures.  If memory of any other type is added starting
>   // at address 0, then do not zero the page at address 0 because the page is being
>   // used for other purposes.
>   //
>   if (Type == EfiConventionalMemory && Start == 0 && (End >= EFI_PAGE_SIZE - 1)) {
>     if ((PcdGet8(PcdNullPointerDetectionPropertyMask) & BIT0) == 0) {
>       SetMem ((VOID *)(UINTN)Start, EFI_PAGE_SIZE, 0);
>     }
>   }
>
> I think the best solution at present is to update above code to make sure page 0 is always cleared no matter NULL detection is enable or not. To make it possible, I have to enable page 0 before memory clearing and disable it again after then. My concern is that I cannot use CPU arch protocol to do so in CoreAddRange() function because it’s not ready at that time. There’re two options here. One is “manually” disabling page 0 (read cr3, then pm4l, then pde…) in above code; another is moving above code to another place where paging operation is at hand, like DxeIpl or CPU driver. My personal opinion is second one would be better. What’s your opinions? Or any better ideas?
>
> Thanks,
> Wang, Jian J
>
> From: Yao, Jiewen
> Sent: Wednesday, September 06, 2017 7:17 PM
> To: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com<mailto:lersek@redhat.com%3cmailto:lersek@redhat.com>>>; Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com<mailto:jian.j.wang@intel.com%3cmailto:jian.j.wang@intel.com>>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com%3cmailto:jordan.l.justen@intel.com>>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com%3cmailto:michael.d.kinney@intel.com>>>
> Subject: RE: ASSERT in QemuVideoDxe driver during reset
>
> HI
> I think the NULL detection feature should *never* break any existing platform.
> No real function should be skipped for PcdNullDetection.
>
> If InstallVbeShim () is needed for CSM, we should always call InstallVbeShim() for CSM.
>
> I suggest below options:
>
> 1)       In OVMF, if CSM is enabled, disable PcdNullDetection.
>
> 2)       In OVMF, if any driver need access first 4K page, the code should explicit call SetAttribute(0, 4K, READ|WRITE), if PcdNullDetection is enabled.
>
> Either #1 or #2 is OK.
> #1 is simple, and #2 can help detect more potential issue.
>
> Thank you
> Yao Jiewen
>
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, September 6, 2017 6:21 PM
> To: Wang, Jian J <jian.j.wang@intel.com<mailto:jian.j.wang@intel.com<mailto:jian.j.wang@intel.com%3cmailto:jian.j.wang@intel.com>>>; Justen, Jordan L <jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com<mailto:jordan.l.justen@intel.com%3cmailto:jordan.l.justen@intel.com>>>; Yao, Jiewen <jiewen.yao@intel.com<mailto:jiewen.yao@intel.com<mailto:jiewen.yao@intel.com%3cmailto:jiewen.yao@intel.com>>>
> Cc: edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org<mailto:edk2-devel@lists.01.org%3cmailto:edk2-devel@lists.01.org>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com%3cmailto:michael.d.kinney@intel.com>>>
> Subject: Re: ASSERT in QemuVideoDxe driver during reset
>
> On 09/06/17 10:15, Wang, Jian J wrote:
>> Hi guys,
>>
>> I found an ASSERT issue in function InstallVbeShim() in QemuVideoDxe
>> driver during reset. The assert statement is like below.
>>
>>     ASSERT (Int0x10->Segment == 0x0000);
>>     ASSERT (Int0x10->Offset  == 0x0000);
>>
>> This happened after I have enabled NULL pointer access detection
>> feature, in which page 0 (4K)  is disabled.
>
> The NULL pointer access detection conflicts with the VBE shim. For
> installing the VBE shim, OVMF has to write to page 0, on purpose.
>
> Please see commit 90803342b1b6 ("OvmfPkg: QemuVideoDxe: Int10h stub for
> Windows 7 & 2008 (stdvga, QXL)", 2014-05-20).
>
>> And because of page 0 disabled, I have to skip the memory clearing for
>> page 0 in DXE core.
>
> By doing that, you invalidate the following comment in said OVMF commit:
>
> +    //
> +    // We managed to allocate the page at zero. SVN r14218 guarantees that it
> +    // is NUL-filled.
> +    //
> +    ASSERT (Int0x10->Segment == 0x0000);
> +    ASSERT (Int0x10->Offset  == 0x0000);
>
> Because SVN r14218 was what added the clearing of page 0 to the DXE
> core. (Expressed as a git commit: d436d5ca0936.)
>
>> Otherwise it will cause page fault exception there. It seems that QEMU
>> may clear all its memory at startup. Skipping the action of clearing
>> page 0 in core won't cause ASSERT issue in QemuVideoDxe, for the first
>> time boot. But QemuVideoDxe will write int10 vector at memory 0x10 and
>> QEMU will not clear all its memory during warm boot. ASSERT will be
>> triggered after reset.
>
> QEMU does not clear guest RAM at reboot (or S3 resume).
>
>> It's easy to fix this issue but there're some subtle situations which
>> I'm not quite certain. I'd like your opinions for them.
>>
>> Here're my thoughts on several solutions:
>> a) Remove the ASSERT statement in InstallVbeShim().
>
> Not a good idea.
>
>> But I'm sure if it is safe to do so because I don't quite understand
>> the purpose of the ASSERT.
>
> The ASSERT expresses our belief that, after we manage to *allocate* page
> 0, we can freely write the real mode Int0x10 vector there, without
> overwriting anything else.
>
>> b) Instead of skipping clearing page 0, enable it, do clearing and
>> then disable it. The problem here is that CPU arch protocol is not
>> ready at that time. I have to "manually" do page operation, which
>> might be non-portable and a little bit odd in DXE core.
>> c) Move code clearing page 0 from DXE core to another place wherever
>> appropriate, like DxeIpl or cpu driver. But I think there's a good
>> reason to put code there before.
>
> I'm not sure how the NULL pointer access detection is enabled. Is it
> enabled by a PCD?
>
> Hm, yes, it seems, from your patch
>
>   [edk2] [PATCH 1/2] Implement NULL pointer detection for EDK-II Core
>
> that the PCD is called
>
>   gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection
>
> I can see in that patch that the DXE core is updated *not* to clear page
> 0 if the PCD is set to TRUE. That makes sense.
>
> However, OVMF should also be updated to skip the VBE shim installation
> if the PCD is set to TRUE. Namely, if the PCD is set to TRUE, then we
> simply cannot put the legacy Int0x10 vector in place, so we shouldn't
> even try.
>
> Please extend your original patch set with a patch for OvmfPkg. In the
> file "OvmfPkg/QemuVideoDxe/Driver.c", locate the InstallVbeShim() call
> site:
>
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>       Private->Variant == QEMU_VIDEO_BOCHS) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
>
> and please skip the call if the PCD is set to TRUE:
>
>> #if defined MDE_CPU_IA32 || defined MDE_CPU_X64
>>   if (!FeaturePcdGet (PcdNullPointerDetection) &&
>>       (Private->Variant == QEMU_VIDEO_BOCHS_MMIO ||
>>        Private->Variant == QEMU_VIDEO_BOCHS)) {
>>     InstallVbeShim (Card->Name, Private->GraphicsOutput.Mode->FrameBufferBase);
>>   }
>> #endif
>
> Now, the consequence of this would be a working OVMF build, but it would
> also break booting UEFI Windows 7 on OVMF. Therefore, please append
> *another* patch to your patch set, setting the Feature PCD to FALSE in
> all three OVMF DSC files:
>
> - OvmfPkg/OvmfPkgIa32.dsc
> - OvmfPkg/OvmfPkgIa32X64.dsc
> - OvmfPkg/OvmfPkgX64.dsc
>
> We cannot break Windows 7 booting in upstream OVMF. If someone really
> wants the null pointer detection in OVMF (and doesn't need Windows 7 for
> sure), they can patch the DSC files, or else pass the
>
>   --pcd=gEfiMdeModulePkgTokenSpaceGuid.PcdNullPointerDetection=TRUE
>
> option to the "build" utility.
>
> ... I'm now seeing in the original discussion that perhaps you will
> introduce a bitmap instead (one bit per firmware phase). That's OK with
> me; please reinterpret all of the above with the "DXE bit" then.
>
> When you post the next version of the patch set, please CC Jordan and
> myself on the OvmfPkg patches.
>
> Thank you!
> Laszlo
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-09-07 11:13 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-06  8:15 ASSERT in QemuVideoDxe driver during reset Wang, Jian J
2017-09-06 10:20 ` Laszlo Ersek
2017-09-06 11:16   ` Yao, Jiewen
2017-09-06 13:18     ` Laszlo Ersek
2017-09-06 14:47       ` Gao, Liming
2017-09-06 15:35         ` Laszlo Ersek
2017-09-06 15:06       ` Yao, Jiewen
2017-09-06 15:25         ` Laszlo Ersek
2017-09-07  0:54           ` Yao, Jiewen
2017-09-07  0:41     ` Wang, Jian J
2017-09-07  1:28       ` Yao, Jiewen
2017-09-07 10:58         ` Laszlo Ersek
2017-09-07 11:16           ` Yao, Jiewen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox