From: Stefano Garzarella <sgarzare@redhat.com>
To: Oliver Steffen <osteffen@redhat.com>
Cc: devel@edk2.groups.io, Gerd Hoffmann <kraxel@redhat.com>,
Jiewen Yao <jiewen.yao@intel.com>,
Zachary Clark-williams <zachary.clark-williams@intel.com>,
Saloni Kasbekar <saloni.kasbekar@intel.com>,
Doug Flick <dougflick@microsoft.com>,
Daniel Berrange <berrange@redhat.com>, Cong Li <coli@redhat.com>
Subject: Re: [edk2-devel] OVMF Issue with Netboot, VirtioRng, and both COM1/COM2 configured
Date: Tue, 15 Oct 2024 15:07:10 +0200 [thread overview]
Message-ID: <CAGxU2F7RLsH_tfAj4b1_F_7EJgXK7NvV3Tz5qSVVbEOk1NGy3A@mail.gmail.com> (raw)
In-Reply-To: <CA+bRGFrqWr86C0T7pwjpkH-7j3iVNH0UfY8=s5E7MGXmvqFB3Q@mail.gmail.com>
On Mon, Oct 14, 2024 at 9:08 PM Oliver Steffen <osteffen@redhat.com> wrote:
>
> Since the PixieFail CVE fixes, a strong random number generator is
> required to use network functionality, such as booting via PXE or
> HTTP.
> On modern x86_64 CPUs this is not a problem because these support the
> RDRAND instruction.
> On older models one needs to add a virtio-rng device otherwise network
> initialization fails.
>
> We now observe a very strange problem [1]:
> Network initialization still fails when adding a virtio-rng to a VM
> with an old CPU, under certain hardware configurations.
>
> For example in combination with COM1 and COM2 isa-serial port, while
> it works if only one of them is there (it doesn't matter which one, as
> long as they are not both configured in QEMU).
>
> Steps to reproduce the issue:
>
> Use a recent edk2 master branch, for example 596773f5e33e. We used
> qemu-8.2.7-1.fc40.
>
> Build OVMF for X64 like this:
>
> build -t GCC5 -b DEBUG -a X64 \
> -p OvmfPkg/OvmfPkgX64.dsc \
> -D NETWORK_HTTP_BOOT_ENABLE=TRUE \
> -D NETWORK_IP6_ENABLE=TRUE \
> -D NETWORK_TLS_ENABLE=TRUE \
> -D NETWORK_ALLOW_HTTP_CONNECTIONS=TRUE \
> -D DEBUG_PRINT_ERROR_LEVEL=0xFFFFFFFF
>
> Run QEMU with a CPU that does not feature RDRAND:
>
> qemu-system-x86_64 \
> -machine q35,accel=kvm -m 1G -display none -nodefaults \
> -drive file=OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
> -drive file=OVMF_VARS.fd,if=pflash,format=raw,unit=1,readonly=on \
> -chardev file,id=fw,path="firmware.log" -device
> isa-debugcon,iobase=0x402,chardev=fw \
> -drive file=UefiShell.iso,format=raw,if=none,media=cdrom,id=drive-cd1,readonly=on
> \
> -device ide-cd,drive=drive-cd1,id=cd1,bootindex=1 \
> -netdev user,id=net0 -device virtio-net-pci,netdev=net0,bootindex=2 \
> -device virtio-rng-pci \
> -serial stdio \
> -serial null \
> -cpu core2duo
>
>
> The attached CD-Rom image [2] contains a EFI Shell executable that is booted.
> From the shell one can investigate the available boot options:
>
> # bcfg boot dump
>
> Expectation: PXE and HTTP options are listed.
> Observation: No network boot options present.
>
> Changing the CPU model on the QEMU command line to “max” makes PXE and
> HTTP options available. We suspected that a virtio-rng-pci is not
> working and network support is unavailable due to the lack of an RNG.
>
> But the same can be achieved by removing the second serial port
> (“-serial null”) while keeping the CPU model. We can’t explain this at
> all.
>
> While network boot can be achieved by changing other parts of the
> command line too (modifying bootindex, for example) it is very strange
> that simply the serial port configuration influences network boot.
>
> Bisection:
> Doing a bisection, the commit that introduces this problem is
> 4c4ceb2ceb ("NetworkPkg: SECURITY PATCH CVE-2023-45237").
>
> The problem seems to be pre-existing, but as of this commit, DxeNetLib
> has a new Depex with gEfiRngProtocolGuid
> (3152BCA5-EADE-433D-862E-C01CDC291F44) since it is now a consumer.
> Producers can be VirtioRng (when the device is present) and RngDxe
> (when the CPU supports for example instructions like RDRAND). Removing
> the Depex, just for confirmation, solves the problem, but of course
> DxeNetLib fails on an assert where it expects to find random
> generators.
>
> Observing the logs [3,4] with DEBUG_DISPATCH enabled and adding some
> printing in VirtioRng, we noticed that in both cases (PXE working or
> not), VirtioRng is started at the same time in the log (see on both
> logs attached at line 22240), but with both COM1 and COM2 we no longer
> see any dispatcher messages after VirtioRng has started, while we see
> them when there is only one of them. Just this last stage of the
> dispatcher will load the network modules, finding the dependency with
> gEfiRngProtocolGuid true.
Going in this direction, I found a hack that solves the problem, but
it's obviously not the right solution (sorry, I have little experience
in edk2).
By analyzing the calls to the dispatcher (`gDS->Dispatch ()`) I found
that when we only have COM1, EfiBootManagerConnectDevicePath() at some
point invokes `gDS->Dispatch ()` after VirtioRng has started. This call
will then get DxeNetLib loaded.
With both COM1 and COM2 on the other hand, I don't see this call, maybe
because `RemainingDevicePath` in this case is empty, since EDK2 was able
to initialize both, but this is just an idea.
So the hack is the following, where I force the call to the dispatcher
on every call of EfiBootManagerConnectDevicePath():
diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c b/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c
index d1fb0f72ba..621f90d297 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c
+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c
@@ -121,6 +121,8 @@ EfiBootManagerConnectDevicePath (
}
CurrentTpl = EfiGetCurrentTpl ();
+ Status = gDS->Dispatch ();^M
+ DEBUG ((DEBUG_INFO, "%a extra gDS->Dispatch () - Status: %r\n", __func__, Status));^M
//
// Start the real work of connect with RemainingDevicePath
//
I try to better understand how the dispatcher works, but I think it is
related to the dispatcher and some dependency, but my knowledge is
limited. Any suggestions are more than welcome.
Thanks,
Stefano
>
> Any help is very much appreciated!
>
> Regards,
> Stefano and Oliver
>
> [1] https://issues.redhat.com/browse/RHEL-58631
> [2] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/UefiShell.iso
> [3] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/edk2_PXE_issue_COM1_COM2.log
> [4] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/edk2_PXE_working_COM1.log
>
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120624): https://edk2.groups.io/g/devel/message/120624
Mute This Topic: https://groups.io/mt/109008158/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-
next prev parent reply other threads:[~2024-10-15 17:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-14 19:07 [edk2-devel] OVMF Issue with Netboot, VirtioRng, and both COM1/COM2 configured Oliver Steffen
2024-10-15 13:07 ` Stefano Garzarella [this message]
2024-11-01 9:30 ` Gerd Hoffmann
2024-11-03 13:37 ` Oliver Steffen
2024-11-04 9:14 ` Stefano Garzarella
2024-11-04 9:40 ` Gerd Hoffmann
2024-11-04 13:36 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGxU2F7RLsH_tfAj4b1_F_7EJgXK7NvV3Tz5qSVVbEOk1NGy3A@mail.gmail.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox