From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id F1E92941B38 for ; Tue, 15 Oct 2024 17:12:59 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=NaHH5Ku+mf0YRHfRNSu7DR7nzpV56kWB/dOgOIb9fBQ=; c=relaxed/simple; d=groups.io; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Resent-From:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type:Content-Disposition:Content-Transfer-Encoding; s=20240830; t=1729012379; v=1; x=1729271578; b=uzvkHPijU2+6CT7zJc2abbjzGkiT4pt2SZkU664el+DKbY9c0521/D52bBDSfrtGoNtwx0bP R4Ltftf60U36m5qu9bCWUv3OCziRiy5NlZZxulPJjYiYsDuZ7yZF0BeQhMBdHf7NK9fLXkJZ0kI uVFrHrFlcBYMV5G+d6+IF9AFHFsOSVkaIzcQ5pTgF1inKFJW52u0XXaf3zfCbJHw52VXrEhF0hr l2c6WfaroYKrYJ/pyrEVfTOwy8KsrrzUdcKgZRWaJX6ejFOiEtRCNG+aJjrwC7B61MJKq4H8ykR 9gnpVE+S8jNuVgUHzGrTfQenGgNTjNr8qNismadzVQ7yQ== X-Received: by 127.0.0.2 with SMTP id 9CG2YY7687511xieiRd50HSS; Tue, 15 Oct 2024 10:12:58 -0700 X-Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web11.13038.1728997643023159789 for ; Tue, 15 Oct 2024 06:07:23 -0700 X-Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-58-prZt-o2fO7KwEvNrR1DJSQ-1; Tue, 15 Oct 2024 09:07:21 -0400 X-MC-Unique: prZt-o2fO7KwEvNrR1DJSQ-1 X-Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6cbe6ad0154so98044246d6.0 for ; Tue, 15 Oct 2024 06:07:20 -0700 (PDT) X-Gm-Message-State: P8kkBe3aEoLYHAKk4km7sRuwx7686176AA= X-Received: by 2002:a05:6214:3d88:b0:6cb:e7eb:fcf0 with SMTP id 6a1803df08f44-6cbf9e76f8emr171304756d6.33.1728997640255; Tue, 15 Oct 2024 06:07:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG0vffUBejzmFjmAm5CLAPqnLbPCNJjClxFJJjGpRaIwr1n/h1vmXawSqipXV4jphRQ0PgWlg== X-Received: by 2002:a05:6214:3d88:b0:6cb:e7eb:fcf0 with SMTP id 6a1803df08f44-6cbf9e76f8emr171303896d6.33.1728997639511; Tue, 15 Oct 2024 06:07:19 -0700 (PDT) X-Received: from sgarzare-redhat (host-79-46-200-231.retail.telecomitalia.it. [79.46.200.231]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6cc2290fa01sm6437606d6.24.2024.10.15.06.07.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Oct 2024 06:07:17 -0700 (PDT) Date: Tue, 15 Oct 2024 15:07:10 +0200 From: Stefano Garzarella To: Oliver Steffen Cc: devel@edk2.groups.io, Gerd Hoffmann , Jiewen Yao , Zachary Clark-williams , Saloni Kasbekar , Doug Flick , Daniel Berrange , Cong Li Subject: Re: [edk2-devel] OVMF Issue with Netboot, VirtioRng, and both COM1/COM2 configured Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Tue, 15 Oct 2024 10:12:57 -0700 Resent-From: sgarzare@redhat.com Reply-To: devel@edk2.groups.io,sgarzare@redhat.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240830 header.b=uzvkHPij; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=redhat.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io On Mon, Oct 14, 2024 at 9:08 PM Oliver Steffen wrote: > > Since the PixieFail CVE fixes, a strong random number generator is > required to use network functionality, such as booting via PXE or > HTTP. > On modern x86_64 CPUs this is not a problem because these support the > RDRAND instruction. > On older models one needs to add a virtio-rng device otherwise network > initialization fails. > > We now observe a very strange problem [1]: > Network initialization still fails when adding a virtio-rng to a VM > with an old CPU, under certain hardware configurations. > > For example in combination with COM1 and COM2 isa-serial port, while > it works if only one of them is there (it doesn't matter which one, as > long as they are not both configured in QEMU). > > Steps to reproduce the issue: > > Use a recent edk2 master branch, for example 596773f5e33e. We used > qemu-8.2.7-1.fc40. > > Build OVMF for X64 like this: > > build -t GCC5 -b DEBUG -a X64 \ > -p OvmfPkg/OvmfPkgX64.dsc \ > -D NETWORK_HTTP_BOOT_ENABLE=TRUE \ > -D NETWORK_IP6_ENABLE=TRUE \ > -D NETWORK_TLS_ENABLE=TRUE \ > -D NETWORK_ALLOW_HTTP_CONNECTIONS=TRUE \ > -D DEBUG_PRINT_ERROR_LEVEL=0xFFFFFFFF > > Run QEMU with a CPU that does not feature RDRAND: > > qemu-system-x86_64 \ > -machine q35,accel=kvm -m 1G -display none -nodefaults \ > -drive file=OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \ > -drive file=OVMF_VARS.fd,if=pflash,format=raw,unit=1,readonly=on \ > -chardev file,id=fw,path="firmware.log" -device > isa-debugcon,iobase=0x402,chardev=fw \ > -drive file=UefiShell.iso,format=raw,if=none,media=cdrom,id=drive-cd1,readonly=on > \ > -device ide-cd,drive=drive-cd1,id=cd1,bootindex=1 \ > -netdev user,id=net0 -device virtio-net-pci,netdev=net0,bootindex=2 \ > -device virtio-rng-pci \ > -serial stdio \ > -serial null \ > -cpu core2duo > > > The attached CD-Rom image [2] contains a EFI Shell executable that is booted. > From the shell one can investigate the available boot options: > > # bcfg boot dump > > Expectation: PXE and HTTP options are listed. > Observation: No network boot options present. > > Changing the CPU model on the QEMU command line to “max” makes PXE and > HTTP options available. We suspected that a virtio-rng-pci is not > working and network support is unavailable due to the lack of an RNG. > > But the same can be achieved by removing the second serial port > (“-serial null”) while keeping the CPU model. We can’t explain this at > all. > > While network boot can be achieved by changing other parts of the > command line too (modifying bootindex, for example) it is very strange > that simply the serial port configuration influences network boot. > > Bisection: > Doing a bisection, the commit that introduces this problem is > 4c4ceb2ceb ("NetworkPkg: SECURITY PATCH CVE-2023-45237"). > > The problem seems to be pre-existing, but as of this commit, DxeNetLib > has a new Depex with gEfiRngProtocolGuid > (3152BCA5-EADE-433D-862E-C01CDC291F44) since it is now a consumer. > Producers can be VirtioRng (when the device is present) and RngDxe > (when the CPU supports for example instructions like RDRAND). Removing > the Depex, just for confirmation, solves the problem, but of course > DxeNetLib fails on an assert where it expects to find random > generators. > > Observing the logs [3,4] with DEBUG_DISPATCH enabled and adding some > printing in VirtioRng, we noticed that in both cases (PXE working or > not), VirtioRng is started at the same time in the log (see on both > logs attached at line 22240), but with both COM1 and COM2 we no longer > see any dispatcher messages after VirtioRng has started, while we see > them when there is only one of them. Just this last stage of the > dispatcher will load the network modules, finding the dependency with > gEfiRngProtocolGuid true. Going in this direction, I found a hack that solves the problem, but it's obviously not the right solution (sorry, I have little experience in edk2). By analyzing the calls to the dispatcher (`gDS->Dispatch ()`) I found that when we only have COM1, EfiBootManagerConnectDevicePath() at some point invokes `gDS->Dispatch ()` after VirtioRng has started. This call will then get DxeNetLib loaded. With both COM1 and COM2 on the other hand, I don't see this call, maybe because `RemainingDevicePath` in this case is empty, since EDK2 was able to initialize both, but this is just an idea. So the hack is the following, where I force the call to the dispatcher on every call of EfiBootManagerConnectDevicePath(): diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c b/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c index d1fb0f72ba..621f90d297 100644 --- a/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmConnect.c @@ -121,6 +121,8 @@ EfiBootManagerConnectDevicePath ( } CurrentTpl = EfiGetCurrentTpl (); + Status = gDS->Dispatch ();^M + DEBUG ((DEBUG_INFO, "%a extra gDS->Dispatch () - Status: %r\n", __func__, Status));^M // // Start the real work of connect with RemainingDevicePath // I try to better understand how the dispatcher works, but I think it is related to the dispatcher and some dependency, but my knowledge is limited. Any suggestions are more than welcome. Thanks, Stefano > > Any help is very much appreciated! > > Regards, > Stefano and Oliver > > [1] https://issues.redhat.com/browse/RHEL-58631 > [2] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/UefiShell.iso > [3] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/edk2_PXE_issue_COM1_COM2.log > [4] https://osteffen.fedorapeople.org/OvmfNetbootRngIssue/edk2_PXE_working_COM1.log > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#120624): https://edk2.groups.io/g/devel/message/120624 Mute This Topic: https://groups.io/mt/109008158/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-