From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id CD194AC10EB for ; Wed, 20 Nov 2024 09:35:06 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=jPn/SPGMH6k4KjFx2vm/1IuVc5oXJZqvuazl5NVtjLc=; c=relaxed/simple; d=groups.io; h=Subject:To:From:User-Agent:MIME-Version:Date:References:In-Reply-To:Message-ID:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type; s=20240830; t=1732095306; v=1; x=1732354505; b=R3xFrIHqDCsNlkx9sqvmclSZD8ITjqmrF9mf1QGL2XpvJNmhTTbtHOlUZU4k56VIfELCBN9C J9NzUc0Wp1AG33qozdYTFTxnn6ohTG/kU0VYVtVwABT87VEn9kxNR5HSqjDYmYXUFiRnKTuhL0b bH7SVO9FyfpoSO36PEvTZFlEpJ+kjRcJpdDHjcELfgCFYdN4a3DOdDsz9g2Ju84HcXcvZ3qQTHD G64/ByN4/zlRuDS7hWntM8YvQ6q1XAA3B9v7o+HclXbBMn3zyEc3uwEPHe6NPGgmwL7Ua0poBId pVOZCpjZApko3iTrOLgv9eODgFFfcPBL+2M9byjk3m1vw== X-Received: by 127.0.0.2 with SMTP id QrjqYY7687511xi9GvjWXYeO; Wed, 20 Nov 2024 01:35:05 -0800 Subject: Re: [edk2-devel] [BUG] Extremely slow boot times with CPU and GPU passthrough and host phys-bits > 40 To: mitchell.augustin@canonical.com, devel@edk2.groups.io From: "xpahos via groups.io" X-Originating-Location: RU (217.15.57.160) X-Originating-Platform: Mac Chrome 124 User-Agent: GROUPS.IO Web Poster MIME-Version: 1.0 Date: Wed, 20 Nov 2024 01:35:04 -0800 References: <24085.1732055112128290386@groups.io> In-Reply-To: <24085.1732055112128290386@groups.io> Message-ID: <2000.1732095304681550888@groups.io> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,xpahos@gmail.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: XwjfJsCxB5cg6IgwoyXO0XWqx7686176AA= Content-Type: multipart/alternative; boundary="IaT18p1qnKXecQirw9fj" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240830 header.b=R3xFrIHq; dmarc=pass (policy=none) header.from=groups.io; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io --IaT18p1qnKXecQirw9fj Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, Mitchell. > Thanks for the suggestion. I'm not necessarily saying this patch itself h= as an issue, just that it is the point in the git history at which this slo= w boot time issue manifests for us. This may be because the patch does actu= ally fix the other issue I described above related to BAR assignment not wo= rking correctly in versions before that patch, despite boot being faster ba= ck then. (in those earlier versions, the PCI devices for the GPUs were pass= ed through, but the BAR assignment was erroneous, so we couldn't actually u= se them - the Nvidia GPU driver would just throw errors.) tl;dr For GPU instances, a huge amount of memory is required for the VM to = be able to map BARs. So, the amount of memory required for MMIO could be in= sufficient and OVMF was rejecting some PCI devices during the initialisatio= n phase. To fix this, there is an opt/ovmf/X-PciMmio64Mb option that increa= ses the MMIO size. This patch adds functionality that automatically adjusts= the MMIO size based on the number of physical bits. As a starting point, I= would try running an old build of OVMF and running grep on =E2=80=98reject= ed=E2=80=99 to make sure that no GPUs were taken out of service while OVMF = was running. > After I initially posted here, we also discovered another kernel issue th= at was contributing to the boot times for this config exceeding 5 minutes -= so with that isolated, I can say that my config only takes about a 5 minut= es for a full boot: 1-2 minutes for `virsh start` (which scales with guest = memory allocation), and about 2-3 minutes of time spent on PCIe initializat= ion / BAR assignment for 2 to 4 GPUs (attached). This was still the case wh= en I tried with my GPUs attached in the way you suggested. I'll attach the = xml config for that and for my original VM in case I may have configured so= mething incorrectly there. > With that said, I have a more basic question - do you expect that it shou= ld take upwards of 30 seconds after `virsh start` completes before I see an= y output in `virsh console`, or that PCI devices' memory window assignments= in the VM should take 45-90 seconds per passed-through GPU? (given that wh= en the same kernel on the host initializes these devices, it doesn't take n= early this long?) I'm not sure I can help you, we don't use virsh. But the linux kernel also = takes a long time to initialise NVIDIA GPU using SeaBIOS. Another way to ch= eck the boot time is to hot-plug the cards after booting. I don't know how = this works in virsh. I made a script for expect to emulate hot-plug: ``` #!/bin/bash CWD=3D"$(dirname "$(realpath "$0")")" /usr/bin/expect < I'm going to attempt to profile ovmf next to see what part of the code pa= th is taking up the most time, but if you already have an idea of what that= might be (and whether it is actually a bug or expected to take that long),= that insight would be appreciated. We just started migration from SeaBIOS to UEFI/SecureBoot, so I know only s= ome parts of the OVMF code which is used for enumeration/initialisation of = PCI devices. I'm not core developer of edk2, just solving the same problems= with starting VMs with GPUs. -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#120803): https://edk2.groups.io/g/devel/message/120803 Mute This Topic: https://groups.io/mt/109651206/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- --IaT18p1qnKXecQirw9fj Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
Hello, Mitchell.
 
> Thanks for the suggestion. I'm not necessarily sa= ying this patch itself has an issue, just that it is the point in the git h= istory at which this slow boot time issue manifests for us. This may be bec= ause the patch does actually fix the other issue I described above related = to BAR assignment not working correctly in versions before that patch, desp= ite boot being faster back then. (in those earlier versions, the PCI device= s for the GPUs were passed through, but the BAR assignment was erroneous, s= o we couldn't actually use them - the Nvidia GPU driver would just throw er= rors.)
 
tl;dr For GPU instances, a huge amount of memory is required for the V= M to be able to map BARs. So, the amount of memory required for MMIO could = be insufficient and OVMF was rejecting some PCI devices during the initiali= sation phase. To fix this, there is an opt/ovmf/X-PciMmio64Mb option that i= ncreases the MMIO size. This patch adds functionality that automatically ad= justs the MMIO size based on the number of physical bits. As a starting poi= nt, I would try running an old build of OVMF and running grep on ‘rej= ected’ to make sure that no GPUs were taken out of service while OVMF= was running.
 
> After I initially post= ed here, we also discovered another kernel issue that was contributing to t= he boot times for this config exceeding 5 minutes - so with that isolated, = I can say that my config only takes about a 5 minutes for a full boot: 1-2 = minutes for `virsh start` (which scales with guest memory allocation), and = about 2-3 minutes of time spent on PCIe initialization / BAR assignment for= 2 to 4 GPUs (attached). This was still the case when I tried with my GPUs = attached in the way you suggested. I'll attach the xml config for that and = for my original VM in case I may have configured something incorrectly ther= e.
> With that said, I have a more basic questi= on - do you expect that it should take upwards of 30 seconds after `virsh start` comple= tes before I see any output in `virsh console`, or that PCI devices' memory= window assignments in the VM should take 45-90 seconds per passed-through = GPU? (given that when the same kernel on the host initializes these devices= , it doesn't take nearly this long?)
 
I'm not sure I can help you, we don't use virsh= . But the linux kernel also takes a long time to initialise NVIDIA GPU usin= g SeaBIOS. Another way to check the boot time is to hot-plug the cards afte= r booting. I don't know how this works in virsh. I made a script for expect= to emulate hot-plug:
 
```
#!/bin/bash
CWD=3D"$(dirname "$(realpath "$0")")"
/usr/bin/expect <<EOF
spawn $CWD/qmp-shell $CWD/qmp.sock
send -- "query-pci\r"
send -- "device_add driver=3Dpci-gpu-testdev bus=3Ds30 regions=3Dmpx2M= vendorid=3D5555 deviceid=3D4126\r"
```
 
> I'm going to attempt to profile ov= mf next to see what part of the code path is taking up the most time, but i= f you already have an idea of what that might be (and whether it is actuall= y a bug or expected to take that long), that insight would be appreciated.<= /span>
 
We just started migration from SeaBIOS to UEFI/= SecureBoot, so I know only some parts of the OVMF code which is used for en= umeration/initialisation of PCI devices. I'm not core developer of edk2, ju= st solving the same problems with starting VMs with GPUs. 
_._,_._,_

Groups.io Links:

=20 You receive all messages sent to this group. =20 =20

View/Reply Online (#120803) | =20 | Mute= This Topic | New Topic
Your Subscriptio= n | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_
--IaT18p1qnKXecQirw9fj--