From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id 2FC6DD8114D for ; Wed, 20 Nov 2024 15:20:42 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=QH1nu14s7XRMUiOfv73DTdH1YvSg16WTlXRBmcJd40w=; c=relaxed/simple; d=groups.io; h=Subject:To:From:User-Agent:MIME-Version:Date:References:In-Reply-To:Message-ID:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type; s=20240830; t=1732116041; v=1; x=1732375240; b=muwygC4INZwJEuh2j+5ZpXFV/m6AX3orkdTjIMxJhgrCkARkKdlIV7kXQGD5Peuq6e23EG/v IWtYeTuPk7JsYOB1Zmwb5UpHkk+tWVko9H0rr2TF/M2A7t/u2sUFOsXRXdeE91VadxJL5KNfWBx NxrXkBmEkNj/qq1s/E0b9TDgO6cIlSczS77nEsH18WuyBu+NLFf7TXkW4BD1Hj3M5IXNO9DEvHq 5QYfa1oVGxreZxgNTWs/QAm1sZF9tuNIOVR9TP9s4ykHr4Ls5VGwzs1UWVFX76W9/TrNJL7R/rk OnNpVCFedAFsjdKUbTnj8DxyKNcIolfqCOvqZLqP5R/7g== X-Received: by 127.0.0.2 with SMTP id lzXOYY7687511xZ8hIXR7tXp; Wed, 20 Nov 2024 07:20:40 -0800 Subject: Re: [edk2-devel] [BUG] Extremely slow boot times with CPU and GPU passthrough and host phys-bits > 40 To: "Gerd Hoffmann" , devel@edk2.groups.io From: "mitchell.augustin via groups.io" X-Originating-Location: Waterloo, Illinois, US (65.87.55.115) X-Originating-Platform: Linux Chrome 129 User-Agent: GROUPS.IO Web Poster MIME-Version: 1.0 Date: Wed, 20 Nov 2024 07:20:39 -0800 References: In-Reply-To: Message-ID: <18748.1732116039858528046@groups.io> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,mitchell.augustin@canonical.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: S0vRUpDIe4dYTSmo2F4RWwr1x7686176AA= Content-Type: multipart/alternative; boundary="1kamzPdwM7S7cKUJ1mN4" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240830 header.b=muwygC4I; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io; dmarc=pass (policy=none) header.from=groups.io --1kamzPdwM7S7cKUJ1mN4 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable @Gerd > Do you also see the slowdown without the GPU in a otherwise identical guest configuration? No - without the GPUs, the entire boot process takes less than 30 seconds (= which is true before and after the dynamic mmio window size patch ( https:/= /github.com/tianocore/edk2/commit/ecb778d0ac62560aa172786ba19521f27bc3f650 = ) ). > Looks quite high to me. =C2=A0What amount of guest memory we are talking about? It is a pretty large memory allocation - over 900GB - so I'm not surprised = that the initial allocation during `virsh start` takes a while when PCIe de= vices are passed through, since that allocation has to happen at init time.= `virsh start` also takes the same amount of time with or without the dynam= ic mmio window size patch, but its time does scale with amount of memory al= located. (although I expect that, given that the time consuming part is jus= t that memory allocation.) > More details would be helpful indeed. =C2=A0Is that a general overall slowdown? =C2=A0Is it some specific part which takes alot of time? The part of the kernel boot that I highlighted in https://edk2.groups.io/g/= devel/attachment/120801/2/this-part-takes-2-3-minutes.txt (which I think is= PCIe device initialization and BAR assignment) is the part that seems slow= er than it should be. Each section of that log starting with "acpiphp: Slot= registered" takes probably 15 seconds, so this whole section adds u= p to a few minutes. That part also does not scale with memory allocation, j= ust with number of GPUs passed through. (in this log, I had 4 GPUs attached= , IIRC). Without the dynamic mmio window size patch, if I set my guest kernel to use= `pci=3Dnocrs pci=3Drealloc`, this boot slowdown disappears and I am able t= o use the GPU with some conditions (details below). @xpahos: > This patch adds functionality that automatically adjusts the MMIO size ba= sed on the number of physical bits. As a starting point, I would try runnin= g an old build of OVMF and running grep on =E2=80=98rejected=E2=80=99 to ma= ke sure that no GPUs were taken out of service while OVMF was running. I haven't looked for this in OVMF debug output, but what you say here seems= realistic, given that my VMs without the dynamic mmio window size patch th= row many errors like this during guest kernel boot: [ =C2=A0 =C2=A04.650955] pci 0000:00:01.5: BAR 15: no space for [mem size 0= x3000000000 64bit pref] [ =C2=A0 =C2=A04.651700] pci 0000:00:01.5: BAR 15: failed to assign [mem si= ze 0x3000000000 64bit pref] (and subsequently, the GPUs are not usable in the VMs (but the PCI devices = are still present)). So it would make sense if the fast boot time in those = versions is simply attributed to the kernel "giving up" on all of those rig= ht away, before the slow path starts. The only confusing part to me then is= why I would not see this part ( https://edk2.groups.io/g/devel/attachment/= 120801/2/this-part-takes-2-3-minutes.txt ) going so slowly when I use a ver= sion of OVMF with the dynamic mmio window size patch reverted but with my g= uest kernel having `pci=3Drealloc pci=3Dnocrs` set. Under those circumstanc= es, I have a fast boot time and my passed-through GPUs work. (although I do= still see some outputs like this during linux boot: [ =C2=A0 =C2=A04.592009] pci 0000:06:00.0: can't claim BAR 0 [mem 0xfffffff= fff000000-0xffffffffffffffff 64bit pref]: no compatible bridge window [ =C2=A0 =C2=A04.593477] pci 0000:06:00.0: can't claim BAR 2 [mem 0xffffffe= 000000000-0xffffffffffffffff 64bit pref]: no compatible bridge window [ =C2=A0 =C2=A04.593817] pci 0000:06:00.0: can't claim BAR 4 [mem 0xfffffff= ffe000000-0xffffffffffffffff 64bit pref]: no compatible bridge window and sometimes the loading of the Nvidia driver does introduce some brief lo= ckups ( https://pastebin.ubuntu.com/p/J3TH3S7Xhd/ ) ) > But the linux kernel also takes a long time to initialise NVIDIA GPU usin= g SeaBIOS This is good to know... given this and the above, I'm starting to wonder if= it might actually be a kernel issue... -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#120805): https://edk2.groups.io/g/devel/message/120805 Mute This Topic: https://groups.io/mt/109651206/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- --1kamzPdwM7S7cKUJ1mN4 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
@Gerd
 
> Do you also see the slowdown without the GPU in a otherwise ident= ical
guest configuration?
 
No - without the GPUs, the entire boot process takes less than 30 seco= nds (which is true before and after the dynamic mmio window size patch).
 
> Looks quite high to me.  What amount of guest memory we are = talking
about?
 
It is a pretty large memory allocation - over 900GB - so I'm not surpr= ised that the initial allocation during `virsh start` takes a while when PC= Ie devices are passed through, since that allocation has to happen at init = time. `virsh start` also takes the same amount of time with or without the = dynamic mmio window size patch, but its time does scale with amount of memo= ry allocated. (although I expect that, given that the time consuming part i= s just that memory allocation.)
 
> More details would be helpful indeed.  Is that a general ove= rall
slowdown?  Is it some specific part which takes alot of time= ?
 
The part of the kernel boot that I highlighted in https://edk2.groups.io/g/devel/attachmen= t/120801/2/this-part-takes-2-3-minutes.txt (which I think is PCIe devic= e initialization and BAR assignment) is the part that seems slower than it = should be. Each section of that log starting with "acpiphp: Slot <slot&g= t; registered" takes probably 15 seconds, so this whole section adds up to = a few minutes. That part also does not scale with memory allocation, just w= ith number of GPUs passed through. (in this log, I had 4 GPUs attached, IIR= C).
 
Without the dynamic mmio window size patch, if I set my guest= kernel to use `pci=3Dnocrs pci=3Drealloc`, this boot slowdown disappears a= nd I am able to use the GPU with some conditions (details below).
 
@xpahos:
 
> This patch adds functionality that automatically adjusts the MMIO= size based on the number of physical bits. As a starting point, I would tr= y running an old build of OVMF and running grep on ‘rejected’ t= o make sure that no GPUs were taken out of service while OVMF was running.<= /div>
 
I haven't looked for this in OVMF debug output, but what you say here = seems realistic, given that my VMs without the dynamic mmio window size pat= ch throw many errors like this during guest kernel boot:
[    4.650955] pci 0000:00:01.5: BAR 15: no space for [mem s= ize 0x3000000000 64bit pref]
[    4.651700] pci 0000:00:01.5= : BAR 15: failed to assign [mem size 0x3000000000 64bit pref]
 
(and subsequently, the GPUs are not usable in the VMs (but the PCI dev= ices are still present)). So it would make sense if the fast boot time in t= hose versions is simply attributed to the kernel "giving up" on all of thos= e right away, before the slow path starts. The only confusing part to me th= en is why I would not see this part going so slowly when I use a version of OVMF with the d= ynamic mmio window size patch reverted but with my guest kernel having `pci= =3Drealloc pci=3Dnocrs` set. Under those circumstances, I have a fast boot = time and my passed-through GPUs work. (although I do still see some outputs= like this during linux boot:
[    4.592009] pci 0000:06:00.0: can't claim BAR 0 [mem 0xff= ffffffff000000-0xffffffffffffffff 64bit pref]: no compatible bridge window<= br />[    4.593477] pci 0000:06:00.0: can't claim BAR 2 [mem 0xff= ffffe000000000-0xffffffffffffffff 64bit pref]: no compatible bridge window<= br />[    4.593817] pci 0000:06:00.0: can't claim BAR 4 [mem 0xff= fffffffe000000-0xffffffffffffffff 64bit pref]: no compatible bridge window<= /div>
and sometimes the loading of the Nvidia driver does introduce some brief lockups)
 
> But the linux kernel also takes a long time to initialise NVIDIA = GPU using SeaBIOS
 
This is good to know... given this and the above, I'm starting to wond= er if it might actually be a kernel issue...
_._,_._,_

Groups.io Links:

=20 You receive all messages sent to this group. =20 =20

View/Reply Online (#120805) | =20 | Mute= This Topic | New Topic
Your Subscriptio= n | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_
--1kamzPdwM7S7cKUJ1mN4--