From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id 38B09D811A7 for ; Tue, 26 Nov 2024 08:10:10 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=WI5g4jl6JOwgjwKQhJVPEtQ1NPomYbBKPwyYTcCU+dM=; c=relaxed/simple; d=groups.io; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Resent-From:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type:Content-Disposition:Content-Transfer-Encoding; s=20240830; t=1732608609; v=1; x=1732867808; b=vkkgccYmUTGnrENmspG0r6HJ3E3lqAzxoUdX96i1Ivv4CRq4AaHamrwzLJMR9FCSFh9Ztp1j tBhLDtEhHOds3tTfciuaoXPVtn+9bv98DuC6fQd3TErVxW2Ly5gGGtrFxhis0zc8fxAPlhxkT0K tody+O5YZ/K1pjxhB3CORe+Bn9x8EjdKgsXbih+3jQ7khR3d1X3H5cobzeo/zT7l8HVLhoXca2q ZI/xWysNXN2BDaHQ0gLzjlAPwucvrUtzZnWya0vyggcerrlMbfqnwXhHnyKcCLVmx9MCPRxgtWo 1QQFXxGaoxx2NCeBxxB/muqYkvk3OJZ3gBWpUHDAd92eA== X-Received: by 127.0.0.2 with SMTP id qCDBYY7687511x68tj2tLgAl; Tue, 26 Nov 2024 00:10:08 -0800 X-Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.groups.io with SMTP id smtpd.web10.40880.1732608602926928864 for ; Tue, 26 Nov 2024 00:10:03 -0800 X-Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-563-wayMMlk2OmGNd_ZPayjI1A-1; Tue, 26 Nov 2024 03:09:58 -0500 X-MC-Unique: wayMMlk2OmGNd_ZPayjI1A-1 X-Mimecast-MFC-AGG-ID: wayMMlk2OmGNd_ZPayjI1A X-Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7E57C1956096; Tue, 26 Nov 2024 08:09:57 +0000 (UTC) X-Received: from sirius.home.kraxel.org (unknown [10.39.192.172]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0FB451956052; Tue, 26 Nov 2024 08:09:56 +0000 (UTC) X-Received: by sirius.home.kraxel.org (Postfix, from userid 1000) id 995F918000B5; Tue, 26 Nov 2024 09:09:54 +0100 (CET) Date: Tue, 26 Nov 2024 09:09:54 +0100 From: "Gerd Hoffmann via groups.io" To: mitchell.augustin@canonical.com Cc: devel@edk2.groups.io Subject: Re: [edk2-devel] [BUG] Extremely slow boot times with CPU and GPU passthrough and host phys-bits > 40 Message-ID: References: <14717.1732564682653358784@groups.io> MIME-Version: 1.0 In-Reply-To: <14717.1732564682653358784@groups.io> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 8ic_DIqlg3QJjKSllMecTr3yW9M1Rr6FXTuq7PsTjMA_1732608597 X-Mimecast-Originator: redhat.com Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Tue, 26 Nov 2024 00:10:03 -0800 Resent-From: kraxel@redhat.com Reply-To: devel@edk2.groups.io,kraxel@redhat.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: B3RulM11gio7VSUDntknH0gDx7686176AA= Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240830 header.b=vkkgccYm; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io; dmarc=pass (policy=none) header.from=groups.io On Mon, Nov 25, 2024 at 11:58:02AM -0800, via groups.io wrote: > Thanks. > > > That is extremely slow.  How does /proc/iomem look like?  Anything overlapping the ECAM maybe? > > Slow and fast guests' and host's /proc/iomem outputs are attached. For > the fast guest, I also included the mapping after a reboot with > `pci=realloc pci=nocrs` set, since that is the config that actually > allows the driver to load. I don't see any regions labeled "PCI ECAM", > not sure if that's an issue or if it might just appear as something > else on some configs. It's MMCONFIG, I think older kernels name it that way instead of ECAM. Looks normal. > I am not sure if there's a way to force bus 0000:08 to use more of the > overall MMIO window space after boot, When hotplugging devices you might need the 'pref64-reserve=' property for '-device pcie-root-port' to make the bridge window larger. For devices present at boot this is not needed because OVMF can figure that automatically by looking at the bar sizes of the pci devices. > Thinking ahead here: hypothetically, if I were to propose a patch to > add a knob for this similar to X-PciMmio64Mb to MemDetect.c, do you > think it could be acceptable? It seems that the immediately viable > workaround for our specific use case would be to disable > PlatformDynamicMMIOWindow via a qemu option, and if this is an issue > with many large BAR Nvidia GPUs, it could be broadly useful until the > root issue is fixed in the kernel. I already patched and tested a knob > for this in a local build, and it works (and shouldn't introduce any > regressions, since omission of the flag would just mean PDMW gets > called as it does today.) I think it would be better to just give the PciMmio64Mb config hints higher priority, i.e. if they are present (and don't conflict with installed memory) simply use them as-is. But I'd also like to figure what the exact problem is. Ideally OVMF should just work without requiring manual configuration. One of the reasons to add the dynamic mmio window was to allow pci devices with large bars to work fine without manual tuning ... What is the difference between the slow dynamic mmio window configuration and the fast manual PciMmio64Mb configuration? Can you try to change the manual configuration to be more like the dynamic mmio window configuration, one change at a time (first size, next base, possibly multiple base addresses) and see where exactly it breaks? Does it make a difference if you add a iommu to the virtual machine? > From guest VM that booted quickly, has 52 physbits/57 virtual bits, and in which GPU driver does not work since `pci=realloc pci=nocrs isn't set: > > e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff] > e0000000-efffffff : Reserved > e0000000-efffffff : pnp 00:04 ECAM with old name. > From guest VM that booted slowly, has 52 physbits/57 virtual bits, and in which GPU driver works correctly: > > 380000000000-3937ffffffff : PCI Bus 0000:00 > 380000000000-382002ffffff : PCI Bus 0000:06 > 380000000000-381fffffffff : 0000:06:00.0 > 382000000000-382001ffffff : 0000:06:00.0 > 382002000000-382002ffffff : 0000:06:00.0 This is the NPU I guess? [ host /proc/iomem below ] > 210000000000-21ffffffffff : PCI Bus 0000:15 > 21a000000000-21e047ffffff : PCI Bus 0000:16 > 21a000000000-21e047ffffff : PCI Bus 0000:17 > 21a000000000-21e042ffffff : PCI Bus 0000:19 > 21a000000000-21e042ffffff : PCI Bus 0000:1a > 21a000000000-21e042ffffff : PCI Bus 0000:1b > 21a000000000-21bfffffffff : 0000:1b:00.0 > 21c000000000-21dfffffffff : 0000:1b:00.0 > 21e000000000-21e03fffffff : 0000:1b:00.0 > 21e040000000-21e041ffffff : 0000:1b:00.0 > 21e042000000-21e042ffffff : 0000:1b:00.0 Hmm. Looks like the device has more resources on the host. Maybe *that* is the problem. What does 'sudo lspci -v' print for the NPU on the host and in the guest? take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#120844): https://edk2.groups.io/g/devel/message/120844 Mute This Topic: https://groups.io/mt/109651206/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-