From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web12.3259.1649142443707672331 for ; Tue, 05 Apr 2022 00:07:24 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=X3G5WSil; spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: kraxel@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649142442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KLKuOrPLV0lOkW1MZRvsnVDBA6C51It6cp90WsPPzIc=; b=X3G5WSil9Nsf5o5RNr91AD+wmt0GShPx6t/jrQvqUeVVgMXiSakKYg4eJyu+EidGhwUoi4 UOdl6ri1gCk++OZcV3GR7SOAArj/UvJUZMz9uMZ4SIaE/ZIbxVnbjGt1Nn4LujYA5uyF3A +anQz5b3w5nw6eOWMRsDFsnStZ7tdqY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-304-X-7X2pmVMS6rHb5htt4IBQ-1; Tue, 05 Apr 2022 03:07:20 -0400 X-MC-Unique: X-7X2pmVMS6rHb5htt4IBQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AD6D638009F7; Tue, 5 Apr 2022 07:06:58 +0000 (UTC) Received: from sirius.home.kraxel.org (unknown [10.39.192.9]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4C82B41149AB; Tue, 5 Apr 2022 07:06:42 +0000 (UTC) Received: by sirius.home.kraxel.org (Postfix, from userid 1000) id 64F6318003B3; Tue, 5 Apr 2022 09:06:39 +0200 (CEST) Date: Tue, 5 Apr 2022 09:06:39 +0200 From: "Gerd Hoffmann" To: Corvin =?utf-8?B?S8O2aG5l?= Cc: "virtualization@freebsd.org" , Ard Biesheuvel , Jiewen Yao , Jordan Justen , Rebecca Cran , Peter Grehan , "devel@edk2.groups.io" Subject: Re: [PATCH] OvmfPkg: reserve igd memory by E820 Message-ID: <20220405070639.uq5uiydxhirwu4gb@sirius.home.kraxel.org> References: <20220404063448.280-1-c.koehne@beckhoff.com> <20220404113830.6novz55zpid3l6fl@sirius.home.kraxel.org> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=kraxel@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, > > First, there is no need to communicate memory regions from the > > hypervisor to the guest. The IGD hardware has registers pointing > > to the opregion and to stolen memory, so the guest can simply > > allocate and initialize memory, then program the registers > > accordingly. Same procedure you have when initializing IGD on > > physical hardware. > > As far as I know, on physical hardware it's done by UEFI. Sadly, the > Intel GOP driver doesn't do it. Where does the intel gop driver come from? Extracted from host firmware? > > BTW: Do you talk about GVT-d (== build virtual IGD devices with some > > resources of the physical device, roughly comparable to SR-IOV but > > with the igd kernel driver instead of the hardware handling this)? > > Or do you want pci-assign the complete igd device? > > Intel has different terms for their GPU passthrough, GVT-d, GVT-g and > GVT-s. I'd like to use GVT-d which means pci-assigning the > complete igd device. Ah, ok, didn't notice the subtile differences with the small letter at the end. GVT-d + GVT-g is clear then. What is GVT-s ? > > Lovely. Intel should fix their broken windows drivers ... > > > > The etc/e820 fw_cfg file can have both 'ram' and 'reserved' entries. > > > > And, yes, adding a 'reserved' entry there for the region which requires > > an identity mapping (to workaround the driver bug) is fine and should > > make sure the region is not used for something else. Everything else > > should be handled by the igd efi driver / optionrom. > > At the moment, my bhyve implementation allocates GSM and OpRegion in > guest memory, copies OpRegion into guest memory, inserts the GSM > and OpRegion addresses into the PCI registers and creates an E820 > table. Ideally the guest would allocate and initialize this all itself. That is hard for the GSM though when it requires an identity mapping. Having the guest check whenever the GSM register points to reserved memory and if so use it instead of allocating memory should work I think. > > For the opregion: qemu has, for historical reasons, an (optional and > > disabled by default) etc/igd-opregion fw_cfg file. It was a quick hack > > which ended up staying. When a option rom is needed anyway the content > > of the opregion can simply be passed to the guest as part of the option > > rom image. But if you prefer to use fw_cfg instead you should at least > > use the same hack instead of inventing a new one ... > > > > See also: > > https://gitlab.com/qemu-project/qemu/-/blob/master/docs/igd-assign.txt > > > > Seabios uses etc/igd-opregion, guest code is here: > > https://gitlab.com/qemu-project/seabios/-/blob/master/src/fw/pciinit.c#L291 > > Ideally we'd move that to a proper vgabios too ... > > Moving all of this into a proper option rom might be a good idea. > It'll require some work to create some tools which build a proper > rom for each system. Once we have the code for vgabios and PlatformGopPolicy we can roll them with the intelgop driver into a rom image with EfiRom. Ideally also add opregion content to the rom somehow. > However, with this solution there aren't any > hacks neither in the hypervisor nor in OVMF. Yep, that's the main point of the approach. Make it self-contained and not depend on specific behavior/support in hypervisor and/or firmware. take care, Gerd