From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=66.187.233.73; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 57D58224E6904 for ; Fri, 2 Mar 2018 04:13:47 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6C39D7B4AD; Fri, 2 Mar 2018 12:19:55 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-104.rdu2.redhat.com [10.10.120.104]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A2B110EE979; Fri, 2 Mar 2018 12:19:50 +0000 (UTC) To: Thomas Lamprecht , qemu-discuss@nongnu.org Cc: edk2-devel-01 , "Dr. David Alan Gilbert" References: <96f46413-6eb6-21f4-d07c-89aebea6f233@proxmox.com> From: Laszlo Ersek Message-ID: <03691be2-0fcd-ace6-2ec9-d005f85782c3@redhat.com> Date: Fri, 2 Mar 2018 13:19:49 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <96f46413-6eb6-21f4-d07c-89aebea6f233@proxmox.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 02 Mar 2018 12:19:56 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 02 Mar 2018 12:19:56 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'lersek@redhat.com' RCPT:'' Subject: Re: How to handle pflash backed OVMF FW upgrade and live migration best? X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Mar 2018 12:13:48 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit CC Dave On 03/01/18 12:21, Thomas Lamprecht wrote: > Hi, > > I'm currently evaluating how to update the firmware (OVMF) code image > without impacting a KVM/QEMU VM on live migration. I.e., the FW code lives > under /usr/share/OVMF/OVMF_CODE.fd and gets passed to the QEMU command with: > > qemu-binary [...] -drive "if=pflash,unit=0,format=raw,readonly,file=/usr/share/OVMF/OVMF_CODE.fd" > > Now if the target node has an updated version of OVMF the VM does not really > likes that, as from its POV it gets effectively another code image loaded > from one moment at the other without any notice. This should not cause any issues. On the destination host, the destination QEMU instance should load the (different) OVMF_CODE.fd image into the pflash chip, at startup. However, the incoming migration stream contains, as a RAMBlock, the original OVMF_CODE.fd image. In other words, the original firmware image is migrated (in memory, as part of the migration stream) too. ( BTW, there is very little firmware code in OVMF that actually *executes* from pflash -- that's just the SEC module. SEC decompresses the PEI and DXE firmware volumes from pflash to RAM, and the rest of the firmware runs from normal RAM. This applies to runtime firmware services as well. So about the only times when OVMF_CODE.fd (in the pflash chip) and migration intersect are: - if you migrate while SEC is running from pflash (i.e. the earliest part of the boot), - if you warm-reboot on the destination host after migration -- in this case, the OVMF_CODE.fd binary (that got migrated in the pflash RAMBlock from the source host) will again boot from pflash. ) > So my questions is if it would make sense to see this read-only pflash > content as "VM state" and send it over during live migration? That's what already happens. Now, if you have a differently *sized* OVMF_CODE.fd image on the destination host, that could be a problem. Avoiding such problems is an IT / distro job. There are some "build aspects" of OVMF that can make two OVMF binaries "incompatible" in this sense. Using *some* different build flags it's also possible to make (a) an OVMF binary and (b) a varstore file originally created for another OVMF binary, incompatible. > This would > make migration way easier. Else we need to save all FW files and track which > one the VM is using, so that when starting the migration target VM we pass > along the correct pflash drive file. Sending over a pflash drive could maybe > only get done when a special flag is set for the pflash drive? > > As said I can work around in our management stack, but saving the FW image > and tracking which VM uses what version, and that cluster wide, may get > quite a headache and we would need to keep all older OVMF binaries around... When you deploy new OVMF binaries (packages) to a subset of your virtualization hosts, you are responsible for keeping those compatible. (They *can* contain code updates, but those updates have to be compatible.) If a new OVMF binary is built that is known to be incompatible, then it has to be installed under a different pathname (either via a separate package; or in the same package, but still under a different pathname). To give you the simplest example, binaries (and varstores) corresponding to FD_SIZE_2MB and FD_SIZE_4MB are incompatible. If a domain is originally defined on top of an FD_SIZE_2MB OVMF, then it likely cannot be migrated to a host where the same OVMF pathname refers to an FD_SIZE_4MB binary. If you have a mixed environment, then you need to carry both binaries to all hosts (and if you backport fixes from upstream edk2, you need to backport those to both binaries). In addition, assuming the domain is powered down for good (the QEMU process terminates), and you update the domain XML from the FD_SIZE_2MB OVMF binary to the FD_SIZE_4MB binary, you *also* have to delete/recreate the domain's variable store file (losing all UEFI variables the domain has accumulated until then). This is because the FD_SIZE_4MB binary is incompatible with the varstore that was originally created for the FD_SIZE_2MB binary (and vice versa). Thanks Laszlo > If I'm missing something and there's already an easy way for this I'd be > very happy to hear from it. > > Besides qemu-discuss I posted it to edk2-devel as there maybe more people > are in the QEMU and OVMF user intersection. :)