From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx.groups.io with SMTP id smtpd.web09.515.1628636700493800957 for ; Tue, 10 Aug 2021 16:05:01 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=softfail (domain: linux.com, ip: 211.29.132.249, mailfrom: chris.willing@linux.com) Received: from d8.hgw.net.au (pa49-197-140-170.pa.qld.optusnet.com.au [49.197.140.170]) (Authenticated sender: chris.willing@optusnet.com.au) by mail105.syd.optusnet.com.au (Postfix) with ESMTPSA id 09F9310485B7; Wed, 11 Aug 2021 09:04:57 +1000 (AEST) Reply-To: chris.willing@linux.com Subject: Re: [edk2-devel] [PATCH 1/1] OvmfPkg PlatformBootManagerLib: Move TryRunningQemuKernel() To: James Bottomley , devel@edk2.groups.io Cc: ardb+tianocore@kernel.org, jiewen.yao@intel.com, Gerd Hoffmann References: <20210728020232.127332-1-chris.willing@linux.com> <1695D2E15A92C8E7.3876@groups.io> <62f9ffa0-786f-09dd-9546-c4c118fa2a17@linux.com> <1b544f28-b5b9-c08c-bab7-8c1f41778dce@linux.com> <5c85a3f963d1ab7d20e177db9a07a73e82a0eed0.camel@HansenPartnership.com> From: "Christoph Willing" Message-ID: <7dc3b9d2-5ebb-c261-46b4-658dcbbfd0f4@linux.com> Date: Wed, 11 Aug 2021 09:04:56 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <5c85a3f963d1ab7d20e177db9a07a73e82a0eed0.camel@HansenPartnership.com> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=YKPhNiOx c=1 sm=1 tr=0 a=jHkrTDijwrxVdaLaom9wOA==:117 a=jHkrTDijwrxVdaLaom9wOA==:17 a=IkcTkHD0fZMA:10 a=hqBzw_eTAAAA:8 a=4PiWjpKz06wqSlN9RaMA:9 a=QEXdDO2ut3YA:10 a=bkWp_v3HvcftT6DRAIDL:22 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 11/8/21 12:26 am, James Bottomley wrote: > On Tue, 2021-08-10 at 10:10 +1000, Christoph Willing wrote: >> On 10/8/21 12:52 am, James Bottomley wrote: >>> On Mon, 2021-08-09 at 22:53 +1000, Christoph Willing wrote: >>>> With soft feature freeze started, I wonder if this patch could be >>>> reviewed and pushed for edk2-stable202108 tag? I think it has >>>> languished because I didn't initially Cc appropriately - pls add >>>> others as necessary. >>>> >>>> This patch is a trivial (I think) change which fixes a long >>>> standing >>>> and annoying bug for those booting Qemu with UEFI using external >>>> kernel & initrd. >>> >>> I'm with Ard on this one: -kernel is working just fine for me and >>> the >>> team at IBM working on Kata containers. It sounds like this might >>> be a >>> problem local to your environment, so we need to debug it to >>> understand >>> the issue rather than blindly reverse existing commits. >>> >> Thanks for responding James & Ard. >> >> Below is the script I'm using to create, then run, the VM. To verify >> that it works normally with UEFI boot, it initially uses the internal >> kernel & initrd. >> >> The OVMF_CODE & my_VARS lines contain git hash to identify the build >> from which OVMF_CODE.fd & OVMF_VARS.fd were taken; 97fdcg is from a >> build of yesterday's git master. >> >> After the OS has been installed, I can run the VM multiple times to >> verify that it boots under UEFI OK (I see the TianoCore splash >> screen) >> with internal kernel. >> >> >> #!/bin/bash >> >> /usr/bin/qemu-kvm \ >> -name "UEFI Testing" \ >> -enable-kvm \ >> -cpu kvm64 \ >> -smp cores=4 \ >> -boot once=c \ >> -m 8192 \ >> -device intel-hda \ >> -device hda-duplex \ >> -vga virtio \ >> -drive if=pflash,format=raw,file=OVMF_CODE_97fdcb.fd,readonly=on \ >> -drive if=pflash,format=raw,file=my_VARS_97fdcb.fd \ >> -drive file=disk.img,format=raw,cache=none,index=0,media=disk \ >> -cdrom >> /storage/iso/slackware/slackware64-15.0/slackware64-15.0-20210807.iso >> \ >> -daemonize \ >> "$@" > > There's no definition of a disk device in here. > -drive file=disk.img,format=raw,cache=none,index=0,media=disk \ >> To now use external kernel, I add the lines: >> >> -kernel /var/cache/vmbuilder/boot/15.0/x86_64/vmlinuz \ >> -initrd /var/cache/vmbuilder/boot/15.0/x86_64/initrd \ >> -append "root=/dev/sda2 rootfstype=ext4 ro vga=0x386" \ >> >> to the script just after "-boot once=c" (but I doubt the exact >> positioning makes any difference). >> >> In this case, I see the kernel running and initrd unpacked and its >> modules loaded but the root partition is unable to be mounted - the >> disk >> is not visible (running 'ls -l /dev/sd*' in recovery shell gives 'ls: >> /dev/sd*: No such file or directory'). >> >> The last lines of the Qemu screen are: >> >> /boot/initrd-5.13.8.gz: Loading kernel modules from initrd image: >> insmod /lib/modules/5.13.8/kernel/fs/jbd2/jbd2.ko >> insmod /lib/modules/5.13.8/kernel/fs/mbcache.ko >> insmod /lib/modules/5.13.8/kernel/fs/ext4/ext4.ko >> mount: mounting /dev/sda2 on /mnt failed: No such file or directory > > Which looks like why this failed. > Exactly. I should have mentioned that specifically in the patch comment, although the bz report at https://bugzilla.tianocore.org/show_bug.cgi?id=3504 already says: "The kernel and initrd are loaded but have no access to the VM itself." > Where's the vmm supposed to get /dev/sda from? It sort of seems like > the CD rom boot script thinks it was mounted as a USB device in this > case. > It should find /dev/sda in the virtual disk, just as it is correctly found in the case of the patched code. > In the working kernel dmesg Gerd requested, what does it mount as root? > sda? In which case what does the kernel say about where it got sda > from? > Yes it mounts /dev/sda2 as root. The boot logs are are now attached at https://bugzilla.tianocore.org/show_bug.cgi?id=3504 as well a diff between good and bad boots (patched & unpatched code). There's nothing obvious (to me) as to why the unpatched code can't find the virtual disk. My simple minded guess is that PlatformBdsConnectSequence() performs some preparatory work that enables the kernel to have access to the vm (therefore to the virtual disk) by the time TryRunningQemuKernel() is called. At the moment however, TryRunningQemuKernel() is called before PlatformBdsConnectSequence() so that preparatory work hasn't been done and the disk can't be found. chris