From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Thu, 22 Aug 2019 11:56:08 -0700 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EEE13793F4; Thu, 22 Aug 2019 18:56:07 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (unknown [10.36.118.90]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9585C600CD; Thu, 22 Aug 2019 18:56:06 +0000 (UTC) Subject: Re: [edk2-devel] Getting Synchronous Exception while run avocado-vt tests To: devel@edk2.groups.io, ard.biesheuvel@linaro.org, Zhanghailiang Cc: "edk2-devel@lists.01.org" , Guoheyi References: <4e8a0c5f50b642538b310a8edd9ce248@huawei.com> From: "Laszlo Ersek" Message-ID: <6256d296-1985-5719-c89a-6b959be6cbc6@redhat.com> Date: Thu, 22 Aug 2019 20:56:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 22 Aug 2019 18:56:08 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 08/22/19 11:24, Ard Biesheuvel wrote: > On Thu, 22 Aug 2019 at 10:40, Zhanghailiang > wrote: >> >> Hi All, >> >> >> >> We caught an =E2=80=98Synchronous Exception=E2=80=99 error while booti= ng VM with uefi firmware in the avocado-vt tests. >> >> The Edk2 version we used is edk2-stable201905. The qemu version is qem= u-4.0.0 and kernel version is 4.19.0. >> >> Parts of the log we got from serial is bellow, you can get the full lo= g from attachment. >> >> We can easily reproduce this issue with running avocado-vt tests. Actu= ally, we tried the new edk2 from upstream, >> >> It is still can be reproduced. >> >> >> >> Reproduce command: >> >> # avocado run type_specific.io-github-autotest-qemu.qmp_event_notifica= tion --vt-type qemu --vt-guest-os Guest.Linux.Fedora.29 >> >> >> >> Qemu command is : >> > .. >> >> It reports that this is a alignment fault from log, We analyzed the ca= llstack from log: >> >> VirtioScsiPassThru-> VirtioFlush->virtio10SetQueueNotify->Virtio10Tran= sfer->PciIoMemWrite-> CpuMemoryServiceWrite-> MmioWrite32 <- here, the ad= dress is not align. >> >=20 > The faulting address ends in 0x16, so the access is to the QueueSelect > field in VIRTIO_PCI_COMMON_CFG. This is a UINT16 field, so the access > should be 16-bit not 32-bits wide. >=20 > Could you dump the instructions leading up to the first > Virtio10Transfer() call in Virtio10SetQueueNotify()? (from > Build/ArmVirtQemu-AARCH64/DEBUG_GCC49/AARCH64/OvmfPkg/Virtio10Dxe/Virti= o10/DEBUG/Virtio10.dll) >=20 > 2280: aa0103e5 mov x5, x1 > 2284: d2800044 mov x4, #0x2 = // #2 > 2288: d28002c3 mov x3, #0x16 = // #22 > 228c: 52800002 mov w2, #0x0 = // #0 > 2290: aa0003e1 mov x1, x0 > 2294: aa0603e0 mov x0, x6 > 2298: 97fffcf3 bl 1664 >=20 > If the size is passed correctly here, we'll have to track down how the > call gets routed to Mmio32Write instead of Mmio16Write(). Do you have > any patches on top of edk2-stable-201905 ? Right -- checking the "QueueSelect" (whole word) references in Virtio10SetQueueNotify(), the "FieldSize" arguments passed to Virtio10Transfer() are: - sizeof SavedQueueSelect - sizeof Index - sizeof SavedQueueSelect and both "SavedQueueSelect" and "Index" are of type UINT16. Virtio10Transfer() maps (FieldSize=3D=3D2) to "EfiPciIoWidthUint16". PciIoMemWrite() can only decrease "Width" (provided "PcdUnalignedPciIoEnable" is set to TRUE -- which is not the case in ArmVirtPkg). So "Width" is passed to RootBridgeIoMemWrite() unchanged, in "MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c". The latter passes "Width" unchanged to CpuMemoryServiceWrite(), in "ArmPkg/Drivers/ArmPciCpuIo2Dxe/ArmPciCpuIo2Dxe.c". That function seems to set "OperationWidth" to "EfiCpuIoWidthUint16" (value 1, unchanged), which should result in a call to MmioWrite16()... I have a different question. We recently saw a bunch of Synchronous Exceptions, but those were not deterministic. Whenever they fired (which was not always), they popped up in different spots. It turned out to be a KVM regression, apparently a problem with the vtimer. I believe it was fixed by a backport of upstream commit 6bc210003dff ("KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls", 2019-04-25). I could be totally off-target, of course. (The RHBZ is , but *of course* it has to be a private bug; it was reported for the kernel after all! /s) Thanks Laszlo