From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-x243.google.com (mail-it0-x243.google.com [IPv6:2607:f8b0:4001:c0b::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 2A2FD81E55 for ; Tue, 22 Nov 2016 05:58:14 -0800 (PST) Received: by mail-it0-x243.google.com with SMTP id b123so1837878itb.2 for ; Tue, 22 Nov 2016 05:58:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=IahyF2dwGfHzpxJzNw1OdmYDW2XLeKMG3y+lOYOxGa4=; b=d2bquCZ5t27ujXqMDNEZhYIoVVb6DdrzVrFSGS14RD7wvTwnmiNNGy8W3ULcQTgWLH owMJBOu5J70U3rz+mnq2oZUSshE11/3nohvmIFpuPkut5SOmSnAIr6zL5CNaIaWXv+Tc XqJIgAlpxPUhz7rqcNNQ6a8iosNsGlGB1Y4exyMJD3mGQWcHFQA9xMSGMByHBsG+YThL RGo1M/RbE+Ubvc2nqMlsX+v0C8MLgnwGLAo7y9CZFktmVFBQjIE3sIiAOT3JCEGhOzco SsKsVuJyC9IHcebh4or9CVXxHufB9b8ZSiSnxKFb2NZEFnpGwQZ4KhzeuXGDEyf2Dhtz SKGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=IahyF2dwGfHzpxJzNw1OdmYDW2XLeKMG3y+lOYOxGa4=; b=AVZRyRCFXHDD8DULsJpg8R4+5Kgv7Ccjt8HW71szHH0oXTrjkf9MhnRokRUwiLIKoT Y+Qca8ocDooleOozTBqp9c6Xdjzs4loc57c7EBDeu/apJq6euPX63ScBgI1KXfuPGEdC ZUcbcCA8jTV5ihC0cLwka3UQQvXtYThbMgkWIWY+3Nzt4y9slzRbf/GkmBwKM3cnWErv yy9iC+iIdHmk4RHIh0SOxc7N4yZselNqPAbXJ2ku2JLxkKSNbJldBNe0OHTw7hcr4Trv 0fPGXW92NjgaDrdOzS6GTWEkHQii2ba4KSfKQdcBzT++u9YIpbC49Ycti/9IeodewQOZ 2hzg== X-Gm-Message-State: AKaTC01fb10cfTGDdEhTX3GVPE1QGRueVISHZxkhj8Gae7dV9ZATxTcOOaX5tYYrQsxms+Rqv5AvS4IP5/nXeQ== X-Received: by 10.36.111.212 with SMTP id x203mr2317907itb.59.1479823093338; Tue, 22 Nov 2016 05:58:13 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.113.196 with HTTP; Tue, 22 Nov 2016 05:58:12 -0800 (PST) In-Reply-To: <2340021c-4bcb-2622-07a8-6e6173f94d81@redhat.com> References: <2340021c-4bcb-2622-07a8-6e6173f94d81@redhat.com> From: Evgeny Yakovlev Date: Tue, 22 Nov 2016 16:58:12 +0300 Message-ID: To: Laszlo Ersek Cc: edk2-devel@ml01.01.org, eyakovlev@virtuozzo.com, den@virtuozzo.com, Jeff Fan X-Content-Filtered-By: Mailman/MimeDel 2.1.21 Subject: Re: OvmfPkg: VM crashed trying to write to RO memory from CommonInterruptEntry X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Nov 2016 13:58:14 -0000 Content-Type: text/plain; charset=UTF-8 Wow, that is more than i expected :) > I wonder if you started to see this issue very recently. Very recently, however we use a pretty old OVMF build, circa 2015 > OVMF debug log Sorry, we hadn't had it enabled when VM crashed and these crashes are very rare. We will try to capture it when it happens again > - your host CPU model, cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 > - the host kernel (KVM) version, Our kernel is roughly based on RHEL7.2 (kernel version 3.10.0-327.36.1). We also have some upstream KVM patches backported. > - the guest CPU model, -cpu SandyBridge,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+osxsave,-arat,-xsaveopt,-xgetbv1,-vmx,-xsavec,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex,hv_runtime,hv_synic,hv_stimer,hv_reset,hv_crash > - the guest CPU topology. 8 sockets, 1 core per socket, 1 thread per core Hope that helps! 2016-11-22 16:41 GMT+03:00 Laszlo Ersek : > Hello Evgeny, > > On 11/22/16 13:57, Evgeny Yakovlev wrote: > > We are running windows UEFI-based VMs on QEMU/KVM with OvmfPkg. > > > > Very rarely we are experiencing a crash when VM tries to write to RO > memory > > very early during UEFI boot process. > > > > Crash happens when VM tries to execute this code in interrupt handler: > > https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/Library/ > CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.asm#L244-L246 > > > > > > fxsave [rdi], where RDI = 0xffe60 > > > > Which is bad - it points to ISA BIOS F-segment area. > > > > This memory was mapped by qemu for read only access, which is reflected > in > > KVM EPT: > > 00000000000e0000-00000000000fffff (prio 1, R-): isa-bios > > > > This is a very early IRQ0 interrupt, presumably during early > initialization > > phase (Sec or Pei). > > > > Looks like CommonInterruptHandler does not switch to a separate stack and > > works on interrupted context's stack, which was fairly close to 1MB > > boundary when IRQ0 fired (RSP around 1002c0). When CommonInterruptEntry > > reached highlighted code it subtracted 512 bytes from current RSP which > > dropped to 0xffe60, below 1MB and into QEMU RO region. > > > > We were figuring out how to best fix this. Possible solutions are to > switch > > to a separate stack in CommonInterruptEntry, relocate early OvmfPkg stack > > to somewhere farther away from 1MB, to run with interrupts disabled until > > we reach a later phase or maybe something else. > > > > Any comments would be very appreciated! > > I wonder if you started to see this issue very recently. > > I suspect (hope!) that the symptoms you are experiencing are a > consequence of a bug in UefiCpuPkg that I've debugged and fixed just > today. (I hope to post the patches today.) > > While testing those patches on your end will of course tell us if your > issue has the same root cause, you could gather a few more symptoms even > before I get around posting the patches. The bug that I'm working on has > extremely varied crash symptoms (basically the APs wander off into the > weeds), and some of those symptoms have involved CpuExceptionHandlerLib. > The point is, by the time we get into CpuExceptionHandlerLib, all is > lost -- it is executing on an AP whose state is corrupt anyway. The > fxsave symptom is a red herring, most likely. > > CpuExceptionHandlerLib works fine otherwise, especially when invoked > from the BSP -- we've used the output dumped by CpuExceptionHandlerLib > to the serial port several times to track down issues. > > So, my request is that you please capture the OVMF debug log (please see > the "OvmfPkg/README" file for how). I'm curious if it crashes where and > how I suspect it crashes. > > Also, it would help if you provided > - your host CPU model, > - the host kernel (KVM) version, > - the guest CPU model, > - the guest CPU topology. > > Thanks! > Laszlo >