From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-x22a.google.com (mail-io0-x22a.google.com [IPv6:2607:f8b0:4001:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id E7D5481F2B for ; Wed, 7 Dec 2016 01:11:56 -0800 (PST) Received: by mail-io0-x22a.google.com with SMTP id a124so700839149ioe.2 for ; Wed, 07 Dec 2016 01:11:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=94wYx7414k8ajuzt5zt49kabJL53QVab/8Ypm6wiXhM=; b=KQ8EktE48tqRzHScnoaCsUN8Vhw80wBZERfvtCmN+tsTnodXQcgLMlQNxOBpGJipU8 BNeLNWcK4lvmQOIdXThDdB7aHvsmF7UZs68TZ2JVz5SdQf2r5aNXoVzVBc0sCthFr5zi zBzZDKlex5Gd2wCPtJ5LutG3hK+gMr+Uk5kBDXLxMmr/myh23ONS416FpymE5szE9DWP QPiDwX3MnMW6hMGtBYMbRE/KyRMZhPfOFjz9BnvsxRkVjp7rG0pVd31g02JuRpU7gm1+ OAYRvPTHt9STPJDC3sLP45vzEc/37weGLq29LYyjG5VLRXW/7Jo16D3yWklwfhYdXVLL siXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=94wYx7414k8ajuzt5zt49kabJL53QVab/8Ypm6wiXhM=; b=IucN+t//Q6i7kI8HVZLfo7eoHS9/6u2DanFVZlMDb1vwsDPfS5NRyzvwnSgsKGeLq0 v01Wk2npr0cRbt+jy7KVFHvmlJfZM3wJqE0iiEekqg+5Ov2++AmDXMzzWmZyz7LynWUV 4ENTeSldgxzKSzCQjQcZ/d2spJdC+3m+dOQie2JaNpPn4CG2aq7a36SUhl5nvpfLd/pF NWCdKBDAQciZ5ac0DBp8IcGltL6dFT/mqh2gPEVNfSt22dK+db94wZq6uR2vy0wUnUIE JzdE5m3IRdxhp3F3lJf4dfeHCUkkNujjNrxEblogTS6o9lD9Q6LuE4CRiF8FkO92bUXs 9LaQ== X-Gm-Message-State: AKaTC01pxlt1x6CIoJ+e/YBJNQfMt8fAZrL5A3a4MtitaA6tCeuiT9Qn41SCVinU+XddQa8D/gNzxoYcskCL5A== X-Received: by 10.36.185.83 with SMTP id k19mr1265199iti.59.1481101915931; Wed, 07 Dec 2016 01:11:55 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.156.133 with HTTP; Wed, 7 Dec 2016 01:11:55 -0800 (PST) In-Reply-To: References: <2340021c-4bcb-2622-07a8-6e6173f94d81@redhat.com> <9fcf577d-cf9e-db6f-c0f8-6842baf8bb83@redhat.com> From: Evgeny Yakovlev Date: Wed, 7 Dec 2016 12:11:55 +0300 Message-ID: To: Laszlo Ersek Cc: edk2-devel@ml01.01.org, eyakovlev@virtuozzo.com, den@virtuozzo.com, Jeff Fan X-Content-Filtered-By: Mailman/MimeDel 2.1.21 Subject: Re: OvmfPkg: VM crashed trying to write to RO memory from CommonInterruptEntry X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Dec 2016 09:11:57 -0000 Content-Type: text/plain; charset=UTF-8 Hi Laszlo So i managed to reproduce this with debug logs this time and figured out what happened. tl;dr: KVM is spamming guest firmware with 8254 PIT timer IRQs after accumulating a lot of missed timer expiration events while win 2k16 boot loader was waiting for debugger connection on COM2. As in previous bugs we see recursive entry into IRQ0 handler in OVMF. OVMF IRQ0 handler workload is O(1) however it enables interrupts during normal processing so reenters are possible. If we manually slow down IRQ0 handler enough we can reproduce this easily. Repro always happens strictly during UEFI BDS phase, when 2k16 boot loader is already executing but still as a UEFI application (ExitBootServices hasn't been called yet). During this time boot loader waits for debugger connection on COM2 (installation ISO was configured this way). While it does so a lot of IRQ3's are generated (COM2) and guest does not react to IRQ0 interrupts which are generated by kvm-pit. KVM-PIT keeps track of all non-acked interrupts. When any ACK comes KVM tries to immediately reinject all accumulated pending interrupts without any delays: kvm_pit_ack_irq: 248 /* in this case, we had multiple outstanding pit interrupts 249 * that we needed to inject. Reinject 250 */ 251 queue_kthread_work(&ps->pit->worker, &ps->pit->expired); In one case KVM managed to accumulate 0x108b interrupts to reinject: CPU 0/KVM-47696 [000] d... 63779.924496: kvm_pit_ack_irq_44: (kvm_pit_ack_irq+0x2c/0x70 [kvm]) pending=108b Normally OVMF configures PIT to generate IRQ0 each 10 milliseconds, however guest now sees IRQ0 each 500-600 microseconds: kvm-pit/47688-47704 [007] .... 63779.924512: kvm_pic_set_irq: chip 0 pin 0 (edge) kvm-pit/47688-47704 [007] .... 63779.925132: kvm_pic_set_irq: chip 0 pin 0 (edge) Which is not enough time for normal OVMF IRQ0 handler to exit so when it enables interrupts IRQ0 handler is immediately reintered. After about 300 reinjects machine runs out of stack space and hits RO area below 1MB. Just though you'd be interested to know! 2016-11-23 19:54 GMT+03:00 Laszlo Ersek : > On 11/23/16 09:37, Evgeny Yakovlev wrote: > > You are right of course about the old tree, no objections here. I will > > try to advocate for an update however i am pretty sure we're stuck with > > our version for some time at least. > > > > Still, my original question was about is it normal for OVMF Sec/Pei > > stage to have its stack so close to 0x100000 > > The SEC phase, and the part of the PEI phase that runs before installing > and migrating to the permanent PEI RAM, use the range 8256KB..8288KB as > heap and stack (half/half, 16 KB and 16 KB). This range is above 8MB. > > The address you specify is 1MB. I don't think we ever set up such a > stack intentionally. > > It is possible that the failure occurs in one of the AP startup routines > (the APs start in real mode), and the exception handler is executed by > the AP. > > To know more, the serial port and debug port outputs would be > interesting; but, I should note, based on past experience, when the APs > run into such issues in their startup routines, the final symptoms are > usually garbage / chaotic. These issues are usually fixed by preventing > the APs from wandering off into the woods in the first place (for > example, nailing down race conditions between BSP and APs), and edk2 has > seen a lot of improvements in that area, with the arrival of MpInitLib. > > > and/or why interrupt > > handler in UefiCpuPkg/Library/CpuExceptionHandlerLib/X64 does not switch > > to a separate stack. > > I hope Jeff can provide some insight here. > > Thanks > Laszlo > > > Code in UefiCpuPkg/Library/CpuExceptionHandlerLib/X64 hasn't been > > touched for 2 years so our version is still relevant. > > > > 2016-11-22 19:58 GMT+03:00 Laszlo Ersek > >: > > > > On 11/22/16 14:58, Evgeny Yakovlev wrote: > > > Wow, that is more than i expected :) > > > > > >> I wonder if you started to see this issue very recently. > > > Very recently, however we use a pretty old OVMF build, circa 2015 > > > > Ugh. Please update OVMF first... A whole lot of things has changed in > > edk2 in this year. > > > > > > > >> OVMF debug log > > > Sorry, we hadn't had it enabled when VM crashed and these crashes > are very > > > rare. We will try to capture it when it happens again > > > > > >> - your host CPU model, > > > cpu family : 6 > > > model : 42 > > > model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz > > > stepping : 7 > > > > > >> - the host kernel (KVM) version, > > > Our kernel is roughly based on RHEL7.2 (kernel version > 3.10.0-327.36.1). We > > > also have some upstream KVM patches backported. > > > > > >> - the guest CPU model, > > > -cpu > > > SandyBridge,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+ > monitor,+ds_cpl,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+ > osxsave,-arat,-xsaveopt,-xgetbv1,-vmx,-xsavec,hv_time, > hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vpindex, > hv_runtime,hv_synic,hv_stimer,hv_reset,hv_crash > > > > > >> - the guest CPU topology. > > > 8 sockets, 1 core per socket, 1 thread per core > > > > > > Hope that helps! > > > > The fact that you are using 8 VCPUs is definitely relevant. However, > I > > don't think it would make sense to try to analyze any errors with an > > OVMF / edk2 tree this old. Please try to reproduce the issue with a > > fresh build from master. > > > > Thanks! > > Laszlo > > > > > 2016-11-22 16:41 GMT+03:00 Laszlo Ersek > >: > > > > > >> Hello Evgeny, > > >> > > >> On 11/22/16 13:57, Evgeny Yakovlev wrote: > > >>> We are running windows UEFI-based VMs on QEMU/KVM with OvmfPkg. > > >>> > > >>> Very rarely we are experiencing a crash when VM tries to write > to RO > > >> memory > > >>> very early during UEFI boot process. > > >>> > > >>> Crash happens when VM tries to execute this code in interrupt > > handler: > > >>> > > https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/Library/ > > > > >> CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.asm#L244-L246 > > >>> > > >>> > > >>> fxsave [rdi], where RDI = 0xffe60 > > >>> > > >>> Which is bad - it points to ISA BIOS F-segment area. > > >>> > > >>> This memory was mapped by qemu for read only access, which is > > reflected > > >> in > > >>> KVM EPT: > > >>> 00000000000e0000-00000000000fffff (prio 1, R-): isa-bios > > >>> > > >>> This is a very early IRQ0 interrupt, presumably during early > > >> initialization > > >>> phase (Sec or Pei). > > >>> > > >>> Looks like CommonInterruptHandler does not switch to a separate > > stack and > > >>> works on interrupted context's stack, which was fairly close to > 1MB > > >>> boundary when IRQ0 fired (RSP around 1002c0). When > > CommonInterruptEntry > > >>> reached highlighted code it subtracted 512 bytes from current > > RSP which > > >>> dropped to 0xffe60, below 1MB and into QEMU RO region. > > >>> > > >>> We were figuring out how to best fix this. Possible solutions > are to > > >> switch > > >>> to a separate stack in CommonInterruptEntry, relocate early > > OvmfPkg stack > > >>> to somewhere farther away from 1MB, to run with interrupts > > disabled until > > >>> we reach a later phase or maybe something else. > > >>> > > >>> Any comments would be very appreciated! > > >> > > >> I wonder if you started to see this issue very recently. > > >> > > >> I suspect (hope!) that the symptoms you are experiencing are a > > >> consequence of a bug in UefiCpuPkg that I've debugged and fixed > just > > >> today. (I hope to post the patches today.) > > >> > > >> While testing those patches on your end will of course tell us if > > your > > >> issue has the same root cause, you could gather a few more > > symptoms even > > >> before I get around posting the patches. The bug that I'm working > > on has > > >> extremely varied crash symptoms (basically the APs wander off > > into the > > >> weeds), and some of those symptoms have involved > > CpuExceptionHandlerLib. > > >> The point is, by the time we get into CpuExceptionHandlerLib, all > is > > >> lost -- it is executing on an AP whose state is corrupt anyway. > The > > >> fxsave symptom is a red herring, most likely. > > >> > > >> CpuExceptionHandlerLib works fine otherwise, especially when > invoked > > >> from the BSP -- we've used the output dumped by > > CpuExceptionHandlerLib > > >> to the serial port several times to track down issues. > > >> > > >> So, my request is that you please capture the OVMF debug log > > (please see > > >> the "OvmfPkg/README" file for how). I'm curious if it crashes > > where and > > >> how I suspect it crashes. > > >> > > >> Also, it would help if you provided > > >> - your host CPU model, > > >> - the host kernel (KVM) version, > > >> - the guest CPU model, > > >> - the guest CPU topology. > > >> > > >> Thanks! > > >> Laszlo > > >> > > > _______________________________________________ > > > edk2-devel mailing list > > > edk2-devel@lists.01.org > > > https://lists.01.org/mailman/listinfo/edk2-devel > > > > > > > > > > >