From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 8257F81DCC for ; Mon, 14 Nov 2016 00:50:00 -0800 (PST) Received: by mail-wm0-x243.google.com with SMTP id g23so13306209wme.1 for ; Mon, 14 Nov 2016 00:50:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=Ni/DLUxrJBvfj0Wobn86wXtp3wBEIbNwJPCIygTV3SM=; b=LAqfqp0Gir/xhZ/mBOF3YKtJLx5m1HvbKoUtTsU4myZCkYEF1bahyFs008a9ixZJZx 8pjCfsg4LsSIhbM7PK+iGM7wvE5hQL3EW+YsANTSQRh8lNbIsZQc0FCTleiCuQuX0ekl G95ldR7RdT/5X2CCMZFZfl1wDrr+foF4IEDu1dEQcH9H4ZFUDO1wfG0hPNUtmmLXEXta AI3vqonHn73vFgHw8KFhqlRt4wcg2HoMaIb0s7udC7mjo6w/ll9IcVd5pzW3z6NiHeOy wv8bQoh3fpo9d1lhYV+O/nuD/dEj4fEoixmm2Fd76vrcacPZQUwARhEewYR1gGNX4wYm lVFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:to:references:cc:from:message-id :date:user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=Ni/DLUxrJBvfj0Wobn86wXtp3wBEIbNwJPCIygTV3SM=; b=J444znyAAhAdWciI09fvxsUp1rg8PWt053owq1amU00vy5TMbAqxEYodvmgmxvMzpy tswcM8Ug5VQ4TM+kAYAKRty8mQd1aRB2MtMXImB9d7LyCtHmpo1d2Ki6Dw91fCwj3/Of 1wgJEExsQZCUqXOR7G7HVubhp18n9oEvBw1Ibiz9FscDrcMHaKisSm2NAXaRYiJeKrnc 0zDQiaAXu3prfo4UGcBtuRXzSJpf0nrpr2LQUsML82tf26gs+yzl84TwV+l1FM4FhopF aeJ6RJzNlefWsJnAzDLcQb5BBOinDuAJ1la07UUgSBPZVfGnv72J3PlUnwC4OnCyxAI6 1tjw== X-Gm-Message-State: ABUngvfrbYHkid/oUhClPkZNr1D/VxrB7G+cHMXz+26kFgp7Vib8xxajsnGQNDHWRuPzDg== X-Received: by 10.194.222.132 with SMTP id qm4mr10691033wjc.150.1479113403486; Mon, 14 Nov 2016 00:50:03 -0800 (PST) Received: from [192.168.10.165] (94-39-155-114.adsl-ull.clienti.tiscali.it. [94.39.155.114]) by smtp.googlemail.com with ESMTPSA id g73sm10774704wme.16.2016.11.14.00.50.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 14 Nov 2016 00:50:02 -0800 (PST) Sender: Paolo Bonzini To: Laszlo Ersek , "Fan, Jeff" References: <20161111054545.19616-1-jeff.fan@intel.com> <542CF652F8836A4AB8DBFAAD40ED192A4A2DB4F5@shsmsx102.ccr.corp.intel.com> <00b6828b-78c5-af4f-ab98-de4460b1b8ec@redhat.com> Cc: "edk2-devel@ml01.01.org" , "Yao, Jiewen" From: Paolo Bonzini Message-ID: <4dc14e5c-9b43-4338-c7a5-9750e8a9547a@redhat.com> Date: Mon, 14 Nov 2016 09:50:02 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <00b6828b-78c5-af4f-ab98-de4460b1b8ec@redhat.com> Subject: Re: [PATCH v2 0/3] Put AP into safe hlt-loop code on S3 path X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Nov 2016 08:50:00 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 14/11/2016 09:17, Laszlo Ersek wrote: > On 11/13/16 13:51, Fan, Jeff wrote: >> Laszlo, >> >> Thanks your testing. It seems that there is still some unknown issue existing. >> >> I suggest to push this serial of patches firstly, because they have >> big progress to solve the AP crashed issue in >> https://bugzilla.tianocore.org/show_bug.cgi?id=216. > > Sounds good to me. > >> I could submit another bug to handle "AP lost" issue. > > I hope that Paolo can continue to help us with the KVM trace analysis. I will, but it will take a few days. In the meanwhile it would be nice if you could take a look at using SendSmiIpiAllExcludingSelf() to bridge the difference between 0xb2 on QEMU and on real hardware. Paolo >> Thus, JIewen's >> or others' patches could be push as long as they have no additional >> issue except for "AP Lost:". > > I haven't gotten around testing Jiewen's v3 series yet. I think it would > be best if I could test Jiewen's v3 after this v2 series of yours is > committed. I'll report back with results. > > Thanks > Laszlo > >> >> I could follow up to fix "AP Lost" issue. >> >> Thanks! >> Jeff >> >> >> -----Original Message----- >> From: Laszlo Ersek [mailto:lersek@redhat.com] >> Sent: Saturday, November 12, 2016 3:49 AM >> To: Fan, Jeff >> Cc: edk2-devel@ml01.01.org; Yao, Jiewen; Paolo Bonzini >> Subject: Re: [edk2] [PATCH v2 0/3] Put AP into safe hlt-loop code on S3 path >> >> On 11/11/16 06:45, Jeff Fan wrote: >>> On S3 path, we will wake up APs to restore CPU context in >>> PiSmmCpuDxeSmm driver. In case, one NMI or SMI happens, APs may exit >>> from hlt state and execute the instruction after HLT instruction. >>> >>> But APs are not running on safe code, it leads OVMF S3 boot unstable. >>> >>> https://bugzilla.tianocore.org/show_bug.cgi?id=216 >>> >>> I tested real platform with 64bit DXE. >>> >>> v2: >>> 1. Make stack alignment per Laszlo's comment. >>> 2. Trim whitespace at end of end per Laszlo's comment. >>> 3. Update year mark in file header. >>> 4. Enhancement on InterlockedDecrement() per Paolo's comment. >>> >>> Jeff Fan (3): >>> UefiCpuPkg/PiSmmCpuDxeSmm: Put AP into safe hlt-loop code on S3 path >>> UefiCpuPkg/PiSmmCpuDxeSmm: Place AP to 32bit protected mode on S3 path >>> UefiCpuPkg/PiSmmCpuDxeSmm: Decrease mNumberToFinish in AP safe code >>> >>> UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 33 +++++++++++++- >>> UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c | 29 +++++++++++- >>> UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 15 +++++++ >>> UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c | 63 >>> ++++++++++++++++++++++++++- >>> 4 files changed, 136 insertions(+), 4 deletions(-) >>> >> >> Applied this locally to master (ffd6b0b1b65e) for testing. I tested the series with a suspend-resume loop -- not a busy loop, just manually. (So there was always one second or so between adjacent steps.) >> >> No crashes or emulation failures, but the "AP going lost" issue remains present -- sometimes Linux cannot bring up one of the four VCPUs after resume. >> >> In the Ia32 case, this "AP lost" symptom surfaced after the 6th resume. >> >> In the Ia32X64 case, I experienced the symptom after the 89th resume. >> >> Thanks >> Laszlo >>