From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id DC0C281E82 for ; Fri, 11 Nov 2016 11:49:03 -0800 (PST) Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6506972AF2; Fri, 11 Nov 2016 19:49:07 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-102.phx2.redhat.com [10.3.116.102]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uABJn5B5006885; Fri, 11 Nov 2016 14:49:06 -0500 To: Jeff Fan References: <20161111054545.19616-1-jeff.fan@intel.com> Cc: edk2-devel@ml01.01.org, Jiewen Yao , Paolo Bonzini From: Laszlo Ersek Message-ID: Date: Fri, 11 Nov 2016 20:49:05 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161111054545.19616-1-jeff.fan@intel.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 11 Nov 2016 19:49:07 +0000 (UTC) Subject: Re: [PATCH v2 0/3] Put AP into safe hlt-loop code on S3 path X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Nov 2016 19:49:04 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 11/11/16 06:45, Jeff Fan wrote: > On S3 path, we will wake up APs to restore CPU context in PiSmmCpuDxeSmm > driver. In case, one NMI or SMI happens, APs may exit from hlt state and > execute the instruction after HLT instruction. > > But APs are not running on safe code, it leads OVMF S3 boot unstable. > > https://bugzilla.tianocore.org/show_bug.cgi?id=216 > > I tested real platform with 64bit DXE. > > v2: > 1. Make stack alignment per Laszlo's comment. > 2. Trim whitespace at end of end per Laszlo's comment. > 3. Update year mark in file header. > 4. Enhancement on InterlockedDecrement() per Paolo's comment. > > Jeff Fan (3): > UefiCpuPkg/PiSmmCpuDxeSmm: Put AP into safe hlt-loop code on S3 path > UefiCpuPkg/PiSmmCpuDxeSmm: Place AP to 32bit protected mode on S3 path > UefiCpuPkg/PiSmmCpuDxeSmm: Decrease mNumberToFinish in AP safe code > > UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 33 +++++++++++++- > UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c | 29 +++++++++++- > UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 15 +++++++ > UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c | 63 ++++++++++++++++++++++++++- > 4 files changed, 136 insertions(+), 4 deletions(-) > Applied this locally to master (ffd6b0b1b65e) for testing. I tested the series with a suspend-resume loop -- not a busy loop, just manually. (So there was always one second or so between adjacent steps.) No crashes or emulation failures, but the "AP going lost" issue remains present -- sometimes Linux cannot bring up one of the four VCPUs after resume. In the Ia32 case, this "AP lost" symptom surfaced after the 6th resume. In the Ia32X64 case, I experienced the symptom after the 89th resume. Thanks Laszlo