From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 7589781E2D for ; Thu, 10 Nov 2016 05:33:40 -0800 (PST) Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9617390221; Thu, 10 Nov 2016 13:33:43 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-106.phx2.redhat.com [10.3.116.106]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAADXgaV021897; Thu, 10 Nov 2016 08:33:42 -0500 To: Paolo Bonzini , Jeff Fan References: <20161110060708.13932-1-jeff.fan@intel.com> <0528a12e-3755-99cb-861a-ac927d484ec1@redhat.com> Cc: edk2-devel@ml01.01.org, Jiewen Yao From: Laszlo Ersek Message-ID: <6c731344-df44-79ea-63df-64352d9af388@redhat.com> Date: Thu, 10 Nov 2016 14:33:41 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 10 Nov 2016 13:33:43 +0000 (UTC) Subject: Re: [PATCH 0/2] Put AP into safe hlt-loop code on S3 path X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Nov 2016 13:33:40 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 11/10/16 13:26, Paolo Bonzini wrote: > > > On 10/11/2016 11:41, Laszlo Ersek wrote: >> Here's an excerpt from the KVM trace: >> >>> CPU-23509 [002] 8406.908787: kvm_enter_smm: vcpu 1: entering SMM, smbase 0x30000 >>> CPU-23509 [002] 8406.908836: kvm_enter_smm: vcpu 1: leaving SMM, smbase 0x7ffb3000 >>> CPU-23510 [003] 8406.908850: kvm_enter_smm: vcpu 2: entering SMM, smbase 0x30000 >>> CPU-23510 [003] 8406.908881: kvm_enter_smm: vcpu 2: leaving SMM, smbase 0x7ffb5000 >>> CPU-23511 [001] 8406.908908: kvm_enter_smm: vcpu 3: entering SMM, smbase 0x30000 >>> CPU-23511 [001] 8406.908941: kvm_enter_smm: vcpu 3: leaving SMM, smbase 0x7ffb7000 >>> CPU-23508 [005] 8406.908951: kvm_enter_smm: vcpu 0: entering SMM, smbase 0x30000 >>> CPU-23508 [005] 8406.908989: kvm_enter_smm: vcpu 0: leaving SMM, smbase 0x7ffb1000 >>> CPU-23511 [001] 8406.920215: kvm_enter_smm: vcpu 3: entering SMM, smbase 0x7ffb7000 >>> CPU-23509 [002] 8406.920225: kvm_enter_smm: vcpu 1: entering SMM, smbase 0x7ffb3000 >>> CPU-23510 [003] 8406.920225: kvm_enter_smm: vcpu 2: entering SMM, smbase 0x7ffb5000 >>> CPU-23508 [005] 8406.920227: kvm_enter_smm: vcpu 0: entering SMM, smbase 0x7ffb1000 >>> CPU-23508 [005] 8406.920262: kvm_enter_smm: vcpu 0: leaving SMM, smbase 0x7ffb1000 >>> CPU-23511 [001] 8406.920263: kvm_enter_smm: vcpu 3: leaving SMM, smbase 0x7ffb7000 >>> CPU-23508 [005] 8407.020292: kvm_enter_smm: vcpu 0: entering SMM, smbase 0x7ffb1000 >>> CPU-23509 [006] 8407.020338: kvm_enter_smm: vcpu 1: leaving SMM, smbase 0x7ffb3000 >>> CPU-23510 [003] 8407.020338: kvm_enter_smm: vcpu 2: leaving SMM, smbase 0x7ffb5000 >>> CPU-23508 [005] 8407.020338: kvm_enter_smm: vcpu 0: leaving SMM, smbase 0x7ffb1000 >> >> It seems that VCPU#0 still leaves (and then re-enters) SMM while VCPU#1 and VCPU#2 are firmly in SMM. >> >> So this series is a clear improvement, but something else remains amiss. >> >> If I remove Jiewen's v2 series, and apply only this one, then the symptom shows up much less frequently, but it does exist: >> - With (Jiewen's v2 + this one), testing case 13, I hit the symptom on the second resume, >> - With just this set applied, I hit the symptom (= one AP disappearing from Linux after resume) only on the 24th resume. > > Any trace I can look at? Thank you for asking / offering, I did stash the full trace :), I just didn't want to foist it upon you without you offering first :) http://people.redhat.com/lersek/s3-crash-8d1dfed7-ca92-4e25-8d2b-b1c9ac2a53db/case-13-jiewen-v2-jeff-fixup-trace.txt.bz2 > What about case 14, with > PcdCpuSmmStaticPageTable=TRUE? I didn't test case 14, because I consider it a more demanding test case than case 13. According to Jiewen's earlier explanation, PcdCpuSmmStaticPageTable=FALSE implements a subset of the protections that PcdCpuSmmStaticPageTable=TRUE implements. Thanks Laszlo