From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.65; helo=mga03.intel.com; envelope-from=ruiyu.ni@intel.com; receiver=edk2-devel@lists.01.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 365B82116DF97 for ; Wed, 17 Oct 2018 19:12:37 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Oct 2018 19:12:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,393,1534834800"; d="scan'208";a="82346143" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga008.jf.intel.com with ESMTP; 17 Oct 2018 19:12:36 -0700 Received: from fmsmsx118.amr.corp.intel.com (10.18.116.18) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 17 Oct 2018 19:12:36 -0700 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by fmsmsx118.amr.corp.intel.com (10.18.116.18) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 17 Oct 2018 19:12:36 -0700 Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.48]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.245]) with mapi id 14.03.0319.002; Thu, 18 Oct 2018 10:12:33 +0800 From: "Ni, Ruiyu" To: "Dong, Eric" , "edk2-devel@lists.01.org" CC: Laszlo Ersek Thread-Topic: [Patch v2 0/6] Fix performance issue caused by Set MSR task. Thread-Index: AQHUZb96FEM7xvN9k0axYjNQnV22TaUkRH1w Date: Thu, 18 Oct 2018 02:12:32 +0000 Message-ID: <734D49CCEBEEF84792F5B80ED585239D5BE97331@SHSMSX104.ccr.corp.intel.com> References: <20181017021635.14972-1-eric.dong@intel.com> In-Reply-To: <20181017021635.14972-1-eric.dong@intel.com> Accept-Language: en-US, zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [Patch v2 0/6] Fix performance issue caused by Set MSR task. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2018 02:12:37 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Eric, Can you post your changes to github yours mirror repo? I found #3/6 cannot be applied to my code properly. Thanks/Ray > -----Original Message----- > From: Dong, Eric > Sent: Wednesday, October 17, 2018 10:16 AM > To: edk2-devel@lists.01.org > Cc: Ni, Ruiyu ; Laszlo Ersek > Subject: [Patch v2 0/6] Fix performance issue caused by Set MSR task. >=20 > V2 changes include: > 1. Include the change for CpuCommonFeaturesLib which used to set MSR > base on its scope info. > 2. Include the change for CpuS3DataDxe driver which also handle the > AcpiCpuData data. > 3. Update code base on feedback for V1 changes. >=20 > V1 changes include: > In a system which has multiple cores, current set register value task cos= ts > huge times. > After investigation, current set MSR task costs most of the times. Curren= t > logic uses SpinLock to let set MSR task as an single thread task for all = cores. > Because MSR has scope attribute which may cause GP fault if multiple APs > set MSR at the same time, current logic use an easiest solution (use Spin= Lock) > to avoid this issue, but it will cost huge times. >=20 > In order to fix this performance issue, new solution will set MSRs base o= n > their scope attribute. After this, the SpinLock will not needed. Without > SpinLock, new issue raised which is caused by MSR dependence. For > example, MSR A depends on MSR B which means MSR A must been set after > MSR B has been set. Also MSR B is package scope level and MSR A is thread > scope level. If system has multiple threads, Thread 1 needs to set the th= read > level MSRs and thread 2 needs to set thread and package level MSRs. Set > MSRs task for thread 1 and thread 2 like below: >=20 > Thread 1 Thread 2 > MSR B N Y > MSR A Y Y >=20 > If driver don't control execute MSR order, for thread 1, it will execute = MSR A > first, but at this time, MSR B not been executed yet by thread 2. system = may > trig exception at this time. >=20 > In order to fix the above issue, driver introduces semaphore logic to con= trol > the MSR execute sequence. For the above case, a semaphore will be add > between MSR A and B for all threads. Semaphore has scope info for it. The > possible scope value is core or package. > For each thread, when it meets a semaphore during it set registers, it wi= ll 1) > release semaphore (+1) for each threads in this core or package(based on > the scope info for this > semaphore) 2) acquire semaphore (-1) for all the threads in this core or > package(based on the scope info for this semaphore). With these two steps= , > driver can control MSR sequence. Sample code logic like below: >=20 > // > // First increase semaphore count by 1 for processors in this package. > // > for (ProcessorIndex =3D 0; ProcessorIndex < PackageThreadsCount ; > ProcessorIndex ++) { > LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset + > ProcessorIndex]); > } > // > // Second, check whether the count has reach the check number. > // > for (ProcessorIndex =3D 0; ProcessorIndex < ValidApCount; ProcessorInde= x ++) > { > LibWaitForSemaphore (&SemaphorePtr[ApOffset]); > } >=20 > Platform Requirement: > 1. This change requires register MSR setting base on MSR scope info. If s= till > register MSR > for all threads, exception may raised. >=20 > Known limitation: > 1. Current CpuFeatures driver supports DXE instance and PEI instance. But > semaphore logic > requires Aps execute in async mode which is not supported by PEI drive= r. > So CpuFeature > PEI instance not works after this change. We plan to support async mod= e > for PEI in phase > 2 for this task. > 2. Current execute MSR task code in duplicated in PiSmmCpuDxeSmm driver > and > RegisterCpuFeaturesLib library because the schedule limitation. Will m= erge > the code to > RegisterCpuFeaturesLib and export as an API in phase 2 for this task. >=20 > Cc: Ruiyu Ni > Cc: Laszlo Ersek > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: Eric Dong >=20 >=20 > Eric Dong (6): > UefiCpuPkg/Include/AcpiCpuData.h: Add Semaphore related Information. > UefiCpuPkg/RegisterCpuFeaturesLib.h: Add new dependence types. > UefiCpuPkg/RegisterCpuFeaturesLib: Add logic to support semaphore > type. > UefiCpuPkg/PiSmmCpuDxeSmm: Add logic to support semaphore type. > UefiCpuPkg/CpuS3DataDxe: Keep old data if value already existed. > UefiCpuPkg/CpuCommonFeaturesLib: Register MSR base on scope Info. >=20 > UefiCpuPkg/CpuS3DataDxe/CpuS3Data.c | 2 + > UefiCpuPkg/Include/AcpiCpuData.h | 45 +- > .../Include/Library/RegisterCpuFeaturesLib.h | 25 +- > UefiCpuPkg/Library/CpuCommonFeaturesLib/C1e.c | 8 + > UefiCpuPkg/Library/CpuCommonFeaturesLib/Eist.c | 12 + > .../Library/CpuCommonFeaturesLib/ExecuteDisable.c | 10 + > .../Library/CpuCommonFeaturesLib/FastStrings.c | 12 + > .../Library/CpuCommonFeaturesLib/FeatureControl.c | 38 ++ > .../CpuCommonFeaturesLib/LimitCpuIdMaxval.c | 14 + > .../Library/CpuCommonFeaturesLib/MachineCheck.c | 38 ++ > .../Library/CpuCommonFeaturesLib/MonitorMwait.c | 15 + > .../Library/CpuCommonFeaturesLib/PendingBreak.c | 11 + > UefiCpuPkg/Library/CpuCommonFeaturesLib/Ppin.c | 11 + > .../Library/CpuCommonFeaturesLib/ProcTrace.c | 11 + > UefiCpuPkg/Library/CpuCommonFeaturesLib/X2Apic.c | 10 + > .../RegisterCpuFeaturesLib/CpuFeaturesInitialize.c | 438 > ++++++++++++++++--- > .../DxeRegisterCpuFeaturesLib.c | 71 ++- > .../DxeRegisterCpuFeaturesLib.inf | 3 + > .../PeiRegisterCpuFeaturesLib.c | 55 ++- > .../PeiRegisterCpuFeaturesLib.inf | 1 + > .../RegisterCpuFeaturesLib/RegisterCpuFeatures.h | 61 ++- > .../RegisterCpuFeaturesLib.c | 484 +++++++++++++++= +++--- > UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 406 +++++++++++----= -- > UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c | 3 - > UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 3 +- > 25 files changed, 1505 insertions(+), 282 deletions(-) >=20 > -- > 2.15.0.windows.1