From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.20; helo=mga02.intel.com; envelope-from=ruiyu.ni@intel.com; receiver=edk2-devel@lists.01.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 447EF2116325D for ; Mon, 15 Oct 2018 20:04:01 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Oct 2018 20:04:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,386,1534834800"; d="scan'208";a="100553337" Received: from ray-dev.ccr.corp.intel.com (HELO [10.239.9.11]) ([10.239.9.11]) by orsmga002.jf.intel.com with ESMTP; 15 Oct 2018 20:03:59 -0700 To: Eric Dong , edk2-devel@lists.01.org Cc: Laszlo Ersek References: <20181015024948.228-1-eric.dong@intel.com> <20181015024948.228-4-eric.dong@intel.com> From: "Ni, Ruiyu" Message-ID: Date: Tue, 16 Oct 2018 11:05:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20181015024948.228-4-eric.dong@intel.com> Subject: Re: [Patch 3/4] UefiCpuPkg/RegisterCpuFeaturesLib: Add logic to support semaphore type. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2018 03:04:01 -0000 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/15/2018 10:49 AM, Eric Dong wrote: > In a system which has multiple cores, current set register value task costs huge times. > After investigation, current set MSR task costs most of the times. Current logic uses > SpinLock to let set MSR task as an single thread task for all cores. Because MSR has > scope attribute which may cause GP fault if multiple APs set MSR at the same time, > current logic use an easiest solution (use SpinLock) to avoid this issue, but it will > cost huge times. > > In order to fix this performance issue, new solution will set MSRs base on their scope > attribute. After this, the SpinLock will not needed. Without SpinLock, new issue raised > which is caused by MSR dependence. For example, MSR A depends on MSR B which means MSR A > must been set after MSR B has been set. Also MSR B is package scope level and MSR A is > thread scope level. If system has multiple threads, Thread 1 needs to set the thread level > MSRs and thread 2 needs to set thread and package level MSRs. Set MSRs task for thread 1 > and thread 2 like below: > > Thread 1 Thread 2 > MSR B N Y > MSR A Y Y > > If driver don't control execute MSR order, for thread 1, it will execute MSR A first, but > at this time, MSR B not been executed yet by thread 2. system may trig exception at this > time. > > In order to fix the above issue, driver introduces semaphore logic to control the MSR > execute sequence. For the above case, a semaphore will be add between MSR A and B for > all threads. Semaphore has scope info for it. The possible scope value is core or package. > For each thread, when it meets a semaphore during it set registers, it will 1) release > semaphore (+1) for each threads in this core or package(based on the scope info for this > semaphore) 2) acquire semaphore (-1) for all the threads in this core or package(based > on the scope info for this semaphore). With these two steps, driver can control MSR > sequence. Sample code logic like below: > > // > // First increase semaphore count by 1 for processors in this package. > // > for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ; ProcessorIndex ++) { > LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset + ProcessorIndex]); > } > // > // Second, check whether the count has reach the check number. > // > for (ProcessorIndex = 0; ProcessorIndex < ValidApCount; ProcessorIndex ++) { > LibWaitForSemaphore (&SemaphorePtr[ApOffset]); > } > > Platform Requirement: > 1. This change requires register MSR setting base on MSR scope info. If still register MSR > for all threads, exception may raised. > > Known limitation: > 1. Current CpuFeatures driver supports DXE instance and PEI instance. But semaphore logic > requires Aps execute in async mode which is not supported by PEI driver. So CpuFeature > PEI instance not works after this change. We plan to support async mode for PEI in phase > 2 for this task. > > Cc: Ruiyu Ni > Cc: Laszlo Ersek > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: Eric Dong > --- > .../RegisterCpuFeaturesLib/CpuFeaturesInitialize.c | 324 ++++++++++++--- > .../DxeRegisterCpuFeaturesLib.c | 71 +++- > .../DxeRegisterCpuFeaturesLib.inf | 3 + > .../PeiRegisterCpuFeaturesLib.c | 55 ++- > .../PeiRegisterCpuFeaturesLib.inf | 1 + > .../RegisterCpuFeaturesLib/RegisterCpuFeatures.h | 51 ++- > .../RegisterCpuFeaturesLib.c | 452 ++++++++++++++++++--- > 7 files changed, 840 insertions(+), 117 deletions(-) > > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > index ba3fb3250f..f820b4fed7 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > @@ -145,6 +145,20 @@ CpuInitDataInitialize ( > CPU_FEATURES_INIT_ORDER *InitOrder; > CPU_FEATURES_DATA *CpuFeaturesData; > LIST_ENTRY *Entry; > + UINT32 Core; > + UINT32 Package; > + UINT32 Thread; > + EFI_CPU_PHYSICAL_LOCATION *Location; > + UINT32 *CoreArray; > + UINTN Index; > + UINT32 ValidCount; > + UINTN CoreIndex; > + ACPI_CPU_DATA *AcpiCpuData; > + CPU_STATUS_INFORMATION *CpuStatus; > + > + Core = 0; > + Package = 0; > + Thread = 0; > > CpuFeaturesData = GetCpuFeaturesData (); > CpuFeaturesData->InitOrder = AllocateZeroPool (sizeof (CPU_FEATURES_INIT_ORDER) * NumberOfCpus); > @@ -163,6 +177,16 @@ CpuInitDataInitialize ( > Entry = Entry->ForwardLink; > } > > + CpuFeaturesData->NumberOfCpus = (UINT32) NumberOfCpus; > + > + AcpiCpuData = (ACPI_CPU_DATA *) (UINTN) PcdGet64 (PcdCpuS3DataAddress); > + ASSERT (AcpiCpuData != NULL); > + CpuFeaturesData->AcpiCpuData= AcpiCpuData; > + > + CpuStatus = &AcpiCpuData->CpuStatus; > + AcpiCpuData->ApLocation = AllocateZeroPool (sizeof (EFI_CPU_PHYSICAL_LOCATION) * NumberOfCpus); > + ASSERT (AcpiCpuData->ApLocation != NULL); > + > for (ProcessorNumber = 0; ProcessorNumber < NumberOfCpus; ProcessorNumber++) { > InitOrder = &CpuFeaturesData->InitOrder[ProcessorNumber]; > InitOrder->FeaturesSupportedMask = AllocateZeroPool (CpuFeaturesData->BitMaskSize); > @@ -175,7 +199,59 @@ CpuInitDataInitialize ( > &ProcessorInfoBuffer, > sizeof (EFI_PROCESSOR_INFORMATION) > ); > + CopyMem ( > + AcpiCpuData->ApLocation + ProcessorNumber, > + &ProcessorInfoBuffer.Location, > + sizeof (EFI_CPU_PHYSICAL_LOCATION) > + ); > + Please add more comments here to describe what the below code tries to do and why. > + if (Package < ProcessorInfoBuffer.Location.Package) { > + Package = ProcessorInfoBuffer.Location.Package; > + } > + if (Core < ProcessorInfoBuffer.Location.Core) { > + Core = ProcessorInfoBuffer.Location.Core; > + } > + if (Thread < ProcessorInfoBuffer.Location.Thread) { > + Thread = ProcessorInfoBuffer.Location.Thread; > + } > + } > + CpuStatus->PackageCount = Package + 1; > + CpuStatus->CoreCount = Core + 1; > + CpuStatus->ThreadCount = Thread + 1; > + DEBUG ((DEBUG_INFO, "Processor Info: Package: %d, Core : %d, Thread: %d\n", > + CpuStatus->PackageCount, > + CpuStatus->CoreCount, > + CpuStatus->ThreadCount)); Please use MaxCore and MaxThread in debug message. Otherwise it's confusing. > + > + // > + // Collect valid core count in each package because not all cores are valid. > + // > + CpuStatus->ValidCoresInPackages = AllocateZeroPool (sizeof (UINT32) * CpuStatus->PackageCount); > + ASSERT (CpuStatus->ValidCoresInPackages != NULL); Please add comments to describe the purpose of CoreArray. CoreArray is not a good name IMO. How about: CoreVisited - AllocatePool (sizeof (BOOLEAN) * CpuStatus->MaxCoreCount); > + CoreArray = AllocatePool (sizeof (UINT32) * CpuStatus->CoreCount); > + ASSERT (CoreArray != NULL); > + > + for (Index = 0; Index <= Package; Index ++ ) { Please stop using Package/Core/Thread. Use the field in CpuStatus structure instead. It makes the code more readable. > + ZeroMem (CoreArray, sizeof (UINT32) * (Core + 1)); > + for (ProcessorNumber = 0; ProcessorNumber < NumberOfCpus; ProcessorNumber++) { > + Location = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo.ProcessorInfo.Location; > + if (Location->Package == Index) { > + CoreArray[Location->Core] = 1; > + } The above if-clause can be: if ((Location->Package == Index) && !CoreVisited[Location->Core])) { CpuStatus->ValidCoreCountPerPackage[Index]++; CoreVisited[Location->Core] = TRUE; } The for-loop below can be removed. > + } > + for (CoreIndex = 0, ValidCount = 0; CoreIndex <= Core; CoreIndex ++) { > + ValidCount += CoreArray[CoreIndex]; > + } > + CpuStatus->ValidCoresInPackages[Index] = ValidCount; > } > + FreePool (CoreArray); > + for (Index = 0; Index <= Package; Index++) { > + DEBUG ((DEBUG_INFO, "Package: %d, Valid Core : %d\n", Index, CpuStatus->ValidCoresInPackages[Index])); > + } > + > + CpuFeaturesData->CpuFlags.SemaphoreCount = AllocateZeroPool (sizeof (UINT32) * CpuStatus->PackageCount * CpuStatus->CoreCount* CpuStatus->ThreadCount); > + ASSERT (CpuFeaturesData->CpuFlags.SemaphoreCount != NULL); > + > // > // Get support and configuration PCDs > // > @@ -310,7 +386,7 @@ CollectProcessorData ( > LIST_ENTRY *Entry; > CPU_FEATURES_DATA *CpuFeaturesData; > > - CpuFeaturesData = GetCpuFeaturesData (); > + CpuFeaturesData = (CPU_FEATURES_DATA *)Buffer; Is the above change more proper in a separate patch? > ProcessorNumber = GetProcessorIndex (); > CpuInfo = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo; > // > @@ -416,6 +492,15 @@ DumpRegisterTableOnProcessor ( > RegisterTableEntry->Value > )); > break; > + case Semaphore: > + DEBUG (( > + DebugPrintErrorLevel, > + "Processor: %d: Semaphore: Scope Value: %d\r\n", How about print the Scope value in string? This makes the debug message more meaningful. > + ProcessorNumber, > + RegisterTableEntry->Value > + )); > + break; > + > default: > break; > } > @@ -441,6 +526,11 @@ AnalysisProcessorFeatures ( > REGISTER_CPU_FEATURE_INFORMATION *CpuInfo; > LIST_ENTRY *Entry; > CPU_FEATURES_DATA *CpuFeaturesData; > + LIST_ENTRY *NextEntry; > + CPU_FEATURES_ENTRY *NextCpuFeatureInOrder; > + BOOLEAN Success; > + CPU_FEATURE_DEPENDENCE_TYPE BeforeDep; > + CPU_FEATURE_DEPENDENCE_TYPE AfterDep; > > CpuFeaturesData = GetCpuFeaturesData (); > CpuFeaturesData->CapabilityPcd = AllocatePool (CpuFeaturesData->BitMaskSize); > @@ -517,8 +607,14 @@ AnalysisProcessorFeatures ( > // > CpuInfo = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo; > Entry = GetFirstNode (&CpuInitOrder->OrderList); > + NextEntry = Entry->ForwardLink; > while (!IsNull (&CpuInitOrder->OrderList, Entry)) { > CpuFeatureInOrder = CPU_FEATURE_ENTRY_FROM_LINK (Entry); > + if (!IsNull (&CpuInitOrder->OrderList, NextEntry)) { > + NextCpuFeatureInOrder = CPU_FEATURE_ENTRY_FROM_LINK (NextEntry); > + } else { > + NextCpuFeatureInOrder = NULL; > + } > if (IsBitMaskMatch (CpuFeatureInOrder->FeatureMask, CpuFeaturesData->SettingPcd)) { > Status = CpuFeatureInOrder->InitializeFunc (ProcessorNumber, CpuInfo, CpuFeatureInOrder->ConfigData, TRUE); > if (EFI_ERROR (Status)) { > @@ -532,6 +628,8 @@ AnalysisProcessorFeatures ( > DEBUG ((DEBUG_WARN, "Warning :: Failed to enable Feature: Mask = ")); > DumpCpuFeatureMask (CpuFeatureInOrder->FeatureMask); > } > + } else { > + Success = TRUE; > } > } else { > Status = CpuFeatureInOrder->InitializeFunc (ProcessorNumber, CpuInfo, CpuFeatureInOrder->ConfigData, FALSE); > @@ -542,9 +640,36 @@ AnalysisProcessorFeatures ( > DEBUG ((DEBUG_WARN, "Warning :: Failed to disable Feature: Mask = ")); > DumpCpuFeatureMask (CpuFeatureInOrder->FeatureMask); > } > + } else { > + Success = TRUE; > } > } > - Entry = Entry->ForwardLink; > + > + if (Success) { > + // > + // If feature has dependence with the next feature (ONLY care core/package dependency). > + // and feature initialize succeed, add sync semaphere here. > + // > + BeforeDep = DetectFeatureScope (CpuFeatureInOrder, TRUE); > + if (NextCpuFeatureInOrder != NULL) { > + AfterDep = DetectFeatureScope (NextCpuFeatureInOrder, FALSE); > + } else { > + AfterDep = NoneDepType; > + } > + // > + // Assume only one of the depend is valid. > + // > + ASSERT (!(BeforeDep > ThreadDepType && AfterDep > ThreadDepType)); > + if (BeforeDep > ThreadDepType) { > + CPU_REGISTER_TABLE_WRITE32 (ProcessorNumber, Semaphore, 0, BeforeDep); > + } > + if (AfterDep > ThreadDepType) { > + CPU_REGISTER_TABLE_WRITE32 (ProcessorNumber, Semaphore, 0, AfterDep); > + } > + } > + > + Entry = Entry->ForwardLink; > + NextEntry = Entry->ForwardLink; > } > > // > @@ -561,27 +686,79 @@ AnalysisProcessorFeatures ( > } > } > > +/** > + Increment semaphore by 1. > + > + @param Sem IN: 32-bit unsigned integer > + > +**/ > +VOID > +LibReleaseSemaphore ( > + IN OUT volatile UINT32 *Sem > + ) > +{ > + InterlockedIncrement (Sem); > +} > + > +/** > + Decrement the semaphore by 1 if it is not zero. > + > + Performs an atomic decrement operation for semaphore. > + The compare exchange operation must be performed using > + MP safe mechanisms. > + > + @param Sem IN: 32-bit unsigned integer > + > +**/ > +VOID > +LibWaitForSemaphore ( > + IN OUT volatile UINT32 *Sem > + ) > +{ > + UINT32 Value; > + > + do { > + Value = *Sem; > + } while (Value == 0); > + > + InterlockedDecrement (Sem); > +} > + > /** > Initialize the CPU registers from a register table. > > - @param[in] ProcessorNumber The index of the CPU executing this function. > + @param[in] RegisterTable The register table for this AP. > + @param[in] ApLocation AP location info for this ap. > + @param[in] CpuStatus CPU status info for this CPU. > + @param[in] CpuFlags Flags data structure used when program the register. > > @note This service could be called by BSP/APs. > **/ > VOID > +EFIAPI > ProgramProcessorRegister ( > - IN UINTN ProcessorNumber > + IN CPU_REGISTER_TABLE *RegisterTable, > + IN EFI_CPU_PHYSICAL_LOCATION *ApLocation, > + IN CPU_STATUS_INFORMATION *CpuStatus, > + IN PROGRAM_CPU_REGISTER_FLAGS *CpuFlags > ) > { > - CPU_FEATURES_DATA *CpuFeaturesData; > - CPU_REGISTER_TABLE *RegisterTable; > CPU_REGISTER_TABLE_ENTRY *RegisterTableEntry; > UINTN Index; > UINTN Value; > CPU_REGISTER_TABLE_ENTRY *RegisterTableEntryHead; > - > - CpuFeaturesData = GetCpuFeaturesData (); > - RegisterTable = &CpuFeaturesData->RegisterTable[ProcessorNumber]; > + volatile UINT32 *SemaphorePtr; > + UINT32 CoreOffset; > + UINT32 PackageOffset; > + UINT32 PackageThreadsCount; > + UINT32 ApOffset; > + UINTN ProcessorIndex; > + UINTN ApIndex; > + UINTN ValidApCount; > + > + ApIndex = ApLocation->Package * CpuStatus->CoreCount * CpuStatus->ThreadCount \ > + + ApLocation->Core * CpuStatus->ThreadCount \ > + + ApLocation->Thread; > > // > // Traverse Register Table of this logical processor > @@ -591,6 +768,7 @@ ProgramProcessorRegister ( > for (Index = 0; Index < RegisterTable->TableLength; Index++) { > > RegisterTableEntry = &RegisterTableEntryHead[Index]; > + DEBUG ((DEBUG_INFO, "Processor = %d, Entry Index %d, Type = %d!\n", ApIndex, Index, RegisterTableEntry->RegisterType)); How about print the register type in string? > > // > // Check the type of specified register > @@ -654,10 +832,6 @@ ProgramProcessorRegister ( > // The specified register is Model Specific Register > // > case Msr: > - // > - // Get lock to avoid Package/Core scope MSRs programming issue in parallel execution mode > - // > - AcquireSpinLock (&CpuFeaturesData->MsrLock); > if (RegisterTableEntry->ValidBitLength >= 64) { > // > // If length is not less than 64 bits, then directly write without reading > @@ -677,20 +851,19 @@ ProgramProcessorRegister ( > RegisterTableEntry->Value > ); > } > - ReleaseSpinLock (&CpuFeaturesData->MsrLock); > break; > // > // MemoryMapped operations > // > case MemoryMapped: > - AcquireSpinLock (&CpuFeaturesData->MemoryMappedLock); > + AcquireSpinLock (&CpuFlags->MemoryMappedLock); > MmioBitFieldWrite32 ( > (UINTN)(RegisterTableEntry->Index | LShiftU64 (RegisterTableEntry->HighIndex, 32)), > RegisterTableEntry->ValidBitStart, > RegisterTableEntry->ValidBitStart + RegisterTableEntry->ValidBitLength - 1, > (UINT32)RegisterTableEntry->Value > ); > - ReleaseSpinLock (&CpuFeaturesData->MemoryMappedLock); > + ReleaseSpinLock (&CpuFlags->MemoryMappedLock); > break; > // > // Enable or disable cache > @@ -706,6 +879,50 @@ ProgramProcessorRegister ( > } > break; > > + case Semaphore: > + SemaphorePtr = CpuFlags->SemaphoreCount; > + switch (RegisterTableEntry->Value) { > + case CoreDepType: > + CoreOffset = (ApLocation->Package * CpuStatus->CoreCount + ApLocation->Core) * CpuStatus->ThreadCount > + ApOffset = CoreOffset + ApLocation->Thread; How about FirstThread and CurrentThread? > + // > + // First increase semaphore count by 1 for processors in this core. This comment might not be helpful for reviewer to understand. How about "Notify all threads in current Core"? > + // > + for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->ThreadCount; ProcessorIndex ++) { > + LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[CoreOffset + ProcessorIndex]); > + } > + // > + // Second, check whether the count has reach the check number. How about "Wait for all threads in current Core"? Below diagram is also helpful // // V(x) = LibReleaseSemaphore (Semaphore[FirstThread + x]); // P(x) = LibWaitForSemaphore (Semaphore[FirstThread + x]); // // All threads (T0...Tn) waits in P() line and continues running // together. // // // T0 T1 ... Tn // // V(0...n) V(0...n) ... V(0...n) // n * P(0) n * P(1) ... n * P(n) // > + // > + for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->ThreadCount; ProcessorIndex ++) { > + LibWaitForSemaphore (&SemaphorePtr[ApOffset]); > + } > + break; > + > + case PackageDepType: > + PackageOffset = ApLocation->Package * CpuStatus->CoreCount * CpuStatus->ThreadCount; FirstThread? > + PackageThreadsCount = CpuStatus->ThreadCount * CpuStatus->CoreCount; ThreadCount? > + ApOffset = PackageOffset + CpuStatus->ThreadCount * ApLocation->Core + ApLocation->Thread; CurrentThread? > + ValidApCount = CpuStatus->ThreadCount * CpuStatus->ValidCoresInPackages[ApLocation->Package]; ValidThreadCount? > + // > + // First increase semaphore count by 1 for processors in this package. How about "Notify all threads in current Package"? > + // > + for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ; ProcessorIndex ++) { > + LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset + ProcessorIndex]); > + } > + // > + // Second, check whether the count has reach the check number. How about "Wait for all threads in current Package"? > + // > + for (ProcessorIndex = 0; ProcessorIndex < ValidApCount; ProcessorIndex ++) { > + LibWaitForSemaphore (&SemaphorePtr[ApOffset]); > + } > + break; > + > + default: > + break; > + } > + break; > + > default: > break; > } > @@ -724,10 +941,36 @@ SetProcessorRegister ( > IN OUT VOID *Buffer > ) > { > - UINTN ProcessorNumber; > + CPU_FEATURES_DATA *CpuFeaturesData; > + CPU_REGISTER_TABLE *RegisterTable; > + CPU_REGISTER_TABLE *RegisterTables; > + UINT32 InitApicId; > + UINTN ProcIndex; > + UINTN Index; > + ACPI_CPU_DATA *AcpiCpuData; > > - ProcessorNumber = GetProcessorIndex (); > - ProgramProcessorRegister (ProcessorNumber); > + CpuFeaturesData = (CPU_FEATURES_DATA *) Buffer; > + AcpiCpuData = CpuFeaturesData->AcpiCpuData; > + > + RegisterTables = (CPU_REGISTER_TABLE *)(UINTN)AcpiCpuData->RegisterTable; > + > + InitApicId = GetInitialApicId (); > + RegisterTable = NULL; > + for (Index = 0; Index < AcpiCpuData->NumberOfCpus; Index++) { > + if (RegisterTables[Index].InitialApicId == InitApicId) { > + RegisterTable = &RegisterTables[Index]; > + ProcIndex = Index; > + break; > + } > + } > + ASSERT (RegisterTable != NULL); > + > + ProgramProcessorRegister ( > + RegisterTable, > + AcpiCpuData->ApLocation + ProcIndex, > + &AcpiCpuData->CpuStatus, > + &CpuFeaturesData->CpuFlags > + ); > } > > /** > @@ -746,6 +989,9 @@ CpuFeaturesDetect ( > { > UINTN NumberOfCpus; > UINTN NumberOfEnabledProcessors; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData(); > > GetNumberOfProcessor (&NumberOfCpus, &NumberOfEnabledProcessors); > > @@ -754,49 +1000,13 @@ CpuFeaturesDetect ( > // > // Wakeup all APs for data collection. > // > - StartupAPsWorker (CollectProcessorData); > + StartupAPsWorker (CollectProcessorData, NULL); > > // > // Collect data on BSP > // > - CollectProcessorData (NULL); > + CollectProcessorData (CpuFeaturesData); > > AnalysisProcessorFeatures (NumberOfCpus); > } > > -/** > - Performs CPU features Initialization. > - > - This service will invoke MP service to perform CPU features > - initialization on BSP/APs per user configuration. > - > - @note This service could be called by BSP only. > -**/ > -VOID > -EFIAPI > -CpuFeaturesInitialize ( > - VOID > - ) > -{ > - CPU_FEATURES_DATA *CpuFeaturesData; > - UINTN OldBspNumber; > - > - CpuFeaturesData = GetCpuFeaturesData (); > - > - OldBspNumber = GetProcessorIndex(); > - CpuFeaturesData->BspNumber = OldBspNumber; > - // > - // Wakeup all APs for programming. > - // > - StartupAPsWorker (SetProcessorRegister); > - // > - // Programming BSP > - // > - SetProcessorRegister (NULL); > - // > - // Switch to new BSP if required > - // > - if (CpuFeaturesData->BspNumber != OldBspNumber) { > - SwitchNewBsp (CpuFeaturesData->BspNumber); > - } > -} > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > index 1f34a3f489..8346f7004f 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > @@ -15,6 +15,7 @@ > #include > > #include > +#include > > #include "RegisterCpuFeatures.h" > > @@ -115,14 +116,20 @@ GetProcessorInformation ( > > @param[in] Procedure A pointer to the function to be run on > enabled APs of the system. > + @param[in] MpEvent A pointer to the event to be used later > + to check whether procedure has done. > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN VOID *MpEvent > ) > { > EFI_STATUS Status; > EFI_MP_SERVICES_PROTOCOL *MpServices; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData (); > > MpServices = GetMpProtocol (); > // > @@ -132,9 +139,9 @@ StartupAPsWorker ( > MpServices, > Procedure, > FALSE, > - NULL, > + (EFI_EVENT)MpEvent, > 0, > - NULL, > + CpuFeaturesData, > NULL > ); > ASSERT_EFI_ERROR (Status); > @@ -197,3 +204,61 @@ GetNumberOfProcessor ( > ASSERT_EFI_ERROR (Status); > } > > +/** > + Performs CPU features Initialization. > + > + This service will invoke MP service to perform CPU features > + initialization on BSP/APs per user configuration. > + > + @note This service could be called by BSP only. > +**/ > +VOID > +EFIAPI > +CpuFeaturesInitialize ( > + VOID > + ) > +{ > + CPU_FEATURES_DATA *CpuFeaturesData; > + UINTN OldBspNumber; > + EFI_EVENT MpEvent; > + EFI_STATUS Status; > + > + CpuFeaturesData = GetCpuFeaturesData (); > + > + OldBspNumber = GetProcessorIndex(); > + CpuFeaturesData->BspNumber = OldBspNumber; > + > + Status = gBS->CreateEvent ( > + EVT_NOTIFY_WAIT, > + TPL_CALLBACK, > + EfiEventEmptyFunction, > + NULL, > + &MpEvent > + ); > + ASSERT_EFI_ERROR (Status); > + > + // > + // Wakeup all APs for programming. > + // > + StartupAPsWorker (SetProcessorRegister, MpEvent); > + // > + // Programming BSP > + // > + SetProcessorRegister (CpuFeaturesData); > + > + // > + // Wait all processors to finish the task. > + // > + do { > + Status = gBS->CheckEvent (MpEvent); > + } while (Status == EFI_NOT_READY); > + ASSERT_EFI_ERROR (Status); > + > + // > + // Switch to new BSP if required > + // > + if (CpuFeaturesData->BspNumber != OldBspNumber) { > + SwitchNewBsp (CpuFeaturesData->BspNumber); > + } > +} > + > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > index f0f317c945..6693bae575 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > @@ -47,6 +47,9 @@ > SynchronizationLib > UefiBootServicesTableLib > IoLib > + UefiBootServicesTableLib > + UefiLib > + LocalApicLib > > [Protocols] > gEfiMpServiceProtocolGuid ## CONSUMES > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > index 82fe268812..799864a136 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > @@ -149,11 +149,15 @@ GetProcessorInformation ( > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN VOID *MpEvent > ) > { > EFI_STATUS Status; > EFI_PEI_MP_SERVICES_PPI *CpuMpPpi; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData (); > > // > // Get MP Services Protocol > @@ -175,7 +179,7 @@ StartupAPsWorker ( > Procedure, > FALSE, > 0, > - NULL > + CpuFeaturesData > ); > ASSERT_EFI_ERROR (Status); > } > @@ -257,3 +261,50 @@ GetNumberOfProcessor ( > ); > ASSERT_EFI_ERROR (Status); > } > + > +/** > + Performs CPU features Initialization. > + > + This service will invoke MP service to perform CPU features > + initialization on BSP/APs per user configuration. > + > + @note This service could be called by BSP only. > +**/ > +VOID > +EFIAPI > +CpuFeaturesInitialize ( > + VOID > + ) > +{ > + CPU_FEATURES_DATA *CpuFeaturesData; > + UINTN OldBspNumber; > + > + CpuFeaturesData = GetCpuFeaturesData (); > + > + OldBspNumber = GetProcessorIndex(); > + CpuFeaturesData->BspNumber = OldBspNumber; > + > + // > + // Known limitation: In PEI phase, CpuFeatures driver not > + // support async mode execute tasks. So semaphore type > + // register can't been used for this instance, must use > + // DXE type instance. > + // > + > + // > + // Wakeup all APs for programming. > + // > + StartupAPsWorker (SetProcessorRegister, NULL); > + // > + // Programming BSP > + // > + SetProcessorRegister (CpuFeaturesData); > + > + // > + // Switch to new BSP if required > + // > + if (CpuFeaturesData->BspNumber != OldBspNumber) { > + SwitchNewBsp (CpuFeaturesData->BspNumber); > + } > +} > + > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.inf b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.inf > index fdfef98293..e95f01df0b 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.inf > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.inf > @@ -49,6 +49,7 @@ > PeiServicesLib > PeiServicesTablePointerLib > IoLib > + LocalApicLib > > [Ppis] > gEfiPeiMpServicesPpiGuid ## CONSUMES > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > index edd266934f..39457e9730 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > > #include > > @@ -46,16 +47,26 @@ typedef struct { > CPU_FEATURE_INITIALIZE InitializeFunc; > UINT8 *BeforeFeatureBitMask; > UINT8 *AfterFeatureBitMask; > + UINT8 *CoreBeforeFeatureBitMask; > + UINT8 *CoreAfterFeatureBitMask; > + UINT8 *PackageBeforeFeatureBitMask; > + UINT8 *PackageAfterFeatureBitMask; > VOID *ConfigData; > BOOLEAN BeforeAll; > BOOLEAN AfterAll; > } CPU_FEATURES_ENTRY; > > +// > +// Flags used when program the register. > +// > +typedef struct { > + volatile UINTN MemoryMappedLock; // Spinlock used to program mmio > + volatile UINT32 *SemaphoreCount; // Semaphore used to program semaphore. > +} PROGRAM_CPU_REGISTER_FLAGS; > + > typedef struct { > UINTN FeaturesCount; > UINT32 BitMaskSize; > - SPIN_LOCK MsrLock; > - SPIN_LOCK MemoryMappedLock; > LIST_ENTRY FeatureList; > > CPU_FEATURES_INIT_ORDER *InitOrder; > @@ -64,9 +75,14 @@ typedef struct { > UINT8 *ConfigurationPcd; > UINT8 *SettingPcd; > > + UINT32 NumberOfCpus; > + ACPI_CPU_DATA *AcpiCpuData; > + > CPU_REGISTER_TABLE *RegisterTable; > CPU_REGISTER_TABLE *PreSmmRegisterTable; > UINTN BspNumber; > + > + PROGRAM_CPU_REGISTER_FLAGS CpuFlags; > } CPU_FEATURES_DATA; > > #define CPU_FEATURE_ENTRY_FROM_LINK(a) \ > @@ -118,10 +134,13 @@ GetProcessorInformation ( > > @param[in] Procedure A pointer to the function to be run on > enabled APs of the system. > + @param[in] MpEvent A pointer to the event to be used later > + to check whether procedure has done. > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN VOID *MpEvent > ); > > /** > @@ -170,4 +189,30 @@ DumpCpuFeature ( > IN CPU_FEATURES_ENTRY *CpuFeature > ); > > +/** > + Return feature dependence result. > + > + @param[in] CpuFeature Pointer to CPU feature. > + @param[in] Before Check before dependence or after. > + > + @retval return the dependence result. > +**/ > +CPU_FEATURE_DEPENDENCE_TYPE > +DetectFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ); > + > +/** > + Programs registers for the calling processor. > + > + @param[in,out] Buffer The pointer to private data buffer. > + > +**/ > +VOID > +EFIAPI > +SetProcessorRegister ( > + IN OUT VOID *Buffer > + ); > + > #endif > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > index fa7e107e39..f9e3178dc1 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > @@ -112,6 +112,302 @@ IsBitMaskMatchCheck ( > return FALSE; > } > > +/** > + Return feature dependence result. > + > + @param[in] CpuFeature Pointer to CPU feature. > + @param[in] Before Check before dependence or after. > + > + @retval return the dependence result. > +**/ > +CPU_FEATURE_DEPENDENCE_TYPE > +DetectFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ) > +{ > + if (Before) { > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + return PackageDepType; > + } > + > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + return CoreDepType; > + } > + > + if (CpuFeature->BeforeFeatureBitMask != NULL) { > + return ThreadDepType; > + } > + > + return NoneDepType; > + } > + > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + return PackageDepType; > + } > + > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + return CoreDepType; > + } > + > + if (CpuFeature->AfterFeatureBitMask != NULL) { > + return ThreadDepType; > + } > + > + return NoneDepType; > +} > + > +/** > + Clear dependence for the specified type. > + > + @param[in] CurrentFeature Cpu feature need to clear. > + @param[in] Before Before or after dependence relationship. > + > +**/ > +VOID > +ClearFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ) > +{ > + if (Before) { > + if (CpuFeature->BeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->BeforeFeatureBitMask); > + CpuFeature->BeforeFeatureBitMask = NULL; > + } > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->CoreBeforeFeatureBitMask); > + CpuFeature->CoreBeforeFeatureBitMask = NULL; > + } > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->PackageBeforeFeatureBitMask); > + CpuFeature->PackageBeforeFeatureBitMask = NULL; > + } > + } else { > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->PackageAfterFeatureBitMask); > + CpuFeature->PackageAfterFeatureBitMask = NULL; > + } > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->CoreAfterFeatureBitMask); > + CpuFeature->CoreAfterFeatureBitMask = NULL; > + } > + if (CpuFeature->AfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->AfterFeatureBitMask); > + CpuFeature->AfterFeatureBitMask = NULL; > + } > + } > +} > + > +/** > + Base on dependence relationship to asjust feature dependence. > + > + ONLY when the feature before(or after) the find feature also has > + dependence with the find feature. In this case, driver need to base > + on dependce relationship to decide how to insert current feature and > + adjust the feature dependence. > + > + @param[in] PreviousFeature CPU feature current before the find one. > + @param[in] CurrentFeature Cpu feature need to adjust. > + @param[in] Before Before or after dependence relationship. > + > + @retval TRUE means the current feature dependence has been adjusted. > + > + @retval FALSE means the previous feature dependence has been adjusted. > + or previous feature has no dependence with the find one. > + > +**/ > +BOOLEAN > +AdjustFeaturesDependence ( > + IN OUT CPU_FEATURES_ENTRY *PreviousFeature, > + IN OUT CPU_FEATURES_ENTRY *CurrentFeature, > + IN BOOLEAN Before > + ) > +{ > + CPU_FEATURE_DEPENDENCE_TYPE PreDependType; > + CPU_FEATURE_DEPENDENCE_TYPE CurrentDependType; > + > + PreDependType = DetectFeatureScope(PreviousFeature, Before); > + CurrentDependType = DetectFeatureScope(CurrentFeature, Before); > + > + // > + // If previous feature has no dependence with the find featue. > + // return FALSE. > + // > + if (PreDependType == NoneDepType) { > + return FALSE; > + } > + > + // > + // If both feature have dependence, keep the one which needs use more > + // processors and clear the dependence for the other one. > + // > + if (PreDependType >= CurrentDependType) { > + ClearFeatureScope (CurrentFeature, Before); > + return TRUE; > + } else { > + ClearFeatureScope (PreviousFeature, Before); > + return FALSE; > + } > +} > + > +/** > + Base on dependence relationship to asjust feature order. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] FindEntry The entry this feature depend on. > + @param[in] CurrentEntry The entry for this feature. > + @param[in] Before Before or after dependence relationship. > + > +**/ > +VOID > +AdjustEntry ( > + IN LIST_ENTRY *FeatureList, > + IN OUT LIST_ENTRY *FindEntry, > + IN OUT LIST_ENTRY *CurrentEntry, > + IN BOOLEAN Before > + ) > +{ > + LIST_ENTRY *PreviousEntry; > + CPU_FEATURES_ENTRY *PreviousFeature; > + CPU_FEATURES_ENTRY *CurrentFeature; > + > + // > + // For CPU feature which has core or package type dependence, later code need to insert > + // AcquireSpinLock/ReleaseSpinLock logic to sequency the execute order. > + // So if driver finds both feature A and B need to execute before feature C, driver will > + // base on dependence type of feature A and B to update the logic here. > + // For example, feature A has package type dependence and feature B has core type dependence, > + // because package type dependence need to wait for more processors which has strong dependence > + // than core type dependence. So driver will adjust the feature order to B -> A -> C. and driver > + // will remove the feature dependence in feature B. > + // Driver just needs to make sure before feature C been executed, feature A has finished its task > + // in all all thread. Feature A finished in all threads also means feature B have finshed in all > + // threads. > + // > + if (Before) { > + PreviousEntry = GetPreviousNode (FeatureList, FindEntry); > + } else { > > + PreviousEntry = GetNextNode (FeatureList, FindEntry); > + } > + > + CurrentFeature = CPU_FEATURE_ENTRY_FROM_LINK (CurrentEntry); > + RemoveEntryList (CurrentEntry); > + > + if (IsNull (FeatureList, PreviousEntry)) { > + // > + // If not exist the previous or next entry, just insert the current entry. > + // > + if (Before) { > + InsertTailList (FindEntry, CurrentEntry); > + } else { > + InsertHeadList (FindEntry, CurrentEntry); > + } > + } else { > + // > + // If exist the previous or next entry, need to check it before insert curent entry. > + // > + PreviousFeature = CPU_FEATURE_ENTRY_FROM_LINK (PreviousEntry); > + > + if (AdjustFeaturesDependence (PreviousFeature, CurrentFeature, Before)) { > + // > + // Return TRUE means current feature dependence has been cleared and the previous > + // feature dependence has been kept and used. So insert current feature before (or after) > + // the previous feature. > + // > + if (Before) { > + InsertTailList (PreviousEntry, CurrentEntry); > + } else { > + InsertHeadList (PreviousEntry, CurrentEntry); > + } > + } else { > + if (Before) { > + InsertTailList (FindEntry, CurrentEntry); > + } else { > + InsertHeadList (FindEntry, CurrentEntry); > + } > + } > + } > +} > > + > +/** > + Checks and adjusts current CPU features per dependency relationship. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] CurrentEntry Pointer to current checked CPU feature > + @param[in] FeatureMask The feature bit mask. > + > + @retval return Swapped info. > +**/ > +BOOLEAN > +InsertToBeforeEntry ( > + IN LIST_ENTRY *FeatureList, > + IN LIST_ENTRY *CurrentEntry, > + IN UINT8 *FeatureMask > + ) > +{ > + LIST_ENTRY *CheckEntry; > + CPU_FEATURES_ENTRY *CheckFeature; > + BOOLEAN Swapped; > + > + Swapped = FALSE; > + > + // > + // Check all features dispatched before this entry > + // > + CheckEntry = GetFirstNode (FeatureList); > + while (CheckEntry != CurrentEntry) { > + CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > + if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, FeatureMask)) { > + AdjustEntry (FeatureList, CheckEntry, CurrentEntry, TRUE); > + Swapped = TRUE; > + break; > + } > + CheckEntry = CheckEntry->ForwardLink; > + } > + > + return Swapped; > +} > + > +/** > + Checks and adjusts current CPU features per dependency relationship. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] CurrentEntry Pointer to current checked CPU feature > + @param[in] FeatureMask The feature bit mask. > + > + @retval return Swapped info. > +**/ > +BOOLEAN > +InsertToAfterEntry ( > + IN LIST_ENTRY *FeatureList, > + IN LIST_ENTRY *CurrentEntry, > + IN UINT8 *FeatureMask > + ) > +{ > + LIST_ENTRY *CheckEntry; > + CPU_FEATURES_ENTRY *CheckFeature; > + BOOLEAN Swapped; > + > + Swapped = FALSE; > + > + // > + // Check all features dispatched after this entry > + // > + CheckEntry = GetNextNode (FeatureList, CurrentEntry); > + while (!IsNull (FeatureList, CheckEntry)) { > + CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > + if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, FeatureMask)) { > + AdjustEntry (FeatureList, CheckEntry, CurrentEntry, FALSE); > + Swapped = TRUE; > + break; > + } > + CheckEntry = CheckEntry->ForwardLink; > + } > + > + return Swapped; > +} > + > /** > Checks and adjusts CPU features order per dependency relationship. > > @@ -128,11 +424,13 @@ CheckCpuFeaturesDependency ( > CPU_FEATURES_ENTRY *CheckFeature; > BOOLEAN Swapped; > LIST_ENTRY *TempEntry; > + LIST_ENTRY *NextEntry; > > CurrentEntry = GetFirstNode (FeatureList); > while (!IsNull (FeatureList, CurrentEntry)) { > Swapped = FALSE; > CpuFeature = CPU_FEATURE_ENTRY_FROM_LINK (CurrentEntry); > + NextEntry = CurrentEntry->ForwardLink; > if (CpuFeature->BeforeAll) { > // > // Check all features dispatched before this entry > @@ -153,6 +451,7 @@ CheckCpuFeaturesDependency ( > CheckEntry = CheckEntry->ForwardLink; > } > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > @@ -179,60 +478,59 @@ CheckCpuFeaturesDependency ( > CheckEntry = CheckEntry->ForwardLink; > } > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > > if (CpuFeature->BeforeFeatureBitMask != NULL) { > - // > - // Check all features dispatched before this entry > - // > - CheckEntry = GetFirstNode (FeatureList); > - while (CheckEntry != CurrentEntry) { > - CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > - if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, CpuFeature->BeforeFeatureBitMask)) { > - // > - // If there is dependency, swap them > - // > - RemoveEntryList (CurrentEntry); > - InsertTailList (CheckEntry, CurrentEntry); > - Swapped = TRUE; > - break; > - } > - CheckEntry = CheckEntry->ForwardLink; > - } > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->BeforeFeatureBitMask); > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > > if (CpuFeature->AfterFeatureBitMask != NULL) { > - // > - // Check all features dispatched after this entry > - // > - CheckEntry = GetNextNode (FeatureList, CurrentEntry); > - while (!IsNull (FeatureList, CheckEntry)) { > - CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > - if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, CpuFeature->AfterFeatureBitMask)) { > - // > - // If there is dependency, swap them > - // > - TempEntry = GetNextNode (FeatureList, CurrentEntry); > - RemoveEntryList (CurrentEntry); > - InsertHeadList (CheckEntry, CurrentEntry); > - CurrentEntry = TempEntry; > - Swapped = TRUE; > - break; > - } > - CheckEntry = CheckEntry->ForwardLink; > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->AfterFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > } > + } > + > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->CoreBeforeFeatureBitMask); > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > - // > - // No swap happened, check the next feature > - // > + > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->CoreAfterFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->PackageBeforeFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->PackageAfterFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > CurrentEntry = CurrentEntry->ForwardLink; > } > } > @@ -265,8 +563,7 @@ RegisterCpuFeatureWorker ( > CpuFeaturesData = GetCpuFeaturesData (); > if (CpuFeaturesData->FeaturesCount == 0) { > InitializeListHead (&CpuFeaturesData->FeatureList); > - InitializeSpinLock (&CpuFeaturesData->MsrLock); > - InitializeSpinLock (&CpuFeaturesData->MemoryMappedLock); > + InitializeSpinLock (&CpuFeaturesData->CpuFlags.MemoryMappedLock); > CpuFeaturesData->BitMaskSize = (UINT32) BitMaskSize; > } > ASSERT (CpuFeaturesData->BitMaskSize == BitMaskSize); > @@ -328,6 +625,31 @@ RegisterCpuFeatureWorker ( > } > CpuFeatureEntry->AfterFeatureBitMask = CpuFeature->AfterFeatureBitMask; > } > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + if (CpuFeatureEntry->CoreBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->CoreBeforeFeatureBitMask); > + } > + CpuFeatureEntry->CoreBeforeFeatureBitMask = CpuFeature->CoreBeforeFeatureBitMask; > + } > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + if (CpuFeatureEntry->CoreAfterFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->CoreAfterFeatureBitMask); > + } > + CpuFeatureEntry->CoreAfterFeatureBitMask = CpuFeature->CoreAfterFeatureBitMask; > + } > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + if (CpuFeatureEntry->PackageBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->PackageBeforeFeatureBitMask); > + } > + CpuFeatureEntry->PackageBeforeFeatureBitMask = CpuFeature->PackageBeforeFeatureBitMask; > + } > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + if (CpuFeatureEntry->PackageAfterFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->PackageAfterFeatureBitMask); > + } > + CpuFeatureEntry->PackageAfterFeatureBitMask = CpuFeature->PackageAfterFeatureBitMask; > + } > + > CpuFeatureEntry->BeforeAll = CpuFeature->BeforeAll; > CpuFeatureEntry->AfterAll = CpuFeature->AfterAll; > > @@ -410,6 +732,8 @@ SetCpuFeaturesBitMask ( > @retval RETURN_UNSUPPORTED Registration of the CPU feature is not > supported due to a circular dependency between > BEFORE and AFTER features. > + @retval RETURN_NOT_READY CPU feature PCD PcdCpuFeaturesUserConfiguration > + not updated by Platform driver yet. > > @note This service could be called by BSP only. > **/ > @@ -431,12 +755,20 @@ RegisterCpuFeature ( > UINT8 *FeatureMask; > UINT8 *BeforeFeatureBitMask; > UINT8 *AfterFeatureBitMask; > + UINT8 *CoreBeforeFeatureBitMask; > + UINT8 *CoreAfterFeatureBitMask; > + UINT8 *PackageBeforeFeatureBitMask; > + UINT8 *PackageAfterFeatureBitMask; > BOOLEAN BeforeAll; > BOOLEAN AfterAll; > > - FeatureMask = NULL; > - BeforeFeatureBitMask = NULL; > - AfterFeatureBitMask = NULL; > + FeatureMask = NULL; > + BeforeFeatureBitMask = NULL; How about renaming BeforeFeatureBitMask to ThreadBeforeFeatureBitMask? I think the renaming together with redefining the macro CPU_FEATURE_BEFORE as CPU_FEATURE_THREAD_BEFORE can be in a separate patch. > + AfterFeatureBitMask = NULL; > + CoreBeforeFeatureBitMask = NULL; > + CoreAfterFeatureBitMask = NULL; > + PackageBeforeFeatureBitMask = NULL; > + PackageAfterFeatureBitMask = NULL; > BeforeAll = FALSE; > AfterAll = FALSE; > > @@ -449,6 +781,10 @@ RegisterCpuFeature ( > != (CPU_FEATURE_BEFORE | CPU_FEATURE_AFTER)); > ASSERT ((Feature & (CPU_FEATURE_BEFORE_ALL | CPU_FEATURE_AFTER_ALL)) > != (CPU_FEATURE_BEFORE_ALL | CPU_FEATURE_AFTER_ALL)); Implementation can avoid using CPU_FEATURE_BEFORE and CPU_FEATURE_AFTER. Use CPU_FEATURE_THREAD_BEFORE and CPU_FEATURE_THREAD_AFTER. > + ASSERT ((Feature & (CPU_FEATURE_CORE_BEFORE | CPU_FEATURE_CORE_AFTER)) > + != (CPU_FEATURE_CORE_BEFORE | CPU_FEATURE_CORE_AFTER)); > + ASSERT ((Feature & (CPU_FEATURE_PACKAGE_BEFORE | CPU_FEATURE_PACKAGE_AFTER)) > + != (CPU_FEATURE_PACKAGE_BEFORE | CPU_FEATURE_PACKAGE_AFTER)); > if (Feature < CPU_FEATURE_BEFORE) { > BeforeAll = ((Feature & CPU_FEATURE_BEFORE_ALL) != 0) ? TRUE : FALSE; > AfterAll = ((Feature & CPU_FEATURE_AFTER_ALL) != 0) ? TRUE : FALSE; > @@ -459,6 +795,14 @@ RegisterCpuFeature ( > SetCpuFeaturesBitMask (&BeforeFeatureBitMask, Feature & ~CPU_FEATURE_BEFORE, BitMaskSize); > } else if ((Feature & CPU_FEATURE_AFTER) != 0) { > SetCpuFeaturesBitMask (&AfterFeatureBitMask, Feature & ~CPU_FEATURE_AFTER, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_CORE_BEFORE) != 0) { > + SetCpuFeaturesBitMask (&CoreBeforeFeatureBitMask, Feature & ~CPU_FEATURE_CORE_BEFORE, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_CORE_AFTER) != 0) { > + SetCpuFeaturesBitMask (&CoreAfterFeatureBitMask, Feature & ~CPU_FEATURE_CORE_AFTER, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_PACKAGE_BEFORE) != 0) { > + SetCpuFeaturesBitMask (&PackageBeforeFeatureBitMask, Feature & ~CPU_FEATURE_PACKAGE_BEFORE, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_PACKAGE_AFTER) != 0) { > + SetCpuFeaturesBitMask (&PackageAfterFeatureBitMask, Feature & ~CPU_FEATURE_PACKAGE_AFTER, BitMaskSize); > } > Feature = VA_ARG (Marker, UINT32); > } > @@ -466,15 +810,19 @@ RegisterCpuFeature ( > > CpuFeature = AllocateZeroPool (sizeof (CPU_FEATURES_ENTRY)); > ASSERT (CpuFeature != NULL); > - CpuFeature->Signature = CPU_FEATURE_ENTRY_SIGNATURE; > - CpuFeature->FeatureMask = FeatureMask; > - CpuFeature->BeforeFeatureBitMask = BeforeFeatureBitMask; > - CpuFeature->AfterFeatureBitMask = AfterFeatureBitMask; > - CpuFeature->BeforeAll = BeforeAll; > - CpuFeature->AfterAll = AfterAll; > - CpuFeature->GetConfigDataFunc = GetConfigDataFunc; > - CpuFeature->SupportFunc = SupportFunc; > - CpuFeature->InitializeFunc = InitializeFunc; > + CpuFeature->Signature = CPU_FEATURE_ENTRY_SIGNATURE; > + CpuFeature->FeatureMask = FeatureMask; > + CpuFeature->BeforeFeatureBitMask = BeforeFeatureBitMask; > + CpuFeature->AfterFeatureBitMask = AfterFeatureBitMask; > + CpuFeature->CoreBeforeFeatureBitMask = CoreBeforeFeatureBitMask; > + CpuFeature->CoreAfterFeatureBitMask = CoreAfterFeatureBitMask; > + CpuFeature->PackageBeforeFeatureBitMask = PackageBeforeFeatureBitMask; > + CpuFeature->PackageAfterFeatureBitMask = PackageAfterFeatureBitMask; > + CpuFeature->BeforeAll = BeforeAll; > + CpuFeature->AfterAll = AfterAll; > + CpuFeature->GetConfigDataFunc = GetConfigDataFunc; > + CpuFeature->SupportFunc = SupportFunc; > + CpuFeature->InitializeFunc = InitializeFunc; > if (FeatureName != NULL) { > CpuFeature->FeatureName = AllocatePool (CPU_FEATURE_NAME_SIZE); > ASSERT (CpuFeature->FeatureName != NULL); > -- Thanks, Ray