From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=ruiyu.ni@intel.com; receiver=edk2-devel@lists.01.org Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 346CA21168228 for ; Thu, 18 Oct 2018 01:17:50 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Oct 2018 01:17:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,395,1534834800"; d="scan'208";a="92981071" Received: from ray-dev.ccr.corp.intel.com (HELO [10.239.9.11]) ([10.239.9.11]) by orsmga003.jf.intel.com with ESMTP; 18 Oct 2018 01:17:48 -0700 To: Eric Dong , edk2-devel@lists.01.org Cc: Laszlo Ersek References: <20181018073448.11496-1-eric.dong@intel.com> <20181018073448.11496-4-eric.dong@intel.com> From: "Ni, Ruiyu" Message-ID: Date: Thu, 18 Oct 2018 16:19:01 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20181018073448.11496-4-eric.dong@intel.com> Subject: Re: [Patch v3 3/6] UefiCpuPkg/RegisterCpuFeaturesLib: Add logic to support semaphore type. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2018 08:17:51 -0000 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/18/2018 3:34 PM, Eric Dong wrote: > V3 changes include: > 1. Use global variable instead of internal function to return string for register type > and dependence type. > 2. Add comments for some complicated logic. > > V2 changes include: > 1. Add more description for the code part which need easy to understand. > 2. Refine some code base on feedback for V1 changes. > > V1 changes include: > In a system which has multiple cores, current set register value task costs huge times. > After investigation, current set MSR task costs most of the times. Current logic uses > SpinLock to let set MSR task as an single thread task for all cores. Because MSR has > scope attribute which may cause GP fault if multiple APs set MSR at the same time, > current logic use an easiest solution (use SpinLock) to avoid this issue, but it will > cost huge times. > > In order to fix this performance issue, new solution will set MSRs base on their scope > attribute. After this, the SpinLock will not needed. Without SpinLock, new issue raised > which is caused by MSR dependence. For example, MSR A depends on MSR B which means MSR A > must been set after MSR B has been set. Also MSR B is package scope level and MSR A is > thread scope level. If system has multiple threads, Thread 1 needs to set the thread level > MSRs and thread 2 needs to set thread and package level MSRs. Set MSRs task for thread 1 > and thread 2 like below: > > Thread 1 Thread 2 > MSR B N Y > MSR A Y Y > > If driver don't control execute MSR order, for thread 1, it will execute MSR A first, but > at this time, MSR B not been executed yet by thread 2. system may trig exception at this > time. > > In order to fix the above issue, driver introduces semaphore logic to control the MSR > execute sequence. For the above case, a semaphore will be add between MSR A and B for > all threads. Semaphore has scope info for it. The possible scope value is core or package. > For each thread, when it meets a semaphore during it set registers, it will 1) release > semaphore (+1) for each threads in this core or package(based on the scope info for this > semaphore) 2) acquire semaphore (-1) for all the threads in this core or package(based > on the scope info for this semaphore). With these two steps, driver can control MSR > sequence. Sample code logic like below: > > // > // First increase semaphore count by 1 for processors in this package. > // > for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ; ProcessorIndex ++) { > LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset + ProcessorIndex]); > } > // > // Second, check whether the count has reach the check number. > // > for (ProcessorIndex = 0; ProcessorIndex < ValidApCount; ProcessorIndex ++) { > LibWaitForSemaphore (&SemaphorePtr[ApOffset]); > } > > Platform Requirement: > 1. This change requires register MSR setting base on MSR scope info. If still register MSR > for all threads, exception may raised. > > Known limitation: > 1. Current CpuFeatures driver supports DXE instance and PEI instance. But semaphore logic > requires Aps execute in async mode which is not supported by PEI driver. So CpuFeature > PEI instance not works after this change. We plan to support async mode for PEI in phase > 2 for this task. > > Cc: Ruiyu Ni > Cc: Laszlo Ersek > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: Eric Dong > --- > .../RegisterCpuFeaturesLib/CpuFeaturesInitialize.c | 392 ++++++++++++++--- > .../DxeRegisterCpuFeaturesLib.c | 71 ++- > .../DxeRegisterCpuFeaturesLib.inf | 2 + > .../PeiRegisterCpuFeaturesLib.c | 55 ++- > .../RegisterCpuFeaturesLib/RegisterCpuFeatures.h | 61 ++- > .../RegisterCpuFeaturesLib.c | 484 ++++++++++++++++++--- > 6 files changed, 932 insertions(+), 133 deletions(-) > > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > index ba3fb3250f..16d2b71fd8 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c > @@ -14,6 +14,9 @@ > > #include "RegisterCpuFeatures.h" > > +CHAR16 *mDependTypeStr[] = {L"None", L"Thread", L"Core", L"Package", L"Invalid" }; > +CHAR16 *mRegisterTypeStr[] = {L"MSR", L"CR", L"MMIO", L"CACHE", L"SEMAP", L"INVALID" }; > + > /** > Worker function to save PcdCpuFeaturesCapability. > > @@ -145,6 +148,19 @@ CpuInitDataInitialize ( > CPU_FEATURES_INIT_ORDER *InitOrder; > CPU_FEATURES_DATA *CpuFeaturesData; > LIST_ENTRY *Entry; > + UINT32 Core; > + UINT32 Package; > + UINT32 Thread; > + EFI_CPU_PHYSICAL_LOCATION *Location; > + BOOLEAN *CoresVisited; > + UINTN Index; > + ACPI_CPU_DATA *AcpiCpuData; > + CPU_STATUS_INFORMATION *CpuStatus; > + UINT32 *ValidCorePerPackage; 1. ValidCoreCountPerPackage > + > + Core = 0; > + Package = 0; > + Thread = 0; > > CpuFeaturesData = GetCpuFeaturesData (); > CpuFeaturesData->InitOrder = AllocateZeroPool (sizeof (CPU_FEATURES_INIT_ORDER) * NumberOfCpus); > @@ -163,6 +179,17 @@ CpuInitDataInitialize ( > Entry = Entry->ForwardLink; > } > > + CpuFeaturesData->NumberOfCpus = (UINT32) NumberOfCpus; > + > + AcpiCpuData = GetAcpiCpuData (); > + ASSERT (AcpiCpuData != NULL); > + CpuFeaturesData->AcpiCpuData= AcpiCpuData; > + > + CpuStatus = &AcpiCpuData->CpuStatus; > + Location = AllocateZeroPool (sizeof (EFI_CPU_PHYSICAL_LOCATION) * NumberOfCpus); > + ASSERT (Location != NULL); > + AcpiCpuData->ApLocation = (EFI_PHYSICAL_ADDRESS)(UINTN)Location; > + > for (ProcessorNumber = 0; ProcessorNumber < NumberOfCpus; ProcessorNumber++) { > InitOrder = &CpuFeaturesData->InitOrder[ProcessorNumber]; > InitOrder->FeaturesSupportedMask = AllocateZeroPool (CpuFeaturesData->BitMaskSize); > @@ -175,7 +202,76 @@ CpuInitDataInitialize ( > &ProcessorInfoBuffer, > sizeof (EFI_PROCESSOR_INFORMATION) > ); > + CopyMem ( > + &Location[ProcessorNumber], > + &ProcessorInfoBuffer.Location, > + sizeof (EFI_CPU_PHYSICAL_LOCATION) > + ); > + > + // > + // Collect CPU package count info. > + // > + if (Package < ProcessorInfoBuffer.Location.Package) { > + Package = ProcessorInfoBuffer.Location.Package; > + } > + // > + // Collect CPU max core count info. > + // > + if (Core < ProcessorInfoBuffer.Location.Core) { > + Core = ProcessorInfoBuffer.Location.Core; > + } > + // > + // Collect CPU max thread count info. > + // > + if (Thread < ProcessorInfoBuffer.Location.Thread) { > + Thread = ProcessorInfoBuffer.Location.Thread; > + } > + } > + CpuStatus->PackageCount = Package + 1; > + CpuStatus->MaxCoreCount = Core + 1; > + CpuStatus->MaxThreadCount = Thread + 1; > + DEBUG ((DEBUG_INFO, "Processor Info: Package: %d, MaxCore : %d, MaxThread: %d\n", > + CpuStatus->PackageCount, > + CpuStatus->MaxCoreCount, > + CpuStatus->MaxThreadCount)); > + > + // > + // Collect valid core count in each package because not all cores are valid. > + // > + ValidCorePerPackage= AllocateZeroPool (sizeof (UINT32) * CpuStatus->PackageCount); > + ASSERT (ValidCorePerPackage != 0); > + CpuStatus->ValidCoreCountPerPackage = (EFI_PHYSICAL_ADDRESS)(UINTN)ValidCorePerPackage; > + CoresVisited = AllocatePool (sizeof (BOOLEAN) * CpuStatus->MaxCoreCount); > + ASSERT (CoresVisited != NULL); > + > + for (Index = 0; Index < CpuStatus->PackageCount; Index ++ ) { > + ZeroMem (CoresVisited, sizeof (UINT32) * CpuStatus->MaxCoreCount); 2. sizeof (BOOLEAN) > + // > + // Collect valid cores in Current package. > + // > + for (ProcessorNumber = 0; ProcessorNumber < NumberOfCpus; ProcessorNumber++) { > + Location = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo.ProcessorInfo.Location; > + if (Location->Package == Index && !CoresVisited[Location->Core] ) { > + // > + // The ValidCores position for Location->Core is valid. > + // The possible values in ValidCores[Index] are 0 or 1. > + // FALSE means no valid threads in this Core. > + // TRUE means have valid threads in this core, no matter the thead count is 1 or more. > + // > + CoresVisited[Location->Core] = TRUE; > + ValidCorePerPackage[Index]++; > + } > + } > } > + FreePool (CoresVisited); > + > + for (Index = 0; Index <= Package; Index++) { > + DEBUG ((DEBUG_INFO, "Package: %d, Valid Core : %d\n", Index, ValidCorePerPackage[Index])); > + } > + > + CpuFeaturesData->CpuFlags.SemaphoreCount = AllocateZeroPool (sizeof (UINT32) * CpuStatus->PackageCount * CpuStatus->MaxCoreCount * CpuStatus->MaxThreadCount); > + ASSERT (CpuFeaturesData->CpuFlags.SemaphoreCount != NULL); > + > // > // Get support and configuration PCDs > // > @@ -310,7 +406,7 @@ CollectProcessorData ( > LIST_ENTRY *Entry; > CPU_FEATURES_DATA *CpuFeaturesData; > > - CpuFeaturesData = GetCpuFeaturesData (); > + CpuFeaturesData = (CPU_FEATURES_DATA *)Buffer; > ProcessorNumber = GetProcessorIndex (); > CpuInfo = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo; > // > @@ -416,6 +512,15 @@ DumpRegisterTableOnProcessor ( > RegisterTableEntry->Value > )); > break; > + case Semaphore: > + DEBUG (( > + DebugPrintErrorLevel, > + "Processor: %d: Semaphore: Scope Value: %s\r\n", > + ProcessorNumber, > + mDependTypeStr[MIN ((CPU_FEATURE_DEPENDENCE_TYPE)RegisterTableEntry->Value, InvalidDepType)] 3. No typecast is needed here. > + )); > + break; > + > default: > break; > } > @@ -441,6 +546,11 @@ AnalysisProcessorFeatures ( > REGISTER_CPU_FEATURE_INFORMATION *CpuInfo; > LIST_ENTRY *Entry; > CPU_FEATURES_DATA *CpuFeaturesData; > + LIST_ENTRY *NextEntry; > + CPU_FEATURES_ENTRY *NextCpuFeatureInOrder; > + BOOLEAN Success; > + CPU_FEATURE_DEPENDENCE_TYPE BeforeDep; > + CPU_FEATURE_DEPENDENCE_TYPE AfterDep; > > CpuFeaturesData = GetCpuFeaturesData (); > CpuFeaturesData->CapabilityPcd = AllocatePool (CpuFeaturesData->BitMaskSize); > @@ -517,8 +627,14 @@ AnalysisProcessorFeatures ( > // > CpuInfo = &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo; > Entry = GetFirstNode (&CpuInitOrder->OrderList); > + NextEntry = Entry->ForwardLink; > while (!IsNull (&CpuInitOrder->OrderList, Entry)) { > CpuFeatureInOrder = CPU_FEATURE_ENTRY_FROM_LINK (Entry); > + if (!IsNull (&CpuInitOrder->OrderList, NextEntry)) { > + NextCpuFeatureInOrder = CPU_FEATURE_ENTRY_FROM_LINK (NextEntry); > + } else { > + NextCpuFeatureInOrder = NULL; > + } > if (IsBitMaskMatch (CpuFeatureInOrder->FeatureMask, CpuFeaturesData->SettingPcd)) { > Status = CpuFeatureInOrder->InitializeFunc (ProcessorNumber, CpuInfo, CpuFeatureInOrder->ConfigData, TRUE); > if (EFI_ERROR (Status)) { > @@ -532,6 +648,8 @@ AnalysisProcessorFeatures ( > DEBUG ((DEBUG_WARN, "Warning :: Failed to enable Feature: Mask = ")); > DumpCpuFeatureMask (CpuFeatureInOrder->FeatureMask); > } > + } else { > + Success = TRUE; > } > } else { > Status = CpuFeatureInOrder->InitializeFunc (ProcessorNumber, CpuInfo, CpuFeatureInOrder->ConfigData, FALSE); > @@ -542,9 +660,36 @@ AnalysisProcessorFeatures ( > DEBUG ((DEBUG_WARN, "Warning :: Failed to disable Feature: Mask = ")); > DumpCpuFeatureMask (CpuFeatureInOrder->FeatureMask); > } > + } else { > + Success = TRUE; > } > } > - Entry = Entry->ForwardLink; > + > + if (Success) { > + // > + // If feature has dependence with the next feature (ONLY care core/package dependency). > + // and feature initialize succeed, add sync semaphere here. > + // > + BeforeDep = DetectFeatureScope (CpuFeatureInOrder, TRUE); > + if (NextCpuFeatureInOrder != NULL) { > + AfterDep = DetectFeatureScope (NextCpuFeatureInOrder, FALSE); > + } else { > + AfterDep = NoneDepType; > + } > + // > + // Assume only one of the depend is valid. > + // > + ASSERT (!(BeforeDep > ThreadDepType && AfterDep > ThreadDepType)); > + if (BeforeDep > ThreadDepType) { > + CPU_REGISTER_TABLE_WRITE32 (ProcessorNumber, Semaphore, 0, BeforeDep); > + } > + if (AfterDep > ThreadDepType) { > + CPU_REGISTER_TABLE_WRITE32 (ProcessorNumber, Semaphore, 0, AfterDep); > + } > + } > + > + Entry = Entry->ForwardLink; > + NextEntry = Entry->ForwardLink; > } > > // > @@ -561,27 +706,81 @@ AnalysisProcessorFeatures ( > } > } > > +/** > + Increment semaphore by 1. > + > + @param Sem IN: 32-bit unsigned integer > + > +**/ > +VOID > +LibReleaseSemaphore ( > + IN OUT volatile UINT32 *Sem > + ) > +{ > + InterlockedIncrement (Sem); > +} > + > +/** > + Decrement the semaphore by 1 if it is not zero. > + > + Performs an atomic decrement operation for semaphore. > + The compare exchange operation must be performed using > + MP safe mechanisms. > + > + @param Sem IN: 32-bit unsigned integer > + > +**/ > +VOID > +LibWaitForSemaphore ( > + IN OUT volatile UINT32 *Sem > + ) > +{ > + UINT32 Value; > + > + do { > + Value = *Sem; > + } while (Value == 0 || > + InterlockedCompareExchange32 ( > + Sem, > + Value, > + Value - 1 > + ) != Value); > +} > + > /** > Initialize the CPU registers from a register table. > > - @param[in] ProcessorNumber The index of the CPU executing this function. > + @param[in] RegisterTable The register table for this AP. > + @param[in] ApLocation AP location info for this ap. > + @param[in] CpuStatus CPU status info for this CPU. > + @param[in] CpuFlags Flags data structure used when program the register. > > @note This service could be called by BSP/APs. > **/ > VOID > ProgramProcessorRegister ( > - IN UINTN ProcessorNumber > + IN CPU_REGISTER_TABLE *RegisterTable, > + IN EFI_CPU_PHYSICAL_LOCATION *ApLocation, > + IN CPU_STATUS_INFORMATION *CpuStatus, > + IN PROGRAM_CPU_REGISTER_FLAGS *CpuFlags > ) > { > - CPU_FEATURES_DATA *CpuFeaturesData; > - CPU_REGISTER_TABLE *RegisterTable; > CPU_REGISTER_TABLE_ENTRY *RegisterTableEntry; > UINTN Index; > UINTN Value; > CPU_REGISTER_TABLE_ENTRY *RegisterTableEntryHead; > - > - CpuFeaturesData = GetCpuFeaturesData (); > - RegisterTable = &CpuFeaturesData->RegisterTable[ProcessorNumber]; > + volatile UINT32 *SemaphorePtr; > + UINT32 FirstThread; > + UINT32 PackageThreadsCount; > + UINT32 CurrentThread; > + UINTN ProcessorIndex; > + UINTN ThreadIndex; > + UINTN ValidThreadCount; > + UINT32 *ValidCorePerPackage; > + > + ThreadIndex = ApLocation->Package * CpuStatus->MaxCoreCount * CpuStatus->MaxThreadCount \ > + + ApLocation->Core * CpuStatus->MaxThreadCount \ > + + ApLocation->Thread; > > // > // Traverse Register Table of this logical processor > @@ -591,6 +790,13 @@ ProgramProcessorRegister ( > for (Index = 0; Index < RegisterTable->TableLength; Index++) { > > RegisterTableEntry = &RegisterTableEntryHead[Index]; > + DEBUG (( > + DEBUG_INFO, > + "Processor = %ul, Entry Index %ul, Type = %s!\n", > + (UINT64)ThreadIndex, > + (UINT64)Index, > + mRegisterTypeStr[MIN ((REGISTER_TYPE)RegisterTableEntry->RegisterType, InvalidReg)] > + )); > > // > // Check the type of specified register > @@ -654,10 +860,6 @@ ProgramProcessorRegister ( > // The specified register is Model Specific Register > // > case Msr: > - // > - // Get lock to avoid Package/Core scope MSRs programming issue in parallel execution mode > - // > - AcquireSpinLock (&CpuFeaturesData->MsrLock); > if (RegisterTableEntry->ValidBitLength >= 64) { > // > // If length is not less than 64 bits, then directly write without reading > @@ -677,20 +879,19 @@ ProgramProcessorRegister ( > RegisterTableEntry->Value > ); > } > - ReleaseSpinLock (&CpuFeaturesData->MsrLock); > break; > // > // MemoryMapped operations > // > case MemoryMapped: > - AcquireSpinLock (&CpuFeaturesData->MemoryMappedLock); > + AcquireSpinLock (&CpuFlags->MemoryMappedLock); > MmioBitFieldWrite32 ( > (UINTN)(RegisterTableEntry->Index | LShiftU64 (RegisterTableEntry->HighIndex, 32)), > RegisterTableEntry->ValidBitStart, > RegisterTableEntry->ValidBitStart + RegisterTableEntry->ValidBitLength - 1, > (UINT32)RegisterTableEntry->Value > ); > - ReleaseSpinLock (&CpuFeaturesData->MemoryMappedLock); > + ReleaseSpinLock (&CpuFlags->MemoryMappedLock); > break; > // > // Enable or disable cache > @@ -706,6 +907,90 @@ ProgramProcessorRegister ( > } > break; > > + case Semaphore: > + // Semaphore works logic like below: > + // > + // V(x) = LibReleaseSemaphore (Semaphore[FirstThread + x]); > + // P(x) = LibWaitForSemaphore (Semaphore[FirstThread + x]); > + // > + // All threads (T0...Tn) waits in P() line and continues running > + // together. > + // > + // > + // T0 T1 ... Tn > + // > + // V(0...n) V(0...n) ... V(0...n) > + // n * P(0) n * P(1) ... n * P(n) > + // > + SemaphorePtr = CpuFlags->SemaphoreCount; > + switch (RegisterTableEntry->Value) { > + case CoreDepType: > + // > + // Get Offset info for the first thread in the core which current thread belongs to. > + // > + FirstThread = (ApLocation->Package * CpuStatus->MaxCoreCount + ApLocation->Core) * CpuStatus->MaxThreadCount; > + CurrentThread = FirstThread + ApLocation->Thread; > + // > + // First Notify all threads in current Core that this thread has ready. > + // > + for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->MaxThreadCount; ProcessorIndex ++) { > + LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[FirstThread + ProcessorIndex]); > + } > + // > + // Second, check whether all valid threads in current core have ready. > + // > + for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->MaxThreadCount; ProcessorIndex ++) { > + LibWaitForSemaphore (&SemaphorePtr[CurrentThread]); > + } > + break; > + > + case PackageDepType: > + ValidCorePerPackage = (UINT32 *)(UINTN)CpuStatus->ValidCoreCountPerPackage; > + // > + // Get Offset info for the first thread in the package which current thread belongs to. > + // > + FirstThread = ApLocation->Package * CpuStatus->MaxCoreCount * CpuStatus->MaxThreadCount; > + // > + // Get the possible threads count for current package. > + // > + PackageThreadsCount = CpuStatus->MaxThreadCount * CpuStatus->MaxCoreCount; > + CurrentThread = FirstThread + CpuStatus->MaxThreadCount * ApLocation->Core + ApLocation->Thread; > + // > + // Get the valid thread count for current package. > + // > + ValidThreadCount = CpuStatus->MaxThreadCount * ValidCorePerPackage[ApLocation->Package]; > + > + // > + // Different packages may have different valid cores in them. If driver maintail clearly > + // cores number in different packages, the logic will be much complicated. > + // Here driver just simply records the max core number in all packages and use it as expect > + // core number for all packages. > + // In below two steps logic, first current thread will Release semaphore for each thread > + // in current package. Maybe some threads are not valid in this package, but driver don't > + // care. Second, driver will let current thread wait semaphore for all valid threads in > + // current package. Because only the valid threads will do release semaphore for this > + // thread, driver here only need to wait the valid thread count. > + // > + > + // > + // First Notify ALL THREADS in current package that this thread has ready. > + // > + for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ; ProcessorIndex ++) { > + LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[FirstThread + ProcessorIndex]); > + } > + // > + // Second, check whether VALID THREADS (not all threads) in current package have ready. > + // > + for (ProcessorIndex = 0; ProcessorIndex < ValidThreadCount; ProcessorIndex ++) { > + LibWaitForSemaphore (&SemaphorePtr[CurrentThread]); > + } > + break; > + > + default: > + break; > + } > + break; > + > default: > break; > } > @@ -724,10 +1009,36 @@ SetProcessorRegister ( > IN OUT VOID *Buffer > ) > { > - UINTN ProcessorNumber; > + CPU_FEATURES_DATA *CpuFeaturesData; > + CPU_REGISTER_TABLE *RegisterTable; > + CPU_REGISTER_TABLE *RegisterTables; > + UINT32 InitApicId; > + UINTN ProcIndex; > + UINTN Index; > + ACPI_CPU_DATA *AcpiCpuData; > > - ProcessorNumber = GetProcessorIndex (); > - ProgramProcessorRegister (ProcessorNumber); > + CpuFeaturesData = (CPU_FEATURES_DATA *) Buffer; > + AcpiCpuData = CpuFeaturesData->AcpiCpuData; > + > + RegisterTables = (CPU_REGISTER_TABLE *)(UINTN)AcpiCpuData->RegisterTable; > + > + InitApicId = GetInitialApicId (); > + RegisterTable = NULL; > + for (Index = 0; Index < AcpiCpuData->NumberOfCpus; Index++) { > + if (RegisterTables[Index].InitialApicId == InitApicId) { > + RegisterTable = &RegisterTables[Index]; > + ProcIndex = Index; > + break; > + } > + } > + ASSERT (RegisterTable != NULL); > + > + ProgramProcessorRegister ( > + RegisterTable, > + (EFI_CPU_PHYSICAL_LOCATION *)(UINTN)AcpiCpuData->ApLocation + ProcIndex, > + &AcpiCpuData->CpuStatus, > + &CpuFeaturesData->CpuFlags > + ); > } > > /** > @@ -746,6 +1057,9 @@ CpuFeaturesDetect ( > { > UINTN NumberOfCpus; > UINTN NumberOfEnabledProcessors; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData(); > > GetNumberOfProcessor (&NumberOfCpus, &NumberOfEnabledProcessors); > > @@ -754,49 +1068,13 @@ CpuFeaturesDetect ( > // > // Wakeup all APs for data collection. > // > - StartupAPsWorker (CollectProcessorData); > + StartupAPsWorker (CollectProcessorData, NULL); > > // > // Collect data on BSP > // > - CollectProcessorData (NULL); > + CollectProcessorData (CpuFeaturesData); > > AnalysisProcessorFeatures (NumberOfCpus); > } > > -/** > - Performs CPU features Initialization. > - > - This service will invoke MP service to perform CPU features > - initialization on BSP/APs per user configuration. > - > - @note This service could be called by BSP only. > -**/ > -VOID > -EFIAPI > -CpuFeaturesInitialize ( > - VOID > - ) > -{ > - CPU_FEATURES_DATA *CpuFeaturesData; > - UINTN OldBspNumber; > - > - CpuFeaturesData = GetCpuFeaturesData (); > - > - OldBspNumber = GetProcessorIndex(); > - CpuFeaturesData->BspNumber = OldBspNumber; > - // > - // Wakeup all APs for programming. > - // > - StartupAPsWorker (SetProcessorRegister); > - // > - // Programming BSP > - // > - SetProcessorRegister (NULL); > - // > - // Switch to new BSP if required > - // > - if (CpuFeaturesData->BspNumber != OldBspNumber) { > - SwitchNewBsp (CpuFeaturesData->BspNumber); > - } > -} > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > index 1f34a3f489..4ee2c783ed 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c > @@ -15,6 +15,7 @@ > #include > > #include > +#include > > #include "RegisterCpuFeatures.h" > > @@ -115,14 +116,20 @@ GetProcessorInformation ( > > @param[in] Procedure A pointer to the function to be run on > enabled APs of the system. > + @param[in] MpEvent A pointer to the event to be used later > + to check whether procedure has done. > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN EFI_EVENT *MpEvent 4. "*" is not needed. > ) > { > EFI_STATUS Status; > EFI_MP_SERVICES_PROTOCOL *MpServices; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData (); > > MpServices = GetMpProtocol (); > // > @@ -132,9 +139,9 @@ StartupAPsWorker ( > MpServices, > Procedure, > FALSE, > - NULL, > + MpEvent, > 0, > - NULL, > + CpuFeaturesData, > NULL > ); > ASSERT_EFI_ERROR (Status); > @@ -197,3 +204,61 @@ GetNumberOfProcessor ( > ASSERT_EFI_ERROR (Status); > } > > +/** > + Performs CPU features Initialization. > + > + This service will invoke MP service to perform CPU features > + initialization on BSP/APs per user configuration. > + > + @note This service could be called by BSP only. > +**/ > +VOID > +EFIAPI > +CpuFeaturesInitialize ( > + VOID > + ) > +{ > + CPU_FEATURES_DATA *CpuFeaturesData; > + UINTN OldBspNumber; > + EFI_EVENT MpEvent; > + EFI_STATUS Status; > + > + CpuFeaturesData = GetCpuFeaturesData (); > + > + OldBspNumber = GetProcessorIndex(); > + CpuFeaturesData->BspNumber = OldBspNumber; > + > + Status = gBS->CreateEvent ( > + EVT_NOTIFY_WAIT, > + TPL_CALLBACK, > + EfiEventEmptyFunction, > + NULL, > + &MpEvent > + ); > + ASSERT_EFI_ERROR (Status); > + > + // > + // Wakeup all APs for programming. > + // > + StartupAPsWorker (SetProcessorRegister, MpEvent); > + // > + // Programming BSP > + // > + SetProcessorRegister (CpuFeaturesData); > + > + // > + // Wait all processors to finish the task. > + // > + do { > + Status = gBS->CheckEvent (MpEvent); > + } while (Status == EFI_NOT_READY); > + ASSERT_EFI_ERROR (Status); > + > + // > + // Switch to new BSP if required > + // > + if (CpuFeaturesData->BspNumber != OldBspNumber) { > + SwitchNewBsp (CpuFeaturesData->BspNumber); > + } > +} > + > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > index f0f317c945..362e0c6cd1 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.inf > @@ -47,6 +47,8 @@ > SynchronizationLib > UefiBootServicesTableLib > IoLib > + UefiBootServicesTableLib > + UefiLib > > [Protocols] > gEfiMpServiceProtocolGuid ## CONSUMES > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > index 82fe268812..7464cf17ac 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/PeiRegisterCpuFeaturesLib.c > @@ -149,11 +149,15 @@ GetProcessorInformation ( > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN EFI_EVENT *MpEvent 5. similar to #4. > ) > { > EFI_STATUS Status; > EFI_PEI_MP_SERVICES_PPI *CpuMpPpi; > + CPU_FEATURES_DATA *CpuFeaturesData; > + > + CpuFeaturesData = GetCpuFeaturesData (); > > // > // Get MP Services Protocol > @@ -175,7 +179,7 @@ StartupAPsWorker ( > Procedure, > FALSE, > 0, > - NULL > + CpuFeaturesData > ); > ASSERT_EFI_ERROR (Status); > } > @@ -257,3 +261,50 @@ GetNumberOfProcessor ( > ); > ASSERT_EFI_ERROR (Status); > } > + > +/** > + Performs CPU features Initialization. > + > + This service will invoke MP service to perform CPU features > + initialization on BSP/APs per user configuration. > + > + @note This service could be called by BSP only. > +**/ > +VOID > +EFIAPI > +CpuFeaturesInitialize ( > + VOID > + ) > +{ > + CPU_FEATURES_DATA *CpuFeaturesData; > + UINTN OldBspNumber; > + > + CpuFeaturesData = GetCpuFeaturesData (); > + > + OldBspNumber = GetProcessorIndex(); > + CpuFeaturesData->BspNumber = OldBspNumber; > + > + // > + // Known limitation: In PEI phase, CpuFeatures driver not > + // support async mode execute tasks. So semaphore type > + // register can't been used for this instance, must use > + // DXE type instance. > + // > + > + // > + // Wakeup all APs for programming. > + // > + StartupAPsWorker (SetProcessorRegister, NULL); > + // > + // Programming BSP > + // > + SetProcessorRegister (CpuFeaturesData); > + > + // > + // Switch to new BSP if required > + // > + if (CpuFeaturesData->BspNumber != OldBspNumber) { > + SwitchNewBsp (CpuFeaturesData->BspNumber); > + } > +} > + > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > index edd266934f..25c25229b4 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeatures.h > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > > #include > > @@ -46,16 +47,26 @@ typedef struct { > CPU_FEATURE_INITIALIZE InitializeFunc; > UINT8 *BeforeFeatureBitMask; > UINT8 *AfterFeatureBitMask; > + UINT8 *CoreBeforeFeatureBitMask; > + UINT8 *CoreAfterFeatureBitMask; > + UINT8 *PackageBeforeFeatureBitMask; > + UINT8 *PackageAfterFeatureBitMask; > VOID *ConfigData; > BOOLEAN BeforeAll; > BOOLEAN AfterAll; > } CPU_FEATURES_ENTRY; > > +// > +// Flags used when program the register. > +// > +typedef struct { > + volatile UINTN MemoryMappedLock; // Spinlock used to program mmio > + volatile UINT32 *SemaphoreCount; // Semaphore used to program semaphore. > +} PROGRAM_CPU_REGISTER_FLAGS; > + > typedef struct { > UINTN FeaturesCount; > UINT32 BitMaskSize; > - SPIN_LOCK MsrLock; > - SPIN_LOCK MemoryMappedLock; > LIST_ENTRY FeatureList; > > CPU_FEATURES_INIT_ORDER *InitOrder; > @@ -64,9 +75,14 @@ typedef struct { > UINT8 *ConfigurationPcd; > UINT8 *SettingPcd; > > + UINT32 NumberOfCpus; > + ACPI_CPU_DATA *AcpiCpuData; > + > CPU_REGISTER_TABLE *RegisterTable; > CPU_REGISTER_TABLE *PreSmmRegisterTable; > UINTN BspNumber; > + > + PROGRAM_CPU_REGISTER_FLAGS CpuFlags; > } CPU_FEATURES_DATA; > > #define CPU_FEATURE_ENTRY_FROM_LINK(a) \ > @@ -118,10 +134,13 @@ GetProcessorInformation ( > > @param[in] Procedure A pointer to the function to be run on > enabled APs of the system. > + @param[in] MpEvent A pointer to the event to be used later > + to check whether procedure has done. > **/ > VOID > StartupAPsWorker ( > - IN EFI_AP_PROCEDURE Procedure > + IN EFI_AP_PROCEDURE Procedure, > + IN EFI_EVENT *MpEvent > ); > > /** > @@ -170,4 +189,40 @@ DumpCpuFeature ( > IN CPU_FEATURES_ENTRY *CpuFeature > ); > > +/** > + Return feature dependence result. > + > + @param[in] CpuFeature Pointer to CPU feature. > + @param[in] Before Check before dependence or after. > + > + @retval return the dependence result. > +**/ > +CPU_FEATURE_DEPENDENCE_TYPE > +DetectFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ); > + > +/** > + Programs registers for the calling processor. > + > + @param[in,out] Buffer The pointer to private data buffer. > + > +**/ > +VOID > +EFIAPI > +SetProcessorRegister ( > + IN OUT VOID *Buffer > + ); > + > +/** > + Return ACPI_CPU_DATA data. > + > + @return Pointer to ACPI_CPU_DATA data. > +**/ > +ACPI_CPU_DATA * > +GetAcpiCpuData ( > + VOID > + ); > + > #endif > diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > index fa7e107e39..dcf7d621a9 100644 > --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/RegisterCpuFeaturesLib.c > @@ -112,6 +112,302 @@ IsBitMaskMatchCheck ( > return FALSE; > } > > +/** > + Return feature dependence result. > + > + @param[in] CpuFeature Pointer to CPU feature. > + @param[in] Before Check before dependence or after. > + > + @retval return the dependence result. > +**/ > +CPU_FEATURE_DEPENDENCE_TYPE > +DetectFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ) > +{ > + if (Before) { > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + return PackageDepType; > + } > + > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + return CoreDepType; > + } > + > + if (CpuFeature->BeforeFeatureBitMask != NULL) { > + return ThreadDepType; > + } > + > + return NoneDepType; > + } > + > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + return PackageDepType; > + } > + > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + return CoreDepType; > + } > + > + if (CpuFeature->AfterFeatureBitMask != NULL) { > + return ThreadDepType; > + } > + > + return NoneDepType; > +} > + > +/** > + Clear dependence for the specified type. > + > + @param[in] CurrentFeature Cpu feature need to clear. > + @param[in] Before Before or after dependence relationship. > + > +**/ > +VOID > +ClearFeatureScope ( > + IN CPU_FEATURES_ENTRY *CpuFeature, > + IN BOOLEAN Before > + ) > +{ > + if (Before) { > + if (CpuFeature->BeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->BeforeFeatureBitMask); > + CpuFeature->BeforeFeatureBitMask = NULL; > + } > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->CoreBeforeFeatureBitMask); > + CpuFeature->CoreBeforeFeatureBitMask = NULL; > + } > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeature->PackageBeforeFeatureBitMask); > + CpuFeature->PackageBeforeFeatureBitMask = NULL; > + } > + } else { > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->PackageAfterFeatureBitMask); > + CpuFeature->PackageAfterFeatureBitMask = NULL; > + } > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->CoreAfterFeatureBitMask); > + CpuFeature->CoreAfterFeatureBitMask = NULL; > + } > + if (CpuFeature->AfterFeatureBitMask != NULL) { > + FreePool (CpuFeature->AfterFeatureBitMask); > + CpuFeature->AfterFeatureBitMask = NULL; > + } > + } > +} > + > +/** > + Base on dependence relationship to asjust feature dependence. > + > + ONLY when the feature before(or after) the find feature also has > + dependence with the find feature. In this case, driver need to base > + on dependce relationship to decide how to insert current feature and > + adjust the feature dependence. > + > + @param[in] PreviousFeature CPU feature current before the find one. > + @param[in] CurrentFeature Cpu feature need to adjust. > + @param[in] Before Before or after dependence relationship. > + > + @retval TRUE means the current feature dependence has been adjusted. > + > + @retval FALSE means the previous feature dependence has been adjusted. > + or previous feature has no dependence with the find one. > + > +**/ > +BOOLEAN > +AdjustFeaturesDependence ( > + IN OUT CPU_FEATURES_ENTRY *PreviousFeature, > + IN OUT CPU_FEATURES_ENTRY *CurrentFeature, > + IN BOOLEAN Before > + ) > +{ > + CPU_FEATURE_DEPENDENCE_TYPE PreDependType; > + CPU_FEATURE_DEPENDENCE_TYPE CurrentDependType; > + > + PreDependType = DetectFeatureScope(PreviousFeature, Before); > + CurrentDependType = DetectFeatureScope(CurrentFeature, Before); > + > + // > + // If previous feature has no dependence with the find featue. > + // return FALSE. > + // > + if (PreDependType == NoneDepType) { > + return FALSE; > + } > + > + // > + // If both feature have dependence, keep the one which needs use more > + // processors and clear the dependence for the other one. > + // > + if (PreDependType >= CurrentDependType) { > + ClearFeatureScope (CurrentFeature, Before); > + return TRUE; > + } else { > + ClearFeatureScope (PreviousFeature, Before); > + return FALSE; > + } > +} > + > +/** > + Base on dependence relationship to asjust feature order. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] FindEntry The entry this feature depend on. > + @param[in] CurrentEntry The entry for this feature. > + @param[in] Before Before or after dependence relationship. > + > +**/ > +VOID > +AdjustEntry ( > + IN LIST_ENTRY *FeatureList, > + IN OUT LIST_ENTRY *FindEntry, > + IN OUT LIST_ENTRY *CurrentEntry, > + IN BOOLEAN Before > + ) > +{ > + LIST_ENTRY *PreviousEntry; > + CPU_FEATURES_ENTRY *PreviousFeature; > + CPU_FEATURES_ENTRY *CurrentFeature; > + > + // > + // For CPU feature which has core or package type dependence, later code need to insert > + // AcquireSpinLock/ReleaseSpinLock logic to sequency the execute order. > + // So if driver finds both feature A and B need to execute before feature C, driver will > + // base on dependence type of feature A and B to update the logic here. > + // For example, feature A has package type dependence and feature B has core type dependence, > + // because package type dependence need to wait for more processors which has strong dependence > + // than core type dependence. So driver will adjust the feature order to B -> A -> C. and driver > + // will remove the feature dependence in feature B. > + // Driver just needs to make sure before feature C been executed, feature A has finished its task > + // in all all thread. Feature A finished in all threads also means feature B have finshed in all > + // threads. > + // > + if (Before) { > + PreviousEntry = GetPreviousNode (FeatureList, FindEntry); > + } else { > > + PreviousEntry = GetNextNode (FeatureList, FindEntry); > + } > + > + CurrentFeature = CPU_FEATURE_ENTRY_FROM_LINK (CurrentEntry); > + RemoveEntryList (CurrentEntry); > + > + if (IsNull (FeatureList, PreviousEntry)) { > + // > + // If not exist the previous or next entry, just insert the current entry. > + // > + if (Before) { > + InsertTailList (FindEntry, CurrentEntry); > + } else { > + InsertHeadList (FindEntry, CurrentEntry); > + } > + } else { > + // > + // If exist the previous or next entry, need to check it before insert curent entry. > + // > + PreviousFeature = CPU_FEATURE_ENTRY_FROM_LINK (PreviousEntry); > + > + if (AdjustFeaturesDependence (PreviousFeature, CurrentFeature, Before)) { > + // > + // Return TRUE means current feature dependence has been cleared and the previous > + // feature dependence has been kept and used. So insert current feature before (or after) > + // the previous feature. > + // > + if (Before) { > + InsertTailList (PreviousEntry, CurrentEntry); > + } else { > + InsertHeadList (PreviousEntry, CurrentEntry); > + } > + } else { > + if (Before) { > + InsertTailList (FindEntry, CurrentEntry); > + } else { > + InsertHeadList (FindEntry, CurrentEntry); > + } > + } > + } > +} > > + > +/** > + Checks and adjusts current CPU features per dependency relationship. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] CurrentEntry Pointer to current checked CPU feature > + @param[in] FeatureMask The feature bit mask. > + > + @retval return Swapped info. > +**/ > +BOOLEAN > +InsertToBeforeEntry ( > + IN LIST_ENTRY *FeatureList, > + IN LIST_ENTRY *CurrentEntry, > + IN UINT8 *FeatureMask > + ) > +{ > + LIST_ENTRY *CheckEntry; > + CPU_FEATURES_ENTRY *CheckFeature; > + BOOLEAN Swapped; > + > + Swapped = FALSE; > + > + // > + // Check all features dispatched before this entry > + // > + CheckEntry = GetFirstNode (FeatureList); > + while (CheckEntry != CurrentEntry) { > + CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > + if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, FeatureMask)) { > + AdjustEntry (FeatureList, CheckEntry, CurrentEntry, TRUE); > + Swapped = TRUE; > + break; > + } > + CheckEntry = CheckEntry->ForwardLink; > + } > + > + return Swapped; > +} > + > +/** > + Checks and adjusts current CPU features per dependency relationship. > + > + @param[in] FeatureList Pointer to CPU feature list > + @param[in] CurrentEntry Pointer to current checked CPU feature > + @param[in] FeatureMask The feature bit mask. > + > + @retval return Swapped info. > +**/ > +BOOLEAN > +InsertToAfterEntry ( > + IN LIST_ENTRY *FeatureList, > + IN LIST_ENTRY *CurrentEntry, > + IN UINT8 *FeatureMask > + ) > +{ > + LIST_ENTRY *CheckEntry; > + CPU_FEATURES_ENTRY *CheckFeature; > + BOOLEAN Swapped; > + > + Swapped = FALSE; > + > + // > + // Check all features dispatched after this entry > + // > + CheckEntry = GetNextNode (FeatureList, CurrentEntry); > + while (!IsNull (FeatureList, CheckEntry)) { > + CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > + if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, FeatureMask)) { > + AdjustEntry (FeatureList, CheckEntry, CurrentEntry, FALSE); > + Swapped = TRUE; > + break; > + } > + CheckEntry = CheckEntry->ForwardLink; > + } > + > + return Swapped; > +} > + > /** > Checks and adjusts CPU features order per dependency relationship. > > @@ -128,11 +424,13 @@ CheckCpuFeaturesDependency ( > CPU_FEATURES_ENTRY *CheckFeature; > BOOLEAN Swapped; > LIST_ENTRY *TempEntry; > + LIST_ENTRY *NextEntry; > > CurrentEntry = GetFirstNode (FeatureList); > while (!IsNull (FeatureList, CurrentEntry)) { > Swapped = FALSE; > CpuFeature = CPU_FEATURE_ENTRY_FROM_LINK (CurrentEntry); > + NextEntry = CurrentEntry->ForwardLink; > if (CpuFeature->BeforeAll) { > // > // Check all features dispatched before this entry > @@ -153,6 +451,7 @@ CheckCpuFeaturesDependency ( > CheckEntry = CheckEntry->ForwardLink; > } > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > @@ -179,60 +478,59 @@ CheckCpuFeaturesDependency ( > CheckEntry = CheckEntry->ForwardLink; > } > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > > if (CpuFeature->BeforeFeatureBitMask != NULL) { > - // > - // Check all features dispatched before this entry > - // > - CheckEntry = GetFirstNode (FeatureList); > - while (CheckEntry != CurrentEntry) { > - CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > - if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, CpuFeature->BeforeFeatureBitMask)) { > - // > - // If there is dependency, swap them > - // > - RemoveEntryList (CurrentEntry); > - InsertTailList (CheckEntry, CurrentEntry); > - Swapped = TRUE; > - break; > - } > - CheckEntry = CheckEntry->ForwardLink; > - } > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->BeforeFeatureBitMask); > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > > if (CpuFeature->AfterFeatureBitMask != NULL) { > - // > - // Check all features dispatched after this entry > - // > - CheckEntry = GetNextNode (FeatureList, CurrentEntry); > - while (!IsNull (FeatureList, CheckEntry)) { > - CheckFeature = CPU_FEATURE_ENTRY_FROM_LINK (CheckEntry); > - if (IsBitMaskMatchCheck (CheckFeature->FeatureMask, CpuFeature->AfterFeatureBitMask)) { > - // > - // If there is dependency, swap them > - // > - TempEntry = GetNextNode (FeatureList, CurrentEntry); > - RemoveEntryList (CurrentEntry); > - InsertHeadList (CheckEntry, CurrentEntry); > - CurrentEntry = TempEntry; > - Swapped = TRUE; > - break; > - } > - CheckEntry = CheckEntry->ForwardLink; > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->AfterFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->CoreBeforeFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > } > + } > + > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->CoreAfterFeatureBitMask); > if (Swapped) { > + CurrentEntry = NextEntry; > continue; > } > } > - // > - // No swap happened, check the next feature > - // > + > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + Swapped = InsertToBeforeEntry (FeatureList, CurrentEntry, CpuFeature->PackageBeforeFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + Swapped = InsertToAfterEntry (FeatureList, CurrentEntry, CpuFeature->PackageAfterFeatureBitMask); > + if (Swapped) { > + CurrentEntry = NextEntry; > + continue; > + } > + } > + > CurrentEntry = CurrentEntry->ForwardLink; > } > } > @@ -265,8 +563,7 @@ RegisterCpuFeatureWorker ( > CpuFeaturesData = GetCpuFeaturesData (); > if (CpuFeaturesData->FeaturesCount == 0) { > InitializeListHead (&CpuFeaturesData->FeatureList); > - InitializeSpinLock (&CpuFeaturesData->MsrLock); > - InitializeSpinLock (&CpuFeaturesData->MemoryMappedLock); > + InitializeSpinLock (&CpuFeaturesData->CpuFlags.MemoryMappedLock); > CpuFeaturesData->BitMaskSize = (UINT32) BitMaskSize; > } > ASSERT (CpuFeaturesData->BitMaskSize == BitMaskSize); > @@ -328,6 +625,31 @@ RegisterCpuFeatureWorker ( > } > CpuFeatureEntry->AfterFeatureBitMask = CpuFeature->AfterFeatureBitMask; > } > + if (CpuFeature->CoreBeforeFeatureBitMask != NULL) { > + if (CpuFeatureEntry->CoreBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->CoreBeforeFeatureBitMask); > + } > + CpuFeatureEntry->CoreBeforeFeatureBitMask = CpuFeature->CoreBeforeFeatureBitMask; > + } > + if (CpuFeature->CoreAfterFeatureBitMask != NULL) { > + if (CpuFeatureEntry->CoreAfterFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->CoreAfterFeatureBitMask); > + } > + CpuFeatureEntry->CoreAfterFeatureBitMask = CpuFeature->CoreAfterFeatureBitMask; > + } > + if (CpuFeature->PackageBeforeFeatureBitMask != NULL) { > + if (CpuFeatureEntry->PackageBeforeFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->PackageBeforeFeatureBitMask); > + } > + CpuFeatureEntry->PackageBeforeFeatureBitMask = CpuFeature->PackageBeforeFeatureBitMask; > + } > + if (CpuFeature->PackageAfterFeatureBitMask != NULL) { > + if (CpuFeatureEntry->PackageAfterFeatureBitMask != NULL) { > + FreePool (CpuFeatureEntry->PackageAfterFeatureBitMask); > + } > + CpuFeatureEntry->PackageAfterFeatureBitMask = CpuFeature->PackageAfterFeatureBitMask; > + } > + > CpuFeatureEntry->BeforeAll = CpuFeature->BeforeAll; > CpuFeatureEntry->AfterAll = CpuFeature->AfterAll; > > @@ -410,6 +732,8 @@ SetCpuFeaturesBitMask ( > @retval RETURN_UNSUPPORTED Registration of the CPU feature is not > supported due to a circular dependency between > BEFORE and AFTER features. > + @retval RETURN_NOT_READY CPU feature PCD PcdCpuFeaturesUserConfiguration > + not updated by Platform driver yet. > > @note This service could be called by BSP only. > **/ > @@ -431,12 +755,20 @@ RegisterCpuFeature ( > UINT8 *FeatureMask; > UINT8 *BeforeFeatureBitMask; > UINT8 *AfterFeatureBitMask; > + UINT8 *CoreBeforeFeatureBitMask; > + UINT8 *CoreAfterFeatureBitMask; > + UINT8 *PackageBeforeFeatureBitMask; > + UINT8 *PackageAfterFeatureBitMask; > BOOLEAN BeforeAll; > BOOLEAN AfterAll; > > - FeatureMask = NULL; > - BeforeFeatureBitMask = NULL; > - AfterFeatureBitMask = NULL; > + FeatureMask = NULL; > + BeforeFeatureBitMask = NULL; > + AfterFeatureBitMask = NULL; > + CoreBeforeFeatureBitMask = NULL; > + CoreAfterFeatureBitMask = NULL; > + PackageBeforeFeatureBitMask = NULL; > + PackageAfterFeatureBitMask = NULL; > BeforeAll = FALSE; > AfterAll = FALSE; > > @@ -449,6 +781,10 @@ RegisterCpuFeature ( > != (CPU_FEATURE_BEFORE | CPU_FEATURE_AFTER)); > ASSERT ((Feature & (CPU_FEATURE_BEFORE_ALL | CPU_FEATURE_AFTER_ALL)) > != (CPU_FEATURE_BEFORE_ALL | CPU_FEATURE_AFTER_ALL)); > + ASSERT ((Feature & (CPU_FEATURE_CORE_BEFORE | CPU_FEATURE_CORE_AFTER)) > + != (CPU_FEATURE_CORE_BEFORE | CPU_FEATURE_CORE_AFTER)); > + ASSERT ((Feature & (CPU_FEATURE_PACKAGE_BEFORE | CPU_FEATURE_PACKAGE_AFTER)) > + != (CPU_FEATURE_PACKAGE_BEFORE | CPU_FEATURE_PACKAGE_AFTER)); > if (Feature < CPU_FEATURE_BEFORE) { > BeforeAll = ((Feature & CPU_FEATURE_BEFORE_ALL) != 0) ? TRUE : FALSE; > AfterAll = ((Feature & CPU_FEATURE_AFTER_ALL) != 0) ? TRUE : FALSE; > @@ -459,6 +795,14 @@ RegisterCpuFeature ( > SetCpuFeaturesBitMask (&BeforeFeatureBitMask, Feature & ~CPU_FEATURE_BEFORE, BitMaskSize); > } else if ((Feature & CPU_FEATURE_AFTER) != 0) { > SetCpuFeaturesBitMask (&AfterFeatureBitMask, Feature & ~CPU_FEATURE_AFTER, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_CORE_BEFORE) != 0) { > + SetCpuFeaturesBitMask (&CoreBeforeFeatureBitMask, Feature & ~CPU_FEATURE_CORE_BEFORE, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_CORE_AFTER) != 0) { > + SetCpuFeaturesBitMask (&CoreAfterFeatureBitMask, Feature & ~CPU_FEATURE_CORE_AFTER, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_PACKAGE_BEFORE) != 0) { > + SetCpuFeaturesBitMask (&PackageBeforeFeatureBitMask, Feature & ~CPU_FEATURE_PACKAGE_BEFORE, BitMaskSize); > + } else if ((Feature & CPU_FEATURE_PACKAGE_AFTER) != 0) { > + SetCpuFeaturesBitMask (&PackageAfterFeatureBitMask, Feature & ~CPU_FEATURE_PACKAGE_AFTER, BitMaskSize); > } > Feature = VA_ARG (Marker, UINT32); > } > @@ -466,15 +810,19 @@ RegisterCpuFeature ( > > CpuFeature = AllocateZeroPool (sizeof (CPU_FEATURES_ENTRY)); > ASSERT (CpuFeature != NULL); > - CpuFeature->Signature = CPU_FEATURE_ENTRY_SIGNATURE; > - CpuFeature->FeatureMask = FeatureMask; > - CpuFeature->BeforeFeatureBitMask = BeforeFeatureBitMask; > - CpuFeature->AfterFeatureBitMask = AfterFeatureBitMask; > - CpuFeature->BeforeAll = BeforeAll; > - CpuFeature->AfterAll = AfterAll; > - CpuFeature->GetConfigDataFunc = GetConfigDataFunc; > - CpuFeature->SupportFunc = SupportFunc; > - CpuFeature->InitializeFunc = InitializeFunc; > + CpuFeature->Signature = CPU_FEATURE_ENTRY_SIGNATURE; > + CpuFeature->FeatureMask = FeatureMask; > + CpuFeature->BeforeFeatureBitMask = BeforeFeatureBitMask; > + CpuFeature->AfterFeatureBitMask = AfterFeatureBitMask; > + CpuFeature->CoreBeforeFeatureBitMask = CoreBeforeFeatureBitMask; > + CpuFeature->CoreAfterFeatureBitMask = CoreAfterFeatureBitMask; > + CpuFeature->PackageBeforeFeatureBitMask = PackageBeforeFeatureBitMask; > + CpuFeature->PackageAfterFeatureBitMask = PackageAfterFeatureBitMask; > + CpuFeature->BeforeAll = BeforeAll; > + CpuFeature->AfterAll = AfterAll; > + CpuFeature->GetConfigDataFunc = GetConfigDataFunc; > + CpuFeature->SupportFunc = SupportFunc; > + CpuFeature->InitializeFunc = InitializeFunc; > if (FeatureName != NULL) { > CpuFeature->FeatureName = AllocatePool (CPU_FEATURE_NAME_SIZE); > ASSERT (CpuFeature->FeatureName != NULL); > @@ -489,13 +837,12 @@ RegisterCpuFeature ( > } > > /** > - Allocates boot service data to save ACPI_CPU_DATA. > + Return ACPI_CPU_DATA data. > > - @return Pointer to allocated ACPI_CPU_DATA. > + @return Pointer to ACPI_CPU_DATA data. > **/ > -STATIC > ACPI_CPU_DATA * > -AllocateAcpiCpuData ( > +GetAcpiCpuData ( > VOID > ) > { > @@ -508,9 +855,20 @@ AllocateAcpiCpuData ( > UINTN Index; > EFI_PROCESSOR_INFORMATION ProcessorInfoBuffer; > > + AcpiCpuData = (ACPI_CPU_DATA *) (UINTN) PcdGet64 (PcdCpuS3DataAddress); > + if (AcpiCpuData != NULL) { > + return AcpiCpuData; > + } > + > AcpiCpuData = AllocatePages (EFI_SIZE_TO_PAGES (sizeof (ACPI_CPU_DATA))); > ASSERT (AcpiCpuData != NULL); > > + // > + // Set PcdCpuS3DataAddress to the base address of the ACPI_CPU_DATA structure > + // > + Status = PcdSet64S (PcdCpuS3DataAddress, (UINT64)(UINTN)AcpiCpuData); > + ASSERT_EFI_ERROR (Status); > + > GetNumberOfProcessor (&NumberOfCpus, &NumberOfEnabledProcessors); > AcpiCpuData->NumberOfCpus = (UINT32)NumberOfCpus; > > @@ -606,7 +964,6 @@ CpuRegisterTableWriteWorker ( > IN UINT64 Value > ) > { > - EFI_STATUS Status; > CPU_FEATURES_DATA *CpuFeaturesData; > ACPI_CPU_DATA *AcpiCpuData; > CPU_REGISTER_TABLE *RegisterTable; > @@ -614,17 +971,8 @@ CpuRegisterTableWriteWorker ( > > CpuFeaturesData = GetCpuFeaturesData (); > if (CpuFeaturesData->RegisterTable == NULL) { > - AcpiCpuData = (ACPI_CPU_DATA *) (UINTN) PcdGet64 (PcdCpuS3DataAddress); > - if (AcpiCpuData == NULL) { > - AcpiCpuData = AllocateAcpiCpuData (); > - ASSERT (AcpiCpuData != NULL); > - // > - // Set PcdCpuS3DataAddress to the base address of the ACPI_CPU_DATA structure > - // > - Status = PcdSet64S (PcdCpuS3DataAddress, (UINT64)(UINTN)AcpiCpuData); > - ASSERT_EFI_ERROR (Status); > - } > - ASSERT (AcpiCpuData->RegisterTable != 0); > + AcpiCpuData = GetAcpiCpuData (); > + ASSERT ((AcpiCpuData != NULL) && (AcpiCpuData->RegisterTable != 0)); > CpuFeaturesData->RegisterTable = (CPU_REGISTER_TABLE *) (UINTN) AcpiCpuData->RegisterTable; > CpuFeaturesData->PreSmmRegisterTable = (CPU_REGISTER_TABLE *) (UINTN) AcpiCpuData->PreSmmInitRegisterTable; > } > With the above 5 issues fixed and tested, Reviewed-by: Ruiyu Ni -- Thanks, Ray