public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Dong, Eric" <eric.dong@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>
Subject: Re: [Patch 4/4] UefiCpuPkg/PiSmmCpuDxeSmm: Add logic to support semaphore type.
Date: Tue, 16 Oct 2018 14:44:31 +0000	[thread overview]
Message-ID: <ED077930C258884BBCB450DB737E66225576B8B3@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <f948edb6-f41f-0ebc-2586-8be584a61e63@redhat.com>

Hi Laszlo,

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Tuesday, October 16, 2018 1:13 AM
> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>
> Subject: Re: [Patch 4/4] UefiCpuPkg/PiSmmCpuDxeSmm: Add logic to support
> semaphore type.
> 
> On 10/15/18 04:49, Eric Dong wrote:
> > Because this driver needs to set MSRs saved in normal boot phase, sync
> > semaphore logic from RegisterCpuFeaturesLib code which used for normal
> boot phase.
> 
> (My review of this patch is going to be superficial. I'm not trying to validate the
> actual algorithm. I'm mostly sanity-checking the code, and gauging whether it
> will break platforms that use CpuS3DataDxe.)
> 

Reasonable, thanks for your efforts.

> 
> > Detail see change SHA-1: dcdf1774212d87e2d7feb36286a408ea7475fd7b for
> > RegisterCpuFeaturesLib.
> 
> (1) I think it is valid to reference other patches in the same series.
> However, the commit hashes are not stable yet -- when you rebase the series,
> the commit hashes will change. Therefore, when we refer to a patch that is not
> upstream yet (i.e. it is part of the same series), it is best to spell out the full
> subject, such as:
> 
> UefiCpuPkg/RegisterCpuFeaturesLib: Add logic to support semaphore type.
> 

I aware this value change when do the rebase action. I plan to update the value when I do the rebase action.  Your suggestion is good. I can also use the change header to specify the change. I will use it in my next change.

> 
> >
> > Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Eric Dong <eric.dong@intel.com>
> > ---
> >  UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c          | 316 ++++++++++++++++-------
> ------
> >  UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c      |   3 -
> >  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h |   3 +-
> >  3 files changed, 180 insertions(+), 142 deletions(-)
> >
> > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> > b/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> > index 52ff9679d5..5a35f7a634 100644
> > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c
> > @@ -38,9 +38,12 @@ typedef struct {
> >  } MP_ASSEMBLY_ADDRESS_MAP;
> >
> >  //
> > -// Spin lock used to serialize MemoryMapped operation
> > +// Flags used when program the register.
> >  //
> > -SPIN_LOCK                *mMemoryMappedLock = NULL;
> > +typedef struct {
> > +  volatile UINTN           MemoryMappedLock;     // Spinlock used to program
> mmio
> > +  volatile UINT32          *SemaphoreCount;      // Semaphore used to program
> semaphore.
> > +} PROGRAM_CPU_REGISTER_FLAGS;
> >
> >  //
> >  // Signal that SMM BASE relocation is complete.
> > @@ -62,13 +65,11 @@ AsmGetAddressMap (
> >  #define LEGACY_REGION_SIZE    (2 * 0x1000)
> >  #define LEGACY_REGION_BASE    (0xA0000 - LEGACY_REGION_SIZE)
> >
> > +PROGRAM_CPU_REGISTER_FLAGS   mCpuFlags;
> >  ACPI_CPU_DATA                mAcpiCpuData;
> >  volatile UINT32              mNumberToFinish;
> >  MP_CPU_EXCHANGE_INFO         *mExchangeInfo;
> >  BOOLEAN                      mRestoreSmmConfigurationInS3 = FALSE;
> > -MP_MSR_LOCK                  *mMsrSpinLocks = NULL;
> > -UINTN                        mMsrSpinLockCount;
> > -UINTN                        mMsrCount = 0;
> >
> >  //
> >  // S3 boot flag
> > @@ -91,89 +92,6 @@ UINT8                        mApHltLoopCodeTemplate[] = {
> >                                 0xEB, 0xFC               // jmp $-2
> >                                 };
> >
> > -/**
> > -  Get MSR spin lock by MSR index.
> > -
> > -  @param  MsrIndex       MSR index value.
> > -
> > -  @return Pointer to MSR spin lock.
> > -
> > -**/
> > -SPIN_LOCK *
> > -GetMsrSpinLockByIndex (
> > -  IN UINT32      MsrIndex
> > -  )
> > -{
> > -  UINTN     Index;
> > -  for (Index = 0; Index < mMsrCount; Index++) {
> > -    if (MsrIndex == mMsrSpinLocks[Index].MsrIndex) {
> > -      return mMsrSpinLocks[Index].SpinLock;
> > -    }
> > -  }
> > -  return NULL;
> > -}
> > -
> > -/**
> > -  Initialize MSR spin lock by MSR index.
> > -
> > -  @param  MsrIndex       MSR index value.
> > -
> > -**/
> > -VOID
> > -InitMsrSpinLockByIndex (
> > -  IN UINT32      MsrIndex
> > -  )
> > -{
> > -  UINTN    MsrSpinLockCount;
> > -  UINTN    NewMsrSpinLockCount;
> > -  UINTN    Index;
> > -  UINTN    AddedSize;
> > -
> > -  if (mMsrSpinLocks == NULL) {
> > -    MsrSpinLockCount =
> mSmmCpuSemaphores.SemaphoreMsr.AvailableCounter;
> > -    mMsrSpinLocks = (MP_MSR_LOCK *) AllocatePool (sizeof (MP_MSR_LOCK)
> * MsrSpinLockCount);
> > -    ASSERT (mMsrSpinLocks != NULL);
> > -    for (Index = 0; Index < MsrSpinLockCount; Index++) {
> > -      mMsrSpinLocks[Index].SpinLock =
> > -       (SPIN_LOCK *)((UINTN)mSmmCpuSemaphores.SemaphoreMsr.Msr +
> Index * mSemaphoreSize);
> > -      mMsrSpinLocks[Index].MsrIndex = (UINT32)-1;
> > -    }
> > -    mMsrSpinLockCount = MsrSpinLockCount;
> > -    mSmmCpuSemaphores.SemaphoreMsr.AvailableCounter = 0;
> > -  }
> > -  if (GetMsrSpinLockByIndex (MsrIndex) == NULL) {
> > -    //
> > -    // Initialize spin lock for MSR programming
> > -    //
> > -    mMsrSpinLocks[mMsrCount].MsrIndex = MsrIndex;
> > -    InitializeSpinLock (mMsrSpinLocks[mMsrCount].SpinLock);
> > -    mMsrCount ++;
> > -    if (mMsrCount == mMsrSpinLockCount) {
> > -      //
> > -      // If MSR spin lock buffer is full, enlarge it
> > -      //
> > -      AddedSize = SIZE_4KB;
> > -      mSmmCpuSemaphores.SemaphoreMsr.Msr =
> > -                        AllocatePages (EFI_SIZE_TO_PAGES(AddedSize));
> > -      ASSERT (mSmmCpuSemaphores.SemaphoreMsr.Msr != NULL);
> > -      NewMsrSpinLockCount = mMsrSpinLockCount + AddedSize /
> mSemaphoreSize;
> > -      mMsrSpinLocks = ReallocatePool (
> > -                        sizeof (MP_MSR_LOCK) * mMsrSpinLockCount,
> > -                        sizeof (MP_MSR_LOCK) * NewMsrSpinLockCount,
> > -                        mMsrSpinLocks
> > -                        );
> > -      ASSERT (mMsrSpinLocks != NULL);
> > -      mMsrSpinLockCount = NewMsrSpinLockCount;
> > -      for (Index = mMsrCount; Index < mMsrSpinLockCount; Index++) {
> > -        mMsrSpinLocks[Index].SpinLock =
> > -                 (SPIN_LOCK *)((UINTN)mSmmCpuSemaphores.SemaphoreMsr.Msr +
> > -                 (Index - mMsrCount)  * mSemaphoreSize);
> > -        mMsrSpinLocks[Index].MsrIndex = (UINT32)-1;
> > -      }
> > -    }
> > -  }
> > -}
> > -
> >  /**
> >    Sync up the MTRR values for all processors.
> >
> > @@ -204,42 +122,89 @@ Returns:
> >  }
> >
> >  /**
> > -  Programs registers for the calling processor.
> > +  Increment semaphore by 1.
> >
> > -  This function programs registers for the calling processor.
> > +  @param      Sem            IN:  32-bit unsigned integer
> >
> > -  @param  RegisterTables        Pointer to register table of the running
> processor.
> > -  @param  RegisterTableCount    Register table count.
> > +**/
> > +VOID
> > +S3ReleaseSemaphore (
> > +  IN OUT  volatile UINT32           *Sem
> > +  )
> > +{
> > +  InterlockedIncrement (Sem);
> > +}
> > +
> > +/**
> > +  Decrement the semaphore by 1 if it is not zero.
> > +
> > +  Performs an atomic decrement operation for semaphore.
> > +  The compare exchange operation must be performed using  MP safe
> > + mechanisms.
> > +
> > +  @param      Sem            IN:  32-bit unsigned integer
> > +
> > +**/
> > +VOID
> > +S3WaitForSemaphore (
> > +  IN OUT  volatile UINT32           *Sem
> > +  )
> > +{
> > +  UINT32  Value;
> > +
> > +  do {
> > +    Value = *Sem;
> > +  } while (Value == 0);
> > +
> > +  InterlockedDecrement (Sem);
> > +}
> 
> (2) I think this implementation is not correct. If threads T1 and T2 are spinning in
> the loop, and thread T3 releases the semaphore, then both T1 and T2 could see
> (Value==1). They will both exit the loop, they will both decrement (*Sem), and
> then (*Sem) will wrap around.
> 
> Instead, we should do:
> 
>   for (;;) {
>     Value = *Sem;
>     if (Value == 0) {
>       continue;
>     }
>     if (InterlockedCompareExchange32 (Sem, Value, Value - 1) == Value) {
>       break;
>     }
>   }
> 
> This implementation is not protected against the ABA problem, but that's fine.
> Namely, it doesn't matter whether, and how, the value of (*Sem) fluctuates,
> between fetching it into Value, and setting it to (Value-1).
> What matters is that we either perform a transition from Value to (Value-1), or
> nothing.
> 

Good catch. Thanks for your sample code. I will update the code in my next changes.

> 
> > +
> > +/**
> > +  Initialize the CPU registers from a register table.
> > +
> > +  @param[in]  RegisterTable         The register table for this AP.
> > +  @param[in]  ApLocation            AP location info for this ap.
> > +  @param[in]  CpuStatus             CPU status info for this CPU.
> > +  @param[in]  CpuFlags              Flags data structure used when program the
> register.
> >
> > +  @note This service could be called by BSP/APs.
> >  **/
> >  VOID
> > -SetProcessorRegister (
> > -  IN CPU_REGISTER_TABLE        *RegisterTables,
> > -  IN UINTN                     RegisterTableCount
> > +EFIAPI
> > +ProgramProcessorRegister (
> > +  IN CPU_REGISTER_TABLE           *RegisterTable,
> > +  IN EFI_CPU_PHYSICAL_LOCATION    *ApLocation,
> > +  IN CPU_STATUS_INFORMATION       *CpuStatus,
> > +  IN PROGRAM_CPU_REGISTER_FLAGS   *CpuFlags
> >    )
> 
> (3) Any particular reason for declaring this function as EFIAPI?

I plan to export this function as an API from RegisterCpuFeaturesLib, I have did some POC code change. But this version changes will not export it as an API. I forgot to remove EFIAPI when I duplicate the code here. Will remove it in my next version changes.

> 
> 
> >  {
> >    CPU_REGISTER_TABLE_ENTRY  *RegisterTableEntry;
> >    UINTN                     Index;
> >    UINTN                     Value;
> > -  SPIN_LOCK                 *MsrSpinLock;
> > -  UINT32                    InitApicId;
> > -  CPU_REGISTER_TABLE        *RegisterTable;
> > +  CPU_REGISTER_TABLE_ENTRY  *RegisterTableEntryHead;
> > +  volatile UINT32           *SemaphorePtr;
> > +  UINT32                    CoreOffset;
> > +  UINT32                    PackageOffset;
> > +  UINT32                    PackageThreadsCount;
> > +  UINT32                    ApOffset;
> > +  UINTN                     ProcessorIndex;
> > +  UINTN                     ApIndex;
> > +  UINTN                     ValidApCount;
> >
> > -  InitApicId = GetInitialApicId ();
> > -  RegisterTable = NULL;
> > -  for (Index = 0; Index < RegisterTableCount; Index++) {
> > -    if (RegisterTables[Index].InitialApicId == InitApicId) {
> > -      RegisterTable =  &RegisterTables[Index];
> > -      break;
> > -    }
> > -  }
> > -  ASSERT (RegisterTable != NULL);
> > +  ApIndex = ApLocation->Package * CpuStatus->CoreCount * CpuStatus-
> >ThreadCount \
> > +            + ApLocation->Core * CpuStatus->ThreadCount \
> > +            + ApLocation->Thread;
> 
> (4) The backslashes look useless.
> 
> In addition, the plus signs should be at the ends of the lines, according to the
> edk2 style (operators at the end).

Will update then in my next version code changes.

> 
> >
> >    //
> >    // Traverse Register Table of this logical processor
> >    //
> > -  RegisterTableEntry = (CPU_REGISTER_TABLE_ENTRY *) (UINTN)
> > RegisterTable->RegisterTableEntry;
> > -  for (Index = 0; Index < RegisterTable->TableLength; Index++,
> > RegisterTableEntry++) {
> > +  RegisterTableEntryHead = (CPU_REGISTER_TABLE_ENTRY *) (UINTN)
> > + RegisterTable->RegisterTableEntry;
> > +
> > +  for (Index = 0; Index < RegisterTable->TableLength; Index++) {
> 
> (OK, I think this should continue working with (TableLength==0), from
> CpuS3DataDxe.)
> 
> > +
> > +    RegisterTableEntry = &RegisterTableEntryHead[Index];
> > +    DEBUG ((DEBUG_INFO, "Processor = %d, Entry Index %d, Type =
> > + %d!\n", ApIndex, Index, RegisterTableEntry->RegisterType));
> 
> (5) "ApIndex" and "Index" have type UINTN; they should not be printed with
> "%d". The portable way to print them is to cast them to UINT64, and use "%lu".
> 

ok, will update then in my next version code changes.

> 
> > +
> >      //
> >      // Check the type of specified register
> >      //
> > @@ -310,12 +275,6 @@ SetProcessorRegister (
> >            RegisterTableEntry->Value
> >            );
> >        } else {
> > -        //
> > -        // Get lock to avoid Package/Core scope MSRs programming issue in
> parallel execution mode
> > -        // to make sure MSR read/write operation is atomic.
> > -        //
> > -        MsrSpinLock = GetMsrSpinLockByIndex (RegisterTableEntry->Index);
> > -        AcquireSpinLock (MsrSpinLock);
> >          //
> >          // Set the bit section according to bit start and length
> >          //
> > @@ -325,21 +284,20 @@ SetProcessorRegister (
> >            RegisterTableEntry->ValidBitStart + RegisterTableEntry->ValidBitLength -
> 1,
> >            RegisterTableEntry->Value
> >            );
> > -        ReleaseSpinLock (MsrSpinLock);
> >        }
> >        break;
> >      //
> >      // MemoryMapped operations
> >      //
> >      case MemoryMapped:
> > -      AcquireSpinLock (mMemoryMappedLock);
> > +      AcquireSpinLock (&CpuFlags->MemoryMappedLock);
> >        MmioBitFieldWrite32 (
> >          (UINTN)(RegisterTableEntry->Index | LShiftU64 (RegisterTableEntry-
> >HighIndex, 32)),
> >          RegisterTableEntry->ValidBitStart,
> >          RegisterTableEntry->ValidBitStart + RegisterTableEntry->ValidBitLength -
> 1,
> >          (UINT32)RegisterTableEntry->Value
> >          );
> > -      ReleaseSpinLock (mMemoryMappedLock);
> > +      ReleaseSpinLock (&CpuFlags->MemoryMappedLock);
> >        break;
> >      //
> >      // Enable or disable cache
> > @@ -355,12 +313,99 @@ SetProcessorRegister (
> >        }
> >        break;
> >
> > +    case Semaphore:
> > +      SemaphorePtr = CpuFlags->SemaphoreCount;
> > +      switch (RegisterTableEntry->Value) {
> > +      case CoreDepType:
> > +        CoreOffset = (ApLocation->Package * CpuStatus->CoreCount +
> ApLocation->Core) * CpuStatus->ThreadCount;
> > +        ApOffset = CoreOffset + ApLocation->Thread;
> > +        //
> > +        // First increase semaphore count by 1 for processors in this core.
> > +        //
> > +        for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->ThreadCount;
> ProcessorIndex ++) {
> > +          S3ReleaseSemaphore ((UINT32 *) &SemaphorePtr[CoreOffset +
> > + ProcessorIndex]);
> 
> (6) The explicit (UINT32*) cast is confusing and unneeded, please remove it.

ok, will update then in my next version code changes.

> 
> > +        }
> > +        //
> > +        // Second, check whether the count has reach the check number.
> > +        //
> > +        for (ProcessorIndex = 0; ProcessorIndex < CpuStatus->ThreadCount;
> ProcessorIndex ++) {
> > +          S3WaitForSemaphore (&SemaphorePtr[ApOffset]);
> > +        }
> > +        break;
> > +
> > +      case PackageDepType:
> > +        PackageOffset = ApLocation->Package * CpuStatus->CoreCount *
> CpuStatus->ThreadCount;
> > +        PackageThreadsCount = CpuStatus->ThreadCount * CpuStatus-
> >CoreCount;
> > +        ApOffset = PackageOffset + CpuStatus->ThreadCount * ApLocation-
> >Core + ApLocation->Thread;
> > +        ValidApCount = CpuStatus->ThreadCount * CpuStatus-
> >ValidCoresInPackages[ApLocation->Package];
> > +        //
> > +        // First increase semaphore count by 1 for processors in this package.
> > +        //
> > +        for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ;
> ProcessorIndex ++) {
> > +          S3ReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset
> > + + ProcessorIndex]);
> 
> (7) Same as (6).

ok, will update then in my next version code changes.

> 
> > +        }
> > +        //
> > +        // Second, check whether the count has reach the check number.
> > +        //
> > +        for (ProcessorIndex = 0; ProcessorIndex < ValidApCount; ProcessorIndex
> ++) {
> > +          S3WaitForSemaphore (&SemaphorePtr[ApOffset]);
> > +        }
> > +        break;
> > +
> > +      default:
> > +        break;
> > +      }
> > +      break;
> > +
> >      default:
> >        break;
> >      }
> >    }
> >  }
> >
> > +/**
> > +
> > +  Set Processor register for one AP.
> > +
> > +  @param     SmmPreRegisterTable     Use pre register table or register table.
> > +
> > +**/
> > +VOID
> > +SetRegister (
> > +  IN BOOLEAN                 SmmPreRegisterTable
> 
> (8) For consistency with the "PreSmmInitRegisterTable" field name, I think this
> parameter should be named "PreSmmRegisterTable" (in the leading comment as
> well).

ok, will update then in my next version code changes.
> 
> 
> > +  )
> > +{
> > +  CPU_REGISTER_TABLE        *RegisterTable;
> > +  CPU_REGISTER_TABLE        *RegisterTables;
> > +  UINT32                    InitApicId;
> > +  UINTN                     ProcIndex;
> > +  UINTN                     Index;
> > +
> > +  if (SmmPreRegisterTable) {
> > +    RegisterTables = (CPU_REGISTER_TABLE
> > + *)(UINTN)mAcpiCpuData.PreSmmInitRegisterTable;
> > +  } else {
> > +    RegisterTables = (CPU_REGISTER_TABLE
> > + *)(UINTN)mAcpiCpuData.RegisterTable;
> > +  }
> > +
> > +  InitApicId = GetInitialApicId ();
> > +  RegisterTable = NULL;
> > +  for (Index = 0; Index < mAcpiCpuData.NumberOfCpus; Index++) {
> > +    if (RegisterTables[Index].InitialApicId == InitApicId) {
> > +      RegisterTable =  &RegisterTables[Index];
> 
> (9) Unjustified double space after the equal sign.

ok, will update then in my next version code changes.
> 
> 
> > +      ProcIndex = Index;
> > +      break;
> > +    }
> > +  }
> > +  ASSERT (RegisterTable != NULL);
> > +
> > +  ProgramProcessorRegister (
> > +    RegisterTable,
> > +    mAcpiCpuData.ApLocation + ProcIndex,
> > +    &mAcpiCpuData.CpuStatus,
> > +    &mCpuFlags
> > +    );
> > +}
> > +
> >  /**
> >    AP initialization before then after SMBASE relocation in the S3 boot path.
> >  **/
> > @@ -374,7 +419,7 @@ InitializeAp (
> >
> >    LoadMtrrData (mAcpiCpuData.MtrrTable);
> >
> > -  SetProcessorRegister ((CPU_REGISTER_TABLE *) (UINTN)
> > mAcpiCpuData.PreSmmInitRegisterTable, mAcpiCpuData.NumberOfCpus);
> > +  SetRegister (TRUE);
> >
> >    //
> >    // Count down the number with lock mechanism.
> > @@ -391,7 +436,7 @@ InitializeAp (
> >    ProgramVirtualWireMode ();
> >    DisableLvtInterrupts ();
> >
> > -  SetProcessorRegister ((CPU_REGISTER_TABLE *) (UINTN)
> > mAcpiCpuData.RegisterTable, mAcpiCpuData.NumberOfCpus);
> > +  SetRegister (FALSE);
> >
> >    //
> >    // Place AP into the safe code, count down the number with lock mechanism
> in the safe code.
> > @@ -466,7 +511,7 @@ InitializeCpuBeforeRebase (  {
> >    LoadMtrrData (mAcpiCpuData.MtrrTable);
> >
> > -  SetProcessorRegister ((CPU_REGISTER_TABLE *) (UINTN)
> > mAcpiCpuData.PreSmmInitRegisterTable, mAcpiCpuData.NumberOfCpus);
> > +  SetRegister (TRUE);
> >
> >    ProgramVirtualWireMode ();
> >
> > @@ -502,8 +547,6 @@ InitializeCpuAfterRebase (
> >    VOID
> >    )
> >  {
> > -  SetProcessorRegister ((CPU_REGISTER_TABLE *) (UINTN)
> > mAcpiCpuData.RegisterTable, mAcpiCpuData.NumberOfCpus);
> > -
> >    mNumberToFinish = mAcpiCpuData.NumberOfCpus - 1;
> >
> >    //
> > @@ -511,6 +554,8 @@ InitializeCpuAfterRebase (
> >    //
> >    mInitApsAfterSmmBaseReloc = TRUE;
> >
> > +  SetRegister (FALSE);
> > +
> >    while (mNumberToFinish > 0) {
> >      CpuPause ();
> >    }
> 
> (10) I'm not implying this is incorrect, just asking: can you please explain why the
> function call is *moved*?
> 
> Does it merit a comment in the code perhaps?

Yes, this change used to let Aps continue it procedure before BSP begin to set the register. Old logic will let AP continue their tasks after BSP finished set its register. This is a must have change if the semaphore logic is working. Because if the dependence type is package type, the semaphore will wait for all Aps in one package finishing their tasks before set next register for all APs. If the Aps not begin its task during BSP doing its task, the BSP thread will hang because it is waiting for other Aps in the same package finishing their task.

I will add this info in the comments.

> 
> 
> > @@ -574,8 +619,6 @@ SmmRestoreCpu (
> >
> >    mSmmS3Flag = TRUE;
> >
> > -  InitializeSpinLock (mMemoryMappedLock);
> > -
> >    //
> >    // See if there is enough context to resume PEI Phase
> >    //
> > @@ -790,7 +833,6 @@ CopyRegisterTable (
> >    )
> >  {
> >    UINTN                      Index;
> > -  UINTN                      Index1;
> >    CPU_REGISTER_TABLE_ENTRY   *RegisterTableEntry;
> >
> >    CopyMem (DestinationRegisterTableList, SourceRegisterTableList,
> > NumberOfCpus * sizeof (CPU_REGISTER_TABLE)); @@ -802,17 +844,6 @@
> CopyRegisterTable (
> >          );
> >        ASSERT (RegisterTableEntry != NULL);
> >        DestinationRegisterTableList[Index].RegisterTableEntry =
> (EFI_PHYSICAL_ADDRESS)(UINTN)RegisterTableEntry;
> > -      //
> > -      // Go though all MSRs in register table to initialize MSR spin lock
> > -      //
> > -      for (Index1 = 0; Index1 < DestinationRegisterTableList[Index].TableLength;
> Index1++, RegisterTableEntry++) {
> > -        if ((RegisterTableEntry->RegisterType == Msr) && (RegisterTableEntry-
> >ValidBitLength < 64)) {
> > -          //
> > -          // Initialize MSR spin lock only for those MSRs need bit field writing
> > -          //
> > -          InitMsrSpinLockByIndex (RegisterTableEntry->Index);
> > -        }
> > -      }
> >      }
> >    }
> >  }
> > @@ -832,6 +863,7 @@ GetAcpiCpuData (
> >    VOID                       *GdtForAp;
> >    VOID                       *IdtForAp;
> >    VOID                       *MachineCheckHandlerForAp;
> > +  CPU_STATUS_INFORMATION     *CpuStatus;
> >
> >    if (!mAcpiS3Enable) {
> >      return;
> > @@ -906,6 +938,16 @@ GetAcpiCpuData (
> >    Gdtr->Base = (UINTN)GdtForAp;
> >    Idtr->Base = (UINTN)IdtForAp;
> >    mAcpiCpuData.ApMachineCheckHandlerBase =
> > (EFI_PHYSICAL_ADDRESS)(UINTN)MachineCheckHandlerForAp;
> > +
> > +  CpuStatus = &mAcpiCpuData.CpuStatus;  CopyMem (CpuStatus,
> > + &AcpiCpuData->CpuStatus, sizeof (CPU_STATUS_INFORMATION));
> > + CpuStatus->ValidCoresInPackages = AllocateCopyPool (sizeof (UINT32)
> > + * CpuStatus->PackageCount,
> > + AcpiCpuData->CpuStatus.ValidCoresInPackages);
> 
> (11) This line is 142 characters long.
> 
> Please make sure that all new lines are at most 120 chars long.

got it, will refine the code in my next changes.

> 
> 
> (12) I don't understand the multiplication. In the "ValidCoresInPackages" array,
> do we have a simple (scalar) core count, for each socket?
Yes, ValidCoresInPackages is a pointer which point to an array. This array saves valid core count for each package(socket) in the CPU. In server platform, it has multiple packages and different packages may have different valid cores. This info is required by semaphore register.

> 
> That's what the "ValidApCount" assignment above suggests. Can we perhaps
> rename the field so that it says "Count" somewhere?
> 
> 
> (13) Without modifying CpuS3DataDxe, this line will crash.	
Yes, I missed the change in CpuS3DataDxe, will include the change in my next version.

> 
> 
> > +  ASSERT (CpuStatus->ValidCoresInPackages != NULL);
> > + mAcpiCpuData.ApLocation = AllocateCopyPool
> > + (mAcpiCpuData.NumberOfCpus * sizeof (EFI_CPU_PHYSICAL_LOCATION),
> > + AcpiCpuData->ApLocation);  ASSERT (mAcpiCpuData.ApLocation != NULL);
> 
> (14) This also requires a modification to CpuS3DataDxe.
Yes, I missed the change in CpuS3DataDxe, will include the change in my next version.

> 
> 
> > +  InitializeSpinLock((SPIN_LOCK*) &mCpuFlags.MemoryMappedLock);
> > + mCpuFlags.SemaphoreCount = AllocateZeroPool (sizeof (UINT32) *
> > + CpuStatus->PackageCount * CpuStatus->CoreCount *
> > + CpuStatus->ThreadCount);  ASSERT (mCpuFlags.SemaphoreCount != NULL);
> >  }
> >
> >  /**
> > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> > b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> > index 9cf508a5c7..42b040531e 100644
> > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c
> > @@ -1303,8 +1303,6 @@ InitializeSmmCpuSemaphores (
> >    mSmmCpuSemaphores.SemaphoreGlobal.CodeAccessCheckLock
> >                                                    = (SPIN_LOCK *)SemaphoreAddr;
> >    SemaphoreAddr += SemaphoreSize;
> > -  mSmmCpuSemaphores.SemaphoreGlobal.MemoryMappedLock
> > -                                                  = (SPIN_LOCK *)SemaphoreAddr;
> >
> >    SemaphoreAddr = (UINTN)SemaphoreBlock + GlobalSemaphoresSize;
> >    mSmmCpuSemaphores.SemaphoreCpu.Busy    = (SPIN_LOCK
> *)SemaphoreAddr;
> > @@ -1321,7 +1319,6 @@ InitializeSmmCpuSemaphores (
> >
> >    mPFLock                       = mSmmCpuSemaphores.SemaphoreGlobal.PFLock;
> >    mConfigSmmCodeAccessCheckLock =
> mSmmCpuSemaphores.SemaphoreGlobal.CodeAccessCheckLock;
> > -  mMemoryMappedLock             =
> mSmmCpuSemaphores.SemaphoreGlobal.MemoryMappedLock;
> >
> >    mSemaphoreSize = SemaphoreSize;
> >  }
> > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
> > b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
> > index 8c7f4996d1..e2970308fe 100644
> > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
> > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h
> > @@ -53,6 +53,7 @@ WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY
> KIND, EITHER EXPRESS OR IMPLIED.
> >  #include <Library/ReportStatusCodeLib.h>  #include
> > <Library/SmmCpuFeaturesLib.h>  #include
> > <Library/PeCoffGetEntryPointLib.h>
> > +#include <Library/RegisterCpuFeaturesLib.h>
> >
> >  #include <AcpiCpuData.h>
> >  #include <CpuHotPlugData.h>
> > @@ -364,7 +365,6 @@ typedef struct {
> >    volatile BOOLEAN     *AllCpusInSync;
> >    SPIN_LOCK            *PFLock;
> >    SPIN_LOCK            *CodeAccessCheckLock;
> > -  SPIN_LOCK            *MemoryMappedLock;
> >  } SMM_CPU_SEMAPHORE_GLOBAL;
> >
> >  ///
> > @@ -409,7 +409,6 @@ extern SMM_CPU_SEMAPHORES
> mSmmCpuSemaphores;
> >  extern UINTN                               mSemaphoreSize;
> >  extern SPIN_LOCK                           *mPFLock;
> >  extern SPIN_LOCK                           *mConfigSmmCodeAccessCheckLock;
> > -extern SPIN_LOCK                           *mMemoryMappedLock;
> >  extern EFI_SMRAM_DESCRIPTOR                *mSmmCpuSmramRanges;
> >  extern UINTN                               mSmmCpuSmramRangeCount;
> >  extern UINT8                               mPhysicalAddressBits;
> >
> 
> Thanks,
> Laszlo

  reply	other threads:[~2018-10-16 14:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15  2:49 [Patch 0/4] Fix performance issue caused by Set MSR task Eric Dong
2018-10-15  2:49 ` [Patch 1/4] UefiCpuPkg/Include/AcpiCpuData.h: Add Semaphore related Information Eric Dong
2018-10-15 16:02   ` Laszlo Ersek
2018-10-16  3:43     ` Dong, Eric
2018-10-16  2:27   ` Ni, Ruiyu
2018-10-16  5:25     ` Dong, Eric
2018-10-15  2:49 ` [Patch 2/4] UefiCpuPkg/RegisterCpuFeaturesLib.h: Add new dependence types Eric Dong
2018-10-15  2:49 ` [Patch 3/4] UefiCpuPkg/RegisterCpuFeaturesLib: Add logic to support semaphore type Eric Dong
2018-10-16  3:05   ` Ni, Ruiyu
2018-10-16  7:43     ` Dong, Eric
2018-10-15  2:49 ` [Patch 4/4] UefiCpuPkg/PiSmmCpuDxeSmm: " Eric Dong
2018-10-15 17:13   ` Laszlo Ersek
2018-10-16 14:44     ` Dong, Eric [this message]
2018-10-16  3:16   ` Ni, Ruiyu
2018-10-16 23:52     ` Dong, Eric
2018-10-15 15:51 ` [Patch 0/4] Fix performance issue caused by Set MSR task Laszlo Ersek
2018-10-16  1:39   ` Dong, Eric
2018-10-17 11:42     ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ED077930C258884BBCB450DB737E66225576B8B3@shsmsx102.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox