From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by mx.groups.io with SMTP id smtpd.web08.6955.1606980391841049936 for ; Wed, 02 Dec 2020 23:26:32 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@intel.onmicrosoft.com header.s=selector2-intel-onmicrosoft-com header.b=NhCzsmUF; spf=pass (domain: intel.com, ip: 192.55.52.120, mailfrom: eric.dong@intel.com) IronPort-SDR: FAFynbslb6HorkxDWFmlS02LH4oJm5/K/WHSsqc4d2rGRM87Ci8eAkWq/YSdPLb06BIsW21huA Uv6eMRqHdV+g== X-IronPort-AV: E=McAfee;i="6000,8403,9823"; a="170581493" X-IronPort-AV: E=Sophos;i="5.78,388,1599548400"; d="scan'208";a="170581493" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2020 23:26:30 -0800 IronPort-SDR: IujeNyPvli06nVsVGNem6jiACNw62Fk4GTfe/kSKo0GHIRB2/n6LBpc2+ky9zK2qGfd2Z6IczA LXmieCAimAxQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,388,1599548400"; d="scan'208";a="538258543" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by fmsmga006.fm.intel.com with ESMTP; 02 Dec 2020 23:26:30 -0800 Received: from fmsmsx612.amr.corp.intel.com (10.18.126.92) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 2 Dec 2020 23:26:29 -0800 Received: from FMSEDG603.ED.cps.intel.com (10.1.192.133) by fmsmsx612.amr.corp.intel.com (10.18.126.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 2 Dec 2020 23:26:29 -0800 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (104.47.73.43) by edgegateway.intel.com (192.55.55.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Wed, 2 Dec 2020 23:26:27 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=m3sLvLsKwEQNPPkzkHMZc/R9EM9P7MFDwLn89F/lN6NDPkRtLBMjQ41SwnjATJg1lLfjzhkGpn/7I1Js/p6KfkjvRKOlWBL57RxLvaYNDmrN7+2rLH8vneOM/RPJWx+K6Iz062j89olKHfkblp6HGAYrxP2Hmwl+kcs5bZZ1Gg19sJ+/B2SoG7E0Xrpjw4tT1R8T/zZQmTjL8dIfe8+yt1PHhAQ+m/lqS2Qwvn6UQL/HqMwycnqBTJSgyY7kKYAWUh2n6iBAEYs6UxaDMy1dcqC0Nq/bs1TrcAmjR+PaoDKr2mUrlv23DRet21KDTfWJjSx4EtLUqjlbNB6gsmCfgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8Sy0P8nHRGR45LZzfscCBkl9b8B8Qv90nXxw3T1z3aQ=; b=h7wzUJmXu2xb3KL5yjkdpOzgST0OzU2h0T0tmRNrEcrDK9989rjRCItM8S4+QdEWbO/kyJc3x0A0RcNkyyTxP06mdcbDPkxtlI5qUsZLEzAtVnnrte2TIeYOyxl8rnYpesfzgFJWdgvTOvCsL0rS0RYY4wwpd/nnjKiM8HneqMfoSIxgidM93LRoNY3T1l6qKjgY/lkzsje0eBLScsIn9UGcOPqQi+XHtBQDZ1ipQMteitCagGo8SF+v3SwpfMCsl2OTPSS2hUmgh8q86EUUVGoq0xMRiTZ05+cbqGU0RtUIwIx+Z6twJ6OSqudp7+5/U3CFPMuSl0Ylz8RBx8anWg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8Sy0P8nHRGR45LZzfscCBkl9b8B8Qv90nXxw3T1z3aQ=; b=NhCzsmUFXVToJfB3MQ2Q/qnFgRue+M6xtYALacLgVOK9fygCJWMnKOnNsAmOCYgAzvkcBaI7QRzS15Nh6G6LP12cWXuT5ZF+D8vk4WcU7V8ywLN/lwNALXE0YDM6RT2nuV/pmMZI7sooKRDrFByHLjFqLxnNhAip+txGMj07xqk= Received: from CY4PR11MB1272.namprd11.prod.outlook.com (2603:10b6:903:29::9) by CY4PR1101MB2247.namprd11.prod.outlook.com (2603:10b6:910:19::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3611.25; Thu, 3 Dec 2020 07:26:24 +0000 Received: from CY4PR11MB1272.namprd11.prod.outlook.com ([fe80::a403:f387:2d1f:8fe8]) by CY4PR11MB1272.namprd11.prod.outlook.com ([fe80::a403:f387:2d1f:8fe8%4]) with mapi id 15.20.3632.021; Thu, 3 Dec 2020 07:26:23 +0000 From: "Dong, Eric" To: "Ni, Ray" , "devel@edk2.groups.io" CC: "Lou, Yun" , Laszlo Ersek Subject: Re: [PATCH] UefiCpuPkg/Feature: Support different thread count per core Thread-Topic: [PATCH] UefiCpuPkg/Feature: Support different thread count per core Thread-Index: AQHWyKIxPzA8XGYhfUeuLWJo78JA/6nk+dkQ Date: Thu, 3 Dec 2020 07:26:23 +0000 Message-ID: References: <20201202115505.664-1-ray.ni@intel.com> In-Reply-To: <20201202115505.664-1-ray.ni@intel.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [192.55.46.39] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 28d6279d-83af-460b-277d-08d8975cbd33 x-ms-traffictypediagnostic: CY4PR1101MB2247: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: eQf31DNlR+fE3MmrcOZNWPa/5OFkrzEqtTCOXwK4PIVyGs2Cf2fyj7VTIXtMs1GaMUh+ays0ZKmiCvyUko/99xRMVZm0nl1U+y16xn0aXsq7LrA3yWgBLBoaYcFX2iSX6fzCfZemsQepAZNXU/Zw8xSQi66WZXTnHCPVTwaQeHewxTE504wW+U3LVhoovF3arl+/4Q7HehJMWwmJTXKaGdQtTk1BJwRnAP/96CeSZofMmuGIQuLb4UpIumEk51UJ8volD+LBvpEtUarGABX2nZFwxKC4QEAoI+aIx0TONKmlltG83WrJ6v1odW1AvJ/HBuo8GNuYLuuKnOCPKNmk2g== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CY4PR11MB1272.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(346002)(366004)(39860400002)(396003)(376002)(136003)(52536014)(5660300002)(9686003)(55016002)(26005)(186003)(8936002)(30864003)(86362001)(2906002)(33656002)(8676002)(316002)(66946007)(66556008)(66476007)(64756008)(71200400001)(76116006)(110136005)(83380400001)(54906003)(4326008)(478600001)(53546011)(6506007)(7696005)(19627235002)(66446008)(579004);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?cfI7nu+FKsHtXbXuMyxseGrUvXJlS5IR1kt6qeUooORDRYQuLZlHNT6LTQvz?= =?us-ascii?Q?du103bzSitfrQWJZWgBOecRcaZfdZXY7dPVmUgXUphIU44c8POxB5nULHwUL?= =?us-ascii?Q?ZFwxAV92LP4HmbAlxhLm2DjDd8uv55CJOetFb58wKL2HHZ1maVsPFEmoFCOo?= =?us-ascii?Q?UAtr3uD0Ck95eOvLhM0jVZSAPY8kMNoa+pi4Fb0Ly7zAq+rUObYnDVp3DNc+?= =?us-ascii?Q?o0pefhsUD/8BCk2v0tytCB5CerMwlztY3V49LLPNZoegNFmPv9WxKYUUDOUR?= =?us-ascii?Q?P4CEmpGP1hzps9GqT7HhA4iStZgNovOcio6fKUsExTQIUeMdqCkEkc+GSA5n?= =?us-ascii?Q?u7bTBMCzv5YNMqqz8kaWUDnsRSKkuFvAb93W1pzvDTAmB7l/2AtyiNI+Gs32?= =?us-ascii?Q?HQKsNZCMc9thoM/M75+xB5r4D955c/hiO8xFzVo9iGuW21qV0awXuW3I6TmH?= =?us-ascii?Q?0AtSWdOZQ/8yHfsd4nv1dnugR4W4Bd7XFCSDzzXbHCAE0ZMVVpZO505vSsDD?= =?us-ascii?Q?/KD01NKk8449T9dqZ3fHhUN9gQPEmec10Vw8wQc7CDrZVGZWEkHP6EEYboxM?= =?us-ascii?Q?r1EZprwhN00mlFW1AttwzwaYL/tRT361+HE/KpwrR1GjKkPOiN4ELUJbJcu9?= =?us-ascii?Q?w8Tt9vOYONjOPDTj9KG+3tD9VX9L1Kkda7X3SKhG4SxetTFDSgYmRK9Jbobi?= =?us-ascii?Q?hxwuFfPOINRKq4VMJDFFTWDpT2UOBpHHEG82A2WPHFr7DDq/7EggOjHHhSkF?= =?us-ascii?Q?lMcSL9ok9cicVLjt0sirHplK2wxvqRhQ9U7SDMyLX0C/oKJ3+xy69f73gqkj?= =?us-ascii?Q?jG7Nes6Ldl0zlAdskBMdvL0RqezwFH3jCAipKUFG+bgSJxW0RGPjGgnk1ZjW?= =?us-ascii?Q?9XROwCo38y105e+rVK7xhsYlRn6BKNbRYO3r5aAd7+ovuspH34DMrvIdcIBu?= =?us-ascii?Q?r+JJecKgdiyYwOZmX2tGLf20oPH6YBP768FrU3GLSew=3D?= MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CY4PR11MB1272.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 28d6279d-83af-460b-277d-08d8975cbd33 X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Dec 2020 07:26:23.9227 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: aAOl77gaoxyqSXF6Ededi7BUfwQubZPYK5SJwUDhidFeE/9prb1owwfcU4QKOaJUKBpWW7XK4Tla8m8VAH1aCg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR1101MB2247 Return-Path: eric.dong@intel.com X-OriginatorOrg: intel.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Reviewed-by: Eric Dong -----Original Message----- From: Ni, Ray =20 Sent: Wednesday, December 2, 2020 7:55 PM To: devel@edk2.groups.io Cc: Dong, Eric ; Lou, Yun ; Laszlo = Ersek Subject: [PATCH] UefiCpuPkg/Feature: Support different thread count per cor= e Today's code assumes every core contains the same number of threads. It's not always TRUE for certain model. Such assumption causes system hang when thread count per core is different and there is core or package dependency between CPU features (using CPU_FEATURE_CORE_BEFORE/AFTER, CPU_FEATURE_PACKAGE_BEFORE/AFTER). The change removes such assumption by calculating the actual thread count per package and per core. Signed-off-by: Ray Ni Cc: Eric Dong Cc: Yun Lou Cc: Laszlo Ersek --- UefiCpuPkg/Include/AcpiCpuData.h | 16 ++- .../CpuFeaturesInitialize.c | 113 ++++++++++-------- UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 73 ++++++----- 3 files changed, 119 insertions(+), 83 deletions(-) diff --git a/UefiCpuPkg/Include/AcpiCpuData.h b/UefiCpuPkg/Include/AcpiCpuD= ata.h index 77da5d4455..b5a69ad80c 100644 --- a/UefiCpuPkg/Include/AcpiCpuData.h +++ b/UefiCpuPkg/Include/AcpiCpuData.h @@ -1,7 +1,7 @@ /** @file Definitions for CPU S3 data. =20 -Copyright (c) 2013 - 2018, Intel Corporation. All rights reserved.
+Copyright (c) 2013 - 2020, Intel Corporation. All rights reserved.
SPDX-License-Identifier: BSD-2-Clause-Patent =20 **/ @@ -60,14 +60,24 @@ typedef struct { UINT32 MaxThreadCount; // // This field points to an array. - // This array saves valid core count (type UINT32) of each package. + // This array saves thread count (type UINT32) of each package. // The array has PackageCount elements. // // If the platform does not support MSR setting at S3 resume, and // therefore it doesn't need the dependency semaphores, it should set // this field to 0. // - EFI_PHYSICAL_ADDRESS ValidCoreCountPerPackage; + EFI_PHYSICAL_ADDRESS ThreadCountPerPackage; + // + // This field points to an array. + // This array saves thread count (type UINT8) of each core. + // The array has PackageCount * MaxCoreCount elements. + // + // If the platform does not support MSR setting at S3 resume, and + // therefore it doesn't need the dependency semaphores, it should set + // this field to 0. + // + EFI_PHYSICAL_ADDRESS ThreadCountPerCore; } CPU_STATUS_INFORMATION; =20 // diff --git a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitializ= e.c b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c index 5c673fa8cf..0cce909cc0 100644 --- a/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c +++ b/UefiCpuPkg/Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c @@ -103,14 +103,13 @@ CpuInitDataInitialize ( UINT32 Package; UINT32 Thread; EFI_CPU_PHYSICAL_LOCATION *Location; - BOOLEAN *CoresVisited; - UINTN Index; UINT32 PackageIndex; UINT32 CoreIndex; UINT32 First; ACPI_CPU_DATA *AcpiCpuData; CPU_STATUS_INFORMATION *CpuStatus; - UINT32 *ValidCoreCountPerPackage; + UINT32 *ThreadCountPerPackage; + UINT8 *ThreadCountPerCore; UINTN NumberOfCpus; UINTN NumberOfEnabledProcessors; =20 @@ -202,35 +201,32 @@ CpuInitDataInitialize ( // // Collect valid core count in each package because not all cores are va= lid. // - ValidCoreCountPerPackage=3D AllocateZeroPool (sizeof (UINT32) * CpuStatu= s->PackageCount); - ASSERT (ValidCoreCountPerPackage !=3D 0); - CpuStatus->ValidCoreCountPerPackage =3D (EFI_PHYSICAL_ADDRESS)(UINTN)Val= idCoreCountPerPackage; - CoresVisited =3D AllocatePool (sizeof (BOOLEAN) * CpuStatus->MaxCoreCoun= t); - ASSERT (CoresVisited !=3D NULL); - - for (Index =3D 0; Index < CpuStatus->PackageCount; Index ++ ) { - ZeroMem (CoresVisited, sizeof (BOOLEAN) * CpuStatus->MaxCoreCount); - // - // Collect valid cores in Current package. - // - for (ProcessorNumber =3D 0; ProcessorNumber < NumberOfCpus; ProcessorN= umber++) { - Location =3D &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo.Pr= ocessorInfo.Location; - if (Location->Package =3D=3D Index && !CoresVisited[Location->Core] = ) { - // - // The ValidCores position for Location->Core is valid. - // The possible values in ValidCores[Index] are 0 or 1. - // FALSE means no valid threads in this Core. - // TRUE means have valid threads in this core, no matter the thead= count is 1 or more. - // - CoresVisited[Location->Core] =3D TRUE; - ValidCoreCountPerPackage[Index]++; - } - } + ThreadCountPerPackage =3D AllocateZeroPool (sizeof (UINT32) * CpuStatus-= >PackageCount); + ASSERT (ThreadCountPerPackage !=3D NULL); + CpuStatus->ThreadCountPerPackage =3D (EFI_PHYSICAL_ADDRESS)(UINTN)Thread= CountPerPackage; + + ThreadCountPerCore =3D AllocateZeroPool (sizeof (UINT8) * CpuStatus->Pac= kageCount * CpuStatus->MaxCoreCount); + ASSERT (ThreadCountPerCore !=3D NULL); + CpuStatus->ThreadCountPerCore =3D (EFI_PHYSICAL_ADDRESS)(UINTN)ThreadCou= ntPerCore; + + for (ProcessorNumber =3D 0; ProcessorNumber < NumberOfCpus; ProcessorNum= ber++) { + Location =3D &CpuFeaturesData->InitOrder[ProcessorNumber].CpuInfo.Proc= essorInfo.Location; + ThreadCountPerPackage[Location->Package]++; + ThreadCountPerCore[Location->Package * CpuStatus->MaxCoreCount + Locat= ion->Core]++; } - FreePool (CoresVisited); =20 - for (Index =3D 0; Index <=3D Package; Index++) { - DEBUG ((DEBUG_INFO, "Package: %d, Valid Core : %d\n", Index, ValidCore= CountPerPackage[Index])); + for (PackageIndex =3D 0; PackageIndex < CpuStatus->PackageCount; Package= Index++) { + if (ThreadCountPerPackage[PackageIndex] !=3D 0) { + DEBUG ((DEBUG_INFO, "P%02d: Thread Count =3D %d\n", PackageIndex, Th= readCountPerPackage[PackageIndex])); + for (CoreIndex =3D 0; CoreIndex < CpuStatus->MaxCoreCount; CoreIndex= ++) { + if (ThreadCountPerCore[PackageIndex * CpuStatus->MaxCoreCount + Co= reIndex] !=3D 0) { + DEBUG (( + DEBUG_INFO, " P%02d C%04d, Thread Count =3D %d\n", PackageInd= ex, CoreIndex,=20 + ThreadCountPerCore[PackageIndex * CpuStatus->MaxCoreCount + Co= reIndex] + )); + } + } + } } =20 CpuFeaturesData->CpuFlags.CoreSemaphoreCount =3D AllocateZeroPool (sizeo= f (UINT32) * CpuStatus->PackageCount * CpuStatus->MaxCoreCount * CpuStatus-= >MaxThreadCount); @@ -894,11 +890,11 @@ ProgramProcessorRegister ( CPU_REGISTER_TABLE_ENTRY *RegisterTableEntryHead; volatile UINT32 *SemaphorePtr; UINT32 FirstThread; - UINT32 PackageThreadsCount; UINT32 CurrentThread; + UINT32 CurrentCore; UINTN ProcessorIndex; - UINTN ValidThreadCount; - UINT32 *ValidCoreCountPerPackage; + UINT32 *ThreadCountPerPackage; + UINT8 *ThreadCountPerCore; EFI_STATUS Status; UINT64 CurrentValue; =20 @@ -1029,28 +1025,44 @@ ProgramProcessorRegister ( switch (RegisterTableEntry->Value) { case CoreDepType: SemaphorePtr =3D CpuFlags->CoreSemaphoreCount; + ThreadCountPerCore =3D (UINT8 *)(UINTN)CpuStatus->ThreadCountPerCo= re; + + CurrentCore =3D ApLocation->Package * CpuStatus->MaxCoreCount + Ap= Location->Core; // // Get Offset info for the first thread in the core which current = thread belongs to. // - FirstThread =3D (ApLocation->Package * CpuStatus->MaxCoreCount + A= pLocation->Core) * CpuStatus->MaxThreadCount; + FirstThread =3D CurrentCore * CpuStatus->MaxThreadCount; CurrentThread =3D FirstThread + ApLocation->Thread; + // - // First Notify all threads in current Core that this thread has r= eady. + // Different cores may have different valid threads in them. If dr= iver maintail clearly + // thread index in different cores, the logic will be much complic= ated. + // Here driver just simply records the max thread number in all co= res and use it as expect + // thread number for all cores. + // In below two steps logic, first current thread will Release sem= aphore for each thread + // in current core. Maybe some threads are not valid in this core,= but driver don't + // care. Second, driver will let current thread wait semaphore for= all valid threads in + // current core. Because only the valid threads will do release se= maphore for this + // thread, driver here only need to wait the valid thread count. + // + + // + // First Notify ALL THREADs in current Core that this thread is re= ady. // for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt; ProcessorIndex ++) { - LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[FirstThread + Proc= essorIndex]); + LibReleaseSemaphore (&SemaphorePtr[FirstThread + ProcessorIndex]= ); } // - // Second, check whether all valid threads in current core have re= ady. + // Second, check whether all VALID THREADs (not all threads) in cu= rrent core are ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt; ProcessorIndex ++) { + for (ProcessorIndex =3D 0; ProcessorIndex < ThreadCountPerCore[Cur= rentCore]; ProcessorIndex ++) { LibWaitForSemaphore (&SemaphorePtr[CurrentThread]); } break; =20 case PackageDepType: SemaphorePtr =3D CpuFlags->PackageSemaphoreCount; - ValidCoreCountPerPackage =3D (UINT32 *)(UINTN)CpuStatus->ValidCore= CountPerPackage; + ThreadCountPerPackage =3D (UINT32 *)(UINTN)CpuStatus->ThreadCountP= erPackage; // // Get Offset info for the first thread in the package which curre= nt thread belongs to. // @@ -1058,18 +1070,13 @@ ProgramProcessorRegister ( // // Get the possible threads count for current package. // - PackageThreadsCount =3D CpuStatus->MaxThreadCount * CpuStatus->Max= CoreCount; CurrentThread =3D FirstThread + CpuStatus->MaxThreadCount * ApLoca= tion->Core + ApLocation->Thread; - // - // Get the valid thread count for current package. - // - ValidThreadCount =3D CpuStatus->MaxThreadCount * ValidCoreCountPer= Package[ApLocation->Package]; =20 // - // Different packages may have different valid cores in them. If d= river maintail clearly - // cores number in different packages, the logic will be much comp= licated. - // Here driver just simply records the max core number in all pack= ages and use it as expect - // core number for all packages. + // Different packages may have different valid threads in them. If= driver maintail clearly + // thread index in different packages, the logic will be much comp= licated. + // Here driver just simply records the max thread number in all pa= ckages and use it as expect + // thread number for all packages. // In below two steps logic, first current thread will Release sem= aphore for each thread // in current package. Maybe some threads are not valid in this pa= ckage, but driver don't // care. Second, driver will let current thread wait semaphore for= all valid threads in @@ -1078,15 +1085,15 @@ ProgramProcessorRegister ( // =20 // - // First Notify ALL THREADS in current package that this thread ha= s ready. + // First Notify ALL THREADS in current package that this thread is= ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < PackageThreadsCount ; = ProcessorIndex ++) { - LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[FirstThread + Proc= essorIndex]); + for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt * CpuStatus->MaxCoreCount; ProcessorIndex ++) { + LibReleaseSemaphore (&SemaphorePtr[FirstThread + ProcessorIndex]= ); } // - // Second, check whether VALID THREADS (not all threads) in curren= t package have ready. + // Second, check whether VALID THREADS (not all threads) in curren= t package are ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < ValidThreadCount; Proc= essorIndex ++) { + for (ProcessorIndex =3D 0; ProcessorIndex < ThreadCountPerPackage[= ApLocation->Package]; ProcessorIndex ++) { LibWaitForSemaphore (&SemaphorePtr[CurrentThread]); } break; diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c b/UefiCpuPkg/PiSmmCpuDxeSmm/= CpuS3.c index 29e9ba92b4..9592430636 100644 --- a/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c @@ -1,7 +1,7 @@ /** @file Code for Processor S3 restoration =20 -Copyright (c) 2006 - 2019, Intel Corporation. All rights reserved.
+Copyright (c) 2006 - 2020, Intel Corporation. All rights reserved.
SPDX-License-Identifier: BSD-2-Clause-Patent =20 **/ @@ -235,11 +235,11 @@ ProgramProcessorRegister ( CPU_REGISTER_TABLE_ENTRY *RegisterTableEntryHead; volatile UINT32 *SemaphorePtr; UINT32 FirstThread; - UINT32 PackageThreadsCount; UINT32 CurrentThread; + UINT32 CurrentCore; UINTN ProcessorIndex; - UINTN ValidThreadCount; - UINT32 *ValidCoreCountPerPackage; + UINT32 *ThreadCountPerPackage; + UINT8 *ThreadCountPerCore; EFI_STATUS Status; UINT64 CurrentValue; =20 @@ -372,35 +372,52 @@ ProgramProcessorRegister ( // ASSERT ( (ApLocation !=3D NULL) && - (CpuStatus->ValidCoreCountPerPackage !=3D 0) && + (CpuStatus->ThreadCountPerPackage !=3D 0) && + (CpuStatus->ThreadCountPerCore !=3D 0) && (CpuFlags->CoreSemaphoreCount !=3D NULL) && (CpuFlags->PackageSemaphoreCount !=3D NULL) ); switch (RegisterTableEntry->Value) { case CoreDepType: SemaphorePtr =3D CpuFlags->CoreSemaphoreCount; + ThreadCountPerCore =3D (UINT8 *)(UINTN)CpuStatus->ThreadCountPerCo= re; + + CurrentCore =3D ApLocation->Package * CpuStatus->MaxCoreCount + Ap= Location->Core; // // Get Offset info for the first thread in the core which current = thread belongs to. // - FirstThread =3D (ApLocation->Package * CpuStatus->MaxCoreCount + A= pLocation->Core) * CpuStatus->MaxThreadCount; + FirstThread =3D CurrentCore * CpuStatus->MaxThreadCount; CurrentThread =3D FirstThread + ApLocation->Thread; + // - // First Notify all threads in current Core that this thread has r= eady. + // Different cores may have different valid threads in them. If dr= iver maintail clearly + // thread index in different cores, the logic will be much complic= ated. + // Here driver just simply records the max thread number in all co= res and use it as expect + // thread number for all cores. + // In below two steps logic, first current thread will Release sem= aphore for each thread + // in current core. Maybe some threads are not valid in this core,= but driver don't + // care. Second, driver will let current thread wait semaphore for= all valid threads in + // current core. Because only the valid threads will do release se= maphore for this + // thread, driver here only need to wait the valid thread count. + // + + // + // First Notify ALL THREADs in current Core that this thread is re= ady. // for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt; ProcessorIndex ++) { S3ReleaseSemaphore (&SemaphorePtr[FirstThread + ProcessorIndex])= ; } // - // Second, check whether all valid threads in current core have re= ady. + // Second, check whether all VALID THREADs (not all threads) in cu= rrent core are ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt; ProcessorIndex ++) { + for (ProcessorIndex =3D 0; ProcessorIndex < ThreadCountPerCore[Cur= rentCore]; ProcessorIndex ++) { S3WaitForSemaphore (&SemaphorePtr[CurrentThread]); } break; =20 case PackageDepType: SemaphorePtr =3D CpuFlags->PackageSemaphoreCount; - ValidCoreCountPerPackage =3D (UINT32 *)(UINTN)CpuStatus->ValidCore= CountPerPackage; + ThreadCountPerPackage =3D (UINT32 *)(UINTN)CpuStatus->ThreadCountP= erPackage; // // Get Offset info for the first thread in the package which curre= nt thread belongs to. // @@ -408,18 +425,13 @@ ProgramProcessorRegister ( // // Get the possible threads count for current package. // - PackageThreadsCount =3D CpuStatus->MaxThreadCount * CpuStatus->Max= CoreCount; CurrentThread =3D FirstThread + CpuStatus->MaxThreadCount * ApLoca= tion->Core + ApLocation->Thread; - // - // Get the valid thread count for current package. - // - ValidThreadCount =3D CpuStatus->MaxThreadCount * ValidCoreCountPer= Package[ApLocation->Package]; =20 // - // Different packages may have different valid cores in them. If d= river maintail clearly - // cores number in different packages, the logic will be much comp= licated. - // Here driver just simply records the max core number in all pack= ages and use it as expect - // core number for all packages. + // Different packages may have different valid threads in them. If= driver maintail clearly + // thread index in different packages, the logic will be much comp= licated. + // Here driver just simply records the max thread number in all pa= ckages and use it as expect + // thread number for all packages. // In below two steps logic, first current thread will Release sem= aphore for each thread // in current package. Maybe some threads are not valid in this pa= ckage, but driver don't // care. Second, driver will let current thread wait semaphore for= all valid threads in @@ -428,15 +440,15 @@ ProgramProcessorRegister ( // =20 // - // First Notify all threads in current package that this thread ha= s ready. + // First Notify ALL THREADS in current package that this thread is= ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < PackageThreadsCount ; = ProcessorIndex ++) { + for (ProcessorIndex =3D 0; ProcessorIndex < CpuStatus->MaxThreadCo= unt * CpuStatus->MaxCoreCount; ProcessorIndex ++) { S3ReleaseSemaphore (&SemaphorePtr[FirstThread + ProcessorIndex])= ; } // - // Second, check whether all valid threads in current package have= ready. + // Second, check whether VALID THREADS (not all threads) in curren= t package are ready. // - for (ProcessorIndex =3D 0; ProcessorIndex < ValidThreadCount; Proc= essorIndex ++) { + for (ProcessorIndex =3D 0; ProcessorIndex < ThreadCountPerPackage[= ApLocation->Package]; ProcessorIndex ++) { S3WaitForSemaphore (&SemaphorePtr[CurrentThread]); } break; @@ -1059,12 +1071,19 @@ GetAcpiCpuData ( =20 CpuStatus =3D &mAcpiCpuData.CpuStatus; CopyMem (CpuStatus, &AcpiCpuData->CpuStatus, sizeof (CPU_STATUS_INFORMAT= ION)); - if (AcpiCpuData->CpuStatus.ValidCoreCountPerPackage !=3D 0) { - CpuStatus->ValidCoreCountPerPackage =3D (EFI_PHYSICAL_ADDRESS)(UINTN)A= llocateCopyPool ( + if (AcpiCpuData->CpuStatus.ThreadCountPerPackage !=3D 0) { + CpuStatus->ThreadCountPerPackage =3D (EFI_PHYSICAL_ADDRESS)(UINTN)Allo= cateCopyPool ( sizeof (UINT32) * CpuStatus->P= ackageCount, - (UINT32 *)(UINTN)AcpiCpuData->= CpuStatus.ValidCoreCountPerPackage + (UINT32 *)(UINTN)AcpiCpuData->= CpuStatus.ThreadCountPerPackage + ); + ASSERT (CpuStatus->ThreadCountPerPackage !=3D 0); + } + if (AcpiCpuData->CpuStatus.ThreadCountPerCore !=3D 0) { + CpuStatus->ThreadCountPerCore =3D (EFI_PHYSICAL_ADDRESS)(UINTN)Allocat= eCopyPool ( + sizeof (UINT8) * (CpuStatus->P= ackageCount * CpuStatus->MaxCoreCount), + (UINT32 *)(UINTN)AcpiCpuData->= CpuStatus.ThreadCountPerCore ); - ASSERT (CpuStatus->ValidCoreCountPerPackage !=3D 0); + ASSERT (CpuStatus->ThreadCountPerCore !=3D 0); } if (AcpiCpuData->ApLocation !=3D 0) { mAcpiCpuData.ApLocation =3D (EFI_PHYSICAL_ADDRESS)(UINTN)AllocateCopyP= ool ( --=20 2.27.0.windows.1