From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=pass header.i=@protonmail.com header.s=default header.b=sUEO9EMz; spf=pass (domain: protonmail.com, ip: 185.70.40.27, mailfrom: vit9696@protonmail.com) Received: from mail4.protonmail.ch (mail4.protonmail.ch [185.70.40.27]) by groups.io with SMTP; Fri, 16 Aug 2019 18:21:11 -0700 Date: Sat, 17 Aug 2019 01:21:06 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=default; t=1566004868; bh=h8l55klNOTShjBuqg65if42djdSPQdBcOgXTUC0WiQE=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References: Feedback-ID:From; b=sUEO9EMz4yyl6FyiMOIFTVCnHrtlwhl3F2FypF+9qfUSIO1WvU5/ZSfuCO7i5NRcd Gaw8TeKLnD1KJX2CdOr7/plkIeappNX/c2C1pmh18ZDFVWmiJzHkLMvcWG0nmDItES SRhYV5QkR04pGO83izL44d/sa/dP4e0HCEjhfGyo= To: Andrew Fish From: "Vitaly Cheptsov" Cc: devel@edk2.groups.io, jiewen.yao@intel.com Reply-To: "vit9696@protonmail.com" Subject: Re: [edk2-devel] Determining TSC frequency programmatically Message-ID: <6FD228D6-9926-4EB1-8DE9-1F44CFCB1C0D@protonmail.com> In-Reply-To: <1E44B076-EDC9-492C-9214-D4FC1D3AEAFB@apple.com> References: <8EC14D0D-DFA5-412D-A4E1-4D641576D58E@protonmail.com> <1E44B076-EDC9-492C-9214-D4FC1D3AEAFB@apple.com> Feedback-ID: p9QuX-L1wMgUm6nrSvNrf8juLupNs0VSnzXGVXuYDxlEahFdWtaedWDMB9zpwGDklGt7kzs1-RBc0cqz327Gcg==:Ext:ProtonMail MIME-Version: 1.0 X-Spam-Status: No, score=-0.7 required=7.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,FREEMAIL_REPLYTO_END_DIGIT,HTML_FONT_LOW_CONTRAST, HTML_MESSAGE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mail.protonmail.ch X-Groupsio-MsgNum: 46004 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha256; boundary="---------------------60071df18ec12809276985cd740ef443"; charset=UTF-8 -----------------------60071df18ec12809276985cd740ef443 Cc: devel@edk2.groups.io, jiewen.yao@intel.com Content-Type: multipart/alternative; boundary="Apple-Mail=_EEBC6DD4-B719-4236-96B7-8796F87A65FA" Date: Sat, 17 Aug 2019 04:21:04 +0300 From: "vit9696@protonmail.com" In-Reply-To: <1E44B076-EDC9-492C-9214-D4FC1D3AEAFB@apple.com> Message-Id: <6FD228D6-9926-4EB1-8DE9-1F44CFCB1C0D@protonmail.com> Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) References: <8EC14D0D-DFA5-412D-A4E1-4D641576D58E@protonmail.com> <1E44B076-EDC9-492C-9214-D4FC1D3AEAFB@apple.com> Subject: Re: [edk2-devel] Determining TSC frequency programmatically To: Andrew Fish X-Mailer: Apple Mail (2.3445.104.11) --Apple-Mail=_EEBC6DD4-B719-4236-96B7-8796F87A65FA Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Andrew, Thanks for the reminder for CpuDxe. I think we saw CpuDxe calculation some time ago, and it was most likely a = reason we went with a very similar approach to determine the TSC frequency = as a fallback mechanism. It, however, is in fact prone to issues, putting a= side the non-guaranteed period and CPU frequency match. Basically, there ar= e two problems: 1. The code implemented in C is subject to minor time drifts, which result= s in slightly different results across the reboots. The calculation based o= n CPUID 15H is more accurate. 2. The lack of TPL changes during frequency measurement results in EFI_EVE= NTs potentially interrupting the process and thus making the resulting valu= e very inaccurate. Since the value is cached there is not always a possibil= ity to avoid it. EmulatorPkg is a bit of a special case, which we think we know about :), a= ctually thanks for fixing it up on macOS, this is appreciated! All in all, I would say that there really needs to exist a decent library,= which most likely CpuDxe should make a use of as well. Best wishes, Vitaly > 16 =D0=B0=D0=B2=D0=B3. 2019 =D0=B3., =D0=B2 21:35, Andrew Fish =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0): >=20 > Vitaly, >=20 > As Mike mentioned platforms can know more info about how they are constr= ucted thus you may not want to have a lot of generic discovery code floatin= g about if you don't really need it.=20 >=20 > One option could be to pass up the TSC Frequency/Period via some EFI mec= hanism so generic code can be told by platform specific code.=20 >=20 > The PI spec already has an abstraction for a CPU based timer that is arc= hitecture neutral. The CPU Architectural Protocol has a GetTimerValue() mem= ber function.=20 > https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Protocol/Cp= u.h#L220 >=20 > For X86 it returns TSC > https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/CpuDxe/CpuDxe.c= #L289 >=20 > EFI Systems are not required to implement PI so we usually don't encoura= ge generic EFI code to go after PI APIs.=20 >=20 > I'd also point out that using TSC can break in things like the EmulatorP= kg as you end up running in ring 0 and TSC access is blocked.=20 > https://github.com/tianocore/edk2/blob/master/EmulatorPkg/CpuRuntimeDxe/= Cpu.c#L352 > https://github.com/tianocore/edk2/blob/master/EmulatorPkg/Unix/Host/EmuT= hunk.c#L250 >=20 >=20 > I would point that a library that did TSC frequency discovery would like= ly be useful for the UefiCpuPkg CpuDxe driver.=20 >=20 > Thanks, >=20 > Andrew Fish >=20 >> On Aug 15, 2019, at 2:10 PM, Vitaly Cheptsov via Groups.Io > wrote: >>=20 >> Hello, >>=20 >> I initially raised this question in a new TimerLib patch[1], but as the= discussion was getting more distracted, I decided to create a separate thr= ead in hopes new people could join. >>=20 >> The issue is that our UEFI bootloader needs to obtain TSC frequency to = pass it to our specialised operating system that uses TSC for scheduling on= x86. >>=20 >> For a while we went with ACPI power management timer to measure the fre= quency, but as modern Intel CPUs support CPUID 15H leaf (CPUID_TIME_STAMP_C= OUNTER) we try to use where possible for better accuracy. The issue with th= is CPUID leaf is that the crystal clock frequency returned in ECX register = is optional and therefore can be 0. Intel SDM suggests to use a static valu= e in this case[2], but it is completely opaque on how to match the running = CPU with its static value from SDM. >>=20 >> Initially we went with CPUID model checking, but this failed badly for = Xeon Scalable and Xeon W, as they share the CPUID (06_55H) but have differe= nt crystal clock frequencies (25 MHz vs 24 MHz accordingly). Donald Kuo gav= e a good hint in the previous thread that client CPUs usually get 24 MHz cr= ystal clock, server CPUs have 25, and Atoms have 19.2. This, however, does = not make the situation easier as we do not see a way to determine CPU verti= cal segment without e.g. parsing the CPUID brand string. >>=20 >> Apparently, we are not alone, and different open-source operating syste= ms have different workarounds to this issue. For example, Linux kernel went= with using marketing frequency from CPUID 16H leaf (CPUID_PROCESSOR_FREQUE= NCY)[3], and BSD flavours fallback to older methods when neither crystal cl= ock frequency can be obtained through CPUID 15H, nor unambiguous CPUID mode= ls exist to be able to use static values. >>=20 >> Another issue we see with EDK II TimerLib implementations for x86 is th= at they are very model specific. As Michael Kinney said, the situation is n= ot a problem when you use TimerLib for BSP bringup, as you already know the= CPU family you target, however, it is not great for other uses, although t= hey may be discouraged. In our opinion, regardless of the use, this approac= h has a number of design issues. >>=20 >> All modern Intel x86 CPUs have virtual TSC counter that has fixed frequ= ency. In fact, this is true for most, if not all, modern x86 CPUs by other = vendors as well. This makes us believe that EDK II should have generic TscT= imerLib for x86, which will work anywhere when given valid TSC frequency, a= nd TscFrequencyLib, which would determine TSC frequency in a vendor-specifi= c method. Ideally there exists GenericTscFrequencyLib that can do this for = a wide majority of CPUs through different methods at runtime, including CPU= ID 15H, ACPI power management, vendor-specific extensions, etc. >>=20 >> To summarise: >> - We need to know how to match current running CPU with its crystal clo= ck frequency when CPUID 15H reports 0 ECX. >> - We would like to suggest to split TSC-based TimerLib in two libraries= : generic with timer implementation and platform-specific with TSC frequenc= y discovery. >> - We wonder whether a generic vendor-supported TSC frequency discovery = library working on a wide range of CPUs is feasible to have in EDK II mainl= ine. >>=20 >> Best regards, >> Vitaly >>=20 >> [1] https://edk2.groups.io/g/devel/topic/patch_ueficpupkg_adding_a/3283= 9184?p=3D,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,32839184 >> [PATCH] UefiCpuPkg: Adding a new TSC library by using CPUID(0x15) T= SC leaf >> [2] 18.7.3 Determining the Processor Base Frequency >> Table 18-75. Nominal Core Crystal Clock Frequency >> [3] https://lore.kernel.org/lkml/20190509055417.13152-1-drake@endlessm.= com/ >>=20 >>=20 >=20 --Apple-Mail=_EEBC6DD4-B719-4236-96B7-8796F87A65FA Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Andrew,
Thanks for the reminder for CpuDxe.

I think we saw CpuDxe= calculation some time ago, and it was most likely a reason we went with a = very similar approach to determine the TSC frequency as a fallback mechanis= m. It, however, is in fact prone to issues, putting aside the non-guarantee= d period and CPU frequency match. Basically, there are two problems:
<= div class=3D"">
1. The code implemented= in C is subject to minor time drifts, which results in slightly different = results across the reboots. The calculation based on CPUID 15H is more accu= rate.
2. The lack of TPL changes during frequency meas= urement results in EFI_EVENTs potentially interrupting the process and thus= making the resulting value very inaccurate. Since the value is cached ther= e is not always a possibility to avoid it.

EmulatorPkg is a bit of a special case, which = we think we know about :), actually thanks for fixing it up on macOS, this = is appreciated!

A= ll in all, I would say that there really needs to exist a decent library, w= hich most likely CpuDxe should make a use of as well.
=
Best wishes,
Vita= ly

16 =D0=B0=D0=B2=D0=B3. 2019 =D0=B3., =D0=B2 21:35, A= ndrew Fish <afish@apple.co= m> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):

Vitaly,

As= Mike mentioned platforms can know more info about how they are constructed= thus you may not want to have a lot of generic discovery code floating abo= ut if you don't really need it. 

<= /div>
One option could be to pass up the TSC Frequency/Perio= d via some EFI mechanism so generic code can be told by platform specific c= ode. 

The PI= spec already has an abstraction for a CPU based timer that is architecture= neutral. The CPU Architectural Protocol has a GetTimerValue() member funct= ion. 
https://github= .com/tianocore/edk2/blob/master/MdePkg/Include/Protocol/Cpu.h#L220

For X86 it returns TS= C

EFI Systems are not required to impl= ement PI so we usually don't encourage generic EFI code to go after PI APIs= . 

I'd also = point out that using TSC can break in things like the EmulatorPkg as you en= d up running in ring 0 and TSC access is blocked. 

=

I would point th= at a library that did TSC frequency discovery would likely be useful for th= e UefiCpuPkg CpuDxe driver. 

Thanks,

Andrew Fish

On Aug 15, 2019, at 2:10 PM, Vit= aly Cheptsov via Groups.Io <vit9696=3Dprotonmail.com@groups.io> wrote:
Hello,

I initially raised this question in a new TimerLib patch[1], bu= t as the discussion was getting more distracted, I decided to create a sepa= rate thread in hopes new people could join.

The issue is that our UEFI bootloader needs to= obtain TSC frequency to pass it to our specialised operating system that u= ses TSC for scheduling on x86.

For a while we went with ACPI power management timer to measu= re the frequency, but as modern Intel CPUs support CPUID 15H leaf (CPUID_TI= ME_STAMP_COUNTER) we try to use where possible for better accuracy. The iss= ue with this CPUID leaf is that the crystal clock frequency returned in ECX= register is optional and therefore can be 0. Intel SDM suggests to use a s= tatic value in this case[2], but it is completely opaque on how to match th= e running CPU with its static value from SDM.

Initially we went with CPUID model checking, b= ut this failed badly for Xeon Scalable and Xeon W, as they share the CPUID = (06_55H) but have different crystal clock frequencies (25 MHz vs 24 MHz acc= ordingly). Donald Kuo gave a good hint in the previous thread that client C= PUs usually get 24 MHz crystal clock, server CPUs have 25, and Atoms have 1= 9.2. This, however, does not make the situation easier as we do not see a w= ay to determine CPU vertical segment without e.g. parsing the CPUID brand s= tring.

Apparently= , we are not alone, and different open-source operating systems have differ= ent workarounds to this issue. For example, Linux kernel went with using ma= rketing frequency from CPUID 16H leaf (CPUID_PROCESSOR_FREQUENCY)[3], and B= SD flavours fallback to older methods when neither crystal clock frequency = can be obtained through CPUID 15H, nor unambiguous CPUID models exist to be= able to use static values.

Another issue we see with EDK II TimerLib implementations for x8= 6 is that they are very model specific. As Michael Kinney said, the situati= on is not a problem when you use TimerLib for BSP bringup, as you already k= now the CPU family you target, however, it is not great for other uses, alt= hough they may be discouraged. In our opinion, regardless of the use, this = approach has a number of design issues.

All modern Intel x86 CPUs have virtual TSC counter t= hat has fixed frequency. In fact, this is true for most, if not all, modern= x86 CPUs by other vendors as well. This makes us believe that EDK II shoul= d have generic TscTimerLib for x86, which will work anywhere when given val= id TSC frequency, and TscFrequencyLib, which would determine TSC frequency = in a vendor-specific method. Ideally there exists GenericTscFrequencyLib th= at can do this for a wide majority of CPUs through different methods at run= time, including CPUID 15H, ACPI power management, vendor-specific extension= s, etc.

To summar= ise:
- We need to know how to match current running CP= U with its crystal clock frequency when CPUID 15H reports 0 ECX.
- We would like to suggest to split TSC-based TimerLib in two li= braries: generic with timer implementation and platform-specific with TSC f= requency discovery.
- We wonder whether a generic vend= or-supported TSC frequency discovery library working on a wide range of CPU= s is feasible to have in EDK II mainline.

Best regards,
Vitaly

  &= nbsp; [PATCH] UefiCpuPkg: Adding a new TSC library by using CPUID(0x15) TSC= leaf
[2] 18.7.3 Determining the Processor Base F= requency
Table 18-75. Nominal Core Crystal Clock Frequ= ency



--Apple-Mail=_EEBC6DD4-B719-4236-96B7-8796F87A65FA-- -----------------------60071df18ec12809276985cd740ef443 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: ProtonMail wsBmBAEBCAAQBQJdV1aBCRBPsoxt7Hy0xQAKCRBPsoxt7Hy0xTs/CAA0gtXe 33L+wK7zrLLKluSZbq+NJlYwdHSjBoXOItXbmrbFz2CSwJtftY6TEVgrEHD/ 4GOaZOYeo9ZTZgMSMom/le4QsjWFlDQXGhHpyfx8nUcTmAnJtwHM2V7CCREs lKBy6aLMi2xyTLl+FnfDBmmr77xXbHZ4+yHumVdHXAfzKiTqw9H3ElXLjed4 Q9SbxLuFh4PYTuUxMoTL9TSAaPdto1RLIdSWxDAho4slWdLhfI+pK6eRhbm7 yZd6zBfEPUlYZyYfb9VNDvQu2UgRchEa+UKOMQtPMi4fIHwPSWJTZBNelPrj o9X6+riVWlosXRCgrkB3+tIQsVZ53chv+xSp =uDVJ -----END PGP SIGNATURE----- -----------------------60071df18ec12809276985cd740ef443--