Re: [edk2-devel] [PATCH v3 11/39] UefiCpuPkg: Add LoongArch64 CPU Timer library

public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed

From: "Chao Li" <lichao@loongson.cn>
To: devel@edk2.groups.io, lersek@redhat.com
Cc: Eric Dong <eric.dong@intel.com>, Ray Ni <ray.ni@intel.com>,
	Rahul Kumar <rahul1.kumar@intel.com>,
	Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [edk2-devel] [PATCH v3 11/39] UefiCpuPkg: Add LoongArch64 CPU Timer library
Date: Thu, 23 Nov 2023 19:43:47 +0800	[thread overview]
Message-ID: <1069d005-ef80-46c6-89fa-f3fed316e4ad@loongson.cn> (raw)
In-Reply-To: <2e42db7c-a927-f47b-7d80-632895b11e1b@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 26249 bytes --]

Hi Laszlo,

Thanks so carefully to review this patch, I'm guessing it took your a 
long time. :)


Thanks,
Chao
On 2023/11/23 00:12, Laszlo Ersek wrote:
> On 11/17/23 11:00, Chao Li wrote:
>> Add the LoongArch64 CPU Timer library, using CPUCFG 0x4 and 0x5 for
>> Stable Counter frequency.
>>
>> BZ:https://bugzilla.tianocore.org/show_bug.cgi?id=4584
>>
>> Cc: Eric Dong<eric.dong@intel.com>
>> Cc: Ray Ni<ray.ni@intel.com>
>> Cc: Rahul Kumar<rahul1.kumar@intel.com>
>> Cc: Gerd Hoffmann<kraxel@redhat.com>
>> Signed-off-by: Chao Li<lichao@loongson.cn>
>> ---
>>   .../BaseLoongArch64CpuTimerLib.inf            |  30 +++
>>   .../BaseLoongArch64CpuTimerLib.uni            |  15 ++
>>   .../BaseLoongArch64CpuTimerLib/CpuTimerLib.c  | 226 ++++++++++++++++++
>>   UefiCpuPkg/UefiCpuPkg.dsc                     |   3 +
>>   4 files changed, 274 insertions(+)
>>   create mode 100644 UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.inf
>>   create mode 100644 UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.uni
>>   create mode 100644 UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/CpuTimerLib.c
> (1) sorry about the annoying comment, but the library should be called
> (preferably) BaseTimerLibLoongArch64Cpu.
>
> We're frequently not consistent with the following library instance
> naming scheme, but in theory anyway, library instances should be named
> as follows:
>
>    <firmware module type><lib class name><library instance identifier>
>
> Thus, in this case, Base + TimerLib + LoongArch64Cpu.
>
> update UNI file name, INF file name, directory name, BASE_NAME and
> MODULE_UNI_FILE accordingly (if you agree)

There has a SPEC for naming style:

https://github.com/tianocore-docs/edk2-CCodingStandardsSpecification/blob/master/4_naming_conventions/42_directory_names.md

Please check 4.2.3 EDKII Library directory, and most directory naming 
follows this SPEC.

>
>> diff --git a/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.inf b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.inf
>> new file mode 100644
>> index 0000000000..c00c215aec
>> --- /dev/null
>> +++ b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.inf
>> @@ -0,0 +1,30 @@
>> +## @file
>> +#  Base CPU Timer Library
>> +#
>> +#  Provides base timer support using CPUCFG 0x4 and 0x5 stable counter frequency.
>> +#
>> +#  Copyright (c) 2023, Loongson Technology Corporation Limited. All rights reserved.<BR>
>> +#  SPDX-License-Identifier: BSD-2-Clause-Patent
>> +#
>> +##
>> +
>> +[Defines]
>> +  INF_VERSION                       = 0x00010005
> (2) Can you use the most recent INF file version? (Generally applies to
> all new INF files in the series.)
>
> The latest version is 1.29
> <https://tianocore-docs.github.io/edk2-InfSpecification/draft/>, and you
> can just write "1.29" here.
OK, I will update this version, include all new INF files for this series.
>
>> +  BASE_NAME                         = BaseLoongArch64CpuTimerLib
>> +  FILE_GUID                         = 740389C7-CC44-4A2F-88DC-89D97D312E7C
>> +  MODULE_TYPE                       = BASE
>> +  VERSION_STRING                    = 1.0
>> +  LIBRARY_CLASS                     = TimerLib
>> +  MODULE_UNI_FILE                   = BaseLoongArch64CpuTimerLib.uni
>> +
>> +[Sources.common]
>> +  CpuTimerLib.c
> (3) ".common" is not an error, but superfluous here.
Yes, I will remove it in V4.
>
>> +
>> +[Packages]
>> +  MdePkg/MdePkg.dec
>> +  UefiCpuPkg/UefiCpuPkg.dec
> (4) Do you actually consume any artifact from "UefiCpuPkg.dec"?
>
> Unless you do, there's no need to depend on "UefiCpuPkg.dec" here.
Yes, the UefiCpuPkg.dec is indeed not necessary, so I will remove it in 
this file for V4.
>
> (5) Relatedly: the TimerLib class is declared in "MdePkg.dec". Thus, if
> you indeed don't need anything from "UefiCpuPkg.dec", I'd suggest moving
> this library instance under MdePkg.
It is inappropriate to place this library in MdePkg, because the MdePkg 
doesn't have any CpuTimerLib instance, and this class library are HW 
implementation-related, so it more appropriate to place it in UefiCpuPkg.
>
>> +
>> +[LibraryClasses]
>> +  BaseLib
>> +  PcdLib
>> +  DebugLib
> (6) If it's not too much of a burden, it's best to keep library classes
> alphabetically sorted.
OK, I will sort them for V4.
>
>> diff --git a/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.uni b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.uni
>> new file mode 100644
>> index 0000000000..72d38ec679
>> --- /dev/null
>> +++ b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.uni
>> @@ -0,0 +1,15 @@
>> +// /** @file
>> +// Base CPU Timer Library
>> +//
>> +// Provides base timer support using CPUCFG 0x4 and 0x5 stable counter frequency.
>> +//
>> +// Copyright (c) 2023, Loongson Technology Corporation Limited. All rights reserved.<BR>
>> +//
>> +// SPDX-License-Identifier: BSD-2-Clause-Patent
>> +//
>> +// **/
>> +
>> +
>> +#string STR_MODULE_ABSTRACT             #language en-US "LOONGARCH CPU Timer Library"
>> +
>> +#string STR_MODULE_DESCRIPTION          #language en-US "Provides basic timer support using CPUCFG 0x4 and 0x5 stable counter frequency."
>> diff --git a/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/CpuTimerLib.c b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/CpuTimerLib.c
>> new file mode 100644
>> index 0000000000..349b881cbc
>> --- /dev/null
>> +++ b/UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/CpuTimerLib.c
>> @@ -0,0 +1,226 @@
>> +/** @file
>> +  CPUCFG 0x4 and 0x5 for Stable Counter frequency instance of Timer Library.
>> +
>> +  Copyright (c) 2023, Loongson Technology Corporation Limited. All rights reserved.<BR>
>> +
>> +  SPDX-License-Identifier: BSD-2-Clause-Patent
>> +**/
>> +
>> +#include <Base.h>
>> +#include <Library/TimerLib.h>
>> +#include <Library/BaseLib.h>
>> +#include <Library/DebugLib.h>
>> +#include <Register/LoongArch64/Cpucfg.h>
> (7) It's best to keep the Library includes in sync with the
> [LibraryClasses] section from the INF file.
>
> (And to keep the #include directives alphabetically sorted as well.)
>
> You are including <Library/TimerLib.h> here because the file is going to
> define those functions. OK. This is the sole justified difference
> between the Library #includes here and the [LibraryClasses] section in
> the INF file.
>
> Regarding dependencies, you are including <Library/BaseLib.h> and
> <Library/DebugLib.h>, but [LibraryClasses] in addition lists PcdLib.
> Thus, <Library/PcdLib.h> is missing here. (You are likely getting it
> included recursively, but that's not clean enough.)
OK, I see, I will sort them and keep them in sync with corresponding 
libraries.
>
>> +
>> +/**
>> +  Calculate clock frequency using CPUCFG 0x4 and 0x5 registers.
>> +
>> +  @param  VOID.
> (8) This @param is useless and should be avoided, AFAICT. I think the
> EccCheck plugin will not complain.
The EccCheck already passed... Any way, I will remove it in V4.
>
>> +
>> +  @return The frequency in Hz.
>> +
>> +**/
>> +UINT32
>> +EFIAPI
>> +CalcConstFreq (
> (9) Should be STATIC, and should *not* be EFIAPI. It is not a public
> library or protocol interface.
I will fix it in V4.
>
>> +  VOID
>> +  )
>> +{
>> +  UINT32                 BaseFreq;
>> +  UINT32                 ClockMultiplier;
>> +  UINT32                 ClockDivide;
>> +  CPUCFG_REG4_INFO_DATA  CCFreq;
> (10) Per edk2 coding style, when acronyms and abbreviations are grouped
> together, each individual acronym itself is supposed to be CamelCase.
> For example, when we use "PCI" in identifiers, it becomes "Pci". So I
> think this variable should be called "CcFreq", where "CC" stands for
> "CPU Config". If I'm wrong about "CC", then please ignore this point.
The two letter 'C' means: first 'C' means Constant, and the second means 
Crystal. In LoongArch SPEC, this area is named CC_FREQ, which means: 
Constant frequency timer and the crystal frequency corresponding to the 
clock used by the timer. So I think CCFreq is more suitable.
>
>> +  CPUCFG_REG5_INFO_DATA  CpucfgReg5Data;
>> +  UINT32                 StableTimerFreq;
>> +
>> +  //
>> +  // Get the the crystal frequency corresponding to the constant
>> +  // frequency timer and the clock used by the timer.
>> +  //
>> +  AsmCpucfg (CPUCFG_REG4_INFO, &CCFreq.Uint32);
>> +
>> +  //
>> +  // Get the multiplication factor and frequency division factor
>> +  // corresponding to the constant frequency timer and the clock
>> +  // used by the timer.
>> +  //
>> +  AsmCpucfg (CPUCFG_REG5_INFO, &CpucfgReg5Data.Uint32);
>> +
>> +  BaseFreq        = CCFreq.Bits.CC_FREQ;
> (11) My comment here probably affects the header file where you
> introduced the CC_FREQ field name. It should be CcFreq.
>
> Same for MUL and DIV below.
The same to above.
>
>> +  ClockMultiplier = CpucfgReg5Data.Bits.CC_MUL & 0xFFFFULL;
>> +  ClockDivide     = CpucfgReg5Data.Bits.CC_DIV & 0xFFFFULL;
> (12) Seeing how the Uint32 members of the unions (?) are used for
> populating the bitfields, the ULL suffix on the masks seems superfluous.
> The results are stored to UINT32 local variables, too. 0xFFFF should
> suffice.
Yes, the ULL does seem redundant, the UINT32 means unsigned INT32 and 
0xFFFF shoud be unsigned, I thought when I wrote this code, ULL was 
added to ensure that the value was unsigned.
>
>> +
>> +  if (!BaseFreq || !ClockMultiplier || !ClockDivide) {
> (13) The edk2 coding style does not like scalar types *other* than
> BOOLEAN to be used in logical context. Please rewrite as:
>
>    (BaseFreq == 0) || (ClockMultiplier == 0) || (ClockDivide == 0)
>
>> +    DEBUG ((DEBUG_ERROR, "LoongArch Stable Timer is not available in the CPU, hence this library cannot be used.\n"));
> (13) I think this line is a bit too long (117 chars). I think the
> currently accepted limit (?) is 100 characters (but I'm unsure).
>
> Recommend to break this up to multiple lines.
>
>> +    StableTimerFreq = 0;
> (14) Needless assignment AFAICT.
>
>> +    ASSERT (0);
> (15) Should be ASSERT (FALSE).
>
> (16) I recommend that a CpuDeadLoop() call be added here too, because
> ASSERT()s are compiled out of RELEASE builds.
>
>> +  } else {
> (17) an "else" branch seems awkward here; code after ASSERT (FALSE) /
> CpuDeadLoop() is supposed to be unreachable, so whatever you do here can
> just be unnested.
For double (13), (14), (15), (16) and (17), I agree, will fix them in 
V4. But for 15, did the CpuDeadLoop() be needed? I think if the DEBUG 
mode doesn't work, the RELEASE mode can not be created.
>
>> +    StableTimerFreq = (BaseFreq * ClockMultiplier / ClockDivide);
> (18) Unsafe multiplication.
>
> ClockMultiplier is at most 0xFFFF, but BaseFreq (from CC_FREQ) may
> theoretically be as high as MAX_UINT32.
>
> - If there is a LoongArch64 specification that places a 0xFFFF limit on
> CC_FREQ, then please add such an ASSERT() here.
>
> - If there is no such specification, then cast either BaseFreq or
> ClockMultiplier to UINT64. A UINT64 multiplication will not overflow
> here (because MAX_UINT32 * 0xFFFF fits in 32+16=48 bits).
Yes, you are right, there is a risk of overflow, I think changing the 
ClockMultiplier to UINT64 would do the trick, or force transferring the 
result to ensure it doesn't overflow? Something like: ((UINT64)(BaseFreq 
* ClockMultiplier) / ClockDivide).
>
>
> (19) In turn, the result of the division needs to be checked against two
> boundaries:
>
> - it must not be zero (could occur if the divisor is too large, relative
> to the product). Returning zero from this function would case problems
> elsewhere.
I will ASSERT here if the result is zero.
>
> - it must fit in a UINT32. Better yet, change the return type of the
> function to UINT64.
Agree.
>
>
>> +  }
>> +
>> +  return StableTimerFreq;
>> +}
>> +
>> +/**
>> +  Stalls the CPU for at least the given number of microseconds.
>> +
>> +  Stalls the CPU for the number of microseconds specified by MicroSeconds.
>> +
>> +  @param  MicroSeconds  The minimum number of microseconds to delay.
>> +
>> +  @return MicroSeconds
>> +
>> +**/
>> +UINTN
>> +EFIAPI
>> +MicroSecondDelay (
>> +  IN UINTN  MicroSeconds
>> +  )
>> +{
>> +  UINTN  Count, Ticks, Start, End;
>> +
>> +  Count = (CalcConstFreq () * MicroSeconds) / 1000000;
> (20) Unchecked multiplication. CalcConstFreq() may be as large as
>
>    (MAX_UINT32 * 0xFFFF / 1)
>
> and then dependent on MicroSeconds we could overflow even a UINT64 here.
>
> This function does not permit returning errors, so I'm not fully sure
> what to do. Normally I'd recommend a safe UINT64 multiplication from
> SafeIntLib (and then storing the quotient, from the division, in a UINT64).
OK, I will use the SafeIntLib here.
>
>> +  Start = AsmReadStableCounter ();
>> +  End   = Start + Count;
>> +
>> +  do {
>> +    Ticks = AsmReadStableCounter ();
>> +  } while (Ticks < End);
> The rest of this patch indicates that
>
> (a) the performance counter is just the stable counter
> (GetPerformanceCounter() simply calls AsmReadStableCounter()), and that
>
> (b) the range is [BIT2, BIT48-1] (both sides inclusive).
Sorry, The PerformanceCounter function series are updated in our 
internal, I will update them in V4. I would say, the performance counter 
is not same to stable counter, and the stable count in LoongArch64 are 
64 bits wide, so no warparound occurs. PLS wait for V4.
>
> (21) The problem is that this simple loop does not consider wrap-around.
>
> We should do something like this:
>
>    RETURN_STATUS  Status;
>    UINT64         Remaining;
>    UINT64         Start;
>
>    Status = SafeUint64Mult (CalcConstFreq (), MicroSeconds, &Remaining);
>    ASSERT_RETURN_ERROR (Status);
>    if (RETURN_ERROR (Status)) {
>      CpuDeadLoop ();
>    }
>
>    //
>    // for the given number of microseconds, the needed tick count should
>    // be rounded upwards
>    //
>    Status = SafeUint64Add (Remaining, 1000000 - 1, &Remaining);
>    ASSERT_RETURN_ERROR (Status);
>    if (RETURN_ERROR (Status)) {
>      CpuDeadLoop ();
>    }
>
>    Remaining = DivU64x32 (Remaining, 1000000);
>    Start = AsmReadStableCounter ();
>
>    while (Remaining > 0) {
>      UINT64  Stop;
>      UINT64  Diff;
>
>      Stop = AsmReadStableCounter ();
>
>      if (Start <= Stop) {
>        //
>        // no wrap-around
>        //
>        Diff = Stop - Start;
>      } else {
>        //
>        // wrap-around
>        //
>        Diff = (BIT48 - Start) + (Stop - BIT2);
>      }
>
>      Start = Stop;
>
>      Remaining -= MIN (Remaining, Diff);
>    }
Refer to above.
>
>> +
>> +  return MicroSeconds;
>> +}
>> +
>> +/**
>> +  Stalls the CPU for at least the given number of nanoseconds.
>> +
>> +  Stalls the CPU for the number of nanoseconds specified by NanoSeconds.
>> +
>> +  @param  NanoSeconds The minimum number of nanoseconds to delay.
>> +
>> +  @return NanoSeconds
>> +
>> +**/
>> +UINTN
>> +EFIAPI
>> +NanoSecondDelay (
>> +  IN UINTN  NanoSeconds
>> +  )
>> +{
>> +  UINT32  MicroSeconds;
>> +
>> +  if (NanoSeconds % 1000 == 0) {
>> +    MicroSeconds = NanoSeconds/1000;
>> +  } else {
>> +    MicroSeconds = NanoSeconds/1000 + 1;
>> +  }
> (22) MicroSeconds should have type UINTN.
Agree.
>
> (23) Using DivU64x32Remainder() would be more idiomatic.
>
> (
>
> I agree that, if we find that the remainder is nonzero, adding 1 to the
> quotient is safe, for the original UINTN range too.
>
> Initial condition:
>
>    ns <= MAX_UINTN
>
> Divide by 1000 (mathematically):
>
>    ns/1000 <= MAX_UINTN/1000
>
> Because floor(x) <= x, we can expand the left hand side:
>
>    floor(ns/1000) <= ns/1000 <= MAX_UINTN/1000
>
> Drop the middle:
>
>    floor(ns/1000) <= MAX_UINTN/1000
>
> Add 1:
>
>    floor(ns/1000) + 1 <= MAX_UINTN/1000 + 1
>
> At the left hand side, we now have the C language result (the truncated,
> then incremented, quotinent).
>
> Now, a small detour:
>
>    1/1000 < 1
>
> Multiply by x (assuming x>0):
>
>    x / 1000 < x
>
> Substitute MAX_UINTN for x (MAX_UINTN is positive):
>
>    MAX_UINTN/1000 < MAX_UINTN
>
> Then use this to expand the right hand side:
>
>    floor(ns/1000) + 1 <= MAX_UINTN/1000 + 1 < MAX_UINTN + 1
>
> Drop the middle:
>
>    floor(ns/1000) + 1 < MAX_UINTN + 1
>
> Given that both sides are integers, we get
>
>    floor(ns/1000) + 1 <= MAX_UINTN
>
> which is what we needed.
>
> )
OK, I see. In V4, I will use the DivU64X32Remainder() to change this 
operation.
>
>> +
>> +  MicroSecondDelay (MicroSeconds);
>> +
>> +  return NanoSeconds;
>> +}
>> +
>> +/**
>> +  Retrieves the current value of a 64-bit free running performance counter.
>> +
>> +  Retrieves the current value of a 64-bit free running performance counter. The
>> +  counter can either count up by 1 or count down by 1. If the physical
>> +  performance counter counts by a larger increment, then the counter values
>> +  must be translated. The properties of the counter can be retrieved from
>> +  GetPerformanceCounterProperties().
>> +
>> +  @return The current value of the free running performance counter.
>> +
>> +**/
>> +UINT64
>> +EFIAPI
>> +GetPerformanceCounter (
>> +  VOID
>> +  )
>> +{
>> +  return AsmReadStableCounter ();
>> +}
>> +
>> +/**
>> +  Retrieves the 64-bit frequency in Hz and the range of performance counter
>> +  values.
>> +
>> +  If StartValue is not NULL, then the value that the performance counter starts
>> +  with immediately after is it rolls over is returned in StartValue. If
>> +  EndValue is not NULL, then the value that the performance counter end with
>> +  immediately before it rolls over is returned in EndValue. The 64-bit
>> +  frequency of the performance counter in Hz is always returned. If StartValue
>> +  is less than EndValue, then the performance counter counts up. If StartValue
>> +  is greater than EndValue, then the performance counter counts down. For
>> +  example, a 64-bit free running counter that counts up would have a StartValue
>> +  of 0 and an EndValue of 0xFFFFFFFFFFFFFFFF. A 24-bit free running counter
>> +  that counts down would have a StartValue of 0xFFFFFF and an EndValue of 0.
>> +
>> +  @param  StartValue  The value the performance counter starts with when it
>> +                      rolls over.
>> +  @param  EndValue    The value that the performance counter ends with before
>> +                      it rolls over.
>> +
>> +  @return The frequency in Hz.
>> +
>> +**/
>> +UINT64
>> +EFIAPI
>> +GetPerformanceCounterProperties (
>> +  OUT UINT64  *StartValue   OPTIONAL,
>> +  OUT UINT64  *EndValue     OPTIONAL
>> +  )
>> +{
>> +  if (StartValue != NULL) {
>> +    *StartValue = BIT2;
>> +  }
>> +
>> +  if (EndValue != NULL) {
>> +    *EndValue = BIT48 - 1;
>> +  }
>> +
>> +  return CalcConstFreq ();
>> +}
>> +
>> +/**
>> +  Converts elapsed ticks of performance counter to time in nanoseconds.
>> +
>> +  This function converts the elapsed ticks of running performance counter to
>> +  time value in unit of nanoseconds.
>> +
>> +  @param  Ticks     The number of elapsed ticks of running performance counter.
>> +
>> +  @return The elapsed time in nanoseconds.
>> +
>> +**/
>> +UINT64
>> +EFIAPI
>> +GetTimeInNanoSecond (
>> +  IN UINT64  Ticks
>> +  )
>> +{
>> +  UINT64  Frequency;
>> +  UINT64  NanoSeconds;
>> +  UINT64  Remainder;
>> +  INTN    Shift;
>> +
>> +  Frequency = GetPerformanceCounterProperties (NULL, NULL);
>> +
>> +  //
>> +  //          Ticks
>> +  // Time = --------- x 1,000,000,000
>> +  //        Frequency
>> +  //
>> +  NanoSeconds = MultU64x32 (DivU64x64Remainder (Ticks, Frequency, &Remainder), 1000000000u);
> (24) I understand this is being copied from elsewhere, but the
> multiplication by 10^9 should be checked against overflow, using
> SafeIntLib. If we have e.g. a 100 MHz clock, then passing in
> Ticks=MAX_UINT64 will lead to an overflow. :(
OK, in V4, I will use the SafeIntLib.
>
>> +
>> +  //
>> +  // Ensure (Remainder * 1,000,000,000) will not overflow 64-bit.
>> +  // Since 2^29 < 1,000,000,000 = 0x3B9ACA00 < 2^30, Remainder should < 2^(64-30) = 2^34,
>> +  // i.e. highest bit set in Remainder should <= 33.
>> +  //
> What we want here is (Remainder * 10^9 / Frequency), and in that we want
> the product not to overflow 2^64 - 1. Thus, we need
>
>    Remainder * 10^9 <= 2^64 - 1
>
>    Remainder <= (2^64 - 1) / 10^9
>
> The floor -- i.e., integral part -- of the RHS is 18,446,744,073
> decimal, or 100_01001011_10000010_11111010_00001001 binary. The most
> significant bit that is set is bit 34. But that's not a sufficient
> condition, if we only consider the MSB in Remainder, because other bits
> that should be clear (such as bit 33) could still be set. The sufficient
> condition is thus that the MSB be <= 33. Therefore the comment is OK.
>
>> +  Shift        = MAX (0, HighBitSet64 (Remainder) - 33);
> Quirky, but correct.
>
> If Remainder is 0, then HighBitSet64 returns -1, and we compare -34 vs.
> 0 -- we get zero. OK, no shift needed.
>
> Otherwise, if HighBitSet64 returns a value in the range [0..33], then we
> compare [-33..0] vs 0 -- we get zero from MAX. OK, no shift needed.
>
> Otherwise, HighBitSet64 returns a value from the range [34..63], then we
> compare [1..30] against zero, and the former prevails (gets stored to
> Shift). OK.
>
>> +  Remainder    = RShiftU64 (Remainder, (UINTN)Shift);
>> +  Frequency    = RShiftU64 (Frequency, (UINTN)Shift);
> I *think* this approach will ensure that Frequency is not zero.
> Remainder is strictly smaller than Frequency, and we shift out only so
> many LSBs from Remainder (and Frequency) that (Remainder * 10^9) below
> *just* not overflow. That means Remainder should not end up being zeroed
> out, and so Frequency shouldn't either (the MSB of the original
> Frequency is larger than or equal to the MSB of the original Remainder). OK.
>
>> +  NanoSeconds += DivU64x64Remainder (MultU64x32 (Remainder, 1000000000u), Frequency, NULL);
> So the RHS here may give us a proper NanoSeconds increment, but how do
> we know that the addition will not overflow NanoSeconds?
>
> Here's an example:
>
>    Ticks = 18,446,744,055,553,255,925 (0xFFFF_FFFB_C5CC_E9F5)
>    Frequency = 999,999,999 (0x3B9A_C9FF)
>
> (
>
> The idea with this example is that dividing Ticks by Frequency and then
> multiplying the result with 1,000,000,000 causes a *very slight* blow-up:
>
>    Ticks / 999,999,999 * 1,000,000,000 =
>    Ticks * (1,000,000,000 / 999,999,999)
>
> in such a tricky way that this blow-up:
>
> (a) *just* doesn't overflow the first part of the function (where we use
> the integral multiples of Frequency), but
>
> (b) does overflow the last part (where we use the remaining, fractional,
> multiple of Frequency).
>
> The way to achieve that, after setting Frequency to just below 1GHz for
> the above blow-up, is to start with Ticks = MAX_UINT64, perform the
> division, then maximize Remainder in comprison to the divisor (i.e.,
> Frequency) -- increase Remainder to (Frequency-1) --, then recalculate
> Ticks from that *backwards* (increase Ticks the same as we increased
> Remainder).
>
> )
>
> And then this happens:
>
> - the initial DivU64x64Remainder (Ticks, Frequency, &Remainder) call
> returns 18,446,744,073 (0x4_4B82_FA09) as quotient, and sets Remainder
> to 999,999,998 (0x3B9A_C9FE, binary 111011_10011010_11001001_11111110).
>
> - the first MultU64x32() call produces 18,446,744,073,000,000,000 in
> NanoSeconds (0xFFFF_FFFF_D5B5_1A00).
>
> - Remainder's MSB is bit 29, thus Shift gets assigned 0.
>
> - Multiplying Remainder by 1,000,000,000 produces
> 999,999,998,000,000,000 (0xDE0_B6B3_302E_6C00). No overflow, as intended.
>
> - Dividing that product by Frequency produces 999,999,998 (0x3B9A_C9FE)
> as the *increment* for NanoSeconds.
>
> - Adding this increment (0x3B9A_C9FE) to NanoSeconds from earlier
> (0xFFFF_FFFF_D5B5_1A00) *does* overflow: the result would be
> 0x1_0000_0000_114F_E3FE!
>
> (25) This proves that, generally speaking, the final "+=" operation
> (increment) is unsafe -- in every TimerLib instance in edk2 that uses
> this calculation --, and should be replaced with an addition from
> SafeIntLib.
Do you recommend changing the final "+=" to the SafeIntLib way? 
Something like: NanoSeconds = SafeUint64Add (NanoSeconds, 
DivU64x64Remainder (MultU64x32 (Remainder, 1000000000u), Frequency, 
NULL), &NanoSeconds).
>
> I'm not asking for a tree-wide fix, but using SafeIntLib here would be nice.
>
>
>> +
>> +  return NanoSeconds;
>> +}
>> diff --git a/UefiCpuPkg/UefiCpuPkg.dsc b/UefiCpuPkg/UefiCpuPkg.dsc
>> index 074fd77461..8e34a9cd6b 100644
>> --- a/UefiCpuPkg/UefiCpuPkg.dsc
>> +++ b/UefiCpuPkg/UefiCpuPkg.dsc
>> @@ -205,5 +205,8 @@
>>     UefiCpuPkg/CpuTimerDxeRiscV64/CpuTimerDxeRiscV64.inf
>>     UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf
>>   
>> +[Components.LOONGARCH64]
>> +  UefiCpuPkg/Library/BaseLoongArch64CpuTimerLib/BaseLoongArch64CpuTimerLib.inf
>> +
>>   [BuildOptions]
>>     *_*_*_CC_FLAGS = -D DISABLE_NEW_DEPRECATED_INTERFACES
> (26) And then this should go into MdePkg.dec, according to points (4)
> and (5) above.
Same to above.
>
> Thanks
> Laszlo
>
>
>
> 
>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#111667): https://edk2.groups.io/g/devel/message/111667
Mute This Topic: https://groups.io/mt/102644766/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 33980 bytes --]

next prev parent reply	other threads:[~2023-11-23 11:44 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20231117095742.3605778-1-lichao@loongs>
2023-11-17  9:58 ` [edk2-devel] [PATCH v3 01/39] MdePkg: Add the header file named Csr.h for LoongArch64 Chao Li
2023-11-17  9:58 ` [edk2-devel] [PATCH v3 02/39] MdePkg: Add LoongArch64 FPU function set into BaseCpuLib Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 03/39] MdePkg: Add LoongArch64 exception function set into BaseLib Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 04/39] MdePkg: Add LoongArch64 local interrupt " Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 05/39] MdePkg: Add LoongArch Cpucfg function Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 06/39] MdePkg: Add read stable counter operation for LoongArch Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 07/39] MdePkg: Add CSR " Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 08/39] MdePkg: Add IOCSR " Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 09/39] MdePkg: Add a new library named PeiServicesTablePointerLibReg Chao Li
2023-11-17 11:35   ` Leif Lindholm
2023-11-20  3:07     ` Chao Li
2023-11-21 14:37   ` Laszlo Ersek
2023-11-22  1:47     ` Chao Li
2023-11-24 11:35       ` Laszlo Ersek
2023-11-27  3:27         ` Chao Li
2023-12-01  0:32           ` 回复: " gaoliming via groups.io
2023-12-01  8:20             ` Chao Li
     [not found]         ` <179B5D231F190982.32091@groups.io>
2023-11-29  1:35           ` Chao Li
2023-11-17  9:59 ` [edk2-devel] [PATCH v3 10/39] MdePkg: Add method of LoongArch64 to PeiServicesTablePointerLibReg Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 11/39] UefiCpuPkg: Add LoongArch64 CPU Timer library Chao Li
2023-11-22 16:12   ` Laszlo Ersek
2023-11-22 16:13     ` Laszlo Ersek
2023-11-23 11:43     ` Chao Li [this message]
2023-12-11 17:16       ` Laszlo Ersek
2023-12-12  3:45         ` Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 12/39] UefiCpuPkg: Add CPU exception library for LoongArch Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 13/39] UefiCpuPkg: Add CpuMmuLib.h to UefiCpuPkg Chao Li
2023-11-17 20:18   ` Andrei Warkentin
2023-11-20  3:26     ` Chao Li
2023-11-30  0:59   ` Ni, Ray
2023-11-30  2:25     ` Chao Li
     [not found]     ` <179C457B5B852375.31732@groups.io>
2023-12-04  7:31       ` Chao Li
2023-12-05  8:27         ` Ni, Ray
2023-12-05 12:27           ` Chao Li
     [not found]           ` <179DEF40376B662A.18076@groups.io>
2023-12-08  2:10             ` Chao Li
2023-12-11  8:13               ` Ni, Ray
2023-12-11  8:19                 ` Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 14/39] UefiCpuPkg: Add LoongArch64CpuMmuLib " Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 15/39] UefiCpuPkg: Add multiprocessor library for LoongArch64 Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 16/39] UefiCpuPkg: Add CpuDxe driver " Chao Li
2023-11-17 10:00 ` [edk2-devel] [PATCH v3 17/39] EmbeddedPkg: Add PcdPrePiCpuIoSize width for LOONGARCH64 Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 18/39] ArmVirtPkg: Move PCD of FDT base address and FDT padding to OvmfPkg Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 19/39] MdePkg: Add a PCD feature flag named PcdPciIoTranslationIsEnabled Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 20/39] UefiCpuPkg: Add MMIO method in CpuIo2Dxe Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 21/39] ArmVirtPkg: Enable UefiCpuPkg version CpuIo2Dxe Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 22/39] ArmPkg: Remove ArmPciCpuIo2Dxe from ArmPkg Chao Li
2023-11-17 13:13   ` Leif Lindholm
2023-11-20  3:24     ` Chao Li
2023-11-20 18:47       ` Leif Lindholm
2023-11-21  1:10         ` Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 23/39] OvmfPkg/RiscVVirt: Enable UefiCpuPkg version CpuIo2Dxe Chao Li
2023-11-17 20:15   ` Andrei Warkentin
2023-11-20  3:04     ` Chao Li
2023-11-17 10:01 ` [edk2-devel] [PATCH v3 24/39] OvmfPkg/RiscVVirt: Remove PciCpuIo2Dxe from RiscVVirt Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 25/39] ArmVirtPkg: Move the FdtSerialPortAddressLib to OvmfPkg Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 26/39] ArmVirtPkg: Move the PcdTerminalTypeGuidBuffer into OvmfPkg Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 27/39] ArmVirtPkg: Move PlatformBootManagerLib to OvmfPkg Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 28/39] OvmfPkg/LoongArchVirt: Add stable timer driver Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 29/39] OvmfPkg/LoongArchVirt: Add a NULL library named CollectApResouceLibNull Chao Li
2023-11-17 10:02 ` [edk2-devel] [PATCH v3 30/39] OvmfPkg/LoongArchVirt: Add serial port hook library Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 31/39] OvmfPkg/LoongArchVirt: Add the early serial port output library Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 32/39] OvmfPkg/LoongArchVirt: Add real time clock library Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 33/39] OvmfPkg/LoongArchVirt: Add NorFlashQemuLib Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 34/39] OvmfPkg/LoongArchVirt: Add FdtQemuFwCfgLib Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 35/39] OvmfPkg/LoongArchVirt: Add reset system library Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 36/39] OvmfPkg/LoongArchVirt: Support SEC phase Chao Li
2023-11-17 10:03 ` [edk2-devel] [PATCH v3 37/39] OvmfPkg/LoongArchVirt: Support PEI phase Chao Li
2023-11-17 10:04 ` [edk2-devel] [PATCH v3 38/39] OvmfPkg/LoongArchVirt: Add build file Chao Li
2023-11-17 10:04 ` [edk2-devel] [PATCH v3 39/39] OvmfPkg/LoongArchVirt: Add self introduction file Chao Li
     [not found] ` <179860C0A131BC70.3002@groups.io>
2023-11-20  9:55   ` [edk2-devel] [PATCH v3 14/39] UefiCpuPkg: Add LoongArch64CpuMmuLib to UefiCpuPkg Chao Li
     [not found] ` <179860DB0A3E8D83.6542@groups.io>
2023-11-21  6:39   ` [edk2-devel] [PATCH v3 27/39] ArmVirtPkg: Move PlatformBootManagerLib to OvmfPkg Chao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1069d005-ef80-46c6-89fa-f3fed316e4ad@loongson.cn \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox