Hi Pedro, I would like to see feedback from Andrew Fish on this topic too. Thanks, Mike From: Pedro Falcato Sent: Tuesday, July 12, 2022 3:30 PM To: edk2-devel-groups-io ; Kinney, Michael D Cc: Andrew Fish (afish@apple.com) ; Gao, Liming ; Liu, Zhiguang Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier On Tue, Jul 12, 2022 at 11:02 PM Michael D Kinney > wrote: Hi Pedro, All good points. I agree that there are some assumptions in implementation code that sizeof(size_t) == sizeof(UINTN) == sizeof(VOID *). When the PrintLib does use a format specifier defined by the ANSI C Spec and the behavior is not identical to ANSI C, then we state that in the format string description in PrintLib.h. Since we do not define or use size_t, we are not strictly conformant. So, do you agree on %z for UINTN? Stating that %z prints UINTN and not size_t in the description could work (and indeed, my original patch does that), even though it's a bit redundant since we have no size_t. Strict conformance is not something I'm looking for (and indeed, I believe that ship has sailed for UEFI), I just want an intuitive interface that resembles ANSI C as best as it can. Mike From: devel@edk2.groups.io > On Behalf Of Pedro Falcato Sent: Tuesday, July 12, 2022 2:33 PM To: edk2-devel-groups-io >; Kinney, Michael D > Cc: Andrew Fish (afish@apple.com) >; Gao, Liming >; Liu, Zhiguang > Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier On Tue, Jul 12, 2022 at 9:46 PM Michael D Kinney > wrote: Hi Pedro, There is a good brief description of size_t on this page: https://en.wikipedia.org/wiki/C_data_types sizeof() is built into the C compiler, but size_t is defined by the std C lib being used. It can be as small as 16-bits. We do not use the std C lib, so size_t is not defined when we do EDK II builds. This means we do not know how large the value sizeof() returns is. Hi Mike, The C standard library and the compiler usually work together and need to agree on how large each type is (uint8_t, uint16_t, uint32_t, uint64_t, size_t), just like EDK2 does for its types. The definition of "implementation" (which encompasses both the compiler and the C standard library) is intentionally vague (as is a good chunk of the C standard) but essentially maps to "whatever makes your program run on a standard C environment". However, your request is to support printing values of type UINTN in the PrintLib. I agree this feature is not present today. Adding in a new format specifier would be required. Using %z may be confusing because size_t defined by a C lib we are not using and UINTN defined in the UEFI Spec are not the same. Using %z makes sense because in practice (and almost by definition, although not technically) UINTN will hold the native word size, and so will size_t. UINTN is used all over the place in UEFI to represent lengths of things in memory, and so is size_t in standard C codebases. Even if we go by both standards' definition of size_t and UINTN ( "unsigned integer type of the result of the sizeof operator" vs "Unsigned value of native width" ), since the two types in every platform we will ever support[1] map to the same concept and the same type, I would defend the use of %z for UINTN. Do you have a suggestion for a different specifier that perhaps is not used in the ANSI C printf() format string? I believe we have quite a good amount of specifiers that are not reserved but I think those would be better used for really non-standard types instead of UINTN (which is effectively a size_t). If we do want a 100% ANSI C conformant version of a print library, then I agree this would require a new library class and we would need to use different API names so the caller knows which format string style to use. Agreed. [1] In fact, I would dare to say that if there ever was a modern machine that we ported UEFI to, that had addresses/sizeof() > native word size, a lot of code would implicitly break as UINTN takes the role of a size_t in UEFI. As a quick example, all interfaces in boot services and runtime services that deal with memory length/size take a UINTN. This should be clarified in the spec (unless it's written down somewhere and I don't see it). My interpretation of both standards leads me to conclude that both size_t and UINTN are intended to be used to represent addresses and lengths in memory. All the best, Pedro There is a long history associated with the format string defined in PrintLib today. Mostly related to size optimizations of the PrintLib when FLASH devices were much, much smaller. Mike From: devel@edk2.groups.io > On Behalf Of Pedro Falcato Sent: Wednesday, July 6, 2022 3:27 PM To: Kinney, Michael D > Cc: devel@edk2.groups.io; Andrew Fish (afish@apple.com) >; Gao, Liming >; Liu, Zhiguang > Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier On Wed, Jul 6, 2022 at 7:22 PM Kinney, Michael D > wrote: Hi Pedro, This is an interesting feature. It is backwards compatible since you are adding a format specifier to the PrintLib class. There is a 2nd lib instance that needs to be updated, and that is DxePrintLibPrint2Protocol in MdeModulePkg. I think using ANSI C %z specifier syntax assumes that sizeof(size_t) == sizeof(UINTN) for all supported compilers/CPU archs. The ANSI C 99 specification does not state that this is always true. If may be true in practice for the compiler/CPU archs currently supported by Tianocore. It would be good to add a STATIC_ASSERT() to check that this is true and break the build if a compiler+CPU combination is used where that assumption is false. Not sure if this should go in Base.h with other STATIC_ASSERT()s for types. Or if it should go in PrintLib.h where this assumption is being made. Hi Mike, How can sizeof(size_t) != sizeof(UINTN)? It seems to be quite a stretch to interpret the standard in a way that the native word size will not be able to store "the maximum size of a theoretically possible object of any type". In any case, getting the size_t type is non-trivial and as far as I can tell would either depend on a C library or C extension trickery to try to get the type of sizeof(Type). I would like to see some more background on the motivation for adding this feature. I think the current workaround would be to typecast and integer of type size_t to UINT64 and use %Ld or typecase to UINT32 and use %d. Ok, some background: I essentially starting looking at DebugLib and PrintLib and I found myself very surprised by a good chunk of decisions. The API is very similar to standard printf() but it lacks support for a good portion of specifiers; some specifiers it implements don't have the C99 standard meaning, like the ones it implements for integers (%x and %lx for hex, for instance) which were non-obviously overridden to mean print UINT32 and print UINT64. Because of that, I feel that maybe reworking this library (creating PrintLib2 or something, since we can't break the existing uses) would be a good idea to have a "principle of least surprise" compliant API and would help smoothen the confusion curve between userspace code and EDK2 firmware. Since reworking the whole of PrintLib is a non-trivial amount of work, I settled for this very tiny feature that doesn't break anyone's code and is very handy to avoid odd workarounds like casting everything to a UINT64 so you can print it with %lx. Obviously, if there's interest in getting a more standard environment[1] in EDK2 (and I personally, although possibly naiively, don't see the downside), going down a PrintLib2 in the long haul is a good idea. But this patch took me around 10 minutes to write up and is probably useful in its current form. Thanks, Pedro [1] Why do I want to get a more standardized environment? Well, I simply believe it smoothens the learning curve between userspace -> kernel -> bootloader -> firmware. A good chunk of kernels and bootloaders already have relatively standard C APIs, but firmware, at least in our case, is still lacking. Also, makes you less prone to mistakes. Thanks, Mike > -----Original Message----- > From: devel@edk2.groups.io > On Behalf Of Pedro Falcato > Sent: Monday, July 4, 2022 6:16 PM > To: devel@edk2.groups.io > Cc: Kinney, Michael D >; Gao, Liming >; Liu, Zhiguang > > > Subject: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier > > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3977 > > %z is used in standard C99 as the printf specifier for size_t types. > Add support for it so we can portably print UINTN. > > Cc: Michael D Kinney > > Cc: Liming Gao > > Cc: Zhiguang Liu > > Signed-off-by: Pedro Falcato > > --- > MdePkg/Include/Library/PrintLib.h | 13 ++++++++----- > MdePkg/Library/BasePrintLib/PrintLibInternal.c | 9 +++++++++ > MdePkg/Library/BasePrintLib/PrintLibInternal.h | 1 + > 3 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/MdePkg/Include/Library/PrintLib.h b/MdePkg/Include/Library/PrintLib.h > index 8d523cac52..0d67f62d3f 100644 > --- a/MdePkg/Include/Library/PrintLib.h > +++ b/MdePkg/Include/Library/PrintLib.h > @@ -42,6 +42,9 @@ SPDX-License-Identifier: BSD-2-Clause-Patent > - L, l > - The number being printed is size UINT64. Only valid for types X, x, and d. > If this flag is not specified, then the number being printed is size int. > + - z > + - The number being printed is of size UINTN. Only valid for types X, x and d. > + If this flag is not specified, then the number being printed is size int. > - NOTE: All invalid flags are ignored. > > [width]: > @@ -73,18 +76,18 @@ SPDX-License-Identifier: BSD-2-Clause-Patent > using this type too by making sure bits 8..15 of the argument are set to 0. > - x > - The argument is an unsigned hexadecimal number. The characters used are 0..9 and > - A..F. If the flag 'L' is not specified, then the argument is assumed > + A..F. If the flags 'L', 'z' are not specified, then the argument is assumed > to be size int. This does not follow ANSI C. > - X > - The argument is an unsigned hexadecimal number and the number is padded with > - zeros. This is equivalent to a format string of "0x". If the flag > - 'L' is not specified, then the argument is assumed to be size int. > + zeros. This is equivalent to a format string of "0x". If the flags > + 'L', 'z' are not specified, then the argument is assumed to be size int. > This does not follow ANSI C. > - d > - - The argument is a signed decimal number. If the flag 'L' is not specified, > + - The argument is a signed decimal number. If the flags 'L', 'z' are not specified, > then the argument is assumed to be size int. > - u > - - The argument is a unsigned decimal number. If the flag 'L' is not specified, > + - The argument is a unsigned decimal number. If the flags 'L'. 'z' are not specified, > then the argument is assumed to be size int. > - p > - The argument is a pointer that is a (VOID *), and it is printed as an > diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c b/MdePkg/Library/BasePrintLib/PrintLibInternal.c > index 42b598a432..1cd99b2213 100644 > --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c > +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c > @@ -720,6 +720,9 @@ BasePrintLibSPrintMarker ( > case 'l': > Flags |= LONG_TYPE; > break; > + case 'z': > + Flags |= SIZET_TYPE; > + break; > case '*': > if ((Flags & PRECISION) == 0) { > Flags |= PAD_TO_WIDTH; > @@ -833,6 +836,12 @@ BasePrintLibSPrintMarker ( > } else { > Value = BASE_ARG (BaseListMarker, int); > } > + } else if ((Flags & SIZET_TYPE) != 0) { > + if (BaseListMarker == NULL) { > + Value = VA_ARG (VaListMarker, UINTN); > + } else { > + Value = BASE_ARG (BaseListMarker, UINTN); > + } > } else { > if (BaseListMarker == NULL) { > Value = VA_ARG (VaListMarker, INT64); > diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.h b/MdePkg/Library/BasePrintLib/PrintLibInternal.h > index 34d591c6fc..9193e6192b 100644 > --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.h > +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.h > @@ -29,6 +29,7 @@ > #define ARGUMENT_REVERSED BIT12 > #define COUNT_ONLY_NO_PRINT BIT13 > #define UNSIGNED_TYPE BIT14 > +#define SIZET_TYPE BIT15 > > // > // Record date and time information > -- > 2.37.0 > > > > > -- Pedro Falcato -- Pedro Falcato -- Pedro Falcato