From: "Michael D Kinney" <michael.d.kinney@intel.com>
To: "devel@edk2.groups.io" <devel@edk2.groups.io>,
"pedro.falcato@gmail.com" <pedro.falcato@gmail.com>,
"Kinney, Michael D" <michael.d.kinney@intel.com>
Cc: "Andrew Fish (afish@apple.com)" <afish@apple.com>,
"Gao, Liming" <gaoliming@byosoft.com.cn>,
"Liu, Zhiguang" <zhiguang.liu@intel.com>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier
Date: Tue, 12 Jul 2022 22:02:22 +0000 [thread overview]
Message-ID: <CO1PR11MB49297FA6CAE59B52423FBD2ED2869@CO1PR11MB4929.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAKbZUD3DqgZN=zkW6+zTRKyaQTLu=xyKBtWPYUV7C=Kw=jgDzg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 13407 bytes --]
Hi Pedro,
All good points. I agree that there are some assumptions in implementation code that sizeof(size_t) == sizeof(UINTN) == sizeof(VOID *).
When the PrintLib does use a format specifier defined by the ANSI C Spec and the behavior is not identical to ANSI C, then we state that in the format string description in PrintLib.h. Since we do not define or use size_t, we are not strictly conformant.
Mike
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Pedro Falcato
Sent: Tuesday, July 12, 2022 2:33 PM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: Andrew Fish (afish@apple.com) <afish@apple.com>; Gao, Liming <gaoliming@byosoft.com.cn>; Liu, Zhiguang <zhiguang.liu@intel.com>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier
On Tue, Jul 12, 2022 at 9:46 PM Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:
Hi Pedro,
There is a good brief description of size_t on this page:
https://en.wikipedia.org/wiki/C_data_types
sizeof() is built into the C compiler, but size_t is defined by the std C lib being used. It can be as small as 16-bits.
We do not use the std C lib, so size_t is not defined when we do EDK II builds. This means we do not know how large the value sizeof() returns is.
Hi Mike,
The C standard library and the compiler usually work together and need to agree on how large each type is (uint8_t, uint16_t, uint32_t, uint64_t, size_t), just like EDK2 does for its types. The definition of "implementation" (which encompasses both the compiler and the C standard library) is intentionally vague (as is a good chunk of the C standard) but essentially maps to "whatever makes your program run on a standard C environment".
However, your request is to support printing values of type UINTN in the PrintLib. I agree this feature is not present today. Adding in a new format specifier would be required.
Using %z may be confusing because size_t defined by a C lib we are not using and UINTN defined in the UEFI Spec are not the same.
Using %z makes sense because in practice (and almost by definition, although not technically) UINTN will hold the native word size, and so will size_t. UINTN is used all over the place in UEFI to represent lengths of things in memory, and so is size_t in standard C codebases. Even if we go by both standards' definition of size_t and UINTN ( "unsigned integer type of the result of the sizeof operator" vs "Unsigned value of native width" ), since the two types in every platform we will ever support[1] map to the same concept and the same type, I would defend the use of %z for UINTN.
Do you have a suggestion for a different specifier that perhaps is not used in the ANSI C printf() format string?
I believe we have quite a good amount of specifiers that are not reserved but I think those would be better used for really non-standard types instead of UINTN (which is effectively a size_t).
If we do want a 100% ANSI C conformant version of a print library, then I agree this would require a new library class and we would need to use different API names so the caller knows which format string style to use.
Agreed.
[1] In fact, I would dare to say that if there ever was a modern machine that we ported UEFI to, that had addresses/sizeof() > native word size, a lot of code would implicitly break as UINTN takes the role of a size_t in UEFI. As a quick example, all interfaces in boot services and runtime services that deal with memory length/size take a UINTN. This should be clarified in the spec (unless it's written down somewhere and I don't see it). My interpretation of both standards leads me to conclude that both size_t and UINTN are intended to be used to represent addresses and lengths in memory.
All the best,
Pedro
There is a long history associated with the format string defined in PrintLib today. Mostly related to size optimizations of the PrintLib when FLASH devices were much, much smaller.
Mike
From: devel@edk2.groups.io<mailto:devel@edk2.groups.io> <devel@edk2.groups.io<mailto:devel@edk2.groups.io>> On Behalf Of Pedro Falcato
Sent: Wednesday, July 6, 2022 3:27 PM
To: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Andrew Fish (afish@apple.com<mailto:afish@apple.com>) <afish@apple.com<mailto:afish@apple.com>>; Gao, Liming <gaoliming@byosoft.com.cn<mailto:gaoliming@byosoft.com.cn>>; Liu, Zhiguang <zhiguang.liu@intel.com<mailto:zhiguang.liu@intel.com>>
Subject: Re: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier
On Wed, Jul 6, 2022 at 7:22 PM Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:
Hi Pedro,
This is an interesting feature.
It is backwards compatible since you are adding a format specifier to the PrintLib class.
There is a 2nd lib instance that needs to be updated, and that is DxePrintLibPrint2Protocol in MdeModulePkg.
I think using ANSI C %z specifier syntax assumes that sizeof(size_t) == sizeof(UINTN) for all supported compilers/CPU archs.
The ANSI C 99 specification does not state that this is always true. If may be true in practice for the compiler/CPU archs
currently supported by Tianocore.
It would be good to add a STATIC_ASSERT() to check that this is true and break the build if a compiler+CPU combination
is used where that assumption is false. Not sure if this should go in Base.h with other STATIC_ASSERT()s for
types. Or if it should go in PrintLib.h where this assumption is being made.
Hi Mike,
How can sizeof(size_t) != sizeof(UINTN)? It seems to be quite a stretch to interpret the standard in a way that the native word size will not be able to store "the maximum size of a theoretically possible object of any type". In any case, getting the size_t type is non-trivial and as far as I can tell would either depend on a C library or C extension trickery to try to get the type of sizeof(Type).
I would like to see some more background on the motivation for adding this feature.
I think the current workaround would be to typecast and integer of type size_t to UINT64 and use %Ld or
typecase to UINT32 and use %d.
Ok, some background: I essentially starting looking at DebugLib and PrintLib and I found myself very surprised by a good chunk of decisions. The API is very similar to standard printf() but it lacks support for a good portion of specifiers; some specifiers it implements don't have the C99 standard meaning, like the ones it implements for integers (%x and %lx for hex, for instance) which were non-obviously overridden to mean print UINT32 and print UINT64. Because of that, I feel that maybe reworking this library (creating PrintLib2 or something, since we can't break the existing uses) would be a good idea to have a "principle of least surprise" compliant API and would help smoothen the confusion curve between userspace code and EDK2 firmware.
Since reworking the whole of PrintLib is a non-trivial amount of work, I settled for this very tiny feature that doesn't break anyone's code and is very handy to avoid odd workarounds like casting everything to a UINT64 so you can print it with %lx. Obviously, if there's interest in getting a more standard environment[1] in EDK2 (and I personally, although possibly naiively, don't see the downside), going down a PrintLib2 in the long haul is a good idea. But this patch took me around 10 minutes to write up and is probably useful in its current form.
Thanks,
Pedro
[1] Why do I want to get a more standardized environment? Well, I simply believe it smoothens the learning curve between userspace -> kernel -> bootloader -> firmware. A good chunk of kernels and bootloaders already have relatively standard C APIs, but firmware, at least in our case, is still lacking. Also, makes you less prone to mistakes.
Thanks,
Mike
> -----Original Message-----
> From: devel@edk2.groups.io<mailto:devel@edk2.groups.io> <devel@edk2.groups.io<mailto:devel@edk2.groups.io>> On Behalf Of Pedro Falcato
> Sent: Monday, July 4, 2022 6:16 PM
> To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Gao, Liming <gaoliming@byosoft.com.cn<mailto:gaoliming@byosoft.com.cn>>; Liu, Zhiguang
> <zhiguang.liu@intel.com<mailto:zhiguang.liu@intel.com>>
> Subject: [edk2-devel] [PATCH v2] MdePkg/BasePrintLib: Add %z specifier
>
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3977
>
> %z is used in standard C99 as the printf specifier for size_t types.
> Add support for it so we can portably print UINTN.
>
> Cc: Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
> Cc: Liming Gao <gaoliming@byosoft.com.cn<mailto:gaoliming@byosoft.com.cn>>
> Cc: Zhiguang Liu <zhiguang.liu@intel.com<mailto:zhiguang.liu@intel.com>>
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com<mailto:pedro.falcato@gmail.com>>
> ---
> MdePkg/Include/Library/PrintLib.h | 13 ++++++++-----
> MdePkg/Library/BasePrintLib/PrintLibInternal.c | 9 +++++++++
> MdePkg/Library/BasePrintLib/PrintLibInternal.h | 1 +
> 3 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/MdePkg/Include/Library/PrintLib.h b/MdePkg/Include/Library/PrintLib.h
> index 8d523cac52..0d67f62d3f 100644
> --- a/MdePkg/Include/Library/PrintLib.h
> +++ b/MdePkg/Include/Library/PrintLib.h
> @@ -42,6 +42,9 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
> - L, l
> - The number being printed is size UINT64. Only valid for types X, x, and d.
> If this flag is not specified, then the number being printed is size int.
> + - z
> + - The number being printed is of size UINTN. Only valid for types X, x and d.
> + If this flag is not specified, then the number being printed is size int.
> - NOTE: All invalid flags are ignored.
>
> [width]:
> @@ -73,18 +76,18 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
> using this type too by making sure bits 8..15 of the argument are set to 0.
> - x
> - The argument is an unsigned hexadecimal number. The characters used are 0..9 and
> - A..F. If the flag 'L' is not specified, then the argument is assumed
> + A..F. If the flags 'L', 'z' are not specified, then the argument is assumed
> to be size int. This does not follow ANSI C.
> - X
> - The argument is an unsigned hexadecimal number and the number is padded with
> - zeros. This is equivalent to a format string of "0x". If the flag
> - 'L' is not specified, then the argument is assumed to be size int.
> + zeros. This is equivalent to a format string of "0x". If the flags
> + 'L', 'z' are not specified, then the argument is assumed to be size int.
> This does not follow ANSI C.
> - d
> - - The argument is a signed decimal number. If the flag 'L' is not specified,
> + - The argument is a signed decimal number. If the flags 'L', 'z' are not specified,
> then the argument is assumed to be size int.
> - u
> - - The argument is a unsigned decimal number. If the flag 'L' is not specified,
> + - The argument is a unsigned decimal number. If the flags 'L'. 'z' are not specified,
> then the argument is assumed to be size int.
> - p
> - The argument is a pointer that is a (VOID *), and it is printed as an
> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.c b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> index 42b598a432..1cd99b2213 100644
> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.c
> @@ -720,6 +720,9 @@ BasePrintLibSPrintMarker (
> case 'l':
> Flags |= LONG_TYPE;
> break;
> + case 'z':
> + Flags |= SIZET_TYPE;
> + break;
> case '*':
> if ((Flags & PRECISION) == 0) {
> Flags |= PAD_TO_WIDTH;
> @@ -833,6 +836,12 @@ BasePrintLibSPrintMarker (
> } else {
> Value = BASE_ARG (BaseListMarker, int);
> }
> + } else if ((Flags & SIZET_TYPE) != 0) {
> + if (BaseListMarker == NULL) {
> + Value = VA_ARG (VaListMarker, UINTN);
> + } else {
> + Value = BASE_ARG (BaseListMarker, UINTN);
> + }
> } else {
> if (BaseListMarker == NULL) {
> Value = VA_ARG (VaListMarker, INT64);
> diff --git a/MdePkg/Library/BasePrintLib/PrintLibInternal.h b/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> index 34d591c6fc..9193e6192b 100644
> --- a/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> +++ b/MdePkg/Library/BasePrintLib/PrintLibInternal.h
> @@ -29,6 +29,7 @@
> #define ARGUMENT_REVERSED BIT12
> #define COUNT_ONLY_NO_PRINT BIT13
> #define UNSIGNED_TYPE BIT14
> +#define SIZET_TYPE BIT15
>
> //
> // Record date and time information
> --
> 2.37.0
>
>
>
>
>
--
Pedro Falcato
--
Pedro Falcato
[-- Attachment #2: Type: text/html, Size: 64640 bytes --]
next prev parent reply other threads:[~2022-07-12 22:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-05 1:16 [PATCH v2] MdePkg/BasePrintLib: Add %z specifier Pedro Falcato
2022-07-06 18:22 ` [edk2-devel] " Michael D Kinney
2022-07-06 22:27 ` Pedro Falcato
2022-07-12 20:46 ` Michael D Kinney
2022-07-12 21:33 ` Pedro Falcato
2022-07-12 22:02 ` Michael D Kinney [this message]
2022-07-12 22:30 ` Pedro Falcato
2022-07-15 16:30 ` Michael D Kinney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CO1PR11MB49297FA6CAE59B52423FBD2ED2869@CO1PR11MB4929.namprd11.prod.outlook.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox