public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, kraxel@redhat.com
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>,
	Jiewen Yao <jiewen.yao@intel.com>,
	oliver@redhat.com
Subject: Re: [edk2-devel] [PATCH 0/4] OvmfPkg/VirtNorFlashDxe: fix corruption + misc small improvements
Date: Mon, 15 Jan 2024 13:38:42 +0100	[thread overview]
Message-ID: <3487014c-3a2b-4a4d-955e-ef4fa5fd81b8@redhat.com> (raw)
In-Reply-To: <01ac5fab-2710-4231-a7cd-1c550fa6319f@redhat.com>

On 1/15/24 11:21, Laszlo Ersek wrote:

> - please only ever apply the bit-neg operator on values that are UINT32,
>   UINTN, or UINT64. Otherwise we get sign bit flipping, and that's
>   terrible. (Most people are not even aware of it happening.)

Doing this is BTW not as obvious as it might seem, at first sight. It's
good to remember some points about integer constants:

- assuming a naked constant (no 0x or 0 prefix, and no suffix such as l,
or u), the types considered are int, long, long long, in this order, by
the compiler, for the value (whichever fits first). That is: a "naked"
integer constant will *always* be signed.

- assuming an octal or hex prefix (0 or 0x), the candidate type list is
only *extended*; in other words, these prefixes don't *force* the type
to be unsigned, only *permit* it. The list becomes int, unsigned, long,
unsigned long, long long, unsigned long long. This is why 0x7F is just
"int", for example. However, 0x8000_0000 is not "int" anymore, but
"unsigned" (the value doesn't fit in "int", and the 0x prefix "permits"
"unsigned int").

- The suffixes do restrict the candidate type list. The "u" (and U)
suffixes remove the signed types, and add in the unsigned types. The
list becomes unsigned, unsigned long, unsigned long long. Furthermore,
the "l" and "ll" suffixes force (restrict) the type selection along a
different axis: they set the minimum integer "conversion rank", so to
say. The head of the list is trimmed so that the first candidate have
the specified rank. So with just an "l" suffix, the normal candidate
type list "int, long, long long" gets trimmed to "long, long long".
Assuming an "u" suffix in place already, adding the "l" suffix trims the
candidate type list "unsigned, unsigned long, unsigned long long" to
"unsigned long, unsigned long long". Assuming a 0x prefix and no "u"
suffix to begin with, appending the "l" suffix trims the type list "int,
unsigned, long, unsigned long, long long, unsigned long long" to "long,
unsigned long, long long, unsigned long long".

The "shorthand" to remember is: "prefixes permit, suffixes force".

Why I'm posting this wall of text: if we have a macro
BOUNDARY_OF_32_WORDS #defined as 0x7F, or a macro BIT1 #defined as
0x00000002, those are *signed int* values. And applying the bit-neg
operator ~ to them directly will flip the sign bit, and the resultant
value will be *implementation-dependent*. Given that we use two's
complement representation, the resultant value will always be signed int
with a negative value. (In a sign-and-magnitude representation e.g.,
where there is +0 and -0, we'd have to think further.) And then, for
example in:

  Offset & ~BOUNDARY_OF_32_WORDS

the negative value of the RHS is converted to the (unsigned) type of the
LHS [*], due to the default arithmetic conversions that are specified
for the & operator (too). This is done with the usual modular addition /
reduction.

So, when most people think that the above expression is simple
bit-fiddling, there are actually *two steps* that they miss: first, the
creation of a negative value of type "signed int" (using two's
complement representation), and then the reduction of that negative
"signed int" value to the (possibly wider) unsigned value range that the
other type is capable of representing [*].

[*] I'm taking some shortcuts here. The actual result type of the usual
arithmetic conversions (the "common real type for the operands
and result") is more complicated, but I won't describe all that here. It
can be read in the C std (drafts).

This is why I insist on one of two things in all such cases:

- either writing the expression as

    Offset & ~(UINTN)BOUNDARY_OF_32_WORDS

  where UINTN is supposed to match the type of Offset precisely,

- or #defining BOUNDARY_OF_32_WORDS already as an unsigned type --
either with an explicit cast ((UINTN)0x7F), or with a suitable suffix
(0x7Fllu).

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113820): https://edk2.groups.io/g/devel/message/113820
Mute This Topic: https://groups.io/mt/103680930/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2024-01-16  7:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-12 11:37 [edk2-devel] [PATCH 0/4] OvmfPkg/VirtNorFlashDxe: fix corruption + misc small improvements Gerd Hoffmann
2024-01-12 11:37 ` [edk2-devel] [PATCH 1/4] OvmfPkg/VirtNorFlashDxe: fix shadowbuffer reads Gerd Hoffmann
2024-01-12 12:11   ` Ard Biesheuvel
2024-01-12 11:37 ` [edk2-devel] [PATCH 2/4] OvmfPkg/VirtNorFlashDxe: clarify block write logic Gerd Hoffmann
2024-01-12 12:14   ` Ard Biesheuvel
2024-01-12 11:37 ` [edk2-devel] [PATCH 3/4] OvmfPkg/VirtNorFlashDxe: allow larger writes without block erase Gerd Hoffmann
2024-01-12 12:15   ` Ard Biesheuvel
2024-01-12 11:37 ` [edk2-devel] [PATCH 4/4] OvmfPkg/VirtNorFlashDxe: ValidateFvHeader: unwritten state is EOL too Gerd Hoffmann
2024-01-12 12:16   ` Ard Biesheuvel
2024-01-12 12:41     ` Gerd Hoffmann
2024-01-15 10:21 ` [edk2-devel] [PATCH 0/4] OvmfPkg/VirtNorFlashDxe: fix corruption + misc small improvements Laszlo Ersek
2024-01-15 12:38   ` Laszlo Ersek [this message]
2024-01-15 17:56   ` Ard Biesheuvel
2024-01-16  9:37     ` Laszlo Ersek
2024-01-16 10:21       ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3487014c-3a2b-4a4d-955e-ef4fa5fd81b8@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox