From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: mx.groups.io;
 dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com)
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by groups.io with SMTP; Tue, 24 Sep 2019 13:26:21 -0700
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mx1.redhat.com (Postfix) with ESMTPS id 5DFC8309BDBA;
	Tue, 24 Sep 2019 20:26:20 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-118.rdu2.redhat.com [10.10.120.118])
	by smtp.corp.redhat.com (Postfix) with ESMTP id 5CD5760F80;
	Tue, 24 Sep 2019 20:26:14 +0000 (UTC)
Subject: Re: [edk2-devel] [PATCH 00/35] edk2: clean up the usage of standardized (VOID*) typedefs
To: =?UTF-8?Q?Marvin_H=c3=a4user?= <mhaeuser@outlook.de>,
 "devel@edk2.groups.io" <devel@edk2.groups.io>
Cc: Andrew Fish <afish@apple.com>, Anthony Perard
 <anthony.perard@citrix.com>, Ard Biesheuvel <ard.biesheuvel@linaro.org>,
 Benjamin You <benjamin.you@intel.com>, Chao Zhang <chao.b.zhang@intel.com>,
 Dandan Bi <dandan.bi@intel.com>, David Woodhouse <dwmw2@infradead.org>,
 Eric Dong <eric.dong@intel.com>, Guo Dong <guo.dong@intel.com>,
 Hao A Wu <hao.a.wu@intel.com>, Jaben Carsey <jaben.carsey@intel.com>,
 Jian J Wang <jian.j.wang@intel.com>, Jiaxin Wu <jiaxin.wu@intel.com>,
 Jiewen Yao <jiewen.yao@intel.com>, Jordan Justen
 <jordan.l.justen@intel.com>, Julien Grall <julien.grall@arm.com>,
 Leif Lindholm <leif.lindholm@linaro.org>, Liming Gao <liming.gao@intel.com>,
 Maurice Ma <maurice.ma@intel.com>,
 Michael D Kinney <michael.d.kinney@intel.com>, Ray Ni <ray.ni@intel.com>,
 Siyuan Fu <siyuan.fu@intel.com>,
 Supreeth Venkatesh <supreeth.venkatesh@arm.com>,
 Zhichao Gao <zhichao.gao@intel.com>
References: <20190917194935.24322-1-lersek@redhat.com>
 <VI1PR0701MB25594288EE33C92D94A7F67CAC850@VI1PR0701MB2559.eurprd07.prod.outlook.com>
From: "Laszlo Ersek" <lersek@redhat.com>
Message-ID: <06b8b728-e38e-cc64-4499-9733fc993e5d@redhat.com>
Date: Tue, 24 Sep 2019 22:26:13 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <VI1PR0701MB25594288EE33C92D94A7F67CAC850@VI1PR0701MB2559.eurprd07.prod.outlook.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Tue, 24 Sep 2019 20:26:21 +0000 (UTC)
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

On 09/23/19 18:27, Marvin H=C3=A4user wrote:
> Good day,
>=20
> Thank you, Laszlo, for your ambition to introduce stricter code style=20
> enforcements. Sorry to "hijack" the actual topic (I did not CC anyone o=
n=20
> purpose, as this is mostly a separate topic and I'd like a quick commen=
t=20
> first), but this seems like a good occasion to mention another few bad=20
> practices edk2 has been following. Mainly, I'd like to call *some*=20
> attention to quality problems in the code base while this has some=20
> traction, and cause a discussion on whether and how those are to be=20
> approached.
>=20
> Thank you for your time.

Sure, I can offer my personal opinion on these.


> "inadequate type punning":
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdeModulePkg/Bus/Usb/UsbBusPei/HubPeim.c#L446
>=20
> This is mostly about the infamous "Strict Aliasing" rule, which is=20
> basically:
> "An object shall have its stored value accessed only by an lvalue=20
> expression that has one of the
> following types:
> =E2=80=94 a type compatible with the effective type of the object,
> =E2=80=94 a qualifed version of a type compatible with the effective ty=
pe of the=20
> object,
> =E2=80=94 a type that is the signed or unsigned type corresponding to t=
he=20
> effective type of the object,
> =E2=80=94 a type that is the signed or unsigned type corresponding to a=
 qualifed=20
> version of the effective
> type of the object,
> =E2=80=94 an aggregate or union type that includes one of the aforement=
ioned=20
> types among its members
> (including, recursively, a member of a subaggregate or contained union)=
, or
> =E2=80=94 a character type."
> C18 (ISO/IEC 9899:2018), 6.5.7 (exists, though has been updated, since =
C90)
>=20
> Currently optimisations based on this are disabled. This is a bit nasty=
=20
> to work around if *seriously* needed when sticking to C90, I can only=20
> think of memcpy right now. However, even though there are compilers tha=
t=20
> do not fully support C99 (ahem, Microsoft :) ), type-punning by unions=20
> should be supported by them all, and has been legal as of C99, where th=
e=20
> following part has been dropped from the standard:
> "With one exception, if a member of a union object is accessed after a=20
> value has been stored in a different member of the object, the behavior=
=20
> is implementation-defined."
> C90 (ISO/IEC 9899:1990), 6.3.2.3

I'm opposed to enforcing the strict aliasing rules, even though in all
code that I write, I either try to conform to them, or at least I seek
to be fully conscious of breaking them.

Here's the thing: IMO, the strict aliasing rules sacrifice flexibility
for performance (optimization possibilities). Not to mention the amount
of code in edk2 that would have to be identified and updated.

BaseTools uses "-fno-strict-aliasing" everywhere, and I think that's a
good choice.

  https://blog.regehr.org/archives/1180

This proposal for a "friendly dialect of C" intended to eliminate the
strict aliasing rules altogether. (Item#10; possibly also item#13.)

Also, as it points out, the Linux kernel is built with
"-fno-strict-aliasing". I've checked now, with a *long* series of "git
blame" commands, even digging into the "history" repository (which is at
<git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git>). I can
say that the current flag has been in place since *at least* Linux
v2.5.0 (2002-02-04).

QEMU has been built with "-fno-strict-aliasing" ever since commit
b932caba32c6 ("new disk image layer", 2004-08-01), known originally as
SVN rev 1030.


Consider the following example. You have a dynamically allocated buffer.
You read some data into it, from the network or the disk, using PCI DMA.
Let's assume that, from the block read via PCI DMA, the library
function(s) or protocol member(s) that you call, directly or indirectly,
there is at least one that:
- copies data from a source buffer to a target buffer, using UINT32 or
UINT64 assignments (for speed),
- and is implemented in C.

Now, according to the effective type rules, your dynamic buffer's
effective type is "array of UINT32" or "array of UINT64". That's because
of C99 6.5 Expressions, p6:

"If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the
lvalue becomes the effective type of the object for that access and for
subsequent accesses that do not modify the stored value."

Then if you try to parse this buffer as a UEFI device path (=3D a packed
sequence of device path node structures), *IN PLACE*, you will break the
effective type rules no matter what. Because, you will necessarily look
at the next node in the blob as an EFI_DEVICE_PATH_PROTOCOL (because
you'll want to learn the Type and the SubType fields, and the Length
array too). Note that EFI_DEVICE_PATH_PROTOCOL is a structure with no
UINT32 or UINT64 members. Boom.

So you have to resort to one of the following things:

1. Define a union type, assign the *full union* first
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D90167#c3>, and then check
the union helper variable. (First you must make sure there was enough
room in the buffer being parsed for a full union.)

2. Or, use memcpy() -- or something that the compiler is similarly
enlightened about --, to the same helper variable.

3. Or, write a manual copy loop with character access, to the helper
variable.

4. Or, use character access to read the fields.

#1 through #3 add a separate copying step, while #4 is extremely
uncomfortable to program.

In a nutshell, the effective type rules require separate
de-serialization routines for all data, and that's incredibly annoying.
It kills the utility of packed C structures.

I prefer packed C structures, size checks (!!!) against the buffers to
parse, and then type punning of pointers.


> "pointer unions":
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdePkg/Include/IndustryStandard/SmBios.h#L2592
>=20
> While the idea behind them is certainly style preference, using a union=
=20
> of pointers prevents two important things over a union of structs.
>=20
> 1) CONST declaration: When defining a variable of a union type=20
> containing pointers as CONST, speaking of its members, they are all=20
> going to be CONST pointers to arbitrary memory and not arbitrary=20
> pointers to CONST memory. With a union of structs, you can have either=20
> as required (e.g. CONST UNION_TYPE *union, or UNION_TYPE *COST union).

Formally, you are right, but I'm doubtful of the utility of
"pointer-to-union-of-structs". We cannot de-reference a pointer to the
union unless the buffer has enough data for the complete union. This
leaves the parsing of small structs unsolved.

A union of pointers is just syntactic sugar, of course, but it's
convenient. The member that is "pointer-to-smallest-header-substructure"
can be used for determining the actual structure type. Then we can
determine if there's enough data for that structure in the buffer. Then
we can use the matching pointer member, for accessing that structure.

Furthermore, CONST gets too complex really soon, and we have to start
adding explicit casts. My favorite link is:

  http://c-faq.com/ansi/constmismatch.html

I had never met "pointer unions" before edk2, but I've found them quite
convenient.


> 2) Well-defined header inspection:
> "if a union contains several structures that share a common initial=20
> sequence [...], it is permitted to inspect the common initial part of=20
> any of them anywhere that a declaration of the completed type of the=20
> union is visible"
> C18 (ISO/IEC 9899:2018), 6.5.2.3.6 (exists since at least C90)
>=20
> This guarantee can be used to inspect the type defined in a common=20
> header (e.g. SMBIOS_STRUCTURE) and the process the type-specific data b=
y=20
> accessing the appropiate member (e.g. SMBIOS_TABLE_TYPE0) legally.

This only applies *after* you have populated the union for inspection.
(Or at least enough bytes for the common header that you're going to
inspect.)

If you have a union collecting three structures: 5 bytes, 19 bytes, and
32 bytes, and you have a buffer with 24 bytes left (suggesting a 5 byte
structure and a 19 byte structure in it, or vice versa), you cannot
populate the full union -- you don't have 32 bytes left.

So let's then assume that the common header is 2 bytes long (with 3 vs.
17 vs. 30 additional bytes required in the specific structures). Fine,
then you read 2 bytes into the stand-alone union helper variable, for
inspecting the common header. Based on the header inspection, you then
decide to (attempt to) read 3 more bytes, or 17 bytes, continuing into
the union, and then parse the specific (now completed) structure through
the matching union member. Great?

Not really. Notice that, in this process, you didn't need the union *at
all*. You can simply use standalone structures, and you may not even
need more stack space.

Compare:

(i) with a union:

enum Type {
  type_1,
  type_2
}

struct H {
  enum Type t;
  ...
};

struct S1 {
  struct H h;
  int S1_i1;
};

struct S2 {
  struct H h;
  char S2_c;
  double S2_d;
};

union U {
  S1 s1;
  S2 s2;
};

Code:

union U u;
char unsigned *src =3D buffer;
char unsigned *dst =3D (void*)&u;
size_t specific;

if (room_left < sizeof(struct H)) {
  return BAD;
}

memcpy(dst, src, sizeof(struct H));
dst +=3D sizeof(struct H);
src +=3D sizeof(struct H);
room_left -=3D sizeof(struct H);

switch (u.s1.h.t) {
  case type_1:
    specific =3D sizeof(struct S1) - sizeof(struct H);
    if (room_left < specific) {
      return BAD;
    }
    memcpy(dst, src, specific);
    dst +=3D specific;
    src +=3D specific;
    room_left -=3D specific;
    /* now access u.s1.S1_i1 */
    return GOOD;

  case type_2:
    specific =3D sizeof(struct S2) - sizeof(struct H);
    if (room_left < specific) {
      return BAD;
    }
    memcpy(dst, src, specific);
    dst +=3D specific;
    src +=3D specific;
    room_left -=3D specific;
    /* now access u.s2.{S2_c,S2_d} */
    return GOOD;

  default:
    return BAD;
}


(ii) without a union:

enum Type {
  type_1,
  type_2
}

struct H {
  enum Type t;
  ...
};

/* note: header no longer embedded */
struct S1 {
  int S1_i1;
};

/* note: header no longer embedded */
struct S2 {
  char S2_c;
  double S2_d;
};

Code:

struct H h; /* note: no union, just the common header */
char unsigned *src =3D buffer;
/* note: no dst pointer into union */
size_t specific;

if (room_left < sizeof h) {
  return BAD;
}

memcpy(&h, src, sizeof h);
src +=3D sizeof h;
room_left -=3D sizeof h;

switch (h.t) { /* note: no awkward reference to "u.s1" */
  case type_1:
    {
      struct S1 s1; /* note: never co-exists with "s2" on the stack */

      specific =3D sizeof s1; /* note: no awkward subtraction */
      if (room_left < specific) {
        return BAD;
      }
      memcpy(&s1, src, specific);
      src +=3D specific;
      room_left -=3D specific;
      /* now access s1.S1_i1 */
    }
    return GOOD;

  case type_2:
    {
      struct S2 s2; /* note: never co-exists with "s1" on the stack */

      specific =3D sizeof s2; /* note: no awkward subtraction */
      if (room_left < specific) {
        return BAD;
      }
      memcpy(&s2, src, specific);
      src +=3D specific;
      room_left -=3D specific;
      /* now access s2.S2_c, s2.S2_d */
    }
    return GOOD;

  default:
    return BAD;
}


To me, (ii) is much cleaner than (i); the union is not needed.

Of course, I find the type punning approach even better than (ii) :) See
below:

(iii) no union, yes type punning:

struct H *h;
char unsigned *src =3D buffer;

if (room_left < sizeof *h) {
  return BAD;
}
h =3D (struct H *)src; /* note: no copying */
src +=3D sizeof *h;
room_left -=3D sizeof *h;

switch (h->t) {
  case type_1:
    {
      struct S1 *s1;

      specific =3D sizeof *s1;
      if (room_left < specific) {
        return BAD;
      }
      s1 =3D (struct S1 *)src; /* note: no copying */
      src +=3D specific;
      room_left -=3D specific;
      /* now access s1->S1_i1 */
    }
    return GOOD;

and so on.


> Plain=20
> casts and "pointer union" accesses are illegal as per the "inadequate=20
> type punning" point above.
>=20
>=20
>=20
> "casting away CONST":
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdePkg/Library/BaseIoLibIntrinsic/IoLibMmioBuffer.c#L236

In this case, the real problem is with the function prototype /
specification, not the implementation. The implementation follows from
the problematic function semantics -- if you take Buffer as (CONST UINT8
*), then why return the exact same buffer as (UINT8 *)?


> This should be obvious as Undefined Behaviour because memory previously=
=20
> guaranteed to be read-only is returned as a pointer to memory that=20
> allows writing,

Note: this is *per se* not undefined behavior. Casting away CONST
(explicitly) in itself is OK. Writing through the pointer is also OK
*if* the pointed-to object was never defined as CONST. Otherwise, the
behavior is undefined. See C99 6.7.3 Type qualifiers, p5:

"If an attempt is made to modify an object defined with a
const-qualified type through use of an lvalue with non-const-qualified
type, the behavior is undefined. If an attempt is made to refer to an
object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined."

I've cited the part about "volatile" to highlight the difference between
"modify" and "refer to". When casting away const, the behavior is only
undefined if you actually try to modify an object that is actually const.

int       i =3D 2;
int const ci =3D 3;
int       *pi;
int const *pci;

pci =3D &i;
pi =3D (int *)pci; /* fine in itself, but now you're on your own */
*pi =3D 3;         /* fine, as "i" is not defined const */

pci =3D &ci;
pi =3D (int *)pci; /* fine in itself, but now you're on your own */
i =3D *pi;         /* fine, not modifying "ci" */
*pi =3D 3;         /* undefined, as "ci" is defined const */


But, yes, the pattern seen under the link is risky practice.


> but for easier lookup, here's the related rule:
> "the left operand has atomic, qualifed, or unqualifed pointer type, and=
=20
> [...] the type pointed to by the left has all the qualifers of the type=
=20
> pointed to by the right"
> C18 (ISO/IEC 9899:2018), 6.5.16.1 (exists since at least C90)

What you quote is from the Constraints of "Simple assignment". Look at
"6.5.4 Cast operators" as well, paragraph 3:

"Conversions that involve pointers, other than where permitted by the
constraints of 6.5.16.1, shall be specified by means of an explicit cast.=
"

In brief, casting away const is not invalid in itself; it just throws
away protections (diagnostics) that the compiler would otherwise be
required to emit. Sometimes it's unavoidable. In most cases we should
avoid it. That might require fixing a few function prototypes.


> "structs with trailing 1-length array"
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdePkg/Include/Guid/FileInfo.h#L51
>=20
> This is undefined as per:
> "The behavior is undefned in the following circumstances:
> [...]
> =E2=80=94 Addition or subtraction of a pointer into, or just beyond, an=
 array=20
> object and an integer type produces a result that does not point into,=20
> or just beyond, the same array object (6.5.6).
> =E2=80=94 Addition or subtraction of a pointer into, or just beyond, an=
 array=20
> object and an integer type produces a result that points just beyond th=
e=20
> array object and is used as the operand of a unary * operator that is=20
> evaluated (6.5.6).
> =E2=80=94 Pointers that do not point into, or just beyond, the same arr=
ay object=20
> are subtracted (6.5.6)."
> C18 (ISO/IEC 9899:2018), J.2 (exists since at least C90)
>=20
> Same as above, while not all compilers fully support C99, flexible=20
> arrays should be support by all reasonably new compilers and allow a=20
> legal declaration and usage where this hack is in place. At worst, a=20
> macro could be provided to declare a [1] vs a [] array on demand and a=20
> requirement be introduced to have a "SIZE_OF_" macro for each such=20
> struct, but my personal preference would be to just enforce flexible ar=
rays.

Yes, in C99, the flexible array member was introduced to replace the
"struct hack", which had always been undefined.

It would be nice to remove all toolchains that don't support the
flexible array member, and then to replace all struct hacks with the
flexible array member. I agree.

Unfortunately, there's one extra difficulty (beyond the "expected"
regressions in adjusting code for the fixed element at offset 0): the
struct hack is used in several places in the UEFI 2.8 spec. So that
would have to be updated too.


> "Missing security checks for external data":
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdePkg/Library/BasePeCoffLib/BasePeCoff.c#L943
>=20
> The given example misses an alignment verification of the resulting=20
> pointer (which technically has to be verified *before* casting), as=20
> documented here:
> "The behavior is undefined in the following circumstances:
> [...]
> =E2=80=94 Conversion between two pointer types produces a result that i=
s=20
> incorrectly aligned (6.3.2.3)."
> C18 (ISO/IEC 9899:2018), J.2 (exists since at least C90)

In C99 anyway, "6.3.2.3 Pointers", paragraph 7 writes,

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer is
not correctly aligned [...] for the pointed-to type, the behavior is
undefined. [...]"

I find this very impractical and limiting, when accessing RAM (not
MMIO). I prefer if unaligned pointers (into RAM) just work, albeit slow,
perhaps. I believe AARCH64 can be configured to trip a fault vs. work
(but more slowly). On Intel, it just works.

I think we should be given the freedom to "define" the behavior that's
left undefined in this case by the standard.


> There are more such issues throughout the codebase, including missing=20
> overflow and (or flawed, see before) bounds checks, however I cannot=20
> find such quickly.
>=20
>=20
>=20
> "signed int BIT definitions":
> e.g.=20
> https://github.com/tianocore/edk2/blob/fcdedafd97c8f18c33a63d26b954e5db=
aee81a2b/MdePkg/Include/Base.h#L348
>=20
> Fixing this would be prone to regressions, but I'd like to add it for=20
> tracking purposes. Related discussion can be found down the chain here:=
=20
> https://lists.01.org/pipermail/edk2-devel/2018-February/021919.html

I agree about this, in theory. I'm afraid it's impossible to fix in
practice, given the huge amounts of dependent code (esp. out of tree code=
).


More or less, I'd summarize my opinion as follows:

- we should try to write such new code that conforms to the standard,

- *except* when the standard doesn't give us enough guarantees (i.e.
leaves the behavior undefined) that we need for convenient *in-place*
parsing (from RAM). Integer range checks and buffer boundary checks are
extremely important and we should implement those meticulously, but once
we've made sure our accesses are in range, the compiler should just
follow our pointers wherever they point. We should drag our toolchains
kicking and screaming into a state where they build the source the way
we need, for *in-place* parsing of RAM buffers, through packed
structures. As long as the architectures that we target don't prevent us
from in-place parsing, the toolchains should neither.

Thanks
Laszlo