public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>,
	edk2-devel-01 <edk2-devel@lists.01.org>,
	Eric Dong <eric.dong@intel.com>, Star Zeng <star.zeng@intel.com>,
	Paulo Alcantara <pcacjr@zytor.com>
Subject: Re: [PATCH 0/5] MdeModulePkg: UDF fixes and cleanups
Date: Tue, 12 Sep 2017 23:29:16 +0200	[thread overview]
Message-ID: <72a0d566-dc95-b9a0-5627-de3d01049680@redhat.com> (raw)
In-Reply-To: <CAKv+Gu84uwaahe-VO3oRCoVLu8-34ejZairaEE2M_usJJzLg1Q@mail.gmail.com>

On 09/12/17 17:38, Ard Biesheuvel wrote:
> On 12 September 2017 at 03:14, Laszlo Ersek <lersek@redhat.com> wrote:
>> On 09/10/17 02:12, Laszlo Ersek wrote:
>>> Repo:   https://github.com/lersek/edk2.git
>>> Branch: udf_fixes_cleanups
>>>
>>> Patches #2, #3 and #4 are needed (and enough) for me to build OVMF for
>>> IA32 and X64 with clang-3.8, after the UDF introduction.
>>>
>>> Patches #1 and #5 are cleanups that I felt fit before patch #2 and after
>>> patch #4, respectively.
>>>
>>> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>> Cc: Eric Dong <eric.dong@intel.com>
>>> Cc: Paulo Alcantara <pcacjr@zytor.com>
>>> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
>>> Cc: Star Zeng <star.zeng@intel.com>
>>>
>>> Thanks
>>> Laszlo
>>>
>>> Laszlo Ersek (5):
>>>   MdeModulePkg/UdfDxe: ASSERT() valid ReadFileInfo Flags for INLINE_DATA
>>>     req
>>>   MdeModulePkg/UdfDxe: don't return unset Status if INLINE_DATA req
>>>     succeeds
>>>   MdeModulePkg/UdfDxe: replace zero-init of local variables with
>>>     ZeroMem()
>>>   MdeModulePkg/PartitionDxe: don't divide 64-bit values with C operators
>>>   MdeModulePkg/PartitionDxe: remove always false comparison
>>>
>>>  MdeModulePkg/Universal/Disk/PartitionDxe/Udf.c            | 9 +++++++--
>>>  MdeModulePkg/Universal/Disk/UdfDxe/File.c                 | 6 ++++--
>>>  MdeModulePkg/Universal/Disk/UdfDxe/FileSystemOperations.c | 5 +++++
>>>  3 files changed, 16 insertions(+), 4 deletions(-)
>>>
>>
>> Thanks all for the feedback, pushed as commit range
>> c05cae55ebd8..b4e5807d2492.
>>
>> (I didn't change the sizeof / sizeof() stuff -- for one, I didn't want
>> to touch the code on such an urgent push, relative to the posted and
>> tested version.)
>>
> 
> Rather unexpectedly, the build is still broken on my CI system
> 
> This time, it is GCC 4.8 that complains about uninitialized variables,
> most likely false positives again (apologies for the letter soup)
> 
> <https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdeModulePkg/Universal/Disk/UdfDxe/FileSystemOperations.c>:
> In function 'ReadFile':
> <https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdeModulePkg/Universal/Disk/UdfDxe/FileSystemOperations.c>:876:27:
> error: 'Status' may be used uninitialized in this function
> [-Werror=maybe-uninitialized]
>    EFI_STATUS              Status;
>                            ^

This is actually a bug in the code.

In order to explain that, I have to elaborate.

The UDF standard from the OSTA is introduced like this:

    The OSTA Universal Disk Format (UDF ® ) specification defines a
    subset of the standard ECMA 167 3 rd edition. The primary goal of
    the OSTA UDF is to maximize data interchange and minimize the cost
    and complexity of implementing ECMA 167.

Which (in retrospect...) means that we have to download ECMA 167 at once:

https://www.ecma-international.org/publications/standards/Ecma-167.htm

OK, so let's look at the code.

First, the ReadFile() function is very hard to read. It is long (~300
lines), but what is actually problematic about it is that it uses "goto"
statements for control flow that is *not* related to error handling.
This makes it painful to track, and it happens to violate the edk2
coding style spec as well:

https://edk2-docs.gitbooks.io/edk-ii-c-coding-standards-specification/content/5_source_files/57_c_programming.html#57-c-programming

    5.7.3.8 Goto Statements should not be used (in general)

    In almost all cases, it is possible to write the code so that a goto
    is not needed. If a goto is used, be ready to defend it during
    review.

    It is common to use goto for error handling and thus, exiting a
    routine in an error case is the only legal use of a goto. A goto
    allows the error exit code to be contained in one place in the
    routine. This one place reduces software life cycle maintenance
    issues, as there can be one copy of error cleanup code per routine.

   [...]

Second, in this case it actually pays off to read the function top-down.

- The first switch statement (on "ReadFileInfo->Flags") does not set
Status, but it also doesn't read it or jump to a read site.

- The second switch statement has four case labels (RecordingFlags):

  - INLINE_DATA: it sets Status to EFI_SUCCESS
    (thanks to my recent patch)

  - LONG_ADS_SEQUENCE, SHORT_ADS_SEQUENCE: Status is set very
    soon, when we unconditionally call GetAllocationDescriptor().

  - EXTENDED_ADS_SEQUENCE: we ASSERT(FALSE), but also set Status to
    EFI_UNSUPPORTED

Then, assuming we didn't jump to any of the error labels
(Error_Read_Disk_Blk, Error_Alloc_Buffer_To_Next_Ad, Error_Get_Aed), we
proceed to the Done label, and return Status.

(BTW, I checked all the paths to those error labels, and those *are* fine.)

So how is it possible that we reach Done, and return Status, without
having set Status? It is possible by taking *none* of the branches in
the second switch statement (RecordingFlags). The switch statement does
not have a default label, which also happens to break the coding style:

  5.7.3.7 switch Statements

  [...]

  Always include the default case. It should always end with a break
  statement, even if the break is the last thing before the closing
  brace.

OK, so how is it possible that none of the case labels match?
"RecordingFlags", i.e. the controlling expression of the switch
statement, is set like this:

  RecordingFlags = GET_FE_RECORDING_FLAGS (FileEntryData);

which in turn is #defined as

typedef enum {
  SHORT_ADS_SEQUENCE,
  LONG_ADS_SEQUENCE,
  EXTENDED_ADS_SEQUENCE,
  INLINE_DATA
} UDF_FE_RECORDING_FLAGS;

#define GET_FE_RECORDING_FLAGS(_Fe) \
  ((UDF_FE_RECORDING_FLAGS)((UDF_ICB_TAG *)( \
                  (UINT8 *)(_Fe) + \
                  sizeof (UDF_DESCRIPTOR_TAG)))->Flags & 0x07)

So basically we take a bitmask which may have values 0..7 decimal
inclusive (bits 2:0), and then compare it against values 0, 1, 2, 3
only. Because this value comes from an on-disk data structure, i.e. we
don't set it ourselves in the driver, this is an actual bug. A default
label should be added to reject the unsupported cases.

Now, the UDF spec documents this field under

2.3.5 ICB Tag
2.3.5.4 Uint16 Flags
  Bits 0-2: These bits specify the type of allocation descriptors used.
  Refer to section 2.3.10 on Allocation Descriptors for the guidelines
  on choosing which type of allocation descriptor to use.

So we go to 2.3.10:

2.3.10 Allocation Descriptors
[...]

and we find that from the 4 constants listed in the code, the UDF spec
names only two, "Short Allocation Descriptor" (SHORT_ADS_SEQUENCE) and
"Long Allocation Descriptor" (LONG_ADS_SEQUENCE), but the spec doesn't
even provide their numerical values!

This is where ECMA-167 becomes relevant. We have:

14.6 ICB Tag
14.6.8 Flags (RBP 18)
Bit  Interpretation
0-2  Shall be interpreted as a 3-bit unsigned binary number as follows.
     The value 0 shall mean that Short Allocation Descriptors
     (4/14.14.1) are used. The value 1 shall mean that Long Allocation
     Descriptors (4/14.14.2) are used. The value 2 shall mean that
     Extended Allocation Descriptors (4/14.14.3) are used. The value 3
     shall mean that the file shall be treated as though it had exactly
     one allocation descriptor describing an extent which starts with
     the first byte of the Allocation Descriptors field and has a
     length, in bytes, recorded in the Length of Allocation Descriptors
     field. The values of 4-7 are reserved for future standardisation.

The enumeration constants in UDF_FE_RECORDING_FLAGS cover the specified
cases, but the reserved values are not checked against in the 2nd switch
statement, under default case label.

This also proves that setting Status at the top of the function, to
EFI_SUCCESS namely, would have been wrong in commit 131fd40ffc34.

> "/home/buildslave/srv/toolchain/arm-tc-14.04/bin/arm-linux-gnueabihf-gcc"
> -mthumb -mcpu=cortex-a15
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/ArmVirtPkg/Include>
> -g -Os -fshort-wchar -fno-builtin -fno-strict-aliasing -Wall -Werror
> -Wno-array-bounds -include AutoGen.h -fno-common -mlittle-endian
> -mabi=aapcs -fno-short-enums -funsigned-char -ffunction-sections
> -fdata-sections -fomit-frame-pointer -Wno-address -mthumb
> -mfloat-abi=soft -fno-pic -fno-pie -fstack-protector
> -mword-relocations -Wno-unused-but-set-variable -DMDEPKG_NDEBUG
> -DDISABLE_NEW_DEPRECATED_INTERFACES -c -o
> <https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/Build/ArmVirtQemu-ARM/RELEASE_GCC48/ARM/OvmfPkg/VirtioNetDxe/VirtioNet/OUTPUT/./SnpSharedHelpers.obj>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/OvmfPkg/VirtioNetDxe>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/Build/ArmVirtQemu-ARM/RELEASE_GCC48/ARM/OvmfPkg/VirtioNetDxe/VirtioNet/DEBUG>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdePkg>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdePkg/Include>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdePkg/Include/Arm>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/OvmfPkg>
> -I<https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/OvmfPkg/Include>
> <https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c>
> <https://ci.linaro.org/job/leg-virt-tianocore-edk2-upstream/ws/edk2/MdeModulePkg/Universal/Disk/UdfDxe/FileSystemOperations.c>:887:27:
> error: 'BytesLeft' may be used uninitialized in this function
> [-Werror=maybe-uninitialized]
>    UINT64                  BytesLeft;
>                            ^

This is a false positive indeed. There are three reads of "BytesLeft",
and they are all under:

  LONG_ADS_SEQUENCE / SHORT_ADS_SEQUENCE
    READ_FILE_SEEK_AND_READ

However, BytesLeft is also assigned near the top of the function, under
READ_FILE_SEEK_AND_READ.

(Of course, this analysis is complicated by the fact that we have a
non-error handling label, namely "Skip_File_Seek", just above the first
read location of "BytesLeft". So we have to verify that nothing jumps
there without having set "BytesLeft". Goto statements that aren't used
for error handling / end-of-function cleanup are horrible.)

I'm going to submit two patches for these issues, but that's officially
the end of the resources I can allocate for this driver now. All further
error reports, even build errors, must go into the TianoCore Bugzilla;
no exceptions.

Laszlo


      reply	other threads:[~2017-09-12 21:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-10  0:12 [PATCH 0/5] MdeModulePkg: UDF fixes and cleanups Laszlo Ersek
2017-09-10  0:13 ` [PATCH 1/5] MdeModulePkg/UdfDxe: ASSERT() valid ReadFileInfo Flags for INLINE_DATA req Laszlo Ersek
2017-09-10  0:13 ` [PATCH 2/5] MdeModulePkg/UdfDxe: don't return unset Status if INLINE_DATA req succeeds Laszlo Ersek
2017-09-12  9:06   ` Zeng, Star
2017-09-10  0:13 ` [PATCH 3/5] MdeModulePkg/UdfDxe: replace zero-init of local variables with ZeroMem() Laszlo Ersek
2017-09-12  8:55   ` Zeng, Star
2017-09-12  9:38     ` Ni, Ruiyu
2017-09-12  9:57       ` Laszlo Ersek
2017-09-12 10:02         ` Zeng, Star
2017-09-12  9:55     ` Laszlo Ersek
2017-09-10  0:13 ` [PATCH 4/5] MdeModulePkg/PartitionDxe: don't divide 64-bit values with C operators Laszlo Ersek
2017-09-12  5:41   ` Bi, Dandan
2017-09-12  7:58     ` Laszlo Ersek
2017-09-12  8:28       ` Ni, Ruiyu
2017-09-12  8:46       ` Zeng, Star
2017-09-10  0:13 ` [PATCH 5/5] MdeModulePkg/PartitionDxe: remove always false comparison Laszlo Ersek
2017-09-12  8:50   ` Zeng, Star
2017-09-10  4:24 ` [PATCH 0/5] MdeModulePkg: UDF fixes and cleanups Shi, Steven
2017-09-10  8:38   ` Laszlo Ersek
2017-09-10 13:52     ` Laszlo Ersek
2017-09-10 14:27       ` Shi, Steven
2017-09-10 15:51         ` Paulo Alcantara
2017-09-11  6:58           ` Laszlo Ersek
2017-09-11 13:55             ` Paulo Alcantara
2017-09-11 13:07           ` Shi, Steven
2017-09-11 13:26             ` Ni, Ruiyu
2017-09-11 13:26             ` Laszlo Ersek
2017-09-11 13:52             ` Paulo Alcantara
2017-09-11 14:00               ` Shi, Steven
2017-09-10 15:05 ` Paulo Alcantara
2017-09-11 11:58 ` Ard Biesheuvel
2017-09-12 10:14 ` Laszlo Ersek
2017-09-12 15:38   ` Ard Biesheuvel
2017-09-12 21:29     ` Laszlo Ersek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72a0d566-dc95-b9a0-5627-de3d01049680@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox