From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5328B21E87964 for ; Tue, 12 Sep 2017 14:26:22 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1319880F7A; Tue, 12 Sep 2017 21:29:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1319880F7A Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=lersek@redhat.com Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-50.rdu2.redhat.com [10.10.120.50]) by smtp.corp.redhat.com (Postfix) with ESMTP id 812D06887E; Tue, 12 Sep 2017 21:29:17 +0000 (UTC) To: Ard Biesheuvel Cc: Ruiyu Ni , edk2-devel-01 , Eric Dong , Star Zeng , Paulo Alcantara References: <20170910001304.8628-1-lersek@redhat.com> <6bc51f3f-8b97-3a24-1ccf-aeb838eedb6f@redhat.com> From: Laszlo Ersek Message-ID: <72a0d566-dc95-b9a0-5627-de3d01049680@redhat.com> Date: Tue, 12 Sep 2017 23:29:16 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 12 Sep 2017 21:29:19 +0000 (UTC) Subject: Re: [PATCH 0/5] MdeModulePkg: UDF fixes and cleanups X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Sep 2017 21:26:22 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 09/12/17 17:38, Ard Biesheuvel wrote: > On 12 September 2017 at 03:14, Laszlo Ersek wrote: >> On 09/10/17 02:12, Laszlo Ersek wrote: >>> Repo: https://github.com/lersek/edk2.git >>> Branch: udf_fixes_cleanups >>> >>> Patches #2, #3 and #4 are needed (and enough) for me to build OVMF for >>> IA32 and X64 with clang-3.8, after the UDF introduction. >>> >>> Patches #1 and #5 are cleanups that I felt fit before patch #2 and after >>> patch #4, respectively. >>> >>> Cc: Ard Biesheuvel >>> Cc: Eric Dong >>> Cc: Paulo Alcantara >>> Cc: Ruiyu Ni >>> Cc: Star Zeng >>> >>> Thanks >>> Laszlo >>> >>> Laszlo Ersek (5): >>> MdeModulePkg/UdfDxe: ASSERT() valid ReadFileInfo Flags for INLINE_DATA >>> req >>> MdeModulePkg/UdfDxe: don't return unset Status if INLINE_DATA req >>> succeeds >>> MdeModulePkg/UdfDxe: replace zero-init of local variables with >>> ZeroMem() >>> MdeModulePkg/PartitionDxe: don't divide 64-bit values with C operators >>> MdeModulePkg/PartitionDxe: remove always false comparison >>> >>> MdeModulePkg/Universal/Disk/PartitionDxe/Udf.c | 9 +++++++-- >>> MdeModulePkg/Universal/Disk/UdfDxe/File.c | 6 ++++-- >>> MdeModulePkg/Universal/Disk/UdfDxe/FileSystemOperations.c | 5 +++++ >>> 3 files changed, 16 insertions(+), 4 deletions(-) >>> >> >> Thanks all for the feedback, pushed as commit range >> c05cae55ebd8..b4e5807d2492. >> >> (I didn't change the sizeof / sizeof() stuff -- for one, I didn't want >> to touch the code on such an urgent push, relative to the posted and >> tested version.) >> > > Rather unexpectedly, the build is still broken on my CI system > > This time, it is GCC 4.8 that complains about uninitialized variables, > most likely false positives again (apologies for the letter soup) > > : > In function 'ReadFile': > :876:27: > error: 'Status' may be used uninitialized in this function > [-Werror=maybe-uninitialized] > EFI_STATUS Status; > ^ This is actually a bug in the code. In order to explain that, I have to elaborate. The UDF standard from the OSTA is introduced like this: The OSTA Universal Disk Format (UDF ® ) specification defines a subset of the standard ECMA 167 3 rd edition. The primary goal of the OSTA UDF is to maximize data interchange and minimize the cost and complexity of implementing ECMA 167. Which (in retrospect...) means that we have to download ECMA 167 at once: https://www.ecma-international.org/publications/standards/Ecma-167.htm OK, so let's look at the code. First, the ReadFile() function is very hard to read. It is long (~300 lines), but what is actually problematic about it is that it uses "goto" statements for control flow that is *not* related to error handling. This makes it painful to track, and it happens to violate the edk2 coding style spec as well: https://edk2-docs.gitbooks.io/edk-ii-c-coding-standards-specification/content/5_source_files/57_c_programming.html#57-c-programming 5.7.3.8 Goto Statements should not be used (in general) In almost all cases, it is possible to write the code so that a goto is not needed. If a goto is used, be ready to defend it during review. It is common to use goto for error handling and thus, exiting a routine in an error case is the only legal use of a goto. A goto allows the error exit code to be contained in one place in the routine. This one place reduces software life cycle maintenance issues, as there can be one copy of error cleanup code per routine. [...] Second, in this case it actually pays off to read the function top-down. - The first switch statement (on "ReadFileInfo->Flags") does not set Status, but it also doesn't read it or jump to a read site. - The second switch statement has four case labels (RecordingFlags): - INLINE_DATA: it sets Status to EFI_SUCCESS (thanks to my recent patch) - LONG_ADS_SEQUENCE, SHORT_ADS_SEQUENCE: Status is set very soon, when we unconditionally call GetAllocationDescriptor(). - EXTENDED_ADS_SEQUENCE: we ASSERT(FALSE), but also set Status to EFI_UNSUPPORTED Then, assuming we didn't jump to any of the error labels (Error_Read_Disk_Blk, Error_Alloc_Buffer_To_Next_Ad, Error_Get_Aed), we proceed to the Done label, and return Status. (BTW, I checked all the paths to those error labels, and those *are* fine.) So how is it possible that we reach Done, and return Status, without having set Status? It is possible by taking *none* of the branches in the second switch statement (RecordingFlags). The switch statement does not have a default label, which also happens to break the coding style: 5.7.3.7 switch Statements [...] Always include the default case. It should always end with a break statement, even if the break is the last thing before the closing brace. OK, so how is it possible that none of the case labels match? "RecordingFlags", i.e. the controlling expression of the switch statement, is set like this: RecordingFlags = GET_FE_RECORDING_FLAGS (FileEntryData); which in turn is #defined as typedef enum { SHORT_ADS_SEQUENCE, LONG_ADS_SEQUENCE, EXTENDED_ADS_SEQUENCE, INLINE_DATA } UDF_FE_RECORDING_FLAGS; #define GET_FE_RECORDING_FLAGS(_Fe) \ ((UDF_FE_RECORDING_FLAGS)((UDF_ICB_TAG *)( \ (UINT8 *)(_Fe) + \ sizeof (UDF_DESCRIPTOR_TAG)))->Flags & 0x07) So basically we take a bitmask which may have values 0..7 decimal inclusive (bits 2:0), and then compare it against values 0, 1, 2, 3 only. Because this value comes from an on-disk data structure, i.e. we don't set it ourselves in the driver, this is an actual bug. A default label should be added to reject the unsupported cases. Now, the UDF spec documents this field under 2.3.5 ICB Tag 2.3.5.4 Uint16 Flags Bits 0-2: These bits specify the type of allocation descriptors used. Refer to section 2.3.10 on Allocation Descriptors for the guidelines on choosing which type of allocation descriptor to use. So we go to 2.3.10: 2.3.10 Allocation Descriptors [...] and we find that from the 4 constants listed in the code, the UDF spec names only two, "Short Allocation Descriptor" (SHORT_ADS_SEQUENCE) and "Long Allocation Descriptor" (LONG_ADS_SEQUENCE), but the spec doesn't even provide their numerical values! This is where ECMA-167 becomes relevant. We have: 14.6 ICB Tag 14.6.8 Flags (RBP 18) Bit Interpretation 0-2 Shall be interpreted as a 3-bit unsigned binary number as follows. The value 0 shall mean that Short Allocation Descriptors (4/14.14.1) are used. The value 1 shall mean that Long Allocation Descriptors (4/14.14.2) are used. The value 2 shall mean that Extended Allocation Descriptors (4/14.14.3) are used. The value 3 shall mean that the file shall be treated as though it had exactly one allocation descriptor describing an extent which starts with the first byte of the Allocation Descriptors field and has a length, in bytes, recorded in the Length of Allocation Descriptors field. The values of 4-7 are reserved for future standardisation. The enumeration constants in UDF_FE_RECORDING_FLAGS cover the specified cases, but the reserved values are not checked against in the 2nd switch statement, under default case label. This also proves that setting Status at the top of the function, to EFI_SUCCESS namely, would have been wrong in commit 131fd40ffc34. > "/home/buildslave/srv/toolchain/arm-tc-14.04/bin/arm-linux-gnueabihf-gcc" > -mthumb -mcpu=cortex-a15 > -I > -g -Os -fshort-wchar -fno-builtin -fno-strict-aliasing -Wall -Werror > -Wno-array-bounds -include AutoGen.h -fno-common -mlittle-endian > -mabi=aapcs -fno-short-enums -funsigned-char -ffunction-sections > -fdata-sections -fomit-frame-pointer -Wno-address -mthumb > -mfloat-abi=soft -fno-pic -fno-pie -fstack-protector > -mword-relocations -Wno-unused-but-set-variable -DMDEPKG_NDEBUG > -DDISABLE_NEW_DEPRECATED_INTERFACES -c -o > > -I > -I > -I > -I > -I > -I > -I > > :887:27: > error: 'BytesLeft' may be used uninitialized in this function > [-Werror=maybe-uninitialized] > UINT64 BytesLeft; > ^ This is a false positive indeed. There are three reads of "BytesLeft", and they are all under: LONG_ADS_SEQUENCE / SHORT_ADS_SEQUENCE READ_FILE_SEEK_AND_READ However, BytesLeft is also assigned near the top of the function, under READ_FILE_SEEK_AND_READ. (Of course, this analysis is complicated by the fact that we have a non-error handling label, namely "Skip_File_Seek", just above the first read location of "BytesLeft". So we have to verify that nothing jumps there without having set "BytesLeft". Goto statements that aren't used for error handling / end-of-function cleanup are horrible.) I'm going to submit two patches for these issues, but that's officially the end of the resources I can allocate for this driver now. All further error reports, even build errors, must go into the TianoCore Bugzilla; no exceptions. Laszlo