From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) by mx.groups.io with SMTP id smtpd.web10.41551.1673481570329476359 for ; Wed, 11 Jan 2023 15:59:30 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20210112 header.b=BChveW9+; spf=pass (domain: gmail.com, ip: 209.85.128.47, mailfrom: pedro.falcato@gmail.com) Received: by mail-wm1-f47.google.com with SMTP id m3so12151677wmq.0 for ; Wed, 11 Jan 2023 15:59:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FPbiwLDiCOj/+qfXWOGU1uL6AQV9yDxHQNn9g3qzys8=; b=BChveW9+7HbiNDLEuJzlLtC+2Ez0V6/GC/EQIja+ug8d8UdPEmHGfVt0dj37JVmig3 qNzeoF1oXdf6MRxPCwmNS4lIXdwPZUpOq555J5Ajg0VVT3Nc2Kd7JJqWH0DEbZ3/SL9P aP0IO4nTo+3Fc017ROagjdZ8GS8Eq0/b1B4ARNPaCYa4hmyeWokhFW5qxZ6ZqqyKmV4L RGo1jhrF+p/cZxi5Na80yVLpp8OEeErjlOsgD2E0CLRK/jyMdw1n4ZNcjdrE2qUfidsv JnjxsdUAJnIg28ee/c70zKZp9r873S881pK4RDjz7ord6TM65IEk1jUgIaUX2T4dKO/z 0hCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FPbiwLDiCOj/+qfXWOGU1uL6AQV9yDxHQNn9g3qzys8=; b=qrCMPBmummHhwWwT5cPpBNs6z0h+1Fu8zE0SJHnohB7neTrcLKbpLOh110uUHuwZUt JSertg8jyNXMDGUD433Apa1InMkaq8PT/mNdJxk+a1JF1X09Ag8qxgMa/vUSluxpC3TB ER1pAh6TivstaXsSQYOfCA2RITSJvi5/wFCyOy8eVJlzZJ5GESB5aWJ80edbUzhMNeYm F8tNMXrbihLWFX84Ydo9wh/pMV72kJKXX1J7UH3vleKU56nN+oVGpVc8OBhtEGvKidv5 wndVq+KMBFFSulvANh83LcACQuHh1sLfJIPn1a2sCzbZQqL2OmvEO2wUzhdPtFO+/Za7 DPGA== X-Gm-Message-State: AFqh2kqSYMFf6DPPCbrlWKWceBnEVS/yd7Bn7XxX9pKa5yYW3Lp8KYXZ Pce3SFMBxJazTbPu9UNZk49VP+oJJqJHAw== X-Google-Smtp-Source: AMrXdXvFu+jRPJB3rsYjZdzb8KQopMgrD+mHAVmQXIjCyD6JV3xE7vVsUhkxbI5dKmMC5rEpfOG7yA== X-Received: by 2002:a05:600c:3c88:b0:3d9:69fd:7707 with SMTP id bg8-20020a05600c3c8800b003d969fd7707mr51424103wmb.2.1673481568562; Wed, 11 Jan 2023 15:59:28 -0800 (PST) Return-Path: Received: from PC-PEDRO-ARCH.lan ([2001:8a0:7280:5801:9441:3dce:686c:bfc7]) by smtp.gmail.com with ESMTPSA id p21-20020a7bcc95000000b003c65c9a36dfsm19276102wma.48.2023.01.11.15.59.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 15:59:28 -0800 (PST) From: "Pedro Falcato" To: devel@edk2.groups.io Cc: Pedro Falcato , =?UTF-8?q?Marvin=20H=C3=A4user?= Subject: [PATCH 2/2] Ext4Pkg: Fix and clarify handling regarding non-utf8 dir entries Date: Wed, 11 Jan 2023 23:59:19 +0000 Message-Id: <20230111235920.252317-5-pedro.falcato@gmail.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230111235920.252317-1-pedro.falcato@gmail.com> References: <20230111235920.252317-1-pedro.falcato@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, the handling was mixed and/or non-existent regarding non utf-8 dirent names. Clarify it. Signed-off-by: Pedro Falcato Cc: Marvin Häuser --- Features/Ext4Pkg/Ext4Dxe/Directory.c | 37 ++++++++++++++++++++++------ Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h | 8 +++--- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/Features/Ext4Pkg/Ext4Dxe/Directory.c b/Features/Ext4Pkg/Ext4Dxe/Directory.c index 6ed664fc632f..ba781bad968c 100644 --- a/Features/Ext4Pkg/Ext4Dxe/Directory.c +++ b/Features/Ext4Pkg/Ext4Dxe/Directory.c @@ -1,7 +1,7 @@ /** @file Directory related routines - Copyright (c) 2021 Pedro Falcato All rights reserved. + Copyright (c) 2021 - 2023 Pedro Falcato All rights reserved. SPDX-License-Identifier: BSD-2-Clause-Patent **/ @@ -16,8 +16,9 @@ @param[in] Entry Pointer to a EXT4_DIR_ENTRY. @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size EXT4_NAME_MAX + 1. - @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. - @retval !EFI_SUCCESS Failure. + @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. + @retval EFI_INVALID_PARAMETER The filename is not valid UTF-8. + @retval !EFI_SUCCESS Failure. **/ EFI_STATUS Ext4GetUcs2DirentName ( @@ -174,10 +175,16 @@ Ext4RetrieveDirent ( * need to form valid ASCII/UTF-8 sequences. */ if (EFI_ERROR (Status)) { - // If we error out, skip this entry - // I'm not sure if this is correct behaviour, but I don't think there's a precedent here. - BlockOffset += Entry->rec_len; - continue; + if (Status == EFI_INVALID_PARAMETER) { + // If we error out due to a bad UTF-8 sequence (see Ext4GetUcs2DirentName), skip this entry. + // I'm not sure if this is correct behaviour, but I don't think there's a precedent here. + BlockOffset += Entry->rec_len; + continue; + } + + // Other sorts of errors should just error out. + FreePool (Buf); + return Status; } if ((Entry->name_len == StrLen (Name)) && @@ -436,6 +443,7 @@ Ext4ReadDir ( EXT4_FILE *TempFile; BOOLEAN ShouldSkip; BOOLEAN IsDotOrDotDot; + CHAR16 DirentUcs2Name[EXT4_NAME_MAX + 1]; DirIno = File->Inode; Status = EFI_SUCCESS; @@ -505,6 +513,21 @@ Ext4ReadDir ( continue; } + // Test if the dirent is valid utf-8. This is already done inside Ext4OpenDirent but EFI_INVALID_PARAMETER + // has the danger of its meaning being overloaded in many places, so we can't skip according to that. + // So test outside of it, explicitly. + Status = Ext4GetUcs2DirentName (&Entry, DirentUcs2Name); + + if (EFI_ERROR (Status)) { + if (Status == EFI_INVALID_PARAMETER) { + // Bad UTF-8, skip. + Offset += Entry.rec_len; + continue; + } + + goto Out; + } + Status = Ext4OpenDirent (Partition, EFI_FILE_MODE_READ, &TempFile, &Entry, File); if (EFI_ERROR (Status)) { diff --git a/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h b/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h index 466e49523030..41779dad855f 100644 --- a/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h +++ b/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h @@ -944,11 +944,11 @@ Ext4StrCmpInsensitive ( Retrieves the filename of the directory entry and converts it to UTF-16/UCS-2 @param[in] Entry Pointer to a EXT4_DIR_ENTRY. - @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size -EXT4_NAME_MAX + 1. + @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size EXT4_NAME_MAX + 1. - @retval EFI_SUCCESS Unicode collation was successfully initialised. - @retval !EFI_SUCCESS Failure. + @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. + @retval EFI_INVALID_PARAMETER The filename is not valid UTF-8. + @retval !EFI_SUCCESS Failure. **/ EFI_STATUS Ext4GetUcs2DirentName ( -- 2.39.0