From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by mx.groups.io with SMTP id smtpd.web11.41668.1673481571331080452 for ; Wed, 11 Jan 2023 15:59:31 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20210112 header.b=EW3GeAW6; spf=pass (domain: gmail.com, ip: 209.85.128.50, mailfrom: pedro.falcato@gmail.com) Received: by mail-wm1-f50.google.com with SMTP id bi26-20020a05600c3d9a00b003d3404a89faso3143095wmb.1 for ; Wed, 11 Jan 2023 15:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FPbiwLDiCOj/+qfXWOGU1uL6AQV9yDxHQNn9g3qzys8=; b=EW3GeAW6U5EAI2eMLDs4DQvFrC4ZoNkgOrAMf94B1sEIdNw0WWpuKgPCnUeVYNQ2if zTQ0Z9/n7CFQQym0rIinXilaecI+R49Wirjee1AYnAC2svTAX1Lw588PBMJNbsPiWQnX 1wZaVdp0QmQXASw/B9SRuF2BSK05WcVxVT6dgNoscZrVxDy8DRG5PYxwEw24E2Haahn9 Ydt12hV5RE0vAzZVmoF7+zTIPmIzPEVr5C5zbpNCBQ0jgv7kue3zS/YE7eqRMDeEQANk YINIv9NN9DZcq+ljj/+psCclIkFxMg/qtcWjKXuTje/gvT+HJdaoukCbOeT7cxh3a1d6 vgJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FPbiwLDiCOj/+qfXWOGU1uL6AQV9yDxHQNn9g3qzys8=; b=pj26cg9yVk4iG8VFFkyNT1pjSCjlyjP3FdozTHiaKp/4n7uRPFhhjaTTgwOg16evSf 8k0mqYoKAiSbHqjbmZDuhSUHvLUDfMoBsPQnjbotvQ49PCpGL7rD9n7p+5Sdc7NjaACx mRar00F0LKhB+HWrYXeF7xjrHm0h232nKkYDtkxkaxm1g1c30X9hG5eG2CaHcGhXxfpd 1pVBtNKKPCLid08Ft+xJFLZAIgV251C10AXTGoHd+xZgqF6Nwl6FUeUyo1dhG1ED03EN eJCbLJpKp6MhPkeEhCV8fQ2ZsNBFwZthQnPwJiWy/mHL5vLPc+yikIMMKB5+VDzBIyXS sBEw== X-Gm-Message-State: AFqh2kpIAdq8fZj1DQxKdfxz9rcypdKYI9O3AGZnwTRL4OAjZRZjY4AL WQEh3vNuDBTRYfu/aQER7w8YT/QQ+WNXnA== X-Google-Smtp-Source: AMrXdXttYMDLx4zKgGUwLKW/y50H6rLtGjZW8AYTNLPbzex3ai8v4uvtghGZyBBVjt93IkJOjar5Vg== X-Received: by 2002:a05:600c:1603:b0:3d1:c895:930c with SMTP id m3-20020a05600c160300b003d1c895930cmr53334796wmn.35.1673481569558; Wed, 11 Jan 2023 15:59:29 -0800 (PST) Return-Path: Received: from PC-PEDRO-ARCH.lan ([2001:8a0:7280:5801:9441:3dce:686c:bfc7]) by smtp.gmail.com with ESMTPSA id p21-20020a7bcc95000000b003c65c9a36dfsm19276102wma.48.2023.01.11.15.59.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 15:59:29 -0800 (PST) From: "Pedro Falcato" To: devel@edk2.groups.io Cc: Pedro Falcato , =?UTF-8?q?Marvin=20H=C3=A4user?= Subject: [PATCH 3/3] Ext4Pkg: Fix and clarify handling regarding non-utf8 dir entries Date: Wed, 11 Jan 2023 23:59:20 +0000 Message-Id: <20230111235920.252317-6-pedro.falcato@gmail.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230111235920.252317-1-pedro.falcato@gmail.com> References: <20230111235920.252317-1-pedro.falcato@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously, the handling was mixed and/or non-existent regarding non utf-8 dirent names. Clarify it. Signed-off-by: Pedro Falcato Cc: Marvin Häuser --- Features/Ext4Pkg/Ext4Dxe/Directory.c | 37 ++++++++++++++++++++++------ Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h | 8 +++--- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/Features/Ext4Pkg/Ext4Dxe/Directory.c b/Features/Ext4Pkg/Ext4Dxe/Directory.c index 6ed664fc632f..ba781bad968c 100644 --- a/Features/Ext4Pkg/Ext4Dxe/Directory.c +++ b/Features/Ext4Pkg/Ext4Dxe/Directory.c @@ -1,7 +1,7 @@ /** @file Directory related routines - Copyright (c) 2021 Pedro Falcato All rights reserved. + Copyright (c) 2021 - 2023 Pedro Falcato All rights reserved. SPDX-License-Identifier: BSD-2-Clause-Patent **/ @@ -16,8 +16,9 @@ @param[in] Entry Pointer to a EXT4_DIR_ENTRY. @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size EXT4_NAME_MAX + 1. - @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. - @retval !EFI_SUCCESS Failure. + @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. + @retval EFI_INVALID_PARAMETER The filename is not valid UTF-8. + @retval !EFI_SUCCESS Failure. **/ EFI_STATUS Ext4GetUcs2DirentName ( @@ -174,10 +175,16 @@ Ext4RetrieveDirent ( * need to form valid ASCII/UTF-8 sequences. */ if (EFI_ERROR (Status)) { - // If we error out, skip this entry - // I'm not sure if this is correct behaviour, but I don't think there's a precedent here. - BlockOffset += Entry->rec_len; - continue; + if (Status == EFI_INVALID_PARAMETER) { + // If we error out due to a bad UTF-8 sequence (see Ext4GetUcs2DirentName), skip this entry. + // I'm not sure if this is correct behaviour, but I don't think there's a precedent here. + BlockOffset += Entry->rec_len; + continue; + } + + // Other sorts of errors should just error out. + FreePool (Buf); + return Status; } if ((Entry->name_len == StrLen (Name)) && @@ -436,6 +443,7 @@ Ext4ReadDir ( EXT4_FILE *TempFile; BOOLEAN ShouldSkip; BOOLEAN IsDotOrDotDot; + CHAR16 DirentUcs2Name[EXT4_NAME_MAX + 1]; DirIno = File->Inode; Status = EFI_SUCCESS; @@ -505,6 +513,21 @@ Ext4ReadDir ( continue; } + // Test if the dirent is valid utf-8. This is already done inside Ext4OpenDirent but EFI_INVALID_PARAMETER + // has the danger of its meaning being overloaded in many places, so we can't skip according to that. + // So test outside of it, explicitly. + Status = Ext4GetUcs2DirentName (&Entry, DirentUcs2Name); + + if (EFI_ERROR (Status)) { + if (Status == EFI_INVALID_PARAMETER) { + // Bad UTF-8, skip. + Offset += Entry.rec_len; + continue; + } + + goto Out; + } + Status = Ext4OpenDirent (Partition, EFI_FILE_MODE_READ, &TempFile, &Entry, File); if (EFI_ERROR (Status)) { diff --git a/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h b/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h index 466e49523030..41779dad855f 100644 --- a/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h +++ b/Features/Ext4Pkg/Ext4Dxe/Ext4Dxe.h @@ -944,11 +944,11 @@ Ext4StrCmpInsensitive ( Retrieves the filename of the directory entry and converts it to UTF-16/UCS-2 @param[in] Entry Pointer to a EXT4_DIR_ENTRY. - @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size -EXT4_NAME_MAX + 1. + @param[out] Ucs2FileName Pointer to an array of CHAR16's, of size EXT4_NAME_MAX + 1. - @retval EFI_SUCCESS Unicode collation was successfully initialised. - @retval !EFI_SUCCESS Failure. + @retval EFI_SUCCESS The filename was succesfully retrieved and converted to UCS2. + @retval EFI_INVALID_PARAMETER The filename is not valid UTF-8. + @retval !EFI_SUCCESS Failure. **/ EFI_STATUS Ext4GetUcs2DirentName ( -- 2.39.0