From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ua1-f43.google.com (mail-ua1-f43.google.com [209.85.222.43]) by mx.groups.io with SMTP id smtpd.web11.2900.1635370473622637759 for ; Wed, 27 Oct 2021 14:34:33 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20210112 header.b=AEIoBpa3; spf=pass (domain: gmail.com, ip: 209.85.222.43, mailfrom: pedro.falcato@gmail.com) Received: by mail-ua1-f43.google.com with SMTP id f4so7683105uad.4 for ; Wed, 27 Oct 2021 14:34:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=N3nZMKyAQt7MPKsQvW/FtEttF8fgO2L3zxeRnofO7V8=; b=AEIoBpa3w4ZE2tdbYZJL4VmxCzjvMlbSP7FbVgGHUqxylqMMh9Uzza5ocsRrYW1Q7k SCJJ2R91o5x/3atS/sxKJsBuVgf1OOXW5K4eUMvQywnFjTv9vKmTFvMpxM1dSJjzeWyE BbyYE3tGfpl2XXNBwMatD9ZsKgXtf7cPmOg+ZcT6BMhH3fyIobzoiu3WBOJeqAXGMtDF YmD7jyNwilie8mZJ+x7mDJbJFD6wIIOmWPcetB0RF3dd4/bsFc+SIxDDNQceLlesvrCa VTjqJroJfnKphDQbHcBEYQnsRZjQxhXp/vFbvxLqSKE0nrExH2sB3qSwLQ9bpO5WQXit lhPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=N3nZMKyAQt7MPKsQvW/FtEttF8fgO2L3zxeRnofO7V8=; b=pYyufrreQcdk9mNxyBrUNduEc3VrVMchrWh9IkwI+jftzXuVAvHkhiQyd+mc1+tXdL bMfjjDb+G8iHxvyIUMXS73xKyirbp7AU+AQ52QPZFpgaB2lZhOnHQd74NZM9ywisMxSz rqvn2cycVPD96HUI/zJk8rStlIdNmReZLPYrTVceq2iKvRuosxRxIDA4auOkmDWtPcNd iH6pkfD668vPSq4DM3T4lfTs7IILR3S8/Y7X2Df6FlY/fWMZhfLtzL9awbLQ0lMbFpNH O5fQUbdA99r/b96B8hIsu4PyejoCYQKew3Wc4AH8vcvRrhG3JLpkCOUbPdWTNgvNcO+f 7YZQ== X-Gm-Message-State: AOAM530z4y1yJUXqw0ny6lAfXL7/vYLaYvI3UNB0bh4A8Rk/uvw/LkGB g6kkvQ2AzX4wI1r0+lDrPt4RUT4bqFwVGcJ6OILisPmEP8xz+Q== X-Google-Smtp-Source: ABdhPJywow7o3ziDzZhpPsbMXjultu2fVdeavz5nRBJ4PjviIqY+HCwFgbXftDUGzigzFrgulojdIJu6gVkC4631FVo= X-Received: by 2002:ab0:1613:: with SMTP id k19mr377542uae.135.1635370472743; Wed, 27 Oct 2021 14:34:32 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: "Pedro Falcato" Date: Wed, 27 Oct 2021 22:34:20 +0100 Message-ID: Subject: Re: [PATCH][Ext4Pkg] unwritten extent suuport To: QiZhou Cc: "devel@edk2.groups.io" Content-Type: multipart/alternative; boundary="0000000000001aa56f05cf5c5cb2" --0000000000001aa56f05cf5c5cb2 Content-Type: text/plain; charset="UTF-8" Hi Qi, If you didn't use the Linux kernel (nor the documentation) as a reference, can you please tell me what you've used? I'm asking because there's at least a line that's suspiciously similar to Linux's code: #define EXTENT_INIT_MAX_LEN (1UL << 15) the UL looks redundant to me, since there's no need for it. Also, I prefer that you fix the typos yourself and format the patch correctly, including the code. On Wed, Oct 27, 2021 at 4:45 PM QiZhou wrote: > 1. I am not familiar with freebsd, and don know if freebsd get the same > issue, > But I do found the freebsd has some code snippets related to unwritten > extent, > see: > https://github.com/freebsd/freebsd-src/blob/b3f46656393f5c8a6e8305afeb5e8c3638025c26/sys/fs/ext2fs/ext2_extents.h#L37 > > https://github.com/freebsd/freebsd-src/blob/b3f46656393f5c8a6e8305afeb5e8c3638025c26/sys/fs/ext2fs/ext2_extents.c#L1347 > Is Ext4 is freebsd's default/major file system ? > > 2. I did't look at linux kernel(ext4) berfor send this patch, I cant > found any offcial document, so I refer to linux source as a standand > when send this patch > > 3. Yes, unwritten extents are wild used, usally when a file cotains many > zeros, or mark file holes(fallocate, qemu-img...) > You can generate a file contains a lot of unwritten extents by qemu, for > example: > qemu-img convert -f raw -O qcow2 win10.img win10.qcow2 > # win10.img's size: 10G > But for files do not have any continuous zeros, like compressed files, > then there will be no any unwritten extents > unwritten extents are usally seen in very large files > > 4. You can fix the typos, My English is not so good > > On Oct 27 2021, at 10:56 pm, Pedro Falcato > wrote: > > > Hi, > > > > The patch looks OK despite the typos and lack of proper formatting on > > the commit message. > > > > But honestly, I don't know if this patch is even mergeable considering > > you looked at the Linux kernel's source code for this. The patch was > > already trivial enough > > if you looked at the documentation and the FreeBSD driver (as I had > > done in the past but never got to fixing this considering I don't even > > know if unwritten extents can appear in the wild). > > > > I *cannot* stress this enough: Ext4Pkg is a clean room implementation > > of ext4 licensed under the BSD-2-Clause-Patent license (which is NOT > > compatible with GPLv2) and cannot have random Linux kernel > > bits, or any other incompatibly-licensed project's bits for that matter. > > > > Best regards, > > Pedro > > > > > >> On Wed, Oct 27, 2021 at 2:37 PM qi zhou wrote: > >> > >>> From: "Qi Zhou" > >>> Subject: [PATCH] unwritten extent suuport > >>> > >>> the real lenght of uninitialized/unwritten extent should be (ee_len > >>> - (1UL << 15)), and > >>> all related block should been read as zeros. see: > >>> > https://github.com/torvalds/linux/blob/d25f27432f80a800a3592db128254c8140bd71bf/fs/ext4/ext4_extents.h#L156 > >>> > >>> Signed-off-by: Qi Zhou > >>> --- > >>> Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h | 5 +++++ > >>> Features/Ext4Pkg/Ext4Dxe/Extents.c | 4 ++-- > >>> Features/Ext4Pkg/Ext4Dxe/Inode.c | 5 +++++ > >>> 3 files changed, 12 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h > b/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h > >>> index 070eb5a..7ca8eee 100644 > >>> --- a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h > >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h > >>> @@ -402,6 +402,11 @@ typedef struct { > >>> > >>> #define EXT4_MIN_DIR_ENTRY_LEN 8 > >>> > >>> +#define EXTENT_INIT_MAX_LEN (1UL << 15) > >>> + > >>> +#define EXTENT_REAL_LEN(x) ((UINT16)(x <= EXTENT_INIT_MAX_LEN ? x : > >>> (x - EXTENT_INIT_MAX_LEN))) > >>> +#define EXTENT_IS_UNWRITTEN(x) (x > EXTENT_INIT_MAX_LEN) > >>> + > >>> // This on-disk structure is present at the bottom of the extent tree > >>> typedef struct { > >>> // First logical block > >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Extents.c > b/Features/Ext4Pkg/Ext4Dxe/Extents.c > >>> index 5fa2fe0..21af573 100644 > >>> --- a/Features/Ext4Pkg/Ext4Dxe/Extents.c > >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Extents.c > >>> @@ -332,7 +332,7 @@ Ext4GetExtent ( > >>> return EFI_NO_MAPPING; > >>> } > >>> > >>> - if (!(LogicalBlock >= Ext->ee_block && Ext->ee_block + > >>> Ext->ee_len > LogicalBlock)) { > >>> + if (!(LogicalBlock >= Ext->ee_block && Ext->ee_block + > >>> EXTENT_REAL_LEN(Ext->ee_len) > LogicalBlock)) { > >>> // This extent does not cover the block > >>> if (Buffer != NULL) { > >>> FreePool (Buffer); > >>> @@ -413,7 +413,7 @@ Ext4ExtentsMapKeyCompare ( > >>> Extent = UserStruct; > >>> Block = (UINT32)(UINTN)StandaloneKey; > >>> > >>> - if (Block >= Extent->ee_block && Block < Extent->ee_block + > >>> Extent->ee_len) { > >>> + if (Block >= Extent->ee_block && Block < Extent->ee_block + > >>> EXTENT_REAL_LEN(Extent->ee_len)) { > >>> return 0; > >>> } > >>> > >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Inode.c > b/Features/Ext4Pkg/Ext4Dxe/Inode.c > >>> index 63cecec..d691ec7 100644 > >>> --- a/Features/Ext4Pkg/Ext4Dxe/Inode.c > >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Inode.c > >>> @@ -151,6 +151,11 @@ Ext4Read ( > >>> // Potential improvement: In the future, we could get the > >>> hole's tota > >>> // size and memset all that > >>> SetMem (Buffer, WasRead, 0); > >>> + } else if(EXTENT_IS_UNWRITTEN(Extent.ee_len)) { > >>> + HoleOff = CurrentSeek - (UINT64)Extent.ee_block * > Partition->BlockSize; > >>> + HoleLen = EXTENT_REAL_LEN(Extent.ee_len) * > >>> Partition->BlockSize - HoleOff; > >>> + WasRead = HoleLen > RemainingRead ? RemainingRead : HoleLen; > >>> + SetMem (Buffer, WasRead, 0); > >>> } else { > >>> ExtentStartBytes = MultU64x32 ( > >>> LShiftU64 (Extent.ee_start_hi, 32) | > >>> -- > >>> 2.17.1 > >>> > > > > > > -- > > > > Pedro Falcato > -- Pedro Falcato --0000000000001aa56f05cf5c5cb2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Qi,

If you didn't use= the Linux kernel (nor the documentation) as a reference, can you please te= ll me what you've used? I'm asking because there's at least a l= ine that's suspiciously similar to Linux's code:

#define EXTENT_INIT_MAX_LEN (1UL << 15)

the UL looks redundant to me, since there's no need for it.

Also, I prefer that you fix the typos yourself and format = the patch correctly, including the code.


On Wed, Oct 27= , 2021 at 4:45 PM QiZhou <atmgnd@o= utlook.com> wrote:
1. I am not familiar with freebsd, and don know if freebsd get t= he same issue,
But I do found the freebsd has some code snippets related to unwritten exte= nt,
see: https://github.com/freebsd/freebsd-src/blob/b3f466563= 93f5c8a6e8305afeb5e8c3638025c26/sys/fs/ext2fs/ext2_extents.h#L37
https://github.com/freebsd/freebsd-src/blob/b3f46656393f= 5c8a6e8305afeb5e8c3638025c26/sys/fs/ext2fs/ext2_extents.c#L1347
Is Ext4 is freebsd's default/major file system ?

2. I did't look at linux kernel(ext4) berfor send this patch, I cant found any offcial document, so I refer to linux source as a standand
when send this patch

3. Yes, unwritten extents are wild used, usally when a file cotains many zeros, or mark file holes(fallocate, qemu-img...)
You can generate a file contains a lot of unwritten extents by qemu, for example:
qemu-img convert -f raw -O qcow2 win10.img win10.qcow2
# win10.img's size: 10G
But for files do not have any continuous zeros, like compressed files,
then there will be no any unwritten extents
unwritten extents are usally seen in very large files

4. You can fix the typos, My English is not so good

On Oct 27 2021, at 10:56 pm, Pedro Falcato <pedro.falcato@gmail.com> wrote:

> Hi,
>=C2=A0
> The patch looks OK despite the typos and lack of proper formatting on<= br> > the commit message.
>=C2=A0
> But honestly, I don't know if this patch is even mergeable conside= ring
> you looked at the Linux kernel's source code for this. The patch w= as
> already trivial enough
> if you looked at the documentation and the FreeBSD driver (as I had > done in the past but never got to fixing this considering I don't = even
> know if unwritten extents can appear in the wild).
>=C2=A0
> I *cannot* stress this enough: Ext4Pkg is a clean room implementation<= br> > of ext4 licensed under the BSD-2-Clause-Patent license (which is NOT > compatible with GPLv2) and cannot have random Linux kernel
> bits, or any other incompatibly-licensed project's bits for that m= atter.
>=C2=A0
> Best regards,
> Pedro
>=C2=A0
>=C2=A0
>> On Wed, Oct 27, 2021 at 2:37 PM qi zhou <atmgnd@outlook.com> wrote:
>>=C2=A0
>>> From: "Qi Zhou" <atmgnd@outlook.com>
>>> Subject: [PATCH] unwritten extent suuport
>>>=C2=A0
>>> the real lenght of uninitialized/unwritten extent should be (e= e_len
>>> - (1UL << 15)), and
>>> all related block should been read as zeros. see:
>>> https://github.com/torvalds/linux/blob/d25f27432f80a800= a3592db128254c8140bd71bf/fs/ext4/ext4_extents.h#L156
>>>=C2=A0
>>> Signed-off-by: Qi Zhou <atmgnd@outlook.com>
>>> ---
>>> =C2=A0Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h | 5 +++++
>>> =C2=A0Features/Ext4Pkg/Ext4Dxe/Extents.c=C2=A0 | 4 ++--
>>> =C2=A0Features/Ext4Pkg/Ext4Dxe/Inode.c=C2=A0 =C2=A0 | 5 +++++<= br> >>> =C2=A03 files changed, 12 insertions(+), 2 deletions(-)
>>>=C2=A0
>>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h b/Features/Ex= t4Pkg/Ext4Dxe/Ext4Disk.h
>>> index 070eb5a..7ca8eee 100644
>>> --- a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h
>>> +++ b/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h
>>> @@ -402,6 +402,11 @@ typedef struct {
>>>=C2=A0
>>> =C2=A0#define EXT4_MIN_DIR_ENTRY_LEN=C2=A0 8
>>>=C2=A0
>>> +#define EXTENT_INIT_MAX_LEN (1UL << 15)
>>> +
>>> +#define EXTENT_REAL_LEN(x) ((UINT16)(x <=3D EXTENT_INIT_MA= X_LEN ? x :
>>> (x - EXTENT_INIT_MAX_LEN)))
>>> +#define EXTENT_IS_UNWRITTEN(x) (x > EXTENT_INIT_MAX_LEN) >>> +
>>> =C2=A0// This on-disk structure is present at the bottom of th= e extent tree
>>> =C2=A0typedef struct {
>>> =C2=A0 =C2=A0// First logical block
>>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Extents.c b/Features/Ext= 4Pkg/Ext4Dxe/Extents.c
>>> index 5fa2fe0..21af573 100644
>>> --- a/Features/Ext4Pkg/Ext4Dxe/Extents.c
>>> +++ b/Features/Ext4Pkg/Ext4Dxe/Extents.c
>>> @@ -332,7 +332,7 @@ Ext4GetExtent (
>>> =C2=A0 =C2=A0 =C2=A0return EFI_NO_MAPPING;
>>> =C2=A0 =C2=A0}
>>>=C2=A0
>>> -=C2=A0 if (!(LogicalBlock >=3D Ext->ee_block &&= Ext->ee_block +
>>> Ext->ee_len > LogicalBlock)) {
>>> +=C2=A0 if (!(LogicalBlock >=3D Ext->ee_block &&= Ext->ee_block +
>>> EXTENT_REAL_LEN(Ext->ee_len) > LogicalBlock)) {
>>> =C2=A0 =C2=A0 =C2=A0// This extent does not cover the block >>> =C2=A0 =C2=A0 =C2=A0if (Buffer !=3D NULL) {
>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0FreePool (Buffer);
>>> @@ -413,7 +413,7 @@ Ext4ExtentsMapKeyCompare (
>>> =C2=A0 =C2=A0Extent =3D UserStruct;
>>> =C2=A0 =C2=A0Block=C2=A0 =3D (UINT32)(UINTN)StandaloneKey;
>>>=C2=A0
>>> -=C2=A0 if (Block >=3D Extent->ee_block && Block= < Extent->ee_block +
>>> Extent->ee_len) {
>>> +=C2=A0 if (Block >=3D Extent->ee_block && Block= < Extent->ee_block +
>>> EXTENT_REAL_LEN(Extent->ee_len)) {
>>> =C2=A0 =C2=A0 =C2=A0return 0;
>>> =C2=A0 =C2=A0}
>>>=C2=A0
>>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Inode.c b/Features/Ext4P= kg/Ext4Dxe/Inode.c
>>> index 63cecec..d691ec7 100644
>>> --- a/Features/Ext4Pkg/Ext4Dxe/Inode.c
>>> +++ b/Features/Ext4Pkg/Ext4Dxe/Inode.c
>>> @@ -151,6 +151,11 @@ Ext4Read (
>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0// Potential improvement: In the fu= ture, we could get the
>>> hole's tota
>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0// size and memset all that
>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0SetMem (Buffer, WasRead, 0);
>>> +=C2=A0 =C2=A0 } else if(EXTENT_IS_UNWRITTEN(Extent.ee_len)) {=
>>> +=C2=A0 =C2=A0 =C2=A0 HoleOff =3D CurrentSeek - (UINT64)Extent= .ee_block * Partition->BlockSize;
>>> +=C2=A0 =C2=A0 =C2=A0 HoleLen =3D EXTENT_REAL_LEN(Extent.ee_le= n) *
>>> Partition->BlockSize - HoleOff;
>>> +=C2=A0 =C2=A0 =C2=A0 WasRead =3D HoleLen > RemainingRead ?= RemainingRead : HoleLen;
>>> +=C2=A0 =C2=A0 =C2=A0 SetMem (Buffer, WasRead, 0);
>>> =C2=A0 =C2=A0 =C2=A0} else {
>>> =C2=A0 =C2=A0 =C2=A0 =C2=A0ExtentStartBytes =3D MultU64x32 ( >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 LShiftU64 (Extent.ee_start_hi, 32) | >>> --
>>> 2.17.1
>>>=C2=A0
>=C2=A0
>=C2=A0
> --
>=C2=A0
> Pedro Falcato


--
Pedro Falcato
--0000000000001aa56f05cf5c5cb2--