From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from AUS01-SY4-obe.outbound.protection.outlook.com (AUS01-SY4-obe.outbound.protection.outlook.com [40.92.62.162]) by mx.groups.io with SMTP id smtpd.web11.5054.1635381816541420543 for ; Wed, 27 Oct 2021 17:43:37 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@outlook.com header.s=selector1 header.b=ho+kxQO2; spf=pass (domain: outlook.com, ip: 40.92.62.162, mailfrom: atmgnd@outlook.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PxwVmT4B9fW+ySMZg2o985UOi2wAWwF715tnTeRmpnzcmtRoyJLJVztYiCdltt4OiaHWMSD9G+r6+wwqBbNlaHhdErgiITguTj9AQhIlMWM6qwPEK5fzlFVsKa8UoooTX7BaZpg+Gsm4ixfFjTsF/aCYBeLxyWdDpa9YWeq7CoCiFKEdQdjJwa88ko4Ur4ZK4k84QayHdue9FB1qR3vrxcdeiQ506vVH82RsGJmkCLfRvkq+TNqORwGmuk3RLxa6kg9d0rJR0FjD+tQQRpWFIBAHEFWtNCqN+t4bi1iyCl7J1n59naYIxLrRJhST+O+FdpjEnlDiWF3NQbFY5z6zRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jJwD0Bb/XadXR5cd2lPow68Ro+iDCiGQnJP+/Q8SPpE=; b=P+2TWu+iz2H4IRVzpeSn42KkdE81cjNlJevvEMs+oL1XlqNHtdA5+sY0ktNXJDfgnGo67DKtrqvFcFuHNRlCjKcNlAj93NDSuB1QIdrMcJKI+Uv+E4j2Sn8quVzw9EatdhvCqmy6Tffhurk1PcYpyOfO+FShzPqNS5QWbhauyVczaW/Z0fWKtP86H4HI+6F2F/HWi72ars80SMKl97Bn43cOMeX24+OM9ndQ11rqBnFCfWN9zzC/Vcgcj56f6K0tc1kEH6AVuA4r4hPHhMx3ZaEMspfZcJG9o74OE5+abdu6eAZRvCv0oAJlDS/KBH51Yu4cmnY3CS+1TMkZFovQVw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jJwD0Bb/XadXR5cd2lPow68Ro+iDCiGQnJP+/Q8SPpE=; b=ho+kxQO2nUYQV8PiufW91C5VLpLbeRKC3SFLeXduniqroiMs1nylzOg4mpnHVwXAI95HLvBlLpSCCPHX6ItmpSUqnF10Sqj9anYHJvNgLcC75EnEoU2rETHvQEDT/exxvT7L1q8F6v2EQUJ4L5YysbJ+T/KK4llmcRXP9+CBpu/ttocfxMAmqzOzz1qGQ/v4akh6KvceOJ5QHnMbWgbXj6yjcw5wLk8M9CtbdJHAvubUV3fmx2DA/4kZJNWV+aeqdvrffT9dIbLxbWWUwkJtYT/o0u0/I5uzD0eNFuZjMioKauRmHv8vSJKdRJR6q6jk4vUEurRvrdP25Cx0+xPtiQ== Received: from SYZP282MB3252.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:169::6) by SY4P282MB4175.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:1ca::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Thu, 28 Oct 2021 00:43:32 +0000 Received: from SYZP282MB3252.AUSP282.PROD.OUTLOOK.COM ([fe80::c917:8548:a196:8972]) by SYZP282MB3252.AUSP282.PROD.OUTLOOK.COM ([fe80::c917:8548:a196:8972%7]) with mapi id 15.20.4649.015; Thu, 28 Oct 2021 00:43:32 +0000 From: "qi zhou" To: Pedro Falcato CC: "devel@edk2.groups.io" Subject: Re: [PATCH][Ext4Pkg] unwritten extent suuport Thread-Topic: [PATCH][Ext4Pkg] unwritten extent suuport Thread-Index: AQHXyzcMbiIWfYFzgEiW74tSJAK0Bqvm7xYAgAANeYCAAGHQAIAANMca Date: Thu, 28 Oct 2021 00:43:32 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US, zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: suggested_attachment_session_id: 84905c77-941a-d0e3-7631-4f3a88bab3d9 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [aqrAHXeYQDkSOcfKBOxmcZL2GPNC+2bv] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: c48a1cbf-9347-427d-f181-08d999abf7d5 x-ms-traffictypediagnostic: SY4P282MB4175: x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: j9Wbn75UyxbLoj04C1CaXxDTxRJ4c3MgCTFrr2yxtX5ZipcZsQs92n38QgjwU9YlL8ktDPTviEoiC1VI2tilUEh2zqfRZuwhSB4DWu839zv676voNAbPuypcZiHKnG2hISyXkMfpOFFAZXoP+a3/XYxKQjr2W5DUW1gCvwgfeYL9dlRzzyeVLb9DU8/30x8j7TjVqkN4+1v+NA1G/8rshr7Ep9rCq9V7z+oKSp1/XG/FMIobqm8TAgWIk29+MUyMkHo0h3sTh9QA4qDqDlwtiTW8Z7H6AfvsSuCiomayF+htP8F5y9u6Yaa0+MXEyIqexVQaGfqCwpg8ExJsAdWh4vKDc43RrM0ZHxODTFxDTJBUvTkJVRR1nZfqRk5xWDtBfeWqy0h7S64r5DYczP/KCfmeF0DFt+2ot5cfIhg7nG2Mapz2KXu5JkMvnhgxnMerR8s936x32u3Dnw+cWW+4hlEnKfaUI97/+b9FNp/ZK0JVrEQWsBEbJuD6IgR5u/y3UxeLyHCLb1j5HrIilVvJ5236cA4ytO8NyFsCKpVU5lz5vv6aVWno1JFEOYPUq1wCZr3zurAr2YlwylC5WzlAZHhURYP6dJJRkjQBhCT6D0likALpU3SRzAeqNVERN6rb x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: +3hF1VAO2Q7udOt2JGyitPopThOCagIoooWxsUc8vZW14Mc3uDz2Xjk5X74L8M6BSAk9YBcopuufWd0L/c++TqAMPOg6yZE60OwVXLxmm0GAvCAGgPVDjpn1IduaJwlBh0TP9ydkUN8imbqV4hfMjlARFWOmXQk79KQOmYbTgJ2XoxuzG74ugbsgLJljEqtOiN2tpVBsA6glp8vWHW88kotqsUFr8MFHJJDCrKbJg8VrpFez5V5FtorQ3bTxWzG6aNpfg1CxAVgcgVZVXWM5d1goIK3Bwhe/xenj5ldN1V57q8VqW7InTf+3YPwBtT9cBW1bRWwNV4qqYIXgMEf0EVea4OimAAa2DhxAgvyeH0X8UyMl0n3rpIWvfmYnrSBPPScuCcXsG42HgWntoMND7OP4PLxZsGQNiQbYL/NyQQcM14zgoic6sSfOWOSsh0pfZ67Sz59Cv2beG5qgJbJ8dVAtL4cyM9pLT0VmhvrIWAMeWPeYk9f/vNUlm6LA5akoHv9iwCY+vmPaopULuaFnWddmCE2neaUbmhhr0Nx+wQmDOs1Rt39MaIGcW2gWx6uVDEAoaCILs9Rr9y+xwC+ah7dGZhdhDFi3rYJOPplHfkrJugEkNLNtJJG5Qjfmbjs0E3KHqr179/1TN92ZPd80Z8sp+WXcpyNt/AwFrydfBK6CvEQXHnehHbULJ0k+NPy05I5fbw8N0MbTeIXcPWiODA== MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SYZP282MB3252.AUSP282.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: c48a1cbf-9347-427d-f181-08d999abf7d5 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Oct 2021 00:43:32.5250 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SY4P282MB4175 Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable This line may do come form linux kernel, As you can see in the first=0A= link I refers says this number (1UL << 15) is kind of magic number. If=0A= you write somethimg linux standanrded, It is hard to keep abosultely no=0A= any linux involued=0A= I think even freebsd has some code from linux, like the second link I=0A= posted, the freebsd's ext4_ext_get_actual_len and EXT_INIT_MAX_LEN are=0A= exactly the same as linux=0A= =0A= It is ok if it's considering as not mergeable, I think it is also good=0A= just as a reference on the mailinng list, to those people who need to=0A= read very large files=0A= =0A= The debug/fix process, I described here=0A= =0A= on the first, I use vbox's ext4 uefi driver to read large files, but=0A= failed on verfication use some tools I writed, I share it here=0A= md5sum.efi: https://1drv.ms/u/s!As-Ec5SPH0fuillwxhIsePY0KBla?e=3DWzHaBf=0A= diff.efi: https://1drv.ms/u/s!As-Ec5SPH0fuilgMwlg6yNQOFCD1?e=3DGVoKuH=0A= =0A= then I googed to for replacement(on the first, I dont plat to fixed it=0A= myself), But no luck, the all fails on large file read verfication. But=0A= I noticed the performance of edk2-platforms's ext4 driver is most best=0A= of all those uefi ext4 drivers=0A= I did not found a working one, so I need to fix it. First I did some=0A= guess and research, and then I added=0A= some logs dump the edk2's read extents to serial on data that did't not=0A= match, (the diff.efi tool I write will stop reading when data dismatch)=0A= I compare those log dump to linux's 'filefrag -v"'s ouput, It is easy=0A= to found the difference, then I google=0A= to find the logic about unwritten extents, then did the fix=0A= =0A= =0A= From: Pedro Falcato =0A= Sent: Thursday, October 28, 2021 5:34=0A= To: QiZhou =0A= Cc: devel@edk2.groups.io =0A= Subject: Re: [PATCH][Ext4Pkg] unwritten extent suuport =0A= =A0=0A= Hi Qi,=0A= =0A= If you didn't use the Linux kernel (nor the documentation) as a reference, = can you please tell me what you've used? I'm asking because there's at leas= t a line that's suspiciously similar to Linux's code:=0A= =0A= #define EXTENT_INIT_MAX_LEN (1UL << 15)=0A= =0A= the UL looks redundant to me, since there's no need for it.=0A= =0A= Also, I prefer that you fix the typos yourself and format the patch correct= ly, including the code.=0A= =0A= =0A= On Wed, Oct 27, 2021 at 4:45 PM QiZhou wrote:=0A= 1. I am not familiar with freebsd, and don know if freebsd get the same iss= ue,=0A= But I do found the freebsd has some code snippets related to unwritten exte= nt,=0A= see: https://github.com/freebsd/freebsd-src/blob/b3f46656393f5c8a6e8305afeb= 5e8c3638025c26/sys/fs/ext2fs/ext2_extents.h#L37=0A= https://github.com/freebsd/freebsd-src/blob/b3f46656393f5c8a6e8305afeb5e8c3= 638025c26/sys/fs/ext2fs/ext2_extents.c#L1347=0A= Is Ext4 is freebsd's default/major file system ?=0A= =0A= 2. I did't look at linux kernel(ext4) berfor send this patch, I cant=0A= found any offcial document, so I refer to linux source as a standand=0A= when send this patch=0A= =0A= 3. Yes, unwritten extents are wild used, usally when a file cotains many=0A= zeros, or mark file holes(fallocate, qemu-img...)=0A= You can generate a file contains a lot of unwritten extents by qemu, for=0A= example:=0A= qemu-img convert -f raw -O qcow2 win10.img win10.qcow2=0A= # win10.img's size: 10G=0A= But for files do not have any continuous zeros, like compressed files,=0A= then there will be no any unwritten extents=0A= unwritten extents are usally seen in very large files=0A= =0A= 4. You can fix the typos, My English is not so good=0A= =0A= On Oct 27 2021, at 10:56 pm, Pedro Falcato wrote:= =0A= =0A= > Hi,=0A= >=A0 =0A= > The patch looks OK despite the typos and lack of proper formatting on=0A= > the commit message.=0A= >=A0 =0A= > But honestly, I don't know if this patch is even mergeable considering=0A= > you looked at the Linux kernel's source code for this. The patch was=0A= > already trivial enough=0A= > if you looked at the documentation and the FreeBSD driver (as I had=0A= > done in the past but never got to fixing this considering I don't even=0A= > know if unwritten extents can appear in the wild).=0A= >=A0 =0A= > I *cannot* stress this enough: Ext4Pkg is a clean room implementation=0A= > of ext4 licensed under the BSD-2-Clause-Patent license (which is NOT=0A= > compatible with GPLv2) and cannot have random Linux kernel=0A= > bits, or any other incompatibly-licensed project's bits for that matter.= =0A= >=A0 =0A= > Best regards,=0A= > Pedro=0A= >=A0 =0A= >=A0 =0A= >> On Wed, Oct 27, 2021 at 2:37 PM qi zhou wrote:=0A= >>=A0 =0A= >>> From: "Qi Zhou" =0A= >>> Subject: [PATCH] unwritten extent suuport=0A= >>>=A0 =0A= >>> the real lenght of uninitialized/unwritten extent should be (ee_len=0A= >>> - (1UL << 15)), and=0A= >>> all related block should been read as zeros. see:=0A= >>> https://github.com/torvalds/linux/blob/d25f27432f80a800a3592db128254c81= 40bd71bf/fs/ext4/ext4_extents.h#L156=0A= >>>=A0 =0A= >>> Signed-off-by: Qi Zhou =0A= >>> ---=0A= >>> =A0Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h | 5 +++++=0A= >>> =A0Features/Ext4Pkg/Ext4Dxe/Extents.c=A0 | 4 ++--=0A= >>> =A0Features/Ext4Pkg/Ext4Dxe/Inode.c=A0 =A0 | 5 +++++=0A= >>> =A03 files changed, 12 insertions(+), 2 deletions(-)=0A= >>>=A0 =0A= >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h b/Features/Ext4Pkg/Ext= 4Dxe/Ext4Disk.h=0A= >>> index 070eb5a..7ca8eee 100644=0A= >>> --- a/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h=0A= >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Ext4Disk.h=0A= >>> @@ -402,6 +402,11 @@ typedef struct {=0A= >>>=A0 =0A= >>> =A0#define EXT4_MIN_DIR_ENTRY_LEN=A0 8=0A= >>>=A0 =0A= >>> +#define EXTENT_INIT_MAX_LEN (1UL << 15)=0A= >>> +=0A= >>> +#define EXTENT_REAL_LEN(x) ((UINT16)(x <=3D EXTENT_INIT_MAX_LEN ? x := =0A= >>> (x - EXTENT_INIT_MAX_LEN)))=0A= >>> +#define EXTENT_IS_UNWRITTEN(x) (x > EXTENT_INIT_MAX_LEN)=0A= >>> +=0A= >>> =A0// This on-disk structure is present at the bottom of the extent tre= e=0A= >>> =A0typedef struct {=0A= >>> =A0 =A0// First logical block=0A= >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Extents.c b/Features/Ext4Pkg/Ext4= Dxe/Extents.c=0A= >>> index 5fa2fe0..21af573 100644=0A= >>> --- a/Features/Ext4Pkg/Ext4Dxe/Extents.c=0A= >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Extents.c=0A= >>> @@ -332,7 +332,7 @@ Ext4GetExtent (=0A= >>> =A0 =A0 =A0return EFI_NO_MAPPING;=0A= >>> =A0 =A0}=0A= >>>=A0 =0A= >>> -=A0 if (!(LogicalBlock >=3D Ext->ee_block && Ext->ee_block +=0A= >>> Ext->ee_len > LogicalBlock)) {=0A= >>> +=A0 if (!(LogicalBlock >=3D Ext->ee_block && Ext->ee_block +=0A= >>> EXTENT_REAL_LEN(Ext->ee_len) > LogicalBlock)) {=0A= >>> =A0 =A0 =A0// This extent does not cover the block=0A= >>> =A0 =A0 =A0if (Buffer !=3D NULL) {=0A= >>> =A0 =A0 =A0 =A0FreePool (Buffer);=0A= >>> @@ -413,7 +413,7 @@ Ext4ExtentsMapKeyCompare (=0A= >>> =A0 =A0Extent =3D UserStruct;=0A= >>> =A0 =A0Block=A0 =3D (UINT32)(UINTN)StandaloneKey;=0A= >>>=A0 =0A= >>> -=A0 if (Block >=3D Extent->ee_block && Block < Extent->ee_block +=0A= >>> Extent->ee_len) {=0A= >>> +=A0 if (Block >=3D Extent->ee_block && Block < Extent->ee_block +=0A= >>> EXTENT_REAL_LEN(Extent->ee_len)) {=0A= >>> =A0 =A0 =A0return 0;=0A= >>> =A0 =A0}=0A= >>>=A0 =0A= >>> diff --git a/Features/Ext4Pkg/Ext4Dxe/Inode.c b/Features/Ext4Pkg/Ext4Dx= e/Inode.c=0A= >>> index 63cecec..d691ec7 100644=0A= >>> --- a/Features/Ext4Pkg/Ext4Dxe/Inode.c=0A= >>> +++ b/Features/Ext4Pkg/Ext4Dxe/Inode.c=0A= >>> @@ -151,6 +151,11 @@ Ext4Read (=0A= >>> =A0 =A0 =A0 =A0// Potential improvement: In the future, we could get th= e=0A= >>> hole's tota=0A= >>> =A0 =A0 =A0 =A0// size and memset all that=0A= >>> =A0 =A0 =A0 =A0SetMem (Buffer, WasRead, 0);=0A= >>> +=A0 =A0 } else if(EXTENT_IS_UNWRITTEN(Extent.ee_len)) {=0A= >>> +=A0 =A0 =A0 HoleOff =3D CurrentSeek - (UINT64)Extent.ee_block * Partit= ion->BlockSize;=0A= >>> +=A0 =A0 =A0 HoleLen =3D EXTENT_REAL_LEN(Extent.ee_len) *=0A= >>> Partition->BlockSize - HoleOff;=0A= >>> +=A0 =A0 =A0 WasRead =3D HoleLen > RemainingRead ? RemainingRead : Hole= Len;=0A= >>> +=A0 =A0 =A0 SetMem (Buffer, WasRead, 0);=0A= >>> =A0 =A0 =A0} else {=0A= >>> =A0 =A0 =A0 =A0ExtentStartBytes =3D MultU64x32 (=0A= >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 LShiftU64 (Exte= nt.ee_start_hi, 32) |=0A= >>> --=0A= >>> 2.17.1=0A= >>>=A0 =0A= >=A0 =0A= >=A0 =0A= > --=0A= >=A0 =0A= > Pedro Falcato=0A= =0A= =0A= -- =0A= Pedro Falcato=