From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ma1-aaemail-dr-lapp02.apple.com (ma1-aaemail-dr-lapp02.apple.com [17.171.2.68]) by mx.groups.io with SMTP id smtpd.web12.3007.1626922083590514031 for ; Wed, 21 Jul 2021 19:48:04 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=wMW0sK5D; spf=pass (domain: apple.com, ip: 17.171.2.68, mailfrom: afish@apple.com) Received: from pps.filterd (ma1-aaemail-dr-lapp02.apple.com [127.0.0.1]) by ma1-aaemail-dr-lapp02.apple.com (8.16.0.42/8.16.0.42) with SMTP id 16M2luZI029123; Wed, 21 Jul 2021 19:47:58 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=yrluuqCUMMazq9v7phDJ2kjlv5SB6/BDqRZWbvAUjPA=; b=wMW0sK5DJr0u67nZ/VsO5s+aWPZoB8Oql8cQ3Cbinv7Ac+CqZQUkPzuaKrSWfdyX5p9I PLDEkBFVdujrxb+I5TdIZd4bcbu5+Jfyst05cqSyuCX7Y2K083LDaLRJ64RQ7paNziwE 5nfucIZ6VbamjwatZJ8LXpsE8HJQluUJ40JuypWByd8pQ/LdxHXxlEsf3oEVx0CjUh7r TFsPj9imPVtduUYRz8pc7V1N5hTIwXGoiJlKeWhujLC8V1MbL4ulLF7XeBzWeezUf+ZY MGpB6DH6msiE2dF9A+27MU1dplM1QaNs9H/9m3U6qM+StUXM4wr1YlXok1/jaHgeYfEX tQ== Received: from rn-mailsvcp-mta-lapp03.rno.apple.com (rn-mailsvcp-mta-lapp03.rno.apple.com [10.225.203.151]) by ma1-aaemail-dr-lapp02.apple.com with ESMTP id 39uuvv5dvh-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Wed, 21 Jul 2021 19:47:58 -0700 Received: from rn-mailsvcp-mmp-lapp02.rno.apple.com (rn-mailsvcp-mmp-lapp02.rno.apple.com [17.179.253.15]) by rn-mailsvcp-mta-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPS id <0QWM00OKBL3XM140@rn-mailsvcp-mta-lapp03.rno.apple.com>; Wed, 21 Jul 2021 19:47:57 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp02.rno.apple.com by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) id <0QWM00Y00KP5PN00@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Wed, 21 Jul 2021 19:47:57 -0700 (PDT) X-Va-A: X-Va-T-CD: 179de7b500054adfc8508c1a54bcfc2c X-Va-E-CD: 2c259520e8f45de15071db738453c888 X-Va-R-CD: 37acc2ebd32331a1cf813371baa4cccb X-Va-CD: 0 X-Va-ID: 605bd5e5-d869-4dfa-ac15-54fc64ff125f X-V-A: X-V-T-CD: 179de7b500054adfc8508c1a54bcfc2c X-V-E-CD: 2c259520e8f45de15071db738453c888 X-V-R-CD: 37acc2ebd32331a1cf813371baa4cccb X-V-CD: 0 X-V-ID: d736cc73-e0ea-4ea9-82c1-3e672cfb6466 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-21_11:2021-07-21,2021-07-21 signatures=0 Received: from [17.235.41.25] (unknown [17.235.41.25]) by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPSA id <0QWM0047OL3V0E00@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Wed, 21 Jul 2021 19:47:57 -0700 (PDT) From: "Andrew Fish" Message-id: <6A3CAE13-52E0-4E2B-8232-3F2AB355185E@apple.com> MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: Re: [edk2-devel] [edk2-rfc] RFC: EXT4 filesystem driver Date: Wed, 21 Jul 2021 19:47:55 -0700 In-reply-to: <000001d77e97$d1089100$7319b300$@byosoft.com.cn> Cc: rfc@edk2.groups.io, pedro.falcato@gmail.com To: edk2-devel-groups-io , gaoliming@byosoft.com.cn References: <000001d77e97$d1089100$7319b300$@byosoft.com.cn> X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-21_11:2021-07-21,2021-07-21 signatures=0 Content-type: multipart/alternative; boundary="Apple-Mail=_5D082664-2876-4123-A3B4-AC77BF355844" --Apple-Mail=_5D082664-2876-4123-A3B4-AC77BF355844 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 21, 2021, at 6:20 PM, gaoliming wrote: >=20 >=20 >=20 >> -----=E9=82=AE=E4=BB=B6=E5=8E=9F=E4=BB=B6----- >> =E5=8F=91=E4=BB=B6=E4=BA=BA: rfc@edk2.groups.io > =E4=BB=A3=E8=A1=A8 Pe= dro Falcato >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2021=E5=B9=B47=E6=9C=8822=E6=97= =A5 7:12 >> =E6=94=B6=E4=BB=B6=E4=BA=BA: devel@edk2.groups.io >> =E6=8A=84=E9=80=81: rfc@edk2.groups.io >> =E4=B8=BB=E9=A2=98: [edk2-rfc] RFC: EXT4 filesystem driver >>=20 >> EXT4 (fourth extended filesystem) is a filesystem developed for Linux >> that has been in wide use (desktops, servers, smartphones) since 2008. >>=20 >> The Ext4Pkg implements the Simple File System Protocol for a partition >> that is formatted with the EXT4 file system. This allows UEFI Drivers, >> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files >> on an EXT4 partition and supports booting a UEFI OS Loader from an >> EXT4 partition. >> This project is one of the TianoCore Google Summer of Code projects. >>=20 >> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a >> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk >> I/O 2 Protocols, and produces the Simple File System protocol. It >> allows mounting ext4 filesystems exclusively. >>=20 >> Brief overhead of EXT4: >> Layout of an EXT2/3/4 filesystem: >> (note: this driver has been developed using >> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as >> documentation). >>=20 >> An ext2/3/4 filesystem (here on out referred to as simply an ext4 files= ystem, >> due to the similarities) is composed of various concepts: >>=20 >> 1) Superblock >> The superblock is the structure near (1024 bytes offset from the start) >> the start of the partition, and describes the filesystem in general. >> Here, we get to know the size of the filesystem's blocks, which feature= s >> it supports or not, whether it's been cleanly unmounted, how many block= s >> we have, etc. >>=20 >> 2) Block groups >> EXT4 filesystems are divided into block groups, and each block group co= vers >> s_blocks_per_group(8 * Block Size) blocks. Each block group has an >> associated block group descriptor; these are present directly after the >> superblock. Each block group descriptor contains the location of the >> inode table, and the inode and block bitmaps (note these bitmaps are on= ly >> a block long, which gets us the 8 * Block Size formula covered previous= ly). >>=20 >> 3) Blocks >> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ = 1024. >> Blocks can be allocated using individual block groups's bitmaps. Note >> that block 0 is invalid and its presence on extents/block tables means >> it's part of a file hole, and that particular location must be read as >> a block full of zeros. >>=20 >> 4) Inodes >> The ext4 filesystem divides files/directories into inodes (originally >> index nodes). Each file/socket/symlink/directory/etc (here on out refer= red >> to as a file, since there is no distinction under the ext4 filesystem) = is >> stored as a /nameless/ inode, that is stored in some block group's inod= e >> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's >> an old filesystem), and holds various metadata about the file. Since th= e >> largest inode structure right now is ~160 bytes, the rest of the inode >> contains inline extended attributes. Inodes' data is stored using eithe= r >> data blocks (under ext2/3) or extents (under ext4). >>=20 >> 5) Extents >> Ext4 inodes store data in extents. These let N contiguous logical block= s >> that are represented by N contiguous physical blocks be represented by = a >> single extent structure, which minimizes filesystem metadata bloat and >> speeds up block mapping (particularly due to the fact that high-quality >> ext4 implementations like linux's try /really/ hard to make the file >> contiguous, so it's common to have files with almost 0 fragmentation). >> Inodes that use extents store them in a tree, and the top of the tree >> is stored on i_data. The tree's leaves always start with an >> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth !=3D 0 >> and >> EXT4_EXTENT on eh_depth =3D 0; these entries are always sorted by logic= al >> block. >>=20 >> 6) Directories >> Ext4 directories are files that store name -> inode mappings for the >> logical directory; this is where files get their names, which means ext= 4 >> inodes do not themselves have names, since they can be linked (present) >> multiple times with different names. Directories can store entries in t= wo >> different ways: >> 1) Classical linear directories: They store entries as a mostly-linked >> mostly-list of EXT4_DIR_ENTRY. >> 2) Hash tree directories: These are used for larger directories, with >> hundreds of entries, and are designed in a backwards-compatible way. >> These are not yet implemented in the Ext4Dxe driver. >>=20 >> 7) Journal >> Ext3/4 filesystems have a journal to help protect the filesystem agains= t >> system crashes. This is not yet implemented in Ext4Dxe but is described >> in detail in the Linux kernel's documentation. >>=20 >> The EDK2 implementation of ext4 is based only on the public documentati= on >> available at >> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html >> and >> the FreeBSD ext2fs driver (available at >> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs, >> BSD-2-Clause-FreeBSD licensed). It is licensed as >> SPDX-License-Identifier: BSD-2-Clause-Patent. >>=20 >> After a brief discussion with the community, the proposed package >> location is edk2-platform/Features/Ext4Pkg >> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185). >>=20 >> I was the main contributor and I would like to maintain the package in >> the future, if possible. >>=20 >> Current limitations: >> 1) The Ext4Dxe driver is, at the moment, read-only. >> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3) >> filesystems. Ensuring compatibility with >> those may not be a bad idea. >>=20 >> I intend to test the package using the UEFI SCTs present in edk2-test, >> and implement any other needed unit tests myself using the already >> available unit test framework. I also intend to (privately) fuzz the >> UEFI driver with bad/unusual disk images, to improve the security and >> reliability of the driver. >>=20 >> In the future, ext4 write support should be added so edk2 has a >> fully-featured RW ext4 driver. There could also be a focus on >> supporting the older ext4-like filesystems, as I mentioned in the >> limitations, but that is open for discussion. >>=20 >> The driver's handling of unclean unmounting through forced shutdown is >> unclear. >> Is there a position in edk2 on how to handle such cases? I don't think >> FAT32 has a "this filesystem is/was dirty" and even though it seems to >> me that stopping a system from booting/opening the partition because >> "we may find some tiny irregularities" is not the best course of >> action, I can't find a clear answer. >>=20 >> The driver also had to add implementations of CRC32C and CRC16, and >> after talking with my mentor we quickly reached the conclusion that >> these may be good candidates for inclusion in MdePkg. We also >> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg >> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments? >=20 > Current MdePkg BaseLib has CalculateCrc32(). So, CRC32C and CRC16 can be= added into BaseLib.=20 >=20 > If more modules need to consume Ucs2 <-> Utf8 conversion library, BaseUc= s2Utf8Lib is generic enough > to be placed in MdePkg.=20 >=20 I think the Terminal driver may have some similar logic to convert UTF-8 t= erminals to/from the UEFI UCS-2? https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Universal/Conso= le/TerminalDxe/Vtutf8.c#L186 Thanks, Andrew Fish > Thanks > Liming >>=20 >> Feel free to ask any questions you may find relevant. >>=20 >> Best Regards, >>=20 >> Pedro Falcato >>=20 >>=20 >>=20 >>=20 >=20 >=20 >=20 >=20 >=20 >=20 --Apple-Mail=_5D082664-2876-4123-A3B4-AC77BF355844 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jul 21, 2= 021, at 6:20 PM, gaoliming <gaoliming@byosoft.com.cn> wrote:


=
-----=E9=82=AE=E4=BB=B6=E5=8E=9F=E4=BB=B6-----
=E5=8F=91=E4=BB=B6=E4=BA=BA:&n= bsp;rfc@edk2.groups= .io <rfc@edk2.groups.io> =E4=BB=A3=E8= = =A1=A8 Pedro Falcato
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2= 021=E5=B9=B47=E6=9C=8822=E6=97=A5 7:12
=E6=94=B6=E4=BB=B6=E4= =BA=BA: devel@edk2.groups.io
= =E6=8A=84=E9=80=81: rfc@edk2.groups.io
=E4=B8=BB=E9=A2=98: [edk2-rfc] RFC: EXT4 filesystem driver

EXT4 (fourth extended filesystem) is a filesystem dev= eloped for Linux
that has been in wide use (desktops, servers= , smartphones) since 2008.

The Ext4Pkg impleme= nts the Simple File System Protocol for a partition
that is f= ormatted with the EXT4 file system. This allows UEFI Drivers,
UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
on an EXT4 partition and supports booting a UEFI OS Loader from = an
EXT4 partition.
This project is one of the T= ianoCore Google Summer of Code projects.

Right= now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk<= br class=3D"">I/O 2 Protocols, and produces the Simple File System protocol= . It
allows mounting ext4 filesystems exclusively.

Brief overhead of EXT4:
Layout of an EX= T2/3/4 filesystem:
(note: this driver has been developed usin= g
https://www.kernel.org/doc/html/latest/files= ystems/ext4/index.html as
documentation).
<= br class=3D"">An ext2/3/4 filesystem (here on out referred to as simply an = ext4 filesystem,
due to the similarities) is composed of vari= ous concepts:

1) Superblock
The = superblock is the structure near (1024 bytes offset from the start)
the start of the partition, and describes the filesystem in general.=
Here, we get to know the size of the filesystem's blocks, wh= ich features
it supports or not, whether it's been cleanly un= mounted, how many blocks
we have, etc.

2) Block groups
EXT4 filesystems are divided into bloc= k groups, and each block group covers
s_blocks_per_group(8 * = Block Size) blocks. Each block group has an
associated block = group descriptor; these are present directly after the
superb= lock. Each block group descriptor contains the location of the
inode table, and the inode and block bitmaps (note these bitmaps are only=
a block long, which gets us the 8 * Block Size formula cover= ed previously).

3) Blocks
The ex= t4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
Blocks can be allocated using individual block groups's bitmaps. = Note
that block 0 is invalid and its presence on extents/bloc= k tables means
it's part of a file hole, and that particular = location must be read as
a block full of zeros.

4) Inodes
The ext4 filesystem divides files/d= irectories into inodes (originally
index nodes). Each file/so= cket/symlink/directory/etc (here on out referred
to as a file= , since there is no distinction under the ext4 filesystem) is
stored as a /nameless/ inode, that is stored in some block group's inodetable. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZ= E if it's
an old filesystem), and holds various metadata abou= t the file. Since the
largest inode structure right now is ~1= 60 bytes, the rest of the inode
contains inline extended attr= ibutes. Inodes' data is stored using either
data blocks (unde= r ext2/3) or extents (under ext4).

5) Extents<= br class=3D"">Ext4 inodes store data in extents. These let N contiguous log= ical blocks
that are represented by N contiguous physical blo= cks be represented by a
single extent structure, which minimi= zes filesystem metadata bloat and
speeds up block mapping (pa= rticularly due to the fact that high-quality
ext4 implementat= ions like linux's try /really/ hard to make the file
contiguo= us, so it's common to have files with almost 0 fragmentation).
Inodes that use extents store them in a tree, and the top of the tree
is stored on i_data. The tree's leaves always start with an
EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth !=3D= 0
and
EXT4_EXTENT on eh_depth =3D 0; these ent= ries are always sorted by logical
block.

6) Directories
Ext4 directories are files that store= name -> inode mappings for the
logical directory; this is= where files get their names, which means ext4
inodes do not = themselves have names, since they can be linked (present)
mul= tiple times with different names. Directories can store entries in two
different ways:
1) Classical linear directories: Th= ey store entries as a mostly-linked
mostly-list of EXT4_DIR_E= NTRY.
2) Hash tree directories: These are used for larger dir= ectories, with
hundreds of entries, and are designed in a bac= kwards-compatible way.
These are not yet implemented in the E= xt4Dxe driver.

7) Journal
Ext3/4= filesystems have a journal to help protect the filesystem against
system crashes. This is not yet implemented in Ext4Dxe but is descri= bed
in detail in the Linux kernel's documentation.

The EDK2 implementation of ext4 is based only on the = public documentation
available at
https://www.kernel.org/doc/html/latest/filesystems/ext4/index.htmland
the FreeBSD ext2fs driver (available at
https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,BSD-2-Clause-FreeBSD licensed). It is licensed as
SPDX-License-Identifier: BSD-2-Clause-Patent.

After a brief discussion with the community, the proposed package
location is edk2-platform/Features/Ext4Pkg
(relevant d= iscussion: https://edk2.groups.io/g/devel/topic/83060185).
I was the main contributor and I would like to maintain the pa= ckage in
the future, if possible.

Current limitations:
1) The Ext4Dxe driver is, at the momen= t, read-only.
2) The Ext4Dxe driver at the moment cannot moun= t older (ext2/3)
filesystems. Ensuring compatibility with
those may not be a bad idea.

I inte= nd to test the package using the UEFI SCTs present in edk2-test,
and implement any other needed unit tests myself using the alreadyavailable unit test framework. I also intend to (privately) fu= zz the
UEFI driver with bad/unusual disk images, to improve t= he security and
reliability of the driver.

In the future, ext4 write support should be added so edk2 has a<= br class=3D"">fully-featured RW ext4 driver. There could also be a focus on=
supporting the older ext4-like filesystems, as I mentioned i= n the
limitations, but that is open for discussion.

The driver's handling of unclean unmounting through f= orced shutdown is
unclear.
Is there a position = in edk2 on how to handle such cases? I don't think
FAT32 has = a "this filesystem is/was dirty" and even though it seems to
= me that stopping a system from booting/opening the partition because
"we may find some tiny irregularities" is not the best course ofaction, I can't find a clear answer.

The driver also had to add implementations of CRC32C and CRC16, and
after talking with my mentor we quickly reached the conclusion th= at
these may be good candidates for inclusion in MdePkg. We a= lso
discussed moving the Ucs2 <-> Utf8 conversion libra= ry in RedfishPkg
(BaseUcs2Utf8Lib) into MdePkg as well. Any c= omments?

= Current MdePkg BaseLib has CalculateCrc32(). So, CRC32C and CRC16 ca= n be added into BaseLib. =

If more modules need to consume Ucs2 <-> Utf8 conver= sion library, BaseUcs2Utf8Lib is generic enough
to be placed in MdePkg. 


I think the Terminal driver may have some similar logic to convert= UTF-8 terminals to/from the UEFI UCS-2?

https://github.com/tia= nocore/edk2/blob/master/MdeModulePkg/Universal/Console/TerminalDxe/Vtutf8.c= #L186

Thanks,

Andrew Fish

Thanks
Liming

Feel free to ask any questions you may find = relevant.

Best Regards,

Pedro Falcato





<= br style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 1= 2px; font-style: normal; font-variant-caps: normal; font-weight: normal; le= tter-spacing: normal; text-align: start; text-indent: 0px; text-transform: = none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0p= x; text-decoration: none;" class=3D"">



--Apple-Mail=_5D082664-2876-4123-A3B4-AC77BF355844--