public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* RFC: EXT4 filesystem driver
@ 2021-07-21 23:12 Pedro Falcato
  2021-07-22  1:20 ` 回复: [edk2-rfc] " gaoliming
  2021-07-22 10:24 ` [edk2-devel] " Marvin Häuser
  0 siblings, 2 replies; 10+ messages in thread
From: Pedro Falcato @ 2021-07-21 23:12 UTC (permalink / raw)
  To: devel; +Cc: rfc

EXT4 (fourth extended filesystem) is a filesystem developed for Linux
that has been in wide use (desktops, servers, smartphones) since 2008.

The Ext4Pkg implements the Simple File System Protocol for a partition
that is formatted with the EXT4 file system. This allows UEFI Drivers,
UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
on an EXT4 partition and supports booting a UEFI OS Loader from an
EXT4 partition.
This project is one of the TianoCore Google Summer of Code projects.

Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
I/O 2 Protocols, and produces the Simple File System protocol. It
allows mounting ext4 filesystems exclusively.

Brief overhead of EXT4:
Layout of an EXT2/3/4 filesystem:
(note: this driver has been developed using
https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
documentation).

An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
due to the similarities) is composed of various concepts:

1) Superblock
The superblock is the structure near (1024 bytes offset from the start)
the start of the partition, and describes the filesystem in general.
Here, we get to know the size of the filesystem's blocks, which features
it supports or not, whether it's been cleanly unmounted, how many blocks
we have, etc.

2) Block groups
EXT4 filesystems are divided into block groups, and each block group covers
s_blocks_per_group(8 * Block Size) blocks. Each block group has an
associated block group descriptor; these are present directly after the
superblock. Each block group descriptor contains the location of the
inode table, and the inode and block bitmaps (note these bitmaps are only
a block long, which gets us the 8 * Block Size formula covered previously).

3) Blocks
The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
Blocks can be allocated using individual block groups's bitmaps. Note
that block 0 is invalid and its presence on extents/block tables means
it's part of a file hole, and that particular location must be read as
a block full of zeros.

4) Inodes
The ext4 filesystem divides files/directories into inodes (originally
index nodes). Each file/socket/symlink/directory/etc (here on out referred
to as a file, since there is no distinction under the ext4 filesystem) is
stored as a /nameless/ inode, that is stored in some block group's inode
table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
an old filesystem), and holds various metadata about the file. Since the
largest inode structure right now is ~160 bytes, the rest of the inode
contains inline extended attributes. Inodes' data is stored using either
data blocks (under ext2/3) or extents (under ext4).

5) Extents
Ext4 inodes store data in extents. These let N contiguous logical blocks
that are represented by N contiguous physical blocks be represented by a
single extent structure, which minimizes filesystem metadata bloat and
speeds up block mapping (particularly due to the fact that high-quality
ext4 implementations like linux's try /really/ hard to make the file
contiguous, so it's common to have files with almost 0 fragmentation).
Inodes that use extents store them in a tree, and the top of the tree
is stored on i_data. The tree's leaves always start with an
EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0 and
EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
block.

6) Directories
Ext4 directories are files that store name -> inode mappings for the
logical directory; this is where files get their names, which means ext4
inodes do not themselves have names, since they can be linked (present)
multiple times with different names. Directories can store entries in two
different ways:
1) Classical linear directories: They store entries as a mostly-linked
mostly-list of EXT4_DIR_ENTRY.
2) Hash tree directories: These are used for larger directories, with
hundreds of entries, and are designed in a backwards-compatible way.
These are not yet implemented in the Ext4Dxe driver.

7) Journal
Ext3/4 filesystems have a journal to help protect the filesystem against
system crashes. This is not yet implemented in Ext4Dxe but is described
in detail in the Linux kernel's documentation.

The EDK2 implementation of ext4 is based only on the public documentation
available at https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
and
the FreeBSD ext2fs driver (available at
https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
BSD-2-Clause-FreeBSD licensed). It is licensed as
SPDX-License-Identifier: BSD-2-Clause-Patent.

After a brief discussion with the community, the proposed package
location is edk2-platform/Features/Ext4Pkg
(relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).

I was the main contributor and I would like to maintain the package in
the future, if possible.

Current limitations:
1) The Ext4Dxe driver is, at the moment, read-only.
2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
filesystems. Ensuring compatibility with
those may not be a bad idea.

I intend to test the package using the UEFI SCTs present in edk2-test,
and implement any other needed unit tests myself using the already
available unit test framework. I also intend to (privately) fuzz the
UEFI driver with bad/unusual disk images, to improve the security and
reliability of the driver.

In the future, ext4 write support should be added so edk2 has a
fully-featured RW ext4 driver. There could also be a focus on
supporting the older ext4-like filesystems, as I mentioned in the
limitations, but that is open for discussion.

The driver's handling of unclean unmounting through forced shutdown is unclear.
Is there a position in edk2 on how to handle such cases? I don't think
FAT32 has a "this filesystem is/was dirty" and even though it seems to
me that stopping a system from booting/opening the partition because
"we may find some tiny irregularities" is not the best course of
action, I can't find a clear answer.

The driver also had to add implementations of CRC32C and CRC16, and
after talking with my mentor we quickly reached the conclusion that
these may be good candidates for inclusion in MdePkg. We also
discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
(BaseUcs2Utf8Lib) into MdePkg as well. Any comments?

Feel free to ask any questions you may find relevant.

Best Regards,

Pedro Falcato

^ permalink raw reply	[flat|nested] 10+ messages in thread

* 回复: [edk2-rfc] RFC: EXT4 filesystem driver
  2021-07-21 23:12 RFC: EXT4 filesystem driver Pedro Falcato
@ 2021-07-22  1:20 ` gaoliming
  2021-07-22  2:47   ` [edk2-devel] " Andrew Fish
  2021-07-22 17:08   ` Michael D Kinney
  2021-07-22 10:24 ` [edk2-devel] " Marvin Häuser
  1 sibling, 2 replies; 10+ messages in thread
From: gaoliming @ 2021-07-22  1:20 UTC (permalink / raw)
  To: rfc, pedro.falcato, devel



> -----邮件原件-----
> 发件人: rfc@edk2.groups.io <rfc@edk2.groups.io> 代表 Pedro Falcato
> 发送时间: 2021年7月22日 7:12
> 收件人: devel@edk2.groups.io
> 抄送: rfc@edk2.groups.io
> 主题: [edk2-rfc] RFC: EXT4 filesystem driver
> 
> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
> that has been in wide use (desktops, servers, smartphones) since 2008.
> 
> The Ext4Pkg implements the Simple File System Protocol for a partition
> that is formatted with the EXT4 file system. This allows UEFI Drivers,
> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
> on an EXT4 partition and supports booting a UEFI OS Loader from an
> EXT4 partition.
> This project is one of the TianoCore Google Summer of Code projects.
> 
> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
> I/O 2 Protocols, and produces the Simple File System protocol. It
> allows mounting ext4 filesystems exclusively.
> 
> Brief overhead of EXT4:
> Layout of an EXT2/3/4 filesystem:
> (note: this driver has been developed using
> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
> documentation).
> 
> An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
> due to the similarities) is composed of various concepts:
> 
> 1) Superblock
> The superblock is the structure near (1024 bytes offset from the start)
> the start of the partition, and describes the filesystem in general.
> Here, we get to know the size of the filesystem's blocks, which features
> it supports or not, whether it's been cleanly unmounted, how many blocks
> we have, etc.
> 
> 2) Block groups
> EXT4 filesystems are divided into block groups, and each block group covers
> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
> associated block group descriptor; these are present directly after the
> superblock. Each block group descriptor contains the location of the
> inode table, and the inode and block bitmaps (note these bitmaps are only
> a block long, which gets us the 8 * Block Size formula covered previously).
> 
> 3) Blocks
> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
> Blocks can be allocated using individual block groups's bitmaps. Note
> that block 0 is invalid and its presence on extents/block tables means
> it's part of a file hole, and that particular location must be read as
> a block full of zeros.
> 
> 4) Inodes
> The ext4 filesystem divides files/directories into inodes (originally
> index nodes). Each file/socket/symlink/directory/etc (here on out referred
> to as a file, since there is no distinction under the ext4 filesystem) is
> stored as a /nameless/ inode, that is stored in some block group's inode
> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
> an old filesystem), and holds various metadata about the file. Since the
> largest inode structure right now is ~160 bytes, the rest of the inode
> contains inline extended attributes. Inodes' data is stored using either
> data blocks (under ext2/3) or extents (under ext4).
> 
> 5) Extents
> Ext4 inodes store data in extents. These let N contiguous logical blocks
> that are represented by N contiguous physical blocks be represented by a
> single extent structure, which minimizes filesystem metadata bloat and
> speeds up block mapping (particularly due to the fact that high-quality
> ext4 implementations like linux's try /really/ hard to make the file
> contiguous, so it's common to have files with almost 0 fragmentation).
> Inodes that use extents store them in a tree, and the top of the tree
> is stored on i_data. The tree's leaves always start with an
> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0
> and
> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
> block.
> 
> 6) Directories
> Ext4 directories are files that store name -> inode mappings for the
> logical directory; this is where files get their names, which means ext4
> inodes do not themselves have names, since they can be linked (present)
> multiple times with different names. Directories can store entries in two
> different ways:
> 1) Classical linear directories: They store entries as a mostly-linked
> mostly-list of EXT4_DIR_ENTRY.
> 2) Hash tree directories: These are used for larger directories, with
> hundreds of entries, and are designed in a backwards-compatible way.
> These are not yet implemented in the Ext4Dxe driver.
> 
> 7) Journal
> Ext3/4 filesystems have a journal to help protect the filesystem against
> system crashes. This is not yet implemented in Ext4Dxe but is described
> in detail in the Linux kernel's documentation.
> 
> The EDK2 implementation of ext4 is based only on the public documentation
> available at
> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
> and
> the FreeBSD ext2fs driver (available at
> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
> BSD-2-Clause-FreeBSD licensed). It is licensed as
> SPDX-License-Identifier: BSD-2-Clause-Patent.
> 
> After a brief discussion with the community, the proposed package
> location is edk2-platform/Features/Ext4Pkg
> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
> 
> I was the main contributor and I would like to maintain the package in
> the future, if possible.
> 
> Current limitations:
> 1) The Ext4Dxe driver is, at the moment, read-only.
> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
> filesystems. Ensuring compatibility with
> those may not be a bad idea.
> 
> I intend to test the package using the UEFI SCTs present in edk2-test,
> and implement any other needed unit tests myself using the already
> available unit test framework. I also intend to (privately) fuzz the
> UEFI driver with bad/unusual disk images, to improve the security and
> reliability of the driver.
> 
> In the future, ext4 write support should be added so edk2 has a
> fully-featured RW ext4 driver. There could also be a focus on
> supporting the older ext4-like filesystems, as I mentioned in the
> limitations, but that is open for discussion.
> 
> The driver's handling of unclean unmounting through forced shutdown is
> unclear.
> Is there a position in edk2 on how to handle such cases? I don't think
> FAT32 has a "this filesystem is/was dirty" and even though it seems to
> me that stopping a system from booting/opening the partition because
> "we may find some tiny irregularities" is not the best course of
> action, I can't find a clear answer.
> 
> The driver also had to add implementations of CRC32C and CRC16, and
> after talking with my mentor we quickly reached the conclusion that
> these may be good candidates for inclusion in MdePkg. We also
> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?

Current MdePkg BaseLib has CalculateCrc32(). So, CRC32C and CRC16 can be added into BaseLib. 

If more modules need to consume Ucs2 <-> Utf8 conversion library, BaseUcs2Utf8Lib is generic enough
to be placed in MdePkg. 

Thanks
Liming
> 
> Feel free to ask any questions you may find relevant.
> 
> Best Regards,
> 
> Pedro Falcato
> 
> 
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] [edk2-rfc] RFC: EXT4 filesystem driver
  2021-07-22  1:20 ` 回复: [edk2-rfc] " gaoliming
@ 2021-07-22  2:47   ` Andrew Fish
  2021-07-22 17:08   ` Michael D Kinney
  1 sibling, 0 replies; 10+ messages in thread
From: Andrew Fish @ 2021-07-22  2:47 UTC (permalink / raw)
  To: edk2-devel-groups-io, gaoliming; +Cc: rfc, pedro.falcato

[-- Attachment #1: Type: text/plain, Size: 8125 bytes --]



> On Jul 21, 2021, at 6:20 PM, gaoliming <gaoliming@byosoft.com.cn> wrote:
> 
> 
> 
>> -----邮件原件-----
>> 发件人: rfc@edk2.groups.io <mailto:rfc@edk2.groups.io> <rfc@edk2.groups.io <mailto:rfc@edk2.groups.io>> 代表 Pedro Falcato
>> 发送时间: 2021年7月22日 7:12
>> 收件人: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>> 抄送: rfc@edk2.groups.io <mailto:rfc@edk2.groups.io>
>> 主题: [edk2-rfc] RFC: EXT4 filesystem driver
>> 
>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
>> that has been in wide use (desktops, servers, smartphones) since 2008.
>> 
>> The Ext4Pkg implements the Simple File System Protocol for a partition
>> that is formatted with the EXT4 file system. This allows UEFI Drivers,
>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
>> on an EXT4 partition and supports booting a UEFI OS Loader from an
>> EXT4 partition.
>> This project is one of the TianoCore Google Summer of Code projects.
>> 
>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
>> I/O 2 Protocols, and produces the Simple File System protocol. It
>> allows mounting ext4 filesystems exclusively.
>> 
>> Brief overhead of EXT4:
>> Layout of an EXT2/3/4 filesystem:
>> (note: this driver has been developed using
>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
>> documentation).
>> 
>> An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
>> due to the similarities) is composed of various concepts:
>> 
>> 1) Superblock
>> The superblock is the structure near (1024 bytes offset from the start)
>> the start of the partition, and describes the filesystem in general.
>> Here, we get to know the size of the filesystem's blocks, which features
>> it supports or not, whether it's been cleanly unmounted, how many blocks
>> we have, etc.
>> 
>> 2) Block groups
>> EXT4 filesystems are divided into block groups, and each block group covers
>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
>> associated block group descriptor; these are present directly after the
>> superblock. Each block group descriptor contains the location of the
>> inode table, and the inode and block bitmaps (note these bitmaps are only
>> a block long, which gets us the 8 * Block Size formula covered previously).
>> 
>> 3) Blocks
>> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
>> Blocks can be allocated using individual block groups's bitmaps. Note
>> that block 0 is invalid and its presence on extents/block tables means
>> it's part of a file hole, and that particular location must be read as
>> a block full of zeros.
>> 
>> 4) Inodes
>> The ext4 filesystem divides files/directories into inodes (originally
>> index nodes). Each file/socket/symlink/directory/etc (here on out referred
>> to as a file, since there is no distinction under the ext4 filesystem) is
>> stored as a /nameless/ inode, that is stored in some block group's inode
>> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
>> an old filesystem), and holds various metadata about the file. Since the
>> largest inode structure right now is ~160 bytes, the rest of the inode
>> contains inline extended attributes. Inodes' data is stored using either
>> data blocks (under ext2/3) or extents (under ext4).
>> 
>> 5) Extents
>> Ext4 inodes store data in extents. These let N contiguous logical blocks
>> that are represented by N contiguous physical blocks be represented by a
>> single extent structure, which minimizes filesystem metadata bloat and
>> speeds up block mapping (particularly due to the fact that high-quality
>> ext4 implementations like linux's try /really/ hard to make the file
>> contiguous, so it's common to have files with almost 0 fragmentation).
>> Inodes that use extents store them in a tree, and the top of the tree
>> is stored on i_data. The tree's leaves always start with an
>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0
>> and
>> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
>> block.
>> 
>> 6) Directories
>> Ext4 directories are files that store name -> inode mappings for the
>> logical directory; this is where files get their names, which means ext4
>> inodes do not themselves have names, since they can be linked (present)
>> multiple times with different names. Directories can store entries in two
>> different ways:
>> 1) Classical linear directories: They store entries as a mostly-linked
>> mostly-list of EXT4_DIR_ENTRY.
>> 2) Hash tree directories: These are used for larger directories, with
>> hundreds of entries, and are designed in a backwards-compatible way.
>> These are not yet implemented in the Ext4Dxe driver.
>> 
>> 7) Journal
>> Ext3/4 filesystems have a journal to help protect the filesystem against
>> system crashes. This is not yet implemented in Ext4Dxe but is described
>> in detail in the Linux kernel's documentation.
>> 
>> The EDK2 implementation of ext4 is based only on the public documentation
>> available at
>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
>> and
>> the FreeBSD ext2fs driver (available at
>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
>> BSD-2-Clause-FreeBSD licensed). It is licensed as
>> SPDX-License-Identifier: BSD-2-Clause-Patent.
>> 
>> After a brief discussion with the community, the proposed package
>> location is edk2-platform/Features/Ext4Pkg
>> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
>> 
>> I was the main contributor and I would like to maintain the package in
>> the future, if possible.
>> 
>> Current limitations:
>> 1) The Ext4Dxe driver is, at the moment, read-only.
>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
>> filesystems. Ensuring compatibility with
>> those may not be a bad idea.
>> 
>> I intend to test the package using the UEFI SCTs present in edk2-test,
>> and implement any other needed unit tests myself using the already
>> available unit test framework. I also intend to (privately) fuzz the
>> UEFI driver with bad/unusual disk images, to improve the security and
>> reliability of the driver.
>> 
>> In the future, ext4 write support should be added so edk2 has a
>> fully-featured RW ext4 driver. There could also be a focus on
>> supporting the older ext4-like filesystems, as I mentioned in the
>> limitations, but that is open for discussion.
>> 
>> The driver's handling of unclean unmounting through forced shutdown is
>> unclear.
>> Is there a position in edk2 on how to handle such cases? I don't think
>> FAT32 has a "this filesystem is/was dirty" and even though it seems to
>> me that stopping a system from booting/opening the partition because
>> "we may find some tiny irregularities" is not the best course of
>> action, I can't find a clear answer.
>> 
>> The driver also had to add implementations of CRC32C and CRC16, and
>> after talking with my mentor we quickly reached the conclusion that
>> these may be good candidates for inclusion in MdePkg. We also
>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
> 
> Current MdePkg BaseLib has CalculateCrc32(). So, CRC32C and CRC16 can be added into BaseLib. 
> 
> If more modules need to consume Ucs2 <-> Utf8 conversion library, BaseUcs2Utf8Lib is generic enough
> to be placed in MdePkg. 
> 

I think the Terminal driver may have some similar logic to convert UTF-8 terminals to/from the UEFI UCS-2?

https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Universal/Console/TerminalDxe/Vtutf8.c#L186

Thanks,

Andrew Fish

> Thanks
> Liming
>> 
>> Feel free to ask any questions you may find relevant.
>> 
>> Best Regards,
>> 
>> Pedro Falcato
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 


[-- Attachment #2: Type: text/html, Size: 18765 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-21 23:12 RFC: EXT4 filesystem driver Pedro Falcato
  2021-07-22  1:20 ` 回复: [edk2-rfc] " gaoliming
@ 2021-07-22 10:24 ` Marvin Häuser
  2021-07-22 15:58   ` Andrew Fish
  1 sibling, 1 reply; 10+ messages in thread
From: Marvin Häuser @ 2021-07-22 10:24 UTC (permalink / raw)
  To: devel, pedro.falcato

On 22.07.21 01:12, Pedro Falcato wrote:
> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
> that has been in wide use (desktops, servers, smartphones) since 2008.
>
> The Ext4Pkg implements the Simple File System Protocol for a partition
> that is formatted with the EXT4 file system. This allows UEFI Drivers,
> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
> on an EXT4 partition and supports booting a UEFI OS Loader from an
> EXT4 partition.
> This project is one of the TianoCore Google Summer of Code projects.
>
> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
> I/O 2 Protocols, and produces the Simple File System protocol. It
> allows mounting ext4 filesystems exclusively.
>
> Brief overhead of EXT4:
> Layout of an EXT2/3/4 filesystem:
> (note: this driver has been developed using
> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
> documentation).
>
> An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
> due to the similarities) is composed of various concepts:
>
> 1) Superblock
> The superblock is the structure near (1024 bytes offset from the start)
> the start of the partition, and describes the filesystem in general.
> Here, we get to know the size of the filesystem's blocks, which features
> it supports or not, whether it's been cleanly unmounted, how many blocks
> we have, etc.
>
> 2) Block groups
> EXT4 filesystems are divided into block groups, and each block group covers
> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
> associated block group descriptor; these are present directly after the
> superblock. Each block group descriptor contains the location of the
> inode table, and the inode and block bitmaps (note these bitmaps are only
> a block long, which gets us the 8 * Block Size formula covered previously).
>
> 3) Blocks
> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
> Blocks can be allocated using individual block groups's bitmaps. Note
> that block 0 is invalid and its presence on extents/block tables means
> it's part of a file hole, and that particular location must be read as
> a block full of zeros.
>
> 4) Inodes
> The ext4 filesystem divides files/directories into inodes (originally
> index nodes). Each file/socket/symlink/directory/etc (here on out referred
> to as a file, since there is no distinction under the ext4 filesystem) is
> stored as a /nameless/ inode, that is stored in some block group's inode
> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
> an old filesystem), and holds various metadata about the file. Since the
> largest inode structure right now is ~160 bytes, the rest of the inode
> contains inline extended attributes. Inodes' data is stored using either
> data blocks (under ext2/3) or extents (under ext4).
>
> 5) Extents
> Ext4 inodes store data in extents. These let N contiguous logical blocks
> that are represented by N contiguous physical blocks be represented by a
> single extent structure, which minimizes filesystem metadata bloat and
> speeds up block mapping (particularly due to the fact that high-quality
> ext4 implementations like linux's try /really/ hard to make the file
> contiguous, so it's common to have files with almost 0 fragmentation).
> Inodes that use extents store them in a tree, and the top of the tree
> is stored on i_data. The tree's leaves always start with an
> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0 and
> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
> block.
>
> 6) Directories
> Ext4 directories are files that store name -> inode mappings for the
> logical directory; this is where files get their names, which means ext4
> inodes do not themselves have names, since they can be linked (present)
> multiple times with different names. Directories can store entries in two
> different ways:
> 1) Classical linear directories: They store entries as a mostly-linked
> mostly-list of EXT4_DIR_ENTRY.
> 2) Hash tree directories: These are used for larger directories, with
> hundreds of entries, and are designed in a backwards-compatible way.
> These are not yet implemented in the Ext4Dxe driver.
>
> 7) Journal
> Ext3/4 filesystems have a journal to help protect the filesystem against
> system crashes. This is not yet implemented in Ext4Dxe but is described
> in detail in the Linux kernel's documentation.
>
> The EDK2 implementation of ext4 is based only on the public documentation
> available at https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
> and
> the FreeBSD ext2fs driver (available at
> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
> BSD-2-Clause-FreeBSD licensed). It is licensed as
> SPDX-License-Identifier: BSD-2-Clause-Patent.
>
> After a brief discussion with the community, the proposed package
> location is edk2-platform/Features/Ext4Pkg
> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
>
> I was the main contributor and I would like to maintain the package in
> the future, if possible.

While I personally don't like it's outside of the EDK II core, I kind of 
get it. However I would strongly suggest to choose a more general 
package name, like "LinuxFsPkg", or "NixFsPkg", or maybe even just 
"FileSystemPkg" (and move FAT over some day?). Imagine someone wants to 
import BTRFS next year, should it really be "BtrfsPkg"? I understand it 
follows the "FatPkg" convention, but I feel like people forget FatPkg 
was special as to its awkward license before Microsoft allowed a change 
a few years ago. Maintainers.txt already has the concept of different 
Reviewers per subfolder, maybe it could be extended a little to have a 
common EDK II contributor to officially maintain the package, but have 
you be a Maintainer or something like a Reviewer+ to your driver? Or you 
could maintain the entire package of course.

> Current limitations:
> 1) The Ext4Dxe driver is, at the moment, read-only.
> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
> filesystems. Ensuring compatibility with
> those may not be a bad idea.
>
> I intend to test the package using the UEFI SCTs present in edk2-test,
> and implement any other needed unit tests myself using the already
> available unit test framework. I also intend to (privately) fuzz the
> UEFI driver with bad/unusual disk images, to improve the security and
> reliability of the driver.
>
> In the future, ext4 write support should be added so edk2 has a
> fully-featured RW ext4 driver. There could also be a focus on
> supporting the older ext4-like filesystems, as I mentioned in the
> limitations, but that is open for discussion.

I may be alone, but I strongly object. One of our projects (OpenCore) 
has a disgusting way of writing files because the FAT32 driver in Aptio 
IV firmwares may corrupt the filesystem when resizing files. To be 
honest, it may corrupt with other usages too and we never noticed, 
because basically we wrote the code to require the least amount of 
(complex) FS operations.

The issue with EDK II is, there is a lot of own code and a lot of users, 
but little testing. By that I do not mean that developers do not test 
their code, but that nobody sits down and performs all sorts of FS 
manipulations in all sorts of orders and closely observes the result for 
regression-testing. Users will not really test it either, as UEFI to 
them should just somehow boot to Windows. If at least the code was 
shared with a codebase that is known-trusted (e.g. the Linux kernel 
itself), that'd be much easier to trust, but realistically this is not 
going to happen.
My point is, if a company like AMI cannot guarantee writing does not 
damage the FS for a very simple FS, how do you plan to guarantee yours 
doesn't for a much more complex FS? I'd rather have only one simple FS 
type that supports writing for most use-cases (e.g. logging).

At the very least I would beg you to have a PCD to turn write support 
off - if it will be off by default, that'd be great of course. :)
Was there any discussion yet as to why write support is needed in the 
first place you could point me to?

Thanks for your work!

Best regards,
Marvin

> The driver's handling of unclean unmounting through forced shutdown is unclear.
> Is there a position in edk2 on how to handle such cases? I don't think
> FAT32 has a "this filesystem is/was dirty" and even though it seems to
> me that stopping a system from booting/opening the partition because
> "we may find some tiny irregularities" is not the best course of
> action, I can't find a clear answer.
>
> The driver also had to add implementations of CRC32C and CRC16, and
> after talking with my mentor we quickly reached the conclusion that
> these may be good candidates for inclusion in MdePkg. We also
> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
>
> Feel free to ask any questions you may find relevant.
>
> Best Regards,
>
> Pedro Falcato
>
>
> 
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-22 10:24 ` [edk2-devel] " Marvin Häuser
@ 2021-07-22 15:58   ` Andrew Fish
  2021-07-22 16:07     ` Marvin Häuser
  2021-07-22 16:54     ` Pedro Falcato
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Fish @ 2021-07-22 15:58 UTC (permalink / raw)
  To: edk2-devel-groups-io, mhaeuser; +Cc: pedro.falcato

[-- Attachment #1: Type: text/plain, Size: 10711 bytes --]



> On Jul 22, 2021, at 3:24 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:
> 
> On 22.07.21 01:12, Pedro Falcato wrote:
>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
>> that has been in wide use (desktops, servers, smartphones) since 2008.
>> 
>> The Ext4Pkg implements the Simple File System Protocol for a partition
>> that is formatted with the EXT4 file system. This allows UEFI Drivers,
>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
>> on an EXT4 partition and supports booting a UEFI OS Loader from an
>> EXT4 partition.
>> This project is one of the TianoCore Google Summer of Code projects.
>> 
>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
>> I/O 2 Protocols, and produces the Simple File System protocol. It
>> allows mounting ext4 filesystems exclusively.
>> 
>> Brief overhead of EXT4:
>> Layout of an EXT2/3/4 filesystem:
>> (note: this driver has been developed using
>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
>> documentation).
>> 
>> An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
>> due to the similarities) is composed of various concepts:
>> 
>> 1) Superblock
>> The superblock is the structure near (1024 bytes offset from the start)
>> the start of the partition, and describes the filesystem in general.
>> Here, we get to know the size of the filesystem's blocks, which features
>> it supports or not, whether it's been cleanly unmounted, how many blocks
>> we have, etc.
>> 
>> 2) Block groups
>> EXT4 filesystems are divided into block groups, and each block group covers
>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
>> associated block group descriptor; these are present directly after the
>> superblock. Each block group descriptor contains the location of the
>> inode table, and the inode and block bitmaps (note these bitmaps are only
>> a block long, which gets us the 8 * Block Size formula covered previously).
>> 
>> 3) Blocks
>> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
>> Blocks can be allocated using individual block groups's bitmaps. Note
>> that block 0 is invalid and its presence on extents/block tables means
>> it's part of a file hole, and that particular location must be read as
>> a block full of zeros.
>> 
>> 4) Inodes
>> The ext4 filesystem divides files/directories into inodes (originally
>> index nodes). Each file/socket/symlink/directory/etc (here on out referred
>> to as a file, since there is no distinction under the ext4 filesystem) is
>> stored as a /nameless/ inode, that is stored in some block group's inode
>> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
>> an old filesystem), and holds various metadata about the file. Since the
>> largest inode structure right now is ~160 bytes, the rest of the inode
>> contains inline extended attributes. Inodes' data is stored using either
>> data blocks (under ext2/3) or extents (under ext4).
>> 
>> 5) Extents
>> Ext4 inodes store data in extents. These let N contiguous logical blocks
>> that are represented by N contiguous physical blocks be represented by a
>> single extent structure, which minimizes filesystem metadata bloat and
>> speeds up block mapping (particularly due to the fact that high-quality
>> ext4 implementations like linux's try /really/ hard to make the file
>> contiguous, so it's common to have files with almost 0 fragmentation).
>> Inodes that use extents store them in a tree, and the top of the tree
>> is stored on i_data. The tree's leaves always start with an
>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0 and
>> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
>> block.
>> 
>> 6) Directories
>> Ext4 directories are files that store name -> inode mappings for the
>> logical directory; this is where files get their names, which means ext4
>> inodes do not themselves have names, since they can be linked (present)
>> multiple times with different names. Directories can store entries in two
>> different ways:
>> 1) Classical linear directories: They store entries as a mostly-linked
>> mostly-list of EXT4_DIR_ENTRY.
>> 2) Hash tree directories: These are used for larger directories, with
>> hundreds of entries, and are designed in a backwards-compatible way.
>> These are not yet implemented in the Ext4Dxe driver.
>> 
>> 7) Journal
>> Ext3/4 filesystems have a journal to help protect the filesystem against
>> system crashes. This is not yet implemented in Ext4Dxe but is described
>> in detail in the Linux kernel's documentation.
>> 
>> The EDK2 implementation of ext4 is based only on the public documentation
>> available at https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
>> and
>> the FreeBSD ext2fs driver (available at
>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
>> BSD-2-Clause-FreeBSD licensed). It is licensed as
>> SPDX-License-Identifier: BSD-2-Clause-Patent.
>> 
>> After a brief discussion with the community, the proposed package
>> location is edk2-platform/Features/Ext4Pkg
>> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
>> 
>> I was the main contributor and I would like to maintain the package in
>> the future, if possible.
> 
> While I personally don't like it's outside of the EDK II core, I kind of get it. However I would strongly suggest to choose a more general package name, like "LinuxFsPkg", or "NixFsPkg", or maybe even just "FileSystemPkg" (and move FAT over some day?). Imagine someone wants to import BTRFS next year, should it really be "BtrfsPkg"? I understand it follows the "FatPkg" convention, but I feel like people forget FatPkg was special as to its awkward license before Microsoft allowed a change a few years ago. Maintainers.txt already has the concept of different Reviewers per subfolder, maybe it could be extended a little to have a common EDK II contributor to officially maintain the package, but have you be a Maintainer or something like a Reviewer+ to your driver? Or you could maintain the entire package of course.
> 

Marvin,

Good point that the FatPkg was more about license boundary than anything else, so I’m not opposed to a more generic package name. 

>> Current limitations:
>> 1) The Ext4Dxe driver is, at the moment, read-only.
>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
>> filesystems. Ensuring compatibility with
>> those may not be a bad idea.
>> 
>> I intend to test the package using the UEFI SCTs present in edk2-test,
>> and implement any other needed unit tests myself using the already
>> available unit test framework. I also intend to (privately) fuzz the
>> UEFI driver with bad/unusual disk images, to improve the security and
>> reliability of the driver.
>> 
>> In the future, ext4 write support should be added so edk2 has a
>> fully-featured RW ext4 driver. There could also be a focus on
>> supporting the older ext4-like filesystems, as I mentioned in the
>> limitations, but that is open for discussion.
> 
> I may be alone, but I strongly object. One of our projects (OpenCore) has a disgusting way of writing files because the FAT32 driver in Aptio IV firmwares may corrupt the filesystem when resizing files. To be honest, it may corrupt with other usages too and we never noticed, because basically we wrote the code to require the least amount of (complex) FS operations.
> 
> The issue with EDK II is, there is a lot of own code and a lot of users, but little testing. By that I do not mean that developers do not test their code, but that nobody sits down and performs all sorts of FS manipulations in all sorts of orders and closely observes the result for regression-testing. Users will not really test it either, as UEFI to them should just somehow boot to Windows. If at least the code was shared with a codebase that is known-trusted (e.g. the Linux kernel itself), that'd be much easier to trust, but realistically this is not going to happen.
> My point is, if a company like AMI cannot guarantee writing does not damage the FS for a very simple FS, how do you plan to guarantee yours doesn't for a much more complex FS? I'd rather have only one simple FS type that supports writing for most use-cases (e.g. logging).
> 
> At the very least I would beg you to have a PCD to turn write support off - if it will be off by default, that'd be great of course. :)
> Was there any discussion yet as to why write support is needed in the first place you could point me to?
> 

I think having a default PCD option of read only is a good idea. 

EFI on Mac carries HFS+ and APFS EFI file system drivers and both of those are read only for safety, security, and to avoid the need to validate them. So I think some products may want to have the option to ship read only versions of the file system. 

Seems like having EFI base file system tests would be useful. I’d imaging with OVMF it would be possible to implement a very robust test infrastructure. Seems like the hard bits would be generating the test cases and figuring out how to validate the tests did the correct thing. I’m guess the OS based file system drivers are robust and try to work around bugs gracefully? Maybe there is a way to turn on OS logging, or even run an OS based fsck on the volume after the tests complete. Regardless this seems like an interesting project, maybe we can add it to next years GSoC?

Thanks,

Andrew Fish

> Thanks for your work!
> 
> Best regards,
> Marvin
> 
>> The driver's handling of unclean unmounting through forced shutdown is unclear.
>> Is there a position in edk2 on how to handle such cases? I don't think
>> FAT32 has a "this filesystem is/was dirty" and even though it seems to
>> me that stopping a system from booting/opening the partition because
>> "we may find some tiny irregularities" is not the best course of
>> action, I can't find a clear answer.
>> 
>> The driver also had to add implementations of CRC32C and CRC16, and
>> after talking with my mentor we quickly reached the conclusion that
>> these may be good candidates for inclusion in MdePkg. We also
>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
>> 
>> Feel free to ask any questions you may find relevant.
>> 
>> Best Regards,
>> 
>> Pedro Falcato
>> 
>> 
>> 
>> 
> 
> 
> 
> 


[-- Attachment #2: Type: text/html, Size: 25392 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-22 15:58   ` Andrew Fish
@ 2021-07-22 16:07     ` Marvin Häuser
  2021-07-22 16:54     ` Pedro Falcato
  1 sibling, 0 replies; 10+ messages in thread
From: Marvin Häuser @ 2021-07-22 16:07 UTC (permalink / raw)
  To: Andrew Fish, edk2-devel-groups-io; +Cc: pedro.falcato

On 22.07.21 17:58, Andrew Fish wrote:
>
>
>> On Jul 22, 2021, at 3:24 AM, Marvin Häuser <mhaeuser@posteo.de 
>> <mailto:mhaeuser@posteo.de>> wrote:
>>
>> On 22.07.21 01:12, Pedro Falcato wrote:
>>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
>>> that has been in wide use (desktops, servers, smartphones) since 2008.
>>>
>>> The Ext4Pkg implements the Simple File System Protocol for a partition
>>> that is formatted with the EXT4 file system. This allows UEFI Drivers,
>>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
>>> on an EXT4 partition and supports booting a UEFI OS Loader from an
>>> EXT4 partition.
>>> This project is one of the TianoCore Google Summer of Code projects.
>>>
>>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
>>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
>>> I/O 2 Protocols, and produces the Simple File System protocol. It
>>> allows mounting ext4 filesystems exclusively.
>>>
>>> Brief overhead of EXT4:
>>> Layout of an EXT2/3/4 filesystem:
>>> (note: this driver has been developed using
>>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html 
>>> <https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html> as
>>> documentation).
>>>
>>> An ext2/3/4 filesystem (here on out referred to as simply an ext4 
>>> filesystem,
>>> due to the similarities) is composed of various concepts:
>>>
>>> 1) Superblock
>>> The superblock is the structure near (1024 bytes offset from the start)
>>> the start of the partition, and describes the filesystem in general.
>>> Here, we get to know the size of the filesystem's blocks, which features
>>> it supports or not, whether it's been cleanly unmounted, how many blocks
>>> we have, etc.
>>>
>>> 2) Block groups
>>> EXT4 filesystems are divided into block groups, and each block group 
>>> covers
>>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
>>> associated block group descriptor; these are present directly after the
>>> superblock. Each block group descriptor contains the location of the
>>> inode table, and the inode and block bitmaps (note these bitmaps are 
>>> only
>>> a block long, which gets us the 8 * Block Size formula covered 
>>> previously).
>>>
>>> 3) Blocks
>>> The ext4 filesystem is divided into blocks, of size s_log_block_size 
>>> ^ 1024.
>>> Blocks can be allocated using individual block groups's bitmaps. Note
>>> that block 0 is invalid and its presence on extents/block tables means
>>> it's part of a file hole, and that particular location must be read as
>>> a block full of zeros.
>>>
>>> 4) Inodes
>>> The ext4 filesystem divides files/directories into inodes (originally
>>> index nodes). Each file/socket/symlink/directory/etc (here on out 
>>> referred
>>> to as a file, since there is no distinction under the ext4 
>>> filesystem) is
>>> stored as a /nameless/ inode, that is stored in some block group's inode
>>> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
>>> an old filesystem), and holds various metadata about the file. Since the
>>> largest inode structure right now is ~160 bytes, the rest of the inode
>>> contains inline extended attributes. Inodes' data is stored using either
>>> data blocks (under ext2/3) or extents (under ext4).
>>>
>>> 5) Extents
>>> Ext4 inodes store data in extents. These let N contiguous logical blocks
>>> that are represented by N contiguous physical blocks be represented by a
>>> single extent structure, which minimizes filesystem metadata bloat and
>>> speeds up block mapping (particularly due to the fact that high-quality
>>> ext4 implementations like linux's try /really/ hard to make the file
>>> contiguous, so it's common to have files with almost 0 fragmentation).
>>> Inodes that use extents store them in a tree, and the top of the tree
>>> is stored on i_data. The tree's leaves always start with an
>>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0 and
>>> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
>>> block.
>>>
>>> 6) Directories
>>> Ext4 directories are files that store name -> inode mappings for the
>>> logical directory; this is where files get their names, which means ext4
>>> inodes do not themselves have names, since they can be linked (present)
>>> multiple times with different names. Directories can store entries 
>>> in two
>>> different ways:
>>> 1) Classical linear directories: They store entries as a mostly-linked
>>> mostly-list of EXT4_DIR_ENTRY.
>>> 2) Hash tree directories: These are used for larger directories, with
>>> hundreds of entries, and are designed in a backwards-compatible way.
>>> These are not yet implemented in the Ext4Dxe driver.
>>>
>>> 7) Journal
>>> Ext3/4 filesystems have a journal to help protect the filesystem against
>>> system crashes. This is not yet implemented in Ext4Dxe but is described
>>> in detail in the Linux kernel's documentation.
>>>
>>> The EDK2 implementation of ext4 is based only on the public 
>>> documentation
>>> available at 
>>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html 
>>> <https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html>
>>> and
>>> the FreeBSD ext2fs driver (available at
>>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs 
>>> <https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs>,
>>> BSD-2-Clause-FreeBSD licensed). It is licensed as
>>> SPDX-License-Identifier: BSD-2-Clause-Patent.
>>>
>>> After a brief discussion with the community, the proposed package
>>> location is edk2-platform/Features/Ext4Pkg
>>> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185 
>>> <https://edk2.groups.io/g/devel/topic/83060185>).
>>>
>>> I was the main contributor and I would like to maintain the package in
>>> the future, if possible.
>>
>> While I personally don't like it's outside of the EDK II core, I kind 
>> of get it. However I would strongly suggest to choose a more general 
>> package name, like "LinuxFsPkg", or "NixFsPkg", or maybe even just 
>> "FileSystemPkg" (and move FAT over some day?). Imagine someone wants 
>> to import BTRFS next year, should it really be "BtrfsPkg"? I 
>> understand it follows the "FatPkg" convention, but I feel like people 
>> forget FatPkg was special as to its awkward license before Microsoft 
>> allowed a change a few years ago. Maintainers.txt already has the 
>> concept of different Reviewers per subfolder, maybe it could be 
>> extended a little to have a common EDK II contributor to officially 
>> maintain the package, but have you be a Maintainer or something like 
>> a Reviewer+ to your driver? Or you could maintain the entire package 
>> of course.
>>
>
> Marvin,
>
> Good point that the FatPkg was more about license boundary than 
> anything else, so I’m not opposed to a more generic package name.
>
>>> Current limitations:
>>> 1) The Ext4Dxe driver is, at the moment, read-only.
>>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
>>> filesystems. Ensuring compatibility with
>>> those may not be a bad idea.
>>>
>>> I intend to test the package using the UEFI SCTs present in edk2-test,
>>> and implement any other needed unit tests myself using the already
>>> available unit test framework. I also intend to (privately) fuzz the
>>> UEFI driver with bad/unusual disk images, to improve the security and
>>> reliability of the driver.
>>>
>>> In the future, ext4 write support should be added so edk2 has a
>>> fully-featured RW ext4 driver. There could also be a focus on
>>> supporting the older ext4-like filesystems, as I mentioned in the
>>> limitations, but that is open for discussion.
>>
>> I may be alone, but I strongly object. One of our projects (OpenCore) 
>> has a disgusting way of writing files because the FAT32 driver in 
>> Aptio IV firmwares may corrupt the filesystem when resizing files. To 
>> be honest, it may corrupt with other usages too and we never noticed, 
>> because basically we wrote the code to require the least amount of 
>> (complex) FS operations.
>>
>> The issue with EDK II is, there is a lot of own code and a lot of 
>> users, but little testing. By that I do not mean that developers do 
>> not test their code, but that nobody sits down and performs all sorts 
>> of FS manipulations in all sorts of orders and closely observes the 
>> result for regression-testing. Users will not really test it either, 
>> as UEFI to them should just somehow boot to Windows. If at least the 
>> code was shared with a codebase that is known-trusted (e.g. the Linux 
>> kernel itself), that'd be much easier to trust, but realistically 
>> this is not going to happen.
>> My point is, if a company like AMI cannot guarantee writing does not 
>> damage the FS for a very simple FS, how do you plan to guarantee 
>> yours doesn't for a much more complex FS? I'd rather have only one 
>> simple FS type that supports writing for most use-cases (e.g. logging).
>>
>> At the very least I would beg you to have a PCD to turn write support 
>> off - if it will be off by default, that'd be great of course. :)
>> Was there any discussion yet as to why write support is needed in the 
>> first place you could point me to?
>>
>
> I think having a default PCD option of read only is a good idea.
>
> EFI on Mac carries HFS+ and APFS EFI file system drivers and both of 
> those are read only for safety, security, and to avoid the need to 
> validate them. So I think some products may want to have the option to 
> ship read only versions of the file system.
>
> Seems like having EFI base file system tests would be useful. I’d 
> imaging with OVMF it would be possible to implement a very robust test 
> infrastructure. Seems like the hard bits would be generating the test 
> cases and figuring out how to validate the tests did the correct 
> thing. I’m guess the OS based file system drivers are robust and try 
> to work around bugs gracefully? Maybe there is a way to turn on OS 
> logging, or even run an OS based fsck on the volume after the tests 
> complete. Regardless this seems like an interesting project, maybe we 
> can add it to next years GSoC?

Great idea, maybe it could be added to that wiki list of topic 
suggestions (preferably close to the top)? :)

Best regards,
Marvin

>
> Thanks,
>
> Andrew Fish
>
>> Thanks for your work!
>>
>> Best regards,
>> Marvin
>>
>>> The driver's handling of unclean unmounting through forced shutdown 
>>> is unclear.
>>> Is there a position in edk2 on how to handle such cases? I don't think
>>> FAT32 has a "this filesystem is/was dirty" and even though it seems to
>>> me that stopping a system from booting/opening the partition because
>>> "we may find some tiny irregularities" is not the best course of
>>> action, I can't find a clear answer.
>>>
>>> The driver also had to add implementations of CRC32C and CRC16, and
>>> after talking with my mentor we quickly reached the conclusion that
>>> these may be good candidates for inclusion in MdePkg. We also
>>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
>>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
>>>
>>> Feel free to ask any questions you may find relevant.
>>>
>>> Best Regards,
>>>
>>> Pedro Falcato
>>>
>>>
>>>
>>>
>>
>>
>>
>> 
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-22 15:58   ` Andrew Fish
  2021-07-22 16:07     ` Marvin Häuser
@ 2021-07-22 16:54     ` Pedro Falcato
  2021-07-23  2:07       ` Nate DeSimone
  1 sibling, 1 reply; 10+ messages in thread
From: Pedro Falcato @ 2021-07-22 16:54 UTC (permalink / raw)
  To: Andrew Fish; +Cc: edk2-devel-groups-io, mhaeuser, rfc

Hi Andrew, Marvin,

RE: The package name: It doesn't sound like a bad idea to have
something like a FileSystemPkg and have a bunch of different
filesystems inside of it, but I'll defer to you
and my mentors' judgement; we could also drop that issue for now and
take care of it afterwards, since it may need further changes that are
not a part of GSoC and would just delay the process.

With respect to the write capabilities of the driver, I'm not entirely
sure whether or not it's useful. I've been thinking about it today,
and it seems like there's not much that could go wrong? The write path
isn't excessively complex. Except of course in the event of an
untimely power cut, but those /should/ be easily detected by the
checksums. My initial idea was to have it up to speed with FatPkg and
other filesystems by implementing all of EFI_FILE_PROTOCOL, including
the write portions. If Apple's HFS+ and APFS drivers don't have those,
it may be a decent idea to reduce the scope of the ext4 driver as
well. I don't see a big need for write support; on the other hand,
I've only worked on UEFI bootloaders before, which may be an outlier
in that regard. Further feedback is appreciated.

As for the tests, UEFI SCTs already seem to have some tests on
EFI_FILE_PROTOCOL's. Further testing may require some sort of fuzzing,
which is what I want to, although in a
simplified way. With fuzzing we could hammer the filesystem code with
all sorts of different calls in different orders, we could also mutate
the disk structures to test if the driver is secure and
can handle corruption in a nice, safe way. A future (GSoC or not)
project could also attempt to use compiler-generated coverage
instrumentation (see LLVM's LibFuzzer and SanitizerCoverage for an
example).

I'm not sure about all OSes, but at least Linux ext2/3/4 drivers are
very robust and can handle and work around any corrupted FS I
(accidentally) throw at them. However, running fsck is the best way to
detect corruption; note that licensing may be an issue as, for
example, ext4's fsck is GPL2 licensed.

Best Regards,
Pedro

On Thu, 22 Jul 2021 at 16:58, Andrew Fish <afish@apple.com> wrote:
>
>
>
> On Jul 22, 2021, at 3:24 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:
>
> On 22.07.21 01:12, Pedro Falcato wrote:
>
> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
> that has been in wide use (desktops, servers, smartphones) since 2008.
>
> The Ext4Pkg implements the Simple File System Protocol for a partition
> that is formatted with the EXT4 file system. This allows UEFI Drivers,
> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
> on an EXT4 partition and supports booting a UEFI OS Loader from an
> EXT4 partition.
> This project is one of the TianoCore Google Summer of Code projects.
>
> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
> I/O 2 Protocols, and produces the Simple File System protocol. It
> allows mounting ext4 filesystems exclusively.
>
> Brief overhead of EXT4:
> Layout of an EXT2/3/4 filesystem:
> (note: this driver has been developed using
> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
> documentation).
>
> An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
> due to the similarities) is composed of various concepts:
>
> 1) Superblock
> The superblock is the structure near (1024 bytes offset from the start)
> the start of the partition, and describes the filesystem in general.
> Here, we get to know the size of the filesystem's blocks, which features
> it supports or not, whether it's been cleanly unmounted, how many blocks
> we have, etc.
>
> 2) Block groups
> EXT4 filesystems are divided into block groups, and each block group covers
> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
> associated block group descriptor; these are present directly after the
> superblock. Each block group descriptor contains the location of the
> inode table, and the inode and block bitmaps (note these bitmaps are only
> a block long, which gets us the 8 * Block Size formula covered previously).
>
> 3) Blocks
> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
> Blocks can be allocated using individual block groups's bitmaps. Note
> that block 0 is invalid and its presence on extents/block tables means
> it's part of a file hole, and that particular location must be read as
> a block full of zeros.
>
> 4) Inodes
> The ext4 filesystem divides files/directories into inodes (originally
> index nodes). Each file/socket/symlink/directory/etc (here on out referred
> to as a file, since there is no distinction under the ext4 filesystem) is
> stored as a /nameless/ inode, that is stored in some block group's inode
> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
> an old filesystem), and holds various metadata about the file. Since the
> largest inode structure right now is ~160 bytes, the rest of the inode
> contains inline extended attributes. Inodes' data is stored using either
> data blocks (under ext2/3) or extents (under ext4).
>
> 5) Extents
> Ext4 inodes store data in extents. These let N contiguous logical blocks
> that are represented by N contiguous physical blocks be represented by a
> single extent structure, which minimizes filesystem metadata bloat and
> speeds up block mapping (particularly due to the fact that high-quality
> ext4 implementations like linux's try /really/ hard to make the file
> contiguous, so it's common to have files with almost 0 fragmentation).
> Inodes that use extents store them in a tree, and the top of the tree
> is stored on i_data. The tree's leaves always start with an
> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0 and
> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
> block.
>
> 6) Directories
> Ext4 directories are files that store name -> inode mappings for the
> logical directory; this is where files get their names, which means ext4
> inodes do not themselves have names, since they can be linked (present)
> multiple times with different names. Directories can store entries in two
> different ways:
> 1) Classical linear directories: They store entries as a mostly-linked
> mostly-list of EXT4_DIR_ENTRY.
> 2) Hash tree directories: These are used for larger directories, with
> hundreds of entries, and are designed in a backwards-compatible way.
> These are not yet implemented in the Ext4Dxe driver.
>
> 7) Journal
> Ext3/4 filesystems have a journal to help protect the filesystem against
> system crashes. This is not yet implemented in Ext4Dxe but is described
> in detail in the Linux kernel's documentation.
>
> The EDK2 implementation of ext4 is based only on the public documentation
> available at https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
> and
> the FreeBSD ext2fs driver (available at
> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
> BSD-2-Clause-FreeBSD licensed). It is licensed as
> SPDX-License-Identifier: BSD-2-Clause-Patent.
>
> After a brief discussion with the community, the proposed package
> location is edk2-platform/Features/Ext4Pkg
> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
>
> I was the main contributor and I would like to maintain the package in
> the future, if possible.
>
>
> While I personally don't like it's outside of the EDK II core, I kind of get it. However I would strongly suggest to choose a more general package name, like "LinuxFsPkg", or "NixFsPkg", or maybe even just "FileSystemPkg" (and move FAT over some day?). Imagine someone wants to import BTRFS next year, should it really be "BtrfsPkg"? I understand it follows the "FatPkg" convention, but I feel like people forget FatPkg was special as to its awkward license before Microsoft allowed a change a few years ago. Maintainers.txt already has the concept of different Reviewers per subfolder, maybe it could be extended a little to have a common EDK II contributor to officially maintain the package, but have you be a Maintainer or something like a Reviewer+ to your driver? Or you could maintain the entire package of course.
>
>
> Marvin,
>
> Good point that the FatPkg was more about license boundary than anything else, so I’m not opposed to a more generic package name.
>
> Current limitations:
> 1) The Ext4Dxe driver is, at the moment, read-only.
> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
> filesystems. Ensuring compatibility with
> those may not be a bad idea.
>
> I intend to test the package using the UEFI SCTs present in edk2-test,
> and implement any other needed unit tests myself using the already
> available unit test framework. I also intend to (privately) fuzz the
> UEFI driver with bad/unusual disk images, to improve the security and
> reliability of the driver.
>
> In the future, ext4 write support should be added so edk2 has a
> fully-featured RW ext4 driver. There could also be a focus on
> supporting the older ext4-like filesystems, as I mentioned in the
> limitations, but that is open for discussion.
>
>
> I may be alone, but I strongly object. One of our projects (OpenCore) has a disgusting way of writing files because the FAT32 driver in Aptio IV firmwares may corrupt the filesystem when resizing files. To be honest, it may corrupt with other usages too and we never noticed, because basically we wrote the code to require the least amount of (complex) FS operations.
>
> The issue with EDK II is, there is a lot of own code and a lot of users, but little testing. By that I do not mean that developers do not test their code, but that nobody sits down and performs all sorts of FS manipulations in all sorts of orders and closely observes the result for regression-testing. Users will not really test it either, as UEFI to them should just somehow boot to Windows. If at least the code was shared with a codebase that is known-trusted (e.g. the Linux kernel itself), that'd be much easier to trust, but realistically this is not going to happen.
> My point is, if a company like AMI cannot guarantee writing does not damage the FS for a very simple FS, how do you plan to guarantee yours doesn't for a much more complex FS? I'd rather have only one simple FS type that supports writing for most use-cases (e.g. logging).
>
> At the very least I would beg you to have a PCD to turn write support off - if it will be off by default, that'd be great of course. :)
> Was there any discussion yet as to why write support is needed in the first place you could point me to?
>
>
> I think having a default PCD option of read only is a good idea.
>
> EFI on Mac carries HFS+ and APFS EFI file system drivers and both of those are read only for safety, security, and to avoid the need to validate them. So I think some products may want to have the option to ship read only versions of the file system.
>
> Seems like having EFI base file system tests would be useful. I’d imaging with OVMF it would be possible to implement a very robust test infrastructure. Seems like the hard bits would be generating the test cases and figuring out how to validate the tests did the correct thing. I’m guess the OS based file system drivers are robust and try to work around bugs gracefully? Maybe there is a way to turn on OS logging, or even run an OS based fsck on the volume after the tests complete. Regardless this seems like an interesting project, maybe we can add it to next years GSoC?
>
> Thanks,
>
> Andrew Fish
>
> Thanks for your work!
>
> Best regards,
> Marvin
>
> The driver's handling of unclean unmounting through forced shutdown is unclear.
> Is there a position in edk2 on how to handle such cases? I don't think
> FAT32 has a "this filesystem is/was dirty" and even though it seems to
> me that stopping a system from booting/opening the partition because
> "we may find some tiny irregularities" is not the best course of
> action, I can't find a clear answer.
>
> The driver also had to add implementations of CRC32C and CRC16, and
> after talking with my mentor we quickly reached the conclusion that
> these may be good candidates for inclusion in MdePkg. We also
> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
>
> Feel free to ask any questions you may find relevant.
>
> Best Regards,
>
> Pedro Falcato
>
>
>
>
>
>
>
> 
>
>


-- 
Pedro Falcato

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-rfc] RFC: EXT4 filesystem driver
  2021-07-22  1:20 ` 回复: [edk2-rfc] " gaoliming
  2021-07-22  2:47   ` [edk2-devel] " Andrew Fish
@ 2021-07-22 17:08   ` Michael D Kinney
  1 sibling, 0 replies; 10+ messages in thread
From: Michael D Kinney @ 2021-07-22 17:08 UTC (permalink / raw)
  To: rfc@edk2.groups.io, gaoliming@byosoft.com.cn,
	pedro.falcato@gmail.com, devel@edk2.groups.io, Chang, Abner,
	Kinney, Michael D



> -----Original Message-----
> From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of gaoliming
> Sent: Wednesday, July 21, 2021 6:21 PM
> To: rfc@edk2.groups.io; pedro.falcato@gmail.com; devel@edk2.groups.io
> Subject: 回复: [edk2-rfc] RFC: EXT4 filesystem driver
> 
> 
> 
> > -----邮件原件-----
> > 发件人: rfc@edk2.groups.io <rfc@edk2.groups.io> 代表 Pedro Falcato
> > 发送时间: 2021年7月22日 7:12
> > 收件人: devel@edk2.groups.io
> > 抄送: rfc@edk2.groups.io
> > 主题: [edk2-rfc] RFC: EXT4 filesystem driver
> >
> > EXT4 (fourth extended filesystem) is a filesystem developed for Linux
> > that has been in wide use (desktops, servers, smartphones) since 2008.
> >
> > The Ext4Pkg implements the Simple File System Protocol for a partition
> > that is formatted with the EXT4 file system. This allows UEFI Drivers,
> > UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
> > on an EXT4 partition and supports booting a UEFI OS Loader from an
> > EXT4 partition.
> > This project is one of the TianoCore Google Summer of Code projects.
> >
> > Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
> > UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
> > I/O 2 Protocols, and produces the Simple File System protocol. It
> > allows mounting ext4 filesystems exclusively.
> >
> > Brief overhead of EXT4:
> > Layout of an EXT2/3/4 filesystem:
> > (note: this driver has been developed using
> > https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
> > documentation).
> >
> > An ext2/3/4 filesystem (here on out referred to as simply an ext4 filesystem,
> > due to the similarities) is composed of various concepts:
> >
> > 1) Superblock
> > The superblock is the structure near (1024 bytes offset from the start)
> > the start of the partition, and describes the filesystem in general.
> > Here, we get to know the size of the filesystem's blocks, which features
> > it supports or not, whether it's been cleanly unmounted, how many blocks
> > we have, etc.
> >
> > 2) Block groups
> > EXT4 filesystems are divided into block groups, and each block group covers
> > s_blocks_per_group(8 * Block Size) blocks. Each block group has an
> > associated block group descriptor; these are present directly after the
> > superblock. Each block group descriptor contains the location of the
> > inode table, and the inode and block bitmaps (note these bitmaps are only
> > a block long, which gets us the 8 * Block Size formula covered previously).
> >
> > 3) Blocks
> > The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
> > Blocks can be allocated using individual block groups's bitmaps. Note
> > that block 0 is invalid and its presence on extents/block tables means
> > it's part of a file hole, and that particular location must be read as
> > a block full of zeros.
> >
> > 4) Inodes
> > The ext4 filesystem divides files/directories into inodes (originally
> > index nodes). Each file/socket/symlink/directory/etc (here on out referred
> > to as a file, since there is no distinction under the ext4 filesystem) is
> > stored as a /nameless/ inode, that is stored in some block group's inode
> > table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it's
> > an old filesystem), and holds various metadata about the file. Since the
> > largest inode structure right now is ~160 bytes, the rest of the inode
> > contains inline extended attributes. Inodes' data is stored using either
> > data blocks (under ext2/3) or extents (under ext4).
> >
> > 5) Extents
> > Ext4 inodes store data in extents. These let N contiguous logical blocks
> > that are represented by N contiguous physical blocks be represented by a
> > single extent structure, which minimizes filesystem metadata bloat and
> > speeds up block mapping (particularly due to the fact that high-quality
> > ext4 implementations like linux's try /really/ hard to make the file
> > contiguous, so it's common to have files with almost 0 fragmentation).
> > Inodes that use extents store them in a tree, and the top of the tree
> > is stored on i_data. The tree's leaves always start with an
> > EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0
> > and
> > EXT4_EXTENT on eh_depth = 0; these entries are always sorted by logical
> > block.
> >
> > 6) Directories
> > Ext4 directories are files that store name -> inode mappings for the
> > logical directory; this is where files get their names, which means ext4
> > inodes do not themselves have names, since they can be linked (present)
> > multiple times with different names. Directories can store entries in two
> > different ways:
> > 1) Classical linear directories: They store entries as a mostly-linked
> > mostly-list of EXT4_DIR_ENTRY.
> > 2) Hash tree directories: These are used for larger directories, with
> > hundreds of entries, and are designed in a backwards-compatible way.
> > These are not yet implemented in the Ext4Dxe driver.
> >
> > 7) Journal
> > Ext3/4 filesystems have a journal to help protect the filesystem against
> > system crashes. This is not yet implemented in Ext4Dxe but is described
> > in detail in the Linux kernel's documentation.
> >
> > The EDK2 implementation of ext4 is based only on the public documentation
> > available at
> > https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
> > and
> > the FreeBSD ext2fs driver (available at
> > https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
> > BSD-2-Clause-FreeBSD licensed). It is licensed as
> > SPDX-License-Identifier: BSD-2-Clause-Patent.
> >
> > After a brief discussion with the community, the proposed package
> > location is edk2-platform/Features/Ext4Pkg
> > (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185).
> >
> > I was the main contributor and I would like to maintain the package in
> > the future, if possible.
> >
> > Current limitations:
> > 1) The Ext4Dxe driver is, at the moment, read-only.
> > 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
> > filesystems. Ensuring compatibility with
> > those may not be a bad idea.
> >
> > I intend to test the package using the UEFI SCTs present in edk2-test,
> > and implement any other needed unit tests myself using the already
> > available unit test framework. I also intend to (privately) fuzz the
> > UEFI driver with bad/unusual disk images, to improve the security and
> > reliability of the driver.
> >
> > In the future, ext4 write support should be added so edk2 has a
> > fully-featured RW ext4 driver. There could also be a focus on
> > supporting the older ext4-like filesystems, as I mentioned in the
> > limitations, but that is open for discussion.
> >
> > The driver's handling of unclean unmounting through forced shutdown is
> > unclear.
> > Is there a position in edk2 on how to handle such cases? I don't think
> > FAT32 has a "this filesystem is/was dirty" and even though it seems to
> > me that stopping a system from booting/opening the partition because
> > "we may find some tiny irregularities" is not the best course of
> > action, I can't find a clear answer.
> >
> > The driver also had to add implementations of CRC32C and CRC16, and
> > after talking with my mentor we quickly reached the conclusion that
> > these may be good candidates for inclusion in MdePkg. We also
> > discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
> > (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
> 
> Current MdePkg BaseLib has CalculateCrc32(). So, CRC32C and CRC16 can be added into BaseLib.

I agree BaseLib is as good location for these 2 APIs.

> 
> If more modules need to consume Ucs2 <-> Utf8 conversion library, BaseUcs2Utf8Lib is generic enough
> to be placed in MdePkg.

This was discussed with the RedfishPkg maintainers and was deferred until there was another use
case.  Looks like EXT4 is a 2nd use case, so it is a good time to move to this lib class and
lib instance from RedfishPkg to MdePkg.

> 
> Thanks
> Liming
> >
> > Feel free to ask any questions you may find relevant.
> >
> > Best Regards,
> >
> > Pedro Falcato
> >
> >
> >
> >
> 
> 
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-22 16:54     ` Pedro Falcato
@ 2021-07-23  2:07       ` Nate DeSimone
  2021-07-23  3:59         ` Andrew Fish
  0 siblings, 1 reply; 10+ messages in thread
From: Nate DeSimone @ 2021-07-23  2:07 UTC (permalink / raw)
  To: devel@edk2.groups.io, pedro.falcato@gmail.com, Andrew Fish
  Cc: mhaeuser@posteo.de, rfc@edk2.groups.io

Hi Pedro,

> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Pedro
> Falcato
> Sent: Thursday, July 22, 2021 9:54 AM
> To: Andrew Fish <afish@apple.com>
> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; mhaeuser@posteo.de;
> rfc@edk2.groups.io
> Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver
> 
> Hi Andrew, Marvin,
> 
> RE: The package name: It doesn't sound like a bad idea to have something
> like a FileSystemPkg and have a bunch of different filesystems inside of it,
> but I'll defer to you and my mentors' judgement; we could also drop that
> issue for now and take care of it afterwards, since it may need further
> changes that are not a part of GSoC and would just delay the process.
> 
> With respect to the write capabilities of the driver, I'm not entirely sure
> whether or not it's useful. I've been thinking about it today, and it seems like
> there's not much that could go wrong? The write path isn't excessively
> complex. Except of course in the event of an untimely power cut, but those
> /should/ be easily detected by the checksums. My initial idea was to have it
> up to speed with FatPkg and other filesystems by implementing all of
> EFI_FILE_PROTOCOL, including the write portions. If Apple's HFS+ and APFS
> drivers don't have those, it may be a decent idea to reduce the scope of the
> ext4 driver as well. I don't see a big need for write support; on the other
> hand, I've only worked on UEFI bootloaders before, which may be an outlier
> in that regard. Further feedback is appreciated.

The most commonly used reason to for writing to the filesystem in a production environment is capsule updates. Most capsule update implementations will stage the capsule on the EFI System Partition and then reset the system to unlock flash. The second most useful is the UEFI Shell and all the many applications that run within it will write to the filesystem for a large variety of reasons. I think it would be a useful feature to have as one could conceivably start using EFI System Partitions formatted as ext4.

> 
> As for the tests, UEFI SCTs already seem to have some tests on
> EFI_FILE_PROTOCOL's. Further testing may require some sort of fuzzing,
> which is what I want to, although in a simplified way. With fuzzing we could
> hammer the filesystem code with all sorts of different calls in different
> orders, we could also mutate the disk structures to test if the driver is secure
> and can handle corruption in a nice, safe way. A future (GSoC or not) project
> could also attempt to use compiler-generated coverage instrumentation (see
> LLVM's LibFuzzer and SanitizerCoverage for an example).
> 
> I'm not sure about all OSes, but at least Linux ext2/3/4 drivers are very robust
> and can handle and work around any corrupted FS I
> (accidentally) throw at them. However, running fsck is the best way to detect
> corruption; note that licensing may be an issue as, for example, ext4's fsck is
> GPL2 licensed.
> 
> Best Regards,
> Pedro
> 
> On Thu, 22 Jul 2021 at 16:58, Andrew Fish <afish@apple.com> wrote:
> >
> >
> >
> > On Jul 22, 2021, at 3:24 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:
> >
> > On 22.07.21 01:12, Pedro Falcato wrote:
> >
> > EXT4 (fourth extended filesystem) is a filesystem developed for Linux
> > that has been in wide use (desktops, servers, smartphones) since 2008.
> >
> > The Ext4Pkg implements the Simple File System Protocol for a partition
> > that is formatted with the EXT4 file system. This allows UEFI Drivers,
> > UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
> > on an EXT4 partition and supports booting a UEFI OS Loader from an
> > EXT4 partition.
> > This project is one of the TianoCore Google Summer of Code projects.
> >
> > Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
> > UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
> > I/O 2 Protocols, and produces the Simple File System protocol. It
> > allows mounting ext4 filesystems exclusively.
> >
> > Brief overhead of EXT4:
> > Layout of an EXT2/3/4 filesystem:
> > (note: this driver has been developed using
> > https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
> > documentation).
> >
> > An ext2/3/4 filesystem (here on out referred to as simply an ext4
> > filesystem, due to the similarities) is composed of various concepts:
> >
> > 1) Superblock
> > The superblock is the structure near (1024 bytes offset from the
> > start) the start of the partition, and describes the filesystem in general.
> > Here, we get to know the size of the filesystem's blocks, which
> > features it supports or not, whether it's been cleanly unmounted, how
> > many blocks we have, etc.
> >
> > 2) Block groups
> > EXT4 filesystems are divided into block groups, and each block group
> > covers
> > s_blocks_per_group(8 * Block Size) blocks. Each block group has an
> > associated block group descriptor; these are present directly after
> > the superblock. Each block group descriptor contains the location of
> > the inode table, and the inode and block bitmaps (note these bitmaps
> > are only a block long, which gets us the 8 * Block Size formula covered
> previously).
> >
> > 3) Blocks
> > The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
> > Blocks can be allocated using individual block groups's bitmaps. Note
> > that block 0 is invalid and its presence on extents/block tables means
> > it's part of a file hole, and that particular location must be read as
> > a block full of zeros.
> >
> > 4) Inodes
> > The ext4 filesystem divides files/directories into inodes (originally
> > index nodes). Each file/socket/symlink/directory/etc (here on out
> > referred to as a file, since there is no distinction under the ext4
> > filesystem) is stored as a /nameless/ inode, that is stored in some
> > block group's inode table. Each inode has s_inode_size size (or
> > GOOD_OLD_INODE_SIZE if it's an old filesystem), and holds various
> > metadata about the file. Since the largest inode structure right now
> > is ~160 bytes, the rest of the inode contains inline extended
> > attributes. Inodes' data is stored using either data blocks (under ext2/3) or
> extents (under ext4).
> >
> > 5) Extents
> > Ext4 inodes store data in extents. These let N contiguous logical
> > blocks that are represented by N contiguous physical blocks be
> > represented by a single extent structure, which minimizes filesystem
> > metadata bloat and speeds up block mapping (particularly due to the
> > fact that high-quality
> > ext4 implementations like linux's try /really/ hard to make the file
> > contiguous, so it's common to have files with almost 0 fragmentation).
> > Inodes that use extents store them in a tree, and the top of the tree
> > is stored on i_data. The tree's leaves always start with an
> > EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0
> and
> > EXT4_EXTENT on eh_depth = 0; these entries are always sorted by
> > logical block.
> >
> > 6) Directories
> > Ext4 directories are files that store name -> inode mappings for the
> > logical directory; this is where files get their names, which means
> > ext4 inodes do not themselves have names, since they can be linked
> > (present) multiple times with different names. Directories can store
> > entries in two different ways:
> > 1) Classical linear directories: They store entries as a mostly-linked
> > mostly-list of EXT4_DIR_ENTRY.
> > 2) Hash tree directories: These are used for larger directories, with
> > hundreds of entries, and are designed in a backwards-compatible way.
> > These are not yet implemented in the Ext4Dxe driver.
> >
> > 7) Journal
> > Ext3/4 filesystems have a journal to help protect the filesystem
> > against system crashes. This is not yet implemented in Ext4Dxe but is
> > described in detail in the Linux kernel's documentation.
> >
> > The EDK2 implementation of ext4 is based only on the public
> > documentation available at
> > https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
> > and
> > the FreeBSD ext2fs driver (available at
> > https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
> > BSD-2-Clause-FreeBSD licensed). It is licensed as
> > SPDX-License-Identifier: BSD-2-Clause-Patent.
> >
> > After a brief discussion with the community, the proposed package
> > location is edk2-platform/Features/Ext4Pkg (relevant discussion:
> > https://edk2.groups.io/g/devel/topic/83060185).
> >
> > I was the main contributor and I would like to maintain the package in
> > the future, if possible.
> >
> >
> > While I personally don't like it's outside of the EDK II core, I kind of get it.
> However I would strongly suggest to choose a more general package name,
> like "LinuxFsPkg", or "NixFsPkg", or maybe even just "FileSystemPkg" (and
> move FAT over some day?). Imagine someone wants to import BTRFS next
> year, should it really be "BtrfsPkg"? I understand it follows the "FatPkg"
> convention, but I feel like people forget FatPkg was special as to its awkward
> license before Microsoft allowed a change a few years ago. Maintainers.txt
> already has the concept of different Reviewers per subfolder, maybe it could
> be extended a little to have a common EDK II contributor to officially maintain
> the package, but have you be a Maintainer or something like a Reviewer+ to
> your driver? Or you could maintain the entire package of course.
> >
> >
> > Marvin,
> >
> > Good point that the FatPkg was more about license boundary than
> anything else, so I’m not opposed to a more generic package name.
> >
> > Current limitations:
> > 1) The Ext4Dxe driver is, at the moment, read-only.
> > 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
> > filesystems. Ensuring compatibility with those may not be a bad idea.
> >
> > I intend to test the package using the UEFI SCTs present in edk2-test,
> > and implement any other needed unit tests myself using the already
> > available unit test framework. I also intend to (privately) fuzz the
> > UEFI driver with bad/unusual disk images, to improve the security and
> > reliability of the driver.
> >
> > In the future, ext4 write support should be added so edk2 has a
> > fully-featured RW ext4 driver. There could also be a focus on
> > supporting the older ext4-like filesystems, as I mentioned in the
> > limitations, but that is open for discussion.
> >
> >
> > I may be alone, but I strongly object. One of our projects (OpenCore) has a
> disgusting way of writing files because the FAT32 driver in Aptio IV firmwares
> may corrupt the filesystem when resizing files. To be honest, it may corrupt
> with other usages too and we never noticed, because basically we wrote the
> code to require the least amount of (complex) FS operations.
> >
> > The issue with EDK II is, there is a lot of own code and a lot of users, but
> little testing. By that I do not mean that developers do not test their code,
> but that nobody sits down and performs all sorts of FS manipulations in all
> sorts of orders and closely observes the result for regression-testing. Users
> will not really test it either, as UEFI to them should just somehow boot to
> Windows. If at least the code was shared with a codebase that is known-
> trusted (e.g. the Linux kernel itself), that'd be much easier to trust, but
> realistically this is not going to happen.
> > My point is, if a company like AMI cannot guarantee writing does not
> damage the FS for a very simple FS, how do you plan to guarantee yours
> doesn't for a much more complex FS? I'd rather have only one simple FS type
> that supports writing for most use-cases (e.g. logging).
> >
> > At the very least I would beg you to have a PCD to turn write support
> > off - if it will be off by default, that'd be great of course. :) Was there any
> discussion yet as to why write support is needed in the first place you could
> point me to?
> >
> >
> > I think having a default PCD option of read only is a good idea.
> >
> > EFI on Mac carries HFS+ and APFS EFI file system drivers and both of those
> are read only for safety, security, and to avoid the need to validate them. So I
> think some products may want to have the option to ship read only versions
> of the file system.
> >
> > Seems like having EFI base file system tests would be useful. I’d imaging
> with OVMF it would be possible to implement a very robust test
> infrastructure. Seems like the hard bits would be generating the test cases
> and figuring out how to validate the tests did the correct thing. I’m guess the
> OS based file system drivers are robust and try to work around bugs
> gracefully? Maybe there is a way to turn on OS logging, or even run an OS
> based fsck on the volume after the tests complete. Regardless this seems
> like an interesting project, maybe we can add it to next years GSoC?
> >
> > Thanks,
> >
> > Andrew Fish
> >
> > Thanks for your work!
> >
> > Best regards,
> > Marvin
> >
> > The driver's handling of unclean unmounting through forced shutdown is
> unclear.
> > Is there a position in edk2 on how to handle such cases? I don't think
> > FAT32 has a "this filesystem is/was dirty" and even though it seems to
> > me that stopping a system from booting/opening the partition because
> > "we may find some tiny irregularities" is not the best course of
> > action, I can't find a clear answer.
> >
> > The driver also had to add implementations of CRC32C and CRC16, and
> > after talking with my mentor we quickly reached the conclusion that
> > these may be good candidates for inclusion in MdePkg. We also
> > discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
> > (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
> >
> > Feel free to ask any questions you may find relevant.
> >
> > Best Regards,
> >
> > Pedro Falcato
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> 
> 
> --
> Pedro Falcato
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [edk2-devel] RFC: EXT4 filesystem driver
  2021-07-23  2:07       ` Nate DeSimone
@ 2021-07-23  3:59         ` Andrew Fish
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Fish @ 2021-07-23  3:59 UTC (permalink / raw)
  To: Desimone, Nathaniel L
  Cc: devel@edk2.groups.io, pedro.falcato@gmail.com, mhaeuser@posteo.de,
	rfc@edk2.groups.io

[-- Attachment #1: Type: text/plain, Size: 15157 bytes --]



> On Jul 22, 2021, at 7:07 PM, Desimone, Nathaniel L <nathaniel.l.desimone@intel.com> wrote:
> 
> Hi Pedro,
> 
>> -----Original Message-----
>> From: devel@edk2.groups.io <mailto:devel@edk2.groups.io> <devel@edk2.groups.io <mailto:devel@edk2.groups.io>> On Behalf Of Pedro
>> Falcato
>> Sent: Thursday, July 22, 2021 9:54 AM
>> To: Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
>> Cc: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; mhaeuser@posteo.de <mailto:mhaeuser@posteo.de>;
>> rfc@edk2.groups.io <mailto:rfc@edk2.groups.io>
>> Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver
>> 
>> Hi Andrew, Marvin,
>> 
>> RE: The package name: It doesn't sound like a bad idea to have something
>> like a FileSystemPkg and have a bunch of different filesystems inside of it,
>> but I'll defer to you and my mentors' judgement; we could also drop that
>> issue for now and take care of it afterwards, since it may need further
>> changes that are not a part of GSoC and would just delay the process.
>> 
>> With respect to the write capabilities of the driver, I'm not entirely sure
>> whether or not it's useful. I've been thinking about it today, and it seems like
>> there's not much that could go wrong? The write path isn't excessively
>> complex. Except of course in the event of an untimely power cut, but those
>> /should/ be easily detected by the checksums. My initial idea was to have it
>> up to speed with FatPkg and other filesystems by implementing all of
>> EFI_FILE_PROTOCOL, including the write portions. If Apple's HFS+ and APFS
>> drivers don't have those, it may be a decent idea to reduce the scope of the
>> ext4 driver as well. I don't see a big need for write support; on the other
>> hand, I've only worked on UEFI bootloaders before, which may be an outlier
>> in that regard. Further feedback is appreciated.
> 
> The most commonly used reason to for writing to the filesystem in a production environment is capsule updates. Most capsule update implementations will stage the capsule on the EFI System Partition and then reset the system to unlock flash. The second most useful is the UEFI Shell and all the many applications that run within it will write to the filesystem for a large variety of reasons. I think it would be a useful feature to have as one could conceivably start using EFI System Partitions formatted as ext4.
> 

The EFI System Partition is defined to be FAT32 by the UEFI Spec for interoperability. It defines the file system drivers required for the firmware and OS. So changing that is not really an option. 

You can still install the UEFI Shell to a read only file system, you just need to do it from the OS :). We actually do this on Macs quite often. You just run the macOS bless command and reboot to the UEFI Shell. 

Thanks,

Andrew Fish

>> 
>> As for the tests, UEFI SCTs already seem to have some tests on
>> EFI_FILE_PROTOCOL's. Further testing may require some sort of fuzzing,
>> which is what I want to, although in a simplified way. With fuzzing we could
>> hammer the filesystem code with all sorts of different calls in different
>> orders, we could also mutate the disk structures to test if the driver is secure
>> and can handle corruption in a nice, safe way. A future (GSoC or not) project
>> could also attempt to use compiler-generated coverage instrumentation (see
>> LLVM's LibFuzzer and SanitizerCoverage for an example).
>> 
>> I'm not sure about all OSes, but at least Linux ext2/3/4 drivers are very robust
>> and can handle and work around any corrupted FS I
>> (accidentally) throw at them. However, running fsck is the best way to detect
>> corruption; note that licensing may be an issue as, for example, ext4's fsck is
>> GPL2 licensed.
>> 
>> Best Regards,
>> Pedro
>> 
>> On Thu, 22 Jul 2021 at 16:58, Andrew Fish <afish@apple.com> wrote:
>>> 
>>> 
>>> 
>>> On Jul 22, 2021, at 3:24 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:
>>> 
>>> On 22.07.21 01:12, Pedro Falcato wrote:
>>> 
>>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux
>>> that has been in wide use (desktops, servers, smartphones) since 2008.
>>> 
>>> The Ext4Pkg implements the Simple File System Protocol for a partition
>>> that is formatted with the EXT4 file system. This allows UEFI Drivers,
>>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files
>>> on an EXT4 partition and supports booting a UEFI OS Loader from an
>>> EXT4 partition.
>>> This project is one of the TianoCore Google Summer of Code projects.
>>> 
>>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a
>>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk
>>> I/O 2 Protocols, and produces the Simple File System protocol. It
>>> allows mounting ext4 filesystems exclusively.
>>> 
>>> Brief overhead of EXT4:
>>> Layout of an EXT2/3/4 filesystem:
>>> (note: this driver has been developed using
>>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as
>>> documentation).
>>> 
>>> An ext2/3/4 filesystem (here on out referred to as simply an ext4
>>> filesystem, due to the similarities) is composed of various concepts:
>>> 
>>> 1) Superblock
>>> The superblock is the structure near (1024 bytes offset from the
>>> start) the start of the partition, and describes the filesystem in general.
>>> Here, we get to know the size of the filesystem's blocks, which
>>> features it supports or not, whether it's been cleanly unmounted, how
>>> many blocks we have, etc.
>>> 
>>> 2) Block groups
>>> EXT4 filesystems are divided into block groups, and each block group
>>> covers
>>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an
>>> associated block group descriptor; these are present directly after
>>> the superblock. Each block group descriptor contains the location of
>>> the inode table, and the inode and block bitmaps (note these bitmaps
>>> are only a block long, which gets us the 8 * Block Size formula covered
>> previously).
>>> 
>>> 3) Blocks
>>> The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024.
>>> Blocks can be allocated using individual block groups's bitmaps. Note
>>> that block 0 is invalid and its presence on extents/block tables means
>>> it's part of a file hole, and that particular location must be read as
>>> a block full of zeros.
>>> 
>>> 4) Inodes
>>> The ext4 filesystem divides files/directories into inodes (originally
>>> index nodes). Each file/socket/symlink/directory/etc (here on out
>>> referred to as a file, since there is no distinction under the ext4
>>> filesystem) is stored as a /nameless/ inode, that is stored in some
>>> block group's inode table. Each inode has s_inode_size size (or
>>> GOOD_OLD_INODE_SIZE if it's an old filesystem), and holds various
>>> metadata about the file. Since the largest inode structure right now
>>> is ~160 bytes, the rest of the inode contains inline extended
>>> attributes. Inodes' data is stored using either data blocks (under ext2/3) or
>> extents (under ext4).
>>> 
>>> 5) Extents
>>> Ext4 inodes store data in extents. These let N contiguous logical
>>> blocks that are represented by N contiguous physical blocks be
>>> represented by a single extent structure, which minimizes filesystem
>>> metadata bloat and speeds up block mapping (particularly due to the
>>> fact that high-quality
>>> ext4 implementations like linux's try /really/ hard to make the file
>>> contiguous, so it's common to have files with almost 0 fragmentation).
>>> Inodes that use extents store them in a tree, and the top of the tree
>>> is stored on i_data. The tree's leaves always start with an
>>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth != 0
>> and
>>> EXT4_EXTENT on eh_depth = 0; these entries are always sorted by
>>> logical block.
>>> 
>>> 6) Directories
>>> Ext4 directories are files that store name -> inode mappings for the
>>> logical directory; this is where files get their names, which means
>>> ext4 inodes do not themselves have names, since they can be linked
>>> (present) multiple times with different names. Directories can store
>>> entries in two different ways:
>>> 1) Classical linear directories: They store entries as a mostly-linked
>>> mostly-list of EXT4_DIR_ENTRY.
>>> 2) Hash tree directories: These are used for larger directories, with
>>> hundreds of entries, and are designed in a backwards-compatible way.
>>> These are not yet implemented in the Ext4Dxe driver.
>>> 
>>> 7) Journal
>>> Ext3/4 filesystems have a journal to help protect the filesystem
>>> against system crashes. This is not yet implemented in Ext4Dxe but is
>>> described in detail in the Linux kernel's documentation.
>>> 
>>> The EDK2 implementation of ext4 is based only on the public
>>> documentation available at
>>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html
>>> and
>>> the FreeBSD ext2fs driver (available at
>>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
>>> BSD-2-Clause-FreeBSD licensed). It is licensed as
>>> SPDX-License-Identifier: BSD-2-Clause-Patent.
>>> 
>>> After a brief discussion with the community, the proposed package
>>> location is edk2-platform/Features/Ext4Pkg (relevant discussion:
>>> https://edk2.groups.io/g/devel/topic/83060185).
>>> 
>>> I was the main contributor and I would like to maintain the package in
>>> the future, if possible.
>>> 
>>> 
>>> While I personally don't like it's outside of the EDK II core, I kind of get it.
>> However I would strongly suggest to choose a more general package name,
>> like "LinuxFsPkg", or "NixFsPkg", or maybe even just "FileSystemPkg" (and
>> move FAT over some day?). Imagine someone wants to import BTRFS next
>> year, should it really be "BtrfsPkg"? I understand it follows the "FatPkg"
>> convention, but I feel like people forget FatPkg was special as to its awkward
>> license before Microsoft allowed a change a few years ago. Maintainers.txt
>> already has the concept of different Reviewers per subfolder, maybe it could
>> be extended a little to have a common EDK II contributor to officially maintain
>> the package, but have you be a Maintainer or something like a Reviewer+ to
>> your driver? Or you could maintain the entire package of course.
>>> 
>>> 
>>> Marvin,
>>> 
>>> Good point that the FatPkg was more about license boundary than
>> anything else, so I’m not opposed to a more generic package name.
>>> 
>>> Current limitations:
>>> 1) The Ext4Dxe driver is, at the moment, read-only.
>>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
>>> filesystems. Ensuring compatibility with those may not be a bad idea.
>>> 
>>> I intend to test the package using the UEFI SCTs present in edk2-test,
>>> and implement any other needed unit tests myself using the already
>>> available unit test framework. I also intend to (privately) fuzz the
>>> UEFI driver with bad/unusual disk images, to improve the security and
>>> reliability of the driver.
>>> 
>>> In the future, ext4 write support should be added so edk2 has a
>>> fully-featured RW ext4 driver. There could also be a focus on
>>> supporting the older ext4-like filesystems, as I mentioned in the
>>> limitations, but that is open for discussion.
>>> 
>>> 
>>> I may be alone, but I strongly object. One of our projects (OpenCore) has a
>> disgusting way of writing files because the FAT32 driver in Aptio IV firmwares
>> may corrupt the filesystem when resizing files. To be honest, it may corrupt
>> with other usages too and we never noticed, because basically we wrote the
>> code to require the least amount of (complex) FS operations.
>>> 
>>> The issue with EDK II is, there is a lot of own code and a lot of users, but
>> little testing. By that I do not mean that developers do not test their code,
>> but that nobody sits down and performs all sorts of FS manipulations in all
>> sorts of orders and closely observes the result for regression-testing. Users
>> will not really test it either, as UEFI to them should just somehow boot to
>> Windows. If at least the code was shared with a codebase that is known-
>> trusted (e.g. the Linux kernel itself), that'd be much easier to trust, but
>> realistically this is not going to happen.
>>> My point is, if a company like AMI cannot guarantee writing does not
>> damage the FS for a very simple FS, how do you plan to guarantee yours
>> doesn't for a much more complex FS? I'd rather have only one simple FS type
>> that supports writing for most use-cases (e.g. logging).
>>> 
>>> At the very least I would beg you to have a PCD to turn write support
>>> off - if it will be off by default, that'd be great of course. :) Was there any
>> discussion yet as to why write support is needed in the first place you could
>> point me to?
>>> 
>>> 
>>> I think having a default PCD option of read only is a good idea.
>>> 
>>> EFI on Mac carries HFS+ and APFS EFI file system drivers and both of those
>> are read only for safety, security, and to avoid the need to validate them. So I
>> think some products may want to have the option to ship read only versions
>> of the file system.
>>> 
>>> Seems like having EFI base file system tests would be useful. I’d imaging
>> with OVMF it would be possible to implement a very robust test
>> infrastructure. Seems like the hard bits would be generating the test cases
>> and figuring out how to validate the tests did the correct thing. I’m guess the
>> OS based file system drivers are robust and try to work around bugs
>> gracefully? Maybe there is a way to turn on OS logging, or even run an OS
>> based fsck on the volume after the tests complete. Regardless this seems
>> like an interesting project, maybe we can add it to next years GSoC?
>>> 
>>> Thanks,
>>> 
>>> Andrew Fish
>>> 
>>> Thanks for your work!
>>> 
>>> Best regards,
>>> Marvin
>>> 
>>> The driver's handling of unclean unmounting through forced shutdown is
>> unclear.
>>> Is there a position in edk2 on how to handle such cases? I don't think
>>> FAT32 has a "this filesystem is/was dirty" and even though it seems to
>>> me that stopping a system from booting/opening the partition because
>>> "we may find some tiny irregularities" is not the best course of
>>> action, I can't find a clear answer.
>>> 
>>> The driver also had to add implementations of CRC32C and CRC16, and
>>> after talking with my mentor we quickly reached the conclusion that
>>> these may be good candidates for inclusion in MdePkg. We also
>>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg
>>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments?
>>> 
>>> Feel free to ask any questions you may find relevant.
>>> 
>>> Best Regards,
>>> 
>>> Pedro Falcato
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> Pedro Falcato
>> 
>> 
>> 


[-- Attachment #2: Type: text/html, Size: 22320 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-23  4:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-07-21 23:12 RFC: EXT4 filesystem driver Pedro Falcato
2021-07-22  1:20 ` 回复: [edk2-rfc] " gaoliming
2021-07-22  2:47   ` [edk2-devel] " Andrew Fish
2021-07-22 17:08   ` Michael D Kinney
2021-07-22 10:24 ` [edk2-devel] " Marvin Häuser
2021-07-22 15:58   ` Andrew Fish
2021-07-22 16:07     ` Marvin Häuser
2021-07-22 16:54     ` Pedro Falcato
2021-07-23  2:07       ` Nate DeSimone
2021-07-23  3:59         ` Andrew Fish

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox