From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rn-mailsvcp-ppex-lapp15.apple.com (rn-mailsvcp-ppex-lapp15.apple.com [17.179.253.34]) by mx.groups.io with SMTP id smtpd.web09.4114.1627012800751082574 for ; Thu, 22 Jul 2021 21:00:01 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=Y4MJi7jS; spf=pass (domain: apple.com, ip: 17.179.253.34, mailfrom: afish@apple.com) Received: from pps.filterd (rn-mailsvcp-ppex-lapp15.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp15.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 16N3w67o023595; Thu, 22 Jul 2021 21:00:00 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=cDIbrgkEEmonBBMySYdW/6a0R0L5N9ua4ropelUK9uc=; b=Y4MJi7jSw6elEMabIYej3GvrB583QN1pU/PnEV5Ual0cSC4rFbhGFkcRaflz+eB7PfMg CBLmCC7DVqHe/OAHgMQGUaugLllrzGbUP/O8k/dTtjSX02L8SucBBOzN069PFQY9o0+I +7QFyEprd4L3JEVkPvi1xMhF/fsYtTmiOm27qmlC3S+WiPtclbKOMIfDHus77nMRKLF/ xS0CgTt+7fpEISKkSvdpAcB1TQAK8HDVGrkV3oLDHcv+cNy23xzchFT9Sril60laZWFC o7V8hbQ5gDT52OrDprR7Sr+pJQVwv5XhisZGtIx1+MSg1/sH2g5OFv8gLUOjUWTbQdQ3 Jw== Received: from rn-mailsvcp-mta-lapp01.rno.apple.com (rn-mailsvcp-mta-lapp01.rno.apple.com [10.225.203.149]) by rn-mailsvcp-ppex-lapp15.rno.apple.com with ESMTP id 39y276592w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 22 Jul 2021 21:00:00 -0700 Received: from rn-mailsvcp-mmp-lapp01.rno.apple.com (rn-mailsvcp-mmp-lapp01.rno.apple.com [17.179.253.14]) by rn-mailsvcp-mta-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPS id <0QWO00PYMJ40BMJ0@rn-mailsvcp-mta-lapp01.rno.apple.com>; Thu, 22 Jul 2021 21:00:00 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp01.rno.apple.com by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) id <0QWO00700ITG0600@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Thu, 22 Jul 2021 21:00:00 -0700 (PDT) X-Va-A: X-Va-T-CD: 81ca60fce39c2560b6c4a7e5841f9b8f X-Va-E-CD: 4715730f02fb6cb650dbd58f0a6f1397 X-Va-R-CD: d46b512246eb718692dfc0eca4d4782e X-Va-CD: 0 X-Va-ID: 75a153fb-eea1-46b1-867b-81cf42f43dac X-V-A: X-V-T-CD: 81ca60fce39c2560b6c4a7e5841f9b8f X-V-E-CD: 4715730f02fb6cb650dbd58f0a6f1397 X-V-R-CD: d46b512246eb718692dfc0eca4d4782e X-V-CD: 0 X-V-ID: acb07e4e-2f4a-4569-9c01-213572782ec7 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-22_16:2021-07-22,2021-07-22 signatures=0 Received: from [17.235.22.88] (unknown [17.235.22.88]) by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPSA id <0QWO00K15J3YY100@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Thu, 22 Jul 2021 20:59:59 -0700 (PDT) From: "Andrew Fish" Message-id: <327A8E9C-3E98-4049-B510-E98F093AB243@apple.com> MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver Date: Thu, 22 Jul 2021 20:59:57 -0700 In-reply-to: Cc: "devel@edk2.groups.io" , "pedro.falcato@gmail.com" , "mhaeuser@posteo.de" , "rfc@edk2.groups.io" To: "Desimone, Nathaniel L" References: X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-07-22_16:2021-07-22,2021-07-22 signatures=0 Content-type: multipart/alternative; boundary="Apple-Mail=_29F51D8F-EB14-4E31-8C97-DF1B21AAC962" --Apple-Mail=_29F51D8F-EB14-4E31-8C97-DF1B21AAC962 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 22, 2021, at 7:07 PM, Desimone, Nathaniel L wrote: >=20 > Hi Pedro, >=20 >> -----Original Message----- >> From: devel@edk2.groups.io > On Behalf Of Pedro >> Falcato >> Sent: Thursday, July 22, 2021 9:54 AM >> To: Andrew Fish > >> Cc: edk2-devel-groups-io >; mhaeuser@posteo.de ; >> rfc@edk2.groups.io >> Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver >>=20 >> Hi Andrew, Marvin, >>=20 >> RE: The package name: It doesn't sound like a bad idea to have somethin= g >> like a FileSystemPkg and have a bunch of different filesystems inside o= f it, >> but I'll defer to you and my mentors' judgement; we could also drop tha= t >> issue for now and take care of it afterwards, since it may need further >> changes that are not a part of GSoC and would just delay the process. >>=20 >> With respect to the write capabilities of the driver, I'm not entirely = sure >> whether or not it's useful. I've been thinking about it today, and it s= eems like >> there's not much that could go wrong? The write path isn't excessively >> complex. Except of course in the event of an untimely power cut, but th= ose >> /should/ be easily detected by the checksums. My initial idea was to ha= ve it >> up to speed with FatPkg and other filesystems by implementing all of >> EFI_FILE_PROTOCOL, including the write portions. If Apple's HFS+ and AP= FS >> drivers don't have those, it may be a decent idea to reduce the scope o= f the >> ext4 driver as well. I don't see a big need for write support; on the o= ther >> hand, I've only worked on UEFI bootloaders before, which may be an outl= ier >> in that regard. Further feedback is appreciated. >=20 > The most commonly used reason to for writing to the filesystem in a prod= uction environment is capsule updates. Most capsule update implementations = will stage the capsule on the EFI System Partition and then reset the syste= m to unlock flash. The second most useful is the UEFI Shell and all the man= y applications that run within it will write to the filesystem for a large = variety of reasons. I think it would be a useful feature to have as one cou= ld conceivably start using EFI System Partitions formatted as ext4. >=20 The EFI System Partition is defined to be FAT32 by the UEFI Spec for inter= operability. It defines the file system drivers required for the firmware a= nd OS. So changing that is not really an option.=20 You can still install the UEFI Shell to a read only file system, you just = need to do it from the OS :). We actually do this on Macs quite often. You = just run the macOS bless command and reboot to the UEFI Shell.=20 Thanks, Andrew Fish >>=20 >> As for the tests, UEFI SCTs already seem to have some tests on >> EFI_FILE_PROTOCOL's. Further testing may require some sort of fuzzing, >> which is what I want to, although in a simplified way. With fuzzing we = could >> hammer the filesystem code with all sorts of different calls in differe= nt >> orders, we could also mutate the disk structures to test if the driver = is secure >> and can handle corruption in a nice, safe way. A future (GSoC or not) p= roject >> could also attempt to use compiler-generated coverage instrumentation (= see >> LLVM's LibFuzzer and SanitizerCoverage for an example). >>=20 >> I'm not sure about all OSes, but at least Linux ext2/3/4 drivers are ve= ry robust >> and can handle and work around any corrupted FS I >> (accidentally) throw at them. However, running fsck is the best way to = detect >> corruption; note that licensing may be an issue as, for example, ext4's= fsck is >> GPL2 licensed. >>=20 >> Best Regards, >> Pedro >>=20 >> On Thu, 22 Jul 2021 at 16:58, Andrew Fish wrote: >>>=20 >>>=20 >>>=20 >>> On Jul 22, 2021, at 3:24 AM, Marvin H=C3=A4user w= rote: >>>=20 >>> On 22.07.21 01:12, Pedro Falcato wrote: >>>=20 >>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux >>> that has been in wide use (desktops, servers, smartphones) since 2008. >>>=20 >>> The Ext4Pkg implements the Simple File System Protocol for a partition >>> that is formatted with the EXT4 file system. This allows UEFI Drivers, >>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files >>> on an EXT4 partition and supports booting a UEFI OS Loader from an >>> EXT4 partition. >>> This project is one of the TianoCore Google Summer of Code projects. >>>=20 >>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a >>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk >>> I/O 2 Protocols, and produces the Simple File System protocol. It >>> allows mounting ext4 filesystems exclusively. >>>=20 >>> Brief overhead of EXT4: >>> Layout of an EXT2/3/4 filesystem: >>> (note: this driver has been developed using >>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html as >>> documentation). >>>=20 >>> An ext2/3/4 filesystem (here on out referred to as simply an ext4 >>> filesystem, due to the similarities) is composed of various concepts: >>>=20 >>> 1) Superblock >>> The superblock is the structure near (1024 bytes offset from the >>> start) the start of the partition, and describes the filesystem in gen= eral. >>> Here, we get to know the size of the filesystem's blocks, which >>> features it supports or not, whether it's been cleanly unmounted, how >>> many blocks we have, etc. >>>=20 >>> 2) Block groups >>> EXT4 filesystems are divided into block groups, and each block group >>> covers >>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an >>> associated block group descriptor; these are present directly after >>> the superblock. Each block group descriptor contains the location of >>> the inode table, and the inode and block bitmaps (note these bitmaps >>> are only a block long, which gets us the 8 * Block Size formula covere= d >> previously). >>>=20 >>> 3) Blocks >>> The ext4 filesystem is divided into blocks, of size s_log_block_size ^= 1024. >>> Blocks can be allocated using individual block groups's bitmaps. Note >>> that block 0 is invalid and its presence on extents/block tables means >>> it's part of a file hole, and that particular location must be read as >>> a block full of zeros. >>>=20 >>> 4) Inodes >>> The ext4 filesystem divides files/directories into inodes (originally >>> index nodes). Each file/socket/symlink/directory/etc (here on out >>> referred to as a file, since there is no distinction under the ext4 >>> filesystem) is stored as a /nameless/ inode, that is stored in some >>> block group's inode table. Each inode has s_inode_size size (or >>> GOOD_OLD_INODE_SIZE if it's an old filesystem), and holds various >>> metadata about the file. Since the largest inode structure right now >>> is ~160 bytes, the rest of the inode contains inline extended >>> attributes. Inodes' data is stored using either data blocks (under ext= 2/3) or >> extents (under ext4). >>>=20 >>> 5) Extents >>> Ext4 inodes store data in extents. These let N contiguous logical >>> blocks that are represented by N contiguous physical blocks be >>> represented by a single extent structure, which minimizes filesystem >>> metadata bloat and speeds up block mapping (particularly due to the >>> fact that high-quality >>> ext4 implementations like linux's try /really/ hard to make the file >>> contiguous, so it's common to have files with almost 0 fragmentation). >>> Inodes that use extents store them in a tree, and the top of the tree >>> is stored on i_data. The tree's leaves always start with an >>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth !=3D 0 >> and >>> EXT4_EXTENT on eh_depth =3D 0; these entries are always sorted by >>> logical block. >>>=20 >>> 6) Directories >>> Ext4 directories are files that store name -> inode mappings for the >>> logical directory; this is where files get their names, which means >>> ext4 inodes do not themselves have names, since they can be linked >>> (present) multiple times with different names. Directories can store >>> entries in two different ways: >>> 1) Classical linear directories: They store entries as a mostly-linked >>> mostly-list of EXT4_DIR_ENTRY. >>> 2) Hash tree directories: These are used for larger directories, with >>> hundreds of entries, and are designed in a backwards-compatible way. >>> These are not yet implemented in the Ext4Dxe driver. >>>=20 >>> 7) Journal >>> Ext3/4 filesystems have a journal to help protect the filesystem >>> against system crashes. This is not yet implemented in Ext4Dxe but is >>> described in detail in the Linux kernel's documentation. >>>=20 >>> The EDK2 implementation of ext4 is based only on the public >>> documentation available at >>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html >>> and >>> the FreeBSD ext2fs driver (available at >>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs, >>> BSD-2-Clause-FreeBSD licensed). It is licensed as >>> SPDX-License-Identifier: BSD-2-Clause-Patent. >>>=20 >>> After a brief discussion with the community, the proposed package >>> location is edk2-platform/Features/Ext4Pkg (relevant discussion: >>> https://edk2.groups.io/g/devel/topic/83060185). >>>=20 >>> I was the main contributor and I would like to maintain the package in >>> the future, if possible. >>>=20 >>>=20 >>> While I personally don't like it's outside of the EDK II core, I kind = of get it. >> However I would strongly suggest to choose a more general package name, >> like "LinuxFsPkg", or "NixFsPkg", or maybe even just "FileSystemPkg" (a= nd >> move FAT over some day?). Imagine someone wants to import BTRFS next >> year, should it really be "BtrfsPkg"? I understand it follows the "FatP= kg" >> convention, but I feel like people forget FatPkg was special as to its = awkward >> license before Microsoft allowed a change a few years ago. Maintainers.= txt >> already has the concept of different Reviewers per subfolder, maybe it = could >> be extended a little to have a common EDK II contributor to officially = maintain >> the package, but have you be a Maintainer or something like a Reviewer+= to >> your driver? Or you could maintain the entire package of course. >>>=20 >>>=20 >>> Marvin, >>>=20 >>> Good point that the FatPkg was more about license boundary than >> anything else, so I=E2=80=99m not opposed to a more generic package nam= e. >>>=20 >>> Current limitations: >>> 1) The Ext4Dxe driver is, at the moment, read-only. >>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3) >>> filesystems. Ensuring compatibility with those may not be a bad idea. >>>=20 >>> I intend to test the package using the UEFI SCTs present in edk2-test, >>> and implement any other needed unit tests myself using the already >>> available unit test framework. I also intend to (privately) fuzz the >>> UEFI driver with bad/unusual disk images, to improve the security and >>> reliability of the driver. >>>=20 >>> In the future, ext4 write support should be added so edk2 has a >>> fully-featured RW ext4 driver. There could also be a focus on >>> supporting the older ext4-like filesystems, as I mentioned in the >>> limitations, but that is open for discussion. >>>=20 >>>=20 >>> I may be alone, but I strongly object. One of our projects (OpenCore) = has a >> disgusting way of writing files because the FAT32 driver in Aptio IV fi= rmwares >> may corrupt the filesystem when resizing files. To be honest, it may co= rrupt >> with other usages too and we never noticed, because basically we wrote = the >> code to require the least amount of (complex) FS operations. >>>=20 >>> The issue with EDK II is, there is a lot of own code and a lot of user= s, but >> little testing. By that I do not mean that developers do not test their= code, >> but that nobody sits down and performs all sorts of FS manipulations in= all >> sorts of orders and closely observes the result for regression-testing.= Users >> will not really test it either, as UEFI to them should just somehow boo= t to >> Windows. If at least the code was shared with a codebase that is known- >> trusted (e.g. the Linux kernel itself), that'd be much easier to trust,= but >> realistically this is not going to happen. >>> My point is, if a company like AMI cannot guarantee writing does not >> damage the FS for a very simple FS, how do you plan to guarantee yours >> doesn't for a much more complex FS? I'd rather have only one simple FS = type >> that supports writing for most use-cases (e.g. logging). >>>=20 >>> At the very least I would beg you to have a PCD to turn write support >>> off - if it will be off by default, that'd be great of course. :) Was = there any >> discussion yet as to why write support is needed in the first place you= could >> point me to? >>>=20 >>>=20 >>> I think having a default PCD option of read only is a good idea. >>>=20 >>> EFI on Mac carries HFS+ and APFS EFI file system drivers and both of t= hose >> are read only for safety, security, and to avoid the need to validate t= hem. So I >> think some products may want to have the option to ship read only versi= ons >> of the file system. >>>=20 >>> Seems like having EFI base file system tests would be useful. I=E2=80= =99d imaging >> with OVMF it would be possible to implement a very robust test >> infrastructure. Seems like the hard bits would be generating the test c= ases >> and figuring out how to validate the tests did the correct thing. I=E2= =80=99m guess the >> OS based file system drivers are robust and try to work around bugs >> gracefully? Maybe there is a way to turn on OS logging, or even run an = OS >> based fsck on the volume after the tests complete. Regardless this seem= s >> like an interesting project, maybe we can add it to next years GSoC? >>>=20 >>> Thanks, >>>=20 >>> Andrew Fish >>>=20 >>> Thanks for your work! >>>=20 >>> Best regards, >>> Marvin >>>=20 >>> The driver's handling of unclean unmounting through forced shutdown is >> unclear. >>> Is there a position in edk2 on how to handle such cases? I don't think >>> FAT32 has a "this filesystem is/was dirty" and even though it seems to >>> me that stopping a system from booting/opening the partition because >>> "we may find some tiny irregularities" is not the best course of >>> action, I can't find a clear answer. >>>=20 >>> The driver also had to add implementations of CRC32C and CRC16, and >>> after talking with my mentor we quickly reached the conclusion that >>> these may be good candidates for inclusion in MdePkg. We also >>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg >>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments? >>>=20 >>> Feel free to ask any questions you may find relevant. >>>=20 >>> Best Regards, >>>=20 >>> Pedro Falcato >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>=20 >>=20 >> -- >> Pedro Falcato >>=20 >>=20 >>=20 --Apple-Mail=_29F51D8F-EB14-4E31-8C97-DF1B21AAC962 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jul 22, 2= 021, at 7:07 PM, Desimone, Nathaniel L <nathaniel.l.desimone@intel.com> wrote:<= /div>
Hi Pedro,

-----O= riginal Message-----
From: devel@ed= k2.groups.io <devel@edk2.groups.io>= On Behalf Of Pedro
Falcato
Sent: Thursday, Jul= y 22, 2021 9:54 AM
To: Andrew Fish <afish@apple.com>
Cc: edk2-deve= l-groups-io <devel@ed= k2.groups.io>; mhaeuser@posteo.de;
rfc@edk2.groups.i= o
Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver
Hi Andrew, Marvin,

RE: The package name: It doesn't sound like a bad idea to have somethinglike a FileSystemPkg and have a bunch of different filesystems= inside of it,
but I'll defer to you and my mentors' judgemen= t; we could also drop that
issue for now and take care of it = afterwards, since it may need further
changes that are not a = part of GSoC and would just delay the process.

With respect to the write capabilities of the driver, I'm not entirely sur= e
whether or not it's useful. I've been thinking about it tod= ay, and it seems like
there's not much that could go wrong? T= he write path isn't excessively
complex. Except of course in = the event of an untimely power cut, but those
/should/ be eas= ily detected by the checksums. My initial idea was to have it
up to speed with FatPkg and other filesystems by implementing all of
EFI_FILE_PROTOCOL, including the write portions. If Apple's HFS+ a= nd APFS
drivers don't have those, it may be a decent idea to = reduce the scope of the
ext4 driver as well. I don't see a bi= g need for write support; on the other
hand, I've only worked= on UEFI bootloaders before, which may be an outlier
in that = regard. Further feedback is appreciated.

The most commonly used reason to for= writing to the filesystem in a production environment is capsule updates. = Most capsule update implementations will stage the capsule on the EFI Syste= m Partition and then reset the system to unlock flash. The second most usef= ul is the UEFI Shell and all the many applications that run within it will = write to the filesystem for a large variety of reasons. I think it would be= a useful feature to have as one could conceivably start using EFI System P= artitions formatted as ext4.
<= br style=3D"caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 1= 2px; font-style: normal; font-variant-caps: normal; font-weight: normal; le= tter-spacing: normal; text-align: start; text-indent: 0px; text-transform: = none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0p= x; text-decoration: none;" class=3D"">

The EFI System Partition is defined to be FAT32 by the UEFI Sp= ec for interoperability. It defines the file system drivers required for th= e firmware and OS. So changing that is not really an option. 

You can still install the UEFI Shell to a read = only file system, you just need to do it from the OS :). We actually do thi= s on Macs quite often. You just run the macOS bless command and reboot to t= he UEFI Shell. 

Thanks,
=
Andrew Fish


As= for the tests, UEFI SCTs already seem to have some tests on
= EFI_FILE_PROTOCOL's. Further testing may require some sort of fuzzing,
which is what I want to, although in a simplified way. With fuzzi= ng we could
hammer the filesystem code with all sorts of diff= erent calls in different
orders, we could also mutate the dis= k structures to test if the driver is secure
and can handle c= orruption in a nice, safe way. A future (GSoC or not) project
could also attempt to use compiler-generated coverage instrumentation (see=
LLVM's LibFuzzer and SanitizerCoverage for an example).

I'm not sure about all OSes, but at least Linux ex= t2/3/4 drivers are very robust
and can handle and work around= any corrupted FS I
(accidentally) throw at them. However, ru= nning fsck is the best way to detect
corruption; note that li= censing may be an issue as, for example, ext4's fsck is
GPL2 = licensed.

Best Regards,
Pedro
On Thu, 22 Jul 2021 at 16:58, Andrew Fish <afish@apple.com> wrote:<= br class=3D"">



On Jul 22, 2021, at 3:24 AM, Marvin H=C3=A4user <<= a href=3D"mailto:mhaeuser@posteo.de" class=3D"">mhaeuser@posteo.de> = wrote:

On 22.07.21 01:12, Pedro Falcato wrote:=

EXT4 (fourth extended filesystem) is a filesy= stem developed for Linux
that has been in wide use (desktops,= servers, smartphones) since 2008.

The Ext4Pkg= implements the Simple File System Protocol for a partition
t= hat is formatted with the EXT4 file system. This allows UEFI Drivers,
UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access f= iles
on an EXT4 partition and supports booting a UEFI OS Load= er from an
EXT4 partition.
This project is one = of the TianoCore Google Summer of Code projects.

Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is aUEFI driver that consumes Block I/O, Disk I/O and (optionally) = Disk
I/O 2 Protocols, and produces the Simple File System pro= tocol. It
allows mounting ext4 filesystems exclusively.

Brief overhead of EXT4:
Layout of an = EXT2/3/4 filesystem:
(note: this driver has been developed us= ing
https://www.kernel.org/doc/html/latest/fil= esystems/ext4/index.html as
documentation).

An ext2/3/4 filesystem (here on out referred to as simply a= n ext4
filesystem, due to the similarities) is composed of va= rious concepts:

1) Superblock
Th= e superblock is the structure near (1024 bytes offset from the
start) the start of the partition, and describes the filesystem in genera= l.
Here, we get to know the size of the filesystem's blocks, = which
features it supports or not, whether it's been cleanly = unmounted, how
many blocks we have, etc.

2) Block groups
EXT4 filesystems are divided into bl= ock groups, and each block group
covers
s_block= s_per_group(8 * Block Size) blocks. Each block group has an
a= ssociated block group descriptor; these are present directly after
the superblock. Each block group descriptor contains the location of=
the inode table, and the inode and block bitmaps (note these= bitmaps
are only a block long, which gets us the 8 * Block S= ize formula covered
previously).
<= blockquote type=3D"cite" class=3D"">
3) Blocks
= The ext4 filesystem is divided into blocks, of size s_log_block_size ^ 1024= .
Blocks can be allocated using individual block groups's bit= maps. Note
that block 0 is invalid and its presence on extent= s/block tables means
it's part of a file hole, and that parti= cular location must be read as
a block full of zeros.

4) Inodes
The ext4 filesystem divides f= iles/directories into inodes (originally
index nodes). Each f= ile/socket/symlink/directory/etc (here on out
referred to as = a file, since there is no distinction under the ext4
filesyst= em) is stored as a /nameless/ inode, that is stored in some
b= lock group's inode table. Each inode has s_inode_size size (or
GOOD_OLD_INODE_SIZE if it's an old filesystem), and holds various
metadata about the file. Since the largest inode structure right now=
is ~160 bytes, the rest of the inode contains inline extende= d
attributes. Inodes' data is stored using either data blocks= (under ext2/3) or
extents (under ext4).

5) Extents
Ext4 inodes store data in extents. These let N contiguous logical<= br class=3D"">blocks that are represented by N contiguous physical blocks b= e
represented by a single extent structure, which minimizes f= ilesystem
metadata bloat and speeds up block mapping (particu= larly due to the
fact that high-quality
ext4 im= plementations like linux's try /really/ hard to make the file
contiguous, so it's common to have files with almost 0 fragmentation).
Inodes that use extents store them in a tree, and the top of the= tree
is stored on i_data. The tree's leaves always start wit= h an
EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_d= epth !=3D 0
and
EXT4_EXTENT on eh_depth =3D 0; these entries are alwa= ys sorted by
logical block.

6) D= irectories
Ext4 directories are files that store name -> i= node mappings for the
logical directory; this is where files = get their names, which means
ext4 inodes do not themselves ha= ve names, since they can be linked
(present) multiple times w= ith different names. Directories can store
entries in two dif= ferent ways:
1) Classical linear directories: They store entr= ies as a mostly-linked
mostly-list of EXT4_DIR_ENTRY.
2) Hash tree directories: These are used for larger directories, wit= h
hundreds of entries, and are designed in a backwards-compat= ible way.
These are not yet implemented in the Ext4Dxe driver= .

7) Journal
Ext3/4 filesystems = have a journal to help protect the filesystem
against system = crashes. This is not yet implemented in Ext4Dxe but is
descri= bed in detail in the Linux kernel's documentation.

The EDK2 implementation of ext4 is based only on the public
documentation available at
https://www.= kernel.org/doc/html/latest/filesystems/ext4/index.html
an= d
the FreeBSD ext2fs driver (available at
https= ://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs,
BS= D-2-Clause-FreeBSD licensed). It is licensed as
SPDX-License-= Identifier: BSD-2-Clause-Patent.

After a brief= discussion with the community, the proposed package
location= is edk2-platform/Features/Ext4Pkg (relevant discussion:
http= s://edk2.groups.io/g/devel/topic/83060185).

I = was the main contributor and I would like to maintain the package in
the future, if possible.


While I personally don't like it's outside of the EDK II core, I kind of = get it.
However I would strongly suggest to choo= se a more general package name,
like "LinuxFsPkg", or "NixFsP= kg", or maybe even just "FileSystemPkg" (and
move FAT over so= me day?). Imagine someone wants to import BTRFS next
year, sh= ould it really be "BtrfsPkg"? I understand it follows the "FatPkg"
convention, but I feel like people forget FatPkg was special as to i= ts awkward
license before Microsoft allowed a change a few ye= ars ago. Maintainers.txt
already has the concept of different= Reviewers per subfolder, maybe it could
be extended a little= to have a common EDK II contributor to officially maintain
t= he package, but have you be a Maintainer or something like a Reviewer+ toyour driver? Or you could maintain the entire package of cours= e.


Marvin,

Good point that the FatPkg wa= s more about license boundary than
anything else= , so I=E2=80=99m not opposed to a more generic package name.
=

Current limitations:1) The Ext4Dxe driver is, at the moment, read-only.
2) The Ext4Dxe driver at the moment cannot mount older (ext2/3)
filesystems. Ensuring compatibility with those may not be a bad idea= .

I intend to test the package using the UEFI = SCTs present in edk2-test,
and implement any other needed uni= t tests myself using the already
available unit test framewor= k. I also intend to (privately) fuzz the
UEFI driver with bad= /unusual disk images, to improve the security and
reliability= of the driver.

In the future, ext4 write supp= ort should be added so edk2 has a
fully-featured RW ext4 driv= er. There could also be a focus on
supporting the older ext4-= like filesystems, as I mentioned in the
limitations, but that= is open for discussion.


I may = be alone, but I strongly object. One of our projects (OpenCore) has a
disgusting way of writing files because the FAT32 dri= ver in Aptio IV firmwares
may corrupt the filesystem when res= izing files. To be honest, it may corrupt
with other usages t= oo and we never noticed, because basically we wrote the
code = to require the least amount of (complex) FS operations.

The issue with EDK II is, th= ere is a lot of own code and a lot of users, but
little testing. By that I do not mean that developers do not test their co= de,
but that nobody sits down and performs all sorts of FS ma= nipulations in all
sorts of orders and closely observes the r= esult for regression-testing. Users
will not really test it e= ither, as UEFI to them should just somehow boot to
Windows. I= f at least the code was shared with a codebase that is known-
trusted (e.g. the Linux kernel itself), that'd be much easier to trust, bu= t
realistically this is not going to happen.
My point is, if a company like AMI canno= t guarantee writing does not
damage the FS for a= very simple FS, how do you plan to guarantee yours
doesn't f= or a much more complex FS? I'd rather have only one simple FS type
that supports writing for most use-cases (e.g. logging).

At the very least I = would beg you to have a PCD to turn write support
off - if it= will be off by default, that'd be great of course. :) Was there any
discussion yet as to why write support is needed in th= e first place you could
point me to?


I think having a d= efault PCD option of read only is a good idea.

EFI on Mac carries HFS+ and APFS EFI file system drivers and both of those=
are read only for safety, security, and to avoi= d the need to validate them. So I
think some products may wan= t to have the option to ship read only versions
of the file s= ystem.

Se= ems like having EFI base file system tests would be useful. I=E2=80=99d ima= ging
with OVMF it would be possible to implement= a very robust test
infrastructure. Seems like the hard bits = would be generating the test cases
and figuring out how to va= lidate the tests did the correct thing. I=E2=80=99m guess the
OS based file system drivers are robust and try to work around bugs
gracefully? Maybe there is a way to turn on OS logging, or even run= an OS
based fsck on the volume after the tests complete. Reg= ardless this seems
like an interesting project, maybe we can = add it to next years GSoC?

Thanks,

Andrew Fish

Thanks for your work!

Be= st regards,
Marvin

The driver's = handling of unclean unmounting through forced shutdown is
unclear.
Is th= ere a position in edk2 on how to handle such cases? I don't think
FAT32 has a "this filesystem is/was dirty" and even though it seems = to
me that stopping a system from booting/opening the partiti= on because
"we may find some tiny irregularities" is not the = best course of
action, I can't find a clear answer.

The driver also had to add implementations of CRC32C = and CRC16, and
after talking with my mentor we quickly reache= d the conclusion that
these may be good candidates for inclus= ion in MdePkg. We also
discussed moving the Ucs2 <-> Ut= f8 conversion library in RedfishPkg
(BaseUcs2Utf8Lib) into Md= ePkg as well. Any comments?

Feel free to ask a= ny questions you may find relevant.

Best Regar= ds,

Pedro Falcato










<= br class=3D"">
--
Pedro Falcato
<= br class=3D"">

--Apple-Mail=_29F51D8F-EB14-4E31-8C97-DF1B21AAC962--