From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout01.posteo.de (mout01.posteo.de [185.67.36.65]) by mx.groups.io with SMTP id smtpd.web10.9883.1626970081047753915 for ; Thu, 22 Jul 2021 09:08:01 -0700 Authentication-Results: mx.groups.io; dkim=fail reason="body hash did not verify" header.i=@posteo.de header.s=2017 header.b=f1lubip0; spf=pass (domain: posteo.de, ip: 185.67.36.65, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [89.146.220.130]) by mout01.posteo.de (Postfix) with ESMTPS id BAB1824002B for ; Thu, 22 Jul 2021 18:07:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1626970077; bh=ksYLQ9Q2lqI1HSRalgt3VaDxwBDVLMBkMGS5k/jgMT0=; h=Subject:To:Cc:From:Date:From; b=f1lubip0jz7OyGln5SlGrFFmZRGxOrVWSVn2Lw0RSuR4tyMs4Fd7NyOF+QzFeBMme QXFpu7ozCXWSTahj6Dh6PUWyjLcFhSTGtOxW9FU9srp6fVleGNeSMrTv5/YHpzPxSN 7uotv1jMKFM6nLrvtmZo0tysnTmznVjz7x/iZ0tix7HZyBL4S4SGPJd5vpsSA6oumF 7FifDJJVM1vnwXpZO+5zk6CRFo6AUL9j16ybwCqBhw9SHuiv3K7JKS0svDz3MNF4I/ WIAbUxO5E8q/wMWMJd+V+uzHSu05N0u7ETsUhhLTuuyjiQkXPDaE+KH8HU9S6GTBse V57WQR6SDMrRg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4GVy645j9hz9rxD; Thu, 22 Jul 2021 18:07:56 +0200 (CEST) Subject: Re: [edk2-devel] RFC: EXT4 filesystem driver To: Andrew Fish , edk2-devel-groups-io Cc: pedro.falcato@gmail.com References: From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-ID: Date: Thu, 22 Jul 2021 16:07:56 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: quoted-printable On 22.07.21 17:58, Andrew Fish wrote: > > >> On Jul 22, 2021, at 3:24 AM, Marvin H=C3=A4user > > wrote: >> >> On 22.07.21 01:12, Pedro Falcato wrote: >>> EXT4 (fourth extended filesystem) is a filesystem developed for Linux >>> that has been in wide use (desktops, servers, smartphones) since 2008. >>> >>> The Ext4Pkg implements the Simple File System Protocol for a partition >>> that is formatted with the EXT4 file system. This allows UEFI Drivers, >>> UEFI Applications, UEFI OS Loaders, and the UEFI Shell to access files >>> on an EXT4 partition and supports booting a UEFI OS Loader from an >>> EXT4 partition. >>> This project is one of the TianoCore Google Summer of Code projects. >>> >>> Right now, Ext4Pkg only contains a single member, Ext4Dxe, which is a >>> UEFI driver that consumes Block I/O, Disk I/O and (optionally) Disk >>> I/O 2 Protocols, and produces the Simple File System protocol. It >>> allows mounting ext4 filesystems exclusively. >>> >>> Brief overhead of EXT4: >>> Layout of an EXT2/3/4 filesystem: >>> (note: this driver has been developed using >>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html=20 >>> a= s >>> documentation). >>> >>> An ext2/3/4 filesystem (here on out referred to as simply an ext4=20 >>> filesystem, >>> due to the similarities) is composed of various concepts: >>> >>> 1) Superblock >>> The superblock is the structure near (1024 bytes offset from the start= ) >>> the start of the partition, and describes the filesystem in general. >>> Here, we get to know the size of the filesystem's blocks, which featur= es >>> it supports or not, whether it's been cleanly unmounted, how many bloc= ks >>> we have, etc. >>> >>> 2) Block groups >>> EXT4 filesystems are divided into block groups, and each block group= =20 >>> covers >>> s_blocks_per_group(8 * Block Size) blocks. Each block group has an >>> associated block group descriptor; these are present directly after th= e >>> superblock. Each block group descriptor contains the location of the >>> inode table, and the inode and block bitmaps (note these bitmaps are= =20 >>> only >>> a block long, which gets us the 8 * Block Size formula covered=20 >>> previously). >>> >>> 3) Blocks >>> The ext4 filesystem is divided into blocks, of size s_log_block_size= =20 >>> ^ 1024. >>> Blocks can be allocated using individual block groups's bitmaps. Note >>> that block 0 is invalid and its presence on extents/block tables means >>> it's part of a file hole, and that particular location must be read as >>> a block full of zeros. >>> >>> 4) Inodes >>> The ext4 filesystem divides files/directories into inodes (originally >>> index nodes). Each file/socket/symlink/directory/etc (here on out=20 >>> referred >>> to as a file, since there is no distinction under the ext4=20 >>> filesystem) is >>> stored as a /nameless/ inode, that is stored in some block group's ino= de >>> table. Each inode has s_inode_size size (or GOOD_OLD_INODE_SIZE if it'= s >>> an old filesystem), and holds various metadata about the file. Since t= he >>> largest inode structure right now is ~160 bytes, the rest of the inode >>> contains inline extended attributes. Inodes' data is stored using eith= er >>> data blocks (under ext2/3) or extents (under ext4). >>> >>> 5) Extents >>> Ext4 inodes store data in extents. These let N contiguous logical bloc= ks >>> that are represented by N contiguous physical blocks be represented by= a >>> single extent structure, which minimizes filesystem metadata bloat and >>> speeds up block mapping (particularly due to the fact that high-qualit= y >>> ext4 implementations like linux's try /really/ hard to make the file >>> contiguous, so it's common to have files with almost 0 fragmentation). >>> Inodes that use extents store them in a tree, and the top of the tree >>> is stored on i_data. The tree's leaves always start with an >>> EXT4_EXTENT_HEADER and contain EXT4_EXTENT_INDEX on eh_depth !=3D 0 an= d >>> EXT4_EXTENT on eh_depth =3D 0; these entries are always sorted by logi= cal >>> block. >>> >>> 6) Directories >>> Ext4 directories are files that store name -> inode mappings for the >>> logical directory; this is where files get their names, which means ex= t4 >>> inodes do not themselves have names, since they can be linked (present= ) >>> multiple times with different names. Directories can store entries=20 >>> in two >>> different ways: >>> 1) Classical linear directories: They store entries as a mostly-linked >>> mostly-list of EXT4_DIR_ENTRY. >>> 2) Hash tree directories: These are used for larger directories, with >>> hundreds of entries, and are designed in a backwards-compatible way. >>> These are not yet implemented in the Ext4Dxe driver. >>> >>> 7) Journal >>> Ext3/4 filesystems have a journal to help protect the filesystem again= st >>> system crashes. This is not yet implemented in Ext4Dxe but is describe= d >>> in detail in the Linux kernel's documentation. >>> >>> The EDK2 implementation of ext4 is based only on the public=20 >>> documentation >>> available at=20 >>> https://www.kernel.org/doc/html/latest/filesystems/ext4/index.html=20 >>> >>> and >>> the FreeBSD ext2fs driver (available at >>> https://github.com/freebsd/freebsd-src/tree/main/sys/fs/ext2fs=20 >>> , >>> BSD-2-Clause-FreeBSD licensed). It is licensed as >>> SPDX-License-Identifier: BSD-2-Clause-Patent. >>> >>> After a brief discussion with the community, the proposed package >>> location is edk2-platform/Features/Ext4Pkg >>> (relevant discussion: https://edk2.groups.io/g/devel/topic/83060185=20 >>> ). >>> >>> I was the main contributor and I would like to maintain the package in >>> the future, if possible. >> >> While I personally don't like it's outside of the EDK II core, I kind= =20 >> of get it. However I would strongly suggest to choose a more general=20 >> package name, like "LinuxFsPkg", or "NixFsPkg", or maybe even just=20 >> "FileSystemPkg" (and move FAT over some day?). Imagine someone wants=20 >> to import BTRFS next year, should it really be "BtrfsPkg"? I=20 >> understand it follows the "FatPkg" convention, but I feel like people= =20 >> forget FatPkg was special as to its awkward license before Microsoft=20 >> allowed a change a few years ago. Maintainers.txt already has the=20 >> concept of different Reviewers per subfolder, maybe it could be=20 >> extended a little to have a common EDK II contributor to officially=20 >> maintain the package, but have you be a Maintainer or something like=20 >> a Reviewer+ to your driver? Or you could maintain the entire package=20 >> of course. >> > > Marvin, > > Good point that the FatPkg was more about license boundary than=20 > anything else, so I=E2=80=99m not opposed to a more generic package name= . > >>> Current limitations: >>> 1) The Ext4Dxe driver is, at the moment, read-only. >>> 2) The Ext4Dxe driver at the moment cannot mount older (ext2/3) >>> filesystems. Ensuring compatibility with >>> those may not be a bad idea. >>> >>> I intend to test the package using the UEFI SCTs present in edk2-test, >>> and implement any other needed unit tests myself using the already >>> available unit test framework. I also intend to (privately) fuzz the >>> UEFI driver with bad/unusual disk images, to improve the security and >>> reliability of the driver. >>> >>> In the future, ext4 write support should be added so edk2 has a >>> fully-featured RW ext4 driver. There could also be a focus on >>> supporting the older ext4-like filesystems, as I mentioned in the >>> limitations, but that is open for discussion. >> >> I may be alone, but I strongly object. One of our projects (OpenCore)= =20 >> has a disgusting way of writing files because the FAT32 driver in=20 >> Aptio IV firmwares may corrupt the filesystem when resizing files. To= =20 >> be honest, it may corrupt with other usages too and we never noticed,= =20 >> because basically we wrote the code to require the least amount of=20 >> (complex) FS operations. >> >> The issue with EDK II is, there is a lot of own code and a lot of=20 >> users, but little testing. By that I do not mean that developers do=20 >> not test their code, but that nobody sits down and performs all sorts= =20 >> of FS manipulations in all sorts of orders and closely observes the=20 >> result for regression-testing. Users will not really test it either,=20 >> as UEFI to them should just somehow boot to Windows. If at least the=20 >> code was shared with a codebase that is known-trusted (e.g. the Linux= =20 >> kernel itself), that'd be much easier to trust, but realistically=20 >> this is not going to happen. >> My point is, if a company like AMI cannot guarantee writing does not=20 >> damage the FS for a very simple FS, how do you plan to guarantee=20 >> yours doesn't for a much more complex FS? I'd rather have only one=20 >> simple FS type that supports writing for most use-cases (e.g. logging). >> >> At the very least I would beg you to have a PCD to turn write support= =20 >> off - if it will be off by default, that'd be great of course. :) >> Was there any discussion yet as to why write support is needed in the= =20 >> first place you could point me to? >> > > I think having a default PCD option of read only is a good idea. > > EFI on Mac carries HFS+ and APFS EFI file system drivers and both of=20 > those are read only for safety, security, and to avoid the need to=20 > validate them. So I think some products may want to have the option to= =20 > ship read only versions of the file system. > > Seems like having EFI base file system tests would be useful. I=E2=80=99= d=20 > imaging with OVMF it would be possible to implement a very robust test= =20 > infrastructure. Seems like the hard bits would be generating the test=20 > cases and figuring out how to validate the tests did the correct=20 > thing. I=E2=80=99m guess the OS based file system drivers are robust and= try=20 > to work around bugs gracefully? Maybe there is a way to turn on OS=20 > logging, or even run an OS based fsck on the volume after the tests=20 > complete. Regardless this seems like an interesting project, maybe we=20 > can add it to next years GSoC? Great idea, maybe it could be added to that wiki list of topic=20 suggestions (preferably close to the top)? :) Best regards, Marvin > > Thanks, > > Andrew Fish > >> Thanks for your work! >> >> Best regards, >> Marvin >> >>> The driver's handling of unclean unmounting through forced shutdown=20 >>> is unclear. >>> Is there a position in edk2 on how to handle such cases? I don't think >>> FAT32 has a "this filesystem is/was dirty" and even though it seems to >>> me that stopping a system from booting/opening the partition because >>> "we may find some tiny irregularities" is not the best course of >>> action, I can't find a clear answer. >>> >>> The driver also had to add implementations of CRC32C and CRC16, and >>> after talking with my mentor we quickly reached the conclusion that >>> these may be good candidates for inclusion in MdePkg. We also >>> discussed moving the Ucs2 <-> Utf8 conversion library in RedfishPkg >>> (BaseUcs2Utf8Lib) into MdePkg as well. Any comments? >>> >>> Feel free to ask any questions you may find relevant. >>> >>> Best Regards, >>> >>> Pedro Falcato >>> >>> >>> >>> >> >> >> >>=20 >