On 18. Feb 2023, at 16:14, Pedro Falcato <pedro.falcato@gmail.com> wrote:

Proper collation (and proper unicode handling in general) is AIUI
pretty hard and involves a lot of tables, etc.
So it makes total sense to me why one wants to "dynamically link" this
in using protocols instead of statically linking it everywhere.

Should be noted that EnglishDxe makes a mockery out of unicode
collation and does plain ASCII operations only :))

Yes, and so does FatPei [1]. Also, EnglishDxe has a clear (and unvalidated!) assumption that FAT file names are strictly 7-bit ASCII [2]. Without doing any proper research, Wikipedia seems to suggest that even EASCII could be problematic for FAT file names. Under the assumption that FAT will only ever give us 7-bit ASCII file names, it's both unrealistic to assume someone will implement proper Unicode collation, but also to fear any observable deviations from FatDxe even if this was the case and Ext4Dxe used ASCII-only collation (because the extended Unicode collation support would be no-op w.r.t. FAT). The ASCII-only support is pretty tiny, it could live in BaseLib without a problem (and not be duplicated into FatPkg...).

In my opinion, merge this to fix the bug and at some later point change this to static linking (maybe alongside FatDxe, but definitely deduplicate FatPei).

[1] https://github.com/tianocore/edk2/tree/02fcfdce1e5ce86f1951191883e7e30de5aa08be/FatPei/FatLiteLib.c#L341-L365

[2] https://github.com/tianocore/edk2/tree/02fcfdce1e5ce86f1951191883e7e30de5aa08be/MdeModulePkg/Universal/Disk/UnicodeCollation/EnglishDxe/UnicodeCollationEng.c#L386-L406

I did think about that, but then decided against it since I felt that
initializing multiple times would make little logical sense.

I get it, but moving it inside would be more consistent with similar code and remove the risk for forgetting the check if it was ever used elsewhere (I know, it probably won't be, but this is a design question).

Best regards,
Marvin