Andrew, Marvin,

Thanks for the quick responses.

I'll give you a rundown of asan/kasan: You create a big (16TB in PML5-less x86) virtual mapping for ASAN, each byte in the shadow map represents 8 bytes of address space, and you poison/unpoison memory as you go and allocate chunks of the address space (usually through malloc, but in our case, AllocatePool()/AllocatePages(), I imagine). Since the only thing you have is a large contiguous virtual mapping, you need to either take a page fault and create mappings on the address space as you go along (very possible in user-space, usually not possible in kernel space and I assume UEFI), or you need to do fun stuff w/ page tables; usually, this means that you set up some page tables pointing to a zero page and remap those same page tables all over the virtual mapping; after taking a look at all our available memory, we allocate shadow pages for those (so you can RW to them).

Note that going a different route (with some data structure instead of the big mapping) is possible but, if you do, you can't use the faster inline ASAN that clang/gcc can generate for you (which do these same memory accesses, but inlined instead of doing e.g call __asan_load_8).

So yeah, if SetMemoryAttributes is the only thing we have, we're going to need some support MMU code for each architecture.

Since adding AddressSanitizer support is pretty involved (build system + actual ASAN code + MMU support code for each arch), I feel like it would be a good large project for this year. I also feel tempted to throw UBSan into the mix and just call it "Add LLVM Sanitizer support to EDK2", but I don't know if that's too much for a GSoC student. Would love some feedback on this.

Note: I would like to work on this, but since I'll be a mentor this year I prefer to first see if a student is interested in this project.

Best regards,
Pedro 

On Fri, Mar 25, 2022 at 6:42 PM Andrew Fish via groups.io <afish=apple.com@groups.io> wrote:
From an UEFI point of view if you own the memory you can do what you want with it. The UEFI Spec does not deal with paging but the PI Spec does have abstractions for how the CPU operates via the CPU ARCH Protocol [1].

So for example if you want to write protect the page tables, add guard page, or add a stack guard all that is OK and exists today [2].
PcdNullPointerDetectionPropertyMask
PcdInitValueInTempStack
PcdHeapGuardPageType
PcdHeapGuardPoolType
PcdHeapGuardPropertyMask
PcdHeapGuardPageType
PcdHeapGuardPropertyMask
PcdCpuStackGuard

Does Asan just need to force page faults? Or does it want to make virtual address mappings? 

If someone wants to work on ASan (or any of the other sanitizers) I’m happy to volunteer to consult. 

[1] https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Protocol/Cpu.h#L221
[2] https://github.com/tianocore/edk2/blob/master/MdeModulePkg/MdeModulePkg.dec#L979

Thanks,

Andrew Fish

On Mar 25, 2022, at 2:07 AM, Marvin Häuser <mhaeuser@posteo.de> wrote:

Hey Pedro,

ASan is somewhat listed for „LLVM Optimizations“.
A quick and dirty reference for UEFI UBSan can be found here: https://github.com/acidanthera/OpenCorePkg/tree/master/Library/OcGuardLib

I don’t think you need to strictly adhere to the UEFI spec for debug tooling. I cannot check the code now, but I can imagine things like ConvertPointer() will not be happy about non-identity-mapping OOTB. But the issues I can think of should be fairly easy to resolve.

Best regards,
Marvin

On 24. Mar 2022, at 23:32, Pedro Falcato <pedro.falcato@gmail.com> wrote:


Hi!

I've been thinking about adding sanitizer support (UBSan and KASAN), like coreboot already has, to the wiki's Tasks for the upcoming GSoC, but I'm a bit confused by something.
Is there anything in the UEFI spec that stops us from doing non-identity memory mappings? I know it specifies the need for the identity mappings (in the architectures where it requires the MMU being enabled), but nowhere do I see anything about the other parts of the address space.
Of course, UEFI supporting AddressSanitizer would be kind of dependent on fancier memory mappings.

Thanks,
Pedro



--
Pedro Falcato