Hi Andrew. I do agree with having better debugging support for Rust-UEFI. However, I am not much experienced with even the C side of UEFI debugging, so it has been difficult for me to do much on the Rust side either. It isn't like Rust UEFI debugging is not possible. There are some examples ([1]), but as you might guess, they are quite few. Additionally, most of them seem to be concentrated around Windows. Finally, there is actually quite good debugging support in uefi-rs [2], but well, that's licensed under MPL-2.0, so I have stayed clear of it.


Another reason I have gone so long without using any actual debugger is because, well, Rust makes it quite explicit where wired errors can occur. Most functions, even inside the std, don't use pointers but rather use `NonNull`, `MaybeUninint` or some other safe abstraction. Even on the worst failure, the program will abort instead of crash. On aborting, Rust also gives a nice message on stderr stating exactly where the error occurred along with all the details of the error. Basically, it was always so clear what caused the error and where it occurred that I simply didn't feel the need to attach a debugger, till now.


There is some work going on to actually document using Rust for UEFI in the rustc docs, sponsored by Red Hat [3], so the situation should improve soon. What I do know is that the `.pdb` files generated by Rust (which are always generated) should contain all the symbols for debugging. I can work on debugging once I finish up with solving all other errors which are due to my implementation of std rather than the ones that have been caused by the rust intrinsic.


Ayush Singh


[1]: https://xitan.me/posts/rust-uefi-runtime-driver/

[2]: https://github.com/rust-osdev/uefi-rs

[3]: https://github.com/rust-lang/rust/pull/99760


On 7/26/22 23:20, Andrew Fish wrote:
On Jul 25, 2022, at 10:43 PM, Ayush Singh <ayushdevel1325@gmail.com> wrote:

Hi Andrew. Thanks for all your work. The more I look at this, the more it feels like it might be a problem on the LLVM side instead of Rust. I also found some more tests (all related to numbers btw) which can cause different types of exceptions, so I think I will try filing bugs upstream.


Ayush,

In general If we want to move to Rust we are going to need a way to debug issues like this down to the root cause. I think figuring out how to debug will make it easier to move forward with the Rust port in general. It will time well spent. 

The best way to get LLVM fixed, if it is even broken, is to provide a simple test case that reproduces the behavior. I don’t think at this point we know what is going on. It is very unlikely that some random LLVM developer is going to invest the time in trying to setup some UEFI environment to try and root cause this bug. I general find I have to create a simple at desk example and then I get stuff fixed quickly. Basically a test case the LLVM developer can compile at their desk and see the error in assembler, or at least run it at desk and have bogus output. 

I’m not 100% sure what toolchain you are using. Can you `objdump -d hello_world_std.efi`and get some symbols with the disassembly? For VC++ I think it would be `DUMPBIN /DISASM`.

What are you planning on using for source level debugging Rust? I wrote some gdb[1] and lldb[2] debugging commands in Python. I’m guessing loading Rust symbols from PE/COFF images should be similar, as long as the debugger knows about rust. 

I’m happy to help you figure out stuff related to debugging Rust. 

[1] https://github.com/tianocore/edk2/blob/master/BaseTools/Scripts/efi_gdb.py
[2] https://github.com/tianocore/edk2/blob/master/BaseTools/Scripts/efi_lldb.py

Thanks,

Andrew Fish

Yours Sincerely,

Ayush Singh


On 7/26/22 00:24, Andrew Fish wrote:
I guess I could at least dump to the end (req)…. Going backwards is a bit painful in x86. 

(lldb) dis -s 0x0000000140001B60 -b -c 30
hello_world_std.efi[0x140001b60]: 48 8b 09                       movq   (%rcx), %rcx
hello_world_std.efi[0x140001b63]: 48 01 c1                       addq   %rax, %rcx
hello_world_std.efi[0x140001b66]: 4c 89 c2                       movq   %r8, %rdx
hello_world_std.efi[0x140001b69]: 48 11 c2                       adcq   %rax, %rdx
hello_world_std.efi[0x140001b6c]: 48 31 c1                       xorq   %rax, %rcx
hello_world_std.efi[0x140001b6f]: 48 31 c2                       xorq   %rax, %rdx
hello_world_std.efi[0x140001b72]: 48 be 00 00 00 00 00 00 00 80  movabsq $-0x8000000000000000, %rsi ; imm = 0x8000000000000000 
hello_world_std.efi[0x140001b7c]: 4c 21 c6                       andq   %r8, %rsi
hello_world_std.efi[0x140001b7f]: e8 5c 55 00 00                 callq  0x1400070e0
hello_world_std.efi[0x140001b84]: 48 09 f0                       orq    %rsi, %rax
hello_world_std.efi[0x140001b87]: 48 83 c4 20                    addq   $0x20, %rsp
hello_world_std.efi[0x140001b8b]: 5e                             popq   %rsi
hello_world_std.efi[0x140001b8c]: c3                             retq   
hello_world_std.efi[0x140001b8d]: cc                             int3   
hello_world_std.efi[0x140001b8e]: cc                             int3   
hello_world_std.efi[0x140001b8f]: cc                             int3   
hello_world_std.efi[0x140001b90]: e9 db 55 00 00                 jmp    0x140007170
hello_world_std.efi[0x140001b95]: cc                             int3   

Then we can guess based on how functions get aligned to find the start….

hello_world_std.efi[0x140001b50]: 56                                   pushq  %rsi
hello_world_std.efi[0x140001b51]: 48 83 ec 20                          subq   $0x20, %rsp
hello_world_std.efi[0x140001b55]: 4c 8b 41 08                          movq   0x8(%rcx), %r8
hello_world_std.efi[0x140001b59]: 4c 89 c0                             movq   %r8, %rax
hello_world_std.efi[0x140001b5c]: 48 c1 f8 3f                          sarq   $0x3f, %rax
hello_world_std.efi[0x140001b60]: 48 8b 09                             movq   (%rcx), %rcx
hello_world_std.efi[0x140001b63]: 48 01 c1                             addq   %rax, %rcx
hello_world_std.efi[0x140001b66]: 4c 89 c2                             movq   %r8, %rdx
hello_world_std.efi[0x140001b69]: 48 11 c2                             adcq   %rax, %rdx
hello_world_std.efi[0x140001b6c]: 48 31 c1                             xorq   %rax, %rcx
hello_world_std.efi[0x140001b6f]: 48 31 c2                             xorq   %rax, %rdx
hello_world_std.efi[0x140001b72]: 48 be 00 00 00 00 00 00 00 80        movabsq $-0x8000000000000000, %rsi ; imm = 0x8000000000000000 
hello_world_std.efi[0x140001b7c]: 4c 21 c6                             andq   %r8, %rsi
hello_world_std.efi[0x140001b7f]: e8 5c 55 00 00                       callq  0x1400070e0
hello_world_std.efi[0x140001b84]: 48 09 f0                             orq    %rsi, %rax
hello_world_std.efi[0x140001b87]: 48 83 c4 20                          addq   $0x20, %rsp
hello_world_std.efi[0x140001b8b]: 5e                                   popq   %rsi
hello_world_std.efi[0x140001b8c]: c3                                   retq   

So the faulting function is getting passed a bad pointer as its 1st arg. 

Thanks,

Andrew Fish

On Jul 25, 2022, at 11:45 AM, Andrew Fish <afish@apple.com> wrote:

Ops… Looks like your PE/COFF is linked at 0x0000000140000000, so 0x140001b60 is the interesting bit.

(lldb) dis -s 0x0000000140001B60 -b
hello_world_std.efi[0x140001b60]: 48 8b 09                       movq   (%rcx), %rcx
hello_world_std.efi[0x140001b63]: 48 01 c1                       addq   %rax, %rcx
hello_world_std.efi[0x140001b66]: 4c 89 c2                       movq   %r8, %rdx
hello_world_std.efi[0x140001b69]: 48 11 c2                       adcq   %rax, %rdx
hello_world_std.efi[0x140001b6c]: 48 31 c1                       xorq   %rax, %rcx
hello_world_std.efi[0x140001b6f]: 48 31 c2                       xorq   %rax, %rdx
hello_world_std.efi[0x140001b72]: 48 be 00 00 00 00 00 00 00 80  movabsq $-0x8000000000000000, %rsi ; imm = 0x8000000000000000 
hello_world_std.efi[0x140001b7c]: 4c 21 c6                       andq   %r8, %rsi

 RCX - FFFFFFFFFFFFFFFF

So yea that looks like the fault. 

I don’t see that pattern in your .s file…. 

Can you figure out what function is @ 0x140001b60 in the PE/COFF image. Do you have a map file from the linker?

Thanks,

Andrew Fish

PS Again sorry I don’t have anything installed to crack PDB files. 

Thanks,

Andrew Fish

On Jul 25, 2022, at 10:51 AM, Andrew Fish via groups.io <afish=apple.com@groups.io> wrote:

Ayush,

CR2 is the fault address so 0xFFFFFFFFFFFFFFFF. Given for EFI Virt == Physical the fault address looks like a bad pointer. 

Sorry I’ve not used VC++ in a long time so I don’t know how to debug with VC++, but If I was using clang/lldb I’d look at the source and assembly for the fault address. 

The image base is: 0x000000000603C000
The fault PC/RIP is: 000000000603DB60

So the faulting code is at 0x1B60 in the image. Given the images are linked at zero you should be able to load the build product into the debugger and look at what code is at offset 0x1B60. The same should work for any tools that dump the binary. 

Thanks,

Andrew Fish

On Jul 25, 2022, at 10:33 AM, Ayush Singh <ayushdevel1325@gmail.com> wrote:

Hello everyone.While running Rust tests in UEFI environment, I have come across a numeric test that causes a pagefault. A simple reproducible example for this is given below:

```rust

fn main() {
    use std::hint::black_box as b;

    let z: i128 = b(1);
    assert!((-z as f64) < 0.0);
}

```


The exception output is as follows:

```

!!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
RIP  - 000000000603DB60, CS  - 0000000000000038, RFLAGS - 0000000000000246
RAX  - 0000000000000000, RCX - FFFFFFFFFFFFFFFF, RDX - FFFFFFFFFFFFFFFF
RBX  - 0000000000000000, RSP - 0000000007EDF1D0, RBP - 0000000007EDF4C0
RSI  - 0000000007EDF360, RDI - 0000000007EDF3C0
R8   - 0000000000000000, R9  - 0000000000000038, R10 - 0000000000000000
R11  - 0000000000000000, R12 - 00000000060C6018, R13 - 0000000007EDF520
R14  - 0000000007EDF6A8, R15 - 0000000005FA9490
DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
GS   - 0000000000000030, SS  - 0000000000000030
CR0  - 0000000080010033, CR2 - FFFFFFFFFFFFFFFF, CR3 - 0000000007C01000
CR4  - 0000000000000668, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 00000000079DE000 0000000000000047, LDTR - 0000000000000000
IDTR - 0000000007418018 0000000000000FFF,   TR - 0000000000000000
FXSAVE_STATE - 0000000007EDEE30
!!!! Find image based on IP(0x603DB60) /var/home/ayush/Documents/Programming/Rust/uefi/hello_world_std/target/x86_64-unknown-uefi/debug/deps/hello_world_std-338028f9369e2d42.pdb (ImageBase=000000000603C000, EntryPoint=000000000603D8C0) !!!!

```


From my testing, the exception only occurs when a few conditions are met.

1. The binary is compiled in Debug mode. No error in Release mode.

2. `i128` is in a black_box [1]. Does not occur if `black_box` is not present.

3. It has to be `i128`. `i64` or something else work fine.

4. The cast has to be done on `-z`. Doing the same with `+z` is fine.


I have also been discussing this in the Rust zulipchat [2], so feel free to chime in there.


Additionally, here are links for more information about this program:

1. Assembly: https://rust-lang.zulipchat.com/user_uploads/4715/od51Y9Dkfjahcg9HHcOud8Fm/hello_world_std-338028f9369e2d42.s

2. EFI Binary: https://rust-lang.zulipchat.com/user_uploads/4715/CknqtXLR8SaJZmyOnXctQkpL/hello_world_std.efi

3. PDB file: https://rust-lang.zulipchat.com/user_uploads/4715/zV4i6DsjgQXotp_gS1naEsU0/hello_world_std-338028f9369e2d42.pdb


Yours Sincerely,

Ayush Singh


[1]: https://doc.rust-lang.org/std/hint/fn.black_box.html

[2]: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Casting.20i128.20to.20f64.20in.20black_box.20causes.20exception.20in.20UEFI