public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Lendacky, Thomas via groups.io" <thomas.lendacky=amd.com@groups.io>
To: devel@edk2.groups.io, ardb@kernel.org, srikanth.aithal@amd.com
Cc: sachinganesh@ami.com
Subject: Re: [edk2-devel] edk2 master: AMD SEV-ES guest boot with OvmfPkgX64 fails
Date: Tue, 28 Jan 2025 16:38:38 -0600	[thread overview]
Message-ID: <d8197d72-69ad-faa1-3b55-e493c0bebac9@amd.com> (raw)
In-Reply-To: <181EF6CF36D04674.20803@groups.io>

On 1/28/25 14:57, Lendacky, Thomas via groups.io wrote:
> On 1/28/25 10:26, Ard Biesheuvel via groups.io wrote:
>> Please retry with a build created from the latest HEAD. There was a
>> bug in that change that got fixed today.
> 
> I tried the latest HEAD and the issue is still there.
> 
> On a whim, I deleted the added DEBUG () calls from the patch and the
> issue went away. At some point during efi_set_virtual_address_map(),
> RelocBase and RelocBaseEnd are NULL and "DEBUG ((DEBUG_ERROR,
> "Relocation block is not valid\n"));" is executed, which crashes the
> boot. I'm guessing there shouldn't be any output generated during
> SetVirtualAddressMap ?

After some digging, the reason for the failure is the attempted DEBUG()
print.

The DEBUG() ends up generating an IO instruction. The IO instruction
generates a #VC which is handled by the Linux kernel. To validate that
the IOIO error code truly came from an IO instruction, the #VC handler
attempts to read the instruction bytes. To read the bytes, the #VC
handler first determines if the context is kernel or userspace. In this
case, the context is kernel, so the instruction bytes are accessed using
copy_from_kernel_nofault(). However, the RIP that is used is the EFI
identity-mapped value, 0x7f6e1331, which appears as a userspace address,
so it fails the check in copy_from_kernel_nofault_allowed().

At that point the #VC handler simulates a #PF by using information from
the #VC exception and forwarding it to the #PF handler.

The easiest and most backward compatible fix would be to remove the
newly added DEBUG() invocations from: aedcaa3df8a2 ("MdePkg: Fix
overflow issue in PeCoffLoaderRelocateImageForRuntime").

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>>
>> On Tue, 28 Jan 2025 at 10:09, Aithal, Srikanth via groups.io
>> <srikanth.aithal=amd.com@groups.io> wrote:
>>>
>>> Hello,
>>>
>>> With current edk2/master booting AMD SEV-ES guest with OvmfPkgX64 package is failing with below error:
>>>
>>>
>>> [    0.240243] Memory Encryption Features active: AMD SEV SEV-ES
>>> [    0.241170] SEV: Status: SEV SEV-ES
>>> [    0.241783] pid_max: default: 32768 minimum: 301
>>> [    0.243627] BUG: unable to handle page fault for address: 000000007f6e1331
>>> [    0.243629] #PF: supervisor instruction fetch in kernel mode
>>> [    0.243630] #PF: error_code(0x0010) - not-present page
>>> [    0.243631] PGD 8000001933063 P4D 8000001933063 PUD 8000001934063 PMD 8000001938063 PTE 800007f6e1063
>>> [    0.243635] Oops: Oops: 0010 [#1] PREEMPT SMP NOPTI
>>> [    0.243637] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.2-7aa21fec18-11cb77746de #1
>>> [    0.243640] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
>>> [    0.243641] RIP: 0010:0x7f6e1331
>>> [    0.243643] Code: Unable to access opcode bytes at 0x7f6e1307.
>>> [    0.243643] RSP: 0000:ffffffff96403b80 EFLAGS: 00010097 ORIG_RAX: 0000000000000010
>>> [    0.243645] RAX: 0000000000000001 RBX: ffffffff96403c40 RCX: ffffffff96403c40
>>> [    0.243646] RDX: 00000000000003fd RSI: ffffffff96403d40 RDI: 0000000000000001
>>> [    0.243647] RBP: ffffffff96403c00 R08: 0000000000000001 R09: 000000007f6e2c74
>>> [    0.243648] R10: 0000000000000002 R11: 0000000000000000 R12: 000000000000001f
>>> [    0.243649] R13: ffffffff96403c5f R14: 00000000000003fd R15: ffffffff96403bb8
>>> [    0.243650] FS:  0000000000000000(0000) GS:ffff933b3cc00000(0000) knlGS:0000000000000000
>>> [    0.243652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    0.243653] CR2: 000000007f6e1331 CR3: 000800000196e000 CR4: 00000000003506f0
>>> [    0.243655] Call Trace:
>>> [    0.243656]  <TASK>
>>> [    0.243657]  ? __die+0x1b/0x60
>>> [    0.243662]  ? page_fault_oops+0x151/0x4d0
>>> [    0.243667]  ? exc_page_fault+0x64/0x140
>>> [    0.243670]  ? vc_raw_handle_exception+0x287/0x2c0
>>> [    0.243676]  ? kernel_exc_vmm_communication+0x4d/0x100
>>> [    0.243679]  ? asm_exc_vmm_communication+0x31/0x70
>>> [    0.243685]  ? __alloc_pages_noprof+0x162/0x300
>>> [    0.243691]  ? __cpa_process_fault+0x463/0x6f0
>>> [    0.243698]  ? srso_return_thunk+0x5/0x5f
>>> [    0.243701]  ? __efi_call+0x28/0x30
>>> [    0.243705]  ? srso_return_thunk+0x5/0x5f
>>> [    0.243706]  ? efi_set_virtual_address_map+0x95/0x1e0
>>> [    0.243710]  ? sev_es_efi_map_ghcbs+0x8c/0xd0
>>> [    0.243714]  ? efi_enter_virtual_mode+0x391/0x470
>>> [    0.243718]  ? start_kernel+0x457/0x750
>>> [    0.243720]  ? x86_64_start_reservations+0x14/0x30
>>> [    0.243722]  ? x86_64_start_kernel+0xce/0xe0
>>> [    0.243723]  ? common_startup_64+0x13e/0x141
>>> [    0.243727]  </TASK>
>>> [    0.243728] Modules linked in:
>>> [    0.243730] CR2: 000000007f6e1331
>>> [    0.243731] ---[ end trace 0000000000000000 ]---
>>> [    0.243732] RIP: 0010:0x7f6e1331
>>> [    0.243733] Code: Unable to access opcode bytes at 0x7f6e1307.
>>> [    0.243734] RSP: 0000:ffffffff96403b80 EFLAGS: 00010097 ORIG_RAX: 0000000000000010
>>> [    0.243735] RAX: 0000000000000001 RBX: ffffffff96403c40 RCX: ffffffff96403c40
>>> [    0.243736] RDX: 00000000000003fd RSI: ffffffff96403d40 RDI: 0000000000000001
>>> [    0.243737] RBP: ffffffff96403c00 R08: 0000000000000001 R09: 000000007f6e2c74
>>> [    0.243738] R10: 0000000000000002 R11: 0000000000000000 R12: 000000000000001f
>>> [    0.243739] R13: ffffffff96403c5f R14: 00000000000003fd R15: ffffffff96403bb8
>>> [    0.243740] FS:  0000000000000000(0000) GS:ffff933b3cc00000(0000) knlGS:0000000000000000
>>> [    0.243741] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    0.243742] CR2: 000000007f6e1331 CR3: 000800000196e000 CR4: 00000000003506f0
>>> [    0.243744] Kernel panic - not syncing: Fatal exception in interrupt
>>> [    0.244168] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>>>
>>>
>>> I did git bisect which points to below commit.
>>>
>>> commit aedcaa3df8a246ef859c234ed5c243471c5be698
>>>
>>> Author: INDIA\sachinganesh <sachinganesh@ami.com>
>>>
>>> Date:   Mon Jan 13 16:15:54 2025 +0530
>>>
>>>
>>>
>>>     MdePkg: Fix overflow issue in PeCoffLoaderRelocateImageForRuntime
>>>
>>>
>>>
>>>     RelocDir->Size is a UINT32 value, and RelocDir->VirtualAddress is
>>>
>>>     also a UINT32 value. The current code in
>>>
>>>     PeCoffLoaderRelocateImageForRuntime does not check for overflow when
>>>
>>>     adding RelocDir->Size to RelocDir->VirtualAddress. This patch uses
>>>
>>>     SafeIntLib to ensure that the addition does not overflow.
>>>
>>>
>>>
>>>     Signed-off-by: Sachin Ganesh <sachinganesh@ami.com>
>>>
>>>
>>>
>>>  MdePkg/Library/BasePeCoffLib/BasePeCoff.c      | 25 +++++++++++++------------
>>>
>>>  MdePkg/Library/BasePeCoffLib/BasePeCoffLib.inf |  1 +
>>>
>>>  2 files changed, 14 insertions(+), 12 deletions(-)
>>>
>>>
>>> Sample qemu command line which I used:
>>>
>>> qemu-system-x86_64 \
>>> -machine q35,confidential-guest-support=sev0,vmport=off \
>>> -object sev-guest,id=sev0,policy=0x5,cbitpos=51,reduced-phys-bits=1 \
>>> -name guest=vm,debug-threads=on \
>>> -drive if=pflash,format=raw,unit=0,file=OVMF_X64/OVMF.fd,readonly  \
>>> -cpu EPYC-v4 \
>>> -m 2048 \
>>> -smp 255,maxcpus=255,cores=255,threads=1,dies=1,sockets=1 \
>>> -kernel bzImage \
>>> -append "root=/dev/sda rw console=ttyS0 net.ifnames=0 biosdevname=0 movable_node swiotlb=65536 " \
>>> -drive id=disk0,file=22.04-server_seves.qcow2,if=none \
>>> -device virtio-scsi-pci,id=scsi0,disable-legacy=on,iommu_platform=true \
>>> -device scsi-hd,drive=disk0 \
>>> --enable-kvm \
>>> --nographic
>>>
>>> Thank you,
>>>
>>> Srikanth Aithal <sraithal@amd.com>
>>>
>>>
>>>
>>
>>
>>
>>
>>
> 
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#121059): https://edk2.groups.io/g/devel/message/121059
Mute This Topic: https://groups.io/mt/110856145/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  parent reply	other threads:[~2025-01-28 22:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28  9:09 [edk2-devel] edk2 master: AMD SEV-ES guest boot with OvmfPkgX64 fails Aithal, Srikanth via groups.io
2025-01-28 16:26 ` Ard Biesheuvel via groups.io
2025-01-28 20:57   ` Lendacky, Thomas via groups.io
     [not found]   ` <181EF6CF36D04674.20803@groups.io>
2025-01-28 22:38     ` Lendacky, Thomas via groups.io [this message]
2025-01-28 23:04       ` Ard Biesheuvel via groups.io
     [not found] <181ED0225145F5E9.31511@groups.io>
2025-01-28  9:21 ` Aithal, Srikanth via groups.io

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8197d72-69ad-faa1-3b55-e493c0bebac9@amd.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox