After thinking about it some more, I suspect this is an issue in the debug library stack specific to BootScriptExecutorDxe. Being after exit-BS, it is in some ways a runtime driver. The BootScriptExecutorDxe that is executed from the lockbox was copied there at ReadyToLock, before the library receives its event to flip the mEndOfBootServices variable inside the data section (as I understand). Therefore, on resume, the library will wrongly try to use boot services, despite this being impossible and seemingly protected by the variable. I suppose the only fix is another library instance. Best regards, Benjamin On Mon, 18 Jul 2022 at 16:33, Benjamin Doron wrote: > Hi all, > I've been working on implementing S3 resume support for MinPlatform during > the past few weeks. Presently, the last line of code that I know will > execute on resume flows is > https://github.com/tianocore/edk2/blob/master/UefiCpuPkg/Universal/Acpi/S3Resume2Pei/S3Resume.c#L878 > - right before transferring control to BootScriptExecutorDxe. > > I had added a debug print at > https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Universal/Acpi/BootScriptExecutorDxe/ScriptExecute.c#L47 > to ensure that control was successfully passed here, but it never executes > and the platform doesn't resume. I've considered that it may be a debug > library-specific issue, and I've been fixing some of those (but that > certainly may still require debugging). However, after addressing that, the > bug still too predictably occurs here. Therefore, what other assumptions > are made for the jump here to succeed? > > So far, I've considered: > - DxePcdLib could try calling the protocol after exit-BS, which is > guaranteed to fail (then page fault). However, I've checked the disassembly > and it's not used on resume flow. This is fine. > - DebugLibSerialPort is used for this module, because RSC's serial port > handler is unregistered at exit-BS. This should now be fine. > > Some (potentially) plausible architectural issues: > - Page tables are used in long mode. Maybe I could verify these are sane > by looking up the structure and printing each entry's fields, but they are > probably fine. > - Maybe BootScriptStackSize is too small? I sort of doubt it from looking > at the disassembly. Also, even if the stack overflows, I'd expect the > earlier debug prints to succeed. > - Maybe my added debug print in S3BootScriptExecutorEntryFunction() is a > problem? However, it's the IDTR that's written, not the GDTR. I'd expect > that to only be an issue if an interrupt is fired. Also, SmmRestoreCpu() > does the same. As I understand, normally there is an enormous difference > between DXE and SMM, because SMM has some resume state in some CPU MSRs > (etc), but I think here PiSmmCpuDxeSmm is being entered as if it were mere > 64-bit code, like DXE. > > Best regards, > Benjamin >