From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by mx.groups.io with SMTP id smtpd.web12.8743.1570796184390071314 for ; Fri, 11 Oct 2019 05:16:24 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@akeo-ie.20150623.gappssmtp.com header.s=20150623 header.b=1KtqYr1u; spf=none, err=permanent DNS error (domain: akeo.ie, ip: 209.85.221.68, mailfrom: pete@akeo.ie) Received: by mail-wr1-f68.google.com with SMTP id h4so11693756wrv.7 for ; Fri, 11 Oct 2019 05:16:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akeo-ie.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=xQI+EuT1U+p0MtKwbH9Sc+xTTo1o9ZXVgRH/ma20T9I=; b=1KtqYr1u7OeVMb2/m07q1Hp6yw5shl2upScb5XKT4I/E8Wq+3jogkqDT2tITaPOXaS jP7kwEEXtZlSzl9WcY9uQJnku5n1HY6wf842hBPcJBcWoZlxnVIVrEApvmCIiPV1PwxT Q7YGWc/OE/3BBlVS8TEyP9FM2eE+W6NUnAgIX8Dlw7MZIiW/Ep/V55fewXQyg8BX//wu KWP4YSWzzeX9bMzgpDuwS4jiygG1h+6CxXMjf3UlxLCXL8nnq/TFGYsEnmJ1zt09l2oK Edq28Kw8huyjKpog+5GGJV9wMxaZYilygKFeFT3+Lyio1nkXB9irm5L52/2lYq1SEU2h pOtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xQI+EuT1U+p0MtKwbH9Sc+xTTo1o9ZXVgRH/ma20T9I=; b=ncPKyI3Mk8F6z6Lu8KoiI0lnghvB+UCQVXM5w7vSgJXHqrUaKEOsStZ/PiWKBpIAQk qVKXZFA9TcH6vfo6SsAYhyaHuyH0TnGNkctvjrbtqMviqTVis1jIz9RCon1sdC+RmWlG PRinLYvKA96vnvEPGgMrb0I7EAHebTTNaL+lVJGrGiAkcUbMV//70zV0oa4pqzk5jOWz JEg1c0WNzKUV7wwHo5WpO20r107dz5Ml+IxfEkcMnH3s9RHvk2GdV8QbSONskbZ9RnXB +wX2kJCvAulhBw+9y1GjXPw0WCHc/KSt0C7VAtfWm7YdXH8Efliy3Mbtg5bHOGTtRxcD Hn9Q== X-Gm-Message-State: APjAAAWrarC202XlARtu6SvyFJYVScr/ObEODw4xv73aACkC1OGnT5zg BjmBgWcQkwhcK08AL+jq2ZTz/A== X-Google-Smtp-Source: APXvYqykHld4gQsvDfJIYx/Zo0+vZJc/XU0vlmm74Lcye6m1lMf/vSGJLYlEeNzzQ+wWZSJXJPDyLw== X-Received: by 2002:adf:e9c8:: with SMTP id l8mr8647615wrn.270.1570796182822; Fri, 11 Oct 2019 05:16:22 -0700 (PDT) Return-Path: Received: from [10.0.0.122] ([84.203.40.122]) by smtp.googlemail.com with ESMTPSA id t8sm8949400wrx.76.2019.10.11.05.16.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Oct 2019 05:16:21 -0700 (PDT) Subject: Re: [edk2-devel] OVMF is crashing for me in master To: Laszlo Ersek , devel@edk2.groups.io, liming.gao@intel.com, "afish@apple.com" References: <4A89E2EF3DFEDB4C8BFDE51014F606A14E515A05@SHSMSX104.ccr.corp.intel.com> <0411bb4e-7d90-df7e-8fa0-5072275f9c62@redhat.com> From: "Pete Batard" Message-ID: Date: Fri, 11 Oct 2019 13:16:20 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <0411bb4e-7d90-df7e-8fa0-5072275f9c62@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Hi Lazlo, On 2019.10.11 12:50, Laszlo Ersek wrote: > Hi, > > On 10/11/19 03:19, Liming Gao wrote: >> Andrew: >> I verify the change (2de1f611be06ded3a59726a4052a9039be7d459b >> MdeModulePkg/BdsDxe: Also call PlatformBootManagerWaitCallback on 0) >> in Emulator. >> It works, because PCD value is set to 10 in Emulator. >> >> Before this change, if TimeOut PCD is zero, BdsEntry doesn't call >> PlatformBootManagerWaitCallback(). >> After this change, if TimeOut PCD is zero, BdsEntry still call >> PlatformBootManagerWaitCallback(). So, it trigs this issue. I agree >> your fix. >> >> Pete: >> Will you contribute the patch to fix this hang issue in OVMF? > > So, when Pete submitted the BDS patch, I didn't test it, I only looked > at the patch email. See my feedback here: > > http://mid.mail-archive.com/8bacbb53-98df-724e-b704-b5059d11e378@redhat.com > https://edk2.groups.io/g/devel/message/48133 > > It seemed OK to me. In particular, I reasoned about (TimeoutRemain==0), > and the patch seemed to do the right thing for that. > > Unfortunately, what I missed was the following case: > > - if PcdPlatformBootTimeOut is zero, then the patch by Pete does not > simply make one more call to PlatformBootManagerWaitCallback(), with > argument zero, but it makes the *ONLY* call to > PlatformBootManagerWaitCallback(). > > I missed this because (like normal) the patch was sent with a 3-line > context, and I didn't think it necessary to review the full function > context. > > Here is the commit, with full function context: > >> diff --git a/MdeModulePkg/Universal/BdsDxe/BdsEntry.c b/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> index f3d5e5ac0615..7968a58f3454 100644 >> --- a/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> +++ b/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> @@ -310,47 +310,48 @@ VOID >> BdsWait ( >> IN EFI_EVENT HotkeyTriggered >> ) >> { >> EFI_STATUS Status; >> UINT16 TimeoutRemain; >> >> DEBUG ((EFI_D_INFO, "[Bds]BdsWait ...Zzzzzzzzzzzz...\n")); >> >> TimeoutRemain = PcdGet16 (PcdPlatformBootTimeOut); >> while (TimeoutRemain != 0) { >> DEBUG ((EFI_D_INFO, "[Bds]BdsWait(%d)..Zzzz...\n", (UINTN) TimeoutRemain)); >> PlatformBootManagerWaitCallback (TimeoutRemain); >> >> BdsReadKeys (); // BUGBUG: Only reading can signal HotkeyTriggered >> // Can be removed after all keyboard drivers invoke callback in timer callback. >> >> if (HotkeyTriggered != NULL) { >> Status = BdsWaitForSingleEvent (HotkeyTriggered, EFI_TIMER_PERIOD_SECONDS (1)); >> if (!EFI_ERROR (Status)) { >> break; >> } >> } else { >> gBS->Stall (1000000); >> } >> >> // >> // 0xffff means waiting forever >> // BDS with no hotkey provided and 0xffff as timeout will "hang" in the loop >> // >> if (TimeoutRemain != 0xffff) { >> TimeoutRemain--; >> } >> } >> + PlatformBootManagerWaitCallback (0); >> DEBUG ((EFI_D_INFO, "[Bds]Exit the waiting!\n")); >> } > > We can see that the function used to handle the > PcdPlatformBootTimeOut==0 situation well, and make *no* call to > PlatformBootManagerWaitCallback(). The patch broke that behavior. > > Therefore, I disagree with the idea to fix this in OvmfPkg. Any number > of platforms may exist that rely on PlatformBootManagerWaitCallback() > not being called when PcdPlatformBootTimeOut is zero. > > (To name just one other example, ArmVirtPkg too is affected.) > > And for OVMF and ArmVirtPkg, it is definitely correct to *not* display > any progress bar if PcdPlatformBootTimeOut is 0. > > Please consult the definition of the PCD in "MdePkg/MdePkg.dec": > >> ## The number of seconds that the firmware will wait before initiating the original default boot selection. >> # A value of 0 indicates that the default boot selection is to be initiated immediately on boot. >> # The value of 0xFFFF then firmware will wait for user input before booting. >> # @Prompt Boot Timeout (s) >> gEfiMdePkgTokenSpaceGuid.PcdPlatformBootTimeOut|0xffff|UINT16|0x0000002c > > Also see the specification of PlatformBootManagerWaitCallback(), in > "MdeModulePkg/Include/Library/PlatformBootManagerLib.h": > >> /** >> This function is called each second during the boot manager waits the timeout. >> >> @param TimeoutRemain The remaining timeout. >> **/ >> VOID >> EFIAPI >> PlatformBootManagerWaitCallback ( >> UINT16 TimeoutRemain >> ); > > According to the DEC file, there is no waiting if the PCD is zero. > Therefore, if the PCD is zero, the function should not be called at all. > > > There is another complication, namely, if the PCD is 0xFFFF. In this > case, the loop goes on until a hotkey is pressed. In every iteration > (that is, every second), PlatformBootManagerWaitCallback() is called > with argument 0xFFFF, invariably. Clearly, this *cannot* cause a > platform to display any progress *changes*. In the most common callback > implementation, namely in the formula > > (Timeout - TimeoutRemain) * 100 / Timeout, > > it will mean > > (0xFFFF - 0xFFFF) * 100 / 0xFFFF, > > that is, "zero percent", in every iteration. > > Based on that, it looks wrong to suddenly call > PlatformBootManagerWaitCallback() with a zero argument ("hundred percent > completion"), after it's been called, let's say for a minute or more, > with a constant 0xFFFF argument ("zero percent completion"), until the > user finally presses a hotkey. > > Jumping from zero to 100% is discontiguous and doesn't appear helpful to > me. > > > Further yet, what if there is an actual progress bar and timeout (like > 10 seconds), and the user presses a hotkey after 5 seconds? Should the > progress bar jump to 100% at once? I wouldn't think so. > > > Therefore, I claim that the right fix is: > >> diff --git a/MdeModulePkg/Universal/BdsDxe/BdsEntry.c b/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> index 7968a58f3454..d12829afeb8b 100644 >> --- a/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> +++ b/MdeModulePkg/Universal/BdsDxe/BdsEntry.c >> @@ -341,7 +341,16 @@ BdsWait ( >> TimeoutRemain--; >> } >> } >> - PlatformBootManagerWaitCallback (0); >> + // >> + // If the platform configured a nonzero and finite time-out, and we have >> + // actually reached that, report 100% completion to the platform. >> + // >> + // Note that the (TimeoutRemain == 0) condition excludes >> + // PcdPlatformBootTimeOut=0xFFFF, and that's deliberate. >> + // >> + if (PcdGet16 (PcdPlatformBootTimeOut) != 0 && TimeoutRemain == 0) { >> + PlatformBootManagerWaitCallback (0); >> + } >> DEBUG ((EFI_D_INFO, "[Bds]Exit the waiting!\n")); >> } >> I agree with the analysis and the proposed fix. My earlier patch was indeed altering existing behaviour when PcdPlatformBootTimeOut was 0, and I completely missed that. Also, apologies for not replying to the direct request you sent to me earlier. The e-mail client I use does not currently deduplicate messages that are sent both to the list as well as directly to me as CC, so I only get one copy in the list which makes these requests easy to miss. I'll try to fix that. Many thanks for looking into this! Regards, /Pete > > Thanks! > Laszlo >