From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.151; helo=mga17.intel.com; envelope-from=ruiyu.ni@intel.com; receiver=edk2-devel@lists.01.org Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 07FBE209574D9 for ; Mon, 26 Feb 2018 21:42:44 -0800 (PST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Feb 2018 21:48:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,398,1515484800"; d="scan'208";a="33891728" Received: from ray-dev.ccr.corp.intel.com (HELO [10.239.9.13]) ([10.239.9.13]) by fmsmga001.fm.intel.com with ESMTP; 26 Feb 2018 21:48:49 -0800 To: Guo Heyi , Laszlo Ersek Cc: edk2-devel@lists.01.org, Star Zeng , Eric Dong References: <1519633779-130687-1-git-send-email-heyi.guo@linaro.org> <20180227004830.GA2261@SZX1000114654> From: "Ni, Ruiyu" Message-ID: Date: Tue, 27 Feb 2018 13:48:49 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180227004830.GA2261@SZX1000114654> Subject: Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Feb 2018 05:42:45 -0000 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 2/27/2018 8:48 AM, Guo Heyi wrote: > Hi Laszlo, > > I agree the current patch makes the code ugly, and turning the logic into a > normal loop should be the perfect solution. If Ray also agrees on it, I can try > to do that. > > Thanks and regards, > > Heyi > > On Mon, Feb 26, 2018 at 05:23:29PM +0100, Laszlo Ersek wrote: >> On 02/26/18 09:29, Heyi Guo wrote: >>> Function BmRepairAllControllers may recursively call itself if some >>> driver health protocol returns EfiDriverHealthStatusReconnectRequired. >>> However, driver health protocol of some buggy third party driver may >>> always return such status even after one and another reconnect. The >>> endless iteration will cause stack overflow and then system exception, >>> and it may be not easy to find that the exception is actually caused >>> by stack overflow. >>> >>> So we limit the number of reconnect retry to 10 to improve code >>> robustness. >> >> Not really my place to comment on this, but how about removing the >> recursion entirely, and turning the logic into a normal (iterative) loop >> instead? >> >> (1) Rename the current function to: >> >> STATIC >> VOID >> BmRepairAllControllersWorker ( >> OUT BOOLEAN *ReconnectRequired, >> OUT BOOLEAN *RebootRequired >> ); >> >> >> (2) The worker function should end right before >> >> if (ReconnectRequired) { >> BmRepairAllControllers (); >> } >> >> >> (3) The worker function should not contain >> >> RebootRequired = FALSE; >> ReconnectRequired = FALSE; >> >> Such initialization should be left to the caller. >> >> >> (4) The worker function should be called in a loop from a new >> BmRepairAllControllers() function, like this: >> >> Reconnect = 0; >> RebootRequired = FALSE; >> do { >> ReconnectRequired = FALSE; >> BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired); >> ++Reconnect; >> } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR); >> >> DEBUG_CODE (...); >> >> if (RebootRequired) { >> ... >> } >> >> >> In addition to eliminating the shoddy recursive call (and the shoddier >> global counter, ewww :) ), this would fix the following other warts with >> the code: >> >> - When a nested call chain is unwound, we currently dump a series of >> "driver health info" lists (assuming no reboot is required), in the >> DEBUG_CODE section. I would argue that we should do that only once, at >> the end. (Even if we have to do it multiple times, it can be moved into >> the worker function, to the end.) >> >> - It seems to be sufficient to accumulate RebootRequired into one >> variable (i.e. not multiple instances of the same local variable on the >> stack) and to act upon it at the very end. >> >> >> Feel free to ignore my comments -- I just think we should be moving in >> the opposite direction; that is, away from recursion (especially from >> recursion combined with global variables -- that's one difficult pattern >> to reason about). How about to just remove the global variable? I prefer to change BmRepairAllControllers in the following prototype: VOID BmRepairAllControllers ( UINTN ReconnectRepairCount ); And start to call this like BmRepairAllControllers (0). I am neutral between recursive call and while loop. But I am afraid such a big change may introduce some bugs. And I also like to move the DEBUG_CODE to before: if (ReconnectRequired) { BmRepairAllControllers (ReconnectRepairCount + 1); } So that we can dump the health info for every reconnect repair. >> >> Thanks >> Laszlo >> >>> Contributed-under: TianoCore Contribution Agreement 1.1 >>> Signed-off-by: Heyi Guo >>> Cc: Star Zeng >>> Cc: Eric Dong >>> Cc: Ruiyu Ni >>> --- >>> MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h | 6 ++++++ >>> MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++- >>> 2 files changed, 22 insertions(+), 1 deletion(-) >>> >>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h >>> index 25a1d522fe84..9aa86b096525 100644 >>> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h >>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h >>> @@ -108,6 +108,12 @@ CHAR16 * >>> #define BM_OPTION_NAME_LEN sizeof ("PlatformRecovery####") >>> extern CHAR16 *mBmLoadOptionName[]; >>> >>> +// >>> +// Maximum number of reconnect retry to repair controller; it is to limit the >>> +// number of recursive call of BmRepairAllControllers. >>> +// >>> +#define MAX_RECONNECT_REPAIR 10 >>> + >>> /** >>> Visitor function to be called by BmForEachVariable for each variable >>> in variable storage. >>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c >>> index ddcee8b0676f..30d70f32af84 100644 >>> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c >>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c >>> @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED >>> L"Reboot Required" >>> }; >>> >>> +// >>> +// Counter of reconnect retry to repair controller; it is to limit the >>> +// number of recursive call of BmRepairAllControllers. >>> +// >>> +STATIC UINTN mReconnectRepairCount; >>> + >>> /** >>> Return the controller name. >>> >>> @@ -549,7 +555,16 @@ BmRepairAllControllers ( >>> >>> >>> if (ReconnectRequired) { >>> - BmRepairAllControllers (); >>> + if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) { >>> + mReconnectRepairCount++; >>> + BmRepairAllControllers (); >>> + } else { >>> + DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n", >>> + __FUNCTION__, __LINE__, mReconnectRepairCount)); >>> + // Reset counter so that it will not affect calling >>> + // BmRepairAllControllers() somewhere else >>> + mReconnectRepairCount = 0; >>> + } >>> } >>> >>> DEBUG_CODE ( >>> >> > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel > -- Thanks, Ray