public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
@ 2018-02-26  8:29 Heyi Guo
  2018-02-26  8:56 ` Wang, Sunny (HPS SW)
  2018-02-26 16:23 ` Laszlo Ersek
  0 siblings, 2 replies; 14+ messages in thread
From: Heyi Guo @ 2018-02-26  8:29 UTC (permalink / raw)
  To: edk2-devel; +Cc: Heyi Guo, Star Zeng, Eric Dong, Ruiyu Ni

Function BmRepairAllControllers may recursively call itself if some
driver health protocol returns EfiDriverHealthStatusReconnectRequired.
However, driver health protocol of some buggy third party driver may
always return such status even after one and another reconnect. The
endless iteration will cause stack overflow and then system exception,
and it may be not easy to find that the exception is actually caused
by stack overflow.

So we limit the number of reconnect retry to 10 to improve code
robustness.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
Cc: Star Zeng <star.zeng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
---
 MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
 MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
index 25a1d522fe84..9aa86b096525 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
+++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
@@ -108,6 +108,12 @@ CHAR16 *
 #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
 extern CHAR16  *mBmLoadOptionName[];
 
+//
+// Maximum number of reconnect retry to repair controller; it is to limit the
+// number of recursive call of BmRepairAllControllers.
+//
+#define MAX_RECONNECT_REPAIR                        10
+
 /**
   Visitor function to be called by BmForEachVariable for each variable
   in variable storage.
diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
index ddcee8b0676f..30d70f32af84 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
@@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
     L"Reboot Required"
     };
 
+//
+// Counter of reconnect retry to repair controller; it is to limit the
+// number of recursive call of BmRepairAllControllers.
+//
+STATIC UINTN mReconnectRepairCount;
+
 /**
   Return the controller name.
 
@@ -549,7 +555,16 @@ BmRepairAllControllers (
 
 
   if (ReconnectRequired) {
-    BmRepairAllControllers ();
+    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
+      mReconnectRepairCount++;
+      BmRepairAllControllers ();
+    } else {
+      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
+        __FUNCTION__, __LINE__, mReconnectRepairCount));
+      // Reset counter so that it will not affect calling
+      // BmRepairAllControllers() somewhere else
+      mReconnectRepairCount = 0;
+    }
   }
 
   DEBUG_CODE (
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-26  8:29 [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth Heyi Guo
@ 2018-02-26  8:56 ` Wang, Sunny (HPS SW)
  2018-02-26 11:34   ` Guo Heyi
  2018-02-26 16:23 ` Laszlo Ersek
  1 sibling, 1 reply; 14+ messages in thread
From: Wang, Sunny (HPS SW) @ 2018-02-26  8:56 UTC (permalink / raw)
  To: Heyi Guo, edk2-devel@lists.01.org
  Cc: Ruiyu Ni, Eric Dong, Star Zeng, Wang, Sunny (HPS SW)

Hi Heyi, 
Just a suggestion. 
Is it better to use a PCD instead of a define for MAX_RECONNECT_REPAIR? So that we can easily override it in our platform dsc file.

Regards,
Sunny Wang

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Heyi Guo
Sent: Monday, February 26, 2018 4:30 PM
To: edk2-devel@lists.01.org
Cc: Ruiyu Ni <ruiyu.ni@intel.com>; Heyi Guo <heyi.guo@linaro.org>; Eric Dong <eric.dong@intel.com>; Star Zeng <star.zeng@intel.com>
Subject: [edk2] [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth

Function BmRepairAllControllers may recursively call itself if some driver health protocol returns EfiDriverHealthStatusReconnectRequired.
However, driver health protocol of some buggy third party driver may always return such status even after one and another reconnect. The endless iteration will cause stack overflow and then system exception, and it may be not easy to find that the exception is actually caused by stack overflow.

So we limit the number of reconnect retry to 10 to improve code robustness.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
Cc: Star Zeng <star.zeng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
---
 MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
 MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
index 25a1d522fe84..9aa86b096525 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
+++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
@@ -108,6 +108,12 @@ CHAR16 *
 #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
 extern CHAR16  *mBmLoadOptionName[];
 
+//
+// Maximum number of reconnect retry to repair controller; it is to 
+limit the // number of recursive call of BmRepairAllControllers.
+//
+#define MAX_RECONNECT_REPAIR                        10
+
 /**
   Visitor function to be called by BmForEachVariable for each variable
   in variable storage.
diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
index ddcee8b0676f..30d70f32af84 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
@@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
     L"Reboot Required"
     };
 
+//
+// Counter of reconnect retry to repair controller; it is to limit the 
+// number of recursive call of BmRepairAllControllers.
+//
+STATIC UINTN mReconnectRepairCount;
+
 /**
   Return the controller name.
 
@@ -549,7 +555,16 @@ BmRepairAllControllers (
 
 
   if (ReconnectRequired) {
-    BmRepairAllControllers ();
+    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
+      mReconnectRepairCount++;
+      BmRepairAllControllers ();
+    } else {
+      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
+        __FUNCTION__, __LINE__, mReconnectRepairCount));
+      // Reset counter so that it will not affect calling
+      // BmRepairAllControllers() somewhere else
+      mReconnectRepairCount = 0;
+    }
   }
 
   DEBUG_CODE (
--
2.7.4

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-26  8:56 ` Wang, Sunny (HPS SW)
@ 2018-02-26 11:34   ` Guo Heyi
  2018-02-27  2:47     ` Wang, Sunny (HPS SW)
  0 siblings, 1 reply; 14+ messages in thread
From: Guo Heyi @ 2018-02-26 11:34 UTC (permalink / raw)
  To: Wang, Sunny (HPS SW)
  Cc: Heyi Guo, edk2-devel@lists.01.org, Ruiyu Ni, Eric Dong, Star Zeng

Hi Sunny,

I didn't consider it as a value necessary for platform override, for the retry
count should only have some impact on boot performance and it only happens when
there is something wrong.

May I know what value you will use for your platform and why?

Thanks and regards,

Gary (Heyi Guo)

On Mon, Feb 26, 2018 at 08:56:50AM +0000, Wang, Sunny (HPS SW) wrote:
> Hi Heyi, 
> Just a suggestion. 
> Is it better to use a PCD instead of a define for MAX_RECONNECT_REPAIR? So that we can easily override it in our platform dsc file.
> 
> Regards,
> Sunny Wang
> 
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Heyi Guo
> Sent: Monday, February 26, 2018 4:30 PM
> To: edk2-devel@lists.01.org
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>; Heyi Guo <heyi.guo@linaro.org>; Eric Dong <eric.dong@intel.com>; Star Zeng <star.zeng@intel.com>
> Subject: [edk2] [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
> 
> Function BmRepairAllControllers may recursively call itself if some driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> However, driver health protocol of some buggy third party driver may always return such status even after one and another reconnect. The endless iteration will cause stack overflow and then system exception, and it may be not easy to find that the exception is actually caused by stack overflow.
> 
> So we limit the number of reconnect retry to 10 to improve code robustness.
> 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> Cc: Star Zeng <star.zeng@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> ---
>  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
>  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> index 25a1d522fe84..9aa86b096525 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> @@ -108,6 +108,12 @@ CHAR16 *
>  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>  extern CHAR16  *mBmLoadOptionName[];
>  
> +//
> +// Maximum number of reconnect retry to repair controller; it is to 
> +limit the // number of recursive call of BmRepairAllControllers.
> +//
> +#define MAX_RECONNECT_REPAIR                        10
> +
>  /**
>    Visitor function to be called by BmForEachVariable for each variable
>    in variable storage.
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> index ddcee8b0676f..30d70f32af84 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
>      L"Reboot Required"
>      };
>  
> +//
> +// Counter of reconnect retry to repair controller; it is to limit the 
> +// number of recursive call of BmRepairAllControllers.
> +//
> +STATIC UINTN mReconnectRepairCount;
> +
>  /**
>    Return the controller name.
>  
> @@ -549,7 +555,16 @@ BmRepairAllControllers (
>  
>  
>    if (ReconnectRequired) {
> -    BmRepairAllControllers ();
> +    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> +      mReconnectRepairCount++;
> +      BmRepairAllControllers ();
> +    } else {
> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> +        __FUNCTION__, __LINE__, mReconnectRepairCount));
> +      // Reset counter so that it will not affect calling
> +      // BmRepairAllControllers() somewhere else
> +      mReconnectRepairCount = 0;
> +    }
>    }
>  
>    DEBUG_CODE (
> --
> 2.7.4
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-26  8:29 [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth Heyi Guo
  2018-02-26  8:56 ` Wang, Sunny (HPS SW)
@ 2018-02-26 16:23 ` Laszlo Ersek
  2018-02-27  0:48   ` Guo Heyi
  1 sibling, 1 reply; 14+ messages in thread
From: Laszlo Ersek @ 2018-02-26 16:23 UTC (permalink / raw)
  To: Heyi Guo, edk2-devel; +Cc: Ruiyu Ni, Eric Dong, Star Zeng

On 02/26/18 09:29, Heyi Guo wrote:
> Function BmRepairAllControllers may recursively call itself if some
> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> However, driver health protocol of some buggy third party driver may
> always return such status even after one and another reconnect. The
> endless iteration will cause stack overflow and then system exception,
> and it may be not easy to find that the exception is actually caused
> by stack overflow.
> 
> So we limit the number of reconnect retry to 10 to improve code
> robustness.

Not really my place to comment on this, but how about removing the
recursion entirely, and turning the logic into a normal (iterative) loop
instead?

(1) Rename the current function to:

STATIC
VOID
BmRepairAllControllersWorker (
  OUT BOOLEAN *ReconnectRequired,
  OUT BOOLEAN *RebootRequired
  );


(2) The worker function should end right before

  if (ReconnectRequired) {
    BmRepairAllControllers ();
  }


(3) The worker function should not contain

  RebootRequired    = FALSE;
  ReconnectRequired = FALSE;

Such initialization should be left to the caller.


(4) The worker function should be called in a loop from a new
BmRepairAllControllers() function, like this:

  Reconnect = 0;
  RebootRequired = FALSE;
  do {
    ReconnectRequired = FALSE;
    BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired);
    ++Reconnect;
  } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR);

  DEBUG_CODE (...);

  if (RebootRequired) {
    ...
  }


In addition to eliminating the shoddy recursive call (and the shoddier
global counter, ewww :) ), this would fix the following other warts with
the code:

- When a nested call chain is unwound, we currently dump a series of
"driver health info" lists (assuming no reboot is required), in the
DEBUG_CODE section. I would argue that we should do that only once, at
the end. (Even if we have to do it multiple times, it can be moved into
the worker function, to the end.)

- It seems to be sufficient to accumulate RebootRequired into one
variable (i.e. not multiple instances of the same local variable on the
stack) and to act upon it at the very end.


Feel free to ignore my comments -- I just think we should be moving in
the opposite direction; that is, away from recursion (especially from
recursion combined with global variables -- that's one difficult pattern
to reason about).

Thanks
Laszlo

> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> Cc: Star Zeng <star.zeng@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> ---
>  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
>  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> index 25a1d522fe84..9aa86b096525 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> @@ -108,6 +108,12 @@ CHAR16 *
>  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>  extern CHAR16  *mBmLoadOptionName[];
>  
> +//
> +// Maximum number of reconnect retry to repair controller; it is to limit the
> +// number of recursive call of BmRepairAllControllers.
> +//
> +#define MAX_RECONNECT_REPAIR                        10
> +
>  /**
>    Visitor function to be called by BmForEachVariable for each variable
>    in variable storage.
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> index ddcee8b0676f..30d70f32af84 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
>      L"Reboot Required"
>      };
>  
> +//
> +// Counter of reconnect retry to repair controller; it is to limit the
> +// number of recursive call of BmRepairAllControllers.
> +//
> +STATIC UINTN mReconnectRepairCount;
> +
>  /**
>    Return the controller name.
>  
> @@ -549,7 +555,16 @@ BmRepairAllControllers (
>  
>  
>    if (ReconnectRequired) {
> -    BmRepairAllControllers ();
> +    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> +      mReconnectRepairCount++;
> +      BmRepairAllControllers ();
> +    } else {
> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> +        __FUNCTION__, __LINE__, mReconnectRepairCount));
> +      // Reset counter so that it will not affect calling
> +      // BmRepairAllControllers() somewhere else
> +      mReconnectRepairCount = 0;
> +    }
>    }
>  
>    DEBUG_CODE (
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-26 16:23 ` Laszlo Ersek
@ 2018-02-27  0:48   ` Guo Heyi
  2018-02-27  5:48     ` Ni, Ruiyu
  0 siblings, 1 reply; 14+ messages in thread
From: Guo Heyi @ 2018-02-27  0:48 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Heyi Guo, edk2-devel, Ruiyu Ni, Eric Dong, Star Zeng

Hi Laszlo,

I agree the current patch makes the code ugly, and turning the logic into a
normal loop should be the perfect solution. If Ray also agrees on it, I can try
to do that.

Thanks and regards,

Heyi

On Mon, Feb 26, 2018 at 05:23:29PM +0100, Laszlo Ersek wrote:
> On 02/26/18 09:29, Heyi Guo wrote:
> > Function BmRepairAllControllers may recursively call itself if some
> > driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> > However, driver health protocol of some buggy third party driver may
> > always return such status even after one and another reconnect. The
> > endless iteration will cause stack overflow and then system exception,
> > and it may be not easy to find that the exception is actually caused
> > by stack overflow.
> > 
> > So we limit the number of reconnect retry to 10 to improve code
> > robustness.
> 
> Not really my place to comment on this, but how about removing the
> recursion entirely, and turning the logic into a normal (iterative) loop
> instead?
> 
> (1) Rename the current function to:
> 
> STATIC
> VOID
> BmRepairAllControllersWorker (
>   OUT BOOLEAN *ReconnectRequired,
>   OUT BOOLEAN *RebootRequired
>   );
> 
> 
> (2) The worker function should end right before
> 
>   if (ReconnectRequired) {
>     BmRepairAllControllers ();
>   }
> 
> 
> (3) The worker function should not contain
> 
>   RebootRequired    = FALSE;
>   ReconnectRequired = FALSE;
> 
> Such initialization should be left to the caller.
> 
> 
> (4) The worker function should be called in a loop from a new
> BmRepairAllControllers() function, like this:
> 
>   Reconnect = 0;
>   RebootRequired = FALSE;
>   do {
>     ReconnectRequired = FALSE;
>     BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired);
>     ++Reconnect;
>   } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR);
> 
>   DEBUG_CODE (...);
> 
>   if (RebootRequired) {
>     ...
>   }
> 
> 
> In addition to eliminating the shoddy recursive call (and the shoddier
> global counter, ewww :) ), this would fix the following other warts with
> the code:
> 
> - When a nested call chain is unwound, we currently dump a series of
> "driver health info" lists (assuming no reboot is required), in the
> DEBUG_CODE section. I would argue that we should do that only once, at
> the end. (Even if we have to do it multiple times, it can be moved into
> the worker function, to the end.)
> 
> - It seems to be sufficient to accumulate RebootRequired into one
> variable (i.e. not multiple instances of the same local variable on the
> stack) and to act upon it at the very end.
> 
> 
> Feel free to ignore my comments -- I just think we should be moving in
> the opposite direction; that is, away from recursion (especially from
> recursion combined with global variables -- that's one difficult pattern
> to reason about).
> 
> Thanks
> Laszlo
> 
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> > Cc: Star Zeng <star.zeng@intel.com>
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> > ---
> >  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
> >  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
> >  2 files changed, 22 insertions(+), 1 deletion(-)
> > 
> > diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> > index 25a1d522fe84..9aa86b096525 100644
> > --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> > +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> > @@ -108,6 +108,12 @@ CHAR16 *
> >  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
> >  extern CHAR16  *mBmLoadOptionName[];
> >  
> > +//
> > +// Maximum number of reconnect retry to repair controller; it is to limit the
> > +// number of recursive call of BmRepairAllControllers.
> > +//
> > +#define MAX_RECONNECT_REPAIR                        10
> > +
> >  /**
> >    Visitor function to be called by BmForEachVariable for each variable
> >    in variable storage.
> > diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> > index ddcee8b0676f..30d70f32af84 100644
> > --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> > +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> > @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
> >      L"Reboot Required"
> >      };
> >  
> > +//
> > +// Counter of reconnect retry to repair controller; it is to limit the
> > +// number of recursive call of BmRepairAllControllers.
> > +//
> > +STATIC UINTN mReconnectRepairCount;
> > +
> >  /**
> >    Return the controller name.
> >  
> > @@ -549,7 +555,16 @@ BmRepairAllControllers (
> >  
> >  
> >    if (ReconnectRequired) {
> > -    BmRepairAllControllers ();
> > +    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> > +      mReconnectRepairCount++;
> > +      BmRepairAllControllers ();
> > +    } else {
> > +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> > +        __FUNCTION__, __LINE__, mReconnectRepairCount));
> > +      // Reset counter so that it will not affect calling
> > +      // BmRepairAllControllers() somewhere else
> > +      mReconnectRepairCount = 0;
> > +    }
> >    }
> >  
> >    DEBUG_CODE (
> > 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-26 11:34   ` Guo Heyi
@ 2018-02-27  2:47     ` Wang, Sunny (HPS SW)
  0 siblings, 0 replies; 14+ messages in thread
From: Wang, Sunny (HPS SW) @ 2018-02-27  2:47 UTC (permalink / raw)
  To: Guo Heyi
  Cc: edk2-devel@lists.01.org, Ruiyu Ni, Eric Dong, Star Zeng,
	Wang, Sunny (HPS SW)

Hi Heyi, 
Thanks for looking into my suggestion. You already mentioned the value I thought. :) The value is to get better boot performance by setting the MAX_RECONNECT_REPAIR to a smaller number. 10 times reconnect may be suitable for consumer products like laptop. However, it may be not suitable for server. For example, user installs a problematic NIC 4-port card and its OPROM produced 4 driver handles to manage 4 ports. For this case, 10 times reconnect may take much longer time.  
If you think it's still not worth adding a PCD for this or have other concern about adding a PCD, I'm fine with dropping this suggestion.   

Regards,
Sunny Wang

-----Original Message-----
From: Guo Heyi [mailto:heyi.guo@linaro.org] 
Sent: Monday, February 26, 2018 7:34 PM
To: Wang, Sunny (HPS SW) <sunnywang@hpe.com>
Cc: Heyi Guo <heyi.guo@linaro.org>; edk2-devel@lists.01.org; Ruiyu Ni <ruiyu.ni@intel.com>; Eric Dong <eric.dong@intel.com>; Star Zeng <star.zeng@intel.com>
Subject: Re: [edk2] [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
Importance: High

Hi Sunny,

I didn't consider it as a value necessary for platform override, for the retry count should only have some impact on boot performance and it only happens when there is something wrong.

May I know what value you will use for your platform and why?

Thanks and regards,

Gary (Heyi Guo)

On Mon, Feb 26, 2018 at 08:56:50AM +0000, Wang, Sunny (HPS SW) wrote:
> Hi Heyi,
> Just a suggestion. 
> Is it better to use a PCD instead of a define for MAX_RECONNECT_REPAIR? So that we can easily override it in our platform dsc file.
> 
> Regards,
> Sunny Wang
> 
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of 
> Heyi Guo
> Sent: Monday, February 26, 2018 4:30 PM
> To: edk2-devel@lists.01.org
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>; Heyi Guo <heyi.guo@linaro.org>; 
> Eric Dong <eric.dong@intel.com>; Star Zeng <star.zeng@intel.com>
> Subject: [edk2] [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit 
> recursive call depth
> 
> Function BmRepairAllControllers may recursively call itself if some driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> However, driver health protocol of some buggy third party driver may always return such status even after one and another reconnect. The endless iteration will cause stack overflow and then system exception, and it may be not easy to find that the exception is actually caused by stack overflow.
> 
> So we limit the number of reconnect retry to 10 to improve code robustness.
> 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> Cc: Star Zeng <star.zeng@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> ---
>  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
>  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 
> ++++++++++++++++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h 
> b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> index 25a1d522fe84..9aa86b096525 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> @@ -108,6 +108,12 @@ CHAR16 *
>  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>  extern CHAR16  *mBmLoadOptionName[];
>  
> +//
> +// Maximum number of reconnect retry to repair controller; it is to 
> +limit the // number of recursive call of BmRepairAllControllers.
> +//
> +#define MAX_RECONNECT_REPAIR                        10
> +
>  /**
>    Visitor function to be called by BmForEachVariable for each variable
>    in variable storage.
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c 
> b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> index ddcee8b0676f..30d70f32af84 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
>      L"Reboot Required"
>      };
>  
> +//
> +// Counter of reconnect retry to repair controller; it is to limit 
> +the // number of recursive call of BmRepairAllControllers.
> +//
> +STATIC UINTN mReconnectRepairCount;
> +
>  /**
>    Return the controller name.
>  
> @@ -549,7 +555,16 @@ BmRepairAllControllers (
>  
>  
>    if (ReconnectRequired) {
> -    BmRepairAllControllers ();
> +    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> +      mReconnectRepairCount++;
> +      BmRepairAllControllers ();
> +    } else {
> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> +        __FUNCTION__, __LINE__, mReconnectRepairCount));
> +      // Reset counter so that it will not affect calling
> +      // BmRepairAllControllers() somewhere else
> +      mReconnectRepairCount = 0;
> +    }
>    }
>  
>    DEBUG_CODE (
> --
> 2.7.4
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-27  0:48   ` Guo Heyi
@ 2018-02-27  5:48     ` Ni, Ruiyu
  2018-02-27 10:29       ` Laszlo Ersek
  0 siblings, 1 reply; 14+ messages in thread
From: Ni, Ruiyu @ 2018-02-27  5:48 UTC (permalink / raw)
  To: Guo Heyi, Laszlo Ersek; +Cc: edk2-devel, Star Zeng, Eric Dong

On 2/27/2018 8:48 AM, Guo Heyi wrote:
> Hi Laszlo,
> 
> I agree the current patch makes the code ugly, and turning the logic into a
> normal loop should be the perfect solution. If Ray also agrees on it, I can try
> to do that.
> 
> Thanks and regards,
> 
> Heyi
> 
> On Mon, Feb 26, 2018 at 05:23:29PM +0100, Laszlo Ersek wrote:
>> On 02/26/18 09:29, Heyi Guo wrote:
>>> Function BmRepairAllControllers may recursively call itself if some
>>> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
>>> However, driver health protocol of some buggy third party driver may
>>> always return such status even after one and another reconnect. The
>>> endless iteration will cause stack overflow and then system exception,
>>> and it may be not easy to find that the exception is actually caused
>>> by stack overflow.
>>>
>>> So we limit the number of reconnect retry to 10 to improve code
>>> robustness.
>>
>> Not really my place to comment on this, but how about removing the
>> recursion entirely, and turning the logic into a normal (iterative) loop
>> instead?
>>
>> (1) Rename the current function to:
>>
>> STATIC
>> VOID
>> BmRepairAllControllersWorker (
>>    OUT BOOLEAN *ReconnectRequired,
>>    OUT BOOLEAN *RebootRequired
>>    );
>>
>>
>> (2) The worker function should end right before
>>
>>    if (ReconnectRequired) {
>>      BmRepairAllControllers ();
>>    }
>>
>>
>> (3) The worker function should not contain
>>
>>    RebootRequired    = FALSE;
>>    ReconnectRequired = FALSE;
>>
>> Such initialization should be left to the caller.
>>
>>
>> (4) The worker function should be called in a loop from a new
>> BmRepairAllControllers() function, like this:
>>
>>    Reconnect = 0;
>>    RebootRequired = FALSE;
>>    do {
>>      ReconnectRequired = FALSE;
>>      BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired);
>>      ++Reconnect;
>>    } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR);
>>
>>    DEBUG_CODE (...);
>>
>>    if (RebootRequired) {
>>      ...
>>    }
>>
>>
>> In addition to eliminating the shoddy recursive call (and the shoddier
>> global counter, ewww :) ), this would fix the following other warts with
>> the code:
>>
>> - When a nested call chain is unwound, we currently dump a series of
>> "driver health info" lists (assuming no reboot is required), in the
>> DEBUG_CODE section. I would argue that we should do that only once, at
>> the end. (Even if we have to do it multiple times, it can be moved into
>> the worker function, to the end.)
>>
>> - It seems to be sufficient to accumulate RebootRequired into one
>> variable (i.e. not multiple instances of the same local variable on the
>> stack) and to act upon it at the very end.
>>
>>
>> Feel free to ignore my comments -- I just think we should be moving in
>> the opposite direction; that is, away from recursion (especially from
>> recursion combined with global variables -- that's one difficult pattern
>> to reason about).

How about to just remove the global variable?
I prefer to change BmRepairAllControllers in the following prototype:
VOID
BmRepairAllControllers (
   UINTN  ReconnectRepairCount
   );
And start to call this like BmRepairAllControllers (0).

I am neutral between recursive call and while loop.
But I am afraid such a big change may introduce some bugs.
And I also like to move the DEBUG_CODE to before:
if (ReconnectRequired) {
   BmRepairAllControllers (ReconnectRepairCount + 1);
}
So that we can dump the health info for every reconnect repair.



>>
>> Thanks
>> Laszlo
>>
>>> Contributed-under: TianoCore Contribution Agreement 1.1
>>> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
>>> Cc: Star Zeng <star.zeng@intel.com>
>>> Cc: Eric Dong <eric.dong@intel.com>
>>> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
>>> ---
>>>   MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     |  6 ++++++
>>>   MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 17 ++++++++++++++++-
>>>   2 files changed, 22 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> index 25a1d522fe84..9aa86b096525 100644
>>> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> @@ -108,6 +108,12 @@ CHAR16 *
>>>   #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>>>   extern CHAR16  *mBmLoadOptionName[];
>>>   
>>> +//
>>> +// Maximum number of reconnect retry to repair controller; it is to limit the
>>> +// number of recursive call of BmRepairAllControllers.
>>> +//
>>> +#define MAX_RECONNECT_REPAIR                        10
>>> +
>>>   /**
>>>     Visitor function to be called by BmForEachVariable for each variable
>>>     in variable storage.
>>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> index ddcee8b0676f..30d70f32af84 100644
>>> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> @@ -26,6 +26,12 @@ GLOBAL_REMOVE_IF_UNREFERENCED
>>>       L"Reboot Required"
>>>       };
>>>   
>>> +//
>>> +// Counter of reconnect retry to repair controller; it is to limit the
>>> +// number of recursive call of BmRepairAllControllers.
>>> +//
>>> +STATIC UINTN mReconnectRepairCount;
>>> +
>>>   /**
>>>     Return the controller name.
>>>   
>>> @@ -549,7 +555,16 @@ BmRepairAllControllers (
>>>   
>>>   
>>>     if (ReconnectRequired) {
>>> -    BmRepairAllControllers ();
>>> +    if (mReconnectRepairCount < MAX_RECONNECT_REPAIR) {
>>> +      mReconnectRepairCount++;
>>> +      BmRepairAllControllers ();
>>> +    } else {
>>> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
>>> +        __FUNCTION__, __LINE__, mReconnectRepairCount));
>>> +      // Reset counter so that it will not affect calling
>>> +      // BmRepairAllControllers() somewhere else
>>> +      mReconnectRepairCount = 0;
>>> +    }
>>>     }
>>>   
>>>     DEBUG_CODE (
>>>
>>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
> 


-- 
Thanks,
Ray


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-27  5:48     ` Ni, Ruiyu
@ 2018-02-27 10:29       ` Laszlo Ersek
  2018-02-27 10:39         ` Guo Heyi
  0 siblings, 1 reply; 14+ messages in thread
From: Laszlo Ersek @ 2018-02-27 10:29 UTC (permalink / raw)
  To: Ni, Ruiyu, Guo Heyi; +Cc: edk2-devel, Star Zeng, Eric Dong

On 02/27/18 06:48, Ni, Ruiyu wrote:
> On 2/27/2018 8:48 AM, Guo Heyi wrote:
>> Hi Laszlo,
>>
>> I agree the current patch makes the code ugly, and turning the logic
>> into a
>> normal loop should be the perfect solution. If Ray also agrees on it,
>> I can try
>> to do that.
>>
>> Thanks and regards,
>>
>> Heyi
>>
>> On Mon, Feb 26, 2018 at 05:23:29PM +0100, Laszlo Ersek wrote:
>>> On 02/26/18 09:29, Heyi Guo wrote:
>>>> Function BmRepairAllControllers may recursively call itself if some
>>>> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
>>>> However, driver health protocol of some buggy third party driver may
>>>> always return such status even after one and another reconnect. The
>>>> endless iteration will cause stack overflow and then system exception,
>>>> and it may be not easy to find that the exception is actually caused
>>>> by stack overflow.
>>>>
>>>> So we limit the number of reconnect retry to 10 to improve code
>>>> robustness.
>>>
>>> Not really my place to comment on this, but how about removing the
>>> recursion entirely, and turning the logic into a normal (iterative) loop
>>> instead?
>>>
>>> (1) Rename the current function to:
>>>
>>> STATIC
>>> VOID
>>> BmRepairAllControllersWorker (
>>>    OUT BOOLEAN *ReconnectRequired,
>>>    OUT BOOLEAN *RebootRequired
>>>    );
>>>
>>>
>>> (2) The worker function should end right before
>>>
>>>    if (ReconnectRequired) {
>>>      BmRepairAllControllers ();
>>>    }
>>>
>>>
>>> (3) The worker function should not contain
>>>
>>>    RebootRequired    = FALSE;
>>>    ReconnectRequired = FALSE;
>>>
>>> Such initialization should be left to the caller.
>>>
>>>
>>> (4) The worker function should be called in a loop from a new
>>> BmRepairAllControllers() function, like this:
>>>
>>>    Reconnect = 0;
>>>    RebootRequired = FALSE;
>>>    do {
>>>      ReconnectRequired = FALSE;
>>>      BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired);
>>>      ++Reconnect;
>>>    } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR);
>>>
>>>    DEBUG_CODE (...);
>>>
>>>    if (RebootRequired) {
>>>      ...
>>>    }
>>>
>>>
>>> In addition to eliminating the shoddy recursive call (and the shoddier
>>> global counter, ewww :) ), this would fix the following other warts with
>>> the code:
>>>
>>> - When a nested call chain is unwound, we currently dump a series of
>>> "driver health info" lists (assuming no reboot is required), in the
>>> DEBUG_CODE section. I would argue that we should do that only once, at
>>> the end. (Even if we have to do it multiple times, it can be moved into
>>> the worker function, to the end.)
>>>
>>> - It seems to be sufficient to accumulate RebootRequired into one
>>> variable (i.e. not multiple instances of the same local variable on the
>>> stack) and to act upon it at the very end.
>>>
>>>
>>> Feel free to ignore my comments -- I just think we should be moving in
>>> the opposite direction; that is, away from recursion (especially from
>>> recursion combined with global variables -- that's one difficult pattern
>>> to reason about).
> 
> How about to just remove the global variable?
> I prefer to change BmRepairAllControllers in the following prototype:
> VOID
> BmRepairAllControllers (
>   UINTN  ReconnectRepairCount
>   );
> And start to call this like BmRepairAllControllers (0).
> 
> I am neutral between recursive call and while loop.
> But I am afraid such a big change may introduce some bugs.
> And I also like to move the DEBUG_CODE to before:
> if (ReconnectRequired) {
>   BmRepairAllControllers (ReconnectRepairCount + 1);
> }
> So that we can dump the health info for every reconnect repair.

Sure, that too works for me.

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-02-27 10:29       ` Laszlo Ersek
@ 2018-02-27 10:39         ` Guo Heyi
  0 siblings, 0 replies; 14+ messages in thread
From: Guo Heyi @ 2018-02-27 10:39 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Ni, Ruiyu, Guo Heyi, edk2-devel, Star Zeng, Eric Dong

Thanks Ray and Laszlo, I will create v2 according to your comments.

Regards,

Heyi

On Tue, Feb 27, 2018 at 11:29:18AM +0100, Laszlo Ersek wrote:
> On 02/27/18 06:48, Ni, Ruiyu wrote:
> > On 2/27/2018 8:48 AM, Guo Heyi wrote:
> >> Hi Laszlo,
> >>
> >> I agree the current patch makes the code ugly, and turning the logic
> >> into a
> >> normal loop should be the perfect solution. If Ray also agrees on it,
> >> I can try
> >> to do that.
> >>
> >> Thanks and regards,
> >>
> >> Heyi
> >>
> >> On Mon, Feb 26, 2018 at 05:23:29PM +0100, Laszlo Ersek wrote:
> >>> On 02/26/18 09:29, Heyi Guo wrote:
> >>>> Function BmRepairAllControllers may recursively call itself if some
> >>>> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> >>>> However, driver health protocol of some buggy third party driver may
> >>>> always return such status even after one and another reconnect. The
> >>>> endless iteration will cause stack overflow and then system exception,
> >>>> and it may be not easy to find that the exception is actually caused
> >>>> by stack overflow.
> >>>>
> >>>> So we limit the number of reconnect retry to 10 to improve code
> >>>> robustness.
> >>>
> >>> Not really my place to comment on this, but how about removing the
> >>> recursion entirely, and turning the logic into a normal (iterative) loop
> >>> instead?
> >>>
> >>> (1) Rename the current function to:
> >>>
> >>> STATIC
> >>> VOID
> >>> BmRepairAllControllersWorker (
> >>>    OUT BOOLEAN *ReconnectRequired,
> >>>    OUT BOOLEAN *RebootRequired
> >>>    );
> >>>
> >>>
> >>> (2) The worker function should end right before
> >>>
> >>>    if (ReconnectRequired) {
> >>>      BmRepairAllControllers ();
> >>>    }
> >>>
> >>>
> >>> (3) The worker function should not contain
> >>>
> >>>    RebootRequired    = FALSE;
> >>>    ReconnectRequired = FALSE;
> >>>
> >>> Such initialization should be left to the caller.
> >>>
> >>>
> >>> (4) The worker function should be called in a loop from a new
> >>> BmRepairAllControllers() function, like this:
> >>>
> >>>    Reconnect = 0;
> >>>    RebootRequired = FALSE;
> >>>    do {
> >>>      ReconnectRequired = FALSE;
> >>>      BmRepairAllControllersWorker (&ReconnectRequired, &RebootRequired);
> >>>      ++Reconnect;
> >>>    } while (ReconnectRequired && Reconnect < MAX_RECONNECT_REPAIR);
> >>>
> >>>    DEBUG_CODE (...);
> >>>
> >>>    if (RebootRequired) {
> >>>      ...
> >>>    }
> >>>
> >>>
> >>> In addition to eliminating the shoddy recursive call (and the shoddier
> >>> global counter, ewww :) ), this would fix the following other warts with
> >>> the code:
> >>>
> >>> - When a nested call chain is unwound, we currently dump a series of
> >>> "driver health info" lists (assuming no reboot is required), in the
> >>> DEBUG_CODE section. I would argue that we should do that only once, at
> >>> the end. (Even if we have to do it multiple times, it can be moved into
> >>> the worker function, to the end.)
> >>>
> >>> - It seems to be sufficient to accumulate RebootRequired into one
> >>> variable (i.e. not multiple instances of the same local variable on the
> >>> stack) and to act upon it at the very end.
> >>>
> >>>
> >>> Feel free to ignore my comments -- I just think we should be moving in
> >>> the opposite direction; that is, away from recursion (especially from
> >>> recursion combined with global variables -- that's one difficult pattern
> >>> to reason about).
> > 
> > How about to just remove the global variable?
> > I prefer to change BmRepairAllControllers in the following prototype:
> > VOID
> > BmRepairAllControllers (
> >   UINTN  ReconnectRepairCount
> >   );
> > And start to call this like BmRepairAllControllers (0).
> > 
> > I am neutral between recursive call and while loop.
> > But I am afraid such a big change may introduce some bugs.
> > And I also like to move the DEBUG_CODE to before:
> > if (ReconnectRequired) {
> >   BmRepairAllControllers (ReconnectRepairCount + 1);
> > }
> > So that we can dump the health info for every reconnect repair.
> 
> Sure, that too works for me.
> 
> Thanks!
> Laszlo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
@ 2018-03-01  2:39 Heyi Guo
  2018-03-01  2:43 ` Guo Heyi
  2018-03-01  4:46 ` Ni, Ruiyu
  0 siblings, 2 replies; 14+ messages in thread
From: Heyi Guo @ 2018-03-01  2:39 UTC (permalink / raw)
  To: edk2-devel; +Cc: Heyi Guo, Star Zeng, Eric Dong, Ruiyu Ni, Laszlo Ersek

Function BmRepairAllControllers may recursively call itself if some
driver health protocol returns EfiDriverHealthStatusReconnectRequired.
However, driver health protocol of some buggy third party driver may
always return such status even after one and another reconnect. The
endless iteration will cause stack overflow and then system exception,
and it may be not easy to find that the exception is actually caused
by stack overflow.

So we limit the number of reconnect retry to 10 to improve code
robustness, and DEBUG_CODE is moved ahead before recursive repair to
track the repair result.

We also remove a duplicated declaration of BmRepairAllControllers() in
InternalBm.h in this patch, for it is only a trivial change.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
Cc: Star Zeng <star.zeng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    v2
    - Use argument instead of global variable to record the recursive
      count [Ray]
    - Move DEBUG_CODE before recursively calling BmRepairAllControllers()
      to track the change of each reconnect [Ray]
    - Remove a duplicated declaration of BmRepairAllControllers() in
      InternalBm.h.

 MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     | 19 ++++++++++---------
 MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c         |  2 +-
 MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 18 +++++++++++++-----
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
index 25a1d522fe84..21ecd8584d24 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
+++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
@@ -108,6 +108,12 @@ CHAR16 *
 #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
 extern CHAR16  *mBmLoadOptionName[];
 
+//
+// Maximum number of reconnect retry to repair controller; it is to limit the
+// number of recursive call of BmRepairAllControllers.
+//
+#define MAX_RECONNECT_REPAIR                        10
+
 /**
   Visitor function to be called by BmForEachVariable for each variable
   in variable storage.
@@ -145,10 +151,13 @@ typedef struct {
 
 /**
   Repair all the controllers according to the Driver Health status queried.
+
+  @param ReconnectRepairCount     To record the number of recursive call of
+                                  this function itself.
 **/
 VOID
 BmRepairAllControllers (
-  VOID
+  UINTN       ReconnectRepairCount
   );
 
 #define BM_HOTKEY_SIGNATURE SIGNATURE_32 ('b', 'm', 'h', 'k')
@@ -328,14 +337,6 @@ BmDelPartMatchInstance (
   );
 
 /**
-  Repair all the controllers according to the Driver Health status queried.
-**/
-VOID
-BmRepairAllControllers (
-  VOID
-  );
-
-/**
   Print the device path info.
 
   @param DevicePath           The device path need to print.
diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
index ce19ae400660..b842d5824aed 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
@@ -1767,7 +1767,7 @@ EfiBootManagerBoot (
     //
     // 4. Repair system through DriverHealth protocol
     //
-    BmRepairAllControllers ();
+    BmRepairAllControllers (0);
   }
 
   PERF_START_EX (gImageHandle, "BdsAttempt", NULL, 0, (UINT32) OptionNumber);
diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
index ddcee8b0676f..db2f859ae73d 100644
--- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
@@ -423,10 +423,13 @@ EfiBootManagerFreeDriverHealthInfo (
 
 /**
   Repair all the controllers according to the Driver Health status queried.
+
+  @param ReconnectRepairCount     To record the number of recursive call of
+                                  this function itself.
 **/
 VOID
 BmRepairAllControllers (
-  VOID
+  UINTN       ReconnectRepairCount
   )
 {
   EFI_STATUS                          Status;
@@ -548,10 +551,6 @@ BmRepairAllControllers (
   EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
 
 
-  if (ReconnectRequired) {
-    BmRepairAllControllers ();
-  }
-
   DEBUG_CODE (
     CHAR16 *ControllerName;
 
@@ -576,6 +575,15 @@ BmRepairAllControllers (
     EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
     );
 
+  if (ReconnectRequired) {
+    if (ReconnectRepairCount < MAX_RECONNECT_REPAIR) {
+      BmRepairAllControllers (ReconnectRepairCount + 1);
+    } else {
+      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
+        __FUNCTION__, __LINE__, ReconnectRepairCount));
+    }
+  }
+
   if (RebootRequired) {
     DEBUG ((EFI_D_INFO, "[BDS] One of the Driver Health instances requires rebooting.\n"));
     gRT->ResetSystem (EfiResetWarm, EFI_SUCCESS, 0, NULL);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-03-01  2:39 Heyi Guo
@ 2018-03-01  2:43 ` Guo Heyi
  2018-03-01  4:46 ` Ni, Ruiyu
  1 sibling, 0 replies; 14+ messages in thread
From: Guo Heyi @ 2018-03-01  2:43 UTC (permalink / raw)
  To: Heyi Guo; +Cc: edk2-devel, Star Zeng, Eric Dong, Ruiyu Ni, Laszlo Ersek

Sorry, forgot to add a "v2" mark in the subjuect prefix...

Regards,
Heyi

On Thu, Mar 01, 2018 at 10:39:32AM +0800, Heyi Guo wrote:
> Function BmRepairAllControllers may recursively call itself if some
> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> However, driver health protocol of some buggy third party driver may
> always return such status even after one and another reconnect. The
> endless iteration will cause stack overflow and then system exception,
> and it may be not easy to find that the exception is actually caused
> by stack overflow.
> 
> So we limit the number of reconnect retry to 10 to improve code
> robustness, and DEBUG_CODE is moved ahead before recursive repair to
> track the repair result.
> 
> We also remove a duplicated declaration of BmRepairAllControllers() in
> InternalBm.h in this patch, for it is only a trivial change.
> 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> Cc: Star Zeng <star.zeng@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> ---
> 
> Notes:
>     v2
>     - Use argument instead of global variable to record the recursive
>       count [Ray]
>     - Move DEBUG_CODE before recursively calling BmRepairAllControllers()
>       to track the change of each reconnect [Ray]
>     - Remove a duplicated declaration of BmRepairAllControllers() in
>       InternalBm.h.
> 
>  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     | 19 ++++++++++---------
>  MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c         |  2 +-
>  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 18 +++++++++++++-----
>  3 files changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> index 25a1d522fe84..21ecd8584d24 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> @@ -108,6 +108,12 @@ CHAR16 *
>  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>  extern CHAR16  *mBmLoadOptionName[];
>  
> +//
> +// Maximum number of reconnect retry to repair controller; it is to limit the
> +// number of recursive call of BmRepairAllControllers.
> +//
> +#define MAX_RECONNECT_REPAIR                        10
> +
>  /**
>    Visitor function to be called by BmForEachVariable for each variable
>    in variable storage.
> @@ -145,10 +151,13 @@ typedef struct {
>  
>  /**
>    Repair all the controllers according to the Driver Health status queried.
> +
> +  @param ReconnectRepairCount     To record the number of recursive call of
> +                                  this function itself.
>  **/
>  VOID
>  BmRepairAllControllers (
> -  VOID
> +  UINTN       ReconnectRepairCount
>    );
>  
>  #define BM_HOTKEY_SIGNATURE SIGNATURE_32 ('b', 'm', 'h', 'k')
> @@ -328,14 +337,6 @@ BmDelPartMatchInstance (
>    );
>  
>  /**
> -  Repair all the controllers according to the Driver Health status queried.
> -**/
> -VOID
> -BmRepairAllControllers (
> -  VOID
> -  );
> -
> -/**
>    Print the device path info.
>  
>    @param DevicePath           The device path need to print.
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> index ce19ae400660..b842d5824aed 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> @@ -1767,7 +1767,7 @@ EfiBootManagerBoot (
>      //
>      // 4. Repair system through DriverHealth protocol
>      //
> -    BmRepairAllControllers ();
> +    BmRepairAllControllers (0);
>    }
>  
>    PERF_START_EX (gImageHandle, "BdsAttempt", NULL, 0, (UINT32) OptionNumber);
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> index ddcee8b0676f..db2f859ae73d 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> @@ -423,10 +423,13 @@ EfiBootManagerFreeDriverHealthInfo (
>  
>  /**
>    Repair all the controllers according to the Driver Health status queried.
> +
> +  @param ReconnectRepairCount     To record the number of recursive call of
> +                                  this function itself.
>  **/
>  VOID
>  BmRepairAllControllers (
> -  VOID
> +  UINTN       ReconnectRepairCount
>    )
>  {
>    EFI_STATUS                          Status;
> @@ -548,10 +551,6 @@ BmRepairAllControllers (
>    EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>  
>  
> -  if (ReconnectRequired) {
> -    BmRepairAllControllers ();
> -  }
> -
>    DEBUG_CODE (
>      CHAR16 *ControllerName;
>  
> @@ -576,6 +575,15 @@ BmRepairAllControllers (
>      EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>      );
>  
> +  if (ReconnectRequired) {
> +    if (ReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> +      BmRepairAllControllers (ReconnectRepairCount + 1);
> +    } else {
> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> +        __FUNCTION__, __LINE__, ReconnectRepairCount));
> +    }
> +  }
> +
>    if (RebootRequired) {
>      DEBUG ((EFI_D_INFO, "[BDS] One of the Driver Health instances requires rebooting.\n"));
>      gRT->ResetSystem (EfiResetWarm, EFI_SUCCESS, 0, NULL);
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-03-01  2:39 Heyi Guo
  2018-03-01  2:43 ` Guo Heyi
@ 2018-03-01  4:46 ` Ni, Ruiyu
  2018-03-07  1:54   ` Guo Heyi
  1 sibling, 1 reply; 14+ messages in thread
From: Ni, Ruiyu @ 2018-03-01  4:46 UTC (permalink / raw)
  To: Heyi Guo, edk2-devel; +Cc: Star Zeng, Eric Dong, Laszlo Ersek

On 3/1/2018 10:39 AM, Heyi Guo wrote:
> Function BmRepairAllControllers may recursively call itself if some
> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> However, driver health protocol of some buggy third party driver may
> always return such status even after one and another reconnect. The
> endless iteration will cause stack overflow and then system exception,
> and it may be not easy to find that the exception is actually caused
> by stack overflow.
> 
> So we limit the number of reconnect retry to 10 to improve code
> robustness, and DEBUG_CODE is moved ahead before recursive repair to
> track the repair result.
> 
> We also remove a duplicated declaration of BmRepairAllControllers() in
> InternalBm.h in this patch, for it is only a trivial change.
> 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> Cc: Star Zeng <star.zeng@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> ---
> 
> Notes:
>      v2
>      - Use argument instead of global variable to record the recursive
>        count [Ray]
>      - Move DEBUG_CODE before recursively calling BmRepairAllControllers()
>        to track the change of each reconnect [Ray]
>      - Remove a duplicated declaration of BmRepairAllControllers() in
>        InternalBm.h.
> 
>   MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     | 19 ++++++++++---------
>   MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c         |  2 +-
>   MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 18 +++++++++++++-----
>   3 files changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> index 25a1d522fe84..21ecd8584d24 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> @@ -108,6 +108,12 @@ CHAR16 *
>   #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>   extern CHAR16  *mBmLoadOptionName[];
>   
> +//
> +// Maximum number of reconnect retry to repair controller; it is to limit the
> +// number of recursive call of BmRepairAllControllers.
> +//
> +#define MAX_RECONNECT_REPAIR                        10
> +
>   /**
>     Visitor function to be called by BmForEachVariable for each variable
>     in variable storage.
> @@ -145,10 +151,13 @@ typedef struct {
>   
>   /**
>     Repair all the controllers according to the Driver Health status queried.
> +
> +  @param ReconnectRepairCount     To record the number of recursive call of
> +                                  this function itself.
>   **/
>   VOID
>   BmRepairAllControllers (
> -  VOID
> +  UINTN       ReconnectRepairCount
>     );
>   
>   #define BM_HOTKEY_SIGNATURE SIGNATURE_32 ('b', 'm', 'h', 'k')
> @@ -328,14 +337,6 @@ BmDelPartMatchInstance (
>     );
>   
>   /**
> -  Repair all the controllers according to the Driver Health status queried.
> -**/
> -VOID
> -BmRepairAllControllers (
> -  VOID
> -  );
> -
> -/**
>     Print the device path info.
>   
>     @param DevicePath           The device path need to print.
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> index ce19ae400660..b842d5824aed 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> @@ -1767,7 +1767,7 @@ EfiBootManagerBoot (
>       //
>       // 4. Repair system through DriverHealth protocol
>       //
> -    BmRepairAllControllers ();
> +    BmRepairAllControllers (0);
>     }
>   
>     PERF_START_EX (gImageHandle, "BdsAttempt", NULL, 0, (UINT32) OptionNumber);
> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> index ddcee8b0676f..db2f859ae73d 100644
> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> @@ -423,10 +423,13 @@ EfiBootManagerFreeDriverHealthInfo (
>   
>   /**
>     Repair all the controllers according to the Driver Health status queried.
> +
> +  @param ReconnectRepairCount     To record the number of recursive call of
> +                                  this function itself.
>   **/
>   VOID
>   BmRepairAllControllers (
> -  VOID
> +  UINTN       ReconnectRepairCount
>     )
>   {
>     EFI_STATUS                          Status;
> @@ -548,10 +551,6 @@ BmRepairAllControllers (
>     EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>   
>   
> -  if (ReconnectRequired) {
> -    BmRepairAllControllers ();
> -  }
> -
>     DEBUG_CODE (
>       CHAR16 *ControllerName;
>   
> @@ -576,6 +575,15 @@ BmRepairAllControllers (
>       EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>       );
>   
> +  if (ReconnectRequired) {
> +    if (ReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> +      BmRepairAllControllers (ReconnectRepairCount + 1);
> +    } else {
> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> +        __FUNCTION__, __LINE__, ReconnectRepairCount));
> +    }
> +  }
> +
>     if (RebootRequired) {
>       DEBUG ((EFI_D_INFO, "[BDS] One of the Driver Health instances requires rebooting.\n"));
>       gRT->ResetSystem (EfiResetWarm, EFI_SUCCESS, 0, NULL);
> 
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>

-- 
Thanks,
Ray


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-03-01  4:46 ` Ni, Ruiyu
@ 2018-03-07  1:54   ` Guo Heyi
  2018-03-08  2:53     ` Ni, Ruiyu
  0 siblings, 1 reply; 14+ messages in thread
From: Guo Heyi @ 2018-03-07  1:54 UTC (permalink / raw)
  To: Ni, Ruiyu; +Cc: Heyi Guo, edk2-devel, Star Zeng, Eric Dong, Laszlo Ersek

Hi Ray,

Sorry to disturb, but I didn't find the patch committed. Could you help to do
that?

Thanks,
Heyi


On Thu, Mar 01, 2018 at 12:46:32PM +0800, Ni, Ruiyu wrote:
> On 3/1/2018 10:39 AM, Heyi Guo wrote:
> >Function BmRepairAllControllers may recursively call itself if some
> >driver health protocol returns EfiDriverHealthStatusReconnectRequired.
> >However, driver health protocol of some buggy third party driver may
> >always return such status even after one and another reconnect. The
> >endless iteration will cause stack overflow and then system exception,
> >and it may be not easy to find that the exception is actually caused
> >by stack overflow.
> >
> >So we limit the number of reconnect retry to 10 to improve code
> >robustness, and DEBUG_CODE is moved ahead before recursive repair to
> >track the repair result.
> >
> >We also remove a duplicated declaration of BmRepairAllControllers() in
> >InternalBm.h in this patch, for it is only a trivial change.
> >
> >Contributed-under: TianoCore Contribution Agreement 1.1
> >Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
> >Cc: Star Zeng <star.zeng@intel.com>
> >Cc: Eric Dong <eric.dong@intel.com>
> >Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> >Cc: Laszlo Ersek <lersek@redhat.com>
> >---
> >
> >Notes:
> >     v2
> >     - Use argument instead of global variable to record the recursive
> >       count [Ray]
> >     - Move DEBUG_CODE before recursively calling BmRepairAllControllers()
> >       to track the change of each reconnect [Ray]
> >     - Remove a duplicated declaration of BmRepairAllControllers() in
> >       InternalBm.h.
> >
> >  MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     | 19 ++++++++++---------
> >  MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c         |  2 +-
> >  MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 18 +++++++++++++-----
> >  3 files changed, 24 insertions(+), 15 deletions(-)
> >
> >diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> >index 25a1d522fe84..21ecd8584d24 100644
> >--- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> >+++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
> >@@ -108,6 +108,12 @@ CHAR16 *
> >  #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
> >  extern CHAR16  *mBmLoadOptionName[];
> >+//
> >+// Maximum number of reconnect retry to repair controller; it is to limit the
> >+// number of recursive call of BmRepairAllControllers.
> >+//
> >+#define MAX_RECONNECT_REPAIR                        10
> >+
> >  /**
> >    Visitor function to be called by BmForEachVariable for each variable
> >    in variable storage.
> >@@ -145,10 +151,13 @@ typedef struct {
> >  /**
> >    Repair all the controllers according to the Driver Health status queried.
> >+
> >+  @param ReconnectRepairCount     To record the number of recursive call of
> >+                                  this function itself.
> >  **/
> >  VOID
> >  BmRepairAllControllers (
> >-  VOID
> >+  UINTN       ReconnectRepairCount
> >    );
> >  #define BM_HOTKEY_SIGNATURE SIGNATURE_32 ('b', 'm', 'h', 'k')
> >@@ -328,14 +337,6 @@ BmDelPartMatchInstance (
> >    );
> >  /**
> >-  Repair all the controllers according to the Driver Health status queried.
> >-**/
> >-VOID
> >-BmRepairAllControllers (
> >-  VOID
> >-  );
> >-
> >-/**
> >    Print the device path info.
> >    @param DevicePath           The device path need to print.
> >diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> >index ce19ae400660..b842d5824aed 100644
> >--- a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> >+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
> >@@ -1767,7 +1767,7 @@ EfiBootManagerBoot (
> >      //
> >      // 4. Repair system through DriverHealth protocol
> >      //
> >-    BmRepairAllControllers ();
> >+    BmRepairAllControllers (0);
> >    }
> >    PERF_START_EX (gImageHandle, "BdsAttempt", NULL, 0, (UINT32) OptionNumber);
> >diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> >index ddcee8b0676f..db2f859ae73d 100644
> >--- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> >+++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
> >@@ -423,10 +423,13 @@ EfiBootManagerFreeDriverHealthInfo (
> >  /**
> >    Repair all the controllers according to the Driver Health status queried.
> >+
> >+  @param ReconnectRepairCount     To record the number of recursive call of
> >+                                  this function itself.
> >  **/
> >  VOID
> >  BmRepairAllControllers (
> >-  VOID
> >+  UINTN       ReconnectRepairCount
> >    )
> >  {
> >    EFI_STATUS                          Status;
> >@@ -548,10 +551,6 @@ BmRepairAllControllers (
> >    EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
> >-  if (ReconnectRequired) {
> >-    BmRepairAllControllers ();
> >-  }
> >-
> >    DEBUG_CODE (
> >      CHAR16 *ControllerName;
> >@@ -576,6 +575,15 @@ BmRepairAllControllers (
> >      EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
> >      );
> >+  if (ReconnectRequired) {
> >+    if (ReconnectRepairCount < MAX_RECONNECT_REPAIR) {
> >+      BmRepairAllControllers (ReconnectRepairCount + 1);
> >+    } else {
> >+      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
> >+        __FUNCTION__, __LINE__, ReconnectRepairCount));
> >+    }
> >+  }
> >+
> >    if (RebootRequired) {
> >      DEBUG ((EFI_D_INFO, "[BDS] One of the Driver Health instances requires rebooting.\n"));
> >      gRT->ResetSystem (EfiResetWarm, EFI_SUCCESS, 0, NULL);
> >
> Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
> 
> -- 
> Thanks,
> Ray


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth
  2018-03-07  1:54   ` Guo Heyi
@ 2018-03-08  2:53     ` Ni, Ruiyu
  0 siblings, 0 replies; 14+ messages in thread
From: Ni, Ruiyu @ 2018-03-08  2:53 UTC (permalink / raw)
  To: Guo Heyi; +Cc: edk2-devel, Star Zeng, Eric Dong, Laszlo Ersek

On 3/7/2018 9:54 AM, Guo Heyi wrote:
> Hi Ray,
> 
> Sorry to disturb, but I didn't find the patch committed. Could you help to do
> that?
> 
> Thanks,
> Heyi
> 
> 
> On Thu, Mar 01, 2018 at 12:46:32PM +0800, Ni, Ruiyu wrote:
>> On 3/1/2018 10:39 AM, Heyi Guo wrote:
>>> Function BmRepairAllControllers may recursively call itself if some
>>> driver health protocol returns EfiDriverHealthStatusReconnectRequired.
>>> However, driver health protocol of some buggy third party driver may
>>> always return such status even after one and another reconnect. The
>>> endless iteration will cause stack overflow and then system exception,
>>> and it may be not easy to find that the exception is actually caused
>>> by stack overflow.
>>>
>>> So we limit the number of reconnect retry to 10 to improve code
>>> robustness, and DEBUG_CODE is moved ahead before recursive repair to
>>> track the repair result.
>>>
>>> We also remove a duplicated declaration of BmRepairAllControllers() in
>>> InternalBm.h in this patch, for it is only a trivial change.
>>>
>>> Contributed-under: TianoCore Contribution Agreement 1.1
>>> Signed-off-by: Heyi Guo <heyi.guo@linaro.org>
>>> Cc: Star Zeng <star.zeng@intel.com>
>>> Cc: Eric Dong <eric.dong@intel.com>
>>> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> ---
>>>
>>> Notes:
>>>      v2
>>>      - Use argument instead of global variable to record the recursive
>>>        count [Ray]
>>>      - Move DEBUG_CODE before recursively calling BmRepairAllControllers()
>>>        to track the change of each reconnect [Ray]
>>>      - Remove a duplicated declaration of BmRepairAllControllers() in
>>>        InternalBm.h.
>>>
>>>   MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h     | 19 ++++++++++---------
>>>   MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c         |  2 +-
>>>   MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c | 18 +++++++++++++-----
>>>   3 files changed, 24 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> index 25a1d522fe84..21ecd8584d24 100644
>>> --- a/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/InternalBm.h
>>> @@ -108,6 +108,12 @@ CHAR16 *
>>>   #define BM_OPTION_NAME_LEN                          sizeof ("PlatformRecovery####")
>>>   extern CHAR16  *mBmLoadOptionName[];
>>> +//
>>> +// Maximum number of reconnect retry to repair controller; it is to limit the
>>> +// number of recursive call of BmRepairAllControllers.
>>> +//
>>> +#define MAX_RECONNECT_REPAIR                        10
>>> +
>>>   /**
>>>     Visitor function to be called by BmForEachVariable for each variable
>>>     in variable storage.
>>> @@ -145,10 +151,13 @@ typedef struct {
>>>   /**
>>>     Repair all the controllers according to the Driver Health status queried.
>>> +
>>> +  @param ReconnectRepairCount     To record the number of recursive call of
>>> +                                  this function itself.
>>>   **/
>>>   VOID
>>>   BmRepairAllControllers (
>>> -  VOID
>>> +  UINTN       ReconnectRepairCount
>>>     );
>>>   #define BM_HOTKEY_SIGNATURE SIGNATURE_32 ('b', 'm', 'h', 'k')
>>> @@ -328,14 +337,6 @@ BmDelPartMatchInstance (
>>>     );
>>>   /**
>>> -  Repair all the controllers according to the Driver Health status queried.
>>> -**/
>>> -VOID
>>> -BmRepairAllControllers (
>>> -  VOID
>>> -  );
>>> -
>>> -/**
>>>     Print the device path info.
>>>     @param DevicePath           The device path need to print.
>>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
>>> index ce19ae400660..b842d5824aed 100644
>>> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
>>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmBoot.c
>>> @@ -1767,7 +1767,7 @@ EfiBootManagerBoot (
>>>       //
>>>       // 4. Repair system through DriverHealth protocol
>>>       //
>>> -    BmRepairAllControllers ();
>>> +    BmRepairAllControllers (0);
>>>     }
>>>     PERF_START_EX (gImageHandle, "BdsAttempt", NULL, 0, (UINT32) OptionNumber);
>>> diff --git a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> index ddcee8b0676f..db2f859ae73d 100644
>>> --- a/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> +++ b/MdeModulePkg/Library/UefiBootManagerLib/BmDriverHealth.c
>>> @@ -423,10 +423,13 @@ EfiBootManagerFreeDriverHealthInfo (
>>>   /**
>>>     Repair all the controllers according to the Driver Health status queried.
>>> +
>>> +  @param ReconnectRepairCount     To record the number of recursive call of
>>> +                                  this function itself.
>>>   **/
>>>   VOID
>>>   BmRepairAllControllers (
>>> -  VOID
>>> +  UINTN       ReconnectRepairCount
>>>     )
>>>   {
>>>     EFI_STATUS                          Status;
>>> @@ -548,10 +551,6 @@ BmRepairAllControllers (
>>>     EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>>> -  if (ReconnectRequired) {
>>> -    BmRepairAllControllers ();
>>> -  }
>>> -
>>>     DEBUG_CODE (
>>>       CHAR16 *ControllerName;
>>> @@ -576,6 +575,15 @@ BmRepairAllControllers (
>>>       EfiBootManagerFreeDriverHealthInfo (DriverHealthInfo, Count);
>>>       );
>>> +  if (ReconnectRequired) {
>>> +    if (ReconnectRepairCount < MAX_RECONNECT_REPAIR) {
>>> +      BmRepairAllControllers (ReconnectRepairCount + 1);
>>> +    } else {
>>> +      DEBUG ((DEBUG_ERROR, "[%a:%d] Repair failed after %d retries.\n",
>>> +        __FUNCTION__, __LINE__, ReconnectRepairCount));
>>> +    }
>>> +  }
>>> +
>>>     if (RebootRequired) {
>>>       DEBUG ((EFI_D_INFO, "[BDS] One of the Driver Health instances requires rebooting.\n"));
>>>       gRT->ResetSystem (EfiResetWarm, EFI_SUCCESS, 0, NULL);
>>>
>> Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
>>
>> -- 
>> Thanks,
>> Ray
Done in 72208a9a90b8c6cd5011ddf174ad01e567b67454.

-- 
Thanks,
Ray


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-03-08  2:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-26  8:29 [PATCH 1/1] MdeModulePkg/UefiBootManagerLib: limit recursive call depth Heyi Guo
2018-02-26  8:56 ` Wang, Sunny (HPS SW)
2018-02-26 11:34   ` Guo Heyi
2018-02-27  2:47     ` Wang, Sunny (HPS SW)
2018-02-26 16:23 ` Laszlo Ersek
2018-02-27  0:48   ` Guo Heyi
2018-02-27  5:48     ` Ni, Ruiyu
2018-02-27 10:29       ` Laszlo Ersek
2018-02-27 10:39         ` Guo Heyi
  -- strict thread matches above, loose matches on Subject: below --
2018-03-01  2:39 Heyi Guo
2018-03-01  2:43 ` Guo Heyi
2018-03-01  4:46 ` Ni, Ruiyu
2018-03-07  1:54   ` Guo Heyi
2018-03-08  2:53     ` Ni, Ruiyu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox