On Jul 31, 2019, at 9:34 AM, Brian J. Johnson <brian.johnson@hpe.com> wrote:

On 7/31/19 7:43 AM, Laszlo Ersek wrote:
(adding Mike)
On 07/31/19 09:35, Eric Dong wrote:
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1984

Current debug message brings much restriction for the platform
which use this driver.

For PEI and DXE phase, platform mush link base DebugLib (without
using any pei/dxe services, even for its dependent libraries).

This patch default disable this debug message, only open it when
need to debug the related code.

Signed-off-by: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>

Eric Dong (2):
  UefiCpuPkg/RegisterCpuFeaturesLib: Default avoid print.
  UefiCpuPkg/PiSmmCpuDxeSmm: Default avoid print.

 .../Library/RegisterCpuFeaturesLib/CpuFeaturesInitialize.c    | 4 +++-
 UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c                             | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

The basic problem seems to be that APs should not use "thick" services
that might underlie the DebugLib instance that is picked by the
platform. That requirement appears sane to me.
I think I disagree with the proposed mitigation though. Reasons:
(a) The mitigation is duplicated to independent modules.
(b) It is not possible to change the debug mask without modifying C
language source code.
(c) Passing a zero log mask to DEBUG() on the APs does not guarantee
thread safety:
- The DEBUG() macro calls DebugPrintEnabled() regardless of the log mask
passed to DEBUG().
- The DEBUG() macro may or may not call DebugPrintLevelEnabled(),
dependent on architecture & toolchain.
- Both DebugPrintEnabled() and DebugPrintLevelEnabled() are DebugLib
interfaces. The library instance may implement them unsafely for APs,
and a zero log mask at the DEBUG call site could not prevent that.
- Finally, DebugPrint() itself could invoke thread-unsafe logic, before
consulting the log mask.
I would propose the following, instead:
(i) Introduce BIT6 for PcdDebugPropertyMask in "MdePkg.dec". The default
value should be zero. The bit stands for "DEBUG is safe to call on APs".
(ii) Add a macro called AP_DEBUG to <DebugLib.h>.
This macro should work the same as DEBUG, except it should do nothing if
BIT6 in PcdDebugProperyMask is clear.
Fetching PcdDebugPropertyMask inside AP_DEBUG() is safe, because:
- the PCD can only be fixed-at-build or patchable-in-module (therefore
it is safe to read on APs -- no PCD PPI or PCD Protocol is needed);
- PcdDebugPropertyMask is a preexistent PCD that *all* existent DebugLib
instances are expected to consume -- per the API specifications in
<DebugLib.h> --, therefore no new PCD dependency would be introduced to
DebugLib instances.
(iii) Modules that call DEBUG on APs should replace those calls with
AP_DEBUG. Code that currently calls DEBUG while running on either BSP or
APs should discriminate those cases from each other, and use AP_DEBUG
explicitly, when it runs on APs.
As a further refinement, a macro called MP_DEBUG could be introduced
too, with a new initial parameter called "Bsp". If the Bsp parameter is
TRUE, then MP_DEBUG is identical to DEBUG. Otherwise, MP_DEBUG is
identical to AP_DEBUG. This way, DEBUG() calls such as described above
wouldn't have to be split into DEBUG / AP_DEBUG calls; they could be
changed into MP_DEBUG calls (with an extra parameter in the front).
(iv) platforms can set BIT6 in PcdDebugPropertyMask in DSC files. This
need not be a full platform-level setting: the PCD can be overridden in
module scope, just like the DebugLib resolution can be module-scoped.
As an end result, AP_DEBUG messages will disappear by default (safely),
and platforms will have to do extra work only if they want AP_DEBUG
messages to appear. Otherwise the change is transparent to platforms.
And, I think that AP_DEBUG belongs in MdePkg (and not UefiCpuPkg)
because both DebugLib and EFI_MP_SERVICES_PROTOCOL are declared in
MdePkg. While UefiCpuPkg provides the multiprocessing implementation for
IA32 and X64, the problem is architecture-independent. Furthermore, the
problem is a long-standing and recurrent one -- please refer to commit
81f560498bf1, for example --, so it makes sense to solve it once and for
all.
Thanks
Laszlo

Laszlo,

Defining a PCD bit for DEBUG() AP safety is an excellent suggestion.  As you said, this is a long-standing, recurrent problem which keeps biting real platforms, and it would be good to have a solid, platform-independent solution.

I do wonder if there would be a clean way to let a DebugLib instance itself declare that AP_DEBUG() is safe.  That way a platform would only need to override the DebugLib instance in the DSC file, rather than both the instance and the PCD.  (I know, I'm nitpicking.)  A library can't override PCDs in its calling modules, of course.  I suppose the AP_DEBUG() macro could call a new DebugLib entry point to test for AP safety before doing anything else, say DebugPrintOnApIsSafe().  Or it could even be a global CONST BOOLEAN defined by the library.  But that would require all DebugLib instances to change, which is something you were trying to avoid.

However, it's not always practical to track down all uses of DEBUG(). An AP can easily call a library routine which uses DEBUG() rather than AP_DEBUG(), buried under several layers of transitive library dependencies.  In other words, it's not always practical to determine ahead of time if a given DEBUG() call may be done on an AP.  I know that AP code runs in a very restricted environment and that people who use MpServices are supposed to understand the repercussions, but it gets very difficult when libraries are involved.  :(

So would a better solution be to modify the common unsafe DebugLib instances to have DebugPrintEnabled() return FALSE on APs?  That would probably require a new BaseLib interface to determine if the caller is running on the BSP or an AP.  (For IA32/X64 this isn't too hard -- it just needs to check a bit in the local APIC.  I have no idea about other architectures.)  That wouldn't solve the problem everywhere -- anyone using a custom DebugLib would have to update it themselves.  But it would solve it solidly in the majority of cases.

Thoughts?


For DXE you can use the MpServices protocol to tell if you are on the BSP or not. 

https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Protocol/MpService.h

EFI_MP_SERVICES_PROTOCOL.WhoAmI() gets you your ProcessorNumber. EFI_MP_SERVICES_PROTOCOL.GetProcessorInfo() returns the ProcessInfoBuffer that lets you know if you are the BSP.  You would want to grab the MpServices protocol pointer early in the driver prior to doing any work on the AP. 

typedef struct {
///
/// The unique processor ID determined by system hardware. For IA32 and X64,
/// the processor ID is the same as the Local APIC ID. Only the lower 8 bits
/// are used, and higher bits are reserved. For IPF, the lower 16 bits contains
/// id/eid, and higher bits are reserved.
///
UINT64 ProcessorId;
///
/// Flags indicating if the processor is BSP or AP, if the processor is enabled
/// or disabled, and if the processor is healthy. Bits 3..31 are reserved and
/// must be 0.
///
/// <pre>
/// BSP ENABLED HEALTH Description
/// === ======= ====== ===================================================
/// 0 0 0 Unhealthy Disabled AP.
/// 0 0 1 Healthy Disabled AP.
/// 0 1 0 Unhealthy Enabled AP.
/// 0 1 1 Healthy Enabled AP.
/// 1 0 0 Invalid. The BSP can never be in the disabled state.
/// 1 0 1 Invalid. The BSP can never be in the disabled state.
/// 1 1 0 Unhealthy Enabled BSP.
/// 1 1 1 Healthy Enabled BSP.
/// </pre>
///
UINT32 StatusFlag;
///
/// The physical location of the processor, including the physical package number
/// that identifies the cartridge, the physical core number within package, and
/// logical thread number within core.
///
EFI_CPU_PHYSICAL_LOCATION Location;
} EFI_PROCESSOR_INFORMATION;

DebugPrinttEnabled() and DebugAssertEnabled() would be needed at a minimum as I don't think you really want to be ASSERTing on the AP, unless that is your intent. I think on a lot of platforms end up having these functions be based on FixedAtBuild PCDs and the function calls get optimized away. Thus it seems like you would want an MP safe version of a given library. 

This same problem kind of exists for Runtime drivers too. You would not want to ASSERT or DEBUG print at runtime in a Lib that was not Runtime safe. 

Thanks,

Andrew Fish

-- 
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise