From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by mx.groups.io with SMTP id smtpd.web09.10094.1621351752382796904
 for <devel@edk2.groups.io>;
 Tue, 18 May 2021 08:29:12 -0700
Authentication-Results: mx.groups.io;
 dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bmL1q/sc;
 spf=pass (domain: redhat.com, ip: 216.205.24.124, mailfrom: lersek@redhat.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1621351751;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=ElJpy42YfB+Ng44b6C3jIkWvSf96/rIUXH2SUbY1Ciw=;
	b=bmL1q/scNL/mnm2APuMbX2gvbEq2plULUlsjDzwmY+EyEKWbeughu2Xu1r4RoVakqDIJJJ
	bv0cBXpJzT8nsz9LRgLUqeEjgqgiqOILFOmTynvBv57HpqvGJlt2HV1uDD5mc4fZJwapUL
	bBTq0XLwJtErC+5x9i9jBCkG7JQ53Bg=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-465-O5EqccYTPz6P4jcsMK4q6g-1; Tue, 18 May 2021 11:29:09 -0400
X-MC-Unique: O5EqccYTPz6P4jcsMK4q6g-1
Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2FCF81922960;
	Tue, 18 May 2021 15:29:07 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-112-30.ams2.redhat.com [10.36.112.30])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 61A2018506;
	Tue, 18 May 2021 15:29:05 +0000 (UTC)
Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Allocate a separate SEV-ES AP reset stack area
To: =?UTF-8?Q?Marvin_H=c3=a4user?= <mhaeuser@posteo.de>,
 devel@edk2.groups.io, thomas.lendacky@amd.com
Cc: Brijesh Singh <brijesh.singh@amd.com>, Eric Dong <eric.dong@intel.com>,
 Ray Ni <ray.ni@intel.com>, Rahul Kumar <rahul1.kumar@intel.com>
References: <1162d1f4fe09048aaafba6f6ea3046bebd34fc2b.1620766224.git.thomas.lendacky@amd.com>
 <a9110fb7-473e-dada-b0c1-0b5f75504395@posteo.de>
 <bd4d5bfd-b9c4-79c4-21ff-4738ec2741a0@amd.com>
 <b4dc7e35-87d9-b598-4dec-5fcaf6a345d1@posteo.de>
 <414ed071-6bed-4e41-400d-7d958b27ae3b@redhat.com>
 <19890a20-d5fb-f793-660f-f72ee610bb68@posteo.de>
From: "Laszlo Ersek" <lersek@redhat.com>
Message-ID: <3ad9c5f7-d447-78cd-0682-eee07b7aa93d@redhat.com>
Date: Tue, 18 May 2021 17:29:03 +0200
MIME-Version: 1.0
In-Reply-To: <19890a20-d5fb-f793-660f-f72ee610bb68@posteo.de>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14
Authentication-Results: relay.mimecast.com;
	auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit

On 05/17/21 00:15, Marvin Häuser wrote:
> Am 16.05.2021 um 03:17 schrieb Laszlo Ersek:
>> On 05/14/21 17:44, Marvin Häuser wrote:
>>> On 14.05.21 17:23, Lendacky, Thomas wrote:
>>>> On 5/14/21 10:04 AM, Marvin Häuser wrote:
>>
>>>>>> +      // Check to be sure that the "allocate below" behavior hasn't
>>>>>> changed.
>>>>>> +      // This will also catch a failed allocation, as "-1" is
>>>>>> returned on
>>>>>> +      // failure.
>>>>>> +      //
>>>>>> +      if (CpuMpData->SevEsAPResetStackStart >=
>>>>>> CpuMpData->WakeupBuffer) {
>>>>>> +        DEBUG ((DEBUG_ERROR,
>>>>>> +          "SEV-ES AP reset stack is not below wakeup buffer\n"));
>>>>>> +
>>>>>> +        ASSERT (FALSE);
>>>>> Should the ASSERT not only catch the broken "allocate below"
>>>>> behaviour,
>>>>> i.e. not trigger on failed allocation?
>>>> I think it's best to trigger on a failed allocation as well rather than
>>>> continuing and allowing a page fault or some other problem to occur.
>>>
>>> Well, it should handle the error in a safe way, i.e. the deadloop below.
>>> To not ASSERT on plausible conditions is a common design guideline in
>>> most low-level projects including Linux kernel.
>>>
>>> Best regards,
>>> Marvin
>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>> +        CpuDeadLoop ();
>>
>> "DEBUG + ASSERT(FALSE) + CpuDeadLoop()" is a pattern in edk2.
>>
>> In RELEASE builds, it will lead to a CpuDeadLoop(). That's the main goal
>> -- don't continue execution if the condition controlling the whole block
>> fired.
>>
>> In DEBUG and NOOPT builds, the pattern will lead to a debug message
>> (usually at the "error" level), followed by an assertion failure. The
>> error message of the assertion failure is irrelevant ("FALSE"). The
>> point of adding ASSERT ahead of CpuDeadLoop() is that the way ASSERT
>> hangs execution is customizable, via "PcdDebugPropertyMask", unlike
>> CpuDeadLoop(). In many cases, ASSERT() uses CpuDeadLoop() itself, so the
>> effect is the same -- the explicit CpuDeadLoop is not reached. In other
>> configs, ASSERT() can raise a debug exception (CpuBreakpoint()).
> 
> I absolutely do not *expect* Tom to change this, it was just a slight
> remark (as many places have this anyway). I'll still try to explain why
> I made that remark, but for whom it is of no interest, I do not expect
> it to be read. I'm fine with the patch as-is myself. Thank you a lot, Tom!
> 
> 
> 
> I know it, unfortunately, is a pattern in EDK II - taking this pattern
> too far is what caused the 8-revision patch regarding untrusted inputs
> we submitted previously. :)
> 
> There are many concerns about unconventional ASSERTs, though I must
> admit none but one (and that one barely) really apply here, which is why
> I have trouble explaining why I believe it should be changed. Here are
> some reasons outside the context of this patch:
> 
> 1) Consistency between DEBUG and RELEASE builds: I think one can justify
> to have a breakpoint on a condition that may realistically occur. But a
> deadloop can give a wrong impression about how production code works.
> E.g. it also is a common pattern in EDK II to ASSERT on memory
> allocation failure but *not* have a proper check after, so DEBUG builds
> will nicely error or deadloop, while RELEASE goes ahead and causes a CPU
> exception or memory corruption depending on the context. Thus,
> real-world error handling cannot really be tested. This does not apply
> because there *is* a RELEASE deadloop.
> 
> 2) Static analysis: Some static analysers use ASSERT information for
> their own analysis, and try to give hints about unsafe or unreachable
> code based on own annotations. This kind of applies, but only when
> substituting EDK II ASSERT with properly recognisable ASSERTs (e.g.
> __builtin_unreachable).
> 
> 2) Dynamic analysis: ASSERTs can be useful when fuzzing for example.
> Enabled Sanitizers will only catch unsafe behaviour, but maybe you have
> some extra code in place to sanity-check the results further. An ASSERT
> yields an error dump (usually followed by the worker dying). However, as
> allocation failures are perfectly expected, this can cause a dramatic
> about of False Positives and testing interruption. This does not apply
> because deadloop'd code cannot really be fuzz-tested anyway.
> 
> ASSERTs really are designed as unbreakable conditions, i.e. 1)
> preconditions 2) invariants 3) postconditions. No allocator in early
> kernel-space or lower can really guarantee allocation success, thus it
> cannot be a postcondition of any such function. And while it might make
> debugging look a bit easier (but you will see from the backtrace anyway
> where you halted), it messes with all tools that assume proper usage.
> 
> Also, I just realised, you can of course see it from the address value
> when debugging, but you cannot see it from the ASSERT or DEBUG message
> *which* of the two logical error conditions failed (i.e. broken
> allocator or OOM). Changing the ASSERT would fix that. :)

I'm OK if the ASSERT() is dropped; from my perspective it's really just
a small convenience / developer/debugging aid. We still have the debug
message and the explicit deadloop.

Thanks
Laszlo