From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web10.400.1600983450400274593 for ; Thu, 24 Sep 2020 14:37:30 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KcV0sUYu; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600983449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vEIi+biU8zmRKGpfJEi+LsWy9n7OVP1pLrb8wCRBlmo=; b=KcV0sUYuwivIq52rCgn9+mpymEeF31B8LbZdz0FaZq34FVZHuLQbgz/6aETRX6Nx4WY2nj Af8dC7OA3IzQx7uwhPVQTw7/sjcTtYAqgpNeSube9vJQoO7VibJ3DXp/+/eSEAlpZQZhj5 mpRDD9hz2nsJUdAX4CCGs03xlWeXgUY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-446-I5MHzfPHPaSDE0YmcMg9UA-1; Thu, 24 Sep 2020 17:37:23 -0400 X-MC-Unique: I5MHzfPHPaSDE0YmcMg9UA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9AC641084C96; Thu, 24 Sep 2020 21:37:21 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-112-39.ams2.redhat.com [10.36.112.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 77B2F5576B; Thu, 24 Sep 2020 21:37:18 +0000 (UTC) Subject: Re: [edk2-devel] [EXTERNAL] [PATCH v8 00/14] Add the VariablePolicy feature To: devel@edk2.groups.io, bret@corthon.com, Andrew Fish Cc: Bret Barkelew , "Yao, Jiewen" , Dandan Bi , Chao Zhang , Jian J Wang , Hao A Wu , "liming.gao" , Jordan Justen , Ard Biesheuvel , "Ni, Ray" References: <20200923060748.3795-1-bret.barkelew@microsoft.com> From: "Laszlo Ersek" Message-ID: <645bb333-c3f4-0acd-f0e8-b783149bec18@redhat.com> Date: Thu, 24 Sep 2020 23:37:17 +0200 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 09/23/20 23:34, Bret Barkelew wrote: > Agreed. It's random that the previous config worked. Here's an idea. Assume we have two DXE drivers (D1 and D2), each of which registers an EndOfDxe notification function (each to be queued at TPL_CALLBACK or TPL_NOTIFY, doesn't matter). Assume D1 does something in its EndOfDxe handler that breaks if D2's EndOfDxe handler runs first. Let's call the offending action in D2's EndOfDxe handler "Nastiness". (Scientific term.) Further assume that D1 is an optional driver (while D2 is mandatory). D1 may or may not be included in the platform firmware. Here's what the drivers could do, for ensuring that D1's handler runs first, *if* D1 is included in the platform firmate: (1) D2 defines (standardizes) a new, well-known protocol GUID. There is no need for a protocol structure; it can be a null protocol. If the protocol is present in the protocol database, that means that Nastiness must not be performed by D2 in its EndOfDxe handler. For simplicity, let's call the protocol EDKII_NO_NASTINESS_PLEASE_PROTOCOL. (2) In its entry point function, D1 produces an instance of EDKII_NO_NASTINESS_PLEASE_PROTOCOL. The protocol can be the sole protocol on a new handle, or exist on another long-term handle (such as gImageHandle). (3) In its EndOfDxe handler, D1 uninstalls EDKII_NO_NASTINESS_PLEASE_PROTOCOL (potentially destroying the handle as well). (This is safe because the handler is running at TPL_NOTIFY at the highest, and uninstalling protocols is safe at <= TPL_NOTIFY. Memory allocation services are also explicitly safe at <= TPL_NOTIFY.) (4) D2 factors Nastiness out of its EndOfDxe handler to a new function. This new function -- solely consisting of Nastiness -- has the usual notification function prototype. Let's call it NastyFun(). (5) In its entry point function, D2 creates a new (private, not GUID-ed) event, for which NastyFun() is the notification function. The TPL is TPL_CALLBACK. Let's call the event NastyEvent. (6) In D2's EndOfDxe handler, the original, open-coded Nastiness is replaced with the following logic: - If EDKII_NO_NASTINESS_PLEASE_PROTOCOL exists in the protocol database (according to gBS->LocateProtocol(), then D2's EndOfDxe handler signals NastyEvent. (This is safe because LocateProtocol() is safe to call at up to and including TPL_NOTIFY, and gBS->SignalEvent is safe even at TPL_HIGH_LEVEL.) - Otherwise, D2's EndOfDxe calls NastyFun() with a normal function call. It's supposed to work like this: - If D1 is not in the firmware, it won't install EDKII_NO_NASTINESS_PLEASE_PROTOCOL, so D2 performs Nastiness (via a direct call to NastyFun()) on the call stack of its original EndOfDxe handler. That is, no change in behavior. - If D1's EndOfDxe handler completes first, D2's EndOfDxe handler will again not see EDKII_NO_NASTINESS_PLEASE_PROTOCOL; so goto previous point. - Multiple drivers similar to D1 may coexist; multiple EDKII_NO_NASTINESS_PLEASE_PROTOCOL instances may coexist (all NULL interfaces, but on different handles). If there is at least one, gBS->LocateProtocol() in D2's EndOfDxe handler will find the first one, and postpone Nastiness. - When D2's EndOfDxe handler signals NastyEvent, NastyFun() will be enqueued at TPL_CALLBACK. Note that, given that D2's EndOfDxe handler is running at the moment, the EndOfDxe event group has been signaled previously. This means *all* events in the EndOfDxe event group have been signaled, and all their notification functions have been enqueued too (in some unspecified order, except for the specified dispatch ordering between TPL_NOTIFY and TPL_CALLBACK). So when NastyEvent is signaled, NastyFun() is enqueued *after* all the other -- already enqueued -- EndOfDxe notification functions. That's because (a) NastyEvent uses TPL_CALLBACK, so the already pending TPL_NOTIFY notifications will be dispatched before it of course, and (b) regarding notify functions enqueued previously at the same TPL (= TPL_CALLBACK), functions in any given TPL queue are queued / dispatched in FIFO order. By the time NastyFun() runs, it could ASSERT() that EDKII_NO_NASTINESS_PLEASE_PROTOCOL does not exist. Thanks Laszlo > On Wed, Sep 23, 2020 at 2:32 PM Andrew Fish wrote: > >> >> >> On Sep 23, 2020, at 2:14 PM, Bret Barkelew wrote: >> >> As far as I can tell, this is exposing a pre-existing race condition in >> EndOfDxe. It just so HAPPENS that the old "Disable RequestToLock" >> callback executed after this TcgMor callback, whereas the new >> VariablePolicy callback executes first. >> >> I'm going to poke around for a solution, but this seems like an >> architectural problem with EndOfDxe. >> >> >> The architecture does not guarantee order for the events. So if the events >> are dependent on the order of other events that is a bug in implementation. >> These kind of bugs are hard to notice as the DXE Core implementation event >> dispatch in a fixed order, I think in the order the event was registered. >> So it looks correct until you add more events. >> >> Thanks, >> >> Andrew Fish >> >> On Wed, Sep 23, 2020 at 6:02 AM Laszlo Ersek wrote: >> >>> Hi Bret, >>> >>> On 09/23/20 08:12, Bret Barkelew via groups.io wrote: >>>> To whom it may concern, >>>> This is as done as it’s going to get. >>>> >>>> Thank you all for your help! >>> >>> I tried to merge this via , >>> but it failed. Please review the build report, and submit v9 if necessary. >>> >>> Thanks! >>> Laszlo >>> >>> >> >> >> > > > > > >