From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id E24BF21B02822 for ; Thu, 24 Jan 2019 04:25:42 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6F2867655C; Thu, 24 Jan 2019 12:25:42 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-237.rdu2.redhat.com [10.10.120.237]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8CA4E26E79; Thu, 24 Jan 2019 12:25:41 +0000 (UTC) To: "Tomas Pilar (tpilar)" Cc: "edk2-devel@lists.01.org" References: <6029fb15-3820-0f05-f02a-577e99592bbc@solarflare.com> <32137552-e2b0-4392-17c7-dedaa1f05244@redhat.com> <09017041-d418-1186-9942-dfa70d82c4d6@solarflare.com> <67f8fb4a-5e1b-8bbc-90d4-670ff7e3bfe8@redhat.com> <5185c0b2-031f-ec50-b273-2665d83ef38a@solarflare.com> From: Laszlo Ersek Message-ID: Date: Thu, 24 Jan 2019 13:25:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <5185c0b2-031f-ec50-b273-2665d83ef38a@solarflare.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 24 Jan 2019 12:25:42 +0000 (UTC) Subject: Re: Network Stack Budgeting X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2019 12:25:43 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 01/24/19 12:37, Tomas Pilar (tpilar) wrote: > > Hi Laszlo, > >> Can you capture a call stack when Snp.Start() is invoked for the very >> first time (which, IIUC, is a call that should not happen, in your >> opinion)? >> > Unfortunately I do not have access to the platform firmware itself (I maintain an IHV network driver that's shipped in OptionROMs) and I don't believe a generic stack capture is available in EDK2 yet. However I have comprehensive debug from my driver that shows that our driver gets a DriverBinding.Start() at TPL_APPLICATION and we perform the entire probe at TPL_NOTIFY and as soon as that completes and we drop the TPL, the newly installed SNP gets a Snp.Start() call at TPL_CALLBACK - I assume that MNP or something higher had an event registered that fired as soon as the TPL dropped after the DriverBinding.Start() finished. > > Here is a snip of the debug log (that I assume will not be of much interest). We don't observe the actual call to DriverBinding.Start() because the debug output is initialised as one of the first things that method does. > > [...] > > The dots between [sfc] and | symbol indicate the TPL the operation is being carried out at. No dots mean TPL_APPLICATION, one dot is TPL_CALLBACK etc. You can see that the Snp.Start() on the first device is called before even the second device gets a DriverBinding.Start(). This reeks. :) It looks like some driver in the platform sets up a protocol notify callback for SNP, with gBS->CreateEvent() + gBS->RegisterProtocolNotify(). Your driver's DriverBindingStart() function is called normally from BDS, via ConnectController(). In DriverBindingStart(), you install an SNP instance, which signals the event (makes it pending / queues the notification function for it). Once you drop the TPL again, the notification function is called, on the stack of gBS->RestoreTPL(). The event's notification funciton probably uses the "Registration" feature of gBS->RegisterProtocolNotify(), together with gBS->LocateProtocol(), to process only those SNP instances that it hasn't seen yet. In other words, it catches exactly the SNP instance that your driver just produced, and then it calls the Start() member of it. This looks wrong to me; SNP is supposed to be consumed in accordance with the UEFI Driver Model. An agent that behaved like described above would most likely be a platform (DXE) driver lying in wait for network connectivity (SNP), for some reason. (The Driver Writer’s Guide for UEFI 2.3.1, Version 1.01, 03/08/2012, explicitly lists RegisterProtocolNotify() in chapter 5.3 "Services that UEFI drivers should not use".) Something's fishy with the platform firmware, IMO. Even if the rest of the network stack -- which consists of well-behaving UEFI drivers that follow the UEFI Driver Model -- comes to life, that one sneaky platform DXE driver will be there, poking at your SNP directly, for example under the feet of the MNP driver. That seems plain wrong. I've now run git grep -A10 -B10 -w gEfiSimpleNetworkProtocolGuid on edk2, and then searched the output for "RegisterProtocolNotify". There were no hits. So, I don't think anything in edk2 behaves like described above (thankfully!). You mentioned seeing this on "DELL 13G platforms". I suggest opening a support case with Dell. Thanks Laszlo