From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mx.groups.io with SMTP id smtpd.web11.625.1620667584883947932 for ; Mon, 10 May 2021 10:26:25 -0700 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: arm.com, ip: 217.140.110.172, mailfrom: jeremy.linton@arm.com) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 854BD1688; Mon, 10 May 2021 10:26:24 -0700 (PDT) Received: from [192.168.122.166] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 411CA3F73B; Mon, 10 May 2021 10:26:24 -0700 (PDT) Subject: Re: [edk2-devel] [PATCH 1/2] Silicon/Broadcom/BcmGenetDxe: Delay for linkup in transmit To: Ard Biesheuvel , edk2-devel-groups-io Cc: Ard Biesheuvel , Leif Lindholm , Peter Batard , Samer El-Haj-Mahmoud , Andrei Warkentin References: <20210415192207.3257790-1-jeremy.linton@arm.com> <20210415192207.3257790-2-jeremy.linton@arm.com> From: "Jeremy Linton" Message-ID: <431d02c5-1c25-cb0f-3c94-e677c1d29536@arm.com> Date: Mon, 10 May 2021 12:26:23 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Hi, On 5/10/21 11:56 AM, Ard Biesheuvel wrote: > On Thu, 15 Apr 2021 at 21:22, Jeremy Linton wrote: >> >> Under normal circumstances GenetSimpleNetworkTransmit won't be >> called unless the rest of the network stack detects the link is >> up. So, during normal operation when the adapter is initialized >> the link naturally transitions to link up, and then is ready for >> activity later in the boot sequence. If that hasn't happened by >> the time PXE runs then it will itself wait for the link. >> >> OTOH, the normal distro PXE sequence involves PXE loading shim >> which in turn loads grub, which tries to read machine specific >> configs, modules, and grub.cfg in order to prepare the boot menu. >> Then, once a grub selection is picked, it might try to load the >> kernel+initrd. >> >> In this sequence the network stack is shutdown and restarted >> multiple times. Grub though, starts up, notices its been network >> booted, reads saved network parameters and immediately tries to >> transmit data assuming the link is still up. >> >> When that happens grub will print "couldn't send network packet" >> and if that lasts long enough it fails to load grub.cfg and the >> user gets dropped to the grub prompt because no one in the path >> bothers to assure the link state has transitioned back up. >> > > This is an excellent problem description. Could we dedicate a couple > of paragraphs to the solution as well? Especially given that 'wait for > random # of seconds' seems to be the taken approach here, which seems > a bit sloppy to me, given that we should be able to detect whether the > link is up, no? Isn't that what GenericPhyUpdateConfig is doing? Checking for linkup, and updating the phy state? So once the link goes up, it return EFI_SUCCESS and the loop terminates. I was actually a bit concerned that the 10s wasn't enough since AFAIK a large part of the delay is the remote side. > > > >> For reference: https://github.com/pftf/RPi4/issues/113 >> >> Signed-off-by: Jeremy Linton >> --- >> .../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++-- >> 1 file changed, 22 insertions(+), 2 deletions(-) >> >> diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c >> index 1bda18f157..29c76d8495 100644 >> --- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c >> +++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c >> @@ -13,6 +13,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include "BcmGenetDxe.h" >> @@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit ( >> >> if (!Genet->SnpMode.MediaPresent) { >> // >> - // Don't bother transmitting if there's no link. >> + // We should only really get here if the link was up >> + // and is now down due to a stop/shutdown sequence, and >> + // the app (grub) doesn't bother to check link state >> + // because it was up a moment before. >> + // Lets wait a bit for the link to resume, rather than >> + // failing to send. In the case of grub it works either way >> + // but we can't be sure that is universally true, and >> + // hanging for a couple seconds is nicer than a screen of >> + // grub send failure messages. >> // >> - return EFI_NOT_READY; >> + int retries = 1000; >> + DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__)); >> + do { >> + gBS->Stall (10000); >> + Status = GenericPhyUpdateConfig (&Genet->Phy); >> + } while (EFI_ERROR (Status) && retries--); >> + if (EFI_ERROR (Status)) { >> + DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__)); >> + return EFI_NOT_READY; >> + } else { >> + Genet->SnpMode.MediaPresent = TRUE; >> + } >> } >> >> if (HeaderSize != 0) { >> -- >> 2.13.7 >> >> >> >> >> >>