public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
@ 2017-01-18 23:27 Daniil Egranov
  2017-01-19 13:49 ` Ryan Harkin
  0 siblings, 1 reply; 11+ messages in thread
From: Daniil Egranov @ 2017-01-18 23:27 UTC (permalink / raw)
  To: edk2-devel; +Cc: leif.lindholm, ryan.harkin, Daniil Egranov

The Marvell Yukon MAC address load supported only on Juno R1 and R2.
It disabled for Juno R0 due to PCI issues on this board.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
---
 ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
index 47ff587..e9e6990 100644
--- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
+++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
@@ -378,6 +378,7 @@ OnEndOfDxe (
   EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
   EFI_HANDLE                Handle;
   EFI_STATUS                Status;
+  UINT32                    JunoRevision;
 
   //
   // PCI Root Complex initialization
@@ -393,8 +394,12 @@ OnEndOfDxe (
   Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath, FALSE);
   ASSERT_EFI_ERROR (Status);
 
-  Status = ArmJunoSetNicMacAddress ();
-  ASSERT_EFI_ERROR (Status);
+  GetJunoRevision (JunoRevision);
+
+  if (JunoRevision != JUNO_REVISION_R0) {
+    Status = ArmJunoSetNicMacAddress ();
+    ASSERT_EFI_ERROR (Status);
+  }
 }
 
 STATIC
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-18 23:27 [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0 Daniil Egranov
@ 2017-01-19 13:49 ` Ryan Harkin
  2017-01-19 15:13   ` Leif Lindholm
  0 siblings, 1 reply; 11+ messages in thread
From: Ryan Harkin @ 2017-01-19 13:49 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: edk2-devel@lists.01.org, Leif Lindholm

On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
> It disabled for Juno R0 due to PCI issues on this board.
>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>

> ---
>  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> index 47ff587..e9e6990 100644
> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> @@ -378,6 +378,7 @@ OnEndOfDxe (
>    EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>    EFI_HANDLE                Handle;
>    EFI_STATUS                Status;
> +  UINT32                    JunoRevision;
>
>    //
>    // PCI Root Complex initialization
> @@ -393,8 +394,12 @@ OnEndOfDxe (
>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath, FALSE);
>    ASSERT_EFI_ERROR (Status);
>
> -  Status = ArmJunoSetNicMacAddress ();
> -  ASSERT_EFI_ERROR (Status);
> +  GetJunoRevision (JunoRevision);
> +
> +  if (JunoRevision != JUNO_REVISION_R0) {
> +    Status = ArmJunoSetNicMacAddress ();
> +    ASSERT_EFI_ERROR (Status);

This is just an FYI, but I stacked your patch on top of mainline, like this:

5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
Fixed crash on Juno R0       [Daniil Egranov]
19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
 [Thomas Huth]

The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
assert triggered:

UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
[snip]
ASSERT_EFI_ERROR (Status = Not Found)
ASSERT [ArmJunoDxe]
/linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
!EFI_ERROR (Status)

I worked out what is happening. And it's not to do with this patch.
It's another fall-out from the re-work you did to the previous patch.
It's also ultimately due to a bug the firmware.

With the initial version of your "Set Marvell Yukon MAC address"
patch, this hang didn't happen. I suspect that was because your error
checking was weaker and certain PCIe failures didn't trigger the
assert.

To reproduce the error with this commit:
1) power on and boot R1 or R2 into Shell
  I do this by interrupting the boot by pressing ESCAPE and using the boot menu
2) At the Shell prompt, run "reset -s" to shutdown
3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
4) the board will hang while booting UEFI, assuming the board firmware
doesn't die with constant messages like this:

    ERROR: PCIe CSR read failed to respond
    ERROR: SMBus transaction not claimed

Assuming the problem is firmware, not EDK2, what should we do about it?

Prior to your "Set Marvell Yukon MAC address" patch, or with the
earlier version, the board would boot anyway, but the Yukon device
would be missing.

Now it dies.

I don't know which is worse, but I think hanging is worse than an
ethernet port dropping out. Although hanging is a bit more obvious
that there's a problem...


> +  }
>  }
>
>  STATIC
> --
> 2.7.4
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-19 13:49 ` Ryan Harkin
@ 2017-01-19 15:13   ` Leif Lindholm
  2017-01-20  1:34     ` Daniil Egranov
  0 siblings, 1 reply; 11+ messages in thread
From: Leif Lindholm @ 2017-01-19 15:13 UTC (permalink / raw)
  To: Ryan Harkin, Daniil Egranov; +Cc: edk2-devel@lists.01.org

On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
> > The Marvell Yukon MAC address load supported only on Juno R1 and R2.
> > It disabled for Juno R0 due to PCI issues on this board.
> >
> > Contributed-under: TianoCore Contribution Agreement 1.0
> > Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
> 
> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
> 
> > ---
> >  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> > index 47ff587..e9e6990 100644
> > --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> > +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> > @@ -378,6 +378,7 @@ OnEndOfDxe (
> >    EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
> >    EFI_HANDLE                Handle;
> >    EFI_STATUS                Status;
> > +  UINT32                    JunoRevision;
> >
> >    //
> >    // PCI Root Complex initialization
> > @@ -393,8 +394,12 @@ OnEndOfDxe (
> >    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath, FALSE);
> >    ASSERT_EFI_ERROR (Status);
> >
> > -  Status = ArmJunoSetNicMacAddress ();
> > -  ASSERT_EFI_ERROR (Status);
> > +  GetJunoRevision (JunoRevision);
> > +
> > +  if (JunoRevision != JUNO_REVISION_R0) {
> > +    Status = ArmJunoSetNicMacAddress ();
> > +    ASSERT_EFI_ERROR (Status);
> 
> This is just an FYI, but I stacked your patch on top of mainline, like this:
> 
> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
> Fixed crash on Juno R0       [Daniil Egranov]
> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>  [Thomas Huth]
> 
> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
> assert triggered:
> 
> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
> [snip]
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
> !EFI_ERROR (Status)
> 
> I worked out what is happening. And it's not to do with this patch.
> It's another fall-out from the re-work you did to the previous patch.
> It's also ultimately due to a bug the firmware.
> 
> With the initial version of your "Set Marvell Yukon MAC address"
> patch, this hang didn't happen. I suspect that was because your error
> checking was weaker and certain PCIe failures didn't trigger the
> assert.
> 
> To reproduce the error with this commit:
> 1) power on and boot R1 or R2 into Shell
>   I do this by interrupting the boot by pressing ESCAPE and using the boot menu
> 2) At the Shell prompt, run "reset -s" to shutdown
> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
> 4) the board will hang while booting UEFI, assuming the board firmware
> doesn't die with constant messages like this:
> 
>     ERROR: PCIe CSR read failed to respond
>     ERROR: SMBus transaction not claimed
> 
> Assuming the problem is firmware, not EDK2, what should we do about it?

OK, so instinctively, my reaction was that "the reset -s bug is a
system controller firmware bug and we shouldn't work around
it". However, since it is actually disrupting Ryan's workflow, which
frequently doesn't touch PCI at all, I think downgrading the ASSERT to
an error message is a good idea short-term.

Daniil - could you make that change please?

/
    Leif

> Prior to your "Set Marvell Yukon MAC address" patch, or with the
> earlier version, the board would boot anyway, but the Yukon device
> would be missing.
> 
> Now it dies.
> 
> I don't know which is worse, but I think hanging is worse than an
> ethernet port dropping out. Although hanging is a bit more obvious
> that there's a problem...
> 
> 
> > +  }
> >  }
> >
> >  STATIC
> > --
> > 2.7.4
> >


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-19 15:13   ` Leif Lindholm
@ 2017-01-20  1:34     ` Daniil Egranov
  2017-01-20 10:30       ` Ryan Harkin
  0 siblings, 1 reply; 11+ messages in thread
From: Daniil Egranov @ 2017-01-20  1:34 UTC (permalink / raw)
  To: Leif Lindholm, Ryan Harkin; +Cc: edk2-devel@lists.01.org

Hi Leif, Ryan


On 01/19/2017 09:13 AM, Leif Lindholm wrote:
> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>>> It disabled for Juno R0 due to PCI issues on this board.
>>>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>
>>> ---
>>>   ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> index 47ff587..e9e6990 100644
>>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>>     EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>>     EFI_HANDLE                Handle;
>>>     EFI_STATUS                Status;
>>> +  UINT32                    JunoRevision;
>>>
>>>     //
>>>     // PCI Root Complex initialization
>>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>>     Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath, FALSE);
>>>     ASSERT_EFI_ERROR (Status);
>>>
>>> -  Status = ArmJunoSetNicMacAddress ();
>>> -  ASSERT_EFI_ERROR (Status);
>>> +  GetJunoRevision (JunoRevision);
>>> +
>>> +  if (JunoRevision != JUNO_REVISION_R0) {
>>> +    Status = ArmJunoSetNicMacAddress ();
>>> +    ASSERT_EFI_ERROR (Status);
>> This is just an FYI, but I stacked your patch on top of mainline, like this:
>>
>> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>> Fixed crash on Juno R0       [Daniil Egranov]
>> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>>   [Thomas Huth]
>>
>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>> assert triggered:
>>
>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>> [snip]
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>> !EFI_ERROR (Status)
>>
>> I worked out what is happening. And it's not to do with this patch.
>> It's another fall-out from the re-work you did to the previous patch.
>> It's also ultimately due to a bug the firmware.
>>
>> With the initial version of your "Set Marvell Yukon MAC address"
>> patch, this hang didn't happen. I suspect that was because your error
>> checking was weaker and certain PCIe failures didn't trigger the
>> assert.
>>
>> To reproduce the error with this commit:
>> 1) power on and boot R1 or R2 into Shell
>>    I do this by interrupting the boot by pressing ESCAPE and using the boot menu
>> 2) At the Shell prompt, run "reset -s" to shutdown
>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>> 4) the board will hang while booting UEFI, assuming the board firmware
>> doesn't die with constant messages like this:
>>
>>      ERROR: PCIe CSR read failed to respond
>>      ERROR: SMBus transaction not claimed
>>
>> Assuming the problem is firmware, not EDK2, what should we do about it?
> OK, so instinctively, my reaction was that "the reset -s bug is a
> system controller firmware bug and we shouldn't work around
> it". However, since it is actually disrupting Ryan's workflow, which
> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
> an error message is a good idea short-term.
>
> Daniil - could you make that change please?
>
> /
>      Leif

I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus 
transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon 
driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same 
error messages during the initial boot.

Testing motherboard interfaces (FPGA build 118)...
SRAM 32MB test: PASSED
LAN9118   test: PASSED
KMI1/2    test: PASSED
MMC       test: PASSED
PB/LEDs   test: PASSED
FPGA UART test: PASSED
ERROR: PCIe CSR read failed to respond
ERROR: SMBus transaction not claimed
ERROR: PCIe CSR read failed to respond
...

Once it went through reporting these errors, the UEFI starts loading but 
still fails in OnEndOfDxe():
ASSERT_EFI_ERROR (Status = Not Found)
ASSERT [ArmJunoDxe] 
/home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110): 
!EFI_ERROR (Status)

This is the original ArmJunoDxe code:
   Status = gBS->ConnectController (Handle, NULL, 
PciRootComplexDevicePath, FALSE);
   ASSERT_EFI_ERROR (Status); <---- line 110

but it actually fails here first:
   Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
                                   &PciRootComplexDevicePath,
                                   &Handle);

Ryan - could you try to remove Marvell patches and check if you also 
catching "PCIe .." and "SMBus .." errors without them and your build 
still fails with other ASSERTs related to PCI.

Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI 
PCI enumeration is completely corrupted.

These errors do not appear if board was reset with the nPBRESET button. 
It never fails with "reset -w" and "reset -c". I also loaded Debian and 
used the shutdown command from there and got the same "PCIe .." and 
"SMBus .." errors after "reboot" command from the "Cmd>" prompt.  
Possibly, the "reboot" command from the board shell prompt doesnot reset 
the board correctly so it looks like a firmware issue.

Thanks,
Daniil

>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>> earlier version, the board would boot anyway, but the Yukon device
>> would be missing.
>>
>> Now it dies.
>>
>> I don't know which is worse, but I think hanging is worse than an
>> ethernet port dropping out. Although hanging is a bit more obvious
>> that there's a problem...
>>
>>
>>> +  }
>>>   }
>>>
>>>   STATIC
>>> --
>>> 2.7.4
>>>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-20  1:34     ` Daniil Egranov
@ 2017-01-20 10:30       ` Ryan Harkin
  2017-01-20 20:57         ` Daniil Egranov
  0 siblings, 1 reply; 11+ messages in thread
From: Ryan Harkin @ 2017-01-20 10:30 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: Leif Lindholm, edk2-devel@lists.01.org

On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com> wrote:
> Hi Leif, Ryan
>
>
> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>
> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>
> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>
> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
> It disabled for Juno R0 due to PCI issues on this board.
>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>
> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>
> ---
>  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> index 47ff587..e9e6990 100644
> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> @@ -378,6 +378,7 @@ OnEndOfDxe (
>    EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>    EFI_HANDLE                Handle;
>    EFI_STATUS                Status;
> +  UINT32                    JunoRevision;
>
>    //
>    // PCI Root Complex initialization
> @@ -393,8 +394,12 @@ OnEndOfDxe (
>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>    ASSERT_EFI_ERROR (Status);
>
> -  Status = ArmJunoSetNicMacAddress ();
> -  ASSERT_EFI_ERROR (Status);
> +  GetJunoRevision (JunoRevision);
> +
> +  if (JunoRevision != JUNO_REVISION_R0) {
> +    Status = ArmJunoSetNicMacAddress ();
> +    ASSERT_EFI_ERROR (Status);
>
> This is just an FYI, but I stacked your patch on top of mainline, like this:
>
> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
> Fixed crash on Juno R0       [Daniil Egranov]
> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>  [Thomas Huth]
>
> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
> assert triggered:
>
> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
> [snip]
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
> !EFI_ERROR (Status)
>
> I worked out what is happening. And it's not to do with this patch.
> It's another fall-out from the re-work you did to the previous patch.
> It's also ultimately due to a bug the firmware.
>
> With the initial version of your "Set Marvell Yukon MAC address"
> patch, this hang didn't happen. I suspect that was because your error
> checking was weaker and certain PCIe failures didn't trigger the
> assert.
>
> To reproduce the error with this commit:
> 1) power on and boot R1 or R2 into Shell
>   I do this by interrupting the boot by pressing ESCAPE and using the boot
> menu
> 2) At the Shell prompt, run "reset -s" to shutdown
> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
> 4) the board will hang while booting UEFI, assuming the board firmware
> doesn't die with constant messages like this:
>
>     ERROR: PCIe CSR read failed to respond
>     ERROR: SMBus transaction not claimed
>
> Assuming the problem is firmware, not EDK2, what should we do about it?
>
> OK, so instinctively, my reaction was that "the reset -s bug is a
> system controller firmware bug and we shouldn't work around
> it". However, since it is actually disrupting Ryan's workflow, which
> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
> an error message is a good idea short-term.
>
> Daniil - could you make that change please?
>
> /
>     Leif
>
>
> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same error
> messages during the initial boot.
>
> Testing motherboard interfaces (FPGA build 118)...
> SRAM 32MB test: PASSED
> LAN9118   test: PASSED
> KMI1/2    test: PASSED
> MMC       test: PASSED
> PB/LEDs   test: PASSED
> FPGA UART test: PASSED
> ERROR: PCIe CSR read failed to respond
> ERROR: SMBus transaction not claimed
> ERROR: PCIe CSR read failed to respond
> ...
>
> Once it went through reporting these errors, the UEFI starts loading but
> still fails in OnEndOfDxe():
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
> !EFI_ERROR (Status)
>
> This is the original ArmJunoDxe code:
>   Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>   ASSERT_EFI_ERROR (Status); <---- line 110

I'm confused at which point in the commit history this is line 110.

Just before your MAC commit, it's line 108. After your commit, and in
the current mainline, it's line 394.

Then, I have to go back to Jeremy's commit
58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
another change to ArmJunoDxe.c. Before that commit, the assert appears
on line 116.

On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
on line 125.

So I'm unsure what code you are running.



>
> but it actually fails here first:
>   Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>                                   &PciRootComplexDevicePath,
>                                   &Handle);
>
> Ryan - could you try to remove Marvell patches and check if you also
> catching "PCIe .." and "SMBus .." errors without them and your build still
> fails with other ASSERTs related to PCI.
>
> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI PCI
> enumeration is completely corrupted.
>
> These errors do not appear if board was reset with the nPBRESET button. It
> never fails with "reset -w" and "reset -c". I also loaded Debian and used
> the shutdown command from there and got the same "PCIe .." and "SMBus .."
> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
> "reboot" command from the board shell prompt does not reset the board
> correctly so it looks like a firmware issue.
>

Yes, that's pretty much what I've been saying. There appears to be a
firmware bug that causes PCIe to be in an unstable state after
software shutdown of the AP by the motherboard, followed by reset.

The "PCIe CSR read failed to respond" error is nothing to do with EDK2
and happens before EDK2 runs.

However, unlike you, if I don't get the "PCIe CSR read failed to
respond" error, upstream EDK2 before your MAC address commit will boot
without ASSERTing on R0/1/2.

So I've never seen the assert at line 110 as you pointed out above.
That's curious. I can see why that would/should assert, but it doesn't
for me. Perhaps it's something to do with the tree you are building
from? Or the devices you have connected? I have no additional SATA or
PCIe devices connected to my board.

Cheers,
Ryan.

> Thanks,
> Daniil
>
>
> Prior to your "Set Marvell Yukon MAC address" patch, or with the
> earlier version, the board would boot anyway, but the Yukon device
> would be missing.
>
> Now it dies.
>
> I don't know which is worse, but I think hanging is worse than an
> ethernet port dropping out. Although hanging is a bit more obvious
> that there's a problem...
>
>
> +  }
>  }
>
>  STATIC
> --
> 2.7.4
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-20 10:30       ` Ryan Harkin
@ 2017-01-20 20:57         ` Daniil Egranov
  2017-01-23 11:26           ` Ryan Harkin
  2017-01-23 12:56           ` Ryan Harkin
  0 siblings, 2 replies; 11+ messages in thread
From: Daniil Egranov @ 2017-01-20 20:57 UTC (permalink / raw)
  To: Ryan Harkin; +Cc: edk2-devel@lists.01.org, Leif Lindholm

Hi Ryan,


On 01/20/2017 04:30 AM, Ryan Harkin wrote:
> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com> wrote:
>> Hi Leif, Ryan
>>
>>
>> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>>
>> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>>
>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>>
>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>> It disabled for Juno R0 due to PCI issues on this board.
>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>>
>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>
>> ---
>>   ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> index 47ff587..e9e6990 100644
>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>     EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>     EFI_HANDLE                Handle;
>>     EFI_STATUS                Status;
>> +  UINT32                    JunoRevision;
>>
>>     //
>>     // PCI Root Complex initialization
>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>     Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
>> FALSE);
>>     ASSERT_EFI_ERROR (Status);
>>
>> -  Status = ArmJunoSetNicMacAddress ();
>> -  ASSERT_EFI_ERROR (Status);
>> +  GetJunoRevision (JunoRevision);
>> +
>> +  if (JunoRevision != JUNO_REVISION_R0) {
>> +    Status = ArmJunoSetNicMacAddress ();
>> +    ASSERT_EFI_ERROR (Status);
>>
>> This is just an FYI, but I stacked your patch on top of mainline, like this:
>>
>> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>> Fixed crash on Juno R0       [Daniil Egranov]
>> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>>   [Thomas Huth]
>>
>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>> assert triggered:
>>
>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>> [snip]
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>> !EFI_ERROR (Status)
>>
>> I worked out what is happening. And it's not to do with this patch.
>> It's another fall-out from the re-work you did to the previous patch.
>> It's also ultimately due to a bug the firmware.
>>
>> With the initial version of your "Set Marvell Yukon MAC address"
>> patch, this hang didn't happen. I suspect that was because your error
>> checking was weaker and certain PCIe failures didn't trigger the
>> assert.
>>
>> To reproduce the error with this commit:
>> 1) power on and boot R1 or R2 into Shell
>>    I do this by interrupting the boot by pressing ESCAPE and using the boot
>> menu
>> 2) At the Shell prompt, run "reset -s" to shutdown
>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>> 4) the board will hang while booting UEFI, assuming the board firmware
>> doesn't die with constant messages like this:
>>
>>      ERROR: PCIe CSR read failed to respond
>>      ERROR: SMBus transaction not claimed
>>
>> Assuming the problem is firmware, not EDK2, what should we do about it?
>>
>> OK, so instinctively, my reaction was that "the reset -s bug is a
>> system controller firmware bug and we shouldn't work around
>> it". However, since it is actually disrupting Ryan's workflow, which
>> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
>> an error message is a good idea short-term.
>>
>> Daniil - could you make that change please?
>>
>> /
>>      Leif
>>
>>
>> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
>> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
>> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same error
>> messages during the initial boot.
>>
>> Testing motherboard interfaces (FPGA build 118)...
>> SRAM 32MB test: PASSED
>> LAN9118   test: PASSED
>> KMI1/2    test: PASSED
>> MMC       test: PASSED
>> PB/LEDs   test: PASSED
>> FPGA UART test: PASSED
>> ERROR: PCIe CSR read failed to respond
>> ERROR: SMBus transaction not claimed
>> ERROR: PCIe CSR read failed to respond
>> ...
>>
>> Once it went through reporting these errors, the UEFI starts loading but
>> still fails in OnEndOfDxe():
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
>> !EFI_ERROR (Status)
>>
>> This is the original ArmJunoDxe code:
>>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
>> FALSE);
>>    ASSERT_EFI_ERROR (Status); <---- line 110
> I'm confused at which point in the commit history this is line 110.
>
> Just before your MAC commit, it's line 108. After your commit, and in
> the current mainline, it's line 394.
>
> Then, I have to go back to Jeremy's commit
> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
> another change to ArmJunoDxe.c. Before that commit, the assert appears
> on line 116.
>
> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
> on line 125.
>
> So I'm unsure what code you are running.
>

This shows the code where it fails for me after MAC address code was 
removed from ArmJunoDxe.c. It's part of my development branch and lines 
may not matching any public commits ... but i pulled 16.06 release for a 
test and see the ASSERT on the same code in case of  "PCIe .." and 
"SMBus .." errors. The line number should be corresponding to 
032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:

ASSERT_EFI_ERROR (Status = Not Found)
ASSERT [ArmJunoDxe] 
/home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124): 
!EFI_ERROR (Status)

>
>> but it actually fails here first:
>>    Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>>                                    &PciRootComplexDevicePath,
>>                                    &Handle);
>>
>> Ryan - could you try to remove Marvell patches and check if you also
>> catching "PCIe .." and "SMBus .." errors without them and your build still
>> fails with other ASSERTs related to PCI.
>>
>> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI PCI
>> enumeration is completely corrupted.
>>
>> These errors do not appear if board was reset with the nPBRESET button. It
>> never fails with "reset -w" and "reset -c". I also loaded Debian and used
>> the shutdown command from there and got the same "PCIe .." and "SMBus .."
>> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
>> "reboot" command from the board shell prompt does not reset the board
>> correctly so it looks like a firmware issue.
>>
> Yes, that's pretty much what I've been saying. There appears to be a
> firmware bug that causes PCIe to be in an unstable state after
> software shutdown of the AP by the motherboard, followed by reset.
>
> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
> and happens before EDK2 runs.
>
> However, unlike you, if I don't get the "PCIe CSR read failed to
> respond" error, upstream EDK2 before your MAC address commit will boot
> without ASSERTing on R0/1/2.

Do you mean, when you do have the "PCIe .." error ? As i understand, if 
you power-cycle the board (no "PCIe .." errors) you do not see ASSERT at 
the MAC address code during the first boot and can start UEFI Shell?

> So I've never seen the assert at line 110 as you pointed out above.
> That's curious. I can see why that would/should assert, but it doesn't
> for me. Perhaps it's something to do with the tree you are building
> from? Or the devices you have connected? I have no additional SATA or
> PCIe devices connected to my board.
>
> Cheers,
> Ryan.

I am using R2 for the tests. The tests were done without SATA or PCI 
devices connected to the board. Unfortunately, I do not have R1 board at 
this moment to check it.

This is my R2 firmware versions:

ARM V2M-Juno Boot loader v1.0.0
HBI0262 build 1844

ARM V2M_Juno r2 Firmware v1.4.3
Build Date: Apr 12 2016

MIC RAM configuration (pms_v105.bin)...

Configuring motherboard (rev D, var A)...
IOFPGA image \MB\HBI0262D\io_b118.bit

As Leif suggested, i will replace the ASSERT with an error message and 
send another patch.

Could you please check one thing for me on your boards? Use a build 
which worked for you (no ASSERT) and your steps: UEFI shell -> reset -s 
-> reboot -> see "PCIe .." and "SMBus .." errors -> boot UEFI. As i 
understand, you do not have the ASSERT with your old builds so you can 
start the UEFI shell. Use the "pci" shell command to list all PCI 
devices. Do you see them correctly enumerated? I'd like to make sure 
that by ignoring the ASSERT errors we will not face any further problems 
with PCI.

Thanks,
Daniil

>> Thanks,
>> Daniil
>>
>>
>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>> earlier version, the board would boot anyway, but the Yukon device
>> would be missing.
>>
>> Now it dies.
>>
>> I don't know which is worse, but I think hanging is worse than an
>> ethernet port dropping out. Although hanging is a bit more obvious
>> that there's a problem...
>>
>>
>> +  }
>>   }
>>
>>   STATIC
>> --
>> 2.7.4
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-20 20:57         ` Daniil Egranov
@ 2017-01-23 11:26           ` Ryan Harkin
  2017-01-23 12:56           ` Ryan Harkin
  1 sibling, 0 replies; 11+ messages in thread
From: Ryan Harkin @ 2017-01-23 11:26 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: edk2-devel@lists.01.org, Leif Lindholm

On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com> wrote:
> Hi Ryan,
>
>
> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>
> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com> wrote:
>
> Hi Leif, Ryan
>
>
> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>
> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>
> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>
> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
> It disabled for Juno R0 due to PCI issues on this board.
>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>
> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>
> ---
>  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> index 47ff587..e9e6990 100644
> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> @@ -378,6 +378,7 @@ OnEndOfDxe (
>    EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>    EFI_HANDLE                Handle;
>    EFI_STATUS                Status;
> +  UINT32                    JunoRevision;
>
>    //
>    // PCI Root Complex initialization
> @@ -393,8 +394,12 @@ OnEndOfDxe (
>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>    ASSERT_EFI_ERROR (Status);
>
> -  Status = ArmJunoSetNicMacAddress ();
> -  ASSERT_EFI_ERROR (Status);
> +  GetJunoRevision (JunoRevision);
> +
> +  if (JunoRevision != JUNO_REVISION_R0) {
> +    Status = ArmJunoSetNicMacAddress ();
> +    ASSERT_EFI_ERROR (Status);
>
> This is just an FYI, but I stacked your patch on top of mainline, like this:
>
> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
> Fixed crash on Juno R0       [Daniil Egranov]
> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>  [Thomas Huth]
>
> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
> assert triggered:
>
> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
> [snip]
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
> !EFI_ERROR (Status)
>
> I worked out what is happening. And it's not to do with this patch.
> It's another fall-out from the re-work you did to the previous patch.
> It's also ultimately due to a bug the firmware.
>
> With the initial version of your "Set Marvell Yukon MAC address"
> patch, this hang didn't happen. I suspect that was because your error
> checking was weaker and certain PCIe failures didn't trigger the
> assert.
>
> To reproduce the error with this commit:
> 1) power on and boot R1 or R2 into Shell
>   I do this by interrupting the boot by pressing ESCAPE and using the boot
> menu
> 2) At the Shell prompt, run "reset -s" to shutdown
> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
> 4) the board will hang while booting UEFI, assuming the board firmware
> doesn't die with constant messages like this:
>
>     ERROR: PCIe CSR read failed to respond
>     ERROR: SMBus transaction not claimed
>
> Assuming the problem is firmware, not EDK2, what should we do about it?
>
> OK, so instinctively, my reaction was that "the reset -s bug is a
> system controller firmware bug and we shouldn't work around
> it". However, since it is actually disrupting Ryan's workflow, which
> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
> an error message is a good idea short-term.
>
> Daniil - could you make that change please?
>
> /
>     Leif
>
>
> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same error
> messages during the initial boot.
>
> Testing motherboard interfaces (FPGA build 118)...
> SRAM 32MB test: PASSED
> LAN9118   test: PASSED
> KMI1/2    test: PASSED
> MMC       test: PASSED
> PB/LEDs   test: PASSED
> FPGA UART test: PASSED
> ERROR: PCIe CSR read failed to respond
> ERROR: SMBus transaction not claimed
> ERROR: PCIe CSR read failed to respond
> ...
>
> Once it went through reporting these errors, the UEFI starts loading but
> still fails in OnEndOfDxe():
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
> !EFI_ERROR (Status)
>
> This is the original ArmJunoDxe code:
>   Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>   ASSERT_EFI_ERROR (Status); <---- line 110
>
> I'm confused at which point in the commit history this is line 110.
>
> Just before your MAC commit, it's line 108. After your commit, and in
> the current mainline, it's line 394.
>
> Then, I have to go back to Jeremy's commit
> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
> another change to ArmJunoDxe.c. Before that commit, the assert appears
> on line 116.
>
> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
> on line 125.
>
> So I'm unsure what code you are running.
>
>
> This shows the code where it fails for me after MAC address code was removed
> from ArmJunoDxe.c. It's part of my development branch and lines may not
> matching any public commits

You should consider switching branch to mainline when trying to
compare and report problems with other people, then we have a common
baseline to work with.


>  ... but i pulled 16.06 release for a test and

16.06? Why so old? What's wrong with 16.12? You are missing firmware
updates and fixes by using such an old release.


> see the ASSERT on the same code in case of  "PCIe .." and "SMBus .." errors.
> The line number should be corresponding to
> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
> !EFI_ERROR (Status)
>
>
> but it actually fails here first:
>   Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>                                   &PciRootComplexDevicePath,
>                                   &Handle);
>
> Ryan - could you try to remove Marvell patches and check if you also
> catching "PCIe .." and "SMBus .." errors without them and your build still
> fails with other ASSERTs related to PCI.
>
> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI PCI
> enumeration is completely corrupted.
>
> These errors do not appear if board was reset with the nPBRESET button. It
> never fails with "reset -w" and "reset -c". I also loaded Debian and used
> the shutdown command from there and got the same "PCIe .." and "SMBus .."
> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
> "reboot" command from the board shell prompt does not reset the board
> correctly so it looks like a firmware issue.
>
> Yes, that's pretty much what I've been saying. There appears to be a
> firmware bug that causes PCIe to be in an unstable state after
> software shutdown of the AP by the motherboard, followed by reset.
>
> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
> and happens before EDK2 runs.
>
> However, unlike you, if I don't get the "PCIe CSR read failed to
> respond" error, upstream EDK2 before your MAC address commit will boot
> without ASSERTing on R0/1/2.
>
>
> Do you mean, when you do have the "PCIe .." error ? As i understand, if you
> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the MAC
> address code during the first boot and can start UEFI Shell?
>

Correct.


> So I've never seen the assert at line 110 as you pointed out above.
> That's curious. I can see why that would/should assert, but it doesn't
> for me. Perhaps it's something to do with the tree you are building
> from? Or the devices you have connected? I have no additional SATA or
> PCIe devices connected to my board.
>
> Cheers,
> Ryan.
>
>
> I am using R2 for the tests. The tests were done without SATA or PCI devices
> connected to the board. Unfortunately, I do not have R1 board at this moment
> to check it.
>

That's fine, R2 is good enough.


> This is my R2 firmware versions:
>
> ARM V2M-Juno Boot loader v1.0.0
> HBI0262 build 1844
>
> ARM V2M_Juno r2 Firmware v1.4.3

v1.4.4 is the latest Juno firmware. You should update. Although it
won't fix these problems as it wasn't intended to.  And as you used
16.06 for your baseline testing, I assume you also have old SCP
firmware in your environment too.


> Build Date: Apr 12 2016
>
> MIC RAM configuration (pms_v105.bin)...
>
> Configuring motherboard (rev D, var A)...
> IOFPGA image \MB\HBI0262D\io_b118.bit
>
> As Leif suggested, i will replace the ASSERT with an error message and send
> another patch.
>

Thanks, that's all we need to do, I think.


> Could you please check one thing for me on your boards? Use a build which
> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s -> reboot
> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.

After I get the "PCIe .." and "SMBus .." errors, the board will never
boot into UEFI. It hangs forever and I have to power-cycle the board.
This rarely happens for me on R2.

Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
give the "PCIe .." and "SMBus .." error, but most of the time, it does
not and UEFI continues to boot.



> As i understand, you do
> not have the ASSERT with your old builds so you can start the UEFI shell.
> Use the "pci" shell command to list all PCI devices. Do you see them
> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
> errors we will not face any further problems with PCI.
>

In the case where my board continues to boot after software shutdown,
the pci shell command shows this:

Shell> pci
   Seg  Bus  Dev  Func
   ---  ---  ---  ----
    00   00   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 1556 Device 1100 Prog Interface 0
    00   01   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   01    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   02    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   03    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   10    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   1F    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0

Which is not the same as when the board is power cycled. On power
cycle, I see this:

   Seg  Bus  Dev  Func
   ---  ---  ---  ----
    00   00   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 1556 Device 1100 Prog Interface 0
    00   01   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   01    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   02    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   03    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   10    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   1F    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   03   00    00 ==> Mass Storage Controller - Other mass storage controll
er
             Vendor 1095 Device 3132 Prog Interface 0
    00   08   00    00 ==> Network Controller - Ethernet controller
             Vendor 11AB Device 4380 Prog Interface 0

Cheers,
Ryan.


> Thanks,
> Daniil
>
>
> Thanks,
> Daniil
>
>
> Prior to your "Set Marvell Yukon MAC address" patch, or with the
> earlier version, the board would boot anyway, but the Yukon device
> would be missing.
>
> Now it dies.
>
> I don't know which is worse, but I think hanging is worse than an
> ethernet port dropping out. Although hanging is a bit more obvious
> that there's a problem...
>
>
> +  }
>  }
>
>  STATIC
> --
> 2.7.4
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-20 20:57         ` Daniil Egranov
  2017-01-23 11:26           ` Ryan Harkin
@ 2017-01-23 12:56           ` Ryan Harkin
  2017-01-24  2:19             ` Daniil Egranov
  1 sibling, 1 reply; 11+ messages in thread
From: Ryan Harkin @ 2017-01-23 12:56 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: edk2-devel@lists.01.org, Leif Lindholm

On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com> wrote:
> Hi Ryan,
>
>
> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>
> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com> wrote:
>
> Hi Leif, Ryan
>
>
> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>
> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>
> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>
> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
> It disabled for Juno R0 due to PCI issues on this board.
>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>
> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>
> ---
>  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> index 47ff587..e9e6990 100644
> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
> @@ -378,6 +378,7 @@ OnEndOfDxe (
>    EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>    EFI_HANDLE                Handle;
>    EFI_STATUS                Status;
> +  UINT32                    JunoRevision;
>
>    //
>    // PCI Root Complex initialization
> @@ -393,8 +394,12 @@ OnEndOfDxe (
>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>    ASSERT_EFI_ERROR (Status);
>
> -  Status = ArmJunoSetNicMacAddress ();
> -  ASSERT_EFI_ERROR (Status);
> +  GetJunoRevision (JunoRevision);
> +
> +  if (JunoRevision != JUNO_REVISION_R0) {
> +    Status = ArmJunoSetNicMacAddress ();
> +    ASSERT_EFI_ERROR (Status);
>
> This is just an FYI, but I stacked your patch on top of mainline, like this:
>
> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
> Fixed crash on Juno R0       [Daniil Egranov]
> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>  [Thomas Huth]
>
> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
> assert triggered:
>
> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
> [snip]
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
> !EFI_ERROR (Status)
>
> I worked out what is happening. And it's not to do with this patch.
> It's another fall-out from the re-work you did to the previous patch.
> It's also ultimately due to a bug the firmware.
>
> With the initial version of your "Set Marvell Yukon MAC address"
> patch, this hang didn't happen. I suspect that was because your error
> checking was weaker and certain PCIe failures didn't trigger the
> assert.
>
> To reproduce the error with this commit:
> 1) power on and boot R1 or R2 into Shell
>   I do this by interrupting the boot by pressing ESCAPE and using the boot
> menu
> 2) At the Shell prompt, run "reset -s" to shutdown
> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
> 4) the board will hang while booting UEFI, assuming the board firmware
> doesn't die with constant messages like this:
>
>     ERROR: PCIe CSR read failed to respond
>     ERROR: SMBus transaction not claimed
>
> Assuming the problem is firmware, not EDK2, what should we do about it?
>
> OK, so instinctively, my reaction was that "the reset -s bug is a
> system controller firmware bug and we shouldn't work around
> it". However, since it is actually disrupting Ryan's workflow, which
> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
> an error message is a good idea short-term.
>
> Daniil - could you make that change please?
>
> /
>     Leif
>
>
> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same error
> messages during the initial boot.
>
> Testing motherboard interfaces (FPGA build 118)...
> SRAM 32MB test: PASSED
> LAN9118   test: PASSED
> KMI1/2    test: PASSED
> MMC       test: PASSED
> PB/LEDs   test: PASSED
> FPGA UART test: PASSED
> ERROR: PCIe CSR read failed to respond
> ERROR: SMBus transaction not claimed
> ERROR: PCIe CSR read failed to respond
> ...
>
> Once it went through reporting these errors, the UEFI starts loading but
> still fails in OnEndOfDxe():
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
> !EFI_ERROR (Status)
>
> This is the original ArmJunoDxe code:
>   Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
> FALSE);
>   ASSERT_EFI_ERROR (Status); <---- line 110
>
> I'm confused at which point in the commit history this is line 110.
>
> Just before your MAC commit, it's line 108. After your commit, and in
> the current mainline, it's line 394.
>
> Then, I have to go back to Jeremy's commit
> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
> another change to ArmJunoDxe.c. Before that commit, the assert appears
> on line 116.
>
> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
> on line 125.
>
> So I'm unsure what code you are running.
>
>
> This shows the code where it fails for me after MAC address code was removed
> from ArmJunoDxe.c. It's part of my development branch and lines may not
> matching any public commits

You should consider switching branch to mainline when trying to
compare and report problems with other people, then we have a common
baseline to work with.


>  ... but i pulled 16.06 release for a test and

16.06? Why so old? What's wrong with 16.12? You are missing firmware
updates and fixes by using such an old release.


> see the ASSERT on the same code in case of  "PCIe .." and "SMBus .." errors.
> The line number should be corresponding to
> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>
> ASSERT_EFI_ERROR (Status = Not Found)
> ASSERT [ArmJunoDxe]
> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
> !EFI_ERROR (Status)
>
>
> but it actually fails here first:
>   Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>                                   &PciRootComplexDevicePath,
>                                   &Handle);
>
> Ryan - could you try to remove Marvell patches and check if you also
> catching "PCIe .." and "SMBus .." errors without them and your build still
> fails with other ASSERTs related to PCI.
>
> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI PCI
> enumeration is completely corrupted.
>
> These errors do not appear if board was reset with the nPBRESET button. It
> never fails with "reset -w" and "reset -c". I also loaded Debian and used
> the shutdown command from there and got the same "PCIe .." and "SMBus .."
> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
> "reboot" command from the board shell prompt does not reset the board
> correctly so it looks like a firmware issue.
>
> Yes, that's pretty much what I've been saying. There appears to be a
> firmware bug that causes PCIe to be in an unstable state after
> software shutdown of the AP by the motherboard, followed by reset.
>
> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
> and happens before EDK2 runs.
>
> However, unlike you, if I don't get the "PCIe CSR read failed to
> respond" error, upstream EDK2 before your MAC address commit will boot
> without ASSERTing on R0/1/2.
>
>
> Do you mean, when you do have the "PCIe .." error ? As i understand, if you
> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the MAC
> address code during the first boot and can start UEFI Shell?
>

Correct.


> So I've never seen the assert at line 110 as you pointed out above.
> That's curious. I can see why that would/should assert, but it doesn't
> for me. Perhaps it's something to do with the tree you are building
> from? Or the devices you have connected? I have no additional SATA or
> PCIe devices connected to my board.
>
> Cheers,
> Ryan.
>
>
> I am using R2 for the tests. The tests were done without SATA or PCI devices
> connected to the board. Unfortunately, I do not have R1 board at this moment
> to check it.
>

That's fine, R2 is good enough.


> This is my R2 firmware versions:
>
> ARM V2M-Juno Boot loader v1.0.0
> HBI0262 build 1844
>
> ARM V2M_Juno r2 Firmware v1.4.3

v1.4.4 is the latest Juno firmware. You should update. Although it
won't fix these problems as it wasn't intended to.  And as you used
16.06 for your baseline testing, I assume you probably also have old
SCP firmware in your environment too.


> Build Date: Apr 12 2016
>
> MIC RAM configuration (pms_v105.bin)...
>
> Configuring motherboard (rev D, var A)...
> IOFPGA image \MB\HBI0262D\io_b118.bit
>
> As Leif suggested, i will replace the ASSERT with an error message and send
> another patch.
>

Thanks, that's all we need to do, I think.


> Could you please check one thing for me on your boards? Use a build which
> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s -> reboot
> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.

After I get the "PCIe .." and "SMBus .." errors, the board will never
boot into UEFI. It hangs forever and I have to power-cycle the board.
This rarely happens for me on R2.

Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
give the "PCIe .." and "SMBus .." error, but most of the time, it does
not and UEFI continues to boot.



> As i understand, you do
> not have the ASSERT with your old builds so you can start the UEFI shell.
> Use the "pci" shell command to list all PCI devices. Do you see them
> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
> errors we will not face any further problems with PCI.
>

In the case where my board continues to boot after software shutdown,
the pci shell command shows this:

Shell> pci
   Seg  Bus  Dev  Func
   ---  ---  ---  ----
    00   00   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 1556 Device 1100 Prog Interface 0
    00   01   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   01    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   02    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   03    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0
    00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   10    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   1F    00 ==> Pre 2.0 device - All devices other than VGA
             Vendor 0000 Device 0000 Prog Interface 0

Which is not the same as when the board is power cycled. On power
cycle, I see this:

   Seg  Bus  Dev  Func
   ---  ---  ---  ----
    00   00   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 1556 Device 1100 Prog Interface 0
    00   01   00    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   01    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   02    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   03    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   10    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   02   1F    00 ==> Bridge Device - PCI/PCI bridge
             Vendor 111D Device 8090 Prog Interface 0
    00   03   00    00 ==> Mass Storage Controller - Other mass storage controll
er
             Vendor 1095 Device 3132 Prog Interface 0
    00   08   00    00 ==> Network Controller - Ethernet controller
             Vendor 11AB Device 4380 Prog Interface 0

Cheers,
Ryan.


> Thanks,
> Daniil
>
>
> Thanks,
> Daniil
>
>
> Prior to your "Set Marvell Yukon MAC address" patch, or with the
> earlier version, the board would boot anyway, but the Yukon device
> would be missing.
>
> Now it dies.
>
> I don't know which is worse, but I think hanging is worse than an
> ethernet port dropping out. Although hanging is a bit more obvious
> that there's a problem...
>
>
> +  }
>  }
>
>  STATIC
> --
> 2.7.4
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-23 12:56           ` Ryan Harkin
@ 2017-01-24  2:19             ` Daniil Egranov
  2017-01-24 11:05               ` Ryan Harkin
  0 siblings, 1 reply; 11+ messages in thread
From: Daniil Egranov @ 2017-01-24  2:19 UTC (permalink / raw)
  To: Ryan Harkin; +Cc: edk2-devel@lists.01.org, Leif Lindholm

Hi Ryan,


On 01/23/2017 06:56 AM, Ryan Harkin wrote:
> On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com> wrote:
>> Hi Ryan,
>>
>>
>> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>>
>> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com> wrote:
>>
>> Hi Leif, Ryan
>>
>>
>> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>>
>> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>>
>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com> wrote:
>>
>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>> It disabled for Juno R0 due to PCI issues on this board.
>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>>
>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>
>> ---
>>   ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> index 47ff587..e9e6990 100644
>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>     EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>     EFI_HANDLE                Handle;
>>     EFI_STATUS                Status;
>> +  UINT32                    JunoRevision;
>>
>>     //
>>     // PCI Root Complex initialization
>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>     Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
>> FALSE);
>>     ASSERT_EFI_ERROR (Status);
>>
>> -  Status = ArmJunoSetNicMacAddress ();
>> -  ASSERT_EFI_ERROR (Status);
>> +  GetJunoRevision (JunoRevision);
>> +
>> +  if (JunoRevision != JUNO_REVISION_R0) {
>> +    Status = ArmJunoSetNicMacAddress ();
>> +    ASSERT_EFI_ERROR (Status);
>>
>> This is just an FYI, but I stacked your patch on top of mainline, like this:
>>
>> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>> Fixed crash on Juno R0       [Daniil Egranov]
>> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>>   [Thomas Huth]
>>
>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>> assert triggered:
>>
>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>> [snip]
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>> !EFI_ERROR (Status)
>>
>> I worked out what is happening. And it's not to do with this patch.
>> It's another fall-out from the re-work you did to the previous patch.
>> It's also ultimately due to a bug the firmware.
>>
>> With the initial version of your "Set Marvell Yukon MAC address"
>> patch, this hang didn't happen. I suspect that was because your error
>> checking was weaker and certain PCIe failures didn't trigger the
>> assert.
>>
>> To reproduce the error with this commit:
>> 1) power on and boot R1 or R2 into Shell
>>    I do this by interrupting the boot by pressing ESCAPE and using the boot
>> menu
>> 2) At the Shell prompt, run "reset -s" to shutdown
>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>> 4) the board will hang while booting UEFI, assuming the board firmware
>> doesn't die with constant messages like this:
>>
>>      ERROR: PCIe CSR read failed to respond
>>      ERROR: SMBus transaction not claimed
>>
>> Assuming the problem is firmware, not EDK2, what should we do about it?
>>
>> OK, so instinctively, my reaction was that "the reset -s bug is a
>> system controller firmware bug and we shouldn't work around
>> it". However, since it is actually disrupting Ryan's workflow, which
>> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
>> an error message is a good idea short-term.
>>
>> Daniil - could you make that change please?
>>
>> /
>>      Leif
>>
>>
>> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
>> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
>> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same error
>> messages during the initial boot.
>>
>> Testing motherboard interfaces (FPGA build 118)...
>> SRAM 32MB test: PASSED
>> LAN9118   test: PASSED
>> KMI1/2    test: PASSED
>> MMC       test: PASSED
>> PB/LEDs   test: PASSED
>> FPGA UART test: PASSED
>> ERROR: PCIe CSR read failed to respond
>> ERROR: SMBus transaction not claimed
>> ERROR: PCIe CSR read failed to respond
>> ...
>>
>> Once it went through reporting these errors, the UEFI starts loading but
>> still fails in OnEndOfDxe():
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
>> !EFI_ERROR (Status)
>>
>> This is the original ArmJunoDxe code:
>>    Status = gBS->ConnectController (Handle, NULL, PciRootComplexDevicePath,
>> FALSE);
>>    ASSERT_EFI_ERROR (Status); <---- line 110
>>
>> I'm confused at which point in the commit history this is line 110.
>>
>> Just before your MAC commit, it's line 108. After your commit, and in
>> the current mainline, it's line 394.
>>
>> Then, I have to go back to Jeremy's commit
>> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
>> another change to ArmJunoDxe.c. Before that commit, the assert appears
>> on line 116.
>>
>> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
>> on line 125.
>>
>> So I'm unsure what code you are running.
>>
>>
>> This shows the code where it fails for me after MAC address code was removed
>> from ArmJunoDxe.c. It's part of my development branch and lines may not
>> matching any public commits
> You should consider switching branch to mainline when trying to
> compare and report problems with other people, then we have a common
> baseline to work with.
>
>
>>   ... but i pulled 16.06 release for a test and
> 16.06? Why so old? What's wrong with 16.12? You are missing firmware
> updates and fixes by using such an old release.

I  wanted to check how long this problem existed. Just picked up one of 
the old builds.

>
>> see the ASSERT on the same code in case of  "PCIe .." and "SMBus .." errors.
>> The line number should be corresponding to
>> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>>
>> ASSERT_EFI_ERROR (Status = Not Found)
>> ASSERT [ArmJunoDxe]
>> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
>> !EFI_ERROR (Status)
>>
>>
>> but it actually fails here first:
>>    Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>>                                    &PciRootComplexDevicePath,
>>                                    &Handle);
>>
>> Ryan - could you try to remove Marvell patches and check if you also
>> catching "PCIe .." and "SMBus .." errors without them and your build still
>> fails with other ASSERTs related to PCI.
>>
>> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI PCI
>> enumeration is completely corrupted.
>>
>> These errors do not appear if board was reset with the nPBRESET button. It
>> never fails with "reset -w" and "reset -c". I also loaded Debian and used
>> the shutdown command from there and got the same "PCIe .." and "SMBus .."
>> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
>> "reboot" command from the board shell prompt does not reset the board
>> correctly so it looks like a firmware issue.
>>
>> Yes, that's pretty much what I've been saying. There appears to be a
>> firmware bug that causes PCIe to be in an unstable state after
>> software shutdown of the AP by the motherboard, followed by reset.
>>
>> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
>> and happens before EDK2 runs.
>>
>> However, unlike you, if I don't get the "PCIe CSR read failed to
>> respond" error, upstream EDK2 before your MAC address commit will boot
>> without ASSERTing on R0/1/2.
>>
>>
>> Do you mean, when you do have the "PCIe .." error ? As i understand, if you
>> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the MAC
>> address code during the first boot and can start UEFI Shell?
>>
> Correct.
>
>
>> So I've never seen the assert at line 110 as you pointed out above.
>> That's curious. I can see why that would/should assert, but it doesn't
>> for me. Perhaps it's something to do with the tree you are building
>> from? Or the devices you have connected? I have no additional SATA or
>> PCIe devices connected to my board.
>>
>> Cheers,
>> Ryan.
>>
>>
>> I am using R2 for the tests. The tests were done without SATA or PCI devices
>> connected to the board. Unfortunately, I do not have R1 board at this moment
>> to check it.
>>
> That's fine, R2 is good enough.
>
>
>> This is my R2 firmware versions:
>>
>> ARM V2M-Juno Boot loader v1.0.0
>> HBI0262 build 1844
>>
>> ARM V2M_Juno r2 Firmware v1.4.3
> v1.4.4 is the latest Juno firmware. You should update. Although it
> won't fix these problems as it wasn't intended to.  And as you used
> 16.06 for your baseline testing, I assume you probably also have old
> SCP firmware in your environment too.
>
>
>> Build Date: Apr 12 2016
>>
>> MIC RAM configuration (pms_v105.bin)...
>>
>> Configuring motherboard (rev D, var A)...
>> IOFPGA image \MB\HBI0262D\io_b118.bit
>>
>> As Leif suggested, i will replace the ASSERT with an error message and send
>> another patch.
>>
> Thanks, that's all we need to do, I think.
>

Done.

>> Could you please check one thing for me on your boards? Use a build which
>> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s -> reboot
>> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.
> After I get the "PCIe .." and "SMBus .." errors, the board will never
> boot into UEFI. It hangs forever and I have to power-cycle the board.
> This rarely happens for me on R2.
>
> Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
> give the "PCIe .." and "SMBus .." error, but most of the time, it does
> not and UEFI continues to boot.

Interesting, if i do reset -s, i am getting "PCIe ..." error  and stop 
with ASSERT error in UEFI all the time.

>
>
>> As i understand, you do
>> not have the ASSERT with your old builds so you can start the UEFI shell.
>> Use the "pci" shell command to list all PCI devices. Do you see them
>> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
>> errors we will not face any further problems with PCI.
>>
> In the case where my board continues to boot after software shutdown,
> the pci shell command shows this:
>
> Shell> pci
>     Seg  Bus  Dev  Func
>     ---  ---  ---  ----
>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 1556 Device 1100 Prog Interface 0
>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   01    00 ==> Pre 2.0 device - All devices other than VGA
>               Vendor 0000 Device 0000 Prog Interface 0
>      00   02   02    00 ==> Pre 2.0 device - All devices other than VGA
>               Vendor 0000 Device 0000 Prog Interface 0
>      00   02   03    00 ==> Pre 2.0 device - All devices other than VGA
>               Vendor 0000 Device 0000 Prog Interface 0
>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   1F    00 ==> Pre 2.0 device - All devices other than VGA
>               Vendor 0000 Device 0000 Prog Interface 0
>
> Which is not the same as when the board is power cycled. On power
> cycle, I see this:
>
>     Seg  Bus  Dev  Func
>     ---  ---  ---  ----
>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 1556 Device 1100 Prog Interface 0
>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   01    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   02    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   03    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   02   1F    00 ==> Bridge Device - PCI/PCI bridge
>               Vendor 111D Device 8090 Prog Interface 0
>      00   03   00    00 ==> Mass Storage Controller - Other mass storage controll
> er
>               Vendor 1095 Device 3132 Prog Interface 0
>      00   08   00    00 ==> Network Controller - Ethernet controller
>               Vendor 11AB Device 4380 Prog Interface 0
>
> Cheers,
> Ryan.
>
>

If I let UEFI boot to the shell after "PCIe ..." error, the pci shell 
command shows me a long list of unrecognized PCI devices with all 
combinations of BDFs.

Thanks,
Daniil

>> Thanks,
>> Daniil
>>
>>
>> Thanks,
>> Daniil
>>
>>
>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>> earlier version, the board would boot anyway, but the Yukon device
>> would be missing.
>>
>> Now it dies.
>>
>> I don't know which is worse, but I think hanging is worse than an
>> ethernet port dropping out. Although hanging is a bit more obvious
>> that there's a problem...
>>
>>
>> +  }
>>   }
>>
>>   STATIC
>> --
>> 2.7.4
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-24  2:19             ` Daniil Egranov
@ 2017-01-24 11:05               ` Ryan Harkin
  2017-01-24 12:34                 ` Ryan Harkin
  0 siblings, 1 reply; 11+ messages in thread
From: Ryan Harkin @ 2017-01-24 11:05 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: edk2-devel@lists.01.org, Leif Lindholm

On 24 January 2017 at 02:19, Daniil Egranov <daniil.egranov@arm.com> wrote:
> Hi Ryan,
>
>
>
> On 01/23/2017 06:56 AM, Ryan Harkin wrote:
>>
>> On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com>
>> wrote:
>>>
>>> Hi Ryan,
>>>
>>>
>>> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>>>
>>> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com>
>>> wrote:
>>>
>>> Hi Leif, Ryan
>>>
>>>
>>> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>>>
>>> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>>>
>>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com>
>>> wrote:
>>>
>>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>>> It disabled for Juno R0 due to PCI issues on this board.
>>>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>>>
>>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>>
>>> ---
>>>   ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> index 47ff587..e9e6990 100644
>>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>>     EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>>     EFI_HANDLE                Handle;
>>>     EFI_STATUS                Status;
>>> +  UINT32                    JunoRevision;
>>>
>>>     //
>>>     // PCI Root Complex initialization
>>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>>     Status = gBS->ConnectController (Handle, NULL,
>>> PciRootComplexDevicePath,
>>> FALSE);
>>>     ASSERT_EFI_ERROR (Status);
>>>
>>> -  Status = ArmJunoSetNicMacAddress ();
>>> -  ASSERT_EFI_ERROR (Status);
>>> +  GetJunoRevision (JunoRevision);
>>> +
>>> +  if (JunoRevision != JUNO_REVISION_R0) {
>>> +    Status = ArmJunoSetNicMacAddress ();
>>> +    ASSERT_EFI_ERROR (Status);
>>>
>>> This is just an FYI, but I stacked your patch on top of mainline, like
>>> this:
>>>
>>> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>>> Fixed crash on Juno R0       [Daniil Egranov]
>>> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>>>   [Thomas Huth]
>>>
>>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>>> assert triggered:
>>>
>>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>>> [snip]
>>> ASSERT_EFI_ERROR (Status = Not Found)
>>> ASSERT [ArmJunoDxe]
>>>
>>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>>> !EFI_ERROR (Status)
>>>
>>> I worked out what is happening. And it's not to do with this patch.
>>> It's another fall-out from the re-work you did to the previous patch.
>>> It's also ultimately due to a bug the firmware.
>>>
>>> With the initial version of your "Set Marvell Yukon MAC address"
>>> patch, this hang didn't happen. I suspect that was because your error
>>> checking was weaker and certain PCIe failures didn't trigger the
>>> assert.
>>>
>>> To reproduce the error with this commit:
>>> 1) power on and boot R1 or R2 into Shell
>>>    I do this by interrupting the boot by pressing ESCAPE and using the
>>> boot
>>> menu
>>> 2) At the Shell prompt, run "reset -s" to shutdown
>>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>>> 4) the board will hang while booting UEFI, assuming the board firmware
>>> doesn't die with constant messages like this:
>>>
>>>      ERROR: PCIe CSR read failed to respond
>>>      ERROR: SMBus transaction not claimed
>>>
>>> Assuming the problem is firmware, not EDK2, what should we do about it?
>>>
>>> OK, so instinctively, my reaction was that "the reset -s bug is a
>>> system controller firmware bug and we shouldn't work around
>>> it". However, since it is actually disrupting Ryan's workflow, which
>>> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
>>> an error message is a good idea short-term.
>>>
>>> Daniil - could you make that change please?
>>>
>>> /
>>>      Leif
>>>
>>>
>>> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
>>> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
>>> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same
>>> error
>>> messages during the initial boot.
>>>
>>> Testing motherboard interfaces (FPGA build 118)...
>>> SRAM 32MB test: PASSED
>>> LAN9118   test: PASSED
>>> KMI1/2    test: PASSED
>>> MMC       test: PASSED
>>> PB/LEDs   test: PASSED
>>> FPGA UART test: PASSED
>>> ERROR: PCIe CSR read failed to respond
>>> ERROR: SMBus transaction not claimed
>>> ERROR: PCIe CSR read failed to respond
>>> ...
>>>
>>> Once it went through reporting these errors, the UEFI starts loading but
>>> still fails in OnEndOfDxe():
>>> ASSERT_EFI_ERROR (Status = Not Found)
>>> ASSERT [ArmJunoDxe]
>>>
>>> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
>>> !EFI_ERROR (Status)
>>>
>>> This is the original ArmJunoDxe code:
>>>    Status = gBS->ConnectController (Handle, NULL,
>>> PciRootComplexDevicePath,
>>> FALSE);
>>>    ASSERT_EFI_ERROR (Status); <---- line 110
>>>
>>> I'm confused at which point in the commit history this is line 110.
>>>
>>> Just before your MAC commit, it's line 108. After your commit, and in
>>> the current mainline, it's line 394.
>>>
>>> Then, I have to go back to Jeremy's commit
>>> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
>>> another change to ArmJunoDxe.c. Before that commit, the assert appears
>>> on line 116.
>>>
>>> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
>>> on line 125.
>>>
>>> So I'm unsure what code you are running.
>>>
>>>
>>> This shows the code where it fails for me after MAC address code was
>>> removed
>>> from ArmJunoDxe.c. It's part of my development branch and lines may not
>>> matching any public commits
>>
>> You should consider switching branch to mainline when trying to
>> compare and report problems with other people, then we have a common
>> baseline to work with.
>>
>>
>>>   ... but i pulled 16.06 release for a test and
>>
>> 16.06? Why so old? What's wrong with 16.12? You are missing firmware
>> updates and fixes by using such an old release.
>
>
> I  wanted to check how long this problem existed. Just picked up one of the
> old builds.
>
>
>>
>>> see the ASSERT on the same code in case of  "PCIe .." and "SMBus .."
>>> errors.
>>> The line number should be corresponding to
>>> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>>>
>>> ASSERT_EFI_ERROR (Status = Not Found)
>>> ASSERT [ArmJunoDxe]
>>>
>>> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
>>> !EFI_ERROR (Status)
>>>
>>>
>>> but it actually fails here first:
>>>    Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>>>                                    &PciRootComplexDevicePath,
>>>                                    &Handle);
>>>
>>> Ryan - could you try to remove Marvell patches and check if you also
>>> catching "PCIe .." and "SMBus .." errors without them and your build
>>> still
>>> fails with other ASSERTs related to PCI.
>>>
>>> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI
>>> PCI
>>> enumeration is completely corrupted.
>>>
>>> These errors do not appear if board was reset with the nPBRESET button.
>>> It
>>> never fails with "reset -w" and "reset -c". I also loaded Debian and used
>>> the shutdown command from there and got the same "PCIe .." and "SMBus .."
>>> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
>>> "reboot" command from the board shell prompt does not reset the board
>>> correctly so it looks like a firmware issue.
>>>
>>> Yes, that's pretty much what I've been saying. There appears to be a
>>> firmware bug that causes PCIe to be in an unstable state after
>>> software shutdown of the AP by the motherboard, followed by reset.
>>>
>>> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
>>> and happens before EDK2 runs.
>>>
>>> However, unlike you, if I don't get the "PCIe CSR read failed to
>>> respond" error, upstream EDK2 before your MAC address commit will boot
>>> without ASSERTing on R0/1/2.
>>>
>>>
>>> Do you mean, when you do have the "PCIe .." error ? As i understand, if
>>> you
>>> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the
>>> MAC
>>> address code during the first boot and can start UEFI Shell?
>>>
>> Correct.
>>
>>
>>> So I've never seen the assert at line 110 as you pointed out above.
>>> That's curious. I can see why that would/should assert, but it doesn't
>>> for me. Perhaps it's something to do with the tree you are building
>>> from? Or the devices you have connected? I have no additional SATA or
>>> PCIe devices connected to my board.
>>>
>>> Cheers,
>>> Ryan.
>>>
>>>
>>> I am using R2 for the tests. The tests were done without SATA or PCI
>>> devices
>>> connected to the board. Unfortunately, I do not have R1 board at this
>>> moment
>>> to check it.
>>>
>> That's fine, R2 is good enough.
>>
>>
>>> This is my R2 firmware versions:
>>>
>>> ARM V2M-Juno Boot loader v1.0.0
>>> HBI0262 build 1844
>>>
>>> ARM V2M_Juno r2 Firmware v1.4.3
>>
>> v1.4.4 is the latest Juno firmware. You should update. Although it
>> won't fix these problems as it wasn't intended to.  And as you used
>> 16.06 for your baseline testing, I assume you probably also have old
>> SCP firmware in your environment too.
>>
>>
>>> Build Date: Apr 12 2016
>>>
>>> MIC RAM configuration (pms_v105.bin)...
>>>
>>> Configuring motherboard (rev D, var A)...
>>> IOFPGA image \MB\HBI0262D\io_b118.bit
>>>
>>> As Leif suggested, i will replace the ASSERT with an error message and
>>> send
>>> another patch.
>>>
>> Thanks, that's all we need to do, I think.
>>
>
> Done.
>
>>> Could you please check one thing for me on your boards? Use a build which
>>> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s ->
>>> reboot
>>> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.
>>
>> After I get the "PCIe .." and "SMBus .." errors, the board will never
>> boot into UEFI. It hangs forever and I have to power-cycle the board.
>> This rarely happens for me on R2.
>>
>> Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
>> give the "PCIe .." and "SMBus .." error, but most of the time, it does
>> not and UEFI continues to boot.
>
>
> Interesting, if i do reset -s, i am getting "PCIe ..." error  and stop with
> ASSERT error in UEFI all the time.
>

Funny, but while testing mainline [1] before testing your v2 patch, R1
and R2 both crashed after software shutdown with:

ASSERT [ArmJunoDxe]
/linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(397):
!EFI_ERROR (Status)


But even worse than that, at one point, R2 did this:

Installation of the FDT using the device path
<VenHw(E7223039-5836-41E1-B542-D7EC736C5E59)/board.dtb> completed.
PCI: The size 0x7F000000 of the region 0x80000000 has been increased
to be a power of two for the AXI translation table.
PCI: The size 0x180000000 of the region 0x880000000 has been increased
to be a power of two for the AXI translation table.


Synchronous Exception at 0x00000000FDEC712C
PC 0x0000FDEC712C (0x0000FDEC5000+0x0000212C) [ 0] CpuIo2Dxe.dll
PC 0x0000FDEC6570 (0x0000FDEC5000+0x00001570) [ 0] CpuIo2Dxe.dll
PC 0x0000FDE449CC (0x0000FDE41000+0x000039CC) [ 1] PciHostBridge.dll
PC 0x0000F886A6B4 (0x0000F8864000+0x000066B4) [ 2] PciBusDxe.dll
PC 0x0000F8867C54 (0x0000F8864000+0x00003C54) [ 2] PciBusDxe.dll
PC 0x0000F8867C24 (0x0000F8864000+0x00003C24) [ 2] PciBusDxe.dll
PC 0x0000F886CA08 (0x0000F8864000+0x00008A08) [ 2] PciBusDxe.dll
PC 0x0000F88677A8 (0x0000F8864000+0x000037A8) [ 2] PciBusDxe.dll
PC 0x0000F8865920 (0x0000F8864000+0x00001920) [ 2] PciBusDxe.dll
PC 0x0000FE005F8C (0x0000FDFFA000+0x0000BF8C) [ 3] DxeCore.dll
PC 0x0000FE005358 (0x0000FDFFA000+0x0000B358) [ 3] DxeCore.dll
PC 0x0000F8A29D6C (0x0000F8A28000+0x00001D6C) [ 4] ArmJunoDxe.dll
PC 0x0000FE0177B8 (0x0000FDFFA000+0x0001D7B8) [ 5] DxeCore.dll
PC 0x0000FE016F38 (0x0000FDFFA000+0x0001CF38) [ 5] DxeCore.dll
PC 0x0000FE005134 (0x0000FDFFA000+0x0000B134) [ 5] DxeCore.dll
PC 0x0000FE0175C4 (0x0000FDFFA000+0x0001D5C4) [ 5] DxeCore.dll
PC 0x0000FE0179C0 (0x0000FDFFA000+0x0001D9C0) [ 5] DxeCore.dll
PC 0x0000FE017E68 (0x0000FDFFA000+0x0001DE68) [ 5] DxeCore.dll
PC 0x0000FDDEDA34 (0x0000FDDE1000+0x0000CA34) [ 6] BdsDxe.dll
PC 0x0000FDDFD534 (0x0000FDDE1000+0x0001C534) [ 6] BdsDxe.dll
PC 0x0000FDDE3E70 (0x0000FDDE1000+0x00002E70) [ 6] BdsDxe.dll
PC 0x0000FDFFC3DC (0x0000FDFFA000+0x000023DC) [ 7] DxeCore.dll
PC 0x0000FDFFB4A0 (0x0000FDFFA000+0x000014A0) [ 7] DxeCore.dll
PC 0x0000FDFFB024 (0x0000FDFFA000+0x00001024) [ 7] DxeCore.dll

[ 0] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/UefiCpuPkg/CpuIo2Dxe/CpuIo2Dxe/DEBUG/CpuIo2Dxe.dll
[ 1] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridge.dll
[ 2] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Bus/Pci/PciBusDxe/PciBusDxe/DEBUG/PciBusDxe.dll
[ 3] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
[ 4] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe/DEBUG/ArmJunoDxe.dll
[ 5] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
[ 6] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Universal/BdsDxe/BdsDxe/DEBUG/BdsDxe.dll
[ 7] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll

  X0 0x0000000057F00000   X1 0x00000000FDECC030   X2
0x0000000057F00000   X3 0x000000000000001A
  X4 0x00000000FD434E98   X5 0x00000000FDEC64D0   X6
0x00000000F887A080   X7 0x00000000FDE4A4B8
  X8 0x00000000FE02D690   X9 0x0000004100000000  X10
0x0000000057EFFFFF  X11 0xC000000000000000
 X12 0x0000000000000000  X13 0x0000000000000003  X14
0x0000000000000000  X15 0x0000000000000003
 X16 0x00000000FEFFFBF0  X17 0x0000000000000000  X18
0x0000000000000000  X19 0x0000000000000013
 X20 0x0000000000000000  X21 0x0000000000000000  X22
0x0000000000000000  X23 0x0000000000000000
 X24 0x0000000000000000  X25 0x0000000000000000  X26
0x0000000000000000  X27 0x0000000000000000
 X28 0x0000000000000000   FP 0x00000000FEFFF2F0   LR 0x00000000FDEC6570

  V0 0x0000000000000000 0000000000000000   V1 0xAB1C5ED5923F82A4
59F111F13956C25B
  V2 0x550C7DC3243185BE 12835B01D807AA98   V3 0xC19BF1749BDC06A7
80DEB1FE72BE5D74
  V4 0x240CA1CC0FC19DC6 EFBE4786E49B69C1   V5 0x76F988DA5CB0A9DC
4A7484AA2DE92C6F
  V6 0xBF597FC7B00327C8 A831C66D983E5152   V7 0x1429296706CA6351
D5A79147C6E00BF3
  V8 0x0000000000000000 2E1B213827B70A85   V9 0x0000000000000000
766A0ABB650A7354
 V10 0x0000000000000000 A81A664BA2BFE8A1  V11 0x0000000000000000
D6990624D192E819
 V12 0x0000000000000000 1E376C0819A4C116  V13 0x0000000000000000
4ED8AA4A391C0CB3
 V14 0x0000000000000000 78A5636F748F82EE  V15 0x0000000000000000
A4506CEB90BEFFFA
 V16 0x3C83709547545153 BC990E9F897D4FA8  V17 0xBEBEF297C5EC0578
E1D99AE8C2B92607
 V18 0x13242833C5F05BAB 101C24C521387481  V19 0x0A50ABCAD0BDDA12
996865483E880969
 V20 0x6C4ABAA53A9BE1CB 4416D9F479B08221  V21 0x60952147C3A68574
55B8AE51435C1A1A
 V22 0x9FEB2A3B4AB8D3BF 88C1883495C7F76F  V23 0xD0C224BC8FB77E09
3DB8D233CF470963
 V24 0x0E72963E02D21C93 C95B757DA62B3A12  V25 0x2A77DBA1E4EE5D5C
E08479A1B557DFA8
 V26 0xB593F86668CA8129 927A0019CDEC1A76  V27 0xD000200000E00002
A6201000E0008C04
 V28 0x8020000081000200 892080400880A022  V29 0x2085000000000136
801489A520002304
 V30 0x0801200082040060 0384E05702949200  V31 0x4064084112940202
1000458804200800

  SP 0x00000000FEFFF2D0  ELR 0x00000000FDEC712C  SPSR 0x60000209  FPSR
0x00000000
 ESR 0x96000210          FAR 0x0000000057F00000

 ESR : EC 0x25  IL 0x1  ISS 0x00000210

Data abort: Synchronous external abort

Stack dump:
  00000FEFFF1D0: 996865483E880969 0A50ABCAD0BDDA12 4416D9F479B08221
6C4ABAA53A9BE1CB
  00000FEFFF1F0: 55B8AE51435C1A1A 60952147C3A68574 88C1883495C7F76F
9FEB2A3B4AB8D3BF
  00000FEFFF210: 3DB8D233CF470963 D0C224BC8FB77E09 C95B757DA62B3A12
0E72963E02D21C93
  00000FEFFF230: E08479A1B557DFA8 2A77DBA1E4EE5D5C 927A0019CDEC1A76
B593F86668CA8129
  00000FEFFF250: A6201000E0008C04 D000200000E00002 892080400880A022
8020000081000200
  00000FEFFF270: 801489A520002304 2085000000000136 0384E05702949200
0801200082040060
  00000FEFFF290: 1000458804200800 4064084112940202 00000000FDEC712C
0000000060000209
  00000FEFFF2B0: 0000000000000000 0000000096000210 0000000057F00000
000000000000001A
> 00000FEFFF2D0: 0000000057F00000 0000000057F00000 FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF
  00000FEFFF2F0: 00000000FEFFF350 00000000FDE449CC A831C66D983E5152
00000000FD434E98
  00000FEFFF310: 000000000000001A 0000000057F00000 0000000057F00000
00000000FDECC000
  00000FEFFF330: 00000000FD434E98 0101000000000000 0000000000000000
00000000FD434E98
  00000FEFFF350: 00000000FEFFF3A0 00000000F886A6B4 00000000FEFFF3A0
00000000FD434E98
  00000FEFFF370: 000000000000001A 0000000057F00000 00000000FEFFF3A0
00000000FD46C860
  00000FEFFF390: 00000000FDECC000 00000000FD46C798 00000000FEFFF420
00000000F8867C54
  00000FEFFF3B0: 0000000057F00000 00000000FDDB8018 00000000FEFFF400
57F00000F8879090
ASSERT [ArmCpuDxe]
/linaro/platforms/uefi/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(265):
((BOOLEAN)(0==1))


Anyway, these problems are all due to dodgy lower level firmware,
which should be fixed eventually. And your v2 patch removing the
assert is out best way forward.

Cheers,
Ryan.


>
>>
>>
>>> As i understand, you do
>>> not have the ASSERT with your old builds so you can start the UEFI shell.
>>> Use the "pci" shell command to list all PCI devices. Do you see them
>>> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
>>> errors we will not face any further problems with PCI.
>>>
>> In the case where my board continues to boot after software shutdown,
>> the pci shell command shows this:
>>
>> Shell> pci
>>     Seg  Bus  Dev  Func
>>     ---  ---  ---  ----
>>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 1556 Device 1100 Prog Interface 0
>>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   01    00 ==> Pre 2.0 device - All devices other than VGA
>>               Vendor 0000 Device 0000 Prog Interface 0
>>      00   02   02    00 ==> Pre 2.0 device - All devices other than VGA
>>               Vendor 0000 Device 0000 Prog Interface 0
>>      00   02   03    00 ==> Pre 2.0 device - All devices other than VGA
>>               Vendor 0000 Device 0000 Prog Interface 0
>>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   1F    00 ==> Pre 2.0 device - All devices other than VGA
>>               Vendor 0000 Device 0000 Prog Interface 0
>>
>> Which is not the same as when the board is power cycled. On power
>> cycle, I see this:
>>
>>     Seg  Bus  Dev  Func
>>     ---  ---  ---  ----
>>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 1556 Device 1100 Prog Interface 0
>>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   01    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   02    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   03    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   02   1F    00 ==> Bridge Device - PCI/PCI bridge
>>               Vendor 111D Device 8090 Prog Interface 0
>>      00   03   00    00 ==> Mass Storage Controller - Other mass storage
>> controll
>> er
>>               Vendor 1095 Device 3132 Prog Interface 0
>>      00   08   00    00 ==> Network Controller - Ethernet controller
>>               Vendor 11AB Device 4380 Prog Interface 0
>>
>> Cheers,
>> Ryan.
>>
>>
>
> If I let UEFI boot to the shell after "PCIe ..." error, the pci shell
> command shows me a long list of unrecognized PCI devices with all
> combinations of BDFs.
>
> Thanks,
> Daniil
>
>
>>> Thanks,
>>> Daniil
>>>
>>>
>>> Thanks,
>>> Daniil
>>>
>>>
>>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>>> earlier version, the board would boot anyway, but the Yukon device
>>> would be missing.
>>>
>>> Now it dies.
>>>
>>> I don't know which is worse, but I think hanging is worse than an
>>> ethernet port dropping out. Although hanging is a bit more obvious
>>> that there's a problem...
>>>
>>>
>>> +  }
>>>   }
>>>
>>>   STATIC
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>
>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>
>>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>
>

[1] mainline is this commit at the moment:
5734d48  2017-01-22  ShellPkg SmbiosView: Add decoding of SMBIOS spec
3.1.1     [Star Zeng]


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
  2017-01-24 11:05               ` Ryan Harkin
@ 2017-01-24 12:34                 ` Ryan Harkin
  0 siblings, 0 replies; 11+ messages in thread
From: Ryan Harkin @ 2017-01-24 12:34 UTC (permalink / raw)
  To: Daniil Egranov; +Cc: edk2-devel@lists.01.org, Leif Lindholm

On 24 January 2017 at 11:05, Ryan Harkin <ryan.harkin@linaro.org> wrote:
> On 24 January 2017 at 02:19, Daniil Egranov <daniil.egranov@arm.com> wrote:
>> Hi Ryan,
>>
>>
>>
>> On 01/23/2017 06:56 AM, Ryan Harkin wrote:
>>>
>>> On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com>
>>> wrote:
>>>>
>>>> Hi Ryan,
>>>>
>>>>
>>>> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>>>>
>>>> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com>
>>>> wrote:
>>>>
>>>> Hi Leif, Ryan
>>>>
>>>>
>>>> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>>>>
>>>> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>>>>
>>>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com>
>>>> wrote:
>>>>
>>>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>>>> It disabled for Juno R0 due to PCI issues on this board.
>>>>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>>>>
>>>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>>>
>>>> ---
>>>>   ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> index 47ff587..e9e6990 100644
>>>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>>>     EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>>>     EFI_HANDLE                Handle;
>>>>     EFI_STATUS                Status;
>>>> +  UINT32                    JunoRevision;
>>>>
>>>>     //
>>>>     // PCI Root Complex initialization
>>>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>>>     Status = gBS->ConnectController (Handle, NULL,
>>>> PciRootComplexDevicePath,
>>>> FALSE);
>>>>     ASSERT_EFI_ERROR (Status);
>>>>
>>>> -  Status = ArmJunoSetNicMacAddress ();
>>>> -  ASSERT_EFI_ERROR (Status);
>>>> +  GetJunoRevision (JunoRevision);
>>>> +
>>>> +  if (JunoRevision != JUNO_REVISION_R0) {
>>>> +    Status = ArmJunoSetNicMacAddress ();
>>>> +    ASSERT_EFI_ERROR (Status);
>>>>
>>>> This is just an FYI, but I stacked your patch on top of mainline, like
>>>> this:
>>>>
>>>> 5f81f61  2017-01-18  ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>>>> Fixed crash on Juno R0       [Daniil Egranov]
>>>> 19ca06b  2017-01-19  OvmfPkg: Remove superfluous return statements.
>>>>   [Thomas Huth]
>>>>
>>>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>>>> assert triggered:
>>>>
>>>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>>>> [snip]
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>>>> !EFI_ERROR (Status)
>>>>
>>>> I worked out what is happening. And it's not to do with this patch.
>>>> It's another fall-out from the re-work you did to the previous patch.
>>>> It's also ultimately due to a bug the firmware.
>>>>
>>>> With the initial version of your "Set Marvell Yukon MAC address"
>>>> patch, this hang didn't happen. I suspect that was because your error
>>>> checking was weaker and certain PCIe failures didn't trigger the
>>>> assert.
>>>>
>>>> To reproduce the error with this commit:
>>>> 1) power on and boot R1 or R2 into Shell
>>>>    I do this by interrupting the boot by pressing ESCAPE and using the
>>>> boot
>>>> menu
>>>> 2) At the Shell prompt, run "reset -s" to shutdown
>>>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>>>> 4) the board will hang while booting UEFI, assuming the board firmware
>>>> doesn't die with constant messages like this:
>>>>
>>>>      ERROR: PCIe CSR read failed to respond
>>>>      ERROR: SMBus transaction not claimed
>>>>
>>>> Assuming the problem is firmware, not EDK2, what should we do about it?
>>>>
>>>> OK, so instinctively, my reaction was that "the reset -s bug is a
>>>> system controller firmware bug and we shouldn't work around
>>>> it". However, since it is actually disrupting Ryan's workflow, which
>>>> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
>>>> an error message is a good idea short-term.
>>>>
>>>> Daniil - could you make that change please?
>>>>
>>>> /
>>>>      Leif
>>>>
>>>>
>>>> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
>>>> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
>>>> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same
>>>> error
>>>> messages during the initial boot.
>>>>
>>>> Testing motherboard interfaces (FPGA build 118)...
>>>> SRAM 32MB test: PASSED
>>>> LAN9118   test: PASSED
>>>> KMI1/2    test: PASSED
>>>> MMC       test: PASSED
>>>> PB/LEDs   test: PASSED
>>>> FPGA UART test: PASSED
>>>> ERROR: PCIe CSR read failed to respond
>>>> ERROR: SMBus transaction not claimed
>>>> ERROR: PCIe CSR read failed to respond
>>>> ...
>>>>
>>>> Once it went through reporting these errors, the UEFI starts loading but
>>>> still fails in OnEndOfDxe():
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
>>>> !EFI_ERROR (Status)
>>>>
>>>> This is the original ArmJunoDxe code:
>>>>    Status = gBS->ConnectController (Handle, NULL,
>>>> PciRootComplexDevicePath,
>>>> FALSE);
>>>>    ASSERT_EFI_ERROR (Status); <---- line 110
>>>>
>>>> I'm confused at which point in the commit history this is line 110.
>>>>
>>>> Just before your MAC commit, it's line 108. After your commit, and in
>>>> the current mainline, it's line 394.
>>>>
>>>> Then, I have to go back to Jeremy's commit
>>>> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
>>>> another change to ArmJunoDxe.c. Before that commit, the assert appears
>>>> on line 116.
>>>>
>>>> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
>>>> on line 125.
>>>>
>>>> So I'm unsure what code you are running.
>>>>
>>>>
>>>> This shows the code where it fails for me after MAC address code was
>>>> removed
>>>> from ArmJunoDxe.c. It's part of my development branch and lines may not
>>>> matching any public commits
>>>
>>> You should consider switching branch to mainline when trying to
>>> compare and report problems with other people, then we have a common
>>> baseline to work with.
>>>
>>>
>>>>   ... but i pulled 16.06 release for a test and
>>>
>>> 16.06? Why so old? What's wrong with 16.12? You are missing firmware
>>> updates and fixes by using such an old release.
>>
>>
>> I  wanted to check how long this problem existed. Just picked up one of the
>> old builds.
>>
>>
>>>
>>>> see the ASSERT on the same code in case of  "PCIe .." and "SMBus .."
>>>> errors.
>>>> The line number should be corresponding to
>>>> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>>>>
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
>>>> !EFI_ERROR (Status)
>>>>
>>>>
>>>> but it actually fails here first:
>>>>    Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>>>>                                    &PciRootComplexDevicePath,
>>>>                                    &Handle);
>>>>
>>>> Ryan - could you try to remove Marvell patches and check if you also
>>>> catching "PCIe .." and "SMBus .." errors without them and your build
>>>> still
>>>> fails with other ASSERTs related to PCI.
>>>>
>>>> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI
>>>> PCI
>>>> enumeration is completely corrupted.
>>>>
>>>> These errors do not appear if board was reset with the nPBRESET button.
>>>> It
>>>> never fails with "reset -w" and "reset -c". I also loaded Debian and used
>>>> the shutdown command from there and got the same "PCIe .." and "SMBus .."
>>>> errors after "reboot" command from the "Cmd>" prompt.  Possibly, the
>>>> "reboot" command from the board shell prompt does not reset the board
>>>> correctly so it looks like a firmware issue.
>>>>
>>>> Yes, that's pretty much what I've been saying. There appears to be a
>>>> firmware bug that causes PCIe to be in an unstable state after
>>>> software shutdown of the AP by the motherboard, followed by reset.
>>>>
>>>> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
>>>> and happens before EDK2 runs.
>>>>
>>>> However, unlike you, if I don't get the "PCIe CSR read failed to
>>>> respond" error, upstream EDK2 before your MAC address commit will boot
>>>> without ASSERTing on R0/1/2.
>>>>
>>>>
>>>> Do you mean, when you do have the "PCIe .." error ? As i understand, if
>>>> you
>>>> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the
>>>> MAC
>>>> address code during the first boot and can start UEFI Shell?
>>>>
>>> Correct.
>>>
>>>
>>>> So I've never seen the assert at line 110 as you pointed out above.
>>>> That's curious. I can see why that would/should assert, but it doesn't
>>>> for me. Perhaps it's something to do with the tree you are building
>>>> from? Or the devices you have connected? I have no additional SATA or
>>>> PCIe devices connected to my board.
>>>>
>>>> Cheers,
>>>> Ryan.
>>>>
>>>>
>>>> I am using R2 for the tests. The tests were done without SATA or PCI
>>>> devices
>>>> connected to the board. Unfortunately, I do not have R1 board at this
>>>> moment
>>>> to check it.
>>>>
>>> That's fine, R2 is good enough.
>>>
>>>
>>>> This is my R2 firmware versions:
>>>>
>>>> ARM V2M-Juno Boot loader v1.0.0
>>>> HBI0262 build 1844
>>>>
>>>> ARM V2M_Juno r2 Firmware v1.4.3
>>>
>>> v1.4.4 is the latest Juno firmware. You should update. Although it
>>> won't fix these problems as it wasn't intended to.  And as you used
>>> 16.06 for your baseline testing, I assume you probably also have old
>>> SCP firmware in your environment too.
>>>
>>>
>>>> Build Date: Apr 12 2016
>>>>
>>>> MIC RAM configuration (pms_v105.bin)...
>>>>
>>>> Configuring motherboard (rev D, var A)...
>>>> IOFPGA image \MB\HBI0262D\io_b118.bit
>>>>
>>>> As Leif suggested, i will replace the ASSERT with an error message and
>>>> send
>>>> another patch.
>>>>
>>> Thanks, that's all we need to do, I think.
>>>
>>
>> Done.
>>
>>>> Could you please check one thing for me on your boards? Use a build which
>>>> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s ->
>>>> reboot
>>>> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.
>>>
>>> After I get the "PCIe .." and "SMBus .." errors, the board will never
>>> boot into UEFI. It hangs forever and I have to power-cycle the board.
>>> This rarely happens for me on R2.
>>>
>>> Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
>>> give the "PCIe .." and "SMBus .." error, but most of the time, it does
>>> not and UEFI continues to boot.
>>
>>
>> Interesting, if i do reset -s, i am getting "PCIe ..." error  and stop with
>> ASSERT error in UEFI all the time.
>>
>
> Funny, but while testing mainline [1] before testing your v2 patch, R1
> and R2 both crashed after software shutdown with:
>
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(397):
> !EFI_ERROR (Status)
>
>
> But even worse than that, at one point, R2 did this:
>
> Installation of the FDT using the device path
> <VenHw(E7223039-5836-41E1-B542-D7EC736C5E59)/board.dtb> completed.
> PCI: The size 0x7F000000 of the region 0x80000000 has been increased
> to be a power of two for the AXI translation table.
> PCI: The size 0x180000000 of the region 0x880000000 has been increased
> to be a power of two for the AXI translation table.
>
>
> Synchronous Exception at 0x00000000FDEC712C
> PC 0x0000FDEC712C (0x0000FDEC5000+0x0000212C) [ 0] CpuIo2Dxe.dll
> PC 0x0000FDEC6570 (0x0000FDEC5000+0x00001570) [ 0] CpuIo2Dxe.dll
> PC 0x0000FDE449CC (0x0000FDE41000+0x000039CC) [ 1] PciHostBridge.dll
> PC 0x0000F886A6B4 (0x0000F8864000+0x000066B4) [ 2] PciBusDxe.dll
> PC 0x0000F8867C54 (0x0000F8864000+0x00003C54) [ 2] PciBusDxe.dll
> PC 0x0000F8867C24 (0x0000F8864000+0x00003C24) [ 2] PciBusDxe.dll
> PC 0x0000F886CA08 (0x0000F8864000+0x00008A08) [ 2] PciBusDxe.dll
> PC 0x0000F88677A8 (0x0000F8864000+0x000037A8) [ 2] PciBusDxe.dll
> PC 0x0000F8865920 (0x0000F8864000+0x00001920) [ 2] PciBusDxe.dll
> PC 0x0000FE005F8C (0x0000FDFFA000+0x0000BF8C) [ 3] DxeCore.dll
> PC 0x0000FE005358 (0x0000FDFFA000+0x0000B358) [ 3] DxeCore.dll
> PC 0x0000F8A29D6C (0x0000F8A28000+0x00001D6C) [ 4] ArmJunoDxe.dll
> PC 0x0000FE0177B8 (0x0000FDFFA000+0x0001D7B8) [ 5] DxeCore.dll
> PC 0x0000FE016F38 (0x0000FDFFA000+0x0001CF38) [ 5] DxeCore.dll
> PC 0x0000FE005134 (0x0000FDFFA000+0x0000B134) [ 5] DxeCore.dll
> PC 0x0000FE0175C4 (0x0000FDFFA000+0x0001D5C4) [ 5] DxeCore.dll
> PC 0x0000FE0179C0 (0x0000FDFFA000+0x0001D9C0) [ 5] DxeCore.dll
> PC 0x0000FE017E68 (0x0000FDFFA000+0x0001DE68) [ 5] DxeCore.dll
> PC 0x0000FDDEDA34 (0x0000FDDE1000+0x0000CA34) [ 6] BdsDxe.dll
> PC 0x0000FDDFD534 (0x0000FDDE1000+0x0001C534) [ 6] BdsDxe.dll
> PC 0x0000FDDE3E70 (0x0000FDDE1000+0x00002E70) [ 6] BdsDxe.dll
> PC 0x0000FDFFC3DC (0x0000FDFFA000+0x000023DC) [ 7] DxeCore.dll
> PC 0x0000FDFFB4A0 (0x0000FDFFA000+0x000014A0) [ 7] DxeCore.dll
> PC 0x0000FDFFB024 (0x0000FDFFA000+0x00001024) [ 7] DxeCore.dll
>
> [ 0] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/UefiCpuPkg/CpuIo2Dxe/CpuIo2Dxe/DEBUG/CpuIo2Dxe.dll
> [ 1] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridge.dll
> [ 2] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Bus/Pci/PciBusDxe/PciBusDxe/DEBUG/PciBusDxe.dll
> [ 3] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
> [ 4] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe/DEBUG/ArmJunoDxe.dll
> [ 5] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
> [ 6] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Universal/BdsDxe/BdsDxe/DEBUG/BdsDxe.dll
> [ 7] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
>
>   X0 0x0000000057F00000   X1 0x00000000FDECC030   X2
> 0x0000000057F00000   X3 0x000000000000001A
>   X4 0x00000000FD434E98   X5 0x00000000FDEC64D0   X6
> 0x00000000F887A080   X7 0x00000000FDE4A4B8
>   X8 0x00000000FE02D690   X9 0x0000004100000000  X10
> 0x0000000057EFFFFF  X11 0xC000000000000000
>  X12 0x0000000000000000  X13 0x0000000000000003  X14
> 0x0000000000000000  X15 0x0000000000000003
>  X16 0x00000000FEFFFBF0  X17 0x0000000000000000  X18
> 0x0000000000000000  X19 0x0000000000000013
>  X20 0x0000000000000000  X21 0x0000000000000000  X22
> 0x0000000000000000  X23 0x0000000000000000
>  X24 0x0000000000000000  X25 0x0000000000000000  X26
> 0x0000000000000000  X27 0x0000000000000000
>  X28 0x0000000000000000   FP 0x00000000FEFFF2F0   LR 0x00000000FDEC6570
>
>   V0 0x0000000000000000 0000000000000000   V1 0xAB1C5ED5923F82A4
> 59F111F13956C25B
>   V2 0x550C7DC3243185BE 12835B01D807AA98   V3 0xC19BF1749BDC06A7
> 80DEB1FE72BE5D74
>   V4 0x240CA1CC0FC19DC6 EFBE4786E49B69C1   V5 0x76F988DA5CB0A9DC
> 4A7484AA2DE92C6F
>   V6 0xBF597FC7B00327C8 A831C66D983E5152   V7 0x1429296706CA6351
> D5A79147C6E00BF3
>   V8 0x0000000000000000 2E1B213827B70A85   V9 0x0000000000000000
> 766A0ABB650A7354
>  V10 0x0000000000000000 A81A664BA2BFE8A1  V11 0x0000000000000000
> D6990624D192E819
>  V12 0x0000000000000000 1E376C0819A4C116  V13 0x0000000000000000
> 4ED8AA4A391C0CB3
>  V14 0x0000000000000000 78A5636F748F82EE  V15 0x0000000000000000
> A4506CEB90BEFFFA
>  V16 0x3C83709547545153 BC990E9F897D4FA8  V17 0xBEBEF297C5EC0578
> E1D99AE8C2B92607
>  V18 0x13242833C5F05BAB 101C24C521387481  V19 0x0A50ABCAD0BDDA12
> 996865483E880969
>  V20 0x6C4ABAA53A9BE1CB 4416D9F479B08221  V21 0x60952147C3A68574
> 55B8AE51435C1A1A
>  V22 0x9FEB2A3B4AB8D3BF 88C1883495C7F76F  V23 0xD0C224BC8FB77E09
> 3DB8D233CF470963
>  V24 0x0E72963E02D21C93 C95B757DA62B3A12  V25 0x2A77DBA1E4EE5D5C
> E08479A1B557DFA8
>  V26 0xB593F86668CA8129 927A0019CDEC1A76  V27 0xD000200000E00002
> A6201000E0008C04
>  V28 0x8020000081000200 892080400880A022  V29 0x2085000000000136
> 801489A520002304
>  V30 0x0801200082040060 0384E05702949200  V31 0x4064084112940202
> 1000458804200800
>
>   SP 0x00000000FEFFF2D0  ELR 0x00000000FDEC712C  SPSR 0x60000209  FPSR
> 0x00000000
>  ESR 0x96000210          FAR 0x0000000057F00000
>
>  ESR : EC 0x25  IL 0x1  ISS 0x00000210
>
> Data abort: Synchronous external abort
>
> Stack dump:
>   00000FEFFF1D0: 996865483E880969 0A50ABCAD0BDDA12 4416D9F479B08221
> 6C4ABAA53A9BE1CB
>   00000FEFFF1F0: 55B8AE51435C1A1A 60952147C3A68574 88C1883495C7F76F
> 9FEB2A3B4AB8D3BF
>   00000FEFFF210: 3DB8D233CF470963 D0C224BC8FB77E09 C95B757DA62B3A12
> 0E72963E02D21C93
>   00000FEFFF230: E08479A1B557DFA8 2A77DBA1E4EE5D5C 927A0019CDEC1A76
> B593F86668CA8129
>   00000FEFFF250: A6201000E0008C04 D000200000E00002 892080400880A022
> 8020000081000200
>   00000FEFFF270: 801489A520002304 2085000000000136 0384E05702949200
> 0801200082040060
>   00000FEFFF290: 1000458804200800 4064084112940202 00000000FDEC712C
> 0000000060000209
>   00000FEFFF2B0: 0000000000000000 0000000096000210 0000000057F00000
> 000000000000001A
>> 00000FEFFF2D0: 0000000057F00000 0000000057F00000 FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF
>   00000FEFFF2F0: 00000000FEFFF350 00000000FDE449CC A831C66D983E5152
> 00000000FD434E98
>   00000FEFFF310: 000000000000001A 0000000057F00000 0000000057F00000
> 00000000FDECC000
>   00000FEFFF330: 00000000FD434E98 0101000000000000 0000000000000000
> 00000000FD434E98
>   00000FEFFF350: 00000000FEFFF3A0 00000000F886A6B4 00000000FEFFF3A0
> 00000000FD434E98
>   00000FEFFF370: 000000000000001A 0000000057F00000 00000000FEFFF3A0
> 00000000FD46C860
>   00000FEFFF390: 00000000FDECC000 00000000FD46C798 00000000FEFFF420
> 00000000F8867C54
>   00000FEFFF3B0: 0000000057F00000 00000000FDDB8018 00000000FEFFF400
> 57F00000F8879090
> ASSERT [ArmCpuDxe]
> /linaro/platforms/uefi/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(265):
> ((BOOLEAN)(0==1))
>
>

I just wanted to add to this problem report in case we start tracking it again.

I've been testing software shutdown and reboot using upstream commit
5734d48 with Daniil's v2 patch applied and rebooting (with "reset -s")
in a loop.

Eventually after many reboots, I got Juno R1 to report the same error
as R0 on boot:

ERROR: PCIe CSR read failed to respond
ERROR: SMBus transaction not claimed

However, unlike R0 which reports this forever in a loop, R1 reports
those two error messages 100 times, then reports this before
continuing to boot the board:

ERROR: reading PCIE Switch VID
PCIe MAC addresss 0002-F700-60BB
MAC addrs test: PASSED

SMC MAC address 0002-F700-60B9
Setting HDMI0 mode for SVGA.
Setting HDMI1 mode for SVGA.

SoC SMB clock enabled.

Testing SMB clock...
SMB clock running
Releasing system resets...

UART0 set to SoC UART0
UART1 set to SoC UART1

NOTICE:  Booting Trusted Firmware
[snip]

I've not seen boot continue after the "PCIe CSR ..." errors before. At
this point, as Daniil reported, EDK2 hangs detecting the controller:

ASSERT [ArmJunoDxe]
/linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(395):
!EFI_ERROR (Status)

Where, in my tree, I see ArmJunoDxe.c has these lines on code at line 394/395:

394:  Status = gBS->ConnectController (Handle, NULL,
PciRootComplexDevicePath, FALSE);
395:  ASSERT_EFI_ERROR (Status);


I'm going to stop looking into this issue now until I get new
motherboard firmware with a proposed fix.


> Anyway, these problems are all due to dodgy lower level firmware,
> which should be fixed eventually. And your v2 patch removing the
> assert is out best way forward.
>
> Cheers,
> Ryan.
>
>
>>
>>>
>>>
>>>> As i understand, you do
>>>> not have the ASSERT with your old builds so you can start the UEFI shell.
>>>> Use the "pci" shell command to list all PCI devices. Do you see them
>>>> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
>>>> errors we will not face any further problems with PCI.
>>>>
>>> In the case where my board continues to boot after software shutdown,
>>> the pci shell command shows this:
>>>
>>> Shell> pci
>>>     Seg  Bus  Dev  Func
>>>     ---  ---  ---  ----
>>>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 1556 Device 1100 Prog Interface 0
>>>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   01    00 ==> Pre 2.0 device - All devices other than VGA
>>>               Vendor 0000 Device 0000 Prog Interface 0
>>>      00   02   02    00 ==> Pre 2.0 device - All devices other than VGA
>>>               Vendor 0000 Device 0000 Prog Interface 0
>>>      00   02   03    00 ==> Pre 2.0 device - All devices other than VGA
>>>               Vendor 0000 Device 0000 Prog Interface 0
>>>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   1F    00 ==> Pre 2.0 device - All devices other than VGA
>>>               Vendor 0000 Device 0000 Prog Interface 0
>>>
>>> Which is not the same as when the board is power cycled. On power
>>> cycle, I see this:
>>>
>>>     Seg  Bus  Dev  Func
>>>     ---  ---  ---  ----
>>>      00   00   00    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 1556 Device 1100 Prog Interface 0
>>>      00   01   00    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   01    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   02    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   03    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   0C    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   10    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   02   1F    00 ==> Bridge Device - PCI/PCI bridge
>>>               Vendor 111D Device 8090 Prog Interface 0
>>>      00   03   00    00 ==> Mass Storage Controller - Other mass storage
>>> controll
>>> er
>>>               Vendor 1095 Device 3132 Prog Interface 0
>>>      00   08   00    00 ==> Network Controller - Ethernet controller
>>>               Vendor 11AB Device 4380 Prog Interface 0
>>>
>>> Cheers,
>>> Ryan.
>>>
>>>
>>
>> If I let UEFI boot to the shell after "PCIe ..." error, the pci shell
>> command shows me a long list of unrecognized PCI devices with all
>> combinations of BDFs.
>>
>> Thanks,
>> Daniil
>>
>>
>>>> Thanks,
>>>> Daniil
>>>>
>>>>
>>>> Thanks,
>>>> Daniil
>>>>
>>>>
>>>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>>>> earlier version, the board would boot anyway, but the Yukon device
>>>> would be missing.
>>>>
>>>> Now it dies.
>>>>
>>>> I don't know which is worse, but I think hanging is worse than an
>>>> ethernet port dropping out. Although hanging is a bit more obvious
>>>> that there's a problem...
>>>>
>>>>
>>>> +  }
>>>>   }
>>>>
>>>>   STATIC
>>>> --
>>>> 2.7.4
>>>>
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>
>>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>>
>
> [1] mainline is this commit at the moment:
> 5734d48  2017-01-22  ShellPkg SmbiosView: Add decoding of SMBIOS spec
> 3.1.1     [Star Zeng]


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-01-24 12:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-18 23:27 [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0 Daniil Egranov
2017-01-19 13:49 ` Ryan Harkin
2017-01-19 15:13   ` Leif Lindholm
2017-01-20  1:34     ` Daniil Egranov
2017-01-20 10:30       ` Ryan Harkin
2017-01-20 20:57         ` Daniil Egranov
2017-01-23 11:26           ` Ryan Harkin
2017-01-23 12:56           ` Ryan Harkin
2017-01-24  2:19             ` Daniil Egranov
2017-01-24 11:05               ` Ryan Harkin
2017-01-24 12:34                 ` Ryan Harkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox