From: Ryan Harkin <ryan.harkin@linaro.org>
To: Daniil Egranov <daniil.egranov@arm.com>
Cc: "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
Leif Lindholm <leif.lindholm@linaro.org>
Subject: Re: [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0
Date: Tue, 24 Jan 2017 12:34:37 +0000 [thread overview]
Message-ID: <CAD0U-hL7w-A5BM7p1GP0m6mPrX4pGcG8VUVQHFuOCbE6+KuDcw@mail.gmail.com> (raw)
In-Reply-To: <CAD0U-hLaWEYhHLtqk31T5ERtOTz7yLO5Mv4t__84Ui4YpgHH4Q@mail.gmail.com>
On 24 January 2017 at 11:05, Ryan Harkin <ryan.harkin@linaro.org> wrote:
> On 24 January 2017 at 02:19, Daniil Egranov <daniil.egranov@arm.com> wrote:
>> Hi Ryan,
>>
>>
>>
>> On 01/23/2017 06:56 AM, Ryan Harkin wrote:
>>>
>>> On 20 January 2017 at 20:57, Daniil Egranov <daniil.egranov@arm.com>
>>> wrote:
>>>>
>>>> Hi Ryan,
>>>>
>>>>
>>>> On 01/20/2017 04:30 AM, Ryan Harkin wrote:
>>>>
>>>> On 20 January 2017 at 01:34, Daniil Egranov <daniil.egranov@arm.com>
>>>> wrote:
>>>>
>>>> Hi Leif, Ryan
>>>>
>>>>
>>>> On 01/19/2017 09:13 AM, Leif Lindholm wrote:
>>>>
>>>> On Thu, Jan 19, 2017 at 01:49:04PM +0000, Ryan Harkin wrote:
>>>>
>>>> On 18 January 2017 at 23:27, Daniil Egranov <daniil.egranov@arm.com>
>>>> wrote:
>>>>
>>>> The Marvell Yukon MAC address load supported only on Juno R1 and R2.
>>>> It disabled for Juno R0 due to PCI issues on this board.
>>>>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Daniil Egranov <daniil.egranov@arm.com>
>>>>
>>>> Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
>>>>
>>>> ---
>>>> ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c | 9 +++++++--
>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> index 47ff587..e9e6990 100644
>>>> --- a/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> +++ b/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c
>>>> @@ -378,6 +378,7 @@ OnEndOfDxe (
>>>> EFI_DEVICE_PATH_PROTOCOL* PciRootComplexDevicePath;
>>>> EFI_HANDLE Handle;
>>>> EFI_STATUS Status;
>>>> + UINT32 JunoRevision;
>>>>
>>>> //
>>>> // PCI Root Complex initialization
>>>> @@ -393,8 +394,12 @@ OnEndOfDxe (
>>>> Status = gBS->ConnectController (Handle, NULL,
>>>> PciRootComplexDevicePath,
>>>> FALSE);
>>>> ASSERT_EFI_ERROR (Status);
>>>>
>>>> - Status = ArmJunoSetNicMacAddress ();
>>>> - ASSERT_EFI_ERROR (Status);
>>>> + GetJunoRevision (JunoRevision);
>>>> +
>>>> + if (JunoRevision != JUNO_REVISION_R0) {
>>>> + Status = ArmJunoSetNicMacAddress ();
>>>> + ASSERT_EFI_ERROR (Status);
>>>>
>>>> This is just an FYI, but I stacked your patch on top of mainline, like
>>>> this:
>>>>
>>>> 5f81f61 2017-01-18 ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe:
>>>> Fixed crash on Juno R0 [Daniil Egranov]
>>>> 19ca06b 2017-01-19 OvmfPkg: Remove superfluous return statements.
>>>> [Thomas Huth]
>>>>
>>>> The first time I ran this, Juno R0 worked fine, but on R1 and R2, the
>>>> assert triggered:
>>>>
>>>> UEFI firmware (version 5f81f61 built at 11:56:52 on Jan 19 2017)
>>>> [snip]
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(401):
>>>> !EFI_ERROR (Status)
>>>>
>>>> I worked out what is happening. And it's not to do with this patch.
>>>> It's another fall-out from the re-work you did to the previous patch.
>>>> It's also ultimately due to a bug the firmware.
>>>>
>>>> With the initial version of your "Set Marvell Yukon MAC address"
>>>> patch, this hang didn't happen. I suspect that was because your error
>>>> checking was weaker and certain PCIe failures didn't trigger the
>>>> assert.
>>>>
>>>> To reproduce the error with this commit:
>>>> 1) power on and boot R1 or R2 into Shell
>>>> I do this by interrupting the boot by pressing ESCAPE and using the
>>>> boot
>>>> menu
>>>> 2) At the Shell prompt, run "reset -s" to shutdown
>>>> 3) At the ARM Boot Loader "Cmd>" prompt, run "reboot"
>>>> 4) the board will hang while booting UEFI, assuming the board firmware
>>>> doesn't die with constant messages like this:
>>>>
>>>> ERROR: PCIe CSR read failed to respond
>>>> ERROR: SMBus transaction not claimed
>>>>
>>>> Assuming the problem is firmware, not EDK2, what should we do about it?
>>>>
>>>> OK, so instinctively, my reaction was that "the reset -s bug is a
>>>> system controller firmware bug and we shouldn't work around
>>>> it". However, since it is actually disrupting Ryan's workflow, which
>>>> frequently doesn't touch PCI at all, I think downgrading the ASSERT to
>>>> an error message is a good idea short-term.
>>>>
>>>> Daniil - could you make that change please?
>>>>
>>>> /
>>>> Leif
>>>>
>>>>
>>>> I've been able to reproduce "PCIe CSR read failed to respond" and "SMBus
>>>> transaction not claimed" errors on my Juno R2. I disabled Marvell Yukon
>>>> driver (.dsc/.fdf) and removed ArmJunoDxe patch but still see the same
>>>> error
>>>> messages during the initial boot.
>>>>
>>>> Testing motherboard interfaces (FPGA build 118)...
>>>> SRAM 32MB test: PASSED
>>>> LAN9118 test: PASSED
>>>> KMI1/2 test: PASSED
>>>> MMC test: PASSED
>>>> PB/LEDs test: PASSED
>>>> FPGA UART test: PASSED
>>>> ERROR: PCIe CSR read failed to respond
>>>> ERROR: SMBus transaction not claimed
>>>> ERROR: PCIe CSR read failed to respond
>>>> ...
>>>>
>>>> Once it went through reporting these errors, the UEFI starts loading but
>>>> still fails in OnEndOfDxe():
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /home/user/workspace/juno/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(110):
>>>> !EFI_ERROR (Status)
>>>>
>>>> This is the original ArmJunoDxe code:
>>>> Status = gBS->ConnectController (Handle, NULL,
>>>> PciRootComplexDevicePath,
>>>> FALSE);
>>>> ASSERT_EFI_ERROR (Status); <---- line 110
>>>>
>>>> I'm confused at which point in the commit history this is line 110.
>>>>
>>>> Just before your MAC commit, it's line 108. After your commit, and in
>>>> the current mainline, it's line 394.
>>>>
>>>> Then, I have to go back to Jeremy's commit
>>>> 58a4bff071832e587d18f97597b5d571dcebc9d7 on 27 July 2016 before I see
>>>> another change to ArmJunoDxe.c. Before that commit, the assert appears
>>>> on line 116.
>>>>
>>>> On my tree, 17.01, 16.12 and 16.11 has the assert on line 246, 16.10
>>>> on line 125.
>>>>
>>>> So I'm unsure what code you are running.
>>>>
>>>>
>>>> This shows the code where it fails for me after MAC address code was
>>>> removed
>>>> from ArmJunoDxe.c. It's part of my development branch and lines may not
>>>> matching any public commits
>>>
>>> You should consider switching branch to mainline when trying to
>>> compare and report problems with other people, then we have a common
>>> baseline to work with.
>>>
>>>
>>>> ... but i pulled 16.06 release for a test and
>>>
>>> 16.06? Why so old? What's wrong with 16.12? You are missing firmware
>>> updates and fixes by using such an old release.
>>
>>
>> I wanted to check how long this problem existed. Just picked up one of the
>> old builds.
>>
>>
>>>
>>>> see the ASSERT on the same code in case of "PCIe .." and "SMBus .."
>>>> errors.
>>>> The line number should be corresponding to
>>>> 032082bf9906f82b77c161c1c2efbddfd4bb3751 commit:
>>>>
>>>> ASSERT_EFI_ERROR (Status = Not Found)
>>>> ASSERT [ArmJunoDxe]
>>>>
>>>> /home/user/workspace/juno_16.06_orig/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(124):
>>>> !EFI_ERROR (Status)
>>>>
>>>>
>>>> but it actually fails here first:
>>>> Status = gBS->LocateDevicePath (&gEfiPciRootBridgeIoProtocolGuid,
>>>> &PciRootComplexDevicePath,
>>>> &Handle);
>>>>
>>>> Ryan - could you try to remove Marvell patches and check if you also
>>>> catching "PCIe .." and "SMBus .." errors without them and your build
>>>> still
>>>> fails with other ASSERTs related to PCI.
>>>>
>>>> Leif - in my tests, if it fails with "PCIe .." and "SMBus ..", the UEFI
>>>> PCI
>>>> enumeration is completely corrupted.
>>>>
>>>> These errors do not appear if board was reset with the nPBRESET button.
>>>> It
>>>> never fails with "reset -w" and "reset -c". I also loaded Debian and used
>>>> the shutdown command from there and got the same "PCIe .." and "SMBus .."
>>>> errors after "reboot" command from the "Cmd>" prompt. Possibly, the
>>>> "reboot" command from the board shell prompt does not reset the board
>>>> correctly so it looks like a firmware issue.
>>>>
>>>> Yes, that's pretty much what I've been saying. There appears to be a
>>>> firmware bug that causes PCIe to be in an unstable state after
>>>> software shutdown of the AP by the motherboard, followed by reset.
>>>>
>>>> The "PCIe CSR read failed to respond" error is nothing to do with EDK2
>>>> and happens before EDK2 runs.
>>>>
>>>> However, unlike you, if I don't get the "PCIe CSR read failed to
>>>> respond" error, upstream EDK2 before your MAC address commit will boot
>>>> without ASSERTing on R0/1/2.
>>>>
>>>>
>>>> Do you mean, when you do have the "PCIe .." error ? As i understand, if
>>>> you
>>>> power-cycle the board (no "PCIe .." errors) you do not see ASSERT at the
>>>> MAC
>>>> address code during the first boot and can start UEFI Shell?
>>>>
>>> Correct.
>>>
>>>
>>>> So I've never seen the assert at line 110 as you pointed out above.
>>>> That's curious. I can see why that would/should assert, but it doesn't
>>>> for me. Perhaps it's something to do with the tree you are building
>>>> from? Or the devices you have connected? I have no additional SATA or
>>>> PCIe devices connected to my board.
>>>>
>>>> Cheers,
>>>> Ryan.
>>>>
>>>>
>>>> I am using R2 for the tests. The tests were done without SATA or PCI
>>>> devices
>>>> connected to the board. Unfortunately, I do not have R1 board at this
>>>> moment
>>>> to check it.
>>>>
>>> That's fine, R2 is good enough.
>>>
>>>
>>>> This is my R2 firmware versions:
>>>>
>>>> ARM V2M-Juno Boot loader v1.0.0
>>>> HBI0262 build 1844
>>>>
>>>> ARM V2M_Juno r2 Firmware v1.4.3
>>>
>>> v1.4.4 is the latest Juno firmware. You should update. Although it
>>> won't fix these problems as it wasn't intended to. And as you used
>>> 16.06 for your baseline testing, I assume you probably also have old
>>> SCP firmware in your environment too.
>>>
>>>
>>>> Build Date: Apr 12 2016
>>>>
>>>> MIC RAM configuration (pms_v105.bin)...
>>>>
>>>> Configuring motherboard (rev D, var A)...
>>>> IOFPGA image \MB\HBI0262D\io_b118.bit
>>>>
>>>> As Leif suggested, i will replace the ASSERT with an error message and
>>>> send
>>>> another patch.
>>>>
>>> Thanks, that's all we need to do, I think.
>>>
>>
>> Done.
>>
>>>> Could you please check one thing for me on your boards? Use a build which
>>>> worked for you (no ASSERT) and your steps: UEFI shell -> reset -s ->
>>>> reboot
>>>> -> see "PCIe .." and "SMBus .." errors -> boot UEFI.
>>>
>>> After I get the "PCIe .." and "SMBus .." errors, the board will never
>>> boot into UEFI. It hangs forever and I have to power-cycle the board.
>>> This rarely happens for me on R2.
>>>
>>> Sometimes, I do UEFI shell -> reset -s -> reboot and the board will
>>> give the "PCIe .." and "SMBus .." error, but most of the time, it does
>>> not and UEFI continues to boot.
>>
>>
>> Interesting, if i do reset -s, i am getting "PCIe ..." error and stop with
>> ASSERT error in UEFI all the time.
>>
>
> Funny, but while testing mainline [1] before testing your v2 patch, R1
> and R2 both crashed after software shutdown with:
>
> ASSERT [ArmJunoDxe]
> /linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(397):
> !EFI_ERROR (Status)
>
>
> But even worse than that, at one point, R2 did this:
>
> Installation of the FDT using the device path
> <VenHw(E7223039-5836-41E1-B542-D7EC736C5E59)/board.dtb> completed.
> PCI: The size 0x7F000000 of the region 0x80000000 has been increased
> to be a power of two for the AXI translation table.
> PCI: The size 0x180000000 of the region 0x880000000 has been increased
> to be a power of two for the AXI translation table.
>
>
> Synchronous Exception at 0x00000000FDEC712C
> PC 0x0000FDEC712C (0x0000FDEC5000+0x0000212C) [ 0] CpuIo2Dxe.dll
> PC 0x0000FDEC6570 (0x0000FDEC5000+0x00001570) [ 0] CpuIo2Dxe.dll
> PC 0x0000FDE449CC (0x0000FDE41000+0x000039CC) [ 1] PciHostBridge.dll
> PC 0x0000F886A6B4 (0x0000F8864000+0x000066B4) [ 2] PciBusDxe.dll
> PC 0x0000F8867C54 (0x0000F8864000+0x00003C54) [ 2] PciBusDxe.dll
> PC 0x0000F8867C24 (0x0000F8864000+0x00003C24) [ 2] PciBusDxe.dll
> PC 0x0000F886CA08 (0x0000F8864000+0x00008A08) [ 2] PciBusDxe.dll
> PC 0x0000F88677A8 (0x0000F8864000+0x000037A8) [ 2] PciBusDxe.dll
> PC 0x0000F8865920 (0x0000F8864000+0x00001920) [ 2] PciBusDxe.dll
> PC 0x0000FE005F8C (0x0000FDFFA000+0x0000BF8C) [ 3] DxeCore.dll
> PC 0x0000FE005358 (0x0000FDFFA000+0x0000B358) [ 3] DxeCore.dll
> PC 0x0000F8A29D6C (0x0000F8A28000+0x00001D6C) [ 4] ArmJunoDxe.dll
> PC 0x0000FE0177B8 (0x0000FDFFA000+0x0001D7B8) [ 5] DxeCore.dll
> PC 0x0000FE016F38 (0x0000FDFFA000+0x0001CF38) [ 5] DxeCore.dll
> PC 0x0000FE005134 (0x0000FDFFA000+0x0000B134) [ 5] DxeCore.dll
> PC 0x0000FE0175C4 (0x0000FDFFA000+0x0001D5C4) [ 5] DxeCore.dll
> PC 0x0000FE0179C0 (0x0000FDFFA000+0x0001D9C0) [ 5] DxeCore.dll
> PC 0x0000FE017E68 (0x0000FDFFA000+0x0001DE68) [ 5] DxeCore.dll
> PC 0x0000FDDEDA34 (0x0000FDDE1000+0x0000CA34) [ 6] BdsDxe.dll
> PC 0x0000FDDFD534 (0x0000FDDE1000+0x0001C534) [ 6] BdsDxe.dll
> PC 0x0000FDDE3E70 (0x0000FDDE1000+0x00002E70) [ 6] BdsDxe.dll
> PC 0x0000FDFFC3DC (0x0000FDFFA000+0x000023DC) [ 7] DxeCore.dll
> PC 0x0000FDFFB4A0 (0x0000FDFFA000+0x000014A0) [ 7] DxeCore.dll
> PC 0x0000FDFFB024 (0x0000FDFFA000+0x00001024) [ 7] DxeCore.dll
>
> [ 0] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/UefiCpuPkg/CpuIo2Dxe/CpuIo2Dxe/DEBUG/CpuIo2Dxe.dll
> [ 1] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridge.dll
> [ 2] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Bus/Pci/PciBusDxe/PciBusDxe/DEBUG/PciBusDxe.dll
> [ 3] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
> [ 4] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe/DEBUG/ArmJunoDxe.dll
> [ 5] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
> [ 6] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Universal/BdsDxe/BdsDxe/DEBUG/BdsDxe.dll
> [ 7] /linaro/platforms/uefi/edk2/Build/ArmJuno/DEBUG_GCC5/AARCH64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll
>
> X0 0x0000000057F00000 X1 0x00000000FDECC030 X2
> 0x0000000057F00000 X3 0x000000000000001A
> X4 0x00000000FD434E98 X5 0x00000000FDEC64D0 X6
> 0x00000000F887A080 X7 0x00000000FDE4A4B8
> X8 0x00000000FE02D690 X9 0x0000004100000000 X10
> 0x0000000057EFFFFF X11 0xC000000000000000
> X12 0x0000000000000000 X13 0x0000000000000003 X14
> 0x0000000000000000 X15 0x0000000000000003
> X16 0x00000000FEFFFBF0 X17 0x0000000000000000 X18
> 0x0000000000000000 X19 0x0000000000000013
> X20 0x0000000000000000 X21 0x0000000000000000 X22
> 0x0000000000000000 X23 0x0000000000000000
> X24 0x0000000000000000 X25 0x0000000000000000 X26
> 0x0000000000000000 X27 0x0000000000000000
> X28 0x0000000000000000 FP 0x00000000FEFFF2F0 LR 0x00000000FDEC6570
>
> V0 0x0000000000000000 0000000000000000 V1 0xAB1C5ED5923F82A4
> 59F111F13956C25B
> V2 0x550C7DC3243185BE 12835B01D807AA98 V3 0xC19BF1749BDC06A7
> 80DEB1FE72BE5D74
> V4 0x240CA1CC0FC19DC6 EFBE4786E49B69C1 V5 0x76F988DA5CB0A9DC
> 4A7484AA2DE92C6F
> V6 0xBF597FC7B00327C8 A831C66D983E5152 V7 0x1429296706CA6351
> D5A79147C6E00BF3
> V8 0x0000000000000000 2E1B213827B70A85 V9 0x0000000000000000
> 766A0ABB650A7354
> V10 0x0000000000000000 A81A664BA2BFE8A1 V11 0x0000000000000000
> D6990624D192E819
> V12 0x0000000000000000 1E376C0819A4C116 V13 0x0000000000000000
> 4ED8AA4A391C0CB3
> V14 0x0000000000000000 78A5636F748F82EE V15 0x0000000000000000
> A4506CEB90BEFFFA
> V16 0x3C83709547545153 BC990E9F897D4FA8 V17 0xBEBEF297C5EC0578
> E1D99AE8C2B92607
> V18 0x13242833C5F05BAB 101C24C521387481 V19 0x0A50ABCAD0BDDA12
> 996865483E880969
> V20 0x6C4ABAA53A9BE1CB 4416D9F479B08221 V21 0x60952147C3A68574
> 55B8AE51435C1A1A
> V22 0x9FEB2A3B4AB8D3BF 88C1883495C7F76F V23 0xD0C224BC8FB77E09
> 3DB8D233CF470963
> V24 0x0E72963E02D21C93 C95B757DA62B3A12 V25 0x2A77DBA1E4EE5D5C
> E08479A1B557DFA8
> V26 0xB593F86668CA8129 927A0019CDEC1A76 V27 0xD000200000E00002
> A6201000E0008C04
> V28 0x8020000081000200 892080400880A022 V29 0x2085000000000136
> 801489A520002304
> V30 0x0801200082040060 0384E05702949200 V31 0x4064084112940202
> 1000458804200800
>
> SP 0x00000000FEFFF2D0 ELR 0x00000000FDEC712C SPSR 0x60000209 FPSR
> 0x00000000
> ESR 0x96000210 FAR 0x0000000057F00000
>
> ESR : EC 0x25 IL 0x1 ISS 0x00000210
>
> Data abort: Synchronous external abort
>
> Stack dump:
> 00000FEFFF1D0: 996865483E880969 0A50ABCAD0BDDA12 4416D9F479B08221
> 6C4ABAA53A9BE1CB
> 00000FEFFF1F0: 55B8AE51435C1A1A 60952147C3A68574 88C1883495C7F76F
> 9FEB2A3B4AB8D3BF
> 00000FEFFF210: 3DB8D233CF470963 D0C224BC8FB77E09 C95B757DA62B3A12
> 0E72963E02D21C93
> 00000FEFFF230: E08479A1B557DFA8 2A77DBA1E4EE5D5C 927A0019CDEC1A76
> B593F86668CA8129
> 00000FEFFF250: A6201000E0008C04 D000200000E00002 892080400880A022
> 8020000081000200
> 00000FEFFF270: 801489A520002304 2085000000000136 0384E05702949200
> 0801200082040060
> 00000FEFFF290: 1000458804200800 4064084112940202 00000000FDEC712C
> 0000000060000209
> 00000FEFFF2B0: 0000000000000000 0000000096000210 0000000057F00000
> 000000000000001A
>> 00000FEFFF2D0: 0000000057F00000 0000000057F00000 FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF
> 00000FEFFF2F0: 00000000FEFFF350 00000000FDE449CC A831C66D983E5152
> 00000000FD434E98
> 00000FEFFF310: 000000000000001A 0000000057F00000 0000000057F00000
> 00000000FDECC000
> 00000FEFFF330: 00000000FD434E98 0101000000000000 0000000000000000
> 00000000FD434E98
> 00000FEFFF350: 00000000FEFFF3A0 00000000F886A6B4 00000000FEFFF3A0
> 00000000FD434E98
> 00000FEFFF370: 000000000000001A 0000000057F00000 00000000FEFFF3A0
> 00000000FD46C860
> 00000FEFFF390: 00000000FDECC000 00000000FD46C798 00000000FEFFF420
> 00000000F8867C54
> 00000FEFFF3B0: 0000000057F00000 00000000FDDB8018 00000000FEFFF400
> 57F00000F8879090
> ASSERT [ArmCpuDxe]
> /linaro/platforms/uefi/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(265):
> ((BOOLEAN)(0==1))
>
>
I just wanted to add to this problem report in case we start tracking it again.
I've been testing software shutdown and reboot using upstream commit
5734d48 with Daniil's v2 patch applied and rebooting (with "reset -s")
in a loop.
Eventually after many reboots, I got Juno R1 to report the same error
as R0 on boot:
ERROR: PCIe CSR read failed to respond
ERROR: SMBus transaction not claimed
However, unlike R0 which reports this forever in a loop, R1 reports
those two error messages 100 times, then reports this before
continuing to boot the board:
ERROR: reading PCIE Switch VID
PCIe MAC addresss 0002-F700-60BB
MAC addrs test: PASSED
SMC MAC address 0002-F700-60B9
Setting HDMI0 mode for SVGA.
Setting HDMI1 mode for SVGA.
SoC SMB clock enabled.
Testing SMB clock...
SMB clock running
Releasing system resets...
UART0 set to SoC UART0
UART1 set to SoC UART1
NOTICE: Booting Trusted Firmware
[snip]
I've not seen boot continue after the "PCIe CSR ..." errors before. At
this point, as Daniil reported, EDK2 hangs detecting the controller:
ASSERT [ArmJunoDxe]
/linaro/platforms/uefi/edk2/ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe/ArmJunoDxe.c(395):
!EFI_ERROR (Status)
Where, in my tree, I see ArmJunoDxe.c has these lines on code at line 394/395:
394: Status = gBS->ConnectController (Handle, NULL,
PciRootComplexDevicePath, FALSE);
395: ASSERT_EFI_ERROR (Status);
I'm going to stop looking into this issue now until I get new
motherboard firmware with a proposed fix.
> Anyway, these problems are all due to dodgy lower level firmware,
> which should be fixed eventually. And your v2 patch removing the
> assert is out best way forward.
>
> Cheers,
> Ryan.
>
>
>>
>>>
>>>
>>>> As i understand, you do
>>>> not have the ASSERT with your old builds so you can start the UEFI shell.
>>>> Use the "pci" shell command to list all PCI devices. Do you see them
>>>> correctly enumerated? I'd like to make sure that by ignoring the ASSERT
>>>> errors we will not face any further problems with PCI.
>>>>
>>> In the case where my board continues to boot after software shutdown,
>>> the pci shell command shows this:
>>>
>>> Shell> pci
>>> Seg Bus Dev Func
>>> --- --- --- ----
>>> 00 00 00 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 1556 Device 1100 Prog Interface 0
>>> 00 01 00 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 01 00 ==> Pre 2.0 device - All devices other than VGA
>>> Vendor 0000 Device 0000 Prog Interface 0
>>> 00 02 02 00 ==> Pre 2.0 device - All devices other than VGA
>>> Vendor 0000 Device 0000 Prog Interface 0
>>> 00 02 03 00 ==> Pre 2.0 device - All devices other than VGA
>>> Vendor 0000 Device 0000 Prog Interface 0
>>> 00 02 0C 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 10 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 1F 00 ==> Pre 2.0 device - All devices other than VGA
>>> Vendor 0000 Device 0000 Prog Interface 0
>>>
>>> Which is not the same as when the board is power cycled. On power
>>> cycle, I see this:
>>>
>>> Seg Bus Dev Func
>>> --- --- --- ----
>>> 00 00 00 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 1556 Device 1100 Prog Interface 0
>>> 00 01 00 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 01 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 02 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 03 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 0C 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 10 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 02 1F 00 ==> Bridge Device - PCI/PCI bridge
>>> Vendor 111D Device 8090 Prog Interface 0
>>> 00 03 00 00 ==> Mass Storage Controller - Other mass storage
>>> controll
>>> er
>>> Vendor 1095 Device 3132 Prog Interface 0
>>> 00 08 00 00 ==> Network Controller - Ethernet controller
>>> Vendor 11AB Device 4380 Prog Interface 0
>>>
>>> Cheers,
>>> Ryan.
>>>
>>>
>>
>> If I let UEFI boot to the shell after "PCIe ..." error, the pci shell
>> command shows me a long list of unrecognized PCI devices with all
>> combinations of BDFs.
>>
>> Thanks,
>> Daniil
>>
>>
>>>> Thanks,
>>>> Daniil
>>>>
>>>>
>>>> Thanks,
>>>> Daniil
>>>>
>>>>
>>>> Prior to your "Set Marvell Yukon MAC address" patch, or with the
>>>> earlier version, the board would boot anyway, but the Yukon device
>>>> would be missing.
>>>>
>>>> Now it dies.
>>>>
>>>> I don't know which is worse, but I think hanging is worse than an
>>>> ethernet port dropping out. Although hanging is a bit more obvious
>>>> that there's a problem...
>>>>
>>>>
>>>> + }
>>>> }
>>>>
>>>> STATIC
>>>> --
>>>> 2.7.4
>>>>
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>
>>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>>
>
> [1] mainline is this commit at the moment:
> 5734d48 2017-01-22 ShellPkg SmbiosView: Add decoding of SMBIOS spec
> 3.1.1 [Star Zeng]
prev parent reply other threads:[~2017-01-24 12:34 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-18 23:27 [PATCH] ArmPlatformPkg/ArmJunoPkg/Drivers/ArmJunoDxe: Fixed crash on Juno R0 Daniil Egranov
2017-01-19 13:49 ` Ryan Harkin
2017-01-19 15:13 ` Leif Lindholm
2017-01-20 1:34 ` Daniil Egranov
2017-01-20 10:30 ` Ryan Harkin
2017-01-20 20:57 ` Daniil Egranov
2017-01-23 11:26 ` Ryan Harkin
2017-01-23 12:56 ` Ryan Harkin
2017-01-24 2:19 ` Daniil Egranov
2017-01-24 11:05 ` Ryan Harkin
2017-01-24 12:34 ` Ryan Harkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAD0U-hL7w-A5BM7p1GP0m6mPrX4pGcG8VUVQHFuOCbE6+KuDcw@mail.gmail.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox