* Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
@ 2018-08-30 20:17 aaron.young
2018-08-31 13:17 ` Laszlo Ersek
0 siblings, 1 reply; 10+ messages in thread
From: aaron.young @ 2018-08-30 20:17 UTC (permalink / raw)
To: edk2-devel; +Cc: lersek, brijesh.singh
We have encountered what we believe to be a regression in the
OvmfPkg/VirtioNetDxe driver.
This regression causes PXE boot to fail with the following ASSERT:
ASSERT
[VirtioNetDxe]/builddir/build/BUILD/ovmf-92d07e48907f/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184):
((BOOLEAN)(0==1))
The failure was encountered on a aarch64/Armv8 system (Ampere server)
using the latest mainline (edk2.git/master) code as of 8-28-2018.
On a hunch, I reverted the following patchset which resulted in PXE
boot being successful - which suggests this patchset is likely the culprit:
https://lists.01.org/pipermail/edk2-devel/2017-September/014679.html
The command used to initiate the VM PXE boot/install is:
virt-install --name guest1 --vcpus 16 --ram 32000 --arch aarch64
--disk guest1.img,size=100 --network
type=direct,source=enP4p1s0,model=virtio --pxe
NOTE that I also tried simply removing the offending ASSERT() and PXE
boot still fails with multiple errors (exceptions of some sort that I
didn't log).
What I'd like to know:
1. Is this a known failure?
2. Was aarch64 given much testing with this new patchset?
3. Is reverting this patchset a safe workaround for now until a fix
is found? Or is there another simple workaround perhaps?
4. Any hunches what the issue could be?
Thanks for any info/help/suggestions,
-Aaron Young
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-08-30 20:17 Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system aaron.young
@ 2018-08-31 13:17 ` Laszlo Ersek
2018-10-03 16:56 ` aaron.young
0 siblings, 1 reply; 10+ messages in thread
From: Laszlo Ersek @ 2018-08-31 13:17 UTC (permalink / raw)
To: aaron.young, edk2-devel; +Cc: brijesh.singh, Ard Biesheuvel
On 08/30/18 22:17, aaron.young@oracle.com wrote:
> We have encountered what we believe to be a regression in the
> OvmfPkg/VirtioNetDxe driver.
>
> This regression causes PXE boot to fail with the following ASSERT:
>
> ASSERT
> [VirtioNetDxe]/builddir/build/BUILD/ovmf-92d07e48907f/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184):
> ((BOOLEAN)(0==1))
>
> The failure was encountered on a aarch64/Armv8 system (Ampere server)
> using the latest mainline (edk2.git/master) code as of 8-28-2018.
>
> On a hunch, I reverted the following patchset which resulted in PXE
> boot being successful - which suggests this patchset is likely the culprit:
>
> https://lists.01.org/pipermail/edk2-devel/2017-September/014679.html
>
> The command used to initiate the VM PXE boot/install is:
>
> virt-install --name guest1 --vcpus 16 --ram 32000 --arch aarch64
> --disk guest1.img,size=100 --network
> type=direct,source=enP4p1s0,model=virtio --pxe
>
> NOTE that I also tried simply removing the offending ASSERT() and PXE
> boot still fails with multiple errors (exceptions of some sort that I
> didn't log).
>
>
> What I'd like to know:
>
> 1. Is this a known failure?
Please read the comment in the context of the failed assertion. If this
assertion is triggered, then that's an error in the code that consumes
the SNP instance provided by VirtioNetDxe. The client code invokes
behavior that is undefined by the UEFI spec.
You can find more details at:
http://mid.mail-archive.com/2adc266a-949d-c546-9bed-d78fea19d969@redhat.com
> 2. Was aarch64 given much testing with this new patchset?
Yes. I described the tests that I had done before pushing the patch set
here:
http://mid.mail-archive.com/581a568d-b0ce-34a1-8e52-95adf14c03e0@redhat.com
http://mid.mail-archive.com/1e4267dc-0133-e833-f805-8381ef82c52e@redhat.com
And most recently (independently of the patch set you refer to), I
successfully tested VirtioNetDxe on aarch64 as part of PXEv4, PXEv6,
HTTPv4, HTTPv6 net-boots, for commit ae08ea246fe9
("ArmVirtPkg/ArmVirtQemu: enable the IPv6 stack", 2018-07-13):
http://mid.mail-archive.com/20180712234112.3566-1-lersek@redhat.com
> 3. Is reverting this patchset a safe workaround for now until a fix is
> found? Or is there another simple workaround perhaps?
Reverting the set should make the symptom go away, but that just papers
over the undefined behavior in the protocol caller.
The goal of the series was to enable VirtioNetDxe to function correctly
on top of:
- a virtio-1.0 PCI transport, provided by Virtio10Dxe, such that
- explicit DMA mapping is performed using the EFI_PCI_IO_PROTOCOL
underlying the virtio transport, such that
- the platform provides an IOMMU protocol implementation for
PciHostBridgeDxe, which underlies PciBusDxe.
If this use case does not apply to your platform, then you can revert
the patch set.
In particular, in the ArmVirtQemu platform, we don't hook
"OvmfPkg/Library/PlatformHasIoMmuLib/PlatformHasIoMmuLib.inf" into
"PciHostBridgeDxe" via NULL class resolution, thus PciHostBridgeDxe
there will not wait for another platform driver to decide dynamically
whether the platform has an actual IOMMU protocol or not. And,
accordingly, we never produce an IOMMU protocol instance in ArmVirtQemu.
(Obviously the patch set was extensively regression-tested to confirm it
was a no-op on platforms where the above use case wouldn't apply.)
> 4. Any hunches what the issue could be?
Code that uses VirtioNetDxe's SNP protocol interface invokes undefined
behavior; please see my answer to your Q1.
> Thanks for any info/help/suggestions,
I'd like to give you a hint on debugging this further as well:
(1) in the VirtioNetTransmit() function, please add DEBUG messages both
before and after the call to VirtioNetMapTxBuf(). Before, log "Buffer"
(with %p). After, log "DeviceAddress" (with "%Lx").
(2) In the VirtioNetGetStatus() function, please add DEBUG messages both
before and after the call to VirtioNetUnmapTxBuf(). Before, log
"DeviceAddress" (with "%Lx"). After, log "*TxBuf" (with "%p").
In each of (1) and (2), the values printed before/after should be
identical (= identity-mapping for DMA).
Please write a python script that parses the log generated by (1) and
(2). The python script should maintain an associative array where the
key type is "address", and the value type is "integer" (starting value 0).
For each "Buffer" address printed by (1), the python script should
increase the counter for that address.
For each "*TxBuf" address printed by (2), the python script should
decrease the counter for that address.
If any counter goes above 1, then the client code invokes undefined
behavior. If any counter goes under 0, then we messed up something. If
all counters only take values 0 and 1, but you still hit the assert,
then again we messed up something.
The idea is to check whether:
- the SNP consumer code tries to Transmit() the same buffer for a second
time *before* it gets it back from GetStatus() -- this is the error that
I'm currently suspecting,
- or we messed up something (return a buffer that hasn't been queued for
transmission, or else we mistake a not-yet-queued buffer for an already
queued buffer when it is Transmit()ted for the first time).
... BTW, if you build ArmVirtQemu for the RELEASE target, the ASSERT()
will disappear from VirtioNetMapTxBuf(), and Transmit() will simply
return EFI_DEVICE_ERROR. This is consistent with your last paragraph --
i.e., "boot still fails with multiple errors" -- and the reason for
returning this error is that the SNP consumer really tries to invoke
undefined behavior. It's not always possible to detect UB, but whenever
we can, without a huge performance hit, then in DEBUG builds it's
appropriate to trip an assert, and in RELEASE builds, to return an error.
Thanks
Laszlo
> -Aaron Young
>
Thanks
Laszlo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-08-31 13:17 ` Laszlo Ersek
@ 2018-10-03 16:56 ` aaron.young
2018-10-03 17:35 ` Ard Biesheuvel
0 siblings, 1 reply; 10+ messages in thread
From: aaron.young @ 2018-10-03 16:56 UTC (permalink / raw)
To: Laszlo Ersek, edk2-devel; +Cc: brijesh.singh, Ard Biesheuvel
Hello,
I have some more information on this issue that I wanted to share.
The SNP client appears to be GRUB - i.e. the ASSERT() is asserted just
after PXE code has loaded GRUB, presented the GRUB menu and a boot
option is selected.
And, I have confirmed that it, as predicted, attempts to call
SNP->Transmit() twice with the same Buffer - without successfully
reaping the Buffer via SNP->GetStatus() in between calls - thus tripping
the ASSERT() in SnpSharedHelpers.c:VirtioNetMapTxBuf().
This was quite easy to see since GRUB appears to use only a single
Buffer for which to transmit packets a few bytes at a time. i.e. it just
calls SNP->Transmit() and then calls SNP->GetStatus() over and over
again with the same buffer to transmit its packets.
From my debug, I can see it work successfully most of the time (i.e.
SNP->Transmit() is called successfully with a corresponding subsequent
call to SNP->GetStatus() which returns the Buffer). However,
occasionally, SNP->GetStatus() will return TxBuf=NULL (indicating that
the Buffer has not yet been transmitted). In this case, GRUB appears to
attempt to call SNP->GetStatus() *twice* before giving up (possibly due
to a timer which I can see in the GRUB code) and then attempts to
transfer the Buffer again - resulting in the ASSERT().
So, this explains the regression since the old code would allow the
same Buffer to to be re-transmitted.
As a quick test, I coded up a quick workaround in VirtioNetMapTxBuf()
to walk through Dev->TxBufCollection to see if the Buffer is present -
i.e. has already been transmitted (and not yet reaped via
SNP->GetStatus()) - and if so, just return EFI_SUCCESS from
VirtioNetTransmit(). This allowed the PXE boot/installation to succeed.
I'm not advocating this as a solution, but is an interesting data point.
Any suggestion where we go from here?
Is there any way to make the code more tolerant of these
re-transmissions (like the code was before)?
I'm not confident that declaring GRUB as not spec compliant is the answer.
Thanks,
-Aaron
On 08/31/18 06:17, Laszlo Ersek wrote:
> On 08/30/18 22:17, aaron.young@oracle.com wrote:
>> We have encountered what we believe to be a regression in the
>> OvmfPkg/VirtioNetDxe driver.
>>
>> This regression causes PXE boot to fail with the following ASSERT:
>>
>> ASSERT
>> [VirtioNetDxe]/builddir/build/BUILD/ovmf-92d07e48907f/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184):
>> ((BOOLEAN)(0==1))
>>
>> The failure was encountered on a aarch64/Armv8 system (Ampere server)
>> using the latest mainline (edk2.git/master) code as of 8-28-2018.
>>
>> On a hunch, I reverted the following patchset which resulted in PXE
>> boot being successful - which suggests this patchset is likely the culprit:
>>
>> https://lists.01.org/pipermail/edk2-devel/2017-September/014679.html
>>
>> The command used to initiate the VM PXE boot/install is:
>>
>> virt-install --name guest1 --vcpus 16 --ram 32000 --arch aarch64
>> --disk guest1.img,size=100 --network
>> type=direct,source=enP4p1s0,model=virtio --pxe
>>
>> NOTE that I also tried simply removing the offending ASSERT() and PXE
>> boot still fails with multiple errors (exceptions of some sort that I
>> didn't log).
>>
>>
>> What I'd like to know:
>>
>> 1. Is this a known failure?
> Please read the comment in the context of the failed assertion. If this
> assertion is triggered, then that's an error in the code that consumes
> the SNP instance provided by VirtioNetDxe. The client code invokes
> behavior that is undefined by the UEFI spec.
>
> You can find more details at:
>
> http://mid.mail-archive.com/2adc266a-949d-c546-9bed-d78fea19d969@redhat.com
>
>> 2. Was aarch64 given much testing with this new patchset?
> Yes. I described the tests that I had done before pushing the patch set
> here:
>
> http://mid.mail-archive.com/581a568d-b0ce-34a1-8e52-95adf14c03e0@redhat.com
>
> http://mid.mail-archive.com/1e4267dc-0133-e833-f805-8381ef82c52e@redhat.com
>
>
> And most recently (independently of the patch set you refer to), I
> successfully tested VirtioNetDxe on aarch64 as part of PXEv4, PXEv6,
> HTTPv4, HTTPv6 net-boots, for commit ae08ea246fe9
> ("ArmVirtPkg/ArmVirtQemu: enable the IPv6 stack", 2018-07-13):
>
> http://mid.mail-archive.com/20180712234112.3566-1-lersek@redhat.com
>
>> 3. Is reverting this patchset a safe workaround for now until a fix is
>> found? Or is there another simple workaround perhaps?
> Reverting the set should make the symptom go away, but that just papers
> over the undefined behavior in the protocol caller.
>
> The goal of the series was to enable VirtioNetDxe to function correctly
> on top of:
> - a virtio-1.0 PCI transport, provided by Virtio10Dxe, such that
> - explicit DMA mapping is performed using the EFI_PCI_IO_PROTOCOL
> underlying the virtio transport, such that
> - the platform provides an IOMMU protocol implementation for
> PciHostBridgeDxe, which underlies PciBusDxe.
>
> If this use case does not apply to your platform, then you can revert
> the patch set.
>
> In particular, in the ArmVirtQemu platform, we don't hook
> "OvmfPkg/Library/PlatformHasIoMmuLib/PlatformHasIoMmuLib.inf" into
> "PciHostBridgeDxe" via NULL class resolution, thus PciHostBridgeDxe
> there will not wait for another platform driver to decide dynamically
> whether the platform has an actual IOMMU protocol or not. And,
> accordingly, we never produce an IOMMU protocol instance in ArmVirtQemu.
>
> (Obviously the patch set was extensively regression-tested to confirm it
> was a no-op on platforms where the above use case wouldn't apply.)
>
>> 4. Any hunches what the issue could be?
> Code that uses VirtioNetDxe's SNP protocol interface invokes undefined
> behavior; please see my answer to your Q1.
>
>> Thanks for any info/help/suggestions,
> I'd like to give you a hint on debugging this further as well:
>
> (1) in the VirtioNetTransmit() function, please add DEBUG messages both
> before and after the call to VirtioNetMapTxBuf(). Before, log "Buffer"
> (with %p). After, log "DeviceAddress" (with "%Lx").
>
> (2) In the VirtioNetGetStatus() function, please add DEBUG messages both
> before and after the call to VirtioNetUnmapTxBuf(). Before, log
> "DeviceAddress" (with "%Lx"). After, log "*TxBuf" (with "%p").
>
> In each of (1) and (2), the values printed before/after should be
> identical (= identity-mapping for DMA).
>
> Please write a python script that parses the log generated by (1) and
> (2). The python script should maintain an associative array where the
> key type is "address", and the value type is "integer" (starting value 0).
>
> For each "Buffer" address printed by (1), the python script should
> increase the counter for that address.
>
> For each "*TxBuf" address printed by (2), the python script should
> decrease the counter for that address.
>
> If any counter goes above 1, then the client code invokes undefined
> behavior. If any counter goes under 0, then we messed up something. If
> all counters only take values 0 and 1, but you still hit the assert,
> then again we messed up something.
>
> The idea is to check whether:
>
> - the SNP consumer code tries to Transmit() the same buffer for a second
> time *before* it gets it back from GetStatus() -- this is the error that
> I'm currently suspecting,
>
> - or we messed up something (return a buffer that hasn't been queued for
> transmission, or else we mistake a not-yet-queued buffer for an already
> queued buffer when it is Transmit()ted for the first time).
>
>
> ... BTW, if you build ArmVirtQemu for the RELEASE target, the ASSERT()
> will disappear from VirtioNetMapTxBuf(), and Transmit() will simply
> return EFI_DEVICE_ERROR. This is consistent with your last paragraph --
> i.e., "boot still fails with multiple errors" -- and the reason for
> returning this error is that the SNP consumer really tries to invoke
> undefined behavior. It's not always possible to detect UB, but whenever
> we can, without a huge performance hit, then in DEBUG builds it's
> appropriate to trip an assert, and in RELEASE builds, to return an error.
>
> Thanks
> Laszlo
>
>> -Aaron Young
>>
> Thanks
> Laszlo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-03 16:56 ` aaron.young
@ 2018-10-03 17:35 ` Ard Biesheuvel
2018-10-03 23:59 ` aaron.young
0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2018-10-03 17:35 UTC (permalink / raw)
To: aaron.young, Leif Lindholm
Cc: Laszlo Ersek, edk2-devel@lists.01.org, Brijesh Singh
(+ Leif)
On 3 October 2018 at 18:56, <aaron.young@oracle.com> wrote:
>
> Hello,
>
> I have some more information on this issue that I wanted to share.
>
> The SNP client appears to be GRUB - i.e. the ASSERT() is asserted just
> after PXE code has loaded GRUB, presented the GRUB menu and a boot option is
> selected.
>
> And, I have confirmed that it, as predicted, attempts to call
> SNP->Transmit() twice with the same Buffer - without successfully reaping
> the Buffer via SNP->GetStatus() in between calls - thus tripping the
> ASSERT() in SnpSharedHelpers.c:VirtioNetMapTxBuf().
>
> This was quite easy to see since GRUB appears to use only a single Buffer
> for which to transmit packets a few bytes at a time. i.e. it just calls
> SNP->Transmit() and then calls SNP->GetStatus() over and over again with the
> same buffer to transmit its packets.
>
> From my debug, I can see it work successfully most of the time (i.e.
> SNP->Transmit() is called successfully with a corresponding subsequent call
> to SNP->GetStatus() which returns the Buffer). However, occasionally,
> SNP->GetStatus() will return TxBuf=NULL (indicating that the Buffer has not
> yet been transmitted). In this case, GRUB appears to attempt to call
> SNP->GetStatus() *twice* before giving up (possibly due to a timer which I
> can see in the GRUB code) and then attempts to transfer the Buffer again -
> resulting in the ASSERT().
>
> So, this explains the regression since the old code would allow the same
> Buffer to to be re-transmitted.
>
> As a quick test, I coded up a quick workaround in VirtioNetMapTxBuf() to
> walk through Dev->TxBufCollection to see if the Buffer is present - i.e. has
> already been transmitted (and not yet reaped via SNP->GetStatus()) - and if
> so, just return EFI_SUCCESS from VirtioNetTransmit(). This allowed the PXE
> boot/installation to succeed. I'm not advocating this as a solution, but is
> an interesting data point.
>
> Any suggestion where we go from here?
>
> Is there any way to make the code more tolerant of these re-transmissions
> (like the code was before)?
>
> I'm not confident that declaring GRUB as not spec compliant is the answer.
>
> Thanks,
>
> -Aaron
>
>
>
> On 08/31/18 06:17, Laszlo Ersek wrote:
>>
>> On 08/30/18 22:17, aaron.young@oracle.com wrote:
>>>
>>> We have encountered what we believe to be a regression in the
>>> OvmfPkg/VirtioNetDxe driver.
>>>
>>> This regression causes PXE boot to fail with the following ASSERT:
>>>
>>> ASSERT
>>>
>>> [VirtioNetDxe]/builddir/build/BUILD/ovmf-92d07e48907f/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184):
>>> ((BOOLEAN)(0==1))
>>>
>>> The failure was encountered on a aarch64/Armv8 system (Ampere server)
>>> using the latest mainline (edk2.git/master) code as of 8-28-2018.
>>>
>>> On a hunch, I reverted the following patchset which resulted in PXE
>>> boot being successful - which suggests this patchset is likely the
>>> culprit:
>>>
>>> https://lists.01.org/pipermail/edk2-devel/2017-September/014679.html
>>>
>>> The command used to initiate the VM PXE boot/install is:
>>>
>>> virt-install --name guest1 --vcpus 16 --ram 32000 --arch aarch64
>>> --disk guest1.img,size=100 --network
>>> type=direct,source=enP4p1s0,model=virtio --pxe
>>>
>>> NOTE that I also tried simply removing the offending ASSERT() and PXE
>>> boot still fails with multiple errors (exceptions of some sort that I
>>> didn't log).
>>>
>>>
>>> What I'd like to know:
>>>
>>> 1. Is this a known failure?
>>
>> Please read the comment in the context of the failed assertion. If this
>> assertion is triggered, then that's an error in the code that consumes
>> the SNP instance provided by VirtioNetDxe. The client code invokes
>> behavior that is undefined by the UEFI spec.
>>
>> You can find more details at:
>>
>>
>> http://mid.mail-archive.com/2adc266a-949d-c546-9bed-d78fea19d969@redhat.com
>>
>>> 2. Was aarch64 given much testing with this new patchset?
>>
>> Yes. I described the tests that I had done before pushing the patch set
>> here:
>>
>>
>> http://mid.mail-archive.com/581a568d-b0ce-34a1-8e52-95adf14c03e0@redhat.com
>>
>>
>> http://mid.mail-archive.com/1e4267dc-0133-e833-f805-8381ef82c52e@redhat.com
>>
>>
>> And most recently (independently of the patch set you refer to), I
>> successfully tested VirtioNetDxe on aarch64 as part of PXEv4, PXEv6,
>> HTTPv4, HTTPv6 net-boots, for commit ae08ea246fe9
>> ("ArmVirtPkg/ArmVirtQemu: enable the IPv6 stack", 2018-07-13):
>>
>> http://mid.mail-archive.com/20180712234112.3566-1-lersek@redhat.com
>>
>>> 3. Is reverting this patchset a safe workaround for now until a fix is
>>> found? Or is there another simple workaround perhaps?
>>
>> Reverting the set should make the symptom go away, but that just papers
>> over the undefined behavior in the protocol caller.
>>
>> The goal of the series was to enable VirtioNetDxe to function correctly
>> on top of:
>> - a virtio-1.0 PCI transport, provided by Virtio10Dxe, such that
>> - explicit DMA mapping is performed using the EFI_PCI_IO_PROTOCOL
>> underlying the virtio transport, such that
>> - the platform provides an IOMMU protocol implementation for
>> PciHostBridgeDxe, which underlies PciBusDxe.
>>
>> If this use case does not apply to your platform, then you can revert
>> the patch set.
>>
>> In particular, in the ArmVirtQemu platform, we don't hook
>> "OvmfPkg/Library/PlatformHasIoMmuLib/PlatformHasIoMmuLib.inf" into
>> "PciHostBridgeDxe" via NULL class resolution, thus PciHostBridgeDxe
>> there will not wait for another platform driver to decide dynamically
>> whether the platform has an actual IOMMU protocol or not. And,
>> accordingly, we never produce an IOMMU protocol instance in ArmVirtQemu.
>>
>> (Obviously the patch set was extensively regression-tested to confirm it
>> was a no-op on platforms where the above use case wouldn't apply.)
>>
>>> 4. Any hunches what the issue could be?
>>
>> Code that uses VirtioNetDxe's SNP protocol interface invokes undefined
>> behavior; please see my answer to your Q1.
>>
>>> Thanks for any info/help/suggestions,
>>
>> I'd like to give you a hint on debugging this further as well:
>>
>> (1) in the VirtioNetTransmit() function, please add DEBUG messages both
>> before and after the call to VirtioNetMapTxBuf(). Before, log "Buffer"
>> (with %p). After, log "DeviceAddress" (with "%Lx").
>>
>> (2) In the VirtioNetGetStatus() function, please add DEBUG messages both
>> before and after the call to VirtioNetUnmapTxBuf(). Before, log
>> "DeviceAddress" (with "%Lx"). After, log "*TxBuf" (with "%p").
>>
>> In each of (1) and (2), the values printed before/after should be
>> identical (= identity-mapping for DMA).
>>
>> Please write a python script that parses the log generated by (1) and
>> (2). The python script should maintain an associative array where the
>> key type is "address", and the value type is "integer" (starting value 0).
>>
>> For each "Buffer" address printed by (1), the python script should
>> increase the counter for that address.
>>
>> For each "*TxBuf" address printed by (2), the python script should
>> decrease the counter for that address.
>>
>> If any counter goes above 1, then the client code invokes undefined
>> behavior. If any counter goes under 0, then we messed up something. If
>> all counters only take values 0 and 1, but you still hit the assert,
>> then again we messed up something.
>>
>> The idea is to check whether:
>>
>> - the SNP consumer code tries to Transmit() the same buffer for a second
>> time *before* it gets it back from GetStatus() -- this is the error that
>> I'm currently suspecting,
>>
>> - or we messed up something (return a buffer that hasn't been queued for
>> transmission, or else we mistake a not-yet-queued buffer for an already
>> queued buffer when it is Transmit()ted for the first time).
>>
>>
>> ... BTW, if you build ArmVirtQemu for the RELEASE target, the ASSERT()
>> will disappear from VirtioNetMapTxBuf(), and Transmit() will simply
>> return EFI_DEVICE_ERROR. This is consistent with your last paragraph --
>> i.e., "boot still fails with multiple errors" -- and the reason for
>> returning this error is that the SNP consumer really tries to invoke
>> undefined behavior. It's not always possible to detect UB, but whenever
>> we can, without a huge performance hit, then in DEBUG builds it's
>> appropriate to trip an assert, and in RELEASE builds, to return an error.
>>
>> Thanks
>> Laszlo
>>
>>> -Aaron Young
>>>
>> Thanks
>> Laszlo
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-03 17:35 ` Ard Biesheuvel
@ 2018-10-03 23:59 ` aaron.young
2018-10-04 9:24 ` Leif Lindholm
0 siblings, 1 reply; 10+ messages in thread
From: aaron.young @ 2018-10-03 23:59 UTC (permalink / raw)
To: Ard Biesheuvel, Leif Lindholm
Cc: Laszlo Ersek, edk2-devel@lists.01.org, Brijesh Singh
I am suspecting that this patch to GRUB is the cause of a Buffer being
re-transmitted before reaping the Buffer via SNP->GetStatus():
https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
So, to reproduce the issue, the GRUB used via PXE boot needs to
include this patch.
Thanks,
-Aaron
On 10/03/18 10:35, Ard Biesheuvel wrote:
> (+ Leif)
>
> On 3 October 2018 at 18:56, <aaron.young@oracle.com> wrote:
>> Hello,
>>
>> I have some more information on this issue that I wanted to share.
>>
>> The SNP client appears to be GRUB - i.e. the ASSERT() is asserted just
>> after PXE code has loaded GRUB, presented the GRUB menu and a boot option is
>> selected.
>>
>> And, I have confirmed that it, as predicted, attempts to call
>> SNP->Transmit() twice with the same Buffer - without successfully reaping
>> the Buffer via SNP->GetStatus() in between calls - thus tripping the
>> ASSERT() in SnpSharedHelpers.c:VirtioNetMapTxBuf().
>>
>> This was quite easy to see since GRUB appears to use only a single Buffer
>> for which to transmit packets a few bytes at a time. i.e. it just calls
>> SNP->Transmit() and then calls SNP->GetStatus() over and over again with the
>> same buffer to transmit its packets.
>>
>> From my debug, I can see it work successfully most of the time (i.e.
>> SNP->Transmit() is called successfully with a corresponding subsequent call
>> to SNP->GetStatus() which returns the Buffer). However, occasionally,
>> SNP->GetStatus() will return TxBuf=NULL (indicating that the Buffer has not
>> yet been transmitted). In this case, GRUB appears to attempt to call
>> SNP->GetStatus() *twice* before giving up (possibly due to a timer which I
>> can see in the GRUB code) and then attempts to transfer the Buffer again -
>> resulting in the ASSERT().
>>
>> So, this explains the regression since the old code would allow the same
>> Buffer to to be re-transmitted.
>>
>> As a quick test, I coded up a quick workaround in VirtioNetMapTxBuf() to
>> walk through Dev->TxBufCollection to see if the Buffer is present - i.e. has
>> already been transmitted (and not yet reaped via SNP->GetStatus()) - and if
>> so, just return EFI_SUCCESS from VirtioNetTransmit(). This allowed the PXE
>> boot/installation to succeed. I'm not advocating this as a solution, but is
>> an interesting data point.
>>
>> Any suggestion where we go from here?
>>
>> Is there any way to make the code more tolerant of these re-transmissions
>> (like the code was before)?
>>
>> I'm not confident that declaring GRUB as not spec compliant is the answer.
>>
>> Thanks,
>>
>> -Aaron
>>
>>
>>
>> On 08/31/18 06:17, Laszlo Ersek wrote:
>>> On 08/30/18 22:17, aaron.young@oracle.com wrote:
>>>> We have encountered what we believe to be a regression in the
>>>> OvmfPkg/VirtioNetDxe driver.
>>>>
>>>> This regression causes PXE boot to fail with the following ASSERT:
>>>>
>>>> ASSERT
>>>>
>>>> [VirtioNetDxe]/builddir/build/BUILD/ovmf-92d07e48907f/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(184):
>>>> ((BOOLEAN)(0==1))
>>>>
>>>> The failure was encountered on a aarch64/Armv8 system (Ampere server)
>>>> using the latest mainline (edk2.git/master) code as of 8-28-2018.
>>>>
>>>> On a hunch, I reverted the following patchset which resulted in PXE
>>>> boot being successful - which suggests this patchset is likely the
>>>> culprit:
>>>>
>>>> https://lists.01.org/pipermail/edk2-devel/2017-September/014679.html
>>>>
>>>> The command used to initiate the VM PXE boot/install is:
>>>>
>>>> virt-install --name guest1 --vcpus 16 --ram 32000 --arch aarch64
>>>> --disk guest1.img,size=100 --network
>>>> type=direct,source=enP4p1s0,model=virtio --pxe
>>>>
>>>> NOTE that I also tried simply removing the offending ASSERT() and PXE
>>>> boot still fails with multiple errors (exceptions of some sort that I
>>>> didn't log).
>>>>
>>>>
>>>> What I'd like to know:
>>>>
>>>> 1. Is this a known failure?
>>> Please read the comment in the context of the failed assertion. If this
>>> assertion is triggered, then that's an error in the code that consumes
>>> the SNP instance provided by VirtioNetDxe. The client code invokes
>>> behavior that is undefined by the UEFI spec.
>>>
>>> You can find more details at:
>>>
>>>
>>> http://mid.mail-archive.com/2adc266a-949d-c546-9bed-d78fea19d969@redhat.com
>>>
>>>> 2. Was aarch64 given much testing with this new patchset?
>>> Yes. I described the tests that I had done before pushing the patch set
>>> here:
>>>
>>>
>>> http://mid.mail-archive.com/581a568d-b0ce-34a1-8e52-95adf14c03e0@redhat.com
>>>
>>>
>>> http://mid.mail-archive.com/1e4267dc-0133-e833-f805-8381ef82c52e@redhat.com
>>>
>>>
>>> And most recently (independently of the patch set you refer to), I
>>> successfully tested VirtioNetDxe on aarch64 as part of PXEv4, PXEv6,
>>> HTTPv4, HTTPv6 net-boots, for commit ae08ea246fe9
>>> ("ArmVirtPkg/ArmVirtQemu: enable the IPv6 stack", 2018-07-13):
>>>
>>> http://mid.mail-archive.com/20180712234112.3566-1-lersek@redhat.com
>>>
>>>> 3. Is reverting this patchset a safe workaround for now until a fix is
>>>> found? Or is there another simple workaround perhaps?
>>> Reverting the set should make the symptom go away, but that just papers
>>> over the undefined behavior in the protocol caller.
>>>
>>> The goal of the series was to enable VirtioNetDxe to function correctly
>>> on top of:
>>> - a virtio-1.0 PCI transport, provided by Virtio10Dxe, such that
>>> - explicit DMA mapping is performed using the EFI_PCI_IO_PROTOCOL
>>> underlying the virtio transport, such that
>>> - the platform provides an IOMMU protocol implementation for
>>> PciHostBridgeDxe, which underlies PciBusDxe.
>>>
>>> If this use case does not apply to your platform, then you can revert
>>> the patch set.
>>>
>>> In particular, in the ArmVirtQemu platform, we don't hook
>>> "OvmfPkg/Library/PlatformHasIoMmuLib/PlatformHasIoMmuLib.inf" into
>>> "PciHostBridgeDxe" via NULL class resolution, thus PciHostBridgeDxe
>>> there will not wait for another platform driver to decide dynamically
>>> whether the platform has an actual IOMMU protocol or not. And,
>>> accordingly, we never produce an IOMMU protocol instance in ArmVirtQemu.
>>>
>>> (Obviously the patch set was extensively regression-tested to confirm it
>>> was a no-op on platforms where the above use case wouldn't apply.)
>>>
>>>> 4. Any hunches what the issue could be?
>>> Code that uses VirtioNetDxe's SNP protocol interface invokes undefined
>>> behavior; please see my answer to your Q1.
>>>
>>>> Thanks for any info/help/suggestions,
>>> I'd like to give you a hint on debugging this further as well:
>>>
>>> (1) in the VirtioNetTransmit() function, please add DEBUG messages both
>>> before and after the call to VirtioNetMapTxBuf(). Before, log "Buffer"
>>> (with %p). After, log "DeviceAddress" (with "%Lx").
>>>
>>> (2) In the VirtioNetGetStatus() function, please add DEBUG messages both
>>> before and after the call to VirtioNetUnmapTxBuf(). Before, log
>>> "DeviceAddress" (with "%Lx"). After, log "*TxBuf" (with "%p").
>>>
>>> In each of (1) and (2), the values printed before/after should be
>>> identical (= identity-mapping for DMA).
>>>
>>> Please write a python script that parses the log generated by (1) and
>>> (2). The python script should maintain an associative array where the
>>> key type is "address", and the value type is "integer" (starting value 0).
>>>
>>> For each "Buffer" address printed by (1), the python script should
>>> increase the counter for that address.
>>>
>>> For each "*TxBuf" address printed by (2), the python script should
>>> decrease the counter for that address.
>>>
>>> If any counter goes above 1, then the client code invokes undefined
>>> behavior. If any counter goes under 0, then we messed up something. If
>>> all counters only take values 0 and 1, but you still hit the assert,
>>> then again we messed up something.
>>>
>>> The idea is to check whether:
>>>
>>> - the SNP consumer code tries to Transmit() the same buffer for a second
>>> time *before* it gets it back from GetStatus() -- this is the error that
>>> I'm currently suspecting,
>>>
>>> - or we messed up something (return a buffer that hasn't been queued for
>>> transmission, or else we mistake a not-yet-queued buffer for an already
>>> queued buffer when it is Transmit()ted for the first time).
>>>
>>>
>>> ... BTW, if you build ArmVirtQemu for the RELEASE target, the ASSERT()
>>> will disappear from VirtioNetMapTxBuf(), and Transmit() will simply
>>> return EFI_DEVICE_ERROR. This is consistent with your last paragraph --
>>> i.e., "boot still fails with multiple errors" -- and the reason for
>>> returning this error is that the SNP consumer really tries to invoke
>>> undefined behavior. It's not always possible to detect UB, but whenever
>>> we can, without a huge performance hit, then in DEBUG builds it's
>>> appropriate to trip an assert, and in RELEASE builds, to return an error.
>>>
>>> Thanks
>>> Laszlo
>>>
>>>> -Aaron Young
>>>>
>>> Thanks
>>> Laszlo
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-03 23:59 ` aaron.young
@ 2018-10-04 9:24 ` Leif Lindholm
2018-10-04 13:06 ` Laszlo Ersek
0 siblings, 1 reply; 10+ messages in thread
From: Leif Lindholm @ 2018-10-04 9:24 UTC (permalink / raw)
To: aaron.young
Cc: Ard Biesheuvel, Laszlo Ersek, edk2-devel@lists.01.org,
Brijesh Singh, pjones
+Peter
On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
> I am suspecting that this patch to GRUB is the cause of a Buffer being
> re-transmitted before reaping the Buffer via SNP->GetStatus():
>
> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>
> So, to reproduce the issue, the GRUB used via PXE boot needs to include
> this patch.
So the issue cannot be reproduced with upstream GRUB?
Does Fedora/Red Hat include the same patch?
/
Leif
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-04 9:24 ` Leif Lindholm
@ 2018-10-04 13:06 ` Laszlo Ersek
2018-10-15 20:52 ` aaron.young
0 siblings, 1 reply; 10+ messages in thread
From: Laszlo Ersek @ 2018-10-04 13:06 UTC (permalink / raw)
To: Leif Lindholm, aaron.young
Cc: Ard Biesheuvel, edk2-devel@lists.01.org, Brijesh Singh, pjones
On 10/04/18 11:24, Leif Lindholm wrote:
> +Peter
>
> On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
>> I am suspecting that this patch to GRUB is the cause of a Buffer being
>> re-transmitted before reaping the Buffer via SNP->GetStatus():
>>
>> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>>
>> So, to reproduce the issue, the GRUB used via PXE boot needs to include
>> this patch.
>
> So the issue cannot be reproduced with upstream GRUB?
>
> Does Fedora/Red Hat include the same patch?
Here's what I can see.
(1) In upstream grub <https://git.savannah.gnu.org/git/grub.git>, at
commit 8ada906031d9 ("msdos: Fix overflow in converting partition start
and length into 512B blocks", 2018-09-27), on the master branch, the
patch is not present.
(2) In "rhboot" grub2 <https://github.com/rhboot/grub2>, where the
master branch seems to track upstream grub, the patch is present on at
least the "fedora-28" and "rhel-7.5" branches. Commit hashes,
respectively: c2b126f52143, 1b9767c136082.
(3) In the commit message, Josef wrote, "When I fixed the txbuf handling
I ripped out the retransmission code". I think he referred to his
earlier commit 4fe8e6d4a127 ("efinet: handle get_status() on buggy
firmware properly", 2015-08-09). That commit is upstream.
In my opinion, commit 4fe8e6d4a127, the chronologically first, and
upstream, tweak, was right (assuming the comment it added was true,
about grub).
On the other hand, the downstream-only (chronologically 2nd) commit was
wrong. Not only did it break the spec, it broke even grub's own internal
invariant, described in the comment that was added in upstream commit
4fe8e6d4a127. The comment states, "we only transmit one packet at a
time". With the downstream-only tweak applied, that's no longer true.
Namely, SNP.Transmit() is called while we know another transmission is
pending on the egress queue. That's the definition of sending more than
one packet at a time.
I'm curious why the patch in question is not upstream. Was it submitted
and rejected? Submitted and ignored? Not submitted at all?
I'm not a fan of the hard-liner "spec above everything" approach. In
this case though, after the downstream-only tweak, grub is inconsistent
not only with the spec, but with itself too.
IMO, downstream(s) should revert the downstream-only patch.
Thanks,
Laszlo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-04 13:06 ` Laszlo Ersek
@ 2018-10-15 20:52 ` aaron.young
2018-10-16 1:55 ` Ard Biesheuvel
0 siblings, 1 reply; 10+ messages in thread
From: aaron.young @ 2018-10-15 20:52 UTC (permalink / raw)
To: Laszlo Ersek, Leif Lindholm
Cc: Ard Biesheuvel, edk2-devel@lists.01.org, Brijesh Singh, pjones,
Aaron Young
On 10/04/18 06:06, Laszlo Ersek wrote:
> On 10/04/18 11:24, Leif Lindholm wrote:
>> +Peter
>>
>> On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
>>> I am suspecting that this patch to GRUB is the cause of a Buffer being
>>> re-transmitted before reaping the Buffer via SNP->GetStatus():
>>>
>>> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>>>
>>> So, to reproduce the issue, the GRUB used via PXE boot needs to include
>>> this patch.
>> So the issue cannot be reproduced with upstream GRUB?
>>
>> Does Fedora/Red Hat include the same patch?
> Here's what I can see.
>
> (1) In upstream grub <https://git.savannah.gnu.org/git/grub.git>, at
> commit 8ada906031d9 ("msdos: Fix overflow in converting partition start
> and length into 512B blocks", 2018-09-27), on the master branch, the
> patch is not present.
>
> (2) In "rhboot" grub2 <https://github.com/rhboot/grub2>, where the
> master branch seems to track upstream grub, the patch is present on at
> least the "fedora-28" and "rhel-7.5" branches. Commit hashes,
> respectively: c2b126f52143, 1b9767c136082.
>
> (3) In the commit message, Josef wrote, "When I fixed the txbuf handling
> I ripped out the retransmission code". I think he referred to his
> earlier commit 4fe8e6d4a127 ("efinet: handle get_status() on buggy
> firmware properly", 2015-08-09). That commit is upstream.
>
> In my opinion, commit 4fe8e6d4a127, the chronologically first, and
> upstream, tweak, was right (assuming the comment it added was true,
> about grub).
>
> On the other hand, the downstream-only (chronologically 2nd) commit was
> wrong. Not only did it break the spec, it broke even grub's own internal
> invariant, described in the comment that was added in upstream commit
> 4fe8e6d4a127. The comment states, "we only transmit one packet at a
> time". With the downstream-only tweak applied, that's no longer true.
> Namely, SNP.Transmit() is called while we know another transmission is
> pending on the egress queue. That's the definition of sending more than
> one packet at a time.
>
> I'm curious why the patch in question is not upstream. Was it submitted
> and rejected? Submitted and ignored? Not submitted at all?
>
> I'm not a fan of the hard-liner "spec above everything" approach. In
> this case though, after the downstream-only tweak, grub is inconsistent
> not only with the spec, but with itself too.
>
> IMO, downstream(s) should revert the downstream-only patch.
>
> Thanks,
> Laszlo
I have confirmed that reverting this GRUB patch indeed fixes the issue
(i.e. with the VirtioNetDxe/SnpSharedHelpers.c(184) ASSERT).
Thanks for the help/info in resolving this issue.
As a follow up question - it seems that the VirtioNetDxe driver is
fragile in that it can get into broken state if (for whatever reason) a
tx buffer is not successfully transmitted and thus never shows back up
on the Used Ring. i.e. If a SNP client has repeatedly called
SNP-GetStatus() and, after a certain amount of time, fails gets back the
buffer, what should it do? If it attempts to re-transmit the buffer,
it'll hit the ASSERT. Perhaps it should shutdown/re-init the interface
in this case (to free up the buffer mapping in Dev->TxBufCollection)?
Or, are we confident this condition can _never_ happen?
Thanks again,
-Aaron
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-15 20:52 ` aaron.young
@ 2018-10-16 1:55 ` Ard Biesheuvel
2018-10-17 9:50 ` Laszlo Ersek
0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2018-10-16 1:55 UTC (permalink / raw)
To: aaron.young
Cc: Laszlo Ersek, Leif Lindholm, edk2-devel@lists.01.org,
Brijesh Singh, Peter Jones
On 15 October 2018 at 22:52, <aaron.young@oracle.com> wrote:
>
>
> On 10/04/18 06:06, Laszlo Ersek wrote:
>>
>> On 10/04/18 11:24, Leif Lindholm wrote:
>>>
>>> +Peter
>>>
>>> On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
>>>>
>>>> I am suspecting that this patch to GRUB is the cause of a Buffer being
>>>> re-transmitted before reaping the Buffer via SNP->GetStatus():
>>>>
>>>>
>>>> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>>>>
>>>> So, to reproduce the issue, the GRUB used via PXE boot needs to
>>>> include
>>>> this patch.
>>>
>>> So the issue cannot be reproduced with upstream GRUB?
>>>
>>> Does Fedora/Red Hat include the same patch?
>>
>> Here's what I can see.
>>
>> (1) In upstream grub <https://git.savannah.gnu.org/git/grub.git>, at
>> commit 8ada906031d9 ("msdos: Fix overflow in converting partition start
>> and length into 512B blocks", 2018-09-27), on the master branch, the
>> patch is not present.
>>
>> (2) In "rhboot" grub2 <https://github.com/rhboot/grub2>, where the
>> master branch seems to track upstream grub, the patch is present on at
>> least the "fedora-28" and "rhel-7.5" branches. Commit hashes,
>> respectively: c2b126f52143, 1b9767c136082.
>>
>> (3) In the commit message, Josef wrote, "When I fixed the txbuf handling
>> I ripped out the retransmission code". I think he referred to his
>> earlier commit 4fe8e6d4a127 ("efinet: handle get_status() on buggy
>> firmware properly", 2015-08-09). That commit is upstream.
>>
>> In my opinion, commit 4fe8e6d4a127, the chronologically first, and
>> upstream, tweak, was right (assuming the comment it added was true,
>> about grub).
>>
>> On the other hand, the downstream-only (chronologically 2nd) commit was
>> wrong. Not only did it break the spec, it broke even grub's own internal
>> invariant, described in the comment that was added in upstream commit
>> 4fe8e6d4a127. The comment states, "we only transmit one packet at a
>> time". With the downstream-only tweak applied, that's no longer true.
>> Namely, SNP.Transmit() is called while we know another transmission is
>> pending on the egress queue. That's the definition of sending more than
>> one packet at a time.
>>
>> I'm curious why the patch in question is not upstream. Was it submitted
>> and rejected? Submitted and ignored? Not submitted at all?
>>
>> I'm not a fan of the hard-liner "spec above everything" approach. In
>> this case though, after the downstream-only tweak, grub is inconsistent
>> not only with the spec, but with itself too.
>>
>> IMO, downstream(s) should revert the downstream-only patch.
>>
>> Thanks,
>> Laszlo
>
>
> I have confirmed that reverting this GRUB patch indeed fixes the issue (i.e.
> with the VirtioNetDxe/SnpSharedHelpers.c(184) ASSERT).
>
> Thanks for the help/info in resolving this issue.
>
> As a follow up question - it seems that the VirtioNetDxe driver is fragile
> in that it can get into broken state if (for whatever reason) a tx buffer is
> not successfully transmitted and thus never shows back up on the Used Ring.
> i.e. If a SNP client has repeatedly called SNP-GetStatus() and, after a
> certain amount of time, fails gets back the buffer, what should it do? If it
> attempts to re-transmit the buffer, it'll hit the ASSERT. Perhaps it should
> shutdown/re-init the interface in this case (to free up the buffer mapping
> in Dev->TxBufCollection)? Or, are we confident this condition can _never_
> happen?
>
It is an implementation detail of GRUB that it only ever uses a single
TX buffer: the protocol supports an arbitrary number of buffers, and
repeated calls to GetStatus() will return each one once the network
stack is done with them.
So the answer is simply to allocate another buffer, and use that.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
2018-10-16 1:55 ` Ard Biesheuvel
@ 2018-10-17 9:50 ` Laszlo Ersek
0 siblings, 0 replies; 10+ messages in thread
From: Laszlo Ersek @ 2018-10-17 9:50 UTC (permalink / raw)
To: aaron.young
Cc: Ard Biesheuvel, Leif Lindholm, edk2-devel@lists.01.org,
Brijesh Singh, Peter Jones
Hi Aaron,
On 10/16/18 03:55, Ard Biesheuvel wrote:
> On 15 October 2018 at 22:52, <aaron.young@oracle.com> wrote:
>> On 10/04/18 06:06, Laszlo Ersek wrote:
>>> On 10/04/18 11:24, Leif Lindholm wrote:
>>>>
>>>> +Peter
>>>>
>>>> On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
>>>>>
>>>>> I am suspecting that this patch to GRUB is the cause of a Buffer being
>>>>> re-transmitted before reaping the Buffer via SNP->GetStatus():
>>>>>
>>>>>
>>>>> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>>>>>
>>>>> So, to reproduce the issue, the GRUB used via PXE boot needs to
>>>>> include
>>>>> this patch.
>>>>
>>>> So the issue cannot be reproduced with upstream GRUB?
>>>>
>>>> Does Fedora/Red Hat include the same patch?
>>>
>>> Here's what I can see.
>>>
>>> (1) In upstream grub <https://git.savannah.gnu.org/git/grub.git>, at
>>> commit 8ada906031d9 ("msdos: Fix overflow in converting partition start
>>> and length into 512B blocks", 2018-09-27), on the master branch, the
>>> patch is not present.
>>>
>>> (2) In "rhboot" grub2 <https://github.com/rhboot/grub2>, where the
>>> master branch seems to track upstream grub, the patch is present on at
>>> least the "fedora-28" and "rhel-7.5" branches. Commit hashes,
>>> respectively: c2b126f52143, 1b9767c136082.
>>>
>>> (3) In the commit message, Josef wrote, "When I fixed the txbuf handling
>>> I ripped out the retransmission code". I think he referred to his
>>> earlier commit 4fe8e6d4a127 ("efinet: handle get_status() on buggy
>>> firmware properly", 2015-08-09). That commit is upstream.
>>>
>>> In my opinion, commit 4fe8e6d4a127, the chronologically first, and
>>> upstream, tweak, was right (assuming the comment it added was true,
>>> about grub).
>>>
>>> On the other hand, the downstream-only (chronologically 2nd) commit was
>>> wrong. Not only did it break the spec, it broke even grub's own internal
>>> invariant, described in the comment that was added in upstream commit
>>> 4fe8e6d4a127. The comment states, "we only transmit one packet at a
>>> time". With the downstream-only tweak applied, that's no longer true.
>>> Namely, SNP.Transmit() is called while we know another transmission is
>>> pending on the egress queue. That's the definition of sending more than
>>> one packet at a time.
>>>
>>> I'm curious why the patch in question is not upstream. Was it submitted
>>> and rejected? Submitted and ignored? Not submitted at all?
>>>
>>> I'm not a fan of the hard-liner "spec above everything" approach. In
>>> this case though, after the downstream-only tweak, grub is inconsistent
>>> not only with the spec, but with itself too.
>>>
>>> IMO, downstream(s) should revert the downstream-only patch.
>>>
>>> Thanks,
>>> Laszlo
>>
>>
>> I have confirmed that reverting this GRUB patch indeed fixes the issue (i.e.
>> with the VirtioNetDxe/SnpSharedHelpers.c(184) ASSERT).
>>
>> Thanks for the help/info in resolving this issue.
>>
>> As a follow up question - it seems that the VirtioNetDxe driver is fragile
>> in that it can get into broken state if (for whatever reason) a tx buffer is
>> not successfully transmitted and thus never shows back up on the Used Ring.
>> i.e. If a SNP client has repeatedly called SNP-GetStatus() and, after a
>> certain amount of time, fails gets back the buffer, what should it do? If it
>> attempts to re-transmit the buffer, it'll hit the ASSERT. Perhaps it should
>> shutdown/re-init the interface in this case (to free up the buffer mapping
>> in Dev->TxBufCollection)? Or, are we confident this condition can _never_
>> happen?
>>
>
> It is an implementation detail of GRUB that it only ever uses a single
> TX buffer: the protocol supports an arbitrary number of buffers, and
> repeated calls to GetStatus() will return each one once the network
> stack is done with them.
>
> So the answer is simply to allocate another buffer, and use that.
I agree with Ard's answer.
I'd like to add:
* If the virtio network device (= the hypervisor) *never* puts the head
of a specific descriptor chain back on the Used Ring (in the TX queue),
then either that is a bug in the virtio network device implementation,
or the packet is genuinely waiting for transmission.
In the former case (= problematic virtio device impl), I think whatever
workaround we came up with would only beget more problems down the road.
(In the present case too, the grub story originally started with a
grub-side workaround for the broken layer underneath, namely an SNP
driver returning bogus buffer addresses.)
In the latter case (= packet still pending transmission), the client
should simply wait. If it decides not to wait any longer, that's fine,
but it's no excuse for reusing the exact same buffer (see Ard's comment).
* Alternatively, I agree that "power cycling" the SNP instance is a
valid thing to do, using EFI_SIMPLE_NETWORK.Shutdown(), and then
EFI_SIMPLE_NETWORK.Initialize(). I'm stating this based on both the
spec, and on the VirtioNetDxe implementation.
In particular, VirtioNetShutdown() starts with a virtio reset, which
will de-configure the virtio device (make it forget all guest memory
references, for example). Then VirtioNetShutdown() calls
VirtioNetShutdownTx() too, which tears down Dev->TxBufCollection, as you
say.
Thanks,
Laszlo
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-10-17 9:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-30 20:17 Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system aaron.young
2018-08-31 13:17 ` Laszlo Ersek
2018-10-03 16:56 ` aaron.young
2018-10-03 17:35 ` Ard Biesheuvel
2018-10-03 23:59 ` aaron.young
2018-10-04 9:24 ` Leif Lindholm
2018-10-04 13:06 ` Laszlo Ersek
2018-10-15 20:52 ` aaron.young
2018-10-16 1:55 ` Ard Biesheuvel
2018-10-17 9:50 ` Laszlo Ersek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox