From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <lersek@redhat.com>
Received-SPF: Pass (sender SPF authorized) identity=mailfrom;
 client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com;
 receiver=edk2-devel@lists.01.org 
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ml01.01.org (Postfix) with ESMTPS id F1BD22116822E
 for <edk2-devel@lists.01.org>; Wed, 17 Oct 2018 02:51:02 -0700 (PDT)
Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com
 [10.5.11.15])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id DA85E30F8DB9;
 Wed, 17 Oct 2018 09:51:00 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-50.rdu2.redhat.com
 [10.10.120.50])
 by smtp.corp.redhat.com (Postfix) with ESMTP id D3B402C313;
 Wed, 17 Oct 2018 09:50:54 +0000 (UTC)
To: aaron.young@oracle.com
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
 Leif Lindholm <leif.lindholm@linaro.org>,
 "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
 Brijesh Singh <brijesh.singh@amd.com>, Peter Jones <pjones@redhat.com>
References: <ccea8a82-1696-84ac-1cb6-0f1b62e3c9b1@oracle.com>
 <b034c163-ea74-b059-6186-5ffda946b536@redhat.com>
 <3b4a4980-42d0-51a0-eccd-ceba83b6c78f@oracle.com>
 <CAKv+Gu87iumO0yTjEBNBY6ukZXkWwFN6beFmhMkwTmcwB7PWyw@mail.gmail.com>
 <eb5d2f4d-1b05-3d88-ad1e-d0184857c692@oracle.com>
 <20181004092411.4bgio3ewvmwkfuyq@bivouac.eciton.net>
 <dcc618fa-a9bf-3d57-5f57-5ae9e9c3148a@redhat.com>
 <85d38be2-fb2f-669d-cc39-3fb6a8d09842@oracle.com>
 <CAKv+Gu-RjJOeqke55M5Ff1+DwxxPjtyuLtzDEFV4UQWHKeyiuA@mail.gmail.com>
From: Laszlo Ersek <lersek@redhat.com>
Message-ID: <da272239-77c8-5eff-b56b-7f870b80f6ac@redhat.com>
Date: Wed, 17 Oct 2018 11:50:54 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <CAKv+Gu-RjJOeqke55M5Ff1+DwxxPjtyuLtzDEFV4UQWHKeyiuA@mail.gmail.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.5.110.49]); Wed, 17 Oct 2018 09:51:01 +0000 (UTC)
Subject: Re: Regression with PXE boot on OvmfPkg/VirtioNetDxe driver on aarch64 system
X-BeenThere: edk2-devel@lists.01.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: EDK II Development  <edk2-devel.lists.01.org>
List-Unsubscribe: <https://lists.01.org/mailman/options/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/edk2-devel/>
List-Post: <mailto:edk2-devel@lists.01.org>
List-Help: <mailto:edk2-devel-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2018 09:51:03 -0000
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit

Hi Aaron,

On 10/16/18 03:55, Ard Biesheuvel wrote:
> On 15 October 2018 at 22:52,  <aaron.young@oracle.com> wrote:
>> On 10/04/18 06:06, Laszlo Ersek wrote:
>>> On 10/04/18 11:24, Leif Lindholm wrote:
>>>>
>>>> +Peter
>>>>
>>>> On Wed, Oct 03, 2018 at 04:59:54PM -0700, aaron.young@oracle.com wrote:
>>>>>
>>>>>   I am suspecting that this patch to GRUB is the cause of a Buffer being
>>>>> re-transmitted before reaping the Buffer via SNP->GetStatus():
>>>>>
>>>>>
>>>>> https://git.centos.org/blob/rpms!grub2.git/1065bd29e776aef83f927747882140dcb6fd5cde/SOURCES!0183-efinet-retransmit-if-our-device-is-busy.patch
>>>>>
>>>>>   So, to reproduce the issue, the GRUB used via PXE boot needs to
>>>>> include
>>>>> this patch.
>>>>
>>>> So the issue cannot be reproduced with upstream GRUB?
>>>>
>>>> Does Fedora/Red Hat include the same patch?
>>>
>>> Here's what I can see.
>>>
>>> (1) In upstream grub <https://git.savannah.gnu.org/git/grub.git>, at
>>> commit 8ada906031d9 ("msdos: Fix overflow in converting partition start
>>> and length into 512B blocks", 2018-09-27), on the master branch, the
>>> patch is not present.
>>>
>>> (2) In "rhboot" grub2 <https://github.com/rhboot/grub2>, where the
>>> master branch seems to track upstream grub, the patch is present on at
>>> least the "fedora-28" and "rhel-7.5" branches. Commit hashes,
>>> respectively: c2b126f52143, 1b9767c136082.
>>>
>>> (3) In the commit message, Josef wrote, "When I fixed the txbuf handling
>>> I ripped out the retransmission code". I think he referred to his
>>> earlier commit 4fe8e6d4a127 ("efinet: handle get_status() on buggy
>>> firmware properly", 2015-08-09). That commit is upstream.
>>>
>>> In my opinion, commit 4fe8e6d4a127, the chronologically first, and
>>> upstream, tweak, was right (assuming the comment it added was true,
>>> about grub).
>>>
>>> On the other hand, the downstream-only (chronologically 2nd) commit was
>>> wrong. Not only did it break the spec, it broke even grub's own internal
>>> invariant, described in the comment that was added in upstream commit
>>> 4fe8e6d4a127. The comment states, "we only transmit one packet at a
>>> time". With the downstream-only tweak applied, that's no longer true.
>>> Namely, SNP.Transmit() is called while we know another transmission is
>>> pending on the egress queue. That's the definition of sending more than
>>> one packet at a time.
>>>
>>> I'm curious why the patch in question is not upstream. Was it submitted
>>> and rejected? Submitted and ignored? Not submitted at all?
>>>
>>> I'm not a fan of the hard-liner "spec above everything" approach. In
>>> this case though, after the downstream-only tweak, grub is inconsistent
>>> not only with the spec, but with itself too.
>>>
>>> IMO, downstream(s) should revert the downstream-only patch.
>>>
>>> Thanks,
>>> Laszlo
>>
>>
>> I have confirmed that reverting this GRUB patch indeed fixes the issue (i.e.
>> with the VirtioNetDxe/SnpSharedHelpers.c(184) ASSERT).
>>
>>   Thanks for the help/info in resolving this issue.
>>
>>   As a follow up question - it seems that the VirtioNetDxe driver is fragile
>> in that it can get into broken state if (for whatever reason) a tx buffer is
>> not successfully transmitted and thus never shows back up on the Used Ring.
>> i.e. If a SNP client has repeatedly called SNP-GetStatus() and, after a
>> certain amount of time, fails gets back the buffer, what should it do? If it
>> attempts to re-transmit the buffer, it'll hit the ASSERT. Perhaps it should
>> shutdown/re-init the interface in this case (to free up the buffer mapping
>> in Dev->TxBufCollection)? Or, are we confident this condition can _never_
>> happen?
>>
> 
> It is an implementation detail of GRUB that it only ever uses a single
> TX buffer: the protocol supports an arbitrary number of buffers, and
> repeated calls to GetStatus() will return each one once the network
> stack is done with them.
> 
> So the answer is simply to allocate another buffer, and use that.

I agree with Ard's answer.

I'd like to add:

* If the virtio network device (= the hypervisor) *never* puts the head
of a specific descriptor chain back on the Used Ring (in the TX queue),
then either that is a bug in the virtio network device implementation,
or the packet is genuinely waiting for transmission.

In the former case (= problematic virtio device impl), I think whatever
workaround we came up with would only beget more problems down the road.
(In the present case too, the grub story originally started with a
grub-side workaround for the broken layer underneath, namely an SNP
driver returning bogus buffer addresses.)

In the latter case (= packet still pending transmission), the client
should simply wait. If it decides not to wait any longer, that's fine,
but it's no excuse for reusing the exact same buffer (see Ard's comment).

* Alternatively, I agree that "power cycling" the SNP instance is a
valid thing to do, using EFI_SIMPLE_NETWORK.Shutdown(), and then
EFI_SIMPLE_NETWORK.Initialize(). I'm stating this based on both the
spec, and on the VirtioNetDxe implementation.

In particular, VirtioNetShutdown() starts with a virtio reset, which
will de-configure the virtio device (make it forget all guest memory
references, for example). Then VirtioNetShutdown() calls
VirtioNetShutdownTx() too, which tears down Dev->TxBufCollection, as you
say.

Thanks,
Laszlo