public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* Network Stack Budgeting
@ 2019-01-23 10:55 Tomas Pilar (tpilar)
  2019-01-23 14:14 ` Ye, Ting
  2019-01-23 16:08 ` Laszlo Ersek
  0 siblings, 2 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-23 10:55 UTC (permalink / raw)
  To: edk2-devel@lists.01.org

Hi,

Recently I have done some performance improvements to my network driver. I am however finding that on some platforms, it's becoming impossible to boot if the network cable has a lot of traffic on it that is not filtered by the NIC itself (broadcast, multicast or directed unicast). It would seem the platform hangs in the DXE phase trying to process (drop) all the packets and not progressing through the boot order.

I am wondering if anyone has seen similar behaviour. Does the network stack have any budgeting?

Ideally this would be fixed by the network stack not calling Snp.Start() until in the BDS phase but it seems most platforms just call Snp.Start() immediately following the driver probe.

Cheers,
Tom




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 10:55 Network Stack Budgeting Tomas Pilar (tpilar)
@ 2019-01-23 14:14 ` Ye, Ting
  2019-01-23 14:51   ` Tomas Pilar (tpilar)
  2019-01-23 16:08 ` Laszlo Ersek
  1 sibling, 1 reply; 23+ messages in thread
From: Ye, Ting @ 2019-01-23 14:14 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), edk2-devel@lists.01.org

Hi Tom,

As I known it is up to platform BDS when to connect network stack, or even not to connect network stack. For example, in fast boot process, network stack will not be connected thus Snp.Start() has no chance to be called.
May I know which platforms you see this issue?

Thanks,
Ting


-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Tomas Pilar (tpilar)
Sent: Wednesday, January 23, 2019 6:56 PM
To: edk2-devel@lists.01.org
Subject: [edk2] Network Stack Budgeting

Hi,

Recently I have done some performance improvements to my network driver. I am however finding that on some platforms, it's becoming impossible to boot if the network cable has a lot of traffic on it that is not filtered by the NIC itself (broadcast, multicast or directed unicast). It would seem the platform hangs in the DXE phase trying to process (drop) all the packets and not progressing through the boot order.

I am wondering if anyone has seen similar behaviour. Does the network stack have any budgeting?

Ideally this would be fixed by the network stack not calling Snp.Start() until in the BDS phase but it seems most platforms just call Snp.Start() immediately following the driver probe.

Cheers,
Tom


_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 14:14 ` Ye, Ting
@ 2019-01-23 14:51   ` Tomas Pilar (tpilar)
  0 siblings, 0 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-23 14:51 UTC (permalink / raw)
  To: Ye, Ting, edk2-devel@lists.01.org

Hi Ting,

Currently we see it on DELL 13G platforms. However, in my experience most platforms will call Snp.Start() immediatelly following the ConnectController() way before the boot manager is entered.

Cheers,
Tom

On 23/01/2019 14:14, Ye, Ting wrote:
> Hi Tom,
>
> As I known it is up to platform BDS when to connect network stack, or even not to connect network stack. For example, in fast boot process, network stack will not be connected thus Snp.Start() has no chance to be called.
> May I know which platforms you see this issue?
>
> Thanks,
> Ting
>
>
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Tomas Pilar (tpilar)
> Sent: Wednesday, January 23, 2019 6:56 PM
> To: edk2-devel@lists.01.org
> Subject: [edk2] Network Stack Budgeting
>
> Hi,
>
> Recently I have done some performance improvements to my network driver. I am however finding that on some platforms, it's becoming impossible to boot if the network cable has a lot of traffic on it that is not filtered by the NIC itself (broadcast, multicast or directed unicast). It would seem the platform hangs in the DXE phase trying to process (drop) all the packets and not progressing through the boot order.
>
> I am wondering if anyone has seen similar behaviour. Does the network stack have any budgeting?
>
> Ideally this would be fixed by the network stack not calling Snp.Start() until in the BDS phase but it seems most platforms just call Snp.Start() immediately following the driver probe.
>
> Cheers,
> Tom
>
>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 10:55 Network Stack Budgeting Tomas Pilar (tpilar)
  2019-01-23 14:14 ` Ye, Ting
@ 2019-01-23 16:08 ` Laszlo Ersek
  2019-01-23 16:27   ` Tomas Pilar (tpilar)
  1 sibling, 1 reply; 23+ messages in thread
From: Laszlo Ersek @ 2019-01-23 16:08 UTC (permalink / raw)
  To: Tomas Pilar (tpilar); +Cc: edk2-devel@lists.01.org

Hi,

On 01/23/19 11:55, Tomas Pilar (tpilar) wrote:
> Hi,
> 
> Recently I have done some performance improvements to my network
> driver. I am however finding that on some platforms, it's becoming
> impossible to boot if the network cable has a lot of traffic on it
> that is not filtered by the NIC itself (broadcast, multicast or
> directed unicast). It would seem the platform hangs in the DXE phase
> trying to process (drop) all the packets and not progressing through
> the boot order.
> 
> I am wondering if anyone has seen similar behaviour. Does the network
> stack have any budgeting?
> 
> Ideally this would be fixed by the network stack not calling
> Snp.Start() until in the BDS phase but it seems most platforms just
> call Snp.Start() immediately following the driver probe.

this is a platform BDS issue, on those platforms where you see the
symptom.

The set of devices connected during BDS is platform policy. It is not
the "network stack" that calls Snp.Start(), but the platform BDS code
that calls gBS->ConnectController() on the device, possibly without a
good reason (e.g. without the device being present in a UEFI boot
option). The network stack only "reacts" to that connection request.

It is convenient for a platform BDS implementation to just connect all
drivers to all devices, regardless of what devices are actually going to
be booted. Unfortunately, as the number of devices grows (or the traffic
grows, as you say), the boot performance worsens.

In OVMF and ArmVirtQemu, we saw pretty dramatic gains when we switched
from the simple+convenient approach to a bit smarter/selective approach.
And once we had fixed all the regressions too :) , it was a net win.

(Regressions because, once you switch the policy from "blanket connect"
to "specific connect", it is easy to forget about devices that are not
really "boot" devices, but you still might want to connect them in every
case. Consider a USB keyboard, or a hardware entropy device, for
example. After the switch, you have to connect those individually.)

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 16:08 ` Laszlo Ersek
@ 2019-01-23 16:27   ` Tomas Pilar (tpilar)
  2019-01-23 17:47     ` Andrew Fish
  2019-01-23 22:18     ` Laszlo Ersek
  0 siblings, 2 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-23 16:27 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: edk2-devel@lists.01.org


> The set of devices connected during BDS is platform policy. It is not
> the "network stack" that calls Snp.Start(), but the platform BDS code
> that calls gBS->ConnectController() on the device, possibly without a
> good reason (e.g. without the device being present in a UEFI boot
> option). The network stack only "reacts" to that connection request.
Indeed, but even if a SNP handle is present as a boot option in a boot manager, surely the Snp.Start() should be deferred until the user actually chooses to boot from that handle.

A workaround that we have in the legacy implementation doesn't start the underlying hardware datapath until the platform tries to send the first packet (since PXE boot is always initiated by the client) but that is a horrible hack that should not be necessary. The distinction between Snp.Initialize() and Snp.Start() is there exactly for that reason, no?

In other words, ConnectController() should not immediately result in Snp.Start() being called.

Cheers,
Tom




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 16:27   ` Tomas Pilar (tpilar)
@ 2019-01-23 17:47     ` Andrew Fish
  2019-01-23 22:18     ` Laszlo Ersek
  1 sibling, 0 replies; 23+ messages in thread
From: Andrew Fish @ 2019-01-23 17:47 UTC (permalink / raw)
  To: Tomas Pilar (tpilar); +Cc: Laszlo Ersek, edk2-devel@lists.01.org



> On Jan 23, 2019, at 8:27 AM, Tomas Pilar (tpilar) <tpilar@solarflare.com> wrote:
> 
> 
>> The set of devices connected during BDS is platform policy. It is not
>> the "network stack" that calls Snp.Start(), but the platform BDS code
>> that calls gBS->ConnectController() on the device, possibly without a
>> good reason (e.g. without the device being present in a UEFI boot
>> option). The network stack only "reacts" to that connection request.
> Indeed, but even if a SNP handle is present as a boot option in a boot manager, surely the Snp.Start() should be deferred until the user actually chooses to boot from that handle.
> 

Tom,

You don't need to call gBS->ConnectController() on all possible boot options, just the one you are currently trying to boot. 

I mostly muck around in a non edk2 BDS, but in general what you do in a BDS is:
1) Connect your graphics console(s) (usually involves ConOut NVRAM)
3) Connect any serial consoles (ConIn/ConOut NVRAM). 
2) Connect any built in keyboard, maybe SPI etc.  (ConIn NVRAM). 
4) Connect any hot pluggable console devices (Connect any existing USB HID devices).
5) Connect any other device required in the boot path (like the example entropy device. 

The platform can have policy to force values into ConIn/ConOut. For example on a laptop maybe you always want the built in keyboard active. 

As you attempt to boot a boot option you can recursively connect the device path for that boot option and attempt to boot it. If that option fails you can fall back to the next boot option and try to connect that device path and boot. Thus you don't need to start things before you are ready.

If you launch Firmware Setup that usually does a Connect All. The same things happen when you launch the Shell. 

Also some drivers connect extra stuff. For example when you try to connect a specific PCI device all the PCI IO handles get created. This is just how you have to enumerate PCI. But the recursive connect should only happen on the PCI IO handle you care about.

Thanks,

Andrew Fish


> A workaround that we have in the legacy implementation doesn't start the underlying hardware datapath until the platform tries to send the first packet (since PXE boot is always initiated by the client) but that is a horrible hack that should not be necessary. The distinction between Snp.Initialize() and Snp.Start() is there exactly for that reason, no?
> 
> In other words, ConnectController() should not immediately result in Snp.Start() being called.
> 
> Cheers,
> Tom
> 
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 16:27   ` Tomas Pilar (tpilar)
  2019-01-23 17:47     ` Andrew Fish
@ 2019-01-23 22:18     ` Laszlo Ersek
  2019-01-24 11:37       ` Tomas Pilar (tpilar)
  1 sibling, 1 reply; 23+ messages in thread
From: Laszlo Ersek @ 2019-01-23 22:18 UTC (permalink / raw)
  To: Tomas Pilar (tpilar); +Cc: edk2-devel@lists.01.org

On 01/23/19 17:27, Tomas Pilar (tpilar) wrote:
>
>> The set of devices connected during BDS is platform policy. It is not
>> the "network stack" that calls Snp.Start(), but the platform BDS code
>> that calls gBS->ConnectController() on the device, possibly without a
>> good reason (e.g. without the device being present in a UEFI boot
>> option). The network stack only "reacts" to that connection request.

> Indeed, but even if a SNP handle is present as a boot option in a boot
> manager, surely the Snp.Start() should be deferred until the user
> actually chooses to boot from that handle.

Yes, I agree. As far as I remember, UefiBootManagerLib is compatible
with that idea (it makes sure the device path is connected before the
boot is attempted). So again, the deferral you suggest applies to
platform BDS.

> A workaround that we have in the legacy implementation doesn't start
> the underlying hardware datapath until the platform tries to send the
> first packet (since PXE boot is always initiated by the client) but
> that is a horrible hack that should not be necessary. The distinction
> between Snp.Initialize() and Snp.Start() is there exactly for that
> reason, no?

That's a good question; if someone finds the answer, please let me too
know! :)

> In other words, ConnectController() should not immediately result in
> Snp.Start() being called.

Hmm, now that you put it like this, it's hard for me to comment on this
specific statement. I'm unsure if the higher level network drivers can
do their jobs (i.e. just bind the device) without calling Snp.Start(). I
haven't implemented anything higher level than SNP; I'm not sure about
MNP etc. internals. I do admit your original point makes a lot more
sense to me now.

Can you capture a call stack when Snp.Start() is invoked for the very
first time (which, IIUC, is a call that should not happen, in your
opinion)?

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-23 22:18     ` Laszlo Ersek
@ 2019-01-24 11:37       ` Tomas Pilar (tpilar)
  2019-01-24 12:25         ` Laszlo Ersek
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-24 11:37 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: edk2-devel@lists.01.org


Hi Laszlo,

> Can you capture a call stack when Snp.Start() is invoked for the very
> first time (which, IIUC, is a call that should not happen, in your
> opinion)?
>
Unfortunately I do not have access to the platform firmware itself (I maintain an IHV network driver that's shipped in OptionROMs) and I don't believe a generic stack capture is available in EDK2 yet. However I have comprehensive debug from my driver that shows that our driver gets a DriverBinding.Start() at TPL_APPLICATION and we perform the entire probe at TPL_NOTIFY and as soon as that completes and we drop the TPL, the newly installed SNP gets a Snp.Start() call at TPL_CALLBACK - I assume that MNP or something higher had an event registered that fired as soon as the TPL dropped after the DriverBinding.Start() finished.

Here is a snip of the debug log (that I assume will not be of much interest). We don't observe the actual call to DriverBinding.Start() because the debug output is initialised as one of the first things that method does.


      1 [sfc]..|INFO: [sfc] Debug  Output Initialised
      2 [sfc]..|INFO: Linked list max_size=1000000
      3 [sfc]..|SfcDebugLibStart:208 [Not Found]
      4 [sfc]..|[0x866D9396<] SfcDebugLibStart:222 [Success]
      5 [sfc]..|INFO: Running Solarflare UEFI Driver 2.7.8.5 (f71064c4d0a0) for MEDFORD
      6 [sfc]..|INFO: Ctrl=954E4B18 Driver=954E4418 Image=866B0000 Size=0x4D9C0 loaded from Device=954E4B18
      7 [sfc]..|INFO: Probing PciDev=942C3118
... snip ...
    376 [sfc]..|INFO: Support strings done
    377 [sfc]..|INFO: Hii init on hunt=942A6018 complete
    378 [sfc]..|TRACE: Closing the DeferredInit event
    379 [sfc]..|INFO: Processed deferred init okay
    380 [sfc].|[0x9940F873<] GetSupportedTypes:178 [Success]
    381 [sfc]|[0x9D4E1DE8<] DriverBindingStart:211 [Success]
    382 [sfc].|INFO: Starting SNP=942A2018
    383 [sfc].|[0x99653BA4<] SimpleNetworkStart:33 [Success]
    384 [sfc].|INFO: Init snp=942A2018
    385 [sfc].|TRACE: INIT: Found netdev=942A5D18
    386 [sfc].|INFO: Hunt open netdev=942A5D18
    387 [sfc].|INFO: Resetting mcdi=942A6480
    388 [sfc].|[Base+0x209E0<] HuntReset:312 [Success]
    389 [sfc].|INFO: Allocating vis count=3 on mcdi=942A6480
... snip ...
    491 [sfc].|TRACE: Final state filters=0x942A2018 mcastcnt=7
    492 [sfc].|TRACE: Filter Snp=942A2018 Add=0x7 Del=0x18 Reset=0 McastCount=1 List=9400EE98
    493 [sfc].|TRACE: Final state filters=0x942A2018 mcastcnt=7
    494 [sfc].|TRACE: Filter Snp=942A2018 Add=0x7 Del=0x18 Reset=0 McastCount=2 List=93FB5A98
    495 [sfc].|TRACE: Final state filters=0x942A2018 mcastcnt=7
    496 [sfc]|TRACE: Name Handle=0x954E4B18 PciDev=942C3118 Child=0x0
    497 [sfc]|INFO: STR [*ControllerName]=[Solarflare Adapter 17:00.0]
    498 [sfc]|TRACE: Name Handle=0x954E4B18 PciDev=942C3118 Child=0x942A5F98
    499 [sfc]|INFO: STR [*ControllerName]=[Solarflare NIC [00:0F:53:4C:A7:A0]]
    500 [sfc]..|INFO: [sfc] Debug  Output Initialised
    501 [sfc]..|INFO: Linked list max_size=1000000
    502 [sfc]..|SfcDebugLibStart:208 [Not Found]
    503 [sfc]..|[Base+0x29396<] SfcDebugLibStart:222 [Success]
    504 [sfc]..|INFO: Running Solarflare UEFI Driver 2.7.8.5 (f71064c4d0a0) for MEDFORD
    505 [sfc]..|INFO: Ctrl=954E3B98 Driver=954E4418 Image=866B0000 Size=0x4D9C0 loaded from Device=954E4B18
... snip ...
---

The dots between [sfc] and | symbol indicate the TPL the operation is being carried out at. No dots mean TPL_APPLICATION, one dot is TPL_CALLBACK etc. You can see that the Snp.Start() on the first device is called before even the second device gets a DriverBinding.Start().

Cheers,
Tom





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 11:37       ` Tomas Pilar (tpilar)
@ 2019-01-24 12:25         ` Laszlo Ersek
  2019-01-24 12:58           ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Laszlo Ersek @ 2019-01-24 12:25 UTC (permalink / raw)
  To: Tomas Pilar (tpilar); +Cc: edk2-devel@lists.01.org

On 01/24/19 12:37, Tomas Pilar (tpilar) wrote:
> 
> Hi Laszlo,
> 
>> Can you capture a call stack when Snp.Start() is invoked for the very
>> first time (which, IIUC, is a call that should not happen, in your
>> opinion)?
>>
> Unfortunately I do not have access to the platform firmware itself (I maintain an IHV network driver that's shipped in OptionROMs) and I don't believe a generic stack capture is available in EDK2 yet. However I have comprehensive debug from my driver that shows that our driver gets a DriverBinding.Start() at TPL_APPLICATION and we perform the entire probe at TPL_NOTIFY and as soon as that completes and we drop the TPL, the newly installed SNP gets a Snp.Start() call at TPL_CALLBACK - I assume that MNP or something higher had an event registered that fired as soon as the TPL dropped after the DriverBinding.Start() finished.
> 
> Here is a snip of the debug log (that I assume will not be of much interest). We don't observe the actual call to DriverBinding.Start() because the debug output is initialised as one of the first things that method does.
> 
> [...] 
> 
> The dots between [sfc] and | symbol indicate the TPL the operation is being carried out at. No dots mean TPL_APPLICATION, one dot is TPL_CALLBACK etc. You can see that the Snp.Start() on the first device is called before even the second device gets a DriverBinding.Start().

This reeks. :)

It looks like some driver in the platform sets up a protocol notify
callback for SNP, with gBS->CreateEvent() +
gBS->RegisterProtocolNotify(). Your driver's DriverBindingStart()
function is called normally from BDS, via ConnectController(). In
DriverBindingStart(), you install an SNP instance, which signals the
event (makes it pending / queues the notification function for it). Once
you drop the TPL again, the notification function is called, on the
stack of gBS->RestoreTPL().

The event's notification funciton probably uses the "Registration"
feature of gBS->RegisterProtocolNotify(), together with
gBS->LocateProtocol(), to process only those SNP instances that it
hasn't seen yet. In other words, it catches exactly the SNP instance
that your driver just produced, and then it calls the Start() member of it.

This looks wrong to me; SNP is supposed to be consumed in accordance
with the UEFI Driver Model. An agent that behaved like described above
would most likely be a platform (DXE) driver lying in wait for network
connectivity (SNP), for some reason.

(The Driver Writer’s Guide for UEFI 2.3.1, Version 1.01, 03/08/2012,
explicitly lists RegisterProtocolNotify() in chapter 5.3 "Services that
UEFI drivers should not use".)

Something's fishy with the platform firmware, IMO. Even if the rest of
the network stack -- which consists of well-behaving UEFI drivers that
follow the UEFI Driver Model -- comes to life, that one sneaky platform
DXE driver will be there, poking at your SNP directly, for example under
the feet of the MNP driver. That seems plain wrong.


I've now run

  git grep -A10 -B10 -w gEfiSimpleNetworkProtocolGuid

on edk2, and then searched the output for "RegisterProtocolNotify".
There were no hits. So, I don't think anything in edk2 behaves like
described above (thankfully!).


You mentioned seeing this on "DELL 13G platforms". I suggest opening a
support case with Dell.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 12:25         ` Laszlo Ersek
@ 2019-01-24 12:58           ` Tomas Pilar (tpilar)
  2019-01-24 13:25             ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-24 12:58 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: edk2-devel@lists.01.org


> This reeks. :)
>
> It looks like some driver in the platform sets up a protocol notify
> callback for SNP, with gBS->CreateEvent() +
> gBS->RegisterProtocolNotify(). Your driver's DriverBindingStart()
> function is called normally from BDS, via ConnectController(). In
> DriverBindingStart(), you install an SNP instance, which signals the
> event (makes it pending / queues the notification function for it). Once
> you drop the TPL again, the notification function is called, on the
> stack of gBS->RestoreTPL().
>
> The event's notification funciton probably uses the "Registration"
> feature of gBS->RegisterProtocolNotify(), together with
> gBS->LocateProtocol(), to process only those SNP instances that it
> hasn't seen yet. In other words, it catches exactly the SNP instance
> that your driver just produced, and then it calls the Start() member of it.

This was my assumption a well. I did not consider that another third party driver could be the cause of this.
> I've now run
>
>   git grep -A10 -B10 -w gEfiSimpleNetworkProtocolGuid
>
> on edk2, and then searched the output for "RegisterProtocolNotify".
> There were no hits. So, I don't think anything in edk2 behaves like
> described above (thankfully!).
That's good to know, I was going to have a browse through the edk2 network stack but this is reassuring.
>
> You mentioned seeing this on "DELL 13G platforms". I suggest opening a
> support case with Dell.
The problem is much more widespread than that. This log is from a HP gen 10 server. I am pretty sure the DELL 14G also has the same behaviour. That's what originally led me to think it might be in upstream.

Cheers,
Tom



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 12:58           ` Tomas Pilar (tpilar)
@ 2019-01-24 13:25             ` Tomas Pilar (tpilar)
  2019-01-24 16:49               ` Laszlo Ersek
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-24 13:25 UTC (permalink / raw)
  To: edk2-devel

Hmm,

Mnp->Configure() will eventually call MnpStartSnp().

A grep for Mnp->Configure shows that:
* ArpDxe performs this on DriverBinding.Start()
* Ip6Dxe performs this on DriverBinding.Start()

Ipv4 and DnsDhcp do this as a part of their Configure() they expose in the API.

Cheers,
Tom




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 13:25             ` Tomas Pilar (tpilar)
@ 2019-01-24 16:49               ` Laszlo Ersek
  2019-01-24 17:43                 ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Laszlo Ersek @ 2019-01-24 16:49 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), edk2-devel

On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
> Hmm,
> 
> Mnp->Configure() will eventually call MnpStartSnp().
> 
> A grep for Mnp->Configure shows that:
> * ArpDxe performs this on DriverBinding.Start()
> * Ip6Dxe performs this on DriverBinding.Start()
> 
> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in the API.

Yes, that makes sense. All of the above drivers are UEFI drivers that
follow the UEFI driver model, AIUI. As long as the controller is not
connected from BDS, no BindingStart() function should be called in these.

E.g. EFI_DNS4_PROTOCOL.Configure() can only be called if an instance of
that protocol exists (... for the NIC in question), but the protocol
instance won't exist unless BDS (or some other agent such as the shell
or the boot menu app) called ConnectController().

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 16:49               ` Laszlo Ersek
@ 2019-01-24 17:43                 ` Tomas Pilar (tpilar)
  2019-01-25  8:44                   ` Wu, Jiaxin
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-24 17:43 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel



On 24/01/2019 16:49, Laszlo Ersek wrote:
> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
>> Hmm,
>>
>> Mnp->Configure() will eventually call MnpStartSnp().
>>
>> A grep for Mnp->Configure shows that:
>> * ArpDxe performs this on DriverBinding.Start()
>> * Ip6Dxe performs this on DriverBinding.Start()
>>
>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in the API.
> Yes, that makes sense. All of the above drivers are UEFI drivers that
> follow the UEFI driver model, AIUI. As long as the controller is not
> connected from BDS, no BindingStart() function should be called in these.
Ah, but I would expect the BDS to call ConnectController() on the NIC, but I would not expect the network stack to be started unless the device is actually chosen for PXE booting. In other words, the above protocols should follow the example of EFI_DNS4_PROTOCOL that binds against the device during BDS but .Configure() is not automatically called by DriverBinding.Start().

.Configure() should be called by the BootManager if networking is actually to be done. That in turn will configure Mnp and start Snp.

Cheers,
Tom



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-24 17:43                 ` Tomas Pilar (tpilar)
@ 2019-01-25  8:44                   ` Wu, Jiaxin
  2019-01-25 12:08                     ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Wu, Jiaxin @ 2019-01-25  8:44 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), Laszlo Ersek, edk2-devel@lists.01.org; +Cc: Ye, Ting

Hi Tom,

One thing I want to highlight is that our design of network stack is not only for the PXE/HTTP boot trigged in BootManager, but also to make sure it's workable once there is any MNP instance configured by upper drivers (ARP/IPv4/IPv6). 

Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to process any ARP requests, which is required by ARP functionality. In such a case, SNP must be started to make ARP/IP drivers works well. 

Thanks,
Jiaxin

> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Tomas Pilar (tpilar)
> Sent: Friday, January 25, 2019 1:43 AM
> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> Subject: Re: [edk2] Network Stack Budgeting
> 
> 
> 
> On 24/01/2019 16:49, Laszlo Ersek wrote:
> > On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
> >> Hmm,
> >>
> >> Mnp->Configure() will eventually call MnpStartSnp().
> >>
> >> A grep for Mnp->Configure shows that:
> >> * ArpDxe performs this on DriverBinding.Start()
> >> * Ip6Dxe performs this on DriverBinding.Start()
> >>
> >> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in the
> API.
> > Yes, that makes sense. All of the above drivers are UEFI drivers that
> > follow the UEFI driver model, AIUI. As long as the controller is not
> > connected from BDS, no BindingStart() function should be called in these.
> Ah, but I would expect the BDS to call ConnectController() on the NIC, but I
> would not expect the network stack to be started unless the device is
> actually chosen for PXE booting. In other words, the above protocols should
> follow the example of EFI_DNS4_PROTOCOL that binds against the device
> during BDS but .Configure() is not automatically called by
> DriverBinding.Start().
> 
> .Configure() should be called by the BootManager if networking is actually to
> be done. That in turn will configure Mnp and start Snp.
> 
> Cheers,
> Tom
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-25  8:44                   ` Wu, Jiaxin
@ 2019-01-25 12:08                     ` Tomas Pilar (tpilar)
  2019-01-27 14:28                       ` Fu, Siyuan
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-25 12:08 UTC (permalink / raw)
  To: Wu, Jiaxin, Laszlo Ersek, edk2-devel@lists.01.org; +Cc: Ye, Ting

Yeah, that makes sense I suppose. The end result is however that the network device is 'opened' as soon as ConnectController() is called rather than when someone wants to do networking. This seems wrong and directly leads to a DoS of a platform in case of heavy network load (unless we implement budgeting in the network driver, which is a really not what it should be doing).

How do you suggest we solve this problem?

Cheers,
Tom

On 25/01/2019 08:44, Wu, Jiaxin wrote:
> Hi Tom,
>
> One thing I want to highlight is that our design of network stack is not only for the PXE/HTTP boot trigged in BootManager, but also to make sure it's workable once there is any MNP instance configured by upper drivers (ARP/IPv4/IPv6). 
>
> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to process any ARP requests, which is required by ARP functionality. In such a case, SNP must be started to make ARP/IP drivers works well. 
>
> Thanks,
> Jiaxin
>
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>> Tomas Pilar (tpilar)
>> Sent: Friday, January 25, 2019 1:43 AM
>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>> Subject: Re: [edk2] Network Stack Budgeting
>>
>>
>>
>> On 24/01/2019 16:49, Laszlo Ersek wrote:
>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
>>>> Hmm,
>>>>
>>>> Mnp->Configure() will eventually call MnpStartSnp().
>>>>
>>>> A grep for Mnp->Configure shows that:
>>>> * ArpDxe performs this on DriverBinding.Start()
>>>> * Ip6Dxe performs this on DriverBinding.Start()
>>>>
>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in the
>> API.
>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
>>> follow the UEFI driver model, AIUI. As long as the controller is not
>>> connected from BDS, no BindingStart() function should be called in these.
>> Ah, but I would expect the BDS to call ConnectController() on the NIC, but I
>> would not expect the network stack to be started unless the device is
>> actually chosen for PXE booting. In other words, the above protocols should
>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
>> during BDS but .Configure() is not automatically called by
>> DriverBinding.Start().
>>
>> .Configure() should be called by the BootManager if networking is actually to
>> be done. That in turn will configure Mnp and start Snp.
>>
>> Cheers,
>> Tom
>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-25 12:08                     ` Tomas Pilar (tpilar)
@ 2019-01-27 14:28                       ` Fu, Siyuan
  2019-01-28 11:24                         ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Fu, Siyuan @ 2019-01-27 14:28 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), Wu, Jiaxin, Laszlo Ersek,
	edk2-devel@lists.01.org
  Cc: Ye, Ting

Hi, Tom

> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Tomas
> Pilar (tpilar)
> Sent: Friday, January 25, 2019 8:09 PM
> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>; edk2-
> devel@lists.01.org
> Cc: Ye, Ting <ting.ye@intel.com>
> Subject: Re: [edk2] Network Stack Budgeting
> 
> Yeah, that makes sense I suppose. The end result is however that the network
> device is 'opened' as soon as ConnectController() is called rather than when
> someone wants to do networking. This seems wrong and directly leads to a DoS
> of a platform in case of heavy network load (unless we implement budgeting in
> the network driver, which is a really not what it should be doing).

It's true that a configured MNP will start a background system poll to receive
network packets, but normally it won't lead to a DoS, because the receive frequency 
is actually very low. The MNP driver only tries to receive 1 packet from SNP for every
10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related timer event).
So no matter how heavy the network is, MNP driver (and upper layer stack) will
only process at most 100 packets per second, all other packets should be dropped
by UNDI driver directly.

So if there is nobody doing busy network receiving, the network stack cost will be
at most 100 packets per NIC device and per second.

In your first email you mentioned "some performance improvements to my network driver",
may I know what's the improvement method you are using?

Best Regards,
Siyuan

> 
> How do you suggest we solve this problem?
> 
> Cheers,
> Tom
> 
> On 25/01/2019 08:44, Wu, Jiaxin wrote:
> > Hi Tom,
> >
> > One thing I want to highlight is that our design of network stack is not
> only for the PXE/HTTP boot trigged in BootManager, but also to make sure it's
> workable once there is any MNP instance configured by upper drivers
> (ARP/IPv4/IPv6).
> >
> > Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to
> process any ARP requests, which is required by ARP functionality. In such a
> case, SNP must be started to make ARP/IP drivers works well.
> >
> > Thanks,
> > Jiaxin
> >
> >> -----Original Message-----
> >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> >> Tomas Pilar (tpilar)
> >> Sent: Friday, January 25, 2019 1:43 AM
> >> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> >> Subject: Re: [edk2] Network Stack Budgeting
> >>
> >>
> >>
> >> On 24/01/2019 16:49, Laszlo Ersek wrote:
> >>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
> >>>> Hmm,
> >>>>
> >>>> Mnp->Configure() will eventually call MnpStartSnp().
> >>>>
> >>>> A grep for Mnp->Configure shows that:
> >>>> * ArpDxe performs this on DriverBinding.Start()
> >>>> * Ip6Dxe performs this on DriverBinding.Start()
> >>>>
> >>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in
> the
> >> API.
> >>> Yes, that makes sense. All of the above drivers are UEFI drivers that
> >>> follow the UEFI driver model, AIUI. As long as the controller is not
> >>> connected from BDS, no BindingStart() function should be called in these.
> >> Ah, but I would expect the BDS to call ConnectController() on the NIC, but
> I
> >> would not expect the network stack to be started unless the device is
> >> actually chosen for PXE booting. In other words, the above protocols should
> >> follow the example of EFI_DNS4_PROTOCOL that binds against the device
> >> during BDS but .Configure() is not automatically called by
> >> DriverBinding.Start().
> >>
> >> .Configure() should be called by the BootManager if networking is actually
> to
> >> be done. That in turn will configure Mnp and start Snp.
> >>
> >> Cheers,
> >> Tom
> >>
> >> _______________________________________________
> >> edk2-devel mailing list
> >> edk2-devel@lists.01.org
> >> https://lists.01.org/mailman/listinfo/edk2-devel
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-27 14:28                       ` Fu, Siyuan
@ 2019-01-28 11:24                         ` Tomas Pilar (tpilar)
  2019-01-29  3:20                           ` Fu, Siyuan
  0 siblings, 1 reply; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-28 11:24 UTC (permalink / raw)
  To: Fu, Siyuan, Wu, Jiaxin, Laszlo Ersek, edk2-devel@lists.01.org; +Cc: Ye, Ting

Hi Siyuan,

I had to do some tweaks due to the fact that people in the wild want to PXE boot whole WDS installations which are upwards of 300MB.

I have an internal set of queues where the NIC receives into. Snp.Receive() will get a packet from this store. I had to implement a separate queue for unicast/multicast/broadcast traffic so that I could reprioritise and deliver unicast traffic while under ARP spam. I tied internal poll to receive and transmit, so Snp.Receive() will poll the NIC and then dequeue a packet while Snp.Transmit() will transmit the packet and then poll the NIC to get a completion. I also implemented a busy poll for 100us or until the next packet comes in when the NIC is polled.

Otherwise I have seen a pathological case where the TFTP client would pull one packet per second but the TFTP_RETRANSMIT interval on the server was also one second and everything was awful.

In theory the busy poll moderator might cause an issue in a pathological case, if it stalls the machine for 100us for each packet - with 100 packets per second this would eat only 10ms.

Cheers,
Tom

On 27/01/2019 14:28, Fu, Siyuan wrote:
> Hi, Tom
>
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Tomas
>> Pilar (tpilar)
>> Sent: Friday, January 25, 2019 8:09 PM
>> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>; edk2-
>> devel@lists.01.org
>> Cc: Ye, Ting <ting.ye@intel.com>
>> Subject: Re: [edk2] Network Stack Budgeting
>>
>> Yeah, that makes sense I suppose. The end result is however that the network
>> device is 'opened' as soon as ConnectController() is called rather than when
>> someone wants to do networking. This seems wrong and directly leads to a DoS
>> of a platform in case of heavy network load (unless we implement budgeting in
>> the network driver, which is a really not what it should be doing).
> It's true that a configured MNP will start a background system poll to receive
> network packets, but normally it won't lead to a DoS, because the receive frequency 
> is actually very low. The MNP driver only tries to receive 1 packet from SNP for every
> 10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related timer event).
> So no matter how heavy the network is, MNP driver (and upper layer stack) will
> only process at most 100 packets per second, all other packets should be dropped
> by UNDI driver directly.
>
> So if there is nobody doing busy network receiving, the network stack cost will be
> at most 100 packets per NIC device and per second.
>
> In your first email you mentioned "some performance improvements to my network driver",
> may I know what's the improvement method you are using?
>
> Best Regards,
> Siyuan
>
>> How do you suggest we solve this problem?
>>
>> Cheers,
>> Tom
>>
>> On 25/01/2019 08:44, Wu, Jiaxin wrote:
>>> Hi Tom,
>>>
>>> One thing I want to highlight is that our design of network stack is not
>> only for the PXE/HTTP boot trigged in BootManager, but also to make sure it's
>> workable once there is any MNP instance configured by upper drivers
>> (ARP/IPv4/IPv6).
>>> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to
>> process any ARP requests, which is required by ARP functionality. In such a
>> case, SNP must be started to make ARP/IP drivers works well.
>>> Thanks,
>>> Jiaxin
>>>
>>>> -----Original Message-----
>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>>>> Tomas Pilar (tpilar)
>>>> Sent: Friday, January 25, 2019 1:43 AM
>>>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>
>>>>
>>>>
>>>> On 24/01/2019 16:49, Laszlo Ersek wrote:
>>>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
>>>>>> Hmm,
>>>>>>
>>>>>> Mnp->Configure() will eventually call MnpStartSnp().
>>>>>>
>>>>>> A grep for Mnp->Configure shows that:
>>>>>> * ArpDxe performs this on DriverBinding.Start()
>>>>>> * Ip6Dxe performs this on DriverBinding.Start()
>>>>>>
>>>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in
>> the
>>>> API.
>>>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
>>>>> follow the UEFI driver model, AIUI. As long as the controller is not
>>>>> connected from BDS, no BindingStart() function should be called in these.
>>>> Ah, but I would expect the BDS to call ConnectController() on the NIC, but
>> I
>>>> would not expect the network stack to be started unless the device is
>>>> actually chosen for PXE booting. In other words, the above protocols should
>>>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
>>>> during BDS but .Configure() is not automatically called by
>>>> DriverBinding.Start().
>>>>
>>>> .Configure() should be called by the BootManager if networking is actually
>> to
>>>> be done. That in turn will configure Mnp and start Snp.
>>>>
>>>> Cheers,
>>>> Tom
>>>>
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-28 11:24                         ` Tomas Pilar (tpilar)
@ 2019-01-29  3:20                           ` Fu, Siyuan
  2019-01-29 10:54                             ` Tomas Pilar (tpilar)
  0 siblings, 1 reply; 23+ messages in thread
From: Fu, Siyuan @ 2019-01-29  3:20 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), Wu, Jiaxin, Laszlo Ersek,
	edk2-devel@lists.01.org
  Cc: Ye, Ting

Hi, Tomas

> -----Original Message-----
> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
> Sent: Monday, January 28, 2019 7:24 PM
> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo
> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> Cc: Ye, Ting <ting.ye@intel.com>
> Subject: Re: [edk2] Network Stack Budgeting
> 
> Hi Siyuan,
> 
> I had to do some tweaks due to the fact that people in the wild want to PXE
> boot whole WDS installations which are upwards of 300MB.
> 
> I have an internal set of queues where the NIC receives into. Snp.Receive()
> will get a packet from this store. I had to implement a separate queue for
> unicast/multicast/broadcast traffic so that I could reprioritise and deliver
> unicast traffic while under ARP spam. I tied internal poll to receive and
> transmit, so Snp.Receive() will poll the NIC and then dequeue a packet while
> Snp.Transmit() will transmit the packet and then poll the NIC to get a
> completion. I also implemented a busy poll for 100us or until the next packet
> comes in when the NIC is polled.

I see some problems here. Both Snp.Receive() and Snp.Transmit() should be non-blocking
interface in current design and UEFI spec, but it's not true in your driver.
For Receive(), it could poll the NIC only once and return either a received packet
or an error, it should not do any blocking poll to wait for a packet.
For Transmit(), it just need to put the packet into the TX queue and return directly,
without checking if the TX is completed. Instead of that, Snp.GetStatus() should be
used to check if the packet has really been sent out.

May I know how did you control the 100us timeout of the busy poll (by which API?),
and what's the TPL of this busy poll?

> 
> Otherwise I have seen a pathological case where the TFTP client would pull one
> packet per second but the TFTP_RETRANSMIT interval on the server was also one
> second and everything was awful.

Why TFTP client just pull one packet per second? I think it’s not correct and it
could use the poll() function to accelerate the receive. Why you trying to solve a
TFTP layer problem in SNP layer? It breaks the design principle of network layer model.
> 
> In theory the busy poll moderator might cause an issue in a pathological case,
> if it stalls the machine for 100us for each packet - with 100 packets per
> second this would eat only 10ms.

Not only 10ms, it's at least 10ms, and in theory. If you have multiple NICs in the
system, this time will double. If someone tries to poll the NIC, the time will increase
significantly. It also depends on who you measure the 100us timeout, I did see some
platform which has a minimum timer interval limitation, that you thought you are using
a 100us timeout but it’s actually 1ms or even 10ms.

So my suggestion is do NOT add any busy wait in SNP layer API, keep it as unblocking,
and the try to solve the application layer problem in application layer.

Best Regards,
Siyuan

> 
> Cheers,
> Tom
> 
> On 27/01/2019 14:28, Fu, Siyuan wrote:
> > Hi, Tom
> >
> >> -----Original Message-----
> >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Tomas
> >> Pilar (tpilar)
> >> Sent: Friday, January 25, 2019 8:09 PM
> >> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>;
> edk2-
> >> devel@lists.01.org
> >> Cc: Ye, Ting <ting.ye@intel.com>
> >> Subject: Re: [edk2] Network Stack Budgeting
> >>
> >> Yeah, that makes sense I suppose. The end result is however that the
> network
> >> device is 'opened' as soon as ConnectController() is called rather than
> when
> >> someone wants to do networking. This seems wrong and directly leads to a
> DoS
> >> of a platform in case of heavy network load (unless we implement budgeting
> in
> >> the network driver, which is a really not what it should be doing).
> > It's true that a configured MNP will start a background system poll to
> receive
> > network packets, but normally it won't lead to a DoS, because the receive
> frequency
> > is actually very low. The MNP driver only tries to receive 1 packet from SNP
> for every
> > 10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related
> timer event).
> > So no matter how heavy the network is, MNP driver (and upper layer stack)
> will
> > only process at most 100 packets per second, all other packets should be
> dropped
> > by UNDI driver directly.
> >
> > So if there is nobody doing busy network receiving, the network stack cost
> will be
> > at most 100 packets per NIC device and per second.
> >
> > In your first email you mentioned "some performance improvements to my
> network driver",
> > may I know what's the improvement method you are using?
> >
> > Best Regards,
> > Siyuan
> >
> >> How do you suggest we solve this problem?
> >>
> >> Cheers,
> >> Tom
> >>
> >> On 25/01/2019 08:44, Wu, Jiaxin wrote:
> >>> Hi Tom,
> >>>
> >>> One thing I want to highlight is that our design of network stack is not
> >> only for the PXE/HTTP boot trigged in BootManager, but also to make sure
> it's
> >> workable once there is any MNP instance configured by upper drivers
> >> (ARP/IPv4/IPv6).
> >>> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to
> >> process any ARP requests, which is required by ARP functionality. In such a
> >> case, SNP must be started to make ARP/IP drivers works well.
> >>> Thanks,
> >>> Jiaxin
> >>>
> >>>> -----Original Message-----
> >>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> >>>> Tomas Pilar (tpilar)
> >>>> Sent: Friday, January 25, 2019 1:43 AM
> >>>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> >>>> Subject: Re: [edk2] Network Stack Budgeting
> >>>>
> >>>>
> >>>>
> >>>> On 24/01/2019 16:49, Laszlo Ersek wrote:
> >>>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
> >>>>>> Hmm,
> >>>>>>
> >>>>>> Mnp->Configure() will eventually call MnpStartSnp().
> >>>>>>
> >>>>>> A grep for Mnp->Configure shows that:
> >>>>>> * ArpDxe performs this on DriverBinding.Start()
> >>>>>> * Ip6Dxe performs this on DriverBinding.Start()
> >>>>>>
> >>>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in
> >> the
> >>>> API.
> >>>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
> >>>>> follow the UEFI driver model, AIUI. As long as the controller is not
> >>>>> connected from BDS, no BindingStart() function should be called in these.
> >>>> Ah, but I would expect the BDS to call ConnectController() on the NIC,
> but
> >> I
> >>>> would not expect the network stack to be started unless the device is
> >>>> actually chosen for PXE booting. In other words, the above protocols
> should
> >>>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
> >>>> during BDS but .Configure() is not automatically called by
> >>>> DriverBinding.Start().
> >>>>
> >>>> .Configure() should be called by the BootManager if networking is
> actually
> >> to
> >>>> be done. That in turn will configure Mnp and start Snp.
> >>>>
> >>>> Cheers,
> >>>> Tom
> >>>>
> >>>> _______________________________________________
> >>>> edk2-devel mailing list
> >>>> edk2-devel@lists.01.org
> >>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >> _______________________________________________
> >> edk2-devel mailing list
> >> edk2-devel@lists.01.org
> >> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-29  3:20                           ` Fu, Siyuan
@ 2019-01-29 10:54                             ` Tomas Pilar (tpilar)
  2019-01-29 13:06                               ` Fu, Siyuan
  2019-01-29 13:42                               ` Laszlo Ersek
  0 siblings, 2 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-29 10:54 UTC (permalink / raw)
  To: Fu, Siyuan, Wu, Jiaxin, Laszlo Ersek, edk2-devel@lists.01.org; +Cc: Ye, Ting



On 29/01/2019 03:20, Fu, Siyuan wrote:
> Hi, Tomas
>
>> -----Original Message-----
>> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
>> Sent: Monday, January 28, 2019 7:24 PM
>> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo
>> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>> Cc: Ye, Ting <ting.ye@intel.com>
>> Subject: Re: [edk2] Network Stack Budgeting
>>
>> Hi Siyuan,
>>
>> I had to do some tweaks due to the fact that people in the wild want to PXE
>> boot whole WDS installations which are upwards of 300MB.
>>
>> I have an internal set of queues where the NIC receives into. Snp.Receive()
>> will get a packet from this store. I had to implement a separate queue for
>> unicast/multicast/broadcast traffic so that I could reprioritise and deliver
>> unicast traffic while under ARP spam. I tied internal poll to receive and
>> transmit, so Snp.Receive() will poll the NIC and then dequeue a packet while
>> Snp.Transmit() will transmit the packet and then poll the NIC to get a
>> completion. I also implemented a busy poll for 100us or until the next packet
>> comes in when the NIC is polled.
> I see some problems here. Both Snp.Receive() and Snp.Transmit() should be non-blocking
> interface in current design and UEFI spec, but it's not true in your driver.
> For Receive(), it could poll the NIC only once and return either a received packet
> or an error, it should not do any blocking poll to wait for a packet.
On some platforms (tftp clients), if the Snp.Receive() returns EFI_NOT_READY they do not poll the NIC until the next MnpSystemPoll(). Hence the delay before returning EFI_NOT_READY in case the packet is still incoming.
> For Transmit(), it just need to put the packet into the TX queue and return directly,
> without checking if the TX is completed. 
> Instead of that, Snp.GetStatus() should be
> used to check if the packet has really been sent out.
I've seen quite a few platforms where GetStatus was only called by the MnpSystemPoll() and the application would not call it. This means our hardware rings are stuffed with completions and we stop being able to transmit. Yes, you could say that we should solve the application, but unfortunately I can't do that.
>
> May I know how did you control the 100us timeout of the busy poll (by which API?),
> and what's the TPL of this busy poll?
>
Create a timeout timer with value of 1000. Then busy poll in a loop until packet is received or a timer is triggered.
This runs at TPL_CALLBACK while the EfxPoll function itself runs at TPL_NOTIFY.

-- 
  /* Set the poll moderation timeout to 100us */
  gBS->SetTimer(NetDev->PollModeratorTimer, TimerRelative, 1000);

  while (!InterruptStatus) {
    EfxPoll(NULL, NetDev->Efx);
    InterruptStatus = EfxStatus(NetDev->Efx);

    if (gBS->CheckEvent(NetDev->PollModeratorTimer))
      return InterruptStatus;
  }

  return InterruptStatus;
--

>> Otherwise I have seen a pathological case where the TFTP client would pull one
>> packet per second but the TFTP_RETRANSMIT interval on the server was also one
>> second and everything was awful.
> Why TFTP client just pull one packet per second? I think it’s not correct and it
> could use the poll() function to accelerate the receive. Why you trying to solve a
> TFTP layer problem in SNP layer? It breaks the design principle of network layer model.
Yes, I appreciate the principle. However, in practice we don't get to sell adapters that do not PXE boot and it's pointless to argue with customers that they have a rubbish TFTP client implementation. So we put in workarounds into the driver.
>> In theory the busy poll moderator might cause an issue in a pathological case,
>> if it stalls the machine for 100us for each packet - with 100 packets per
>> second this would eat only 10ms.
> Not only 10ms, it's at least 10ms, and in theory. If you have multiple NICs in the
> system, this time will double. If someone tries to poll the NIC, the time will increase
> significantly. It also depends on who you measure the 100us timeout, I did see some
> platform which has a minimum timer interval limitation, that you thought you are using
> a 100us timeout but it’s actually 1ms or even 10ms.
>
> So my suggestion is do NOT add any busy wait in SNP layer API, keep it as unblocking,
> and the try to solve the application layer problem in application layer.
I'll try dropping the busy poll and see whether that helps.

Cheers,
Tom
>
> Best Regards,
> Siyuan
>
>> Cheers,
>> Tom
>>
>> On 27/01/2019 14:28, Fu, Siyuan wrote:
>>> Hi, Tom
>>>
>>>> -----Original Message-----
>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>> Tomas
>>>> Pilar (tpilar)
>>>> Sent: Friday, January 25, 2019 8:09 PM
>>>> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>;
>> edk2-
>>>> devel@lists.01.org
>>>> Cc: Ye, Ting <ting.ye@intel.com>
>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>
>>>> Yeah, that makes sense I suppose. The end result is however that the
>> network
>>>> device is 'opened' as soon as ConnectController() is called rather than
>> when
>>>> someone wants to do networking. This seems wrong and directly leads to a
>> DoS
>>>> of a platform in case of heavy network load (unless we implement budgeting
>> in
>>>> the network driver, which is a really not what it should be doing).
>>> It's true that a configured MNP will start a background system poll to
>> receive
>>> network packets, but normally it won't lead to a DoS, because the receive
>> frequency
>>> is actually very low. The MNP driver only tries to receive 1 packet from SNP
>> for every
>>> 10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related
>> timer event).
>>> So no matter how heavy the network is, MNP driver (and upper layer stack)
>> will
>>> only process at most 100 packets per second, all other packets should be
>> dropped
>>> by UNDI driver directly.
>>>
>>> So if there is nobody doing busy network receiving, the network stack cost
>> will be
>>> at most 100 packets per NIC device and per second.
>>>
>>> In your first email you mentioned "some performance improvements to my
>> network driver",
>>> may I know what's the improvement method you are using?
>>>
>>> Best Regards,
>>> Siyuan
>>>
>>>> How do you suggest we solve this problem?
>>>>
>>>> Cheers,
>>>> Tom
>>>>
>>>> On 25/01/2019 08:44, Wu, Jiaxin wrote:
>>>>> Hi Tom,
>>>>>
>>>>> One thing I want to highlight is that our design of network stack is not
>>>> only for the PXE/HTTP boot trigged in BootManager, but also to make sure
>> it's
>>>> workable once there is any MNP instance configured by upper drivers
>>>> (ARP/IPv4/IPv6).
>>>>> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat to
>>>> process any ARP requests, which is required by ARP functionality. In such a
>>>> case, SNP must be started to make ARP/IP drivers works well.
>>>>> Thanks,
>>>>> Jiaxin
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>>>>>> Tomas Pilar (tpilar)
>>>>>> Sent: Friday, January 25, 2019 1:43 AM
>>>>>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>>>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 24/01/2019 16:49, Laszlo Ersek wrote:
>>>>>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
>>>>>>>> Hmm,
>>>>>>>>
>>>>>>>> Mnp->Configure() will eventually call MnpStartSnp().
>>>>>>>>
>>>>>>>> A grep for Mnp->Configure shows that:
>>>>>>>> * ArpDxe performs this on DriverBinding.Start()
>>>>>>>> * Ip6Dxe performs this on DriverBinding.Start()
>>>>>>>>
>>>>>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose in
>>>> the
>>>>>> API.
>>>>>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
>>>>>>> follow the UEFI driver model, AIUI. As long as the controller is not
>>>>>>> connected from BDS, no BindingStart() function should be called in these.
>>>>>> Ah, but I would expect the BDS to call ConnectController() on the NIC,
>> but
>>>> I
>>>>>> would not expect the network stack to be started unless the device is
>>>>>> actually chosen for PXE booting. In other words, the above protocols
>> should
>>>>>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
>>>>>> during BDS but .Configure() is not automatically called by
>>>>>> DriverBinding.Start().
>>>>>>
>>>>>> .Configure() should be called by the BootManager if networking is
>> actually
>>>> to
>>>>>> be done. That in turn will configure Mnp and start Snp.
>>>>>>
>>>>>> Cheers,
>>>>>> Tom
>>>>>>
>>>>>> _______________________________________________
>>>>>> edk2-devel mailing list
>>>>>> edk2-devel@lists.01.org
>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>> _______________________________________________
>>>> edk2-devel mailing list
>>>> edk2-devel@lists.01.org
>>>> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-29 10:54                             ` Tomas Pilar (tpilar)
@ 2019-01-29 13:06                               ` Fu, Siyuan
  2019-01-29 13:12                                 ` Tomas Pilar (tpilar)
  2019-01-29 13:42                               ` Laszlo Ersek
  1 sibling, 1 reply; 23+ messages in thread
From: Fu, Siyuan @ 2019-01-29 13:06 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), Wu, Jiaxin, Laszlo Ersek,
	edk2-devel@lists.01.org
  Cc: Ye, Ting

Hi, Tomas

> -----Original Message-----
> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
> Sent: Tuesday, January 29, 2019 6:54 PM
> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo
> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> Cc: Ye, Ting <ting.ye@intel.com>
> Subject: Re: [edk2] Network Stack Budgeting
> 
> 
> 
> On 29/01/2019 03:20, Fu, Siyuan wrote:
> > Hi, Tomas
> >
> >> -----Original Message-----
> >> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
> >> Sent: Monday, January 28, 2019 7:24 PM
> >> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>;
> Laszlo
> >> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> >> Cc: Ye, Ting <ting.ye@intel.com>
> >> Subject: Re: [edk2] Network Stack Budgeting
> >>
> >> Hi Siyuan,
> >>
> >> I had to do some tweaks due to the fact that people in the wild want to PXE
> >> boot whole WDS installations which are upwards of 300MB.
> >>
> >> I have an internal set of queues where the NIC receives into. Snp.Receive()
> >> will get a packet from this store. I had to implement a separate queue for
> >> unicast/multicast/broadcast traffic so that I could reprioritise and
> deliver
> >> unicast traffic while under ARP spam. I tied internal poll to receive and
> >> transmit, so Snp.Receive() will poll the NIC and then dequeue a packet
> while
> >> Snp.Transmit() will transmit the packet and then poll the NIC to get a
> >> completion. I also implemented a busy poll for 100us or until the next
> packet
> >> comes in when the NIC is polled.
> > I see some problems here. Both Snp.Receive() and Snp.Transmit() should be
> non-blocking
> > interface in current design and UEFI spec, but it's not true in your driver.
> > For Receive(), it could poll the NIC only once and return either a received
> packet
> > or an error, it should not do any blocking poll to wait for a packet.
> On some platforms (tftp clients), if the Snp.Receive() returns EFI_NOT_READY
> they do not poll the NIC until the next MnpSystemPoll(). Hence the delay
> before returning EFI_NOT_READY in case the packet is still incoming.

OK, that's really not good practice.

> > For Transmit(), it just need to put the packet into the TX queue and return
> directly,
> > without checking if the TX is completed.
> > Instead of that, Snp.GetStatus() should be
> > used to check if the packet has really been sent out.
> I've seen quite a few platforms where GetStatus was only called by the
> MnpSystemPoll() and the application would not call it. This means our hardware
> rings are stuffed with completions and we stop being able to transmit. Yes,
> you could say that we should solve the application, but unfortunately I can't
> do that.

The application doesn't need to call Snp.GetStatus() to recycle the TX buffer, the
Mnp.Transmit() will do this automatically when it found the SNP's TX buffer is full.
See MnpSyncSendPacket() and MnpRecycleTxBuf(). 

> >
> > May I know how did you control the 100us timeout of the busy poll (by which
> API?),
> > and what's the TPL of this busy poll?
> >
> Create a timeout timer with value of 1000. Then busy poll in a loop until
> packet is received or a timer is triggered.
> This runs at TPL_CALLBACK while the EfxPoll function itself runs at TPL_NOTIFY.
> 
> --
>   /* Set the poll moderation timeout to 100us */
>   gBS->SetTimer(NetDev->PollModeratorTimer, TimerRelative, 1000);
> 
>   while (!InterruptStatus) {
>     EfxPoll(NULL, NetDev->Efx);
>     InterruptStatus = EfxStatus(NetDev->Efx);
> 
>     if (gBS->CheckEvent(NetDev->PollModeratorTimer))
>       return InterruptStatus;
>   }
> 
>   return InterruptStatus;
> --

This is exactly the case what I worried about. Some platform has a minimum timer
interval (like 1ms or 10ms), so if you set a timer interval shorter than that,
it will actually be signaled after the minimum value. Please check if your
platform could support a such a short timeout (100us).

> 
> >> Otherwise I have seen a pathological case where the TFTP client would pull
> one
> >> packet per second but the TFTP_RETRANSMIT interval on the server was also
> one
> >> second and everything was awful.
> > Why TFTP client just pull one packet per second? I think it’s not correct
> and it
> > could use the poll() function to accelerate the receive. Why you trying to
> solve a
> > TFTP layer problem in SNP layer? It breaks the design principle of network
> layer model.
> Yes, I appreciate the principle. However, in practice we don't get to sell
> adapters that do not PXE boot and it's pointless to argue with customers that
> they have a rubbish TFTP client implementation. So we put in workarounds into
> the driver.

OK I understand your situation now. Technically it will be better to fix the
problem in TFTP client, so I still suggest you to confirm this with your customer
or the TFTP application owner. This workaround in your SNP driver will put you
in the risk that a normal application which follow UEFI spec may not work correctly
with you, so I think you should only consider this method if you can confirm
the TFTP client is the only application layer client you need to serve.

Best Regards,
Siyuan

> >> In theory the busy poll moderator might cause an issue in a pathological
> case,
> >> if it stalls the machine for 100us for each packet - with 100 packets per
> >> second this would eat only 10ms.
> > Not only 10ms, it's at least 10ms, and in theory. If you have multiple NICs
> in the
> > system, this time will double. If someone tries to poll the NIC, the time
> will increase
> > significantly. It also depends on who you measure the 100us timeout, I did
> see some
> > platform which has a minimum timer interval limitation, that you thought you
> are using
> > a 100us timeout but it’s actually 1ms or even 10ms.
> >
> > So my suggestion is do NOT add any busy wait in SNP layer API, keep it as
> unblocking,
> > and the try to solve the application layer problem in application layer.
> I'll try dropping the busy poll and see whether that helps
> 
> Cheers,
> Tom
> >
> > Best Regards,
> > Siyuan
> >
> >> Cheers,
> >> Tom
> >>
> >> On 27/01/2019 14:28, Fu, Siyuan wrote:
> >>> Hi, Tom
> >>>
> >>>> -----Original Message-----
> >>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> >> Tomas
> >>>> Pilar (tpilar)
> >>>> Sent: Friday, January 25, 2019 8:09 PM
> >>>> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>;
> >> edk2-
> >>>> devel@lists.01.org
> >>>> Cc: Ye, Ting <ting.ye@intel.com>
> >>>> Subject: Re: [edk2] Network Stack Budgeting
> >>>>
> >>>> Yeah, that makes sense I suppose. The end result is however that the
> >> network
> >>>> device is 'opened' as soon as ConnectController() is called rather than
> >> when
> >>>> someone wants to do networking. This seems wrong and directly leads to a
> >> DoS
> >>>> of a platform in case of heavy network load (unless we implement
> budgeting
> >> in
> >>>> the network driver, which is a really not what it should be doing).
> >>> It's true that a configured MNP will start a background system poll to
> >> receive
> >>> network packets, but normally it won't lead to a DoS, because the receive
> >> frequency
> >>> is actually very low. The MNP driver only tries to receive 1 packet from
> SNP
> >> for every
> >>> 10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related
> >> timer event).
> >>> So no matter how heavy the network is, MNP driver (and upper layer stack)
> >> will
> >>> only process at most 100 packets per second, all other packets should be
> >> dropped
> >>> by UNDI driver directly.
> >>>
> >>> So if there is nobody doing busy network receiving, the network stack cost
> >> will be
> >>> at most 100 packets per NIC device and per second.
> >>>
> >>> In your first email you mentioned "some performance improvements to my
> >> network driver",
> >>> may I know what's the improvement method you are using?
> >>>
> >>> Best Regards,
> >>> Siyuan
> >>>
> >>>> How do you suggest we solve this problem?
> >>>>
> >>>> Cheers,
> >>>> Tom
> >>>>
> >>>> On 25/01/2019 08:44, Wu, Jiaxin wrote:
> >>>>> Hi Tom,
> >>>>>
> >>>>> One thing I want to highlight is that our design of network stack is not
> >>>> only for the PXE/HTTP boot trigged in BootManager, but also to make sure
> >> it's
> >>>> workable once there is any MNP instance configured by upper drivers
> >>>> (ARP/IPv4/IPv6).
> >>>>> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat
> to
> >>>> process any ARP requests, which is required by ARP functionality. In such
> a
> >>>> case, SNP must be started to make ARP/IP drivers works well.
> >>>>> Thanks,
> >>>>> Jiaxin
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> >>>>>> Tomas Pilar (tpilar)
> >>>>>> Sent: Friday, January 25, 2019 1:43 AM
> >>>>>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
> >>>>>> Subject: Re: [edk2] Network Stack Budgeting
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 24/01/2019 16:49, Laszlo Ersek wrote:
> >>>>>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
> >>>>>>>> Hmm,
> >>>>>>>>
> >>>>>>>> Mnp->Configure() will eventually call MnpStartSnp().
> >>>>>>>>
> >>>>>>>> A grep for Mnp->Configure shows that:
> >>>>>>>> * ArpDxe performs this on DriverBinding.Start()
> >>>>>>>> * Ip6Dxe performs this on DriverBinding.Start()
> >>>>>>>>
> >>>>>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose
> in
> >>>> the
> >>>>>> API.
> >>>>>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
> >>>>>>> follow the UEFI driver model, AIUI. As long as the controller is not
> >>>>>>> connected from BDS, no BindingStart() function should be called in
> these.
> >>>>>> Ah, but I would expect the BDS to call ConnectController() on the NIC,
> >> but
> >>>> I
> >>>>>> would not expect the network stack to be started unless the device is
> >>>>>> actually chosen for PXE booting. In other words, the above protocols
> >> should
> >>>>>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
> >>>>>> during BDS but .Configure() is not automatically called by
> >>>>>> DriverBinding.Start().
> >>>>>>
> >>>>>> .Configure() should be called by the BootManager if networking is
> >> actually
> >>>> to
> >>>>>> be done. That in turn will configure Mnp and start Snp.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Tom
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> edk2-devel mailing list
> >>>>>> edk2-devel@lists.01.org
> >>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
> >>>> _______________________________________________
> >>>> edk2-devel mailing list
> >>>> edk2-devel@lists.01.org
> >>>> https://lists.01.org/mailman/listinfo/edk2-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-29 13:06                               ` Fu, Siyuan
@ 2019-01-29 13:12                                 ` Tomas Pilar (tpilar)
  0 siblings, 0 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-29 13:12 UTC (permalink / raw)
  To: Fu, Siyuan, Wu, Jiaxin, Laszlo Ersek, edk2-devel@lists.01.org; +Cc: Ye, Ting

Hi Siyuan,

Thanks for the insights, hopefully removing the moderator will fix my current issue and I'll have a look at how to do the recycling better.

Cheers,
Tom

On 29/01/2019 13:06, Fu, Siyuan wrote:
> Hi, Tomas
>
>> -----Original Message-----
>> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
>> Sent: Tuesday, January 29, 2019 6:54 PM
>> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo
>> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>> Cc: Ye, Ting <ting.ye@intel.com>
>> Subject: Re: [edk2] Network Stack Budgeting
>>
>>
>>
>> On 29/01/2019 03:20, Fu, Siyuan wrote:
>>> Hi, Tomas
>>>
>>>> -----Original Message-----
>>>> From: Tomas Pilar (tpilar) [mailto:tpilar@solarflare.com]
>>>> Sent: Monday, January 28, 2019 7:24 PM
>>>> To: Fu, Siyuan <siyuan.fu@intel.com>; Wu, Jiaxin <jiaxin.wu@intel.com>;
>> Laszlo
>>>> Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>>>> Cc: Ye, Ting <ting.ye@intel.com>
>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>
>>>> Hi Siyuan,
>>>>
>>>> I had to do some tweaks due to the fact that people in the wild want to PXE
>>>> boot whole WDS installations which are upwards of 300MB.
>>>>
>>>> I have an internal set of queues where the NIC receives into. Snp.Receive()
>>>> will get a packet from this store. I had to implement a separate queue for
>>>> unicast/multicast/broadcast traffic so that I could reprioritise and
>> deliver
>>>> unicast traffic while under ARP spam. I tied internal poll to receive and
>>>> transmit, so Snp.Receive() will poll the NIC and then dequeue a packet
>> while
>>>> Snp.Transmit() will transmit the packet and then poll the NIC to get a
>>>> completion. I also implemented a busy poll for 100us or until the next
>> packet
>>>> comes in when the NIC is polled.
>>> I see some problems here. Both Snp.Receive() and Snp.Transmit() should be
>> non-blocking
>>> interface in current design and UEFI spec, but it's not true in your driver.
>>> For Receive(), it could poll the NIC only once and return either a received
>> packet
>>> or an error, it should not do any blocking poll to wait for a packet.
>> On some platforms (tftp clients), if the Snp.Receive() returns EFI_NOT_READY
>> they do not poll the NIC until the next MnpSystemPoll(). Hence the delay
>> before returning EFI_NOT_READY in case the packet is still incoming.
> OK, that's really not good practice.
>
>>> For Transmit(), it just need to put the packet into the TX queue and return
>> directly,
>>> without checking if the TX is completed.
>>> Instead of that, Snp.GetStatus() should be
>>> used to check if the packet has really been sent out.
>> I've seen quite a few platforms where GetStatus was only called by the
>> MnpSystemPoll() and the application would not call it. This means our hardware
>> rings are stuffed with completions and we stop being able to transmit. Yes,
>> you could say that we should solve the application, but unfortunately I can't
>> do that.
> The application doesn't need to call Snp.GetStatus() to recycle the TX buffer, the
> Mnp.Transmit() will do this automatically when it found the SNP's TX buffer is full.
> See MnpSyncSendPacket() and MnpRecycleTxBuf(). 
>
>>> May I know how did you control the 100us timeout of the busy poll (by which
>> API?),
>>> and what's the TPL of this busy poll?
>>>
>> Create a timeout timer with value of 1000. Then busy poll in a loop until
>> packet is received or a timer is triggered.
>> This runs at TPL_CALLBACK while the EfxPoll function itself runs at TPL_NOTIFY.
>>
>> --
>>   /* Set the poll moderation timeout to 100us */
>>   gBS->SetTimer(NetDev->PollModeratorTimer, TimerRelative, 1000);
>>
>>   while (!InterruptStatus) {
>>     EfxPoll(NULL, NetDev->Efx);
>>     InterruptStatus = EfxStatus(NetDev->Efx);
>>
>>     if (gBS->CheckEvent(NetDev->PollModeratorTimer))
>>       return InterruptStatus;
>>   }
>>
>>   return InterruptStatus;
>> --
> This is exactly the case what I worried about. Some platform has a minimum timer
> interval (like 1ms or 10ms), so if you set a timer interval shorter than that,
> it will actually be signaled after the minimum value. Please check if your
> platform could support a such a short timeout (100us).
>
>>>> Otherwise I have seen a pathological case where the TFTP client would pull
>> one
>>>> packet per second but the TFTP_RETRANSMIT interval on the server was also
>> one
>>>> second and everything was awful.
>>> Why TFTP client just pull one packet per second? I think it’s not correct
>> and it
>>> could use the poll() function to accelerate the receive. Why you trying to
>> solve a
>>> TFTP layer problem in SNP layer? It breaks the design principle of network
>> layer model.
>> Yes, I appreciate the principle. However, in practice we don't get to sell
>> adapters that do not PXE boot and it's pointless to argue with customers that
>> they have a rubbish TFTP client implementation. So we put in workarounds into
>> the driver.
> OK I understand your situation now. Technically it will be better to fix the
> problem in TFTP client, so I still suggest you to confirm this with your customer
> or the TFTP application owner. This workaround in your SNP driver will put you
> in the risk that a normal application which follow UEFI spec may not work correctly
> with you, so I think you should only consider this method if you can confirm
> the TFTP client is the only application layer client you need to serve.
>
> Best Regards,
> Siyuan
>
>>>> In theory the busy poll moderator might cause an issue in a pathological
>> case,
>>>> if it stalls the machine for 100us for each packet - with 100 packets per
>>>> second this would eat only 10ms.
>>> Not only 10ms, it's at least 10ms, and in theory. If you have multiple NICs
>> in the
>>> system, this time will double. If someone tries to poll the NIC, the time
>> will increase
>>> significantly. It also depends on who you measure the 100us timeout, I did
>> see some
>>> platform which has a minimum timer interval limitation, that you thought you
>> are using
>>> a 100us timeout but it’s actually 1ms or even 10ms.
>>>
>>> So my suggestion is do NOT add any busy wait in SNP layer API, keep it as
>> unblocking,
>>> and the try to solve the application layer problem in application layer.
>> I'll try dropping the busy poll and see whether that helps
>>
>> Cheers,
>> Tom
>>> Best Regards,
>>> Siyuan
>>>
>>>> Cheers,
>>>> Tom
>>>>
>>>> On 27/01/2019 14:28, Fu, Siyuan wrote:
>>>>> Hi, Tom
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>>>> Tomas
>>>>>> Pilar (tpilar)
>>>>>> Sent: Friday, January 25, 2019 8:09 PM
>>>>>> To: Wu, Jiaxin <jiaxin.wu@intel.com>; Laszlo Ersek <lersek@redhat.com>;
>>>> edk2-
>>>>>> devel@lists.01.org
>>>>>> Cc: Ye, Ting <ting.ye@intel.com>
>>>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>>>
>>>>>> Yeah, that makes sense I suppose. The end result is however that the
>>>> network
>>>>>> device is 'opened' as soon as ConnectController() is called rather than
>>>> when
>>>>>> someone wants to do networking. This seems wrong and directly leads to a
>>>> DoS
>>>>>> of a platform in case of heavy network load (unless we implement
>> budgeting
>>>> in
>>>>>> the network driver, which is a really not what it should be doing).
>>>>> It's true that a configured MNP will start a background system poll to
>>>> receive
>>>>> network packets, but normally it won't lead to a DoS, because the receive
>>>> frequency
>>>>> is actually very low. The MNP driver only tries to receive 1 packet from
>> SNP
>>>> for every
>>>>> 10 milliseconds (see MNP_SYS_POLL_INTERVAL, MnpSystemPoll() and related
>>>> timer event).
>>>>> So no matter how heavy the network is, MNP driver (and upper layer stack)
>>>> will
>>>>> only process at most 100 packets per second, all other packets should be
>>>> dropped
>>>>> by UNDI driver directly.
>>>>>
>>>>> So if there is nobody doing busy network receiving, the network stack cost
>>>> will be
>>>>> at most 100 packets per NIC device and per second.
>>>>>
>>>>> In your first email you mentioned "some performance improvements to my
>>>> network driver",
>>>>> may I know what's the improvement method you are using?
>>>>>
>>>>> Best Regards,
>>>>> Siyuan
>>>>>
>>>>>> How do you suggest we solve this problem?
>>>>>>
>>>>>> Cheers,
>>>>>> Tom
>>>>>>
>>>>>> On 25/01/2019 08:44, Wu, Jiaxin wrote:
>>>>>>> Hi Tom,
>>>>>>>
>>>>>>> One thing I want to highlight is that our design of network stack is not
>>>>>> only for the PXE/HTTP boot trigged in BootManager, but also to make sure
>>>> it's
>>>>>> workable once there is any MNP instance configured by upper drivers
>>>>>> (ARP/IPv4/IPv6).
>>>>>>> Take ARP/IP as an example, once ARP/IP are started, we need a heartbeat
>> to
>>>>>> process any ARP requests, which is required by ARP functionality. In such
>> a
>>>>>> case, SNP must be started to make ARP/IP drivers works well.
>>>>>>> Thanks,
>>>>>>> Jiaxin
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>>>>>>>> Tomas Pilar (tpilar)
>>>>>>>> Sent: Friday, January 25, 2019 1:43 AM
>>>>>>>> To: Laszlo Ersek <lersek@redhat.com>; edk2-devel@lists.01.org
>>>>>>>> Subject: Re: [edk2] Network Stack Budgeting
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 24/01/2019 16:49, Laszlo Ersek wrote:
>>>>>>>>> On 01/24/19 14:25, Tomas Pilar (tpilar) wrote:
>>>>>>>>>> Hmm,
>>>>>>>>>>
>>>>>>>>>> Mnp->Configure() will eventually call MnpStartSnp().
>>>>>>>>>>
>>>>>>>>>> A grep for Mnp->Configure shows that:
>>>>>>>>>> * ArpDxe performs this on DriverBinding.Start()
>>>>>>>>>> * Ip6Dxe performs this on DriverBinding.Start()
>>>>>>>>>>
>>>>>>>>>> Ipv4 and DnsDhcp do this as a part of their Configure() they expose
>> in
>>>>>> the
>>>>>>>> API.
>>>>>>>>> Yes, that makes sense. All of the above drivers are UEFI drivers that
>>>>>>>>> follow the UEFI driver model, AIUI. As long as the controller is not
>>>>>>>>> connected from BDS, no BindingStart() function should be called in
>> these.
>>>>>>>> Ah, but I would expect the BDS to call ConnectController() on the NIC,
>>>> but
>>>>>> I
>>>>>>>> would not expect the network stack to be started unless the device is
>>>>>>>> actually chosen for PXE booting. In other words, the above protocols
>>>> should
>>>>>>>> follow the example of EFI_DNS4_PROTOCOL that binds against the device
>>>>>>>> during BDS but .Configure() is not automatically called by
>>>>>>>> DriverBinding.Start().
>>>>>>>>
>>>>>>>> .Configure() should be called by the BootManager if networking is
>>>> actually
>>>>>> to
>>>>>>>> be done. That in turn will configure Mnp and start Snp.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Tom
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> edk2-devel mailing list
>>>>>>>> edk2-devel@lists.01.org
>>>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel
>>>>>> _______________________________________________
>>>>>> edk2-devel mailing list
>>>>>> edk2-devel@lists.01.org
>>>>>> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-29 10:54                             ` Tomas Pilar (tpilar)
  2019-01-29 13:06                               ` Fu, Siyuan
@ 2019-01-29 13:42                               ` Laszlo Ersek
  2019-01-29 13:52                                 ` Tomas Pilar (tpilar)
  1 sibling, 1 reply; 23+ messages in thread
From: Laszlo Ersek @ 2019-01-29 13:42 UTC (permalink / raw)
  To: Tomas Pilar (tpilar), Fu, Siyuan, Wu, Jiaxin,
	edk2-devel@lists.01.org
  Cc: Ye, Ting

On 01/29/19 11:54, Tomas Pilar (tpilar) wrote:
>> Why TFTP client just pull one packet per second? I think it’s not
>> correct and it could use the poll() function to accelerate the
>> receive. Why you trying to solve a TFTP layer problem in SNP layer?
>> It breaks the design principle of network layer model.

> Yes, I appreciate the principle. However, in practice we don't get to
> sell adapters that do not PXE boot and it's pointless to argue with
> customers that they have a rubbish TFTP client implementation. So we
> put in workarounds into the driver.

Actually, I think this is the core principle behind the UEFI forum, and
the UEFI spec. You shouldn't have to implement bug compat hacks. The era
when an add-on card would work on some platform's BIOS but not on
another's should be left behind. You have a spec to point at, and the
platform in question was likely certified against that spec in some form
(possibly self-certified).

Sp I think it would be reasonable to contact the platform vendor, and to
direct your own customers to that ticket too. If you have a
representative on the USWG, it might make sense to raise the issue there
as well, especially if the issue is wide-spread and affects multiple
platform vendors. The UEFI spec targets practical, common use cases, and
this looks like one.

(When a RH customer or partner reports e.g. a RHEL kernel issue to us,
and we determine it is a problem in the firmware, we absolutely talk to
the platform vendor, and sometimes to standards bodies too. We also
advise customers on the applications that they run on RHEL, if they ask
and/or care to listen. Plus, some high-profile applications and RHEL are
explicitly certified against each other.)

... I don't mean to intrude of course; I'm sorry if I came through like
that.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Network Stack Budgeting
  2019-01-29 13:42                               ` Laszlo Ersek
@ 2019-01-29 13:52                                 ` Tomas Pilar (tpilar)
  0 siblings, 0 replies; 23+ messages in thread
From: Tomas Pilar (tpilar) @ 2019-01-29 13:52 UTC (permalink / raw)
  To: Laszlo Ersek, Fu, Siyuan, Wu,  Jiaxin, edk2-devel@lists.01.org; +Cc: Ye, Ting



On 29/01/2019 13:42, Laszlo Ersek wrote:
> On 01/29/19 11:54, Tomas Pilar (tpilar) wrote:
>>> Why TFTP client just pull one packet per second? I think it’s not
>>> correct and it could use the poll() function to accelerate the
>>> receive. Why you trying to solve a TFTP layer problem in SNP layer?
>>> It breaks the design principle of network layer model.
>> Yes, I appreciate the principle. However, in practice we don't get to
>> sell adapters that do not PXE boot and it's pointless to argue with
>> customers that they have a rubbish TFTP client implementation. So we
>> put in workarounds into the driver.
> Actually, I think this is the core principle behind the UEFI forum, and
> the UEFI spec. You shouldn't have to implement bug compat hacks. The era
> when an add-on card would work on some platform's BIOS but not on
> another's should be left behind. You have a spec to point at, and the
> platform in question was likely certified against that spec in some form
> (possibly self-certified).
>
> Sp I think it would be reasonable to contact the platform vendor, and to
> direct your own customers to that ticket too. If you have a
> representative on the USWG, it might make sense to raise the issue there
> as well, especially if the issue is wide-spread and affects multiple
> platform vendors. The UEFI spec targets practical, common use cases, and
> this looks like one.
>
> (When a RH customer or partner reports e.g. a RHEL kernel issue to us,
> and we determine it is a problem in the firmware, we absolutely talk to
> the platform vendor, and sometimes to standards bodies too. We also
> advise customers on the applications that they run on RHEL, if they ask
> and/or care to listen. Plus, some high-profile applications and RHEL are
> explicitly certified against each other.)
>
> ... I don't mean to intrude of course; I'm sorry if I came through like
> that.
Thanks, this is some good food for thought. I certainly appreciate the progress we've made since the old EFI days.

Tom

>
> Thanks
> Laszlo



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-01-29 13:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-23 10:55 Network Stack Budgeting Tomas Pilar (tpilar)
2019-01-23 14:14 ` Ye, Ting
2019-01-23 14:51   ` Tomas Pilar (tpilar)
2019-01-23 16:08 ` Laszlo Ersek
2019-01-23 16:27   ` Tomas Pilar (tpilar)
2019-01-23 17:47     ` Andrew Fish
2019-01-23 22:18     ` Laszlo Ersek
2019-01-24 11:37       ` Tomas Pilar (tpilar)
2019-01-24 12:25         ` Laszlo Ersek
2019-01-24 12:58           ` Tomas Pilar (tpilar)
2019-01-24 13:25             ` Tomas Pilar (tpilar)
2019-01-24 16:49               ` Laszlo Ersek
2019-01-24 17:43                 ` Tomas Pilar (tpilar)
2019-01-25  8:44                   ` Wu, Jiaxin
2019-01-25 12:08                     ` Tomas Pilar (tpilar)
2019-01-27 14:28                       ` Fu, Siyuan
2019-01-28 11:24                         ` Tomas Pilar (tpilar)
2019-01-29  3:20                           ` Fu, Siyuan
2019-01-29 10:54                             ` Tomas Pilar (tpilar)
2019-01-29 13:06                               ` Fu, Siyuan
2019-01-29 13:12                                 ` Tomas Pilar (tpilar)
2019-01-29 13:42                               ` Laszlo Ersek
2019-01-29 13:52                                 ` Tomas Pilar (tpilar)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox