public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: "Philippe Mathieu-Daudé" <philmd@redhat.com>, devel@edk2.groups.io
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>,
	Jordan Justen <jordan.l.justen@intel.com>
Subject: Re: [PATCH v2 02/10] OvmfPkg: raise PcdShellFileOperationSize to 128KB
Date: Fri, 15 Jan 2021 19:22:00 +0100	[thread overview]
Message-ID: <ea523cf3-114f-2778-86a3-445b3de34318@redhat.com> (raw)
In-Reply-To: <b82b3803-84a8-254b-6dc2-4fc3fa9cd626@redhat.com>

Hi Phil,

On 01/15/21 16:58, Philippe Mathieu-Daudé wrote:
> On 1/15/21 11:09 AM, Laszlo Ersek wrote:
>> On 01/15/21 10:34, Philippe Mathieu-Daudé wrote:
>>> On 1/13/21 9:54 AM, Laszlo Ersek wrote:
>>>>
>>>> The virtio-fs daemon of QEMU advertizes a 128KB maximum buffer size by
>>>> default, for the FUSE_WRITE operation.
>>>
>>> I delayed this patch review because I couldn't find where this value
>>> is advertized in QEMU (virtiofsd is very new to me). Can you enlighten
>>> me please?
>>
>> Yes, absolutely. It's really difficult to find. I actually started to
>> capture all that information in what would be commit 6f7bc7196ff2
>> ("OvmfPkg/VirtioFsDxe: implement the wrapper function for FUSE_WRITE",
>> 2020-12-21), or perhaps in this patch even -- but in the end, it was
>> such a divergent set of commits, originating actually in the "libfuse"
>> project, from which virtiofsd was forked, that I decided to drop it.
>>
>> So, in virtiofsd, you actually have to look for the *page count* that
>> corresponds to 128KB -- FUSE_DEFAULT_MAX_PAGES_PER_REQ (32).
>>
>> In the fuse_session_new() function, we have
>>
>>     se->conn.max_write = UINT_MAX;
>>     se->bufsize = FUSE_MAX_MAX_PAGES * getpagesize() + FUSE_BUFFER_HEADER_SIZE;
>>
>> In the do_init() function, which handles the FUSE_INIT request, we have
>>
>>     size_t bufsize = se->bufsize;
>>
>> then
>>
>>     if (!(arg->flags & FUSE_MAX_PAGES)) {
>>         size_t max_bufsize = FUSE_DEFAULT_MAX_PAGES_PER_REQ * getpagesize() +
>>                              FUSE_BUFFER_HEADER_SIZE;
>>         if (bufsize > max_bufsize) {
>>             bufsize = max_bufsize;
>>         }
>>     }
>>
>> and later
>>
>>     if (se->conn.max_write > bufsize - FUSE_BUFFER_HEADER_SIZE) {
>>         se->conn.max_write = bufsize - FUSE_BUFFER_HEADER_SIZE;
>>     }
>>
>> and then
>>
>>     outarg.max_write = se->conn.max_write;
>>
>> It's super convoluted, because a bunch of flags are handled, and they
>> all control the buffer size one way or another, and so the various
>> limits are converted and enforced back and forth. In practice, the
>> virtio-fs driver in OVMF does not set the FUSE_MAX_PAGES flag in the
>> FUSE_INIT request, and so in virtiofsd, we first reduce "bufsize" from
>>
>>   FUSE_MAX_MAX_PAGES             * getpagesize() + FUSE_BUFFER_HEADER_SIZE;
>>
>> to
>>
>>   FUSE_DEFAULT_MAX_PAGES_PER_REQ * getpagesize() + FUSE_BUFFER_HEADER_SIZE;
>>
>> and then calculate "max_write" by subtracting FUSE_BUFFER_HEADER_SIZE.
>> The result is
>>
>>   FUSE_DEFAULT_MAX_PAGES_PER_REQ * getpagesize()
>>
>> which is 128 KB.
>>
>>
>> In the libfuse project <https://github.com/libfuse/libfuse.git>, one
>> related commit is:
>>
>>   https://github.com/libfuse/libfuse/commit/027d0d17c8a4
>>
>> and before that,
>>
>>   https://github.com/libfuse/libfuse/commit/97f4a9cb4fc69
>>
>> If you try to peek even before that, you end up with historical commits
>> like
>>
>>   https://github.com/libfuse/libfuse/commit/065f222cd5850
>>
>> So basically the *origin* of max_write=128KB is untraceable :) But, at
>> least the present behavior can be found; it comes from libfuse commit
>> 027d0d17c8a4.
>>
>>
>> Approaching the same topic from the FUSE client in the Linux kernel,
>> commit 5da784cce430 ("fuse: add max_pages to init_out", 2018-10-01) is
>> relevant; it also points to the following discussion on LKML:
>> <https://lkml.org/lkml/2012/7/5/136>.
>>
>> Again it's convoluted; the direct answer to your question is,
>> "max_write=128KB comes from FUSE_DEFAULT_MAX_PAGES_PER_REQ, because the
>> FUSE_MAX_PAGES flag is not requested in the FUSE_INT operation".
> 
> Thanks a lot for the full explanation!
> I'm glad I asked this early enough so this was fresh in your
> memory :) At least we got it archived on the list.
> 
> Shouldn't QEMU announce this via a better mechanism? I.e. a
> fw_cfg entry? This is so convoluted (as you said) than I'm
> worried libfuse change a parameter and we end having a debug
> nightmare figuring the root cause...

Wait, I think you don't have the full picture.

(1) In the FUSE_INIT *response*, virtiofsd advertizes "max_write". This
value is critically important to the virtio-fs driver in OVMF
(OvmfPkg/VirtioFsDxe), and indeed said driver adheres to this limit
closely. When the driver serves an EFI_FILE_PROTOCOL.Write() request, it
breaks the write into chunks of "max_write" bytes, for individual
FUSE_WRITE operations to be sent to virtiofsd.

Please refer to edk2 commits

- 6f7bc7196ff2 ("OvmfPkg/VirtioFsDxe: implement the wrapper function for
FUSE_WRITE", 2020-12-21), and

- 44f43f94cebe ("OvmfPkg/VirtioFsDxe: implement
EFI_FILE_PROTOCOL.Write()", 2020-12-21).

In fact, max_write=128KB is obvious; the reason for that is that the
virtiofsd log contains this value, when the FUSE_INIT request is
processed and answered.

This is one half of the picture -- the important half.

(2) The much less important half is this patch. This patch is a "best
effort" optimization, based on the *empirical value* of 128KB that's
advertized for max_write.

If virtiofsd were modified and it changed the max_write default to 64KB
or 256KB, then this PCD setting would no longer match. But, that would
not be a tragedy, exactly how it is not a tragedy *right now* that the
current PCD setting is 4KB. The patch is a small, static optimization
based on empirical data. It is not foolproof, and it doesn't *have* to be.


So, the complexity is entirely hidden in how virtiofsd calculates the
128 KB value for max_write. That's in fact a virtiofsd implementation
detail. What matters is that the OVMF driver for virtio-fs sees
max_write=128KB *clearly announced* in the FUSE_INIT response, and that
the driver rigorously honors that limit.

Conversely, the present patch is absolutely secondary; it is a "low
hanging fruit", ad-hoc optimization, at a higher level. The PCD change
affects e.g. the CP command in the UEFI shell, even if you only use a
FAT filesystem. In that case, the PCD change is entirely pointless --
but it does not hurt, and the change is still very simple.

It's unreasonable to expect that the CP command flexibly adhere to the
optimal / maximal write size for the particular filesystem it is writing
to. That's not the job of the CP command -- it is the job of the
filesystem driver under CP. But, by increasing the buffer size of CP
(and of some other UEFI shell commands that work with files), based on
some empirical data, we *permit* the virtio-fs driver to work better
under CP -- without harming CP on filesystems different from virtio-fs.
As the commit message says, "And when a Virtio Filesystem is not used, a
128KB chunk size is still not particularly wasteful".

> 
> Anyhow this is not blocking this patch, so:
> Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>

Thanks!
Laszlo


  reply	other threads:[~2021-01-15 18:22 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-13  8:54 [PATCH v2 00/10] multiple packages: shell usability improvements Laszlo Ersek
2021-01-13  8:54 ` [PATCH v2 01/10] ShellPkg/Comp: add file buffering Laszlo Ersek
2021-01-13 18:42   ` Philippe Mathieu-Daudé
2021-01-13  8:54 ` [PATCH v2 02/10] OvmfPkg: raise PcdShellFileOperationSize to 128KB Laszlo Ersek
2021-01-15  9:34   ` Philippe Mathieu-Daudé
2021-01-15 10:09     ` Laszlo Ersek
2021-01-15 15:58       ` Philippe Mathieu-Daudé
2021-01-15 18:22         ` Laszlo Ersek [this message]
2021-01-13  8:54 ` [PATCH v2 03/10] ArmVirtPkg: " Laszlo Ersek
2021-01-15 15:59   ` Philippe Mathieu-Daudé
2021-01-15 19:03     ` [edk2-devel] " Laszlo Ersek
2021-01-13  8:54 ` [PATCH v2 04/10] EmulatorPkg: add OrderedCollectionLib class resolution Laszlo Ersek
2021-01-13 13:20   ` Philippe Mathieu-Daudé
2021-01-18 16:48   ` [edk2-devel] " Laszlo Ersek
2021-01-19  7:55     ` 回复: " gaoliming
2021-01-13  8:54 ` [PATCH v2 05/10] UefiPayloadPkg: " Laszlo Ersek
2021-01-13 13:20   ` Philippe Mathieu-Daudé
2021-01-18 16:48   ` [edk2-devel] " Laszlo Ersek
2021-01-19  4:29     ` Guo Dong
2021-01-13  8:54 ` [PATCH v2 06/10] ShellPkg/ShellCommandLib: add ShellSortFileList() Laszlo Ersek
2021-01-13 13:19   ` Philippe Mathieu-Daudé
2021-01-13  8:54 ` [PATCH v2 07/10] ShellPkg/Ls: sort output by FileName in non-SFO mode Laszlo Ersek
2021-01-13 13:15   ` Philippe Mathieu-Daudé
2021-01-13  8:54 ` [PATCH v2 08/10] ShellPkg/ShellProtocol: sort files by FullName in RemoveDupInFileList() Laszlo Ersek
2021-01-13 13:14   ` Philippe Mathieu-Daudé
2021-01-14 14:19     ` Laszlo Ersek
2021-01-13  8:54 ` [PATCH v2 09/10] OvmfPkg: disable list length checks in NOOPT and DEBUG builds Laszlo Ersek
2021-01-15  9:30   ` Philippe Mathieu-Daudé
2021-01-13  8:54 ` [PATCH v2 10/10] ArmVirtPkg: " Laszlo Ersek
2021-01-15  9:31   ` Philippe Mathieu-Daudé
2021-01-19 18:39 ` [edk2-devel] [PATCH v2 00/10] multiple packages: shell usability improvements Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea523cf3-114f-2778-86a3-445b3de34318@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox