From: "Cohen, Eugene" <eugene@hp.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
"leif.lindholm@linaro.org" <leif.lindholm@linaro.org>,
"liming.gao@intel.com" <liming.gao@intel.com>
Subject: Re: [PATCH 1/3] ArmPkg/CompilerIntrinsicsLib: replace memcpy and memset with C code
Date: Thu, 11 Aug 2016 21:34:08 +0000 [thread overview]
Message-ID: <AT5PR84MB0291918FB0FDCE587FC4A3B1B41E0@AT5PR84MB0291.NAMPRD84.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <1470939632-32198-1-git-send-email-ard.biesheuvel@linaro.org>
> This replaces the various implementations of memset and memcpy,
> including the ARM RTABI ones (__aeabi_mem[set|clr]_[|4|8]) with
> a single C implementation for each. The ones we have are either not
> very sophisticated (ARM), or they are too sophisticated (memcpy() on
> AARCH64, which may perform unaligned accesses) or already coded in
> C
> (memset on AArch64).
Ard,
I'm concerned about the performance impact of this change... there's a reason for all that complexity and it's to optimize performance.
Why does memcpy performance matter? In addition to the overall memcpy stuff scattered around C code we have an instance that is particularly sensitive to memcpy performance. For DMA operations when invoking double-buffering or access to portions of a buffer that is common mapped (i.e. uncached on non-coherent DMA systems) the impact of a non-optimized memcpy is enormous compared to the optimized ones because the penalty is amplified by orders of magnitude due to uncached memory access latency.
So I would ask that before a change like this is brought in that we characterize the cached-cached and cached-uncached (and perhaps unaligned cached-cached) performance across the implementations. Based on my experience I'm expecting both cases will take a massive performance hit.
>From your commit message I'm inferring that the problem you're solving is to play nice in environments that can't tolerate unaligned access like when the MMU is off. I get that - and I think a variant of the library that plays nice in these limited cases makes sense. However, I don't think we should drag down the performance down of the rest of the environment where we spend the vast majority of our time executing.
Eugene
next prev parent reply other threads:[~2016-08-11 21:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-11 18:20 [PATCH 1/3] ArmPkg/CompilerIntrinsicsLib: replace memcpy and memset with C code Ard Biesheuvel
2016-08-11 18:20 ` [PATCH 2/3] BaseTools RVCT: ignore various RVC diagnostics Ard Biesheuvel
2016-08-12 3:04 ` Gao, Liming
2016-08-11 18:20 ` [PATCH 3/3] MdePkg RVCT: add definition of UNREACHABLE Ard Biesheuvel
2016-08-12 3:04 ` Gao, Liming
2016-08-11 21:34 ` Cohen, Eugene [this message]
2016-08-11 21:45 ` [PATCH 1/3] ArmPkg/CompilerIntrinsicsLib: replace memcpy and memset with C code Ard Biesheuvel
2016-08-11 21:50 ` Cohen, Eugene
2016-08-11 21:53 ` Ard Biesheuvel
2016-08-11 23:04 ` Andrew Fish
2016-08-12 8:09 ` Ard Biesheuvel
2016-09-01 13:11 ` Leif Lindholm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AT5PR84MB0291918FB0FDCE587FC4A3B1B41E0@AT5PR84MB0291.NAMPRD84.PROD.OUTLOOK.COM \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox