From: "Daniel Schaefer" <daniel.schaefer@hpe.com>
To: <devel@edk2.groups.io>, <gaoliming@byosoft.com.cn>, <afish@apple.com>
Cc: <derek.lin2@hpe.com>
Subject: Re: 回复: [edk2-devel] Multithreaded compression with LZMA2
Date: Wed, 2 Dec 2020 16:24:47 +0800 [thread overview]
Message-ID: <0b9f1f35-a958-fddf-d2aa-247f41a4b5b1@hpe.com> (raw)
In-Reply-To: <013801d6c86b$0941a040$1bc4e0c0$@byosoft.com.cn>
On 12/2/20 1:21 PM, gaoliming wrote:
> Daniel:
>
> Can you provide the compressed image size? And, what image is used to be compressed? Is it the generated FV image?
>
> Thanks
>
> Liming
>
> *发件人:*bounce+27952+68159+4905953+8761045@groups.io <bounce+27952+68159+4905953+8761045@groups.io> *代表 *Andrew Fish via groups.io
> *发送时间:*2020年12月2日11:37
> *收件人:*devel@edk2.groups.io; daniel.schaefer@hpe.com
> *抄送:*derek.lin2@hpe.com
> *主题:*Re: [edk2-devel] Multithreaded compression with LZMA2
>
>
>
> On Dec 1, 2020, at 6:59 PM, Daniel Schaefer <daniel.schaefer@hpe.com <mailto:daniel.schaefer@hpe.com>> wrote:
>
> Hi everyone,
>
> I'm looking into how to speed up the build process and noticed that our build
> uses LZMA to encrypt the main firmware volume. Since it's quite big it takes a
> while but only uses one CPU thread.
>
> LZMA2 is a version of LZMA which can be multi-threaded and achieve much faster
> compression times. I did a quick benchmark using the `xz` command-line tool,
> which uses a modified version of the LZMA SDK that EDK2 uses. The results are:
>
> Uncompressed size: 64M
>
> | Algo | Comp Time | Decomp Time | Size | Threads |
> | ----- | --------- | ----------- | ---- | ------- |
> | LZMA | 19.67s | 0.9s | 9.1M | 1 |
> | LZMA2 | 20.11s | 1.2s | 9.2M | 1 |
> | LZMA2 | 8.31s | 1.0s | 9.4M | 4 |
>
> Using those commands:
>
> time xz --format=lzma testfile
> time unlzma testfile.lzma
>
> time xz --lzma2 testfile
> time unxz testfile.xz
>
> time xz -T4 --lzma2 testfile
> time unxz testfile.xz
>
> This is quite a significant improvement of build time, while decompression time
> and size only slightly increase. If that's a concern, then LZMA2 could be used
> for development only.
>
> I haven't investigated the details of how to support this in the code but it
> appears to be a simple change, since the LZMA SDK that we use already supports
> LZMA2.
>
> What do you think?
>
> Interesting idea. What OS did you use? I tried this on macOS on some larger FVs and I did not see much difference? I tried a 17.5 MiB FV and it was around 3 seconds both ways.
>
> Maybe it would be worth while seeing how it works on various systems? I guess it might be data set related?
Hi Andrew and Liming,
the FV file is our main FV with the majority of DXEs. It's 64MB uncompressed
and 9MB compressed, as mentioned before. Unfortunately I cannot share that
particular file with you but I am also suprised that it compresses so well to
just 14% of its original size.
I'm running my tests on X64 Linux. I ran the same tests again on a more
powerful machine with the hyperfine command. It runs the testcase 3 times to
warm up (e.g. caches) and then runs it 10 times that count towards the average.
The result is the same as before. The compression takes just 40% with 4 threads.
I don't observe any further speedup by using all 16 thread of the CPU.
# Simple LZMA
$ hyperfine --warmup 3 'xz -k --format=lzma testfile && rm testfile.lzma'
Benchmark #1: xz -k --format=lzma testfile && rm testfile.lzma
Time (mean ± σ): 12.755 s ± 0.151 s [User: 12.691 s, System: 0.064 s]
Range (min … max): 12.568 s … 12.991 s 10 runs
# LZMA2 with single thread
$ hyperfine --warmup 3 'xz -k -T1 --lzma2 testfile && rm testfile.xz'
Benchmark #1: xz -k -T1 --lzma2 testfile && rm testfile.xz
Time (mean ± σ): 12.838 s ± 0.149 s [User: 12.783 s, System: 0.055 s]
Range (min … max): 12.546 s … 13.053 s 10 runs
# LZMA2 with 4 threads
$ hyperfine --warmup 3 'xz -k -T4 --lzma2 testfile && rm testfile.xz'
Benchmark #1: xz -k -T4 --lzma2 testfile && rm testfile.xz
Time (mean ± σ): 5.241 s ± 0.025 s [User: 13.537 s, System: 0.177 s]
Range (min … max): 5.227 s … 5.302 s 10 runs
Using xz from mingw64 on Windows 10 X64 the 4-threaded compression takes 47% of single-threaded.
---
I wanted to try it on a bigger file and ran the same benchmarks on a 16MB file
to discover that compression is no faster with multithreading. The file shrinks
from 16MB to 9MB. However, after my tests I discovered that this is the
combined FV, which includes a few other compressed FVs.
Here the multithreaded command is even able to compress the FV 0.2MB more.
$ hyperfine --warmup 3 'xz -k -T1 --lzma2 testfile && rm testfile.xz'
Benchmark #1: xz -k -T1 --lzma2 testfile && rm testfile.xz
Time (mean ± σ): 2.874 s ± 0.134 s [User: 2.825 s, System: 0.049 s]
Range (min … max): 2.751 s … 3.088 s 10 runs
$ ls -lh
-rw-r--r-- 1 zoid users 16M Dec 2 15:29 testfile
-rw-r--r-- 1 zoid users 9.4M Dec 2 15:29 testfile.lzma
$ hyperfine --warmup 3 'xz -k -T4 --lzma2 testfile && rm testfile.xz'
Benchmark #1: xz -k -T4 --lzma2 testfile && rm testfile.xz
Time (mean ± σ): 2.874 s ± 0.108 s [User: 2.818 s, System: 0.070 s]
Range (min … max): 2.775 s … 3.081 s 10 runs
$ ls -lh
-rw-r--r-- 1 zoid users 16M Dec 2 15:29 testfile
-rw-r--r-- 1 zoid users 9.2M Dec 2 15:29 testfile.xz
So even with a file that doesn't compress as well, the performance doesn't get any worse.
Thanks,
Daniel
next prev parent reply other threads:[~2020-12-02 8:25 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-02 2:59 Multithreaded compression with LZMA2 Daniel Schaefer
2020-12-02 3:36 ` [edk2-devel] " Andrew Fish
2020-12-02 5:21 ` 回复: " gaoliming
2020-12-02 8:24 ` Daniel Schaefer [this message]
2020-12-03 10:24 ` Laszlo Ersek
2020-12-03 12:11 ` Daniel Schaefer
2020-12-03 15:57 ` Bret Barkelew
2020-12-04 8:19 ` Daniel Schaefer
2020-12-03 23:35 ` Laszlo Ersek
2020-12-04 2:28 ` 回复: " gaoliming
2020-12-04 9:02 ` Daniel Schaefer
2020-12-08 6:01 ` 回复: " gaoliming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b9f1f35-a958-fddf-d2aa-247f41a4b5b1@hpe.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox