* Multithreaded compression with LZMA2 @ 2020-12-02 2:59 Daniel Schaefer 2020-12-02 3:36 ` [edk2-devel] " Andrew Fish 2020-12-03 10:24 ` Laszlo Ersek 0 siblings, 2 replies; 12+ messages in thread From: Daniel Schaefer @ 2020-12-02 2:59 UTC (permalink / raw) To: devel@edk2.groups.io; +Cc: derek.lin2 Hi everyone, I'm looking into how to speed up the build process and noticed that our build uses LZMA to encrypt the main firmware volume. Since it's quite big it takes a while but only uses one CPU thread. LZMA2 is a version of LZMA which can be multi-threaded and achieve much faster compression times. I did a quick benchmark using the `xz` command-line tool, which uses a modified version of the LZMA SDK that EDK2 uses. The results are: Uncompressed size: 64M | Algo | Comp Time | Decomp Time | Size | Threads | | ----- | --------- | ----------- | ---- | ------- | | LZMA | 19.67s | 0.9s | 9.1M | 1 | | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | Using those commands: time xz --format=lzma testfile time unlzma testfile.lzma time xz --lzma2 testfile time unxz testfile.xz time xz -T4 --lzma2 testfile time unxz testfile.xz This is quite a significant improvement of build time, while decompression time and size only slightly increase. If that's a concern, then LZMA2 could be used for development only. I haven't investigated the details of how to support this in the code but it appears to be a simple change, since the LZMA SDK that we use already supports LZMA2. What do you think? Thanks, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-02 2:59 Multithreaded compression with LZMA2 Daniel Schaefer @ 2020-12-02 3:36 ` Andrew Fish 2020-12-02 5:21 ` 回复: " gaoliming 2020-12-03 10:24 ` Laszlo Ersek 1 sibling, 1 reply; 12+ messages in thread From: Andrew Fish @ 2020-12-02 3:36 UTC (permalink / raw) To: devel, daniel.schaefer; +Cc: derek.lin2 [-- Attachment #1: Type: text/plain, Size: 1841 bytes --] > On Dec 1, 2020, at 6:59 PM, Daniel Schaefer <daniel.schaefer@hpe.com> wrote: > > Hi everyone, > > I'm looking into how to speed up the build process and noticed that our build > uses LZMA to encrypt the main firmware volume. Since it's quite big it takes a > while but only uses one CPU thread. > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much faster > compression times. I did a quick benchmark using the `xz` command-line tool, > which uses a modified version of the LZMA SDK that EDK2 uses. The results are: > > Uncompressed size: 64M > > | Algo | Comp Time | Decomp Time | Size | Threads | > | ----- | --------- | ----------- | ---- | ------- | > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > Using those commands: > > time xz --format=lzma testfile > time unlzma testfile.lzma > > time xz --lzma2 testfile > time unxz testfile.xz > > time xz -T4 --lzma2 testfile > time unxz testfile.xz > > This is quite a significant improvement of build time, while decompression time > and size only slightly increase. If that's a concern, then LZMA2 could be used > for development only. > > I haven't investigated the details of how to support this in the code but it > appears to be a simple change, since the LZMA SDK that we use already supports > LZMA2. > > What do you think? > Interesting idea. What OS did you use? I tried this on macOS on some larger FVs and I did not see much difference? I tried a 17.5 MiB FV and it was around 3 seconds both ways. Maybe it would be worth while seeing how it works on various systems? I guess it might be data set related? Thanks, Andrew Fish > Thanks, > Daniel > > > [-- Attachment #2: Type: text/html, Size: 28545 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* 回复: [edk2-devel] Multithreaded compression with LZMA2 2020-12-02 3:36 ` [edk2-devel] " Andrew Fish @ 2020-12-02 5:21 ` gaoliming 2020-12-02 8:24 ` Daniel Schaefer 0 siblings, 1 reply; 12+ messages in thread From: gaoliming @ 2020-12-02 5:21 UTC (permalink / raw) To: devel, afish, daniel.schaefer; +Cc: derek.lin2 [-- Attachment #1: Type: text/plain, Size: 2319 bytes --] Daniel: Can you provide the compressed image size? And, what image is used to be compressed? Is it the generated FV image? Thanks Liming 发件人: bounce+27952+68159+4905953+8761045@groups.io <bounce+27952+68159+4905953+8761045@groups.io> 代表 Andrew Fish via groups.io 发送时间: 2020年12月2日 11:37 收件人: devel@edk2.groups.io; daniel.schaefer@hpe.com 抄送: derek.lin2@hpe.com 主题: Re: [edk2-devel] Multithreaded compression with LZMA2 On Dec 1, 2020, at 6:59 PM, Daniel Schaefer <daniel.schaefer@hpe.com <mailto:daniel.schaefer@hpe.com> > wrote: Hi everyone, I'm looking into how to speed up the build process and noticed that our build uses LZMA to encrypt the main firmware volume. Since it's quite big it takes a while but only uses one CPU thread. LZMA2 is a version of LZMA which can be multi-threaded and achieve much faster compression times. I did a quick benchmark using the `xz` command-line tool, which uses a modified version of the LZMA SDK that EDK2 uses. The results are: Uncompressed size: 64M | Algo | Comp Time | Decomp Time | Size | Threads | | ----- | --------- | ----------- | ---- | ------- | | LZMA | 19.67s | 0.9s | 9.1M | 1 | | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | Using those commands: time xz --format=lzma testfile time unlzma testfile.lzma time xz --lzma2 testfile time unxz testfile.xz time xz -T4 --lzma2 testfile time unxz testfile.xz This is quite a significant improvement of build time, while decompression time and size only slightly increase. If that's a concern, then LZMA2 could be used for development only. I haven't investigated the details of how to support this in the code but it appears to be a simple change, since the LZMA SDK that we use already supports LZMA2. What do you think? Interesting idea. What OS did you use? I tried this on macOS on some larger FVs and I did not see much difference? I tried a 17.5 MiB FV and it was around 3 seconds both ways. Maybe it would be worth while seeing how it works on various systems? I guess it might be data set related? Thanks, Andrew Fish Thanks, Daniel [-- Attachment #2: Type: text/html, Size: 7518 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 回复: [edk2-devel] Multithreaded compression with LZMA2 2020-12-02 5:21 ` 回复: " gaoliming @ 2020-12-02 8:24 ` Daniel Schaefer 0 siblings, 0 replies; 12+ messages in thread From: Daniel Schaefer @ 2020-12-02 8:24 UTC (permalink / raw) To: devel, gaoliming, afish; +Cc: derek.lin2 On 12/2/20 1:21 PM, gaoliming wrote: > Daniel: > > Can you provide the compressed image size? And, what image is used to be compressed? Is it the generated FV image? > > Thanks > > Liming > > *发件人:*bounce+27952+68159+4905953+8761045@groups.io <bounce+27952+68159+4905953+8761045@groups.io> *代表 *Andrew Fish via groups.io > *发送时间:*2020年12月2日11:37 > *收件人:*devel@edk2.groups.io; daniel.schaefer@hpe.com > *抄送:*derek.lin2@hpe.com > *主题:*Re: [edk2-devel] Multithreaded compression with LZMA2 > > > > On Dec 1, 2020, at 6:59 PM, Daniel Schaefer <daniel.schaefer@hpe.com <mailto:daniel.schaefer@hpe.com>> wrote: > > Hi everyone, > > I'm looking into how to speed up the build process and noticed that our build > uses LZMA to encrypt the main firmware volume. Since it's quite big it takes a > while but only uses one CPU thread. > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much faster > compression times. I did a quick benchmark using the `xz` command-line tool, > which uses a modified version of the LZMA SDK that EDK2 uses. The results are: > > Uncompressed size: 64M > > | Algo | Comp Time | Decomp Time | Size | Threads | > | ----- | --------- | ----------- | ---- | ------- | > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > Using those commands: > > time xz --format=lzma testfile > time unlzma testfile.lzma > > time xz --lzma2 testfile > time unxz testfile.xz > > time xz -T4 --lzma2 testfile > time unxz testfile.xz > > This is quite a significant improvement of build time, while decompression time > and size only slightly increase. If that's a concern, then LZMA2 could be used > for development only. > > I haven't investigated the details of how to support this in the code but it > appears to be a simple change, since the LZMA SDK that we use already supports > LZMA2. > > What do you think? > > Interesting idea. What OS did you use? I tried this on macOS on some larger FVs and I did not see much difference? I tried a 17.5 MiB FV and it was around 3 seconds both ways. > > Maybe it would be worth while seeing how it works on various systems? I guess it might be data set related? Hi Andrew and Liming, the FV file is our main FV with the majority of DXEs. It's 64MB uncompressed and 9MB compressed, as mentioned before. Unfortunately I cannot share that particular file with you but I am also suprised that it compresses so well to just 14% of its original size. I'm running my tests on X64 Linux. I ran the same tests again on a more powerful machine with the hyperfine command. It runs the testcase 3 times to warm up (e.g. caches) and then runs it 10 times that count towards the average. The result is the same as before. The compression takes just 40% with 4 threads. I don't observe any further speedup by using all 16 thread of the CPU. # Simple LZMA $ hyperfine --warmup 3 'xz -k --format=lzma testfile && rm testfile.lzma' Benchmark #1: xz -k --format=lzma testfile && rm testfile.lzma Time (mean ± σ): 12.755 s ± 0.151 s [User: 12.691 s, System: 0.064 s] Range (min … max): 12.568 s … 12.991 s 10 runs # LZMA2 with single thread $ hyperfine --warmup 3 'xz -k -T1 --lzma2 testfile && rm testfile.xz' Benchmark #1: xz -k -T1 --lzma2 testfile && rm testfile.xz Time (mean ± σ): 12.838 s ± 0.149 s [User: 12.783 s, System: 0.055 s] Range (min … max): 12.546 s … 13.053 s 10 runs # LZMA2 with 4 threads $ hyperfine --warmup 3 'xz -k -T4 --lzma2 testfile && rm testfile.xz' Benchmark #1: xz -k -T4 --lzma2 testfile && rm testfile.xz Time (mean ± σ): 5.241 s ± 0.025 s [User: 13.537 s, System: 0.177 s] Range (min … max): 5.227 s … 5.302 s 10 runs Using xz from mingw64 on Windows 10 X64 the 4-threaded compression takes 47% of single-threaded. --- I wanted to try it on a bigger file and ran the same benchmarks on a 16MB file to discover that compression is no faster with multithreading. The file shrinks from 16MB to 9MB. However, after my tests I discovered that this is the combined FV, which includes a few other compressed FVs. Here the multithreaded command is even able to compress the FV 0.2MB more. $ hyperfine --warmup 3 'xz -k -T1 --lzma2 testfile && rm testfile.xz' Benchmark #1: xz -k -T1 --lzma2 testfile && rm testfile.xz Time (mean ± σ): 2.874 s ± 0.134 s [User: 2.825 s, System: 0.049 s] Range (min … max): 2.751 s … 3.088 s 10 runs $ ls -lh -rw-r--r-- 1 zoid users 16M Dec 2 15:29 testfile -rw-r--r-- 1 zoid users 9.4M Dec 2 15:29 testfile.lzma $ hyperfine --warmup 3 'xz -k -T4 --lzma2 testfile && rm testfile.xz' Benchmark #1: xz -k -T4 --lzma2 testfile && rm testfile.xz Time (mean ± σ): 2.874 s ± 0.108 s [User: 2.818 s, System: 0.070 s] Range (min … max): 2.775 s … 3.081 s 10 runs $ ls -lh -rw-r--r-- 1 zoid users 16M Dec 2 15:29 testfile -rw-r--r-- 1 zoid users 9.2M Dec 2 15:29 testfile.xz So even with a file that doesn't compress as well, the performance doesn't get any worse. Thanks, Daniel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-02 2:59 Multithreaded compression with LZMA2 Daniel Schaefer 2020-12-02 3:36 ` [edk2-devel] " Andrew Fish @ 2020-12-03 10:24 ` Laszlo Ersek 2020-12-03 12:11 ` Daniel Schaefer 1 sibling, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-12-03 10:24 UTC (permalink / raw) To: devel, daniel.schaefer; +Cc: derek.lin2 On 12/02/20 03:59, Daniel Schaefer wrote: > Hi everyone, > > I'm looking into how to speed up the build process and noticed that our > build > uses LZMA to encrypt the main firmware volume. Since it's quite big it > takes a > while but only uses one CPU thread. > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much > faster > compression times. I did a quick benchmark using the `xz` command-line > tool, > which uses a modified version of the LZMA SDK that EDK2 uses. The > results are: > > Uncompressed size: 64M > > | Algo | Comp Time | Decomp Time | Size | Threads | > | ----- | --------- | ----------- | ---- | ------- | > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > Using those commands: > > time xz --format=lzma testfile > time unlzma testfile.lzma > > time xz --lzma2 testfile > time unxz testfile.xz > > time xz -T4 --lzma2 testfile > time unxz testfile.xz > > This is quite a significant improvement of build time, while > decompression time > and size only slightly increase. If that's a concern, then LZMA2 could > be used > for development only. > > I haven't investigated the details of how to support this in the code > but it > appears to be a simple change, since the LZMA SDK that we use already > supports > LZMA2. > > What do you think? "xz -T" works by splitting the input into blocks, and it generates a multi-block compressed output. I'm unsure if the current LZMA decompressor that runs inside the firmware (= guided section extractor) copes with multi-block input. Thanks Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-03 10:24 ` Laszlo Ersek @ 2020-12-03 12:11 ` Daniel Schaefer 2020-12-03 15:57 ` Bret Barkelew 2020-12-03 23:35 ` Laszlo Ersek 0 siblings, 2 replies; 12+ messages in thread From: Daniel Schaefer @ 2020-12-03 12:11 UTC (permalink / raw) To: Laszlo Ersek, devel@edk2.groups.io; +Cc: Lin, Derek (HPS SW) [-- Attachment #1: Type: text/plain, Size: 2166 bytes --] From: Laszlo Ersek <lersek@redhat.com> Sent: Thursday, December 3, 2020 18:24 To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel <daniel.schaefer@hpe.com> Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 On 12/02/20 03:59, Daniel Schaefer wrote: > Hi everyone, > > I'm looking into how to speed up the build process and noticed that our > build > uses LZMA to encrypt the main firmware volume. Since it's quite big it > takes a > while but only uses one CPU thread. > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much > faster > compression times. I did a quick benchmark using the `xz` command-line > tool, > which uses a modified version of the LZMA SDK that EDK2 uses. The > results are: > > Uncompressed size: 64M > > | Algo | Comp Time | Decomp Time | Size | Threads | > | ----- | --------- | ----------- | ---- | ------- | > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > Using those commands: > > time xz --format=lzma testfile > time unlzma testfile.lzma > > time xz --lzma2 testfile > time unxz testfile.xz > > time xz -T4 --lzma2 testfile > time unxz testfile.xz > > This is quite a significant improvement of build time, while > decompression time > and size only slightly increase. If that's a concern, then LZMA2 could > be used > for development only. > > I haven't investigated the details of how to support this in the code > but it > appears to be a simple change, since the LZMA SDK that we use already > supports > LZMA2. > > What do you think? "xz -T" works by splitting the input into blocks, and it generates a multi-block compressed output. Yes, that's correct. > I'm unsure if the current LZMA decompressor that runs inside the firmware (= guided section extractor) copes with multi-block input. I think you're right that it doesn't. But we can make the guided section extractor use that same algorithm(LZMA2) and assign it a different GUID, right? [-- Attachment #2: Type: text/html, Size: 3645 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-03 12:11 ` Daniel Schaefer @ 2020-12-03 15:57 ` Bret Barkelew 2020-12-04 8:19 ` Daniel Schaefer 2020-12-03 23:35 ` Laszlo Ersek 1 sibling, 1 reply; 12+ messages in thread From: Bret Barkelew @ 2020-12-03 15:57 UTC (permalink / raw) To: devel@edk2.groups.io, daniel.schaefer@hpe.com, Laszlo Ersek Cc: Lin, Derek (HPS SW) [-- Attachment #1.1: Type: text/plain, Size: 2653 bytes --] Wasn’t there another push (somewhere in the last 8 months, my brain is foggy) to adopt LZMA2? Or was it a different algorithm? - Bret From: Daniel Schaefer via groups.io<mailto:daniel.schaefer=hpe.com@groups.io> Sent: Thursday, December 3, 2020 4:12 AM To: Laszlo Ersek<mailto:lersek@redhat.com>; devel@edk2.groups.io<mailto:devel@edk2.groups.io> Cc: Lin, Derek (HPS SW)<mailto:derek.lin2@hpe.com> Subject: [EXTERNAL] Re: [edk2-devel] Multithreaded compression with LZMA2 From: Laszlo Ersek <lersek@redhat.com> Sent: Thursday, December 3, 2020 18:24 To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel <daniel.schaefer@hpe.com> Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 On 12/02/20 03:59, Daniel Schaefer wrote: > Hi everyone, > > I'm looking into how to speed up the build process and noticed that our > build > uses LZMA to encrypt the main firmware volume. Since it's quite big it > takes a > while but only uses one CPU thread. > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much > faster > compression times. I did a quick benchmark using the `xz` command-line > tool, > which uses a modified version of the LZMA SDK that EDK2 uses. The > results are: > > Uncompressed size: 64M > > | Algo | Comp Time | Decomp Time | Size | Threads | > | ----- | --------- | ----------- | ---- | ------- | > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > Using those commands: > > time xz --format=lzma testfile > time unlzma testfile.lzma > > time xz --lzma2 testfile > time unxz testfile.xz > > time xz -T4 --lzma2 testfile > time unxz testfile.xz > > This is quite a significant improvement of build time, while > decompression time > and size only slightly increase. If that's a concern, then LZMA2 could > be used > for development only. > > I haven't investigated the details of how to support this in the code > but it > appears to be a simple change, since the LZMA SDK that we use already > supports > LZMA2. > > What do you think? "xz -T" works by splitting the input into blocks, and it generates a multi-block compressed output. Yes, that's correct. > I'm unsure if the current LZMA decompressor that runs inside the firmware (= guided section extractor) copes with multi-block input. I think you're right that it doesn't. But we can make the guided section extractor use that same algorithm(LZMA2) and assign it a different GUID, right? [-- Attachment #1.2: Type: text/html, Size: 6002 bytes --] [-- Attachment #2: 69F4FCEBBB92465AA3A93438CD55E1D3.png --] [-- Type: image/png, Size: 140 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-03 15:57 ` Bret Barkelew @ 2020-12-04 8:19 ` Daniel Schaefer 0 siblings, 0 replies; 12+ messages in thread From: Daniel Schaefer @ 2020-12-04 8:19 UTC (permalink / raw) To: devel, bret.barkelew, Laszlo Ersek; +Cc: Lin, Derek (HPS SW) On 12/3/20 11:57 PM, Bret Barkelew via groups.io wrote: > Wasn’t there another push (somewhere in the last 8 months, my brain is foggy) to adopt LZMA2? Or was it a different algorithm? I couldn't find anything: https://edk2.groups.io/g/devel/search?q=LZMA2 > > - Bret > > *From: *Daniel Schaefer via groups.io <mailto:daniel.schaefer=hpe.com@groups.io> > *Sent: *Thursday, December 3, 2020 4:12 AM > *To: *Laszlo Ersek <mailto:lersek@redhat.com>; devel@edk2.groups.io <mailto:devel@edk2.groups.io> > *Cc: *Lin, Derek (HPS SW) <mailto:derek.lin2@hpe.com> > *Subject: *[EXTERNAL] Re: [edk2-devel] Multithreaded compression with LZMA2 > > *From:* Laszlo Ersek <lersek@redhat.com> > > *Sent:*Thursday, December 3, 2020 18:24 > *To:* devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel <daniel.schaefer@hpe.com> > *Cc:* Lin, Derek (HPS SW) <derek.lin2@hpe.com> > *Subject:* Re: [edk2-devel] Multithreaded compression with LZMA2 > > On 12/02/20 03:59, Daniel Schaefer wrote: > > Hi everyone, > > > > I'm looking into how to speed up the build process and noticed that our > > build > > uses LZMA to encrypt the main firmware volume. Since it's quite big it > > takes a > > while but only uses one CPU thread. > > > > LZMA2 is a version of LZMA which can be multi-threaded and achieve much > > faster > > compression times. I did a quick benchmark using the `xz` command-line > > tool, > > which uses a modified version of the LZMA SDK that EDK2 uses. The > > results are: > > > > Uncompressed size: 64M > > > > | Algo | Comp Time | Decomp Time | Size | Threads | > > | ----- | --------- | ----------- | ---- | ------- | > > | LZMA | 19.67s | 0.9s | 9.1M | 1 | > > | LZMA2 | 20.11s | 1.2s | 9.2M | 1 | > > | LZMA2 | 8.31s | 1.0s | 9.4M | 4 | > > > > Using those commands: > > > > time xz --format=lzma testfile > > time unlzma testfile.lzma > > > > time xz --lzma2 testfile > > time unxz testfile.xz > > > > time xz -T4 --lzma2 testfile > > time unxz testfile.xz > > > > This is quite a significant improvement of build time, while > > decompression time > > and size only slightly increase. If that's a concern, then LZMA2 could > > be used > > for development only. > > > > I haven't investigated the details of how to support this in the code > > but it > > appears to be a simple change, since the LZMA SDK that we use already > > supports > > LZMA2. > > > > What do you think? > > "xz -T" works by splitting the input into blocks, and it generates a > multi-block compressed output. > > Yes, that's correct. > > > I'm unsure if the current LZMA > decompressor that runs inside the firmware (= guided section extractor) > copes with multi-block input. > > I think you're right that it doesn't. But we can make the guided section extractor use that same algorithm(LZMA2) and assign it a different GUID, right? > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-devel] Multithreaded compression with LZMA2 2020-12-03 12:11 ` Daniel Schaefer 2020-12-03 15:57 ` Bret Barkelew @ 2020-12-03 23:35 ` Laszlo Ersek 2020-12-04 2:28 ` 回复: " gaoliming 1 sibling, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-12-03 23:35 UTC (permalink / raw) To: Schaefer, Daniel, devel@edk2.groups.io; +Cc: Lin, Derek (HPS SW) On 12/03/20 13:11, Schaefer, Daniel wrote: > > From: Laszlo Ersek <lersek@redhat.com> > Sent: Thursday, December 3, 2020 18:24 > To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel <daniel.schaefer@hpe.com> > Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> > Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 > > "xz -T" works by splitting the input into blocks, and it generates a > multi-block compressed output. > > Yes, that's correct. > >> I'm unsure if the current LZMA > decompressor that runs inside the firmware (= guided section extractor) > copes with multi-block input. > > I think you're right that it doesn't. But we can make the guided section extractor use that same algorithm(LZMA2) and assign it a different GUID, right? I guess so... Thanks Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* 回复: [edk2-devel] Multithreaded compression with LZMA2 2020-12-03 23:35 ` Laszlo Ersek @ 2020-12-04 2:28 ` gaoliming 2020-12-04 9:02 ` Daniel Schaefer 0 siblings, 1 reply; 12+ messages in thread From: gaoliming @ 2020-12-04 2:28 UTC (permalink / raw) To: devel, lersek, 'Schaefer, Daniel'; +Cc: 'Lin, Derek (HPS SW)' Daniel: Yes. New guided section extractor matches new compression algorithm. For the compression algorithm, its compression ratio, compression performance, the decompression performance, the decompression taken memory are all required to be considered. Thanks Liming > -----邮件原件----- > 发件人: bounce+27952+68292+4905953+8761045@groups.io > <bounce+27952+68292+4905953+8761045@groups.io> 代表 Laszlo Ersek > 发送时间: 2020年12月4日 7:35 > 收件人: Schaefer, Daniel <daniel.schaefer@hpe.com>; devel@edk2.groups.io > 抄送: Lin, Derek (HPS SW) <derek.lin2@hpe.com> > 主题: Re: [edk2-devel] Multithreaded compression with LZMA2 > > On 12/03/20 13:11, Schaefer, Daniel wrote: > > > > From: Laszlo Ersek <lersek@redhat.com> > > Sent: Thursday, December 3, 2020 18:24 > > To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel > <daniel.schaefer@hpe.com> > > Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> > > Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 > > > > > "xz -T" works by splitting the input into blocks, and it generates a > > multi-block compressed output. > > > > Yes, that's correct. > > > >> I'm unsure if the current LZMA > > decompressor that runs inside the firmware (= guided section extractor) > > copes with multi-block input. > > > > I think you're right that it doesn't. But we can make the guided section > extractor use that same algorithm(LZMA2) and assign it a different GUID, > right? > > I guess so... > > Thanks > Laszlo > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 回复: [edk2-devel] Multithreaded compression with LZMA2 2020-12-04 2:28 ` 回复: " gaoliming @ 2020-12-04 9:02 ` Daniel Schaefer 2020-12-08 6:01 ` 回复: " gaoliming 0 siblings, 1 reply; 12+ messages in thread From: Daniel Schaefer @ 2020-12-04 9:02 UTC (permalink / raw) To: devel, gaoliming Cc: Laszlo Ersek, Lin, Derek (HPS SW), Bret.Barkelew@microsoft.com, afish On 12/4/20 10:28 AM, gaoliming wrote: > Daniel: > Yes. New guided section extractor matches new compression algorithm. Good. I see that we use version 18.05 of the LZMA SDK, while there's already 19.00: https://www.7-zip.org/sdk.html Should we use this opportunity to update? > For the compression algorithm, its compression ratio, compression performance, > the decompression performance, the decompression taken memory are all > required to be considered. For compression ratio and performance, see my earlier emails. Summary: Compression ratio is basically the same. Performance is also the same, except when using 4 threads it compresses our main image in just 40% of the time. Some images don't compress as well, they take the same time to compress. For decompression memory usage I used the xz commands on Linux again: # LZMA1 $ /usr/bin/time -v unxz testfile.lzma Maximum resident set size (kbytes): 10228, 10492, 10460, 10200, 10244 => 10324.8 # LZMA2 $ /usr/bin/time -v unxz testfile.xz Maximum resident set size (kbytes): 10456, 10460, 10224, 10212, 10548 => 10380.0 Result: Basically the same. I don't know how I would measure this in EDK2. Any ideas? From the manpage of xz and LZMA authors: > LZMA2 is an updated version of LZMA1 to fix some practical issues of LZMA1. > Compression speed and ratios of LZMA1 and LZMA2 are practically the same. > LZMA2 is better than LZMA, if you compress already compressed data. Here's a benchmark which compares both of them, among others: https://stephane.lesimple.fr/blog/lzop-vs-compress-vs-gzip-vs-bzip2-vs-lzma-vs-lzma2xz-benchmark-reloaded/ Same result: They're basically the same. > Thanks > Liming >> -----邮件原件----- >> 发件人: bounce+27952+68292+4905953+8761045@groups.io >> <bounce+27952+68292+4905953+8761045@groups.io> 代表 Laszlo Ersek >> 发送时间: 2020年12月4日 7:35 >> 收件人: Schaefer, Daniel <daniel.schaefer@hpe.com>; devel@edk2.groups.io >> 抄送: Lin, Derek (HPS SW) <derek.lin2@hpe.com> >> 主题: Re: [edk2-devel] Multithreaded compression with LZMA2 >> >> On 12/03/20 13:11, Schaefer, Daniel wrote: >>> >>> From: Laszlo Ersek <lersek@redhat.com> >>> Sent: Thursday, December 3, 2020 18:24 >>> To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel >> <daniel.schaefer@hpe.com> >>> Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> >>> Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 >>> >> >>> "xz -T" works by splitting the input into blocks, and it generates a >>> multi-block compressed output. >>> >>> Yes, that's correct. >>> >>>> I'm unsure if the current LZMA >>> decompressor that runs inside the firmware (= guided section extractor) >>> copes with multi-block input. >>> >>> I think you're right that it doesn't. But we can make the guided section >> extractor use that same algorithm(LZMA2) and assign it a different GUID, >> right? >> >> I guess so... >> >> Thanks >> Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* 回复: 回复: [edk2-devel] Multithreaded compression with LZMA2 2020-12-04 9:02 ` Daniel Schaefer @ 2020-12-08 6:01 ` gaoliming 0 siblings, 0 replies; 12+ messages in thread From: gaoliming @ 2020-12-08 6:01 UTC (permalink / raw) To: devel, daniel.schaefer Cc: 'Laszlo Ersek', 'Lin, Derek (HPS SW)', Bret.Barkelew, afish Daniel: > -----邮件原件----- > 发件人: bounce+27952+68335+4905953+8761045@groups.io > <bounce+27952+68335+4905953+8761045@groups.io> 代表 Daniel > Schaefer > 发送时间: 2020年12月4日 17:03 > 收件人: devel@edk2.groups.io; gaoliming@byosoft.com.cn > 抄送: Laszlo Ersek <lersek@redhat.com>; Lin, Derek (HPS SW) > <derek.lin2@hpe.com>; Bret.Barkelew@microsoft.com; afish@apple.com > 主题: Re: 回复: [edk2-devel] Multithreaded compression with LZMA2 > > On 12/4/20 10:28 AM, gaoliming wrote: > > Daniel: > > Yes. New guided section extractor matches new compression > algorithm. > > Good. I see that we use version 18.05 of the LZMA SDK, while there's already > 19.00: > https://www.7-zip.org/sdk.html > > Should we use this opportunity to update? > I suggest to separate them. The update doesn't block this change. > > For the compression algorithm, its compression ratio, compression > performance, > > the decompression performance, the decompression taken memory are all > > required to be considered. > > For compression ratio and performance, see my earlier emails. Summary: > Compression ratio is basically the same. > Performance is also the same, except when using 4 threads it compresses our > main image in just 40% of the time. Some images don't compress as well, they > take the same time to compress. > > For decompression memory usage I used the xz commands on Linux again: > > # LZMA1 > $ /usr/bin/time -v unxz testfile.lzma > Maximum resident set size (kbytes): 10228, 10492, 10460, 10200, 10244 => > 10324.8 > > # LZMA2 > $ /usr/bin/time -v unxz testfile.xz > Maximum resident set size (kbytes): 10456, 10460, 10224, 10212, 10548 => > 10380.0 > > Result: Basically the same. > > I don't know how I would measure this in EDK2. Any ideas? > You need to enable it in Edk2 like current LzmaCompression tool and LzmaDecompress library, and apply them in the platform DSC/FDF, then measure its build and decompression. Thanks Liming > > From the manpage of xz and LZMA authors: > > > LZMA2 is an updated version of LZMA1 to fix some practical issues of > LZMA1. > > Compression speed and ratios of LZMA1 and LZMA2 are practically > the same. > > LZMA2 is better than LZMA, if you compress already compressed data. > > Here's a benchmark which compares both of them, among others: > https://stephane.lesimple.fr/blog/lzop-vs-compress-vs-gzip-vs-bzip2-vs-lzma- > vs-lzma2xz-benchmark-reloaded/ > Same result: They're basically the same. > > > > Thanks > > Liming > >> -----邮件原件----- > >> 发件人: bounce+27952+68292+4905953+8761045@groups.io > >> <bounce+27952+68292+4905953+8761045@groups.io> 代表 Laszlo > Ersek > >> 发送时间: 2020年12月4日 7:35 > >> 收件人: Schaefer, Daniel <daniel.schaefer@hpe.com>; > devel@edk2.groups.io > >> 抄送: Lin, Derek (HPS SW) <derek.lin2@hpe.com> > >> 主题: Re: [edk2-devel] Multithreaded compression with LZMA2 > >> > >> On 12/03/20 13:11, Schaefer, Daniel wrote: > >>> > >>> From: Laszlo Ersek <lersek@redhat.com> > >>> Sent: Thursday, December 3, 2020 18:24 > >>> To: devel@edk2.groups.io <devel@edk2.groups.io>; Schaefer, Daniel > >> <daniel.schaefer@hpe.com> > >>> Cc: Lin, Derek (HPS SW) <derek.lin2@hpe.com> > >>> Subject: Re: [edk2-devel] Multithreaded compression with LZMA2 > >>> > >> > >>> "xz -T" works by splitting the input into blocks, and it generates a > >>> multi-block compressed output. > >>> > >>> Yes, that's correct. > >>> > >>>> I'm unsure if the current LZMA > >>> decompressor that runs inside the firmware (= guided section extractor) > >>> copes with multi-block input. > >>> > >>> I think you're right that it doesn't. But we can make the guided section > >> extractor use that same algorithm(LZMA2) and assign it a different GUID, > >> right? > >> > >> I guess so... > >> > >> Thanks > >> Laszlo > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-12-08 6:01 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-12-02 2:59 Multithreaded compression with LZMA2 Daniel Schaefer 2020-12-02 3:36 ` [edk2-devel] " Andrew Fish 2020-12-02 5:21 ` 回复: " gaoliming 2020-12-02 8:24 ` Daniel Schaefer 2020-12-03 10:24 ` Laszlo Ersek 2020-12-03 12:11 ` Daniel Schaefer 2020-12-03 15:57 ` Bret Barkelew 2020-12-04 8:19 ` Daniel Schaefer 2020-12-03 23:35 ` Laszlo Ersek 2020-12-04 2:28 ` 回复: " gaoliming 2020-12-04 9:02 ` Daniel Schaefer 2020-12-08 6:01 ` 回复: " gaoliming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox