
Hi,
On 2024/5/24 22:26, Jonathan Liu wrote:
Hi Jianan,
On Sat, 26 Feb 2022 at 18:05, Huang Jianan jnhuang95@gmail.com wrote:
Update the LZ4 compression module based on LZ4 v1.8.3 in order to use the newest LZ4_decompress_safe_partial() which can now decode exactly the nb of bytes requested.
Signed-off-by: Huang Jianan jnhuang95@gmail.com
I noticed after this commit LZ4 decompression is slower. ulz4fn function call takes 1.209670 seconds with this commit. After reverting this commit, the ulz4fn function call takes 0.587032 seconds.
I am decompressing a LZ4 compressed kernel (compressed with lz4 v1.9.4 using -9 option for maximum compression) on RK3399.
Any ideas why it is slower with this commit and how the performance regression can be fixed?
Just the quick glance, I think the issue may be due to memcpy/memmove since it seems the main difference between these two codebases (I'm not sure which LZ4 version the old codebase was based on) and the new version mainly relies on memcpy/memmove instead of its own versions.
Would you mind to check the assembly how memcpy/memset is generated on your platform?
Thanks, Gao Xiang
Thanks.
Regards, Jonathan