[U-Boot] [PATCH] Add support for LZ4 decompression algorithm

This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Compile-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org --- README | 13 +++ common/bootm.c | 10 ++ common/image.c | 1 + include/common.h | 3 + include/image.h | 1 + lib/Makefile | 1 + lib/lz4.c | 266 +++++++++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 150 ++++++++++++++++++++++++++++++ test/compression.c | 57 ++++++++++++ 9 files changed, 502 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
diff --git a/README b/README index a13705a..4c98285 100644 --- a/README +++ b/README @@ -2035,6 +2035,19 @@ CBFS (Coreboot Filesystem) support the malloc area (as defined by CONFIG_SYS_MALLOC_LEN) should be at least 4MB.
+ CONFIG_LZ4 + + If this option is set, support for lz4 compressed images + is included. The LZ4 algorithm can run in-place as long as the + compressed image is loaded to the end of the output buffer, and + trades lower compression ratios for much faster decompression. + + NOTE: This implements the release version of the LZ4 frame + format as generated by default by the 'lz4' command line tool. + This is not the same as the outdated, less efficient legacy + frame format currently (2015) implemented in the Linux kernel + (generated by 'lz4 -l'). The two formats are incompatible. + CONFIG_LZMA
If this option is set, support for lzma compressed diff --git a/common/bootm.c b/common/bootm.c index 667c934..0621363 100644 --- a/common/bootm.c +++ b/common/bootm.c @@ -389,6 +389,16 @@ int bootm_decomp_image(int comp, ulong load, ulong image_start, int type, break; } #endif /* CONFIG_LZO */ +#ifdef CONFIG_LZ4 + case IH_COMP_LZ4: { + size_t size = ulz4fn(image_buf, image_len, load_buf, unc_len); + if (!size) + ret = -1; + else + image_len = size; + break; + } +#endif /* CONFIG_LZ4 */ default: printf("Unimplemented compression type %d\n", comp); return BOOTM_ERR_UNIMPLEMENTED; diff --git a/common/image.c b/common/image.c index 1325e07..c33749d 100644 --- a/common/image.c +++ b/common/image.c @@ -167,6 +167,7 @@ static const table_entry_t uimage_comp[] = { { IH_COMP_GZIP, "gzip", "gzip compressed", }, { IH_COMP_LZMA, "lzma", "lzma compressed", }, { IH_COMP_LZO, "lzo", "lzo compressed", }, + { IH_COMP_LZ4, "lz4", "lz4 compressed", }, { -1, "", "", }, };
diff --git a/include/common.h b/include/common.h index 68b24d0..82c75e2 100644 --- a/include/common.h +++ b/include/common.h @@ -826,6 +826,9 @@ int gzwrite(unsigned char *src, int len, u64 startoffs, u64 szexpected);
+/* lib/lz4_wrapper.c */ +size_t ulz4fn(const void *src, size_t srcn, void *dst, size_t dstn); + /* lib/qsort.c */ void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)); diff --git a/include/image.h b/include/image.h index 8a864ae..08ae24a 100644 --- a/include/image.h +++ b/include/image.h @@ -259,6 +259,7 @@ struct lmb; #define IH_COMP_BZIP2 2 /* bzip2 Compression Used */ #define IH_COMP_LZMA 3 /* lzma Compression Used */ #define IH_COMP_LZO 4 /* lzo Compression Used */ +#define IH_COMP_LZ4 5 /* lz4 Compression Used */
#define IH_MAGIC 0x27051956 /* Image Magic Number */ #define IH_NMLEN 32 /* Image Name Length */ diff --git a/lib/Makefile b/lib/Makefile index 96f832e..3eecefa 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_GZIP_COMPRESSED) += gzip.o obj-y += initcall.o obj-$(CONFIG_LMB) += lmb.o obj-y += ldiv.o +obj-$(CONFIG_LZ4) += lz4_wrapper.o obj-$(CONFIG_MD5) += md5.o obj-y += net_utils.o obj-$(CONFIG_PHYSMEM) += physmem.o diff --git a/lib/lz4.c b/lib/lz4.c new file mode 100644 index 0000000..fb89090 --- /dev/null +++ b/lib/lz4.c @@ -0,0 +1,266 @@ +/* + LZ4 - Fast LZ compression algorithm + Copyright (C) 2011-2015, Yann Collet. + + BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following disclaimer + in the documentation and/or other materials provided with the + distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + You can contact the author at : + - LZ4 source repository : https://github.com/Cyan4973/lz4 + - LZ4 public forum : https://groups.google.com/forum/#!forum/lz4c +*/ + + +/************************************** +* Reading and writing into memory +**************************************/ + +/* customized version of memcpy, which may overwrite up to 7 bytes beyond dstEnd */ +static void LZ4_wildCopy(void* dstPtr, const void* srcPtr, void* dstEnd) +{ + BYTE* d = (BYTE*)dstPtr; + const BYTE* s = (const BYTE*)srcPtr; + BYTE* e = (BYTE*)dstEnd; + do { LZ4_copy8(d,s); d+=8; s+=8; } while (d<e); +} + + +/************************************** +* Common Constants +**************************************/ +#define MINMATCH 4 + +#define COPYLENGTH 8 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH+MINMATCH) +static const int LZ4_minLength = (MFLIMIT+1); + +#define KB *(1 <<10) +#define MB *(1 <<20) +#define GB *(1U<<30) + +#define MAXD_LOG 16 +#define MAX_DISTANCE ((1 << MAXD_LOG) - 1) + +#define ML_BITS 4 +#define ML_MASK ((1U<<ML_BITS)-1) +#define RUN_BITS (8-ML_BITS) +#define RUN_MASK ((1U<<RUN_BITS)-1) + + +/************************************** +* Local Structures and types +**************************************/ +typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive; +typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive; +typedef enum { full = 0, partial = 1 } earlyEnd_directive; + + + +/******************************* +* Decompression functions +*******************************/ +/* + * This generic decompression function cover all use cases. + * It shall be instantiated several times, using different sets of directives + * Note that it is essential this generic function is really inlined, + * in order to remove useless branches during compilation optimization. + */ +FORCE_INLINE int LZ4_decompress_generic( + const char* const source, + char* const dest, + int inputSize, + int outputSize, /* If endOnInput==endOnInputSize, this value is the max size of Output Buffer. */ + + int endOnInput, /* endOnOutputSize, endOnInputSize */ + int partialDecoding, /* full, partial */ + int targetOutputSize, /* only used if partialDecoding==partial */ + int dict, /* noDict, withPrefix64k, usingExtDict */ + const BYTE* const lowPrefix, /* == dest if dict == noDict */ + const BYTE* const dictStart, /* only if dict==usingExtDict */ + const size_t dictSize /* note : = 0 if noDict */ + ) +{ + /* Local Variables */ + const BYTE* ip = (const BYTE*) source; + const BYTE* const iend = ip + inputSize; + + BYTE* op = (BYTE*) dest; + BYTE* const oend = op + outputSize; + BYTE* cpy; + BYTE* oexit = op + targetOutputSize; + const BYTE* const lowLimit = lowPrefix - dictSize; + + const BYTE* const dictEnd = (const BYTE*)dictStart + dictSize; + const size_t dec32table[] = {4, 1, 2, 1, 4, 4, 4, 4}; + const size_t dec64table[] = {0, 0, 0, (size_t)-1, 0, 1, 2, 3}; + + const int safeDecode = (endOnInput==endOnInputSize); + const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB))); + + + /* Special cases */ + if ((partialDecoding) && (oexit> oend-MFLIMIT)) oexit = oend-MFLIMIT; /* targetOutputSize too high => decode everything */ + if ((endOnInput) && (unlikely(outputSize==0))) return ((inputSize==1) && (*ip==0)) ? 0 : -1; /* Empty output buffer */ + if ((!endOnInput) && (unlikely(outputSize==0))) return (*ip==0?1:-1); + + + /* Main Loop */ + while (1) + { + unsigned token; + size_t length; + const BYTE* match; + + /* get literal length */ + token = *ip++; + if ((length=(token>>ML_BITS)) == RUN_MASK) + { + unsigned s; + do + { + s = *ip++; + length += s; + } + while (likely((endOnInput)?ip<iend-RUN_MASK:1) && (s==255)); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)(op))) goto _output_error; /* overflow detection */ + if ((safeDecode) && unlikely((size_t)(ip+length)<(size_t)(ip))) goto _output_error; /* overflow detection */ + } + + /* copy literals */ + cpy = op+length; + if (((endOnInput) && ((cpy>(partialDecoding?oexit:oend-MFLIMIT)) || (ip+length>iend-(2+1+LASTLITERALS))) ) + || ((!endOnInput) && (cpy>oend-COPYLENGTH))) + { + if (partialDecoding) + { + if (cpy > oend) goto _output_error; /* Error : write attempt beyond end of output buffer */ + if ((endOnInput) && (ip+length > iend)) goto _output_error; /* Error : read attempt beyond end of input buffer */ + } + else + { + if ((!endOnInput) && (cpy != oend)) goto _output_error; /* Error : block decoding must stop exactly there */ + if ((endOnInput) && ((ip+length != iend) || (cpy > oend))) goto _output_error; /* Error : input must be consumed */ + } + memcpy(op, ip, length); + ip += length; + op += length; + break; /* Necessarily EOF, due to parsing restrictions */ + } + LZ4_wildCopy(op, ip, cpy); + ip += length; op = cpy; + + /* get offset */ + match = cpy - LZ4_readLE16(ip); ip+=2; + if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error; /* Error : offset outside destination buffer */ + + /* get matchlength */ + length = token & ML_MASK; + if (length == ML_MASK) + { + unsigned s; + do + { + if ((endOnInput) && (ip > iend-LASTLITERALS)) goto _output_error; + s = *ip++; + length += s; + } while (s==255); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)op)) goto _output_error; /* overflow detection */ + } + length += MINMATCH; + + /* check external dictionary */ + if ((dict==usingExtDict) && (match < lowPrefix)) + { + if (unlikely(op+length > oend-LASTLITERALS)) goto _output_error; /* doesn't respect parsing restriction */ + + if (length <= (size_t)(lowPrefix-match)) + { + /* match can be copied as a single segment from external dictionary */ + match = dictEnd - (lowPrefix-match); + memmove(op, match, length); op += length; + } + else + { + /* match encompass external dictionary and current segment */ + size_t copySize = (size_t)(lowPrefix-match); + memcpy(op, dictEnd - copySize, copySize); + op += copySize; + copySize = length - copySize; + if (copySize > (size_t)(op-lowPrefix)) /* overlap within current segment */ + { + BYTE* const endOfMatch = op + copySize; + const BYTE* copyFrom = lowPrefix; + while (op < endOfMatch) *op++ = *copyFrom++; + } + else + { + memcpy(op, lowPrefix, copySize); + op += copySize; + } + } + continue; + } + + /* copy repeated sequence */ + cpy = op + length; + if (unlikely((op-match)<8)) + { + const size_t dec64 = dec64table[op-match]; + op[0] = match[0]; + op[1] = match[1]; + op[2] = match[2]; + op[3] = match[3]; + match += dec32table[op-match]; + LZ4_copy4(op+4, match); + op += 8; match -= dec64; + } else { LZ4_copy8(op, match); op+=8; match+=8; } + + if (unlikely(cpy>oend-12)) + { + if (cpy > oend-LASTLITERALS) goto _output_error; /* Error : last LASTLITERALS bytes must be literals */ + if (op < oend-8) + { + LZ4_wildCopy(op, match, oend-8); + match += (oend-8) - op; + op = oend-8; + } + while (op<cpy) *op++ = *match++; + } + else + LZ4_wildCopy(op, match, cpy); + op=cpy; /* correction */ + } + + /* end of decoding */ + if (endOnInput) + return (int) (((char*)op)-dest); /* Nb of output bytes decoded */ + else + return (int) (((const char*)ip)-source); /* Nb of input bytes read */ + + /* Overflow error detected */ +_output_error: + return (int) (-(((const char*)ip)-source))-1; +} diff --git a/lib/lz4_wrapper.c b/lib/lz4_wrapper.c new file mode 100644 index 0000000..5f715e6 --- /dev/null +++ b/lib/lz4_wrapper.c @@ -0,0 +1,150 @@ +/* + * Copyright 2015 Google Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. The name of the author may not be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * Alternatively, this software may be distributed under the terms of the + * GNU General Public License ("GPL") version 2 as published by the Free + * Software Foundation. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <common.h> +#include <compiler.h> +#include <linux/kernel.h> +#include <linux/types.h> + +static u16 LZ4_readLE16(const void *src) { return le16_to_cpu(*(u16 *)src); } +static void LZ4_copy4(void *dst, const void *src) { *(u32 *)dst = *(u32 *)src; } +static void LZ4_copy8(void *dst, const void *src) { *(u64 *)dst = *(u64 *)src; } + +typedef uint8_t BYTE; +typedef uint16_t U16; +typedef uint32_t U32; +typedef int32_t S32; +typedef uint64_t U64; + +#define FORCE_INLINE static inline __attribute__((always_inline)) + +/* Unaltered (except removing unrelated code) from github.com/Cyan4973/lz4. */ +#include "lz4.c" /* #include for inlining, do not link! */ + +#define LZ4F_MAGIC 0x184D2204 + +struct lz4_frame_header { + u32 magic; + union { + u8 flags; + struct { + u8 reserved0 : 2; + u8 has_content_checksum : 1; + u8 has_content_size : 1; + u8 has_block_checksum : 1; + u8 independent_blocks : 1; + u8 version : 2; + }; + }; + union { + u8 block_descriptor; + struct { + u8 reserved1 : 4; + u8 max_block_size : 3; + u8 reserved2 : 1; + }; + }; + /* + u64 content_size iff has_content_size is set */ + /* + u8 header_checksum */ +} __attribute__((packed)); + +struct lz4_block_header { + union { + u32 raw; + struct { + u32 size : 31; + u32 not_compressed : 1; + }; + }; + /* + size bytes of data */ + /* + u32 block_checksum iff has_block_checksum is set */ +} __attribute__((packed)); + +size_t ulz4fn(const void *src, size_t srcn, void *dst, size_t dstn) +{ + const void *in = src; + void *out = dst; + int has_block_checksum; + + { /* With in-place decompression the header may become invalid later. */ + const struct lz4_frame_header *h = in; + + if (srcn < sizeof(*h) + sizeof(u64) + sizeof(u8)) + return 0; /* input overrun */ + + /* We assume there's always only a single, standard frame. */ + if (le32_to_cpu(h->magic) != LZ4F_MAGIC || h->version != 1) + return 0; /* unknown format */ + if (h->reserved0 || h->reserved1 || h->reserved2) + return 0; /* reserved must be zero */ + if (!h->independent_blocks) + return 0; /* we don't support block dependency */ + has_block_checksum = h->has_block_checksum; + + in += sizeof(*h); + if (h->has_content_size) + in += sizeof(u64); + in += sizeof(u8); + } + + while (1) { + struct lz4_block_header b = { .raw = le32_to_cpu(*(u32 *)in) }; + in += sizeof(struct lz4_block_header); + + if (in - src + b.size > srcn) + return 0; /* input overrun */ + + if (!b.size) + return out - dst; /* decompression successful */ + + if (b.not_compressed) { + size_t size = min((ptrdiff_t)b.size, dst + dstn - out); + memcpy(out, in, size); + if (size < b.size) + return 0; /* output overrun */ + else + out += size; + } else { + /* constant folding essential, do not touch params! */ + int ret = LZ4_decompress_generic(in, out, b.size, + dst + dstn - out, endOnInputSize, + full, 0, noDict, out, NULL, 0); + if (ret < 0) + return 0; /* decompression error */ + else + out += ret; + } + + in += b.size; + if (has_block_checksum) + in += sizeof(u32); + } +} diff --git a/test/compression.c b/test/compression.c index 7ef3a8c..5521b02 100644 --- a/test/compression.c +++ b/test/compression.c @@ -95,6 +95,28 @@ static const char lzo_compressed[] = "\x73\x61\x67\x65\x73\x2e\x0a\x11\x00\x00\x00\x00\x00\x00"; static const unsigned long lzo_compressed_size = 334;
+/* lz4 -z /tmp/plain.txt > /tmp/plain.lz4 */ +static const char lz4_compressed[] = + "\x04\x22\x4d\x18\x64\x70\xb9\x01\x01\x00\x00\xff\x19\x49\x20\x61" + "\x6d\x20\x61\x20\x68\x69\x67\x68\x6c\x79\x20\x63\x6f\x6d\x70\x72" + "\x65\x73\x73\x61\x62\x6c\x65\x20\x62\x69\x74\x20\x6f\x66\x20\x74" + "\x65\x78\x74\x2e\x0a\x28\x00\x3d\xf1\x25\x54\x68\x65\x72\x65\x20" + "\x61\x72\x65\x20\x6d\x61\x6e\x79\x20\x6c\x69\x6b\x65\x20\x6d\x65" + "\x2c\x20\x62\x75\x74\x20\x74\x68\x69\x73\x20\x6f\x6e\x65\x20\x69" + "\x73\x20\x6d\x69\x6e\x65\x2e\x0a\x49\x66\x20\x49\x20\x77\x32\x00" + "\xd1\x6e\x79\x20\x73\x68\x6f\x72\x74\x65\x72\x2c\x20\x74\x45\x00" + "\xf4\x0b\x77\x6f\x75\x6c\x64\x6e\x27\x74\x20\x62\x65\x20\x6d\x75" + "\x63\x68\x20\x73\x65\x6e\x73\x65\x20\x69\x6e\x0a\xcf\x00\x50\x69" + "\x6e\x67\x20\x6d\x12\x00\x00\x32\x00\xf0\x11\x20\x66\x69\x72\x73" + "\x74\x20\x70\x6c\x61\x63\x65\x2e\x20\x41\x74\x20\x6c\x65\x61\x73" + "\x74\x20\x77\x69\x74\x68\x20\x6c\x7a\x6f\x2c\x63\x00\xf5\x14\x77" + "\x61\x79\x2c\x0a\x77\x68\x69\x63\x68\x20\x61\x70\x70\x65\x61\x72" + "\x73\x20\x74\x6f\x20\x62\x65\x68\x61\x76\x65\x20\x70\x6f\x6f\x72" + "\x6c\x79\x4e\x00\x30\x61\x63\x65\x27\x01\x01\x95\x00\x01\x2d\x01" + "\xb0\x0a\x6d\x65\x73\x73\x61\x67\x65\x73\x2e\x0a\x00\x00\x00\x00" + "\x9d\x12\x8c\x9d"; +static const unsigned long lz4_compressed_size = 276; +
#define TEST_BUFFER_SIZE 512
@@ -227,6 +249,39 @@ static int uncompress_using_lzo(void *in, unsigned long in_size, return (ret != LZO_E_OK); }
+static int compress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + /* There is no lz4 compression in u-boot, so fake it. */ + assert(in_size == strlen(plain)); + assert(memcmp(plain, in, in_size) == 0); + + if (lz4_compressed_size > out_max) + return -1; + + memcpy(out, lz4_compressed, lz4_compressed_size); + if (out_size) + *out_size = lz4_compressed_size; + + return 0; +} + +static int uncompress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + size_t ret; + size_t input_size = in_size; + size_t output_size = out_max; + + ret = ulz4fn(in, input_size, out, output_size); + if (out_size) + *out_size = ret; + + return (ret == 0); +} + #define errcheck(statement) if (!(statement)) { \ fprintf(stderr, "\tFailed: %s\n", #statement); \ ret = 1; \ @@ -325,6 +380,7 @@ static int do_ut_compression(cmd_tbl_t *cmdtp, int flag, int argc, err += run_test("bzip2", compress_using_bzip2, uncompress_using_bzip2); err += run_test("lzma", compress_using_lzma, uncompress_using_lzma); err += run_test("lzo", compress_using_lzo, uncompress_using_lzo); + err += run_test("lz4", compress_using_lz4, uncompress_using_lz4);
printf("ut_compression %s\n", err == 0 ? "ok" : "FAILED");
@@ -401,6 +457,7 @@ static int do_ut_image_decomp(cmd_tbl_t *cmdtp, int flag, int argc, err |= run_bootm_test(IH_COMP_BZIP2, compress_using_bzip2); err |= run_bootm_test(IH_COMP_LZMA, compress_using_lzma); err |= run_bootm_test(IH_COMP_LZO, compress_using_lzo); + err |= run_bootm_test(IH_COMP_LZ4, compress_using_lz4); err |= run_bootm_test(IH_COMP_NONE, compress_using_none);
printf("ut_image_decomp %s\n", err == 0 ? "ok" : "FAILED");

Hi Julius,
On 25 September 2015 at 18:27, Julius Werner jwerner@chromium.org wrote:
This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Sounds like a useful addition.
Compile-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
I get this build error with sandbox.
test/built-in.o: In function `uncompress_using_lz4': /home/sjg/c/src/third_party/u-boot/files/test/compression.c:278: undefined reference to `ulz4fn'
You should be able to run the tests using:
make O=sandbox defconfig all ./sandbox/u-boot -c "ut_compression"
Signed-off-by: Julius Werner jwerner@chromium.org
README | 13 +++ common/bootm.c | 10 ++ common/image.c | 1 + include/common.h | 3 + include/image.h | 1 + lib/Makefile | 1 + lib/lz4.c | 266 +++++++++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 150 ++++++++++++++++++++++++++++++ test/compression.c | 57 ++++++++++++ 9 files changed, 502 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
diff --git a/README b/README index a13705a..4c98285 100644 --- a/README +++ b/README @@ -2035,6 +2035,19 @@ CBFS (Coreboot Filesystem) support the malloc area (as defined by CONFIG_SYS_MALLOC_LEN) should be at least 4MB.
CONFIG_LZ4
If this option is set, support for lz4 compressed images
is included. The LZ4 algorithm can run in-place as long as the
compressed image is loaded to the end of the output buffer, and
trades lower compression ratios for much faster decompression.
NOTE: This implements the release version of the LZ4 frame
format as generated by default by the 'lz4' command line tool.
This is not the same as the outdated, less efficient legacy
frame format currently (2015) implemented in the Linux kernel
(generated by 'lz4 -l'). The two formats are incompatible.
Can you instead add this option to lib/Kconfig and put your help there? We are moving away from the old CONFIGS.
CONFIG_LZMA If this option is set, support for lzma compressed
diff --git a/common/bootm.c b/common/bootm.c index 667c934..0621363 100644 --- a/common/bootm.c +++ b/common/bootm.c @@ -389,6 +389,16 @@ int bootm_decomp_image(int comp, ulong load, ulong image_start, int type, break; } #endif /* CONFIG_LZO */ +#ifdef CONFIG_LZ4
case IH_COMP_LZ4: {
size_t size = ulz4fn(image_buf, image_len, load_buf, unc_len);
if (!size)
ret = -1;
Is that BOOTM_ERR_RESET?
else
image_len = size;
break;
}
+#endif /* CONFIG_LZ4 */ default: printf("Unimplemented compression type %d\n", comp); return BOOTM_ERR_UNIMPLEMENTED; diff --git a/common/image.c b/common/image.c index 1325e07..c33749d 100644 --- a/common/image.c +++ b/common/image.c @@ -167,6 +167,7 @@ static const table_entry_t uimage_comp[] = { { IH_COMP_GZIP, "gzip", "gzip compressed", }, { IH_COMP_LZMA, "lzma", "lzma compressed", }, { IH_COMP_LZO, "lzo", "lzo compressed", },
{ IH_COMP_LZ4, "lz4", "lz4 compressed", }, { -1, "", "", },
};
diff --git a/include/common.h b/include/common.h index 68b24d0..82c75e2 100644 --- a/include/common.h +++ b/include/common.h @@ -826,6 +826,9 @@ int gzwrite(unsigned char *src, int len, u64 startoffs, u64 szexpected);
+/* lib/lz4_wrapper.c */ +size_t ulz4fn(const void *src, size_t srcn, void *dst, size_t dstn);
/* lib/qsort.c */ void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)); diff --git a/include/image.h b/include/image.h index 8a864ae..08ae24a 100644 --- a/include/image.h +++ b/include/image.h @@ -259,6 +259,7 @@ struct lmb; #define IH_COMP_BZIP2 2 /* bzip2 Compression Used */ #define IH_COMP_LZMA 3 /* lzma Compression Used */ #define IH_COMP_LZO 4 /* lzo Compression Used */ +#define IH_COMP_LZ4 5 /* lz4 Compression Used */
#define IH_MAGIC 0x27051956 /* Image Magic Number */ #define IH_NMLEN 32 /* Image Name Length */ diff --git a/lib/Makefile b/lib/Makefile index 96f832e..3eecefa 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_GZIP_COMPRESSED) += gzip.o obj-y += initcall.o obj-$(CONFIG_LMB) += lmb.o obj-y += ldiv.o +obj-$(CONFIG_LZ4) += lz4_wrapper.o obj-$(CONFIG_MD5) += md5.o obj-y += net_utils.o obj-$(CONFIG_PHYSMEM) += physmem.o diff --git a/lib/lz4.c b/lib/lz4.c new file mode 100644 index 0000000..fb89090 --- /dev/null +++ b/lib/lz4.c @@ -0,0 +1,266 @@ +/*
- LZ4 - Fast LZ compression algorithm
- Copyright (C) 2011-2015, Yann Collet.
- BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
You should be able to replace this license with
SPDX-License-Identifier: BSD-2-Clause
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions are
- met:
* Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
- copyright notice, this list of conditions and the following disclaimer
- in the documentation and/or other materials provided with the
- distribution.
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- You can contact the author at :
- LZ4 source repository : https://github.com/Cyan4973/lz4
- LZ4 public forum : https://groups.google.com/forum/#!forum/lz4c
+*/
+/************************************** +* Reading and writing into memory +**************************************/
+/* customized version of memcpy, which may overwrite up to 7 bytes beyond dstEnd */ +static void LZ4_wildCopy(void* dstPtr, const void* srcPtr, void* dstEnd) +{
- BYTE* d = (BYTE*)dstPtr;
- const BYTE* s = (const BYTE*)srcPtr;
- BYTE* e = (BYTE*)dstEnd;
- do { LZ4_copy8(d,s); d+=8; s+=8; } while (d<e);
+}
+/************************************** +* Common Constants +**************************************/ +#define MINMATCH 4
+#define COPYLENGTH 8 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH+MINMATCH) +static const int LZ4_minLength = (MFLIMIT+1);
+#define KB *(1 <<10) +#define MB *(1 <<20) +#define GB *(1U<<30)
+#define MAXD_LOG 16 +#define MAX_DISTANCE ((1 << MAXD_LOG) - 1)
+#define ML_BITS 4 +#define ML_MASK ((1U<<ML_BITS)-1) +#define RUN_BITS (8-ML_BITS) +#define RUN_MASK ((1U<<RUN_BITS)-1)
+/************************************** +* Local Structures and types +**************************************/ +typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive; +typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive; +typedef enum { full = 0, partial = 1 } earlyEnd_directive;
+/******************************* +* Decompression functions +*******************************/ +/*
- This generic decompression function cover all use cases.
- It shall be instantiated several times, using different sets of directives
- Note that it is essential this generic function is really inlined,
- in order to remove useless branches during compilation optimization.
- */
+FORCE_INLINE int LZ4_decompress_generic(
const char* const source,
char* const dest,
int inputSize,
int outputSize, /* If endOnInput==endOnInputSize, this value is the max size of Output Buffer. */
int endOnInput, /* endOnOutputSize, endOnInputSize */
int partialDecoding, /* full, partial */
int targetOutputSize, /* only used if partialDecoding==partial */
int dict, /* noDict, withPrefix64k, usingExtDict */
const BYTE* const lowPrefix, /* == dest if dict == noDict */
const BYTE* const dictStart, /* only if dict==usingExtDict */
const size_t dictSize /* note : = 0 if noDict */
)
+{
- /* Local Variables */
- const BYTE* ip = (const BYTE*) source;
- const BYTE* const iend = ip + inputSize;
- BYTE* op = (BYTE*) dest;
- BYTE* const oend = op + outputSize;
- BYTE* cpy;
- BYTE* oexit = op + targetOutputSize;
- const BYTE* const lowLimit = lowPrefix - dictSize;
- const BYTE* const dictEnd = (const BYTE*)dictStart + dictSize;
- const size_t dec32table[] = {4, 1, 2, 1, 4, 4, 4, 4};
- const size_t dec64table[] = {0, 0, 0, (size_t)-1, 0, 1, 2, 3};
- const int safeDecode = (endOnInput==endOnInputSize);
- const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB)));
- /* Special cases */
- if ((partialDecoding) && (oexit> oend-MFLIMIT)) oexit = oend-MFLIMIT; /* targetOutputSize too high => decode everything */
- if ((endOnInput) && (unlikely(outputSize==0))) return ((inputSize==1) && (*ip==0)) ? 0 : -1; /* Empty output buffer */
- if ((!endOnInput) && (unlikely(outputSize==0))) return (*ip==0?1:-1);
- /* Main Loop */
- while (1)
- {
unsigned token;
size_t length;
const BYTE* match;
/* get literal length */
token = *ip++;
if ((length=(token>>ML_BITS)) == RUN_MASK)
{
unsigned s;
do
{
s = *ip++;
length += s;
}
while (likely((endOnInput)?ip<iend-RUN_MASK:1) && (s==255));
if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)(op))) goto _output_error; /* overflow detection */
if ((safeDecode) && unlikely((size_t)(ip+length)<(size_t)(ip))) goto _output_error; /* overflow detection */
}
/* copy literals */
cpy = op+length;
if (((endOnInput) && ((cpy>(partialDecoding?oexit:oend-MFLIMIT)) || (ip+length>iend-(2+1+LASTLITERALS))) )
|| ((!endOnInput) && (cpy>oend-COPYLENGTH)))
{
if (partialDecoding)
{
if (cpy > oend) goto _output_error; /* Error : write attempt beyond end of output buffer */
if ((endOnInput) && (ip+length > iend)) goto _output_error; /* Error : read attempt beyond end of input buffer */
}
else
{
if ((!endOnInput) && (cpy != oend)) goto _output_error; /* Error : block decoding must stop exactly there */
if ((endOnInput) && ((ip+length != iend) || (cpy > oend))) goto _output_error; /* Error : input must be consumed */
}
memcpy(op, ip, length);
ip += length;
op += length;
break; /* Necessarily EOF, due to parsing restrictions */
}
LZ4_wildCopy(op, ip, cpy);
ip += length; op = cpy;
/* get offset */
match = cpy - LZ4_readLE16(ip); ip+=2;
if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error; /* Error : offset outside destination buffer */
/* get matchlength */
length = token & ML_MASK;
if (length == ML_MASK)
{
unsigned s;
do
{
if ((endOnInput) && (ip > iend-LASTLITERALS)) goto _output_error;
s = *ip++;
length += s;
} while (s==255);
if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)op)) goto _output_error; /* overflow detection */
}
length += MINMATCH;
/* check external dictionary */
if ((dict==usingExtDict) && (match < lowPrefix))
{
if (unlikely(op+length > oend-LASTLITERALS)) goto _output_error; /* doesn't respect parsing restriction */
if (length <= (size_t)(lowPrefix-match))
{
/* match can be copied as a single segment from external dictionary */
match = dictEnd - (lowPrefix-match);
memmove(op, match, length); op += length;
}
else
{
/* match encompass external dictionary and current segment */
size_t copySize = (size_t)(lowPrefix-match);
memcpy(op, dictEnd - copySize, copySize);
op += copySize;
copySize = length - copySize;
if (copySize > (size_t)(op-lowPrefix)) /* overlap within current segment */
{
BYTE* const endOfMatch = op + copySize;
const BYTE* copyFrom = lowPrefix;
while (op < endOfMatch) *op++ = *copyFrom++;
}
else
{
memcpy(op, lowPrefix, copySize);
op += copySize;
}
}
continue;
}
/* copy repeated sequence */
cpy = op + length;
if (unlikely((op-match)<8))
{
const size_t dec64 = dec64table[op-match];
op[0] = match[0];
op[1] = match[1];
op[2] = match[2];
op[3] = match[3];
match += dec32table[op-match];
LZ4_copy4(op+4, match);
op += 8; match -= dec64;
} else { LZ4_copy8(op, match); op+=8; match+=8; }
if (unlikely(cpy>oend-12))
{
if (cpy > oend-LASTLITERALS) goto _output_error; /* Error : last LASTLITERALS bytes must be literals */
if (op < oend-8)
{
LZ4_wildCopy(op, match, oend-8);
match += (oend-8) - op;
op = oend-8;
}
while (op<cpy) *op++ = *match++;
}
else
LZ4_wildCopy(op, match, cpy);
op=cpy; /* correction */
- }
- /* end of decoding */
- if (endOnInput)
return (int) (((char*)op)-dest); /* Nb of output bytes decoded */
- else
return (int) (((const char*)ip)-source); /* Nb of input bytes read */
- /* Overflow error detected */
+_output_error:
- return (int) (-(((const char*)ip)-source))-1;
+} diff --git a/lib/lz4_wrapper.c b/lib/lz4_wrapper.c new file mode 100644 index 0000000..5f715e6 --- /dev/null +++ b/lib/lz4_wrapper.c @@ -0,0 +1,150 @@ +/*
- Copyright 2015 Google Inc.
Should be able to use:
SPDX-License-Identifier: GPL-2.0+ BSD-2-Clause
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions
- are met:
- Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in the
- documentation and/or other materials provided with the distribution.
- The name of the author may not be used to endorse or promote products
- derived from this software without specific prior written permission.
- Alternatively, this software may be distributed under the terms of the
- GNU General Public License ("GPL") version 2 as published by the Free
- Software Foundation.
- THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
- ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
- FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- SUCH DAMAGE.
- */
+#include <common.h> +#include <compiler.h> +#include <linux/kernel.h> +#include <linux/types.h>
+static u16 LZ4_readLE16(const void *src) { return le16_to_cpu(*(u16 *)src); } +static void LZ4_copy4(void *dst, const void *src) { *(u32 *)dst = *(u32 *)src; } +static void LZ4_copy8(void *dst, const void *src) { *(u64 *)dst = *(u64 *)src; }
I can see why you want to keep lz4.c as is. But this file is written by you, isn't it? If so, can you fix the checkpatch errors that are fixable (e.g. run 'patman').
warning: lib/lz4_wrapper.c,41: do not add new typedefs warning: lib/lz4_wrapper.c,42: do not add new typedefs warning: lib/lz4_wrapper.c,43: do not add new typedefs warning: lib/lz4_wrapper.c,44: do not add new typedefs warning: lib/lz4_wrapper.c,45: do not add new typedefs warning: lib/lz4_wrapper.c,47: storage class should be at the beginning of the declaration error: lib/lz4_wrapper.c,59: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,60: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,61: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,62: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,63: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,64: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,70: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,71: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,72: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,77: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,77: Adding new packed members is to be done with care error: lib/lz4_wrapper.c,83: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,84: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,89: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,89: Adding new packed members is to be done with care error: test/compression.c,282: return is not a function, parentheses are not required
+typedef uint8_t BYTE; +typedef uint16_t U16; +typedef uint32_t U32; +typedef int32_t S32; +typedef uint64_t U64;
+#define FORCE_INLINE static inline __attribute__((always_inline))
+/* Unaltered (except removing unrelated code) from github.com/Cyan4973/lz4. */ +#include "lz4.c" /* #include for inlining, do not link! */
+#define LZ4F_MAGIC 0x184D2204
+struct lz4_frame_header {
u32 magic;
union {
u8 flags;
struct {
u8 reserved0 : 2;
u8 has_content_checksum : 1;
u8 has_content_size : 1;
u8 has_block_checksum : 1;
u8 independent_blocks : 1;
u8 version : 2;
};
};
union {
u8 block_descriptor;
struct {
u8 reserved1 : 4;
u8 max_block_size : 3;
u8 reserved2 : 1;
};
};
/* + u64 content_size iff has_content_size is set */
/* + u8 header_checksum */
+} __attribute__((packed));
_packed
+struct lz4_block_header {
union {
u32 raw;
struct {
u32 size : 31;
u32 not_compressed : 1;
};
};
/* + size bytes of data */
/* + u32 block_checksum iff has_block_checksum is set */
+} __attribute__((packed));
+size_t ulz4fn(const void *src, size_t srcn, void *dst, size_t dstn)
Could you return int here, and use proper error numbers below? Like -EINVAL, -EPROTONOSUPPORT, -ENOSPC, etc.
+{
const void *in = src;
void *out = dst;
int has_block_checksum;
{ /* With in-place decompression the header may become invalid later. */
const struct lz4_frame_header *h = in;
if (srcn < sizeof(*h) + sizeof(u64) + sizeof(u8))
return 0; /* input overrun */
/* We assume there's always only a single, standard frame. */
if (le32_to_cpu(h->magic) != LZ4F_MAGIC || h->version != 1)
return 0; /* unknown format */
if (h->reserved0 || h->reserved1 || h->reserved2)
return 0; /* reserved must be zero */
if (!h->independent_blocks)
return 0; /* we don't support block dependency */
has_block_checksum = h->has_block_checksum;
in += sizeof(*h);
if (h->has_content_size)
in += sizeof(u64);
in += sizeof(u8);
}
while (1) {
struct lz4_block_header b = { .raw = le32_to_cpu(*(u32 *)in) };
in += sizeof(struct lz4_block_header);
if (in - src + b.size > srcn)
return 0; /* input overrun */
if (!b.size)
return out - dst; /* decompression successful */
if (b.not_compressed) {
size_t size = min((ptrdiff_t)b.size, dst + dstn - out);
memcpy(out, in, size);
if (size < b.size)
return 0; /* output overrun */
else
out += size;
} else {
/* constant folding essential, do not touch params! */
int ret = LZ4_decompress_generic(in, out, b.size,
dst + dstn - out, endOnInputSize,
full, 0, noDict, out, NULL, 0);
if (ret < 0)
return 0; /* decompression error */
else
out += ret;
}
in += b.size;
if (has_block_checksum)
in += sizeof(u32);
}
+} diff --git a/test/compression.c b/test/compression.c index 7ef3a8c..5521b02 100644 --- a/test/compression.c +++ b/test/compression.c @@ -95,6 +95,28 @@ static const char lzo_compressed[] = "\x73\x61\x67\x65\x73\x2e\x0a\x11\x00\x00\x00\x00\x00\x00"; static const unsigned long lzo_compressed_size = 334;
+/* lz4 -z /tmp/plain.txt > /tmp/plain.lz4 */ +static const char lz4_compressed[] =
"\x04\x22\x4d\x18\x64\x70\xb9\x01\x01\x00\x00\xff\x19\x49\x20\x61"
"\x6d\x20\x61\x20\x68\x69\x67\x68\x6c\x79\x20\x63\x6f\x6d\x70\x72"
"\x65\x73\x73\x61\x62\x6c\x65\x20\x62\x69\x74\x20\x6f\x66\x20\x74"
"\x65\x78\x74\x2e\x0a\x28\x00\x3d\xf1\x25\x54\x68\x65\x72\x65\x20"
"\x61\x72\x65\x20\x6d\x61\x6e\x79\x20\x6c\x69\x6b\x65\x20\x6d\x65"
"\x2c\x20\x62\x75\x74\x20\x74\x68\x69\x73\x20\x6f\x6e\x65\x20\x69"
"\x73\x20\x6d\x69\x6e\x65\x2e\x0a\x49\x66\x20\x49\x20\x77\x32\x00"
"\xd1\x6e\x79\x20\x73\x68\x6f\x72\x74\x65\x72\x2c\x20\x74\x45\x00"
"\xf4\x0b\x77\x6f\x75\x6c\x64\x6e\x27\x74\x20\x62\x65\x20\x6d\x75"
"\x63\x68\x20\x73\x65\x6e\x73\x65\x20\x69\x6e\x0a\xcf\x00\x50\x69"
"\x6e\x67\x20\x6d\x12\x00\x00\x32\x00\xf0\x11\x20\x66\x69\x72\x73"
"\x74\x20\x70\x6c\x61\x63\x65\x2e\x20\x41\x74\x20\x6c\x65\x61\x73"
"\x74\x20\x77\x69\x74\x68\x20\x6c\x7a\x6f\x2c\x63\x00\xf5\x14\x77"
"\x61\x79\x2c\x0a\x77\x68\x69\x63\x68\x20\x61\x70\x70\x65\x61\x72"
"\x73\x20\x74\x6f\x20\x62\x65\x68\x61\x76\x65\x20\x70\x6f\x6f\x72"
"\x6c\x79\x4e\x00\x30\x61\x63\x65\x27\x01\x01\x95\x00\x01\x2d\x01"
"\xb0\x0a\x6d\x65\x73\x73\x61\x67\x65\x73\x2e\x0a\x00\x00\x00\x00"
"\x9d\x12\x8c\x9d";
+static const unsigned long lz4_compressed_size = 276;
#define TEST_BUFFER_SIZE 512
@@ -227,6 +249,39 @@ static int uncompress_using_lzo(void *in, unsigned long in_size, return (ret != LZO_E_OK); }
+static int compress_using_lz4(void *in, unsigned long in_size,
void *out, unsigned long out_max,
unsigned long *out_size)
+{
/* There is no lz4 compression in u-boot, so fake it. */
assert(in_size == strlen(plain));
assert(memcmp(plain, in, in_size) == 0);
if (lz4_compressed_size > out_max)
return -1;
memcpy(out, lz4_compressed, lz4_compressed_size);
if (out_size)
*out_size = lz4_compressed_size;
return 0;
+}
+static int uncompress_using_lz4(void *in, unsigned long in_size,
void *out, unsigned long out_max,
unsigned long *out_size)
+{
size_t ret;
size_t input_size = in_size;
size_t output_size = out_max;
ret = ulz4fn(in, input_size, out, output_size);
if (out_size)
*out_size = ret;
return (ret == 0);
+}
#define errcheck(statement) if (!(statement)) { \ fprintf(stderr, "\tFailed: %s\n", #statement); \ ret = 1; \ @@ -325,6 +380,7 @@ static int do_ut_compression(cmd_tbl_t *cmdtp, int flag, int argc, err += run_test("bzip2", compress_using_bzip2, uncompress_using_bzip2); err += run_test("lzma", compress_using_lzma, uncompress_using_lzma); err += run_test("lzo", compress_using_lzo, uncompress_using_lzo);
err += run_test("lz4", compress_using_lz4, uncompress_using_lz4); printf("ut_compression %s\n", err == 0 ? "ok" : "FAILED");
@@ -401,6 +457,7 @@ static int do_ut_image_decomp(cmd_tbl_t *cmdtp, int flag, int argc, err |= run_bootm_test(IH_COMP_BZIP2, compress_using_bzip2); err |= run_bootm_test(IH_COMP_LZMA, compress_using_lzma); err |= run_bootm_test(IH_COMP_LZO, compress_using_lzo);
err |= run_bootm_test(IH_COMP_LZ4, compress_using_lz4); err |= run_bootm_test(IH_COMP_NONE, compress_using_none); printf("ut_image_decomp %s\n", err == 0 ? "ok" : "FAILED");
-- 2.1.2
Regards, Simon

I get this build error with sandbox.
test/built-in.o: In function `uncompress_using_lz4': /home/sjg/c/src/third_party/u-boot/files/test/compression.c:278: undefined reference to `ulz4fn'
Yeah... that's because you didn't configure it in. I'm not really sure how this is supposed to work... there should really be some sort of dependency between selecting the compression tests and selecting the algorithm, or the algorithms should be #ifdefed out like in bootm.c. But neither of those seems to be in place right now... include/configs/sandbox.h just enables all the other compression algorithms, that's why it works for them.
I could add a 'select LZ4' to config UNIT_TEST in the Kconfig to create such a dependency, I think that would be the best option?
You should be able to run the tests using:
make O=sandbox defconfig all ./sandbox/u-boot -c "ut_compression"
I tried, but I can't find myself through the SDL dependency hell... sorry. I installed Ubuntu's libsdl2-dev but apparently that wasn't enough... I still get "make[2]: sdl-config: Command not found" and "../arch/sandbox/cpu/sdl.c:9:21: fatal error: SDL/SDL.h: No such file or directory" when I'm trying to build. (I ran it with make -k now so I confirmed that it builds both lib/lz4_wrapper.o and test/compression.o without errors. I just can't test the linking step.)
Can you instead add this option to lib/Kconfig and put your help there? We are moving away from the old CONFIGS.
Done in next version.
You should be able to replace this license with
SPDX-License-Identifier: BSD-2-Clause [...] Should be able to use:
SPDX-License-Identifier: GPL-2.0+ BSD-2-Clause
Done in next version.
I can see why you want to keep lz4.c as is. But this file is written by you, isn't it? If so, can you fix the checkpatch errors that are fixable (e.g. run 'patman').
warning: lib/lz4_wrapper.c,41: do not add new typedefs warning: lib/lz4_wrapper.c,42: do not add new typedefs warning: lib/lz4_wrapper.c,43: do not add new typedefs warning: lib/lz4_wrapper.c,44: do not add new typedefs warning: lib/lz4_wrapper.c,45: do not add new typedefs warning: lib/lz4_wrapper.c,47: storage class should be at the beginning of the declaration error: lib/lz4_wrapper.c,59: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,60: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,61: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,62: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,63: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,64: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,70: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,71: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,72: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,77: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,77: Adding new packed members is to be done with care error: lib/lz4_wrapper.c,83: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,84: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,89: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,89: Adding new packed members is to be done with care error: test/compression.c,282: return is not a function, parentheses are not required
Okay, I replaced __attribute__((packed)) with __packed and fixed the whitespace for bit fields. I think those are the only actionable ones... the camel case comes from names inside lz4.c, and I need the typedefs to map U-Boot's types to the ones lz4.c expects.
Could you return int here, and use proper error numbers below? Like -EINVAL, -EPROTONOSUPPORT, -ENOSPC, etc.
Okay, I switched it to the model where it returns an int and the size is an output pointer, like the other algorithms.

This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Compile-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org --- common/bootm.c | 9 ++ common/image.c | 1 + include/common.h | 3 + include/image.h | 1 + lib/Kconfig | 18 ++++ lib/Makefile | 1 + lib/lz4.c | 243 +++++++++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 137 ++++++++++++++++++++++++++++++ test/Kconfig | 1 + test/compression.c | 57 +++++++++++++ 10 files changed, 471 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
diff --git a/common/bootm.c b/common/bootm.c index 667c934..5f99ec8 100644 --- a/common/bootm.c +++ b/common/bootm.c @@ -389,6 +389,15 @@ int bootm_decomp_image(int comp, ulong load, ulong image_start, int type, break; } #endif /* CONFIG_LZO */ +#ifdef CONFIG_LZ4 + case IH_COMP_LZ4: { + size_t size = unc_len; + + ret = ulz4fn(image_buf, image_len, load_buf, &size); + image_len = size; + break; + } +#endif /* CONFIG_LZ4 */ default: printf("Unimplemented compression type %d\n", comp); return BOOTM_ERR_UNIMPLEMENTED; diff --git a/common/image.c b/common/image.c index 1325e07..c33749d 100644 --- a/common/image.c +++ b/common/image.c @@ -167,6 +167,7 @@ static const table_entry_t uimage_comp[] = { { IH_COMP_GZIP, "gzip", "gzip compressed", }, { IH_COMP_LZMA, "lzma", "lzma compressed", }, { IH_COMP_LZO, "lzo", "lzo compressed", }, + { IH_COMP_LZ4, "lz4", "lz4 compressed", }, { -1, "", "", }, };
diff --git a/include/common.h b/include/common.h index 68b24d0..ecb1f06 100644 --- a/include/common.h +++ b/include/common.h @@ -826,6 +826,9 @@ int gzwrite(unsigned char *src, int len, u64 startoffs, u64 szexpected);
+/* lib/lz4_wrapper.c */ +int ulz4fn(const void *src, size_t srcn, void *dst, size_t *dstn); + /* lib/qsort.c */ void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)); diff --git a/include/image.h b/include/image.h index 8a864ae..08ae24a 100644 --- a/include/image.h +++ b/include/image.h @@ -259,6 +259,7 @@ struct lmb; #define IH_COMP_BZIP2 2 /* bzip2 Compression Used */ #define IH_COMP_LZMA 3 /* lzma Compression Used */ #define IH_COMP_LZO 4 /* lzo Compression Used */ +#define IH_COMP_LZ4 5 /* lz4 Compression Used */
#define IH_MAGIC 0x27051956 /* Image Magic Number */ #define IH_NMLEN 32 /* Image Name Length */ diff --git a/lib/Kconfig b/lib/Kconfig index 0673072..a8f8460 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -100,6 +100,24 @@ config SHA_PROG_HW_ACCEL is performed in hardware. endmenu
+menu "Compression Support" + +config LZ4 + bool "Enable LZ4 decompression support" + help + If this option is set, support for LZ4 compressed images + is included. The LZ4 algorithm can run in-place as long as the + compressed image is loaded to the end of the output buffer, and + trades lower compression ratios for much faster decompression. + + NOTE: This implements the release version of the LZ4 frame + format as generated by default by the 'lz4' command line tool. + This is not the same as the outdated, less efficient legacy + frame format currently (2015) implemented in the Linux kernel + (generated by 'lz4 -l'). The two formats are incompatible. + +endmenu + config ERRNO_STR bool "Enable function for getting errno-related string message" help diff --git a/lib/Makefile b/lib/Makefile index 96f832e..3eecefa 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_GZIP_COMPRESSED) += gzip.o obj-y += initcall.o obj-$(CONFIG_LMB) += lmb.o obj-y += ldiv.o +obj-$(CONFIG_LZ4) += lz4_wrapper.o obj-$(CONFIG_MD5) += md5.o obj-y += net_utils.o obj-$(CONFIG_PHYSMEM) += physmem.o diff --git a/lib/lz4.c b/lib/lz4.c new file mode 100644 index 0000000..f518341 --- /dev/null +++ b/lib/lz4.c @@ -0,0 +1,243 @@ +/* + LZ4 - Fast LZ compression algorithm + Copyright (C) 2011-2015, Yann Collet. + + SPDX-License-Identifier: BSD-2-Clause + + You can contact the author at : + - LZ4 source repository : https://github.com/Cyan4973/lz4 + - LZ4 public forum : https://groups.google.com/forum/#!forum/lz4c +*/ + + +/************************************** +* Reading and writing into memory +**************************************/ + +/* customized version of memcpy, which may overwrite up to 7 bytes beyond dstEnd */ +static void LZ4_wildCopy(void* dstPtr, const void* srcPtr, void* dstEnd) +{ + BYTE* d = (BYTE*)dstPtr; + const BYTE* s = (const BYTE*)srcPtr; + BYTE* e = (BYTE*)dstEnd; + do { LZ4_copy8(d,s); d+=8; s+=8; } while (d<e); +} + + +/************************************** +* Common Constants +**************************************/ +#define MINMATCH 4 + +#define COPYLENGTH 8 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH+MINMATCH) +static const int LZ4_minLength = (MFLIMIT+1); + +#define KB *(1 <<10) +#define MB *(1 <<20) +#define GB *(1U<<30) + +#define MAXD_LOG 16 +#define MAX_DISTANCE ((1 << MAXD_LOG) - 1) + +#define ML_BITS 4 +#define ML_MASK ((1U<<ML_BITS)-1) +#define RUN_BITS (8-ML_BITS) +#define RUN_MASK ((1U<<RUN_BITS)-1) + + +/************************************** +* Local Structures and types +**************************************/ +typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive; +typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive; +typedef enum { full = 0, partial = 1 } earlyEnd_directive; + + + +/******************************* +* Decompression functions +*******************************/ +/* + * This generic decompression function cover all use cases. + * It shall be instantiated several times, using different sets of directives + * Note that it is essential this generic function is really inlined, + * in order to remove useless branches during compilation optimization. + */ +FORCE_INLINE int LZ4_decompress_generic( + const char* const source, + char* const dest, + int inputSize, + int outputSize, /* If endOnInput==endOnInputSize, this value is the max size of Output Buffer. */ + + int endOnInput, /* endOnOutputSize, endOnInputSize */ + int partialDecoding, /* full, partial */ + int targetOutputSize, /* only used if partialDecoding==partial */ + int dict, /* noDict, withPrefix64k, usingExtDict */ + const BYTE* const lowPrefix, /* == dest if dict == noDict */ + const BYTE* const dictStart, /* only if dict==usingExtDict */ + const size_t dictSize /* note : = 0 if noDict */ + ) +{ + /* Local Variables */ + const BYTE* ip = (const BYTE*) source; + const BYTE* const iend = ip + inputSize; + + BYTE* op = (BYTE*) dest; + BYTE* const oend = op + outputSize; + BYTE* cpy; + BYTE* oexit = op + targetOutputSize; + const BYTE* const lowLimit = lowPrefix - dictSize; + + const BYTE* const dictEnd = (const BYTE*)dictStart + dictSize; + const size_t dec32table[] = {4, 1, 2, 1, 4, 4, 4, 4}; + const size_t dec64table[] = {0, 0, 0, (size_t)-1, 0, 1, 2, 3}; + + const int safeDecode = (endOnInput==endOnInputSize); + const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB))); + + + /* Special cases */ + if ((partialDecoding) && (oexit> oend-MFLIMIT)) oexit = oend-MFLIMIT; /* targetOutputSize too high => decode everything */ + if ((endOnInput) && (unlikely(outputSize==0))) return ((inputSize==1) && (*ip==0)) ? 0 : -1; /* Empty output buffer */ + if ((!endOnInput) && (unlikely(outputSize==0))) return (*ip==0?1:-1); + + + /* Main Loop */ + while (1) + { + unsigned token; + size_t length; + const BYTE* match; + + /* get literal length */ + token = *ip++; + if ((length=(token>>ML_BITS)) == RUN_MASK) + { + unsigned s; + do + { + s = *ip++; + length += s; + } + while (likely((endOnInput)?ip<iend-RUN_MASK:1) && (s==255)); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)(op))) goto _output_error; /* overflow detection */ + if ((safeDecode) && unlikely((size_t)(ip+length)<(size_t)(ip))) goto _output_error; /* overflow detection */ + } + + /* copy literals */ + cpy = op+length; + if (((endOnInput) && ((cpy>(partialDecoding?oexit:oend-MFLIMIT)) || (ip+length>iend-(2+1+LASTLITERALS))) ) + || ((!endOnInput) && (cpy>oend-COPYLENGTH))) + { + if (partialDecoding) + { + if (cpy > oend) goto _output_error; /* Error : write attempt beyond end of output buffer */ + if ((endOnInput) && (ip+length > iend)) goto _output_error; /* Error : read attempt beyond end of input buffer */ + } + else + { + if ((!endOnInput) && (cpy != oend)) goto _output_error; /* Error : block decoding must stop exactly there */ + if ((endOnInput) && ((ip+length != iend) || (cpy > oend))) goto _output_error; /* Error : input must be consumed */ + } + memcpy(op, ip, length); + ip += length; + op += length; + break; /* Necessarily EOF, due to parsing restrictions */ + } + LZ4_wildCopy(op, ip, cpy); + ip += length; op = cpy; + + /* get offset */ + match = cpy - LZ4_readLE16(ip); ip+=2; + if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error; /* Error : offset outside destination buffer */ + + /* get matchlength */ + length = token & ML_MASK; + if (length == ML_MASK) + { + unsigned s; + do + { + if ((endOnInput) && (ip > iend-LASTLITERALS)) goto _output_error; + s = *ip++; + length += s; + } while (s==255); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)op)) goto _output_error; /* overflow detection */ + } + length += MINMATCH; + + /* check external dictionary */ + if ((dict==usingExtDict) && (match < lowPrefix)) + { + if (unlikely(op+length > oend-LASTLITERALS)) goto _output_error; /* doesn't respect parsing restriction */ + + if (length <= (size_t)(lowPrefix-match)) + { + /* match can be copied as a single segment from external dictionary */ + match = dictEnd - (lowPrefix-match); + memmove(op, match, length); op += length; + } + else + { + /* match encompass external dictionary and current segment */ + size_t copySize = (size_t)(lowPrefix-match); + memcpy(op, dictEnd - copySize, copySize); + op += copySize; + copySize = length - copySize; + if (copySize > (size_t)(op-lowPrefix)) /* overlap within current segment */ + { + BYTE* const endOfMatch = op + copySize; + const BYTE* copyFrom = lowPrefix; + while (op < endOfMatch) *op++ = *copyFrom++; + } + else + { + memcpy(op, lowPrefix, copySize); + op += copySize; + } + } + continue; + } + + /* copy repeated sequence */ + cpy = op + length; + if (unlikely((op-match)<8)) + { + const size_t dec64 = dec64table[op-match]; + op[0] = match[0]; + op[1] = match[1]; + op[2] = match[2]; + op[3] = match[3]; + match += dec32table[op-match]; + LZ4_copy4(op+4, match); + op += 8; match -= dec64; + } else { LZ4_copy8(op, match); op+=8; match+=8; } + + if (unlikely(cpy>oend-12)) + { + if (cpy > oend-LASTLITERALS) goto _output_error; /* Error : last LASTLITERALS bytes must be literals */ + if (op < oend-8) + { + LZ4_wildCopy(op, match, oend-8); + match += (oend-8) - op; + op = oend-8; + } + while (op<cpy) *op++ = *match++; + } + else + LZ4_wildCopy(op, match, cpy); + op=cpy; /* correction */ + } + + /* end of decoding */ + if (endOnInput) + return (int) (((char*)op)-dest); /* Nb of output bytes decoded */ + else + return (int) (((const char*)ip)-source); /* Nb of input bytes read */ + + /* Overflow error detected */ +_output_error: + return (int) (-(((const char*)ip)-source))-1; +} diff --git a/lib/lz4_wrapper.c b/lib/lz4_wrapper.c new file mode 100644 index 0000000..28fea76 --- /dev/null +++ b/lib/lz4_wrapper.c @@ -0,0 +1,137 @@ +/* + * Copyright 2015 Google Inc. + * + * SPDX-License-Identifier: GPL 2.0+ BSD-3-Clause + */ + +#include <common.h> +#include <compiler.h> +#include <linux/kernel.h> +#include <linux/types.h> + +static u16 LZ4_readLE16(const void *src) { return le16_to_cpu(*(u16 *)src); } +static void LZ4_copy4(void *dst, const void *src) { *(u32 *)dst = *(u32 *)src; } +static void LZ4_copy8(void *dst, const void *src) { *(u64 *)dst = *(u64 *)src; } + +typedef uint8_t BYTE; +typedef uint16_t U16; +typedef uint32_t U32; +typedef int32_t S32; +typedef uint64_t U64; + +#define FORCE_INLINE static inline __attribute__((always_inline)) + +/* Unaltered (except removing unrelated code) from github.com/Cyan4973/lz4. */ +#include "lz4.c" /* #include for inlining, do not link! */ + +#define LZ4F_MAGIC 0x184D2204 + +struct lz4_frame_header { + u32 magic; + union { + u8 flags; + struct { + u8 reserved0:2; + u8 has_content_checksum:1; + u8 has_content_size:1; + u8 has_block_checksum:1; + u8 independent_blocks:1; + u8 version:2; + }; + }; + union { + u8 block_descriptor; + struct { + u8 reserved1:4; + u8 max_block_size:3; + u8 reserved2:1; + }; + }; + /* + u64 content_size iff has_content_size is set */ + /* + u8 header_checksum */ +} __packed; + +struct lz4_block_header { + union { + u32 raw; + struct { + u32 size:31; + u32 not_compressed:1; + }; + }; + /* + size bytes of data */ + /* + u32 block_checksum iff has_block_checksum is set */ +} __packed; + +int ulz4fn(const void *src, size_t srcn, void *dst, size_t *dstn) +{ + const void *end = dst + *dstn; + const void *in = src; + void *out = dst; + int has_block_checksum; + int ret; + *dstn = 0; + + { /* With in-place decompression the header may become invalid later. */ + const struct lz4_frame_header *h = in; + + if (srcn < sizeof(*h) + sizeof(u64) + sizeof(u8)) + return -EINVAL; /* input overrun */ + + /* We assume there's always only a single, standard frame. */ + if (le32_to_cpu(h->magic) != LZ4F_MAGIC || h->version != 1) + return -EPROTONOSUPPORT; /* unknown format */ + if (h->reserved0 || h->reserved1 || h->reserved2) + return -EINVAL; /* reserved must be zero */ + if (!h->independent_blocks) + return -EPROTONOSUPPORT; /* we can't support this yet */ + has_block_checksum = h->has_block_checksum; + + in += sizeof(*h); + if (h->has_content_size) + in += sizeof(u64); + in += sizeof(u8); + } + + while (1) { + struct lz4_block_header b = { .raw = le32_to_cpu(*(u32 *)in) }; + in += sizeof(struct lz4_block_header); + + if (in - src + b.size > srcn) { + ret = -EINVAL; /* input overrun */ + break; + } + + if (!b.size) { + ret = 0; /* decompression successful */ + break; + } + + if (b.not_compressed) { + size_t size = min((ptrdiff_t)b.size, end - out); + memcpy(out, in, size); + out += size; + if (size < b.size) { + ret = -ENOBUFS; /* output overrun */ + break; + } + } else { + /* constant folding essential, do not touch params! */ + int ret = LZ4_decompress_generic(in, out, b.size, + end - out, endOnInputSize, + full, 0, noDict, out, NULL, 0); + if (ret < 0) { + ret = -EPROTO; /* decompression error */ + break; + } + out += ret; + } + + in += b.size; + if (has_block_checksum) + in += sizeof(u32); + } + + *dstn = out - dst; + return ret; +} diff --git a/test/Kconfig b/test/Kconfig index d71c332..c888de6 100644 --- a/test/Kconfig +++ b/test/Kconfig @@ -1,5 +1,6 @@ menuconfig UNIT_TEST bool "Unit tests" + select LZ4 help Select this to compile in unit tests for various parts of U-Boot. Test suites will be subcommands of the "ut" command. diff --git a/test/compression.c b/test/compression.c index 7ef3a8c..be4e04e 100644 --- a/test/compression.c +++ b/test/compression.c @@ -95,6 +95,28 @@ static const char lzo_compressed[] = "\x73\x61\x67\x65\x73\x2e\x0a\x11\x00\x00\x00\x00\x00\x00"; static const unsigned long lzo_compressed_size = 334;
+/* lz4 -z /tmp/plain.txt > /tmp/plain.lz4 */ +static const char lz4_compressed[] = + "\x04\x22\x4d\x18\x64\x70\xb9\x01\x01\x00\x00\xff\x19\x49\x20\x61" + "\x6d\x20\x61\x20\x68\x69\x67\x68\x6c\x79\x20\x63\x6f\x6d\x70\x72" + "\x65\x73\x73\x61\x62\x6c\x65\x20\x62\x69\x74\x20\x6f\x66\x20\x74" + "\x65\x78\x74\x2e\x0a\x28\x00\x3d\xf1\x25\x54\x68\x65\x72\x65\x20" + "\x61\x72\x65\x20\x6d\x61\x6e\x79\x20\x6c\x69\x6b\x65\x20\x6d\x65" + "\x2c\x20\x62\x75\x74\x20\x74\x68\x69\x73\x20\x6f\x6e\x65\x20\x69" + "\x73\x20\x6d\x69\x6e\x65\x2e\x0a\x49\x66\x20\x49\x20\x77\x32\x00" + "\xd1\x6e\x79\x20\x73\x68\x6f\x72\x74\x65\x72\x2c\x20\x74\x45\x00" + "\xf4\x0b\x77\x6f\x75\x6c\x64\x6e\x27\x74\x20\x62\x65\x20\x6d\x75" + "\x63\x68\x20\x73\x65\x6e\x73\x65\x20\x69\x6e\x0a\xcf\x00\x50\x69" + "\x6e\x67\x20\x6d\x12\x00\x00\x32\x00\xf0\x11\x20\x66\x69\x72\x73" + "\x74\x20\x70\x6c\x61\x63\x65\x2e\x20\x41\x74\x20\x6c\x65\x61\x73" + "\x74\x20\x77\x69\x74\x68\x20\x6c\x7a\x6f\x2c\x63\x00\xf5\x14\x77" + "\x61\x79\x2c\x0a\x77\x68\x69\x63\x68\x20\x61\x70\x70\x65\x61\x72" + "\x73\x20\x74\x6f\x20\x62\x65\x68\x61\x76\x65\x20\x70\x6f\x6f\x72" + "\x6c\x79\x4e\x00\x30\x61\x63\x65\x27\x01\x01\x95\x00\x01\x2d\x01" + "\xb0\x0a\x6d\x65\x73\x73\x61\x67\x65\x73\x2e\x0a\x00\x00\x00\x00" + "\x9d\x12\x8c\x9d"; +static const unsigned long lz4_compressed_size = 276; +
#define TEST_BUFFER_SIZE 512
@@ -227,6 +249,39 @@ static int uncompress_using_lzo(void *in, unsigned long in_size, return (ret != LZO_E_OK); }
+static int compress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + /* There is no lz4 compression in u-boot, so fake it. */ + assert(in_size == strlen(plain)); + assert(memcmp(plain, in, in_size) == 0); + + if (lz4_compressed_size > out_max) + return -1; + + memcpy(out, lz4_compressed, lz4_compressed_size); + if (out_size) + *out_size = lz4_compressed_size; + + return 0; +} + +static int uncompress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + int ret; + size_t input_size = in_size; + size_t output_size = out_max; + + ret = ulz4fn(in, input_size, out, &output_size); + if (out_size) + *out_size = output_size; + + return (ret != 0); +} + #define errcheck(statement) if (!(statement)) { \ fprintf(stderr, "\tFailed: %s\n", #statement); \ ret = 1; \ @@ -325,6 +380,7 @@ static int do_ut_compression(cmd_tbl_t *cmdtp, int flag, int argc, err += run_test("bzip2", compress_using_bzip2, uncompress_using_bzip2); err += run_test("lzma", compress_using_lzma, uncompress_using_lzma); err += run_test("lzo", compress_using_lzo, uncompress_using_lzo); + err += run_test("lz4", compress_using_lz4, uncompress_using_lz4);
printf("ut_compression %s\n", err == 0 ? "ok" : "FAILED");
@@ -401,6 +457,7 @@ static int do_ut_image_decomp(cmd_tbl_t *cmdtp, int flag, int argc, err |= run_bootm_test(IH_COMP_BZIP2, compress_using_bzip2); err |= run_bootm_test(IH_COMP_LZMA, compress_using_lzma); err |= run_bootm_test(IH_COMP_LZO, compress_using_lzo); + err |= run_bootm_test(IH_COMP_LZ4, compress_using_lz4); err |= run_bootm_test(IH_COMP_NONE, compress_using_none);
printf("ut_image_decomp %s\n", err == 0 ? "ok" : "FAILED");

Hi Julius,
On 3 October 2015 at 06:32, Julius Werner jwerner@chromium.org wrote:
This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Compile-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org
common/bootm.c | 9 ++ common/image.c | 1 + include/common.h | 3 + include/image.h | 1 + lib/Kconfig | 18 ++++ lib/Makefile | 1 + lib/lz4.c | 243 +++++++++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 137 ++++++++++++++++++++++++++++++ test/Kconfig | 1 + test/compression.c | 57 +++++++++++++ 10 files changed, 471 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
Actually you should enable the option in configs/sandbox_defconfig, not lib/Kconfig. See the condition used by compression.c:
obj-$(CONFIG_SANDBOX) += compression.o
Also, for me the test fails:
./b/sandbox/u-boot -c "ut_compression" ... testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun Failed: ret != 0 lz4: FAILED ut_compression FAILED
Regards, Simon

You can build U-Boot with NO_SDL=1
Ah, thanks... that was the important magic flag I needed!
Actually you should enable the option in configs/sandbox_defconfig, not lib/Kconfig. See the condition used by compression.c:
obj-$(CONFIG_SANDBOX) += compression.o
Okay, makes sense. I looked at the wrong line in the Makefile.
Also, for me the test fails:
./b/sandbox/u-boot -c "ut_compression" ... testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun Failed: ret != 0 lz4: FAILED ut_compression FAILED
Huh... that's odd. When I run this now, I get:
testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun uncompress does not overrun lz4: ok
And if I change a byte in the compressed test data, it fails on memcmp() as expected. Are you sure you have no local changes or anything (I based the patch off 1f8836396)? I don't see how this could give different results...

Hi Julius,
On 5 October 2015 at 19:09, Julius Werner jwerner@chromium.org wrote:
You can build U-Boot with NO_SDL=1
Ah, thanks... that was the important magic flag I needed!
Actually you should enable the option in configs/sandbox_defconfig, not lib/Kconfig. See the condition used by compression.c:
obj-$(CONFIG_SANDBOX) += compression.o
Okay, makes sense. I looked at the wrong line in the Makefile.
Also, for me the test fails:
./b/sandbox/u-boot -c "ut_compression" ... testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun Failed: ret != 0 lz4: FAILED ut_compression FAILED
Huh... that's odd. When I run this now, I get:
testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun uncompress does not overrun lz4: ok
And if I change a byte in the compressed test data, it fails on memcmp() as expected. Are you sure you have no local changes or anything (I based the patch off 1f8836396)? I don't see how this could give different results...
git reset --hard HEAD~ HEAD is now at 1f88363 Prepare v2015.10-rc4 (try-julius=1f8836: asc.1 b/ et sandbox/) ~/u> !pw pwclient git-am 525863 Applying patch #525863 using 'git am' Description: [U-Boot,PATCHv2] Add support for LZ4 decompression algorithm Applying: Add support for LZ4 decompression algorithm .git/rebase-apply/patch:91: trailing whitespace.
warning: 1 line adds whitespace errors. (try-julius=b3cf2a: asc.1 b/ et sandbox/) ~/u> 1cro 1cro: command not found 127 (try-julius=b3cf2a: asc.1 b/ et sandbox/) ~/u> !cro crosfw -b sandbox -w
../lib/lz4_wrapper.c: In function ‘ulz4fn’: ../lib/lz4_wrapper.c:72:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] int ret; ^
./b/sandbox/u-boot -c "ut_compression"
U-Boot 2015.10-rc4-00001-gb3cf2a9 (Oct 06 2015 - 14:42:07 +0100)
DRAM: 128 MiB Using default environment
In: serial Out: lcd Err: lcd Net: No ethernet found. testing gzip ... orig_size:350 compressed_size:206 uncompressed_size:350 Deflate need more space to compress left 350 bytes compress does not overrun Error: inflate() returned -5 uncompress does not overrun gzip: ok testing bzip2 ... orig_size:350 compressed_size:240 uncompressed_size:350 compress does not overrun uncompress does not overrun bzip2: ok testing lzma ... orig_size:350 compressed_size:229 uncompressed_size:350 compress does not overrun uncompress does not overrun lzma: ok testing lzo ... orig_size:350 compressed_size:334 uncompressed_size:350 compress does not overrun uncompress does not overrun lzo: ok testing lz4 ... orig_size:350 compressed_size:276 uncompressed_size:350 compress does not overrun Failed: ret != 0 lz4: FAILED ut_compression FAILED
I pushed it to u-boot-x86 branch 'julius-working' for you to check.
Regards, Simon

Well then... a few hours and a significant reduction in sanity later, I found that I'm shadowing the new 'ret' variable from changing the API around because I forgot to remove the declaration part from the 'ret' that already existed in the else block. It would be nice if U-Boot compiled with -Wshadow...
This made the (outer) return variable uninitialized in the compression error case, which on my native x86_64 happened to produce a non-zero value... but with an i686 cross-compiler I could reproduce the error you saw. Fixed in the next version and moved the config to sandbox_defconfig as you suggested.

This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Sandbox-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org --- common/bootm.c | 9 ++ common/image.c | 1 + configs/sandbox_defconfig | 1 + include/common.h | 3 + include/image.h | 1 + lib/Kconfig | 18 ++++ lib/Makefile | 1 + lib/lz4.c | 243 ++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 137 ++++++++++++++++++++++++++ test/compression.c | 57 +++++++++++ 10 files changed, 471 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
diff --git a/common/bootm.c b/common/bootm.c index c0d0d09..58936ca 100644 --- a/common/bootm.c +++ b/common/bootm.c @@ -389,6 +389,15 @@ int bootm_decomp_image(int comp, ulong load, ulong image_start, int type, break; } #endif /* CONFIG_LZO */ +#ifdef CONFIG_LZ4 + case IH_COMP_LZ4: { + size_t size = unc_len; + + ret = ulz4fn(image_buf, image_len, load_buf, &size); + image_len = size; + break; + } +#endif /* CONFIG_LZ4 */ default: printf("Unimplemented compression type %d\n", comp); return BOOTM_ERR_UNIMPLEMENTED; diff --git a/common/image.c b/common/image.c index 1325e07..c33749d 100644 --- a/common/image.c +++ b/common/image.c @@ -167,6 +167,7 @@ static const table_entry_t uimage_comp[] = { { IH_COMP_GZIP, "gzip", "gzip compressed", }, { IH_COMP_LZMA, "lzma", "lzma compressed", }, { IH_COMP_LZO, "lzo", "lzo compressed", }, + { IH_COMP_LZ4, "lz4", "lz4 compressed", }, { -1, "", "", }, };
diff --git a/configs/sandbox_defconfig b/configs/sandbox_defconfig index ae96b63..b2675c7 100644 --- a/configs/sandbox_defconfig +++ b/configs/sandbox_defconfig @@ -56,6 +56,7 @@ CONFIG_USB_STORAGE=y CONFIG_SYS_VSNPRINTF=y CONFIG_CMD_DHRYSTONE=y CONFIG_TPM=y +CONFIG_LZ4=y CONFIG_ERRNO_STR=y CONFIG_UNIT_TEST=y CONFIG_UT_TIME=y diff --git a/include/common.h b/include/common.h index 68b24d0..ecb1f06 100644 --- a/include/common.h +++ b/include/common.h @@ -826,6 +826,9 @@ int gzwrite(unsigned char *src, int len, u64 startoffs, u64 szexpected);
+/* lib/lz4_wrapper.c */ +int ulz4fn(const void *src, size_t srcn, void *dst, size_t *dstn); + /* lib/qsort.c */ void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)); diff --git a/include/image.h b/include/image.h index 8a864ae..08ae24a 100644 --- a/include/image.h +++ b/include/image.h @@ -259,6 +259,7 @@ struct lmb; #define IH_COMP_BZIP2 2 /* bzip2 Compression Used */ #define IH_COMP_LZMA 3 /* lzma Compression Used */ #define IH_COMP_LZO 4 /* lzo Compression Used */ +#define IH_COMP_LZ4 5 /* lz4 Compression Used */
#define IH_MAGIC 0x27051956 /* Image Magic Number */ #define IH_NMLEN 32 /* Image Name Length */ diff --git a/lib/Kconfig b/lib/Kconfig index 0673072..a8f8460 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -100,6 +100,24 @@ config SHA_PROG_HW_ACCEL is performed in hardware. endmenu
+menu "Compression Support" + +config LZ4 + bool "Enable LZ4 decompression support" + help + If this option is set, support for LZ4 compressed images + is included. The LZ4 algorithm can run in-place as long as the + compressed image is loaded to the end of the output buffer, and + trades lower compression ratios for much faster decompression. + + NOTE: This implements the release version of the LZ4 frame + format as generated by default by the 'lz4' command line tool. + This is not the same as the outdated, less efficient legacy + frame format currently (2015) implemented in the Linux kernel + (generated by 'lz4 -l'). The two formats are incompatible. + +endmenu + config ERRNO_STR bool "Enable function for getting errno-related string message" help diff --git a/lib/Makefile b/lib/Makefile index 96f832e..3eecefa 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_GZIP_COMPRESSED) += gzip.o obj-y += initcall.o obj-$(CONFIG_LMB) += lmb.o obj-y += ldiv.o +obj-$(CONFIG_LZ4) += lz4_wrapper.o obj-$(CONFIG_MD5) += md5.o obj-y += net_utils.o obj-$(CONFIG_PHYSMEM) += physmem.o diff --git a/lib/lz4.c b/lib/lz4.c new file mode 100644 index 0000000..f518341 --- /dev/null +++ b/lib/lz4.c @@ -0,0 +1,243 @@ +/* + LZ4 - Fast LZ compression algorithm + Copyright (C) 2011-2015, Yann Collet. + + SPDX-License-Identifier: BSD-2-Clause + + You can contact the author at : + - LZ4 source repository : https://github.com/Cyan4973/lz4 + - LZ4 public forum : https://groups.google.com/forum/#!forum/lz4c +*/ + + +/************************************** +* Reading and writing into memory +**************************************/ + +/* customized version of memcpy, which may overwrite up to 7 bytes beyond dstEnd */ +static void LZ4_wildCopy(void* dstPtr, const void* srcPtr, void* dstEnd) +{ + BYTE* d = (BYTE*)dstPtr; + const BYTE* s = (const BYTE*)srcPtr; + BYTE* e = (BYTE*)dstEnd; + do { LZ4_copy8(d,s); d+=8; s+=8; } while (d<e); +} + + +/************************************** +* Common Constants +**************************************/ +#define MINMATCH 4 + +#define COPYLENGTH 8 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH+MINMATCH) +static const int LZ4_minLength = (MFLIMIT+1); + +#define KB *(1 <<10) +#define MB *(1 <<20) +#define GB *(1U<<30) + +#define MAXD_LOG 16 +#define MAX_DISTANCE ((1 << MAXD_LOG) - 1) + +#define ML_BITS 4 +#define ML_MASK ((1U<<ML_BITS)-1) +#define RUN_BITS (8-ML_BITS) +#define RUN_MASK ((1U<<RUN_BITS)-1) + + +/************************************** +* Local Structures and types +**************************************/ +typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive; +typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive; +typedef enum { full = 0, partial = 1 } earlyEnd_directive; + + + +/******************************* +* Decompression functions +*******************************/ +/* + * This generic decompression function cover all use cases. + * It shall be instantiated several times, using different sets of directives + * Note that it is essential this generic function is really inlined, + * in order to remove useless branches during compilation optimization. + */ +FORCE_INLINE int LZ4_decompress_generic( + const char* const source, + char* const dest, + int inputSize, + int outputSize, /* If endOnInput==endOnInputSize, this value is the max size of Output Buffer. */ + + int endOnInput, /* endOnOutputSize, endOnInputSize */ + int partialDecoding, /* full, partial */ + int targetOutputSize, /* only used if partialDecoding==partial */ + int dict, /* noDict, withPrefix64k, usingExtDict */ + const BYTE* const lowPrefix, /* == dest if dict == noDict */ + const BYTE* const dictStart, /* only if dict==usingExtDict */ + const size_t dictSize /* note : = 0 if noDict */ + ) +{ + /* Local Variables */ + const BYTE* ip = (const BYTE*) source; + const BYTE* const iend = ip + inputSize; + + BYTE* op = (BYTE*) dest; + BYTE* const oend = op + outputSize; + BYTE* cpy; + BYTE* oexit = op + targetOutputSize; + const BYTE* const lowLimit = lowPrefix - dictSize; + + const BYTE* const dictEnd = (const BYTE*)dictStart + dictSize; + const size_t dec32table[] = {4, 1, 2, 1, 4, 4, 4, 4}; + const size_t dec64table[] = {0, 0, 0, (size_t)-1, 0, 1, 2, 3}; + + const int safeDecode = (endOnInput==endOnInputSize); + const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB))); + + + /* Special cases */ + if ((partialDecoding) && (oexit> oend-MFLIMIT)) oexit = oend-MFLIMIT; /* targetOutputSize too high => decode everything */ + if ((endOnInput) && (unlikely(outputSize==0))) return ((inputSize==1) && (*ip==0)) ? 0 : -1; /* Empty output buffer */ + if ((!endOnInput) && (unlikely(outputSize==0))) return (*ip==0?1:-1); + + + /* Main Loop */ + while (1) + { + unsigned token; + size_t length; + const BYTE* match; + + /* get literal length */ + token = *ip++; + if ((length=(token>>ML_BITS)) == RUN_MASK) + { + unsigned s; + do + { + s = *ip++; + length += s; + } + while (likely((endOnInput)?ip<iend-RUN_MASK:1) && (s==255)); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)(op))) goto _output_error; /* overflow detection */ + if ((safeDecode) && unlikely((size_t)(ip+length)<(size_t)(ip))) goto _output_error; /* overflow detection */ + } + + /* copy literals */ + cpy = op+length; + if (((endOnInput) && ((cpy>(partialDecoding?oexit:oend-MFLIMIT)) || (ip+length>iend-(2+1+LASTLITERALS))) ) + || ((!endOnInput) && (cpy>oend-COPYLENGTH))) + { + if (partialDecoding) + { + if (cpy > oend) goto _output_error; /* Error : write attempt beyond end of output buffer */ + if ((endOnInput) && (ip+length > iend)) goto _output_error; /* Error : read attempt beyond end of input buffer */ + } + else + { + if ((!endOnInput) && (cpy != oend)) goto _output_error; /* Error : block decoding must stop exactly there */ + if ((endOnInput) && ((ip+length != iend) || (cpy > oend))) goto _output_error; /* Error : input must be consumed */ + } + memcpy(op, ip, length); + ip += length; + op += length; + break; /* Necessarily EOF, due to parsing restrictions */ + } + LZ4_wildCopy(op, ip, cpy); + ip += length; op = cpy; + + /* get offset */ + match = cpy - LZ4_readLE16(ip); ip+=2; + if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error; /* Error : offset outside destination buffer */ + + /* get matchlength */ + length = token & ML_MASK; + if (length == ML_MASK) + { + unsigned s; + do + { + if ((endOnInput) && (ip > iend-LASTLITERALS)) goto _output_error; + s = *ip++; + length += s; + } while (s==255); + if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)op)) goto _output_error; /* overflow detection */ + } + length += MINMATCH; + + /* check external dictionary */ + if ((dict==usingExtDict) && (match < lowPrefix)) + { + if (unlikely(op+length > oend-LASTLITERALS)) goto _output_error; /* doesn't respect parsing restriction */ + + if (length <= (size_t)(lowPrefix-match)) + { + /* match can be copied as a single segment from external dictionary */ + match = dictEnd - (lowPrefix-match); + memmove(op, match, length); op += length; + } + else + { + /* match encompass external dictionary and current segment */ + size_t copySize = (size_t)(lowPrefix-match); + memcpy(op, dictEnd - copySize, copySize); + op += copySize; + copySize = length - copySize; + if (copySize > (size_t)(op-lowPrefix)) /* overlap within current segment */ + { + BYTE* const endOfMatch = op + copySize; + const BYTE* copyFrom = lowPrefix; + while (op < endOfMatch) *op++ = *copyFrom++; + } + else + { + memcpy(op, lowPrefix, copySize); + op += copySize; + } + } + continue; + } + + /* copy repeated sequence */ + cpy = op + length; + if (unlikely((op-match)<8)) + { + const size_t dec64 = dec64table[op-match]; + op[0] = match[0]; + op[1] = match[1]; + op[2] = match[2]; + op[3] = match[3]; + match += dec32table[op-match]; + LZ4_copy4(op+4, match); + op += 8; match -= dec64; + } else { LZ4_copy8(op, match); op+=8; match+=8; } + + if (unlikely(cpy>oend-12)) + { + if (cpy > oend-LASTLITERALS) goto _output_error; /* Error : last LASTLITERALS bytes must be literals */ + if (op < oend-8) + { + LZ4_wildCopy(op, match, oend-8); + match += (oend-8) - op; + op = oend-8; + } + while (op<cpy) *op++ = *match++; + } + else + LZ4_wildCopy(op, match, cpy); + op=cpy; /* correction */ + } + + /* end of decoding */ + if (endOnInput) + return (int) (((char*)op)-dest); /* Nb of output bytes decoded */ + else + return (int) (((const char*)ip)-source); /* Nb of input bytes read */ + + /* Overflow error detected */ +_output_error: + return (int) (-(((const char*)ip)-source))-1; +} diff --git a/lib/lz4_wrapper.c b/lib/lz4_wrapper.c new file mode 100644 index 0000000..0739663 --- /dev/null +++ b/lib/lz4_wrapper.c @@ -0,0 +1,137 @@ +/* + * Copyright 2015 Google Inc. + * + * SPDX-License-Identifier: GPL 2.0+ BSD-3-Clause + */ + +#include <common.h> +#include <compiler.h> +#include <linux/kernel.h> +#include <linux/types.h> + +static u16 LZ4_readLE16(const void *src) { return le16_to_cpu(*(u16 *)src); } +static void LZ4_copy4(void *dst, const void *src) { *(u32 *)dst = *(u32 *)src; } +static void LZ4_copy8(void *dst, const void *src) { *(u64 *)dst = *(u64 *)src; } + +typedef uint8_t BYTE; +typedef uint16_t U16; +typedef uint32_t U32; +typedef int32_t S32; +typedef uint64_t U64; + +#define FORCE_INLINE static inline __attribute__((always_inline)) + +/* Unaltered (except removing unrelated code) from github.com/Cyan4973/lz4. */ +#include "lz4.c" /* #include for inlining, do not link! */ + +#define LZ4F_MAGIC 0x184D2204 + +struct lz4_frame_header { + u32 magic; + union { + u8 flags; + struct { + u8 reserved0:2; + u8 has_content_checksum:1; + u8 has_content_size:1; + u8 has_block_checksum:1; + u8 independent_blocks:1; + u8 version:2; + }; + }; + union { + u8 block_descriptor; + struct { + u8 reserved1:4; + u8 max_block_size:3; + u8 reserved2:1; + }; + }; + /* + u64 content_size iff has_content_size is set */ + /* + u8 header_checksum */ +} __packed; + +struct lz4_block_header { + union { + u32 raw; + struct { + u32 size:31; + u32 not_compressed:1; + }; + }; + /* + size bytes of data */ + /* + u32 block_checksum iff has_block_checksum is set */ +} __packed; + +int ulz4fn(const void *src, size_t srcn, void *dst, size_t *dstn) +{ + const void *end = dst + *dstn; + const void *in = src; + void *out = dst; + int has_block_checksum; + int ret; + *dstn = 0; + + { /* With in-place decompression the header may become invalid later. */ + const struct lz4_frame_header *h = in; + + if (srcn < sizeof(*h) + sizeof(u64) + sizeof(u8)) + return -EINVAL; /* input overrun */ + + /* We assume there's always only a single, standard frame. */ + if (le32_to_cpu(h->magic) != LZ4F_MAGIC || h->version != 1) + return -EPROTONOSUPPORT; /* unknown format */ + if (h->reserved0 || h->reserved1 || h->reserved2) + return -EINVAL; /* reserved must be zero */ + if (!h->independent_blocks) + return -EPROTONOSUPPORT; /* we can't support this yet */ + has_block_checksum = h->has_block_checksum; + + in += sizeof(*h); + if (h->has_content_size) + in += sizeof(u64); + in += sizeof(u8); + } + + while (1) { + struct lz4_block_header b = { .raw = le32_to_cpu(*(u32 *)in) }; + in += sizeof(struct lz4_block_header); + + if (in - src + b.size > srcn) { + ret = -EINVAL; /* input overrun */ + break; + } + + if (!b.size) { + ret = 0; /* decompression successful */ + break; + } + + if (b.not_compressed) { + size_t size = min((ptrdiff_t)b.size, end - out); + memcpy(out, in, size); + out += size; + if (size < b.size) { + ret = -ENOBUFS; /* output overrun */ + break; + } + } else { + /* constant folding essential, do not touch params! */ + ret = LZ4_decompress_generic(in, out, b.size, + end - out, endOnInputSize, + full, 0, noDict, out, NULL, 0); + if (ret < 0) { + ret = -EPROTO; /* decompression error */ + break; + } + out += ret; + } + + in += b.size; + if (has_block_checksum) + in += sizeof(u32); + } + + *dstn = out - dst; + return ret; +} diff --git a/test/compression.c b/test/compression.c index 7ef3a8c..be4e04e 100644 --- a/test/compression.c +++ b/test/compression.c @@ -95,6 +95,28 @@ static const char lzo_compressed[] = "\x73\x61\x67\x65\x73\x2e\x0a\x11\x00\x00\x00\x00\x00\x00"; static const unsigned long lzo_compressed_size = 334;
+/* lz4 -z /tmp/plain.txt > /tmp/plain.lz4 */ +static const char lz4_compressed[] = + "\x04\x22\x4d\x18\x64\x70\xb9\x01\x01\x00\x00\xff\x19\x49\x20\x61" + "\x6d\x20\x61\x20\x68\x69\x67\x68\x6c\x79\x20\x63\x6f\x6d\x70\x72" + "\x65\x73\x73\x61\x62\x6c\x65\x20\x62\x69\x74\x20\x6f\x66\x20\x74" + "\x65\x78\x74\x2e\x0a\x28\x00\x3d\xf1\x25\x54\x68\x65\x72\x65\x20" + "\x61\x72\x65\x20\x6d\x61\x6e\x79\x20\x6c\x69\x6b\x65\x20\x6d\x65" + "\x2c\x20\x62\x75\x74\x20\x74\x68\x69\x73\x20\x6f\x6e\x65\x20\x69" + "\x73\x20\x6d\x69\x6e\x65\x2e\x0a\x49\x66\x20\x49\x20\x77\x32\x00" + "\xd1\x6e\x79\x20\x73\x68\x6f\x72\x74\x65\x72\x2c\x20\x74\x45\x00" + "\xf4\x0b\x77\x6f\x75\x6c\x64\x6e\x27\x74\x20\x62\x65\x20\x6d\x75" + "\x63\x68\x20\x73\x65\x6e\x73\x65\x20\x69\x6e\x0a\xcf\x00\x50\x69" + "\x6e\x67\x20\x6d\x12\x00\x00\x32\x00\xf0\x11\x20\x66\x69\x72\x73" + "\x74\x20\x70\x6c\x61\x63\x65\x2e\x20\x41\x74\x20\x6c\x65\x61\x73" + "\x74\x20\x77\x69\x74\x68\x20\x6c\x7a\x6f\x2c\x63\x00\xf5\x14\x77" + "\x61\x79\x2c\x0a\x77\x68\x69\x63\x68\x20\x61\x70\x70\x65\x61\x72" + "\x73\x20\x74\x6f\x20\x62\x65\x68\x61\x76\x65\x20\x70\x6f\x6f\x72" + "\x6c\x79\x4e\x00\x30\x61\x63\x65\x27\x01\x01\x95\x00\x01\x2d\x01" + "\xb0\x0a\x6d\x65\x73\x73\x61\x67\x65\x73\x2e\x0a\x00\x00\x00\x00" + "\x9d\x12\x8c\x9d"; +static const unsigned long lz4_compressed_size = 276; +
#define TEST_BUFFER_SIZE 512
@@ -227,6 +249,39 @@ static int uncompress_using_lzo(void *in, unsigned long in_size, return (ret != LZO_E_OK); }
+static int compress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + /* There is no lz4 compression in u-boot, so fake it. */ + assert(in_size == strlen(plain)); + assert(memcmp(plain, in, in_size) == 0); + + if (lz4_compressed_size > out_max) + return -1; + + memcpy(out, lz4_compressed, lz4_compressed_size); + if (out_size) + *out_size = lz4_compressed_size; + + return 0; +} + +static int uncompress_using_lz4(void *in, unsigned long in_size, + void *out, unsigned long out_max, + unsigned long *out_size) +{ + int ret; + size_t input_size = in_size; + size_t output_size = out_max; + + ret = ulz4fn(in, input_size, out, &output_size); + if (out_size) + *out_size = output_size; + + return (ret != 0); +} + #define errcheck(statement) if (!(statement)) { \ fprintf(stderr, "\tFailed: %s\n", #statement); \ ret = 1; \ @@ -325,6 +380,7 @@ static int do_ut_compression(cmd_tbl_t *cmdtp, int flag, int argc, err += run_test("bzip2", compress_using_bzip2, uncompress_using_bzip2); err += run_test("lzma", compress_using_lzma, uncompress_using_lzma); err += run_test("lzo", compress_using_lzo, uncompress_using_lzo); + err += run_test("lz4", compress_using_lz4, uncompress_using_lz4);
printf("ut_compression %s\n", err == 0 ? "ok" : "FAILED");
@@ -401,6 +457,7 @@ static int do_ut_image_decomp(cmd_tbl_t *cmdtp, int flag, int argc, err |= run_bootm_test(IH_COMP_BZIP2, compress_using_bzip2); err |= run_bootm_test(IH_COMP_LZMA, compress_using_lzma); err |= run_bootm_test(IH_COMP_LZO, compress_using_lzo); + err |= run_bootm_test(IH_COMP_LZ4, compress_using_lz4); err |= run_bootm_test(IH_COMP_NONE, compress_using_none);
printf("ut_image_decomp %s\n", err == 0 ? "ok" : "FAILED");

On 7 October 2015 at 04:03, Julius Werner jwerner@chromium.org wrote:
This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Sandbox-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org
common/bootm.c | 9 ++ common/image.c | 1 + configs/sandbox_defconfig | 1 + include/common.h | 3 + include/image.h | 1 + lib/Kconfig | 18 ++++ lib/Makefile | 1 + lib/lz4.c | 243 ++++++++++++++++++++++++++++++++++++++++++++++ lib/lz4_wrapper.c | 137 ++++++++++++++++++++++++++ test/compression.c | 57 +++++++++++ 10 files changed, 471 insertions(+) create mode 100644 lib/lz4.c create mode 100644 lib/lz4_wrapper.c
Acked-by: Simon Glass sjg@chromium.org

On Tue, Oct 06, 2015 at 08:03:53PM -0700, Julius Werner wrote:
This patch adds support for LZ4-compressed FIT image contents. This algorithm has a slightly worse compression ration than LZO while being nearly twice as fast to decompress. When loading images from a fast storage medium this usually results in a boot time win.
Sandbox-tested only since I don't have a U-Boot development system set up right now. The code was imported unchanged from coreboot where it's proven to work, though. I'm mostly interested in getting this recognized by mkImage for use in a downstream project.
Signed-off-by: Julius Werner jwerner@chromium.org Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot/master, thanks!

Hi Julius,
On 7 October 2015 at 04:01, Julius Werner jwerner@chromium.org wrote:
Well then... a few hours and a significant reduction in sanity later, I found that I'm shadowing the new 'ret' variable from changing the API around because I forgot to remove the declaration part from the 'ret' that already existed in the else block. It would be nice if U-Boot compiled with -Wshadow...
I saw an 'unused variable' warning which I mentioned in my email. It may have been another symptom. You could send a patch to change the flags if you like.
This made the (outer) return variable uninitialized in the compression error case, which on my native x86_64 happened to produce a non-zero value... but with an i686 cross-compiler I could reproduce the error you saw. Fixed in the next version and moved the config to sandbox_defconfig as you suggested.
Seems to work now, so we are good.
Regards, Simon

Hi Julius,
On 2 October 2015 at 23:18, Julius Werner jwerner@chromium.org wrote:
I get this build error with sandbox.
test/built-in.o: In function `uncompress_using_lz4': /home/sjg/c/src/third_party/u-boot/files/test/compression.c:278: undefined reference to `ulz4fn'
Yeah... that's because you didn't configure it in. I'm not really sure how this is supposed to work... there should really be some sort of dependency between selecting the compression tests and selecting the algorithm, or the algorithms should be #ifdefed out like in bootm.c. But neither of those seems to be in place right now... include/configs/sandbox.h just enables all the other compression algorithms, that's why it works for them.
I could add a 'select LZ4' to config UNIT_TEST in the Kconfig to create such a dependency, I think that would be the best option?
Sounds good. I think it makes sense to enable all possible options in sandbox - since it helps with build testing too.
You should be able to run the tests using:
make O=sandbox defconfig all ./sandbox/u-boot -c "ut_compression"
I tried, but I can't find myself through the SDL dependency hell... sorry. I installed Ubuntu's libsdl2-dev but apparently that wasn't enough... I still get "make[2]: sdl-config: Command not found" and "../arch/sandbox/cpu/sdl.c:9:21: fatal error: SDL/SDL.h: No such file or directory" when I'm trying to build. (I ran it with make -k now so I confirmed that it builds both lib/lz4_wrapper.o and test/compression.o without errors. I just can't test the linking step.)
You can build U-Boot with NO_SDL=1
I have this on my goobuntu laptop:
~> dpkg -l |grep -i sdl ii libsdl1.2-dev 1.2.15-8ubuntu1.1 amd64 Simple DirectMedia Layer development files ii libsdl1.2debian:amd64 1.2.15-8ubuntu1.1 amd64 Simple DirectMedia Layer
Can you instead add this option to lib/Kconfig and put your help there? We are moving away from the old CONFIGS.
Done in next version.
You should be able to replace this license with
SPDX-License-Identifier: BSD-2-Clause [...] Should be able to use:
SPDX-License-Identifier: GPL-2.0+ BSD-2-Clause
Done in next version.
I can see why you want to keep lz4.c as is. But this file is written by you, isn't it? If so, can you fix the checkpatch errors that are fixable (e.g. run 'patman').
warning: lib/lz4_wrapper.c,41: do not add new typedefs warning: lib/lz4_wrapper.c,42: do not add new typedefs warning: lib/lz4_wrapper.c,43: do not add new typedefs warning: lib/lz4_wrapper.c,44: do not add new typedefs warning: lib/lz4_wrapper.c,45: do not add new typedefs warning: lib/lz4_wrapper.c,47: storage class should be at the beginning of the declaration error: lib/lz4_wrapper.c,59: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,60: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,61: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,62: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,63: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,64: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,70: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,71: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,72: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,77: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,77: Adding new packed members is to be done with care error: lib/lz4_wrapper.c,83: spaces prohibited around that ':' (ctx:WxW) error: lib/lz4_wrapper.c,84: spaces prohibited around that ':' (ctx:WxW) warning: lib/lz4_wrapper.c,89: __packed is preferred over __attribute__((packed)) warning: lib/lz4_wrapper.c,89: Adding new packed members is to be done with care error: test/compression.c,282: return is not a function, parentheses are not required
Okay, I replaced __attribute__((packed)) with __packed and fixed the whitespace for bit fields. I think those are the only actionable ones... the camel case comes from names inside lz4.c, and I need the typedefs to map U-Boot's types to the ones lz4.c expects.
Right.
Could you return int here, and use proper error numbers below? Like -EINVAL, -EPROTONOSUPPORT, -ENOSPC, etc.
Okay, I switched it to the model where it returns an int and the size is an output pointer, like the other algorithms.
Sounds good.
Regards, Simon
participants (3)
-
Julius Werner
-
Simon Glass
-
Tom Rini