
On Thu, Mar 06, 2014 at 15:40 +1300, Charles Manning wrote:
[ ... ] Unfortunately the CRC used in this boot ROM is not the same as the Adler CRC in lib/crc32.c. Indeed the Adler code is not technically a CRC but is more correctly described as a checksum.
I don't quite get why you say that the zlib implementation is not a CRC. IIUC all of the CRC-32 methods follow a common algorithm, and just happen to differ in their polynomials or initial and final masks. So they result in different CRC values for the same data stream, yet all of them are CRCs.
But strictly speaking this remark is not specific to this patch submission, and might be omitted. The only relevant thing is that the zlib CRC-32 approach does not result in the value that the SoC's ROM loader expects.
Thus, the appropriate CRC generator is added to lib/ as crc32_alt.c.
What does "alt" stand for, and can you provide an appropriate description with the commit log (or just use a better name)? As previous communication suggests, it's certainly not "alternative" -- as there just are too many alternatives available to mark one of them as _the_ alternative. It's neither "apple" where you appear to have found the matching polynomial first. If it's "Altera", I'd suggest to use either "altr" as this is what elsewhere is used for Altera (the stock ticker), or plain "altera" instead of something this short and ambiguous. Or "bzlib" since this is the most recent implementation that you are using.
Signed-off-by: Charles Manning cdhmanning@gmail.com
Changes for v3:
- Fix some coding style issues.
- Move from a standalone tool to the mkimgae framework.
Changes for v4:
- Fix more coding style issues.
- Fix typos in Makefile.
- Rebase on master (previous version was not on master, but on a working socfpga branch).
Changes for v5:
- Fix more coding style issues.
- Add some more comments.
- Remove some unused defines.
- Move the local CRC32 code into lib/crc32_alt.c.
Changes for v6:
- Fix more coding style issues.
- Rejig socfpgaimage_vrec_header() function so that it has one return path and does stricter size checks.
Changes for v7:
- Use bzlib's crc table instead of adding another one.
- Use existing code and a packed structure for header marshalling.
One of the reasons I got tired of providing feedback is that this change log is rather terse. Several identical "fix coding style" phrases are another way of telling readers that they should figure out for themselves what might have changed (especially when newer versions have functional changes that are not announced as such), and whether feedback was considered or got ignored. Seeing several iterations which suffer from the same issues that have been identified multiple versions before isn't helpful either when asking yourself whether to spend more time and getting ignored another time.
--- /dev/null +++ b/include/bzlib_crc32.h @@ -0,0 +1,17 @@ +/*
- Copyright (C) 2014 Charles Manning cdhmanning@gmail.com
- SPDX-License-Identifier: GPL-2.0+
- Note that the CRC is **not** the zlib/Adler crc32 in crc32.c.
- It is the CRC-32 used in bzip2, ethernet and elsewhere.
- */
+#ifndef __BZLIB_CRC32_H__ +#define __BZLIB_CRC32_H__
+#include <stdint.h>
+uint32_t bzlib_crc32(uint32_t crc, const void *_buf, int length);
+#endif
and
--- /dev/null +++ b/lib/bzlib_crc32.c @@ -0,0 +1,26 @@ +/*
- Copyright (C) 2014 Charles Manning cdhmanning@gmail.com
- SPDX-License-Identifier: GPL-2.0+
- This provides a CRC-32 using the tables in bzlib_crctable.c
- */
+#include <bzlib_crc32.h> +#include "bzlib_private.h"
+uint32_t bzlib_crc32(uint32_t crc, const void *_buf, int length) +{
- const uint8_t *buf = _buf;
- crc ^= ~0;
- while (length--) {
crc = (crc << 8) ^ BZ2_crc32Table[((crc >> 24) ^ *buf) & 0xff];
buf++;
- }
- crc ^= ~0;
- return crc;
+}
Since you already include the private bzlib header for the BZ2_crc32Table[] declaration, you might as well use the BZ_INITIALISE_CRC(), BZ_FINALISE_CRC(), and BZ_UPDATE_CRC() macros declared from there (right next to the table's declaration), instead of re-inventing how the CRC gets calculated.
You probably should determine whether you want to provide one routine to do the complete calculate over a given byte stream, without any "previous" CRC value -- this is what your current use case is. Or whether you want to provide three primitives to initialize, update, and finalize a CRC (which is not used yet). Or whether you want to provide the three primitives plus one convenience wrapper.
--- a/tools/Makefile +++ b/tools/Makefile @@ -70,6 +70,8 @@ RSA_OBJS-$(CONFIG_FIT_SIGNATURE) := rsa-sign.o # common objs for dumpimage and mkimage dumpimage-mkimage-objs := aisimage.o \ $(FIT_SIG_OBJS-y) \
bzlib_crc32.o \
bzlib_crctable.o \ crc32.o \ default_image.o \ fit_image.o \
@@ -85,6 +87,7 @@ dumpimage-mkimage-objs := aisimage.o \ os_support.o \ pblimage.o \ sha1.o \
socfpgaimage.o \ ublimage.o \ $(LIBFDT_OBJS) \ $(RSA_OBJS-y)
diff --git a/tools/bzlib_crc32.c b/tools/bzlib_crc32.c new file mode 100644 index 0000000..b7f7aa5 --- /dev/null +++ b/tools/bzlib_crc32.c @@ -0,0 +1 @@ +#include "../lib/bzlib_crc32.c" diff --git a/tools/bzlib_crctable.c b/tools/bzlib_crctable.c new file mode 100644 index 0000000..53d38ef --- /dev/null +++ b/tools/bzlib_crctable.c @@ -0,0 +1 @@ +#include "../bzlib_crctable.c" diff --git a/tools/bzlib_private.h b/tools/bzlib_private.h new file mode 100644 index 0000000..cfb74be --- /dev/null +++ b/tools/bzlib_private.h @@ -0,0 +1 @@ +#include "lib/bzlib_private.h"
Now this is incredibly ugly. You are duplicating lib/ stuff in tools/ and build it there as well? Or not at all build it in lib/ where the code actually resides? It's only a side note that #include for "*.c" is problematic as some compilers do semi-intelligent stuff when the filename does not end in ".h" and then break compilation, so this hack should neither get spread nor suggested for use.
Is there any (real, technical) reason why the bzip stuff (the CRC-32 calculation that has been made available separately) cannot get built and used as a library, and the tools/ application just gets linked against it as one would expect?
--- /dev/null +++ b/tools/socfpgaimage.c @@ -0,0 +1,255 @@ +/*
- Copyright (C) 2014 Charles Manning cdhmanning@gmail.com
- SPDX-License-Identifier: GPL-2.0+
- Note this doc is not entirely accurate. Of particular interest to us is the
- "header" length field being in U32s and not bytes.
Can you rephrase this? The current form suggests that you cannot trust the Altera documentation, while you fail to state what exactly you have identified as being wrong (whether this one length spec is all the problems, or whether there are more).
Looking at "Loading the Preloader" on page A-7 I see that it says the "Program length" is in bytes, while the "CRC" description on page A-8 mentions "Program Length * 4". So the Handbook is inconsistent, and available data suggests that "length in units of 32bit words" is correct.
The polynomial cited there BTW translates to 0x04c11db7, which may be more specific than "what is used in ethernet and elsewhere", and might be useful to have in comments and commit logs. The initial value and final XOR and "no reflection" are the other attributes of this specific CRC-32 method.
- "Header" is a structure of the following format.
- this is positioned at 0x40.
- Endian is LSB.
- Offset Length Usage
- 0x40 4 Validation word 0x31305341
- 0x44 1 Version (whatever, zero is fine)
- 0x45 1 Flags (unused, zero is fine)
- 0x46 2 Length (in units of u32, including the end checksum).
- 0x48 2 Zero
- 0x4A 2 Checksum over the heder. NB Not CRC32
This is wrong (and is not what the Handbook Appendix says). The length is 32bits in width. It just happens that in little endian representation and with a maximum of 60KB of payload data, the upper 16bits naturally remain zero. Still the documentation and implementation should work with a 32bit length.
I agree that the term "checksum" might be confusing, because it's a mere sum of the preceeding bytes, and "by coincidence" the sum is used to check plausibility of the header. I wouldn't either know how to best put this into a short comment. From personal experience I can tell that "sum" alone would have helped a lot.
- At the end of the code we have a 32-bit CRC checksum over whole binary
- excluding the CRC.
This probably should say "CRC over the all payload data from offset zero up to but not including the position of the CRC". Depending on whether the 64KB padding is considered part of the binary or some "afterwards thing", this might help.
Background information: The SPL is supposed to be up to 60KB in size (the upper 4KB are for the ROM loader's internal use, and for data shared among the preloader and the bootloader). The SPL memory image has a gap at 0x40 (after the vectors, and before the "useful code") where the preloader header/signature gets put into. After the signature application, a CRC-32 is calculated over the complete data and gets appended to the data (the length spec in the header includes the CRC location). Then the image gets padded to full 64KB and gets duplicated four times (to increase robustness by means of redundancy, assuming boot media might have read errors).
Your submission's motivation is to initially stamp a preloader that's the fresh result of a compilation. I can see that it would be useful and desirable to apply the tool to existing binaries as well, i.e. throw a 64KB binary at the tool to re-create the signature.
- Note that the CRC used here is **not** the zlib/Adler crc32. It is the
- CRC-32 used in bzip2, ethernet and elsewhere.
If you want to be most specific, cite the parameters of the CRC-32 used here. See the above comments and http://opensource.apple.com/source/CommonCrypto/CommonCrypto-60049/libcn/crc... and maybe http://www.ross.net/crc/download/crc_v3.txt
BTW I'm still under the impression that zlib uses the same polynomial as well as identical initial values and final XOR masks (see lib/crc32.c). So where is the difference between the zlib and the bzlib CRC-32 parameters?
$ perl -e 'use Compress::Zlib; printf "%x\n", crc32("123456789");' cbf43926
- The image is padded out to 64k, because that is what is
- typically used to write the image to the boot medium.
- */
+#include <bzlib_crc32.h> +#include "imagetool.h" +#include <image.h>
+#define HEADER_OFFSET 0x40 +#define HEADER_SIZE 0xC +#define VALIDATION_WORD 0x31305341 +#define PADDED_SIZE 0x10000
+/* To allow for adding CRC, the max input size is a bit smaller. */ +#define MAX_INPUT_SIZE (PADDED_SIZE - sizeof(uint32_t))
See above, it's not a blocker for the initial motivation, but might be useful to run the tool on existing 64KB blobs as well. But then it's hard to determine the "right" length in reliable ways, and the benefit may not be as big as I think it is, hmm... So OK, forget this idea, running the tool on fresh compiler output of at most 60KB size is quite appropriate an assumption.
+static uint8_t buffer[PADDED_SIZE];
+static struct {
- uint32_t validation;
- uint8_t version;
- uint8_t flags;
- uint16_t length_u32;
- uint16_t zero;
- uint16_t checksum;
+} __attribute__((packed)) header;
You are aware of the "packed" and "alignment" discussion threads here, aren't you? I considered the "serializer" accessors of previous patch iterations most appropriate (except for their names), and feel this approach with a struct is inferior (especially because it doesn't cover the CRC later). But others need not agree.
Having a "length_u32" field which is of 'uint16_t' data type doesn't look right.
+/*
- The header checksum is just a very simple checksum over
- the header area.
- There is still a crc32 over the whole lot.
- */
+static uint16_t hdr_checksum(const uint8_t *buf, int len) +{
- uint16_t ret = 0;
- int i;
- for (i = 0; i < len; i++) {
ret += (((uint16_t) *buf) & 0xff);
buf++;
- }
- return ret;
+}
What is the typecast good for? Why are there still whitespace issues? Can you either fix them or tell why this is OK? I.e. can you _somehow_ respond to feedback you receive? Ignoring review comments makes you recive less feedback later.
The usual C language style comments apply: Please don't mix declarations and instructions. You can eliminate variables. Use names which better reflect what is happening. Something like this is more idiomatic:
static uint16_t header_sum(const uint8_t *buf, int len) { uint16_t sum;
sum = 0; while (len-- > 0) { sum += *buf++; } return sum; }
+static void build_header(uint8_t *buf,
uint8_t version,
uint8_t flags,
uint16_t length_bytes)
+{
- header.validation = htole32(VALIDATION_WORD);
- header.version = version;
- header.flags = flags;
- header.length_u32 = htole16(length_bytes/4);
- header.zero = 0;
- header.checksum = htole16(hdr_checksum((const uint8_t *)&header, 10));
- memcpy(buf, &header, sizeof(header));
+}
I'd suggest to s/4/sizeof(uint32_t)/
Can you rephrase the "10" in terms of "header size minus the width of its sum"? Very similar to the CRC which covers "all of the data except the CRC field".
And there may be a problem with "direct" assignments to a "packed" struct. Since you fill in a local struct and memcpy to the image then, you might as well use "normal serializers".
+/* Sign the buffer and return the signed buffer size */ +static int sign_buffer(uint8_t *buf,
uint8_t version, uint8_t flags,
int len, int pad_64k)
+{
- uint32_t calc_crc;
- /* Align the length up */
- len = (len + 3) & (~3);
Is there nothing you can re-use? If not can you introduce something generic to roundup() for user space tools to use? Can you rephrase the condition such that it becomes evident you are doing ROUNDUP(len, sizeof(uint32_t)) and avoid the magic "3" that has no such meaning for readers?
- /* Build header, adding 4 bytes to length to hold the CRC32. */
- build_header(buf + HEADER_OFFSET, version, flags, len + 4);
- /* Calculate and apply the CRC */
- calc_crc = bzlib_crc32(0, buf, len);
- *((uint32_t *)(buf + len)) = htole32(calc_crc);
This is what I was talking about above. It may be better to use a serializer to write both header fields as well as the CRC.
- if (!pad_64k)
return len + 4;
- return PADDED_SIZE;
+}
+static void socfpgaimage_print_header(const void *ptr) +{
- if (verify_buffer(ptr) == 0)
printf("Looks like a sane SOCFPGA preloader\n");
- else
printf("Not a sane SOCFPGA preloader\n");
+}
Would it be useful to print a little more information than "it's a preloader" without any further details? I'm not familiar with the context this routine gets called in, what do users expect?
virtually yours Gerhard Sittig