
Hi Scott,
Le Thu, 19 Mar 2015 16:39:42 -0500, Scott Wood scottwood@freescale.com a écrit :
On Wed, 2015-03-18 at 10:04 +0100, Albert ARIBAUD (3ADEV) wrote:
+int nand_spl_load_image(uint32_t offs, unsigned int size, void *dst) +{
- int block_good;
bool?
- struct lpc32xx_oob oob;
- unsigned int page, left;
- /* if starting mid-block consider block good */
- block_good = 1;
- for (page = offs / BYTES_PER_PAGE,
left = DIV_ROUND_UP(size, BYTES_PER_PAGE);
left && (page < PAGES_PER_CHIP_MAX); page++) {
/* read inband and oob page data anyway */
int res = read_single_page(dst, page, &oob);
/* return error if a good block's page was not readable */
if ((res < 0) && block_good)
return -1;
Why even try to read it if it's been marked bad?
Actually, at this point, we may not know yet if the block is good or bad, since we might be reading a page of it for the first time. Will fix, see below.
/* if first page in a block, update 'block good' status */
The non-SPL driver appears to be checking the last two pages in the block, not the first page.
Thanks for pointing out this discrepancy.
I took the first page option here after checking the NAND's datasheet, which says that for factory bad block marking at least, while all pages in a bad block /should/ bear the bad block marker, only the first one is /guaranteed/ to have it. The driver might have inherited the 'last page' option from the previous NAND chip used, or from a mistake of my own.
BTW, is there a standard way to ask a NAND chip which page(s) in a bad block should be checked?
if ((page % PAGES_PER_BLOCK) == 0) {
/* assume block is good */
block_good = 1;
/* if page could not be read, assume block is bad */
if (res < 0)
block_good = 0;
No. A block is to be skipped if and only if it has a bad block marker. ECC errors should not be silently ignored just because they happen on the first page of a block. If the writer of this data didn't skip the block, then skipping it when reading will result in unusable data regardless of the underlying ECC problem.
You're correct of course.
However, I do want the SPL chainloading to be as resilient as possible, and we have established that it will have to read some part of a bad block -- possibly resulting in a read failure -- before knowing that the block is bad, which means it may have to finally ignore the read failure.
I guess I could wait to declare the block good or bad until I could read at least one if its pages (actually, one of its pages' OOB data) properly, only bail out if the OOB says the block is supposed to be good.
Now, if none of the block's pages could be read, this still prevents us from knowing whether these read failures are catastrophic. If the block was actually bad and skipped, then the resulting image might be intact, will pass the checksum test, and will run; if the block was actually good, SPL will detect it and not boot the corrupt image -- except if the completely unreadable good block was the first one, which holds the signature and checksum, in which case SPL will fall back to considering a raw, unsigned, unverifiable image, and will jump into corrupt code.
This may happen; but I think the probability of having a completely unreadable sector is very low, and the probability of this sector being the first one of the image is very low too [1].
Which leads me to two possible courses of action:
- consider the risk even though the probabilities are very low, thus consider any totally unreadable sector as good, and bail out in this case, or
- add an option to SPL that prevents common/spl/spl.c from falling back to running the image as raw if no signature is found, and consider any totally unreadable sector as bad and therefore ignore the read errors. Either the sector was actually good, and SPL will catch the image as corrupt or missing a signature, or the sector was actually bad, and the image will run as expected.
I personally prefer the second one, as it bring a valuable (I think) feature to U-Boot, for any board on which one wants to prevent an unverifiable image from running.
-Scott
Cordialement, Albert ARIBAUD 3ADEV