
On Mon, 31 Jul 2023, 9:29 am Pierre Bourdon, delroth@gmail.com wrote:
On Sun, Jul 30, 2023 at 11:21 PM Chris Packham judge.packham@gmail.com wrote:
On Sun, Jul 30, 2023 at 6:08 AM Pierre Bourdon delroth@gmail.com
wrote:
Chunked raw reads get accumulated to the data buffer, but in some ECC configurations they can end up being larger than the originally computed size (write page size + OOB size). For example:
4K page size, ECC strength 8:
- Normal reads: writesize (4096B) + oobsize (128B) = 4224 bytes.
- Chunked raw reads: 4 chunks of 1024B + 1 final spare area of 64B + 5 ECC areas of 32B = 4320B.
I'm not a NAND expert and I haven't sat down and fully grasped the math but I was curious to see what the Linux kernel did. It looks like it uses the same mtd->writesize + mtd->oobsize calculation (see nand_scan_tail() in nand_base.c). So either Linux has the same bug or maybe there's something off in u-boot's nfc_layouts[]. I'll see if I can get one of my boards to trigger a KASAN report (I'm not sure if any of the NAND chips we use will hit the cases you're pointing out).
Sure, please let me know. I'm not 100% convinced that this is the correct fix - I know very little about this driver or NANDs in general. On the board I'm playing with (Marvell AC3-based) this patch prevents the driver from corrupting dlmalloc's data structures and causing u-boot to hang. But it could be that this is just papering over another root cause.
The NAND chip is a Micron MT29F4G08ABBEAH4, so nothing too unusual.
Hmm. Both boards I tried had sufficient space in writesize+oobsize. I'll see if I can find others with different nand chips.
Thanks! Best,
-- Pierre Bourdon delroth@gmail.com Software Engineer @ Zürich, Switzerland https://delroth.net/