
Anton Vorontsov wrote:
On Tue, Mar 17, 2009 at 12:09:31PM -0500, Scott Wood wrote:
This board currently sets DBAT6 to cover all of the final 256MiB of address space; however, not all of this space is covered by a device. In particular, flash sits at 0xfe000000-0xfe7fffff, and nothing is mapped at the far end of the address space.
In zlib, there is a loop that references p[-1] if p is non-NULL. Under some circumstances, this leads to the CPU speculatively loading from 0xfffffff8 if p is NULL. This leads to a machine check.
Signed-off-by: Scott Wood scottwood@freescale.com
Note that there are likely other board with the same issue.
Wow, I was actually chasing this (I think) bug for some time.
The effect of this bug was quite weird: some kernels didn't boot, and the only difference in the kernel image was.. the build date (i.e. data in linux_banner and init_uts_ns symbols).
I suspected the decompression code (what else could it be?), but I didn't manage to track it down to a failing instruction, as the failing kernel was booting *OK* with BDI-2000 attached. Heh.
I wonder how you tracked it down to zlib code and a particular loop, please share the technique. ;-)
I changed the kernel's decompression address to 0x1000 so that the exception vectors don't get overwritten, and looked at the machine check dump. That pointed to the relevant part of the zlib code, at which point I used print statements to figure out what was going on, combined with arbiter registers after reboot which pointed out 0xfffffff8 as the offending address.
-Scott