
On Thu, Aug 05, 2021 at 11:52:05PM +0200, Marek Vasut wrote:
On 8/2/21 4:44 PM, Tom Rini wrote:
On Mon, Aug 02, 2021 at 04:34:29PM +0200, Jan Kiszka wrote:
On 02.08.21 16:27, Tom Rini wrote:
On Mon, Aug 02, 2021 at 04:03:01PM +0200, Jan Kiszka wrote:
On 02.08.21 15:04, Tom Rini wrote:
On Mon, Aug 02, 2021 at 01:54:57PM +0200, Jan Kiszka wrote: > On 02.08.21 13:38, Marek Vasut wrote: > > On 8/2/21 1:36 PM, Jan Kiszka wrote: > > > On 02.08.21 12:48, Marek Vasut wrote: > > > > On 8/2/21 11:37 AM, Jan Kiszka wrote: > > > > > On 02.08.21 02:54, Marek Vasut wrote: > > > > > > On 7/29/21 6:58 PM, Tom Rini wrote: > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > so when did rcar3 introduce something there that shouldn't be > > > > > > > > > reserved? And you had phrased this to me on IRC as about reserving > > > > > > > > > spot > > > > > > > > > for ATAGS, and that not being needed of course on arm64. But > > > > > > > > > that's > > > > > > > > > not > > > > > > > > > what's going on. Perhaps the answer is that rcar3 needs to > > > > > > > > > introduce a > > > > > > > > > board_lmb_reserve to free the normal arch one and provide whatever > > > > > > > > > more > > > > > > > > > narrow scope it needs. > > > > > > > > > > > > > > > > Based on the commit message 2359fa7a878 ("arm: bootm: Disable LMB > > > > > > > > reservation for command line and board info on arm64") , this is > > > > > > > > about ATAGS > > > > > > > > and we really don't need to reserve those on arm64. > > > > > > > > > > > > > > Commit 2359fa7a878 disables the entire arch_lmb_reserve function on > > > > > > > aarch64, yes. I assumed when we had talked that it was a small area > > > > > > > being set aside and perhaps mis-recalled that ATAGS tended to live at > > > > > > > DDR_BASE + 0x800 or so. > > > > > > > > > > > > That arch_lmb_reserve() is responsible for reserving architecture > > > > > > specific memory. On arm32 it is ATAGS, on arm64 it is nothing as > > > > > > far as > > > > > > I can tell (and see below regarding the TLB). > > > > > > > > > > > > > This reservation is not at that spot, and a lot > > > > > > > more than that. > > > > > > > > > > > > Can you please elaborate on this "lot more" part ? Because as much > > > > > > as I > > > > > > studied the reservation code, the "lot more" was ATAGS on arm32 and > > > > > > nothing on arm64. > > > > > > > > > > See my commit log. > > > > > > > > This is not particularly useful answer, considering the commit log says: > > > > "lot of crucial things", "Possibly more", "likely also on other boards" > > > > and other opaque statements. But really, the problem so far happens on > > > > one K3 board. > > > > > > "Such things are the page table (tlb_addr), > > > relocated U-Boot and the active stack." > > > > Please read the rest of my answer, I don't believe the TLB should be > > reserved at all. DTTO for the stack. If you think otherwise, please > > explain why. > > Marek, I've provided you with three generic examples of active memory > blocks that are relevant while U-Boot is allocating from and also > filling that LMB. Please follow those cases and explain to us why they > aren't active - or at least prove why they are specific the k3 (for > which I found no traces). > > And stop following the TLB topic for now. That was only my first guess. > The actual crash I'm seeing on my board come from plain code > overwriting. It could have been TLB as well. It could also have been the > stack. All those become unprotected via your reservation removal.
Jan, one thing I didn't see before is, are you also using include/configs/ti_armv7_common.h in the end, like the K3 reference platforms, and if not are you setting bootm_size in your environment? I have one more idea on why this fails on your board but not Marek's. Thanks.
We are including that header but we didn't use DEFAULT_LINUX_BOOT_ENV, in fact. That left bootm_size undefined. Can you explain the impact?
I suspect the answer here is that Marek does not see this problem because on R-Car bootm_size is set to 0x10000000 and so no relocation of the device tree / kernel / initrd happens to overwrite the running U-Boot and blow everything up. If you don't revert this, and do set bootm_size does everything work? Marek, if you unset bootm_size, do you see failure? Thanks!
I currently do not see the error, even with unset bootm_size and Marek's patch back in. But fdt indeed moves down when adopting those settings. That makes sense for us anyway, I think our custom env values are rather for historic reasons, and one had an issue anyway (incorrect kernel alignment).
But at least we understand why I was able to see this, sometimes.
OK, thanks. Note that I'm not sure how I want to move forward here because a very frequent user/developer problem is "device tree relocated, everything crashed, why? oh, I'll just disable it (and lead to another problem down the line)".
In rcar with bootm_size unset it looks like this:
=> bdinfo boot_params = 0x000000007beee240 DRAM bank = 0x0000000000000000 -> start = 0x0000000048000000 -> size = 0x0000000038000000 DRAM bank = 0x0000000000000001 -> start = 0x0000000500000000 -> size = 0x0000000040000000 DRAM bank = 0x0000000000000002 -> start = 0x0000000600000000 -> size = 0x0000000040000000 DRAM bank = 0x0000000000000003 -> start = 0x0000000700000000 -> size = 0x0000000040000000 flashstart = 0x0000000008000000 flashsize = 0x0000000004000000 flashoffset = 0x00000000000f5890 baudrate = 115200 bps relocaddr = 0x000000007fee8000 reloc off = 0x000000007fee8000 Build = 64-bit current eth = ethernet@e6800000 ... fdt_blob = 0x000000007beda0e0 new_fdt = 0x000000007beda0e0 fdt_size = 0x000000000000dcc0 multi_dtb_fit= 0x0000000049000000 lmb_dump_all: memory.cnt = 0x4 memory[0] [0x48000000-0x7fffffff], 0x38000000 bytes flags: 0 memory[1] [0x500000000-0x53fffffff], 0x40000000 bytes flags: 0 memory[2] [0x600000000-0x63fffffff], 0x40000000 bytes flags: 0 memory[3] [0x700000000-0x73fffffff], 0x40000000 bytes flags: 0 reserved.cnt = 0x1 reserved[0] [0x44100000-0x47efffff], 0x03e00000 bytes flags: 4 arch_number = 0x0000000000000000 TLB addr = 0x000000007fff0000 irq_sp = 0x000000007beda0d0 sp start = 0x000000007beda0d0 Early malloc usage: 1318 / 8000
...
## Loading kernel from FIT Image at 58000000 ... Using 'conf-1' configuration Trying 'kernel-1' kernel subimage Description: Linux kernel (Sat Jun 5 00:24:15 CEST 2021) Type: Kernel Image Compression: uncompressed Data Start: 0x58000154 Data Size: 16662536 Bytes = 15.9 MiB Architecture: AArch64 OS: Linux Load Address: 0x50200000 Entry Point: 0x50200000 Hash algo: crc32 Hash value: 0655cd1f Verifying Hash Integrity ... crc32+ OK ## Loading fdt from FIT Image at 58000000 ... Using 'conf-1' configuration Trying 'fdt-1' fdt subimage Description: Flattened Device Tree blob (Sat Jun 5 00:24:15 CEST 2021) Type: Flat Device Tree Compression: uncompressed Data Start: 0x58fe42a4 Data Size: 74686 Bytes = 72.9 KiB Architecture: AArch64 Hash algo: crc32 Hash value: 287b2438 Verifying Hash Integrity ... crc32+ OK Booting using the fdt blob at 0x58fe42a4 Loading Kernel Image Loading Device Tree to 000000007ffea000, end 000000007ffff3bd ... OK
OK, I think we can say it's likely that in your case we're relocating the start of the device tree just a bit past where U-Boot is running. A bit of quick math says there's around 1MiB between relocaddr for U-Boot and startof the device tree relocation address.