
On 8/2/21 4:44 PM, Tom Rini wrote:
On Mon, Aug 02, 2021 at 04:34:29PM +0200, Jan Kiszka wrote:
On 02.08.21 16:27, Tom Rini wrote:
On Mon, Aug 02, 2021 at 04:03:01PM +0200, Jan Kiszka wrote:
On 02.08.21 15:04, Tom Rini wrote:
On Mon, Aug 02, 2021 at 01:54:57PM +0200, Jan Kiszka wrote:
On 02.08.21 13:38, Marek Vasut wrote: > On 8/2/21 1:36 PM, Jan Kiszka wrote: >> On 02.08.21 12:48, Marek Vasut wrote: >>> On 8/2/21 11:37 AM, Jan Kiszka wrote: >>>> On 02.08.21 02:54, Marek Vasut wrote: >>>>> On 7/29/21 6:58 PM, Tom Rini wrote: >>>>> >>>>> [...] >>>>> >>>>>>>> so when did rcar3 introduce something there that shouldn't be >>>>>>>> reserved? And you had phrased this to me on IRC as about reserving >>>>>>>> spot >>>>>>>> for ATAGS, and that not being needed of course on arm64. But >>>>>>>> that's >>>>>>>> not >>>>>>>> what's going on. Perhaps the answer is that rcar3 needs to >>>>>>>> introduce a >>>>>>>> board_lmb_reserve to free the normal arch one and provide whatever >>>>>>>> more >>>>>>>> narrow scope it needs. >>>>>>> >>>>>>> Based on the commit message 2359fa7a878 ("arm: bootm: Disable LMB >>>>>>> reservation for command line and board info on arm64") , this is >>>>>>> about ATAGS >>>>>>> and we really don't need to reserve those on arm64. >>>>>> >>>>>> Commit 2359fa7a878 disables the entire arch_lmb_reserve function on >>>>>> aarch64, yes. I assumed when we had talked that it was a small area >>>>>> being set aside and perhaps mis-recalled that ATAGS tended to live at >>>>>> DDR_BASE + 0x800 or so. >>>>> >>>>> That arch_lmb_reserve() is responsible for reserving architecture >>>>> specific memory. On arm32 it is ATAGS, on arm64 it is nothing as >>>>> far as >>>>> I can tell (and see below regarding the TLB). >>>>> >>>>>> This reservation is not at that spot, and a lot >>>>>> more than that. >>>>> >>>>> Can you please elaborate on this "lot more" part ? Because as much >>>>> as I >>>>> studied the reservation code, the "lot more" was ATAGS on arm32 and >>>>> nothing on arm64. >>>> >>>> See my commit log. >>> >>> This is not particularly useful answer, considering the commit log says: >>> "lot of crucial things", "Possibly more", "likely also on other boards" >>> and other opaque statements. But really, the problem so far happens on >>> one K3 board. >> >> "Such things are the page table (tlb_addr), >> relocated U-Boot and the active stack." > > Please read the rest of my answer, I don't believe the TLB should be > reserved at all. DTTO for the stack. If you think otherwise, please > explain why.
Marek, I've provided you with three generic examples of active memory blocks that are relevant while U-Boot is allocating from and also filling that LMB. Please follow those cases and explain to us why they aren't active - or at least prove why they are specific the k3 (for which I found no traces).
And stop following the TLB topic for now. That was only my first guess. The actual crash I'm seeing on my board come from plain code overwriting. It could have been TLB as well. It could also have been the stack. All those become unprotected via your reservation removal.
Jan, one thing I didn't see before is, are you also using include/configs/ti_armv7_common.h in the end, like the K3 reference platforms, and if not are you setting bootm_size in your environment? I have one more idea on why this fails on your board but not Marek's. Thanks.
We are including that header but we didn't use DEFAULT_LINUX_BOOT_ENV, in fact. That left bootm_size undefined. Can you explain the impact?
I suspect the answer here is that Marek does not see this problem because on R-Car bootm_size is set to 0x10000000 and so no relocation of the device tree / kernel / initrd happens to overwrite the running U-Boot and blow everything up. If you don't revert this, and do set bootm_size does everything work? Marek, if you unset bootm_size, do you see failure? Thanks!
I currently do not see the error, even with unset bootm_size and Marek's patch back in. But fdt indeed moves down when adopting those settings. That makes sense for us anyway, I think our custom env values are rather for historic reasons, and one had an issue anyway (incorrect kernel alignment).
But at least we understand why I was able to see this, sometimes.
OK, thanks. Note that I'm not sure how I want to move forward here because a very frequent user/developer problem is "device tree relocated, everything crashed, why? oh, I'll just disable it (and lead to another problem down the line)".
In rcar with bootm_size unset it looks like this:
=> bdinfo boot_params = 0x000000007beee240 DRAM bank = 0x0000000000000000 -> start = 0x0000000048000000 -> size = 0x0000000038000000 DRAM bank = 0x0000000000000001 -> start = 0x0000000500000000 -> size = 0x0000000040000000 DRAM bank = 0x0000000000000002 -> start = 0x0000000600000000 -> size = 0x0000000040000000 DRAM bank = 0x0000000000000003 -> start = 0x0000000700000000 -> size = 0x0000000040000000 flashstart = 0x0000000008000000 flashsize = 0x0000000004000000 flashoffset = 0x00000000000f5890 baudrate = 115200 bps relocaddr = 0x000000007fee8000 reloc off = 0x000000007fee8000 Build = 64-bit current eth = ethernet@e6800000 ... fdt_blob = 0x000000007beda0e0 new_fdt = 0x000000007beda0e0 fdt_size = 0x000000000000dcc0 multi_dtb_fit= 0x0000000049000000 lmb_dump_all: memory.cnt = 0x4 memory[0] [0x48000000-0x7fffffff], 0x38000000 bytes flags: 0 memory[1] [0x500000000-0x53fffffff], 0x40000000 bytes flags: 0 memory[2] [0x600000000-0x63fffffff], 0x40000000 bytes flags: 0 memory[3] [0x700000000-0x73fffffff], 0x40000000 bytes flags: 0 reserved.cnt = 0x1 reserved[0] [0x44100000-0x47efffff], 0x03e00000 bytes flags: 4 arch_number = 0x0000000000000000 TLB addr = 0x000000007fff0000 irq_sp = 0x000000007beda0d0 sp start = 0x000000007beda0d0 Early malloc usage: 1318 / 8000
...
## Loading kernel from FIT Image at 58000000 ... Using 'conf-1' configuration Trying 'kernel-1' kernel subimage Description: Linux kernel (Sat Jun 5 00:24:15 CEST 2021) Type: Kernel Image Compression: uncompressed Data Start: 0x58000154 Data Size: 16662536 Bytes = 15.9 MiB Architecture: AArch64 OS: Linux Load Address: 0x50200000 Entry Point: 0x50200000 Hash algo: crc32 Hash value: 0655cd1f Verifying Hash Integrity ... crc32+ OK ## Loading fdt from FIT Image at 58000000 ... Using 'conf-1' configuration Trying 'fdt-1' fdt subimage Description: Flattened Device Tree blob (Sat Jun 5 00:24:15 CEST 2021) Type: Flat Device Tree Compression: uncompressed Data Start: 0x58fe42a4 Data Size: 74686 Bytes = 72.9 KiB Architecture: AArch64 Hash algo: crc32 Hash value: 287b2438 Verifying Hash Integrity ... crc32+ OK Booting using the fdt blob at 0x58fe42a4 Loading Kernel Image Loading Device Tree to 000000007ffea000, end 000000007ffff3bd ... OK
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd073]