
Hi Albert, On Mon, Oct 19, 2015 at 01:48:25PM +0200, Albert ARIBAUD wrote:
Hello Peng,
(cutting a bit through the previous mails quoting)
This, in turn, leads to new questions:
- How is this PSCI code put in place? Is it bundled with the image,
with a specificy copy routine which puts it in place then locks the
Yeah.
memory range against writing? Or is it loaded in place by some other agent than U-Boot, and how does this agent know what to put where?
In arch/arm/cpu/armv7/virt-v7.c, relocate_secure_section() does this job.
CONFIG_ARMV7_SECURE_BASE is the location that secure code will be copied to. To ARMV7, PSCI code is bundled with the uboot image.
Hmm, the 'relocate' part of the function name is a somewhat non-optimal choice, since the routine just copies data without modifying it.
Looking at relocate_secure_section(), I see that it it is called long after the U-Boot code was relocated -- actually, it is called during bootm.
oh. You remind me, thanks. There maybe something wrong with my previous analysis.
I am not the one who finds the issue, just my understanding. The data abort may be triggered by line 104 or line 106. 96 fixloop: 97 ldmia r2!, {r0-r1} /* (r0,r1) <- (SRC location,fixup) */ 98 and r1, r1, #0xff 99 cmp r1, #23 /* relative fixup? */ 100 bne fixnext 101 102 /* relative fix: increase location by offset */ 103 add r0, r0, r4 104 ldr r1, [r0] 105 add r1, r1, r4 106 str r1, [r0]
At line 104 or 106, the value of r0 may be an address that can not be accessed which trigger data abort.
Frank, am I right?
I'll set up one board to test this.
This means that for any one of the other boards around that seem to use PSCI, either "their" PSCI code does not have any relocations (which I doubt), or it has but they don't cause any data abort when done by the relocate_code() routine.
There maybe something wrong from me. See above.
So what's the difference between those and your board? My bet is that their CONFIG_ARMV7_SECURE_BASE is not write-protected at the time the relocate_code() routine executes (whereas on your board it is), and will be unlocked in some way by the time relocate_secure_section() executes.
I'm not saying that the correct solution would be to enable write access to CONFIG_ARMV7_SECURE_BASE before running relocate_code(), because doing that would be obviously wrong: relocate_code() would try to fix code which is not there yet.
Yeah. This is true, relocate_code may fix code which is not there.
I'm just saying that's what actually happens, and if I am right, then it confirms that the correct fix is to not let relocations to the secure stay in the U-Boot ELF, or better yet, to not let them happen to boot [pun half-intended].
They do not happen to boot. My bad. See above.
CONFIG_ARMV7_SECURE_BASE is the location that secure code will be copied to. To ARMV7, PSCI code is bundled with the uboot image.
- Does this code and the relocatable image refer to symbols of one
another,
Yeah. There is asm function reference. You can see arch/arm/cpu/armv7/psci.S
Looking at this file, I suspect that actually, PSCI is called through a software interrupt / exception vector, not through a function name, and the only symbolic relationship is via __secure_start and __secure_end -- and those must (and will) be correctly relocated along with the U-Boot image.
or do they interact through an IPC independent of their respective location (in which case, one wonders why they are built into a single ELF file)?
This comes a question, why PSCI framework intergated into U-Boot? I have no idea.
There could have been alternative designs, indeed. Here is the design now, so let's handle it.
More generally... which target do you see this problem on? Reproducing the build myself would save both you and me some time, I guess. :)
Build target: mx7dsabresd
Needs to apply the three patches: https://patchwork.ozlabs.org/patch/527040/ https://patchwork.ozlabs.org/patch/527038/ https://patchwork.ozlabs.org/patch/527039/
There is a small difference from my previous log, since my previous log use 0x180000 as the secure address, but the patches use 0x900000 as the secure address. But sure there will be relocation entry there.
Thanks. I'll try and play with compiler / linker options, but from my own experience, this can be tricky. Meanwhile, I suggest you for for the not-too-quick-and-not-too-dirty linker script solution.
Do you need these relocation records? If not, then just append the '*(.rel._secure*) input section spec (with a suitable comment) to the already existing DISCARD output section. By the way, doesn't this linker script change alone fix the data abort?
Yeah. Since the relocation entry for secure section will be stored in rel.secure section. And this section will not be touched in relocate.S
Ok, so that's our first clean, linker-script-based, solution confirmed.
Still, if there *are* relocation records emitted, that's because the toolchain considered them necessary -- IOW, it was (wrongly) told the PSCI code must be relocatable [or the code really is relocatable and we just haven't not hit a bug about this yet].
The PSCI code is bundled with the u-boot image. But it's running address is not same with u-boot. In my example, u-boot is compiled with starting address 0x87800000, but to PSCI, it's running address is 0x180000. relocate_secure_section is just copy the psci code from uboot to 0x180000.
compliation address is same with running address, to PSCI part.
(I understand that. My question was not "why is PSCI built for its run-time address?" but "Since PSCI is built for its run-time address, and since there is little dependency on how come it is built as relocatable?")
uboot will not effect after switch to kernel, and its image will be overriden by kernel. But PSCI will always effect during kernel lifecyle. Maybe built PSCI part non-relocatable is an alternative choice.
Testing every single relocation record target against __image_copy_start and __image_copy_end would be rather costly in the tight loop of the relocation routine, which should run as fast as possible.
There is at least one fix to your problem which removes the out-of- bounds relocation records, and quite probably one better fix which prevents generating them in the first place.
Therefore, instead of the currently proposed patch which increases the size (very slightly) and boot time (statistically in O(n) proportion) of the image, I prefer one of the two alternative solutions which decrease the image size (by removing useless relocation records) and leave the boot time unchanged (since the fix is at build time only).
Agree.
OK. So, for now, I suggest that you:
- resubmit v2 of this series with patch 1/3 reworked into a linker
script change rather than a relocate routine change (and collect the secure relocation records input sections into the DISCARD output section rather than into a purposeless non-discarded section);
- make sure that you CC: the maintainers for the other boards which use
PSCI and explicitly ask them to test the change on their boards and give their "Tested-by". It seems like the list of candidates for testing includes jetson-tk1, ls1021aqds, ls1021atwr and all sun[678]i boards, but I am by no means a PSCI specialist, so anyone feel free to correct me about this list.
Regards, Peng.
Amicalement,
Albert.
--