
CCing the ARM custodian. Albert, what do you think of Alexey's comments below? Actually, having read it properly myself I think Alexey is confusing cache flushing with cache invalidation, I've left the CC in place though in case you have any thoughts on the matter.
On Fri, 2014-04-25 at 08:48 +0000, Alexey Brodkin wrote:
I thought a bit more about this situation and now I'm not that sure if we need to align addresses we pass to cache invalidate/flush functions.
Because IMHO drivers shouldn't care about specifics of particular platform or architecture. Otherwise we'll need to patch each and every driver only for cache invalidate/flush functions. I looked how these functions are used in other drivers and see that in most of cases no additional alignment precautions were implemented. People just pass start and end addresses.
In its turn platform and architecture provides cache invalidate/flush functions implement its functionality depending on hardware specifics.
For example on architectures that may only flush/invalidate with granularity of 1 cache line cache invalidate/flush functions make sure to start processing from the start of the cache line to which start address falls and end processing when cache line where end address falls is processed.
I may assume that there're architectures that automatically understand from which cache line to start and at which line to stop processing.
But if your architecture requires cache line aligned addresses to be used for start/end addresses you may look for examples in ARC (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/arc/cpu/arc700/cac...),, MIPS (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/mips/cpu/mips32/cp...), SH (http://git.denx.de/?p=u-boot/u-boot-arc.git;a=blob;f=arch/sh/cpu/sh4/cache.c),
and what's interesting even implementation you use have semi-proper start/end addresses handling - http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/lib/cache-pl310.c
This is the driver for one particular ARM cache controller and not the one used for the SoC. In any case it does "proper" start/end handling only for cache flush operations, not cache invalidate.
Cache invalidate is a potentially destructive operation (throwing away data in the caches), having it operate on anything more than the precise region requested would be very surprising to almost anyone I think.
Here's your invalidation procedure:
/* invalidate memory from start to stop-1 */ void v7_outer_cache_inval_range(u32 start, u32 stop) { /* PL310 currently supports only 32 bytes cache line */ u32 pa, line_size = 32;
/* * If start address is not aligned to cache-line do not * invalidate the first cache-line */ if (start & (line_size - 1)) { printf("ERROR: %s - start address is not aligned - 0x%08x\n", __func__, start); /* move to next cache line */ start = (start + line_size - 1) & ~(line_size - 1); }
/* * If stop address is not aligned to cache-line do not * invalidate the last cache-line */ if (stop & (line_size - 1)) { printf("ERROR: %s - stop address is not aligned - 0x%08x\n", __func__, stop); /* align to the beginning of this cache line */ stop &= ~(line_size - 1); }
for (pa = start; pa < stop; pa = pa + line_size) writel(pa, &pl310->pl310_inv_line_pa);
pl310_cache_sync(); } ============
- I don't understand why start from the next cache line if start
address is not aligned to cache line boundary? I'd say that you want to invalidate cache line that contains unaligned start address. Otherwise first bytes won't be invalidated, right?
- Why do we throw _error_ message. I may understand if you emit
_warning_ message in case of debug build (with DEBUG defined). Well in current implementation (see 1) it could be error because behavior is really dangerous. But if you start from correct cache line only warning in debug mode makes sense (IMHO).
- Stop/end address in contrast might need to be extended depending on
HW implementation (see above comment).
And here's your flush procedure:
void v7_outer_cache_flush_range(u32 start, u32 stop) { /* PL310 currently supports only 32 bytes cache line */ u32 pa, line_size = 32;
/* * Align to the beginning of cache-line - this ensures that * the first 5 bits are 0 as required by PL310 TRM */ start &= ~(line_size - 1);
for (pa = start; pa < stop; pa = pa + line_size) writel(pa, &pl310->pl310_clean_inv_line_pa);
pl310_cache_sync(); } ===========
Which looks very correct to me. I'm wondering if there was a reason to have so different implementation of functions that do very similar things.
I think you are missing the important differences between a cache flush and a cache invalidate.
So at this point I would ask you to modify cache invalidate function for your architecture. This way you prevent mentioned issues with other drivers.
As I describe above, I don't think this would be at all wise.
I'm not seeing any others, in practice or by eye-balling the code.
- Why don't you squeeze all 3 patches in 1 and name it like "fix
alignment issues with caches on some platforms"? Basically with all 3 patches you fix one and only issue and application of any one of those 3 patches doesn't solve your problem, right?
These are the issues as I discovered them one by one. I can fold them if you like but doing them separately will aid bisection if one of them turns out to be wrong in some way. As you prefer.
Keeping in mind things written above I'd say that patches 2 & 3 are not needed at all, while patch 1 makes perfect sense and fixes an obvious issue.
Regards, Alexey