[U-Boot] Performance of the ARM's PL310 L2 cache.

Hi Aneesh,
I've enabled the L2 cache for Trats board. Please find results from performance tests. The test function as well as my way for enabling L2 are attached to this e-mail.
I simply left the default configuration (number of ways, associativity) as it is at Linux Kernel's driver.
Results:
test_l2_cache() performed once: L1 L2 TIME [seconds] OFF OFF 90,359 ON OFF 62,236 ON ON 61,687
L1 speedup: ~33 % L2 speedup (when compared to L1): < 1%
test_l2_cache() performed 5000 times: L1 L2 TIME [seconds] OFF OFF 444,9 ON OFF 320,55 ON ON 287,21
L1 speedup: ~28 % L2 speedup (when compared to L1): ~ 10%
Normal u-boot operation (from system startup - up till passing execution to kernel).
L1 L2 TIME [seconds] OFF OFF 1,813 ON OFF 1,552 ON ON 1,533
As one can observe, for normal u-boot operation there is no significant difference.
Have you had similar results with OMAP? Do you do more configuration when enabling the L2 at OMAP?
The assembly code presented below (armv7/omap-common/lowlevel_init.S) puzzles me a bit...
ENTRY(set_pl310_ctrl_reg) LDR r12, =0x102 @ Set PL310 control register - value in R0 .word 0xe1600070 @ SMC #0 - hand assembled @because -march=armv5 @ call ROM Code API to set control @ register ENDPROC(set_pl310_ctrl_reg)
Are there any special operations executed at "ROM Code API"?

On Fri, Aug 17, 2012 at 05:49:53PM +0200, Lukasz Majewski wrote:
Hi Aneesh,
I've enabled the L2 cache for Trats board. Please find results from performance tests. The test function as well as my way for enabling L2 are attached to this e-mail.
[snip]
Have you had similar results with OMAP? Do you do more configuration when enabling the L2 at OMAP?
At least on some parts, it's similar here. The normal sequence of operations is loading a relatively small payload (kernel, maybe device tree) from storage and then booting it (which turns off L2 anyways). This is why I was willing to disable DCACHE as a USB workaround, the common use-case doesn't see a great deal of help from dcache being on.

Dear Tom Rini,
On Fri, Aug 17, 2012 at 05:49:53PM +0200, Lukasz Majewski wrote:
Hi Aneesh,
I've enabled the L2 cache for Trats board. Please find results from performance tests. The test function as well as my way for enabling L2 are attached to this e-mail.
[snip]
Have you had similar results with OMAP? Do you do more configuration when enabling the L2 at OMAP?
At least on some parts, it's similar here. The normal sequence of operations is loading a relatively small payload (kernel, maybe device tree) from storage and then booting it (which turns off L2 anyways). This is why I was willing to disable DCACHE as a USB workaround, the common use-case doesn't see a great deal of help from dcache being on.
I saw some pretty significant perf. boost with L1, I was planning to check L2 on mx6q, but I'm not sure if it's worth it anymore.
Best regards, Marek Vasut

Hi Marek,
Dear Tom Rini,
On Fri, Aug 17, 2012 at 05:49:53PM +0200, Lukasz Majewski wrote:
Hi Aneesh,
I've enabled the L2 cache for Trats board. Please find results from performance tests. The test function as well as my way for enabling L2 are attached to this e-mail.
[snip]
Have you had similar results with OMAP? Do you do more configuration when enabling the L2 at OMAP?
At least on some parts, it's similar here. The normal sequence of operations is loading a relatively small payload (kernel, maybe device tree) from storage and then booting it (which turns off L2 anyways). This is why I was willing to disable DCACHE as a USB workaround, the common use-case doesn't see a great deal of help from dcache being on.
I saw some pretty significant perf. boost with L1, I was planning to check L2 on mx6q, but I'm not sure if it's worth it anymore.
Best regards, Marek Vasut
I can agree, that enabling L1 provides significant performance boost. In our case, enabling L2 gives very little boost. Please try L2 on mx and share the results.
participants (3)
-
Lukasz Majewski
-
Marek Vasut
-
Tom Rini