
Hi Christian,
On Thu, Apr 12, 2018 at 4:07 PM, Christian Gmeiner christian.gmeiner@gmail.com wrote:
Fixes performance related issue when running vxWorks 5/7 images.
nits: vxWorks -> VxWorks
The overall memory performance (L1, L2 cache and ram) was measured with Bandwidth [0].
Without this patch we get following numbers:
- sequential 128-bit reads: ~5.2 GB/s
- sequential 128-bit copy: ~2.1 GB/s
- random 32-bit writes: ~1.2 GB/s
With this patch patch we get the following numbers:
- sequential 128-bit reads: ~18.0 GB/s
- sequential 128-bit copy: ~9.5 GB/s
- random 32-bit writes: ~5.0 GB/s
[0] https://zsmith.co/bandwidth.html
v1 -> v2:
- incorporate feedback from Bin Meng
This should not show in the commit message.
Signed-off-by: Christian Gmeiner christian.gmeiner@gmail.com
arch/x86/cpu/queensbay/Makefile | 2 +- arch/x86/cpu/queensbay/cpu.c | 58 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 arch/x86/cpu/queensbay/cpu.c
diff --git a/arch/x86/cpu/queensbay/Makefile b/arch/x86/cpu/queensbay/Makefile index c0681995bd..3dd23465d4 100644 --- a/arch/x86/cpu/queensbay/Makefile +++ b/arch/x86/cpu/queensbay/Makefile @@ -5,4 +5,4 @@ #
obj-y += fsp_configs.o irq.o -obj-y += tnc.o +obj-y += tnc.o cpu.o diff --git a/arch/x86/cpu/queensbay/cpu.c b/arch/x86/cpu/queensbay/cpu.c new file mode 100644 index 0000000000..805a94cc27 --- /dev/null +++ b/arch/x86/cpu/queensbay/cpu.c @@ -0,0 +1,58 @@ +/*
- Copyright (C) 2018, Bachmann electronic GmbH
- SPDX-License-Identifier: GPL-2.0+
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/cpu_x86.h> +#include <asm/msr.h>
+static void set_max_freq(void) +{
msr_t msr;
/* Enable enhanced speed step */
msr = msr_read(MSR_IA32_MISC_ENABLES);
msr.lo |= (1 << 16);
msr_write(MSR_IA32_MISC_ENABLES, msr);
/* Set new performance state */
msr = msr_read(MSR_IA32_PERF_CTL);
msr.lo = 0x101f;
msr_write(MSR_IA32_PERF_CTL, msr);
+}
I tried to find any documentation that describes the performance state values of the TunnelCreek processor, but in vain. However when I read the doc, I do have a question here:
The enhanced speedstep technology is set to disabled by the processor after power-on, that means we don't need set the performance state (P-state) via the MSR_IA32_PERF_CTL and the processor itself should work under its maximum base frequency. So I believe this whole set_max_freq() is not needed. Can you clarify this?
+static int cpu_x86_tunnelcreek_probe(struct udevice *dev) +{
if (!ll_boot_init())
return 0;
debug("Init TunnelCreek core\n");
/* Set core to max frequency ratio */
set_max_freq();
return 0;
+}
+static const struct cpu_ops cpu_x86_tunnelcreek_ops = {
.get_desc = cpu_x86_get_desc,
.get_count = cpu_x86_get_count,
+};
+static const struct udevice_id cpu_x86_tunnelcreek_ids[] = {
{ .compatible = "intel,tunnelcreek-cpu" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_tunnelcreek_drv) = {
.name = "cpu_x86_tunnelcreek",
.id = UCLASS_CPU,
.of_match = cpu_x86_tunnelcreek_ids,
.bind = cpu_x86_bind,
.probe = cpu_x86_tunnelcreek_probe,
.ops = &cpu_x86_tunnelcreek_ops,
+};
Regards, Bin