[U-Boot] [PATCH v2 00/20] x86: Add CPU uclass and multi-core support for Minnowboard MAX

This series adds a new CPU uclass which is intended to be useful on any architecture. So far it has a very simple interface and a command to show CPU details.
This series also introduces multi-core init for x86. It is implemented and enabled on Minnowboard MAX, a single/dual-core Atom board. The CPU uclass is implemented for x86 and the Simple Firmware Interface provides these details to the kernel, since ACPI is not yet available.
With these changes Minnowboard MAX can boot into Linux with both cores enabled.
This series is available at u-boot-x86 branch 'cpu-working'.
Changes in v2: - Use capitals for the header guard - Change 'print' to 'Print' in comment - Correct bugs in number output - Change header guard to capital letters - Change get_info() in function comment to cpu_get_info() - Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig - Correct Kconfig help indentation and text - Drop SFI_BASE config option - Always build sfi.o - Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE() - Make get_entry_start() static - Use table_compute_checksum() to computer checksum - Add a few blank lines - Move patch to after the CPU uclass patch - Drop the RTC table as it is not needed - Move SFI calling code to write_tables() - Remove IDLE table - Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END - Move '__packed' to immediately after 'struct' - Add SFI_DEV_TYPE_SD and convert to enum - Remove #ifdef CONFIG_SFI from hedaer file - Move sfi.h header file to arch/x86/include/asm - Remove unnecessary \t\n after mfence assembler instruction
Simon Glass (20): Fix comment nits in board_f.c dm: core: Add a function to bind a driver for a device tree node x86: Remove unwanted MMC debugging x86: Disable -Werror Move display_options functions to their own header Add print_freq() to display frequencies nicely dm: Implement a CPU uclass x86: Add support for the Simple Firmware Interface (SFI) Add a 'cpu' command to print CPU information x86: Add atomic operations x86: Add defines for fixed MTRRs x86: Add an mfence macro x86: Store the GDT pointer in global_data x86: Provide access to the IDT x86: Add multi-processor init x86: Add functions to set and clear bits on MSRs x86: Allow CPUs to be set up after relocation x86: Add a CPU driver for baytrail x86: Tidy up the LAPIC init code x86: Enable multi-core init for Minnowboard MAX
arch/x86/Kconfig | 39 +++ arch/x86/cpu/Makefile | 2 + arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++ arch/x86/cpu/baytrail/valleyview.c | 1 - arch/x86/cpu/config.mk | 2 +- arch/x86/cpu/cpu.c | 38 +++ arch/x86/cpu/interrupts.c | 5 + arch/x86/cpu/ivybridge/model_206ax.c | 4 +- arch/x86/cpu/lapic.c | 20 +- arch/x86/cpu/mp_init.c | 507 +++++++++++++++++++++++++++++++ arch/x86/cpu/sipi.S | 215 +++++++++++++ arch/x86/dts/minnowmax.dts | 20 ++ arch/x86/include/asm/arch-baytrail/msr.h | 30 ++ arch/x86/include/asm/atomic.h | 115 +++++++ arch/x86/include/asm/cpu.h | 19 ++ arch/x86/include/asm/global_data.h | 1 + arch/x86/include/asm/interrupt.h | 2 + arch/x86/include/asm/lapic.h | 7 - arch/x86/include/asm/mp.h | 94 ++++++ arch/x86/include/asm/msr.h | 19 ++ arch/x86/include/asm/mtrr.h | 14 + arch/x86/include/asm/sfi.h | 137 +++++++++ arch/x86/include/asm/sipi.h | 79 +++++ arch/x86/include/asm/smm.h | 14 + arch/x86/include/asm/u-boot-x86.h | 2 + arch/x86/lib/Makefile | 1 + arch/x86/lib/sfi.c | 154 ++++++++++ arch/x86/lib/tables.c | 5 + common/Kconfig | 8 + common/Makefile | 1 + common/board_f.c | 9 +- common/board_r.c | 2 +- common/cmd_cpu.c | 113 +++++++ configs/minnowmax_defconfig | 4 + drivers/Kconfig | 2 + drivers/Makefile | 1 + drivers/core/lists.c | 9 +- drivers/cpu/Kconfig | 8 + drivers/cpu/Makefile | 7 + drivers/cpu/cpu-uclass.c | 61 ++++ include/common.h | 16 +- include/cpu.h | 84 +++++ include/display_options.h | 59 ++++ include/dm/lists.h | 16 + include/dm/uclass-id.h | 1 + lib/display_options.c | 51 +++- 47 files changed, 2151 insertions(+), 54 deletions(-) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/cpu/mp_init.c create mode 100644 arch/x86/cpu/sipi.S create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h create mode 100644 arch/x86/include/asm/atomic.h create mode 100644 arch/x86/include/asm/mp.h create mode 100644 arch/x86/include/asm/sfi.h create mode 100644 arch/x86/include/asm/sipi.h create mode 100644 arch/x86/include/asm/smm.h create mode 100644 arch/x86/lib/sfi.c create mode 100644 common/cmd_cpu.c create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h create mode 100644 include/display_options.h

Try to make it a little clearer.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
common/board_f.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/common/board_f.c b/common/board_f.c index 322e070..fbbad1b 100644 --- a/common/board_f.c +++ b/common/board_f.c @@ -73,7 +73,7 @@ DECLARE_GLOBAL_DATA_PTR; #endif
/* - * sjg: IMO this code should be + * TODO(sjg@chromium.org): IMO this code should be * refactored to a single function, something like: * * void led_set_state(enum led_colour_t colour, int on); @@ -300,7 +300,7 @@ __weak ulong board_get_usable_ram_top(ulong total_size) { #ifdef CONFIG_SYS_SDRAM_BASE /* - * Detect whether we have so much RAM it goes past the end of our + * Detect whether we have so much RAM that it goes past the end of our * 32-bit address space. If so, clip the usable RAM so it doesn't. */ if (gd->ram_top < CONFIG_SYS_SDRAM_BASE) @@ -507,7 +507,7 @@ static int reserve_global_data(void) static int reserve_fdt(void) { /* - * If the device tree is sitting immediate above our image then we + * If the device tree is sitting immediately above our image then we * must relocate it. If it is embedded in the data section, then it * will be relocated with other data. */ @@ -535,7 +535,7 @@ static int reserve_stacks(void) gd->start_addr_sp &= ~0xf;
/* - * let the architecture specific code tailor gd->start_addr_sp and + * let the architecture-specific code tailor gd->start_addr_sp and * gd->irq_sp */ return arch_reserve_stacks(); @@ -556,7 +556,6 @@ static int setup_board_part1(void) /* * Save local variables to board info struct */ - bd->bi_memstart = CONFIG_SYS_SDRAM_BASE; /* start of memory */ bd->bi_memsize = gd->ram_size; /* size in bytes */

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
Try to make it a little clearer.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
common/board_f.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
Applied to u-boot-x86.

Some device tree nodes do not have compatible strings but do require drivers. This is pretty rare, and somewhat unfortunate. Add a function to permit creation of a driver for any device tree node.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
drivers/core/lists.c | 9 ++++++++- include/dm/lists.h | 16 ++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/drivers/core/lists.c b/drivers/core/lists.c index 647e390..0c49d99 100644 --- a/drivers/core/lists.c +++ b/drivers/core/lists.c @@ -74,6 +74,13 @@ int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only) int device_bind_driver(struct udevice *parent, const char *drv_name, const char *dev_name, struct udevice **devp) { + return device_bind_driver_to_node(parent, drv_name, dev_name, -1, devp); +} + +int device_bind_driver_to_node(struct udevice *parent, const char *drv_name, + const char *dev_name, int node, + struct udevice **devp) +{ struct driver *drv; int ret;
@@ -82,7 +89,7 @@ int device_bind_driver(struct udevice *parent, const char *drv_name, printf("Cannot find driver '%s'\n", drv_name); return -ENOENT; } - ret = device_bind(parent, drv, dev_name, NULL, -1, devp); + ret = device_bind(parent, drv, dev_name, NULL, node, devp); if (ret) { printf("Cannot create device named '%s' (err=%d)\n", dev_name, ret); diff --git a/include/dm/lists.h b/include/dm/lists.h index 1b50af9..61610e6 100644 --- a/include/dm/lists.h +++ b/include/dm/lists.h @@ -73,4 +73,20 @@ int lists_bind_fdt(struct udevice *parent, const void *blob, int offset, int device_bind_driver(struct udevice *parent, const char *drv_name, const char *dev_name, struct udevice **devp);
+/** + * device_bind_driver_to_node() - bind a device to a driver for a node + * + * This binds a new device to a driver for a given device tree node. This + * should only be needed if the node lacks a compatible strings. + * + * @parent: Parent device + * @drv_name: Name of driver to attach to this parent + * @dev_name: Name of the new device thus created + * @node: Device tree node + * @devp: Returns the newly bound device + */ +int device_bind_driver_to_node(struct udevice *parent, const char *drv_name, + const char *dev_name, int node, + struct udevice **devp); + #endif

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
Some device tree nodes do not have compatible strings but do require drivers. This is pretty rare, and somewhat unfortunate. Add a function to permit creation of a driver for any device tree node.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
drivers/core/lists.c | 9 ++++++++- include/dm/lists.h | 16 ++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-)
Applied to u-boot-x86.

This printf() should not have made it into the code.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/cpu/baytrail/valleyview.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/x86/cpu/baytrail/valleyview.c b/arch/x86/cpu/baytrail/valleyview.c index a3e837d..9915da5 100644 --- a/arch/x86/cpu/baytrail/valleyview.c +++ b/arch/x86/cpu/baytrail/valleyview.c @@ -16,7 +16,6 @@ static struct pci_device_id mmc_supported[] = {
int cpu_mmc_init(bd_t *bis) { - printf("mmc init\n"); return pci_mmc_init("ValleyView SDHCI", mmc_supported, ARRAY_SIZE(mmc_supported)); }

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
This printf() should not have made it into the code.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
arch/x86/cpu/baytrail/valleyview.c | 1 - 1 file changed, 1 deletion(-)
Applied to u-boot-x86.

This is annoying during development and serves no useful purpose since warnings are clearly displayed now that we are using Kbuild. Remove this option.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/cpu/config.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/cpu/config.mk b/arch/x86/cpu/config.mk index 84aeaf3..4c4d0c7 100644 --- a/arch/x86/cpu/config.mk +++ b/arch/x86/cpu/config.mk @@ -7,7 +7,7 @@
CROSS_COMPILE ?= i386-linux-
-PLATFORM_CPPFLAGS += -D__I386__ -Werror +PLATFORM_CPPFLAGS += -D__I386__
# DO NOT MODIFY THE FOLLOWING UNLESS YOU REALLY KNOW WHAT YOU ARE DOING! LDPPFLAGS += -DRESET_SEG_START=0xffff0000

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
This is annoying during development and serves no useful purpose since warnings are clearly displayed now that we are using Kbuild. Remove this option.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
arch/x86/cpu/config.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Applied to u-boot-x86.

Before adding one more function, create a separate header to help reduce the size of common.h. Add the missing function comments and tidy up.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Use capitals for the header guard - Change 'print' to 'Print' in comment
include/common.h | 16 +--------------- include/display_options.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++ lib/display_options.c | 13 ------------- 3 files changed, 49 insertions(+), 28 deletions(-) create mode 100644 include/display_options.h
diff --git a/include/common.h b/include/common.h index cde3474..d4d704a 100644 --- a/include/common.h +++ b/include/common.h @@ -192,22 +192,8 @@ int cpu_init(void);
/* */ phys_size_t initdram (int); -int display_options (void);
-/** - * print_size() - Print a size with a suffic - * - * print sizes as "xxx KiB", "xxx.y KiB", "xxx MiB", "xxx.y MiB", - * xxx GiB, xxx.y GiB, etc as needed; allow for optional trailing string - * (like "\n") - * - * @size: Size to print - * @suffix String to print after the size - */ -void print_size(uint64_t size, const char *suffix); - -int print_buffer(ulong addr, const void *data, uint width, uint count, - uint linelen); +#include <display_options.h>
/* common/main.c */ void main_loop (void); diff --git a/include/display_options.h b/include/display_options.h new file mode 100644 index 0000000..54bd41d --- /dev/null +++ b/include/display_options.h @@ -0,0 +1,48 @@ +/* + * Copyright (c) 2015 Google, Inc + * + * (C) Copyright 2000-2002 + * Wolfgang Denk, DENX Software Engineering, wd@denx.de. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#ifndef __DISPLAY_OPTIONS_H +#define __DISPLAY_OPTIONS_H + +/** + * print_size() - Print a size with a suffix + * + * Print sizes as "xxx KiB", "xxx.y KiB", "xxx MiB", "xxx.y MiB", + * xxx GiB, xxx.y GiB, etc as needed; allow for optional trailing string + * (like "\n") + * + * @size: Size to print + * @suffix String to print after the size + */ +void print_size(uint64_t size, const char *suffix); + +/** + * print_buffer() - Print data buffer in hex and ascii form + * + * Data reads are buffered so that each memory address is only read once. + * This is useful when displaying the contents of volatile registers. + * + * @addr: Starting address to display at start of line + * @data: pointer to data buffer + * @width: data value width. May be 1, 2, or 4. + * @count: number of values to display + * @linelen: Number of values to print per line; specify 0 for default length + */ +int print_buffer(ulong addr, const void *data, uint width, uint count, + uint linelen); + +/** + * display_options() - display the version string / build tag + * + * This displays the U-Boot version string. If a build tag is available this + * is displayed also. + */ +int display_options(void); + +#endif diff --git a/lib/display_options.c b/lib/display_options.c index d5d17b2..3f32bcd 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -63,19 +63,6 @@ void print_size(uint64_t size, const char *s) printf (" %ciB%s", c, s); }
-/* - * Print data buffer in hex and ascii form to the terminal. - * - * data reads are buffered so that each memory address is only read once. - * Useful when displaying the contents of volatile registers. - * - * parameters: - * addr: Starting address to display at start of line - * data: pointer to data buffer - * width: data value width. May be 1, 2, or 4. - * count: number of values to display - * linelen: Number of values to print per line; specify 0 for default length - */ #define MAX_LINE_LENGTH_BYTES (64) #define DEFAULT_LINE_LENGTH_BYTES (16) int print_buffer(ulong addr, const void *data, uint width, uint count,

On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Before adding one more function, create a separate header to help reduce the size of common.h. Add the missing function comments and tidy up.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Use capitals for the header guard
- Change 'print' to 'Print' in comment
include/common.h | 16 +--------------- include/display_options.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++ lib/display_options.c | 13 ------------- 3 files changed, 49 insertions(+), 28 deletions(-) create mode 100644 include/display_options.h
diff --git a/include/common.h b/include/common.h index cde3474..d4d704a 100644 --- a/include/common.h +++ b/include/common.h @@ -192,22 +192,8 @@ int cpu_init(void);
/* */ phys_size_t initdram (int); -int display_options (void);
-/**
- print_size() - Print a size with a suffic
- print sizes as "xxx KiB", "xxx.y KiB", "xxx MiB", "xxx.y MiB",
- xxx GiB, xxx.y GiB, etc as needed; allow for optional trailing string
- (like "\n")
- @size: Size to print
- @suffix String to print after the size
- */
-void print_size(uint64_t size, const char *suffix);
-int print_buffer(ulong addr, const void *data, uint width, uint count,
uint linelen);
+#include <display_options.h>
/* common/main.c */ void main_loop (void); diff --git a/include/display_options.h b/include/display_options.h new file mode 100644 index 0000000..54bd41d --- /dev/null +++ b/include/display_options.h @@ -0,0 +1,48 @@ +/*
- Copyright (c) 2015 Google, Inc
- (C) Copyright 2000-2002
- Wolfgang Denk, DENX Software Engineering, wd@denx.de.
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef __DISPLAY_OPTIONS_H +#define __DISPLAY_OPTIONS_H
+/**
- print_size() - Print a size with a suffix
- Print sizes as "xxx KiB", "xxx.y KiB", "xxx MiB", "xxx.y MiB",
- xxx GiB, xxx.y GiB, etc as needed; allow for optional trailing string
- (like "\n")
- @size: Size to print
- @suffix String to print after the size
- */
+void print_size(uint64_t size, const char *suffix);
+/**
- print_buffer() - Print data buffer in hex and ascii form
- Data reads are buffered so that each memory address is only read once.
- This is useful when displaying the contents of volatile registers.
- @addr: Starting address to display at start of line
- @data: pointer to data buffer
- @width: data value width. May be 1, 2, or 4.
- @count: number of values to display
- @linelen: Number of values to print per line; specify 0 for default length
- */
+int print_buffer(ulong addr, const void *data, uint width, uint count,
uint linelen);
+/**
- display_options() - display the version string / build tag
- This displays the U-Boot version string. If a build tag is available this
- is displayed also.
- */
+int display_options(void);
+#endif diff --git a/lib/display_options.c b/lib/display_options.c index d5d17b2..3f32bcd 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -63,19 +63,6 @@ void print_size(uint64_t size, const char *s) printf (" %ciB%s", c, s); }
-/*
- Print data buffer in hex and ascii form to the terminal.
- data reads are buffered so that each memory address is only read once.
- Useful when displaying the contents of volatile registers.
- parameters:
- addr: Starting address to display at start of line
- data: pointer to data buffer
- width: data value width. May be 1, 2, or 4.
- count: number of values to display
- linelen: Number of values to print per line; specify 0 for default length
- */
#define MAX_LINE_LENGTH_BYTES (64) #define DEFAULT_LINE_LENGTH_BYTES (16) int print_buffer(ulong addr, const void *data, uint width, uint count, --
Reviewed-by: Bin Meng bmeng.cn@gmail.com

On 28 April 2015 at 22:42, Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Before adding one more function, create a separate header to help reduce the size of common.h. Add the missing function comments and tidy up.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Use capitals for the header guard
- Change 'print' to 'Print' in comment
include/common.h | 16 +--------------- include/display_options.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++ lib/display_options.c | 13 ------------- 3 files changed, 49 insertions(+), 28 deletions(-) create mode 100644 include/display_options.h
Applied to u-boot-x86.

Add a function similar to print_size() that works for frequencies. It can handle from Hz to GHz.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Correct bugs in number output
include/display_options.h | 11 +++++++++++ lib/display_options.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
diff --git a/include/display_options.h b/include/display_options.h index 54bd41d..ac44c45 100644 --- a/include/display_options.h +++ b/include/display_options.h @@ -23,6 +23,17 @@ void print_size(uint64_t size, const char *suffix);
/** + * print_freq() - Print a frequency with a suffix + * + * Print frequencies as "x.xx GHz", "xxx KHz", etc as needed; allow for + * optional trailing string (like "\n") + * + * @freq: Frequency to print in Hz + * @suffix String to print after the frequency + */ +void print_freq(uint64_t freq, const char *suffix); + +/** * print_buffer() - Print data buffer in hex and ascii form * * Data reads are buffered so that each memory address is only read once. diff --git a/lib/display_options.c b/lib/display_options.c index 3f32bcd..3a70e14 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -7,6 +7,7 @@
#include <config.h> #include <common.h> +#include <div64.h> #include <inttypes.h> #include <version.h> #include <linux/ctype.h> @@ -22,6 +23,43 @@ int display_options (void) return 0; }
+void print_freq(uint64_t freq, const char *s) +{ + unsigned long m = 0, n; + uint64_t f; + static const char names[] = {'G', 'M', 'K'}; + unsigned long d = 1e9; + char c = 0; + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(names); i++, d /= 1000) { + if (freq >= d) { + c = names[i]; + break; + } + } + + if (!c) { + printf("%" PRIu64 " Hz%s", freq, s); + return; + } + + f = do_div(freq, d); + n = freq; + + /* If there's a remainder, show the first few digits */ + if (f) { + m = f % d; + while (!(m % 10)) + m /= 10; + } + + printf("%lu", n); + if (m) + printf(".%ld", m); + printf(" %cHz%s", c, s); +} + void print_size(uint64_t size, const char *s) { unsigned long m = 0, n;

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Add a function similar to print_size() that works for frequencies. It can handle from Hz to GHz.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Correct bugs in number output
include/display_options.h | 11 +++++++++++ lib/display_options.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
diff --git a/include/display_options.h b/include/display_options.h index 54bd41d..ac44c45 100644 --- a/include/display_options.h +++ b/include/display_options.h @@ -23,6 +23,17 @@ void print_size(uint64_t size, const char *suffix);
/**
- print_freq() - Print a frequency with a suffix
- Print frequencies as "x.xx GHz", "xxx KHz", etc as needed; allow for
- optional trailing string (like "\n")
- @freq: Frequency to print in Hz
- @suffix String to print after the frequency
- */
+void print_freq(uint64_t freq, const char *suffix);
+/**
- print_buffer() - Print data buffer in hex and ascii form
- Data reads are buffered so that each memory address is only read once.
diff --git a/lib/display_options.c b/lib/display_options.c index 3f32bcd..3a70e14 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -7,6 +7,7 @@
#include <config.h> #include <common.h> +#include <div64.h> #include <inttypes.h> #include <version.h> #include <linux/ctype.h> @@ -22,6 +23,43 @@ int display_options (void) return 0; }
+void print_freq(uint64_t freq, const char *s) +{
unsigned long m = 0, n;
uint64_t f;
static const char names[] = {'G', 'M', 'K'};
unsigned long d = 1e9;
char c = 0;
unsigned int i;
for (i = 0; i < ARRAY_SIZE(names); i++, d /= 1000) {
if (freq >= d) {
c = names[i];
break;
}
}
if (!c) {
printf("%" PRIu64 " Hz%s", freq, s);
return;
}
f = do_div(freq, d);
n = freq;
/* If there's a remainder, show the first few digits */
if (f) {
m = f % d;
while (!(m % 10))
m /= 10;
}
This 'first few digits' issue is not fixed. Do you intend to print all numbers after the radix point? If yes, then we need fix the comment to say all numbers will be printed.
printf("%lu", n);
if (m)
printf(".%ld", m);
printf(" %cHz%s", c, s);
+}
void print_size(uint64_t size, const char *s) { unsigned long m = 0, n; --
Regards, Bin

Hi Bin,
On 28 April 2015 at 22:56, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Add a function similar to print_size() that works for frequencies. It can handle from Hz to GHz.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Correct bugs in number output
include/display_options.h | 11 +++++++++++ lib/display_options.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
diff --git a/include/display_options.h b/include/display_options.h index 54bd41d..ac44c45 100644 --- a/include/display_options.h +++ b/include/display_options.h @@ -23,6 +23,17 @@ void print_size(uint64_t size, const char *suffix);
/**
- print_freq() - Print a frequency with a suffix
- Print frequencies as "x.xx GHz", "xxx KHz", etc as needed; allow for
- optional trailing string (like "\n")
- @freq: Frequency to print in Hz
- @suffix String to print after the frequency
- */
+void print_freq(uint64_t freq, const char *suffix);
+/**
- print_buffer() - Print data buffer in hex and ascii form
- Data reads are buffered so that each memory address is only read once.
diff --git a/lib/display_options.c b/lib/display_options.c index 3f32bcd..3a70e14 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -7,6 +7,7 @@
#include <config.h> #include <common.h> +#include <div64.h> #include <inttypes.h> #include <version.h> #include <linux/ctype.h> @@ -22,6 +23,43 @@ int display_options (void) return 0; }
+void print_freq(uint64_t freq, const char *s) +{
unsigned long m = 0, n;
uint64_t f;
static const char names[] = {'G', 'M', 'K'};
unsigned long d = 1e9;
char c = 0;
unsigned int i;
for (i = 0; i < ARRAY_SIZE(names); i++, d /= 1000) {
if (freq >= d) {
c = names[i];
break;
}
}
if (!c) {
printf("%" PRIu64 " Hz%s", freq, s);
return;
}
f = do_div(freq, d);
n = freq;
/* If there's a remainder, show the first few digits */
if (f) {
m = f % d;
while (!(m % 10))
m /= 10;
}
This 'first few digits' issue is not fixed. Do you intend to print all numbers after the radix point? If yes, then we need fix the comment to say all numbers will be printed.
I think it is better not to. I'll change my tests a bit and send a new version of just this patch.
Regards, Simon

It is useful to be able to keep track of the available CPUs in a multi-CPU system. This uclass is mostly intended for use with SMP systems.
The uclass provides methods for getting basic information about each CPU.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Change header guard to capital letters - Change get_info() in function comment to cpu_get_info()
drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/cpu/Kconfig | 8 +++++ drivers/cpu/Makefile | 7 ++++ drivers/cpu/cpu-uclass.c | 61 +++++++++++++++++++++++++++++++++++ include/cpu.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ include/dm/uclass-id.h | 1 + 7 files changed, 164 insertions(+) create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index 941aa0c..1f40887 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -2,6 +2,8 @@ menu "Device Drivers"
source "drivers/core/Kconfig"
+source "drivers/cpu/Kconfig" + source "drivers/demo/Kconfig"
source "drivers/pci/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 5ef58c0..405b64b 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_DM_DEMO) += demo/ obj-$(CONFIG_BIOSEMU) += bios_emulator/ obj-y += block/ obj-$(CONFIG_BOOTCOUNT_LIMIT) += bootcount/ +obj-$(CONFIG_CPU) += cpu/ obj-y += crypto/ obj-$(CONFIG_FPGA) += fpga/ obj-y += hwmon/ diff --git a/drivers/cpu/Kconfig b/drivers/cpu/Kconfig new file mode 100644 index 0000000..0d1424d --- /dev/null +++ b/drivers/cpu/Kconfig @@ -0,0 +1,8 @@ +config CPU + bool "Enable CPU drivers using Driver Model" + help + This allows drivers to be provided for CPUs and their type to be + specified in the board's device tree. For boards which support + multiple CPUs, then normally have to be set up in U-Boot so that + they can work correctly in the OS. This provides a framework for + finding out information about available CPUs and making changes. diff --git a/drivers/cpu/Makefile b/drivers/cpu/Makefile new file mode 100644 index 0000000..8710160 --- /dev/null +++ b/drivers/cpu/Makefile @@ -0,0 +1,7 @@ +# +# Copyright (c) 2015 Google, Inc +# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# +# SPDX-License-Identifier: GPL-2.0+ +# +obj-$(CONFIG_CPU) += cpu-uclass.o diff --git a/drivers/cpu/cpu-uclass.c b/drivers/cpu/cpu-uclass.c new file mode 100644 index 0000000..ab18ee2 --- /dev/null +++ b/drivers/cpu/cpu-uclass.c @@ -0,0 +1,61 @@ +/* + * Copyright (C) 2015 Google, Inc + * Written by Simon Glass sjg@chromium.org + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <dm/lists.h> +#include <dm/root.h> + +int cpu_get_desc(struct udevice *dev, char *buf, int size) +{ + struct cpu_ops *ops = cpu_get_ops(dev); + + if (!ops->get_desc) + return -ENOSYS; + + return ops->get_desc(dev, buf, size); +} + +int cpu_get_info(struct udevice *dev, struct cpu_info *info) +{ + struct cpu_ops *ops = cpu_get_ops(dev); + + if (!ops->get_desc) + return -ENOSYS; + + return ops->get_info(dev, info); +} + +U_BOOT_DRIVER(cpu_bus) = { + .name = "cpu_bus", + .id = UCLASS_SIMPLE_BUS, + .per_child_platdata_auto_alloc_size = sizeof(struct cpu_platdata), +}; + +static int uclass_cpu_init(struct uclass *uc) +{ + struct udevice *dev; + int node; + int ret; + + node = fdt_path_offset(gd->fdt_blob, "/cpus"); + if (node < 0) + return 0; + + ret = device_bind_driver_to_node(dm_root(), "cpu_bus", "cpus", node, + &dev); + + return ret; +} + +UCLASS_DRIVER(cpu) = { + .id = UCLASS_CPU, + .name = "cpu", + .flags = DM_UC_FLAG_SEQ_ALIAS, + .init = uclass_cpu_init, +}; diff --git a/include/cpu.h b/include/cpu.h new file mode 100644 index 0000000..34c60bc --- /dev/null +++ b/include/cpu.h @@ -0,0 +1,84 @@ +/* + * Copyright (c) 2015 Google, Inc + * Written by Simon Glass sjg@chromium.org + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#ifndef __CPU_H +#define __CPU_H + +/** + * struct cpu_platdata - platform data for a CPU + * + * This can be accessed with dev_get_parent_platdata() for any UCLASS_CPU + * device. + * + * @cpu_id: Platform-specific way of identifying the CPU. + */ +struct cpu_platdata { + int cpu_id; +}; + +/* CPU features - mostly just a placeholder for now */ +enum { + CPU_FEAT_L1_CACHE = 0, /* Supports level 1 cache */ + CPU_FEAT_MMU = 1, /* Supports virtual memory */ + + CPU_FEAT_COUNT, +}; + +/** + * struct cpu_info - Information about a CPU + * + * @cpu_freq: Current CPU frequency in Hz + * @features: Flags for supported CPU features + */ +struct cpu_info { + ulong cpu_freq; + ulong features; +}; + +struct cpu_ops { + /** + * get_desc() - Get a description string for a CPU + * + * @dev: Device to check (UCLASS_CPU) + * @buf: Buffer to place string + * @size: Size of string space + * @return 0 if OK, -ENOSPC if buffer is too small, other -ve on error + */ + int (*get_desc)(struct udevice *dev, char *buf, int size); + + /** + * get_info() - Get information about a CPU + * + * @dev: Device to check (UCLASS_CPU) + * @info: Returns CPU info + * @return 0 if OK, -ve on error + */ + int (*get_info)(struct udevice *dev, struct cpu_info *info); +}; + +#define cpu_get_ops(dev) ((struct cpu_ops *)(dev)->driver->ops) + +/** + * cpu_get_desc() - Get a description string for a CPU + * + * @dev: Device to check (UCLASS_CPU) + * @buf: Buffer to place string + * @size: Size of string space + * @return 0 if OK, -ENOSPC if buffer is too small, other -ve on error + */ +int cpu_get_desc(struct udevice *dev, char *buf, int size); + +/** + * cpu_get_info() - Get information about a CPU + * + * @dev: Device to check (UCLASS_CPU) + * @info: Returns CPU info + * @return 0 if OK, -ve on error + */ +int cpu_get_info(struct udevice *dev, struct cpu_info *info); + +#endif diff --git a/include/dm/uclass-id.h b/include/dm/uclass-id.h index fddfd35..395e25a 100644 --- a/include/dm/uclass-id.h +++ b/include/dm/uclass-id.h @@ -45,6 +45,7 @@ enum uclass_id { UCLASS_USB_HUB, /* USB hub */ UCLASS_USB_DEV_GENERIC, /* USB generic device */ UCLASS_MASS_STORAGE, /* Mass storage device */ + UCLASS_CPU, /* CPU, typically part of an SoC */
UCLASS_COUNT, UCLASS_INVALID = -1,

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
It is useful to be able to keep track of the available CPUs in a multi-CPU system. This uclass is mostly intended for use with SMP systems.
The uclass provides methods for getting basic information about each CPU.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Bin Meng bmeng.cn@gmail.com
But one comment raised previously not addressed below.
Changes in v2:
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/cpu/Kconfig | 8 +++++ drivers/cpu/Makefile | 7 ++++ drivers/cpu/cpu-uclass.c | 61 +++++++++++++++++++++++++++++++++++ include/cpu.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ include/dm/uclass-id.h | 1 + 7 files changed, 164 insertions(+) create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index 941aa0c..1f40887 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -2,6 +2,8 @@ menu "Device Drivers"
source "drivers/core/Kconfig"
+source "drivers/cpu/Kconfig"
source "drivers/demo/Kconfig"
source "drivers/pci/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 5ef58c0..405b64b 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_DM_DEMO) += demo/ obj-$(CONFIG_BIOSEMU) += bios_emulator/ obj-y += block/ obj-$(CONFIG_BOOTCOUNT_LIMIT) += bootcount/ +obj-$(CONFIG_CPU) += cpu/ obj-y += crypto/ obj-$(CONFIG_FPGA) += fpga/ obj-y += hwmon/ diff --git a/drivers/cpu/Kconfig b/drivers/cpu/Kconfig new file mode 100644 index 0000000..0d1424d --- /dev/null +++ b/drivers/cpu/Kconfig @@ -0,0 +1,8 @@ +config CPU
Should it be DM_CPU? Like other DM drivers (DM_SERIAL, DM_I2C, etc).
bool "Enable CPU drivers using Driver Model"
help
This allows drivers to be provided for CPUs and their type to be
specified in the board's device tree. For boards which support
multiple CPUs, then normally have to be set up in U-Boot so that
they can work correctly in the OS. This provides a framework for
finding out information about available CPUs and making changes.
[snip]
Regards, Bin

Hi Bin,
On 28 April 2015 at 23:01, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
It is useful to be able to keep track of the available CPUs in a multi-CPU system. This uclass is mostly intended for use with SMP systems.
The uclass provides methods for getting basic information about each CPU.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Bin Meng bmeng.cn@gmail.com
But one comment raised previously not addressed below.
Changes in v2:
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/cpu/Kconfig | 8 +++++ drivers/cpu/Makefile | 7 ++++ drivers/cpu/cpu-uclass.c | 61 +++++++++++++++++++++++++++++++++++ include/cpu.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ include/dm/uclass-id.h | 1 + 7 files changed, 164 insertions(+) create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index 941aa0c..1f40887 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -2,6 +2,8 @@ menu "Device Drivers"
source "drivers/core/Kconfig"
+source "drivers/cpu/Kconfig"
source "drivers/demo/Kconfig"
source "drivers/pci/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 5ef58c0..405b64b 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_DM_DEMO) += demo/ obj-$(CONFIG_BIOSEMU) += bios_emulator/ obj-y += block/ obj-$(CONFIG_BOOTCOUNT_LIMIT) += bootcount/ +obj-$(CONFIG_CPU) += cpu/ obj-y += crypto/ obj-$(CONFIG_FPGA) += fpga/ obj-y += hwmon/ diff --git a/drivers/cpu/Kconfig b/drivers/cpu/Kconfig new file mode 100644 index 0000000..0d1424d --- /dev/null +++ b/drivers/cpu/Kconfig @@ -0,0 +1,8 @@ +config CPU
Should it be DM_CPU? Like other DM drivers (DM_SERIAL, DM_I2C, etc).
bool "Enable CPU drivers using Driver Model"
help
This allows drivers to be provided for CPUs and their type to be
specified in the board's device tree. For boards which support
multiple CPUs, then normally have to be set up in U-Boot so that
they can work correctly in the OS. This provides a framework for
finding out information about available CPUs and making changes.
I prefer only using the DM prefix when there is a non-DM option. In fact the way it is supposed to work is that eventually the CONFIG_DM... option goes away. When every board uses DM for a subsystem we should be able to drop it.
CPU support (or whatever we call it) will only be available with driver model, so I don't think we need separate CONFIG_CPU and CONFIG_DM_CPU options.
Regards, Simon

Hi Simon,
On Wed, Apr 29, 2015 at 9:32 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 28 April 2015 at 23:01, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
It is useful to be able to keep track of the available CPUs in a multi-CPU system. This uclass is mostly intended for use with SMP systems.
The uclass provides methods for getting basic information about each CPU.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Bin Meng bmeng.cn@gmail.com
But one comment raised previously not addressed below.
Changes in v2:
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/cpu/Kconfig | 8 +++++ drivers/cpu/Makefile | 7 ++++ drivers/cpu/cpu-uclass.c | 61 +++++++++++++++++++++++++++++++++++ include/cpu.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ include/dm/uclass-id.h | 1 + 7 files changed, 164 insertions(+) create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index 941aa0c..1f40887 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -2,6 +2,8 @@ menu "Device Drivers"
source "drivers/core/Kconfig"
+source "drivers/cpu/Kconfig"
source "drivers/demo/Kconfig"
source "drivers/pci/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 5ef58c0..405b64b 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_DM_DEMO) += demo/ obj-$(CONFIG_BIOSEMU) += bios_emulator/ obj-y += block/ obj-$(CONFIG_BOOTCOUNT_LIMIT) += bootcount/ +obj-$(CONFIG_CPU) += cpu/ obj-y += crypto/ obj-$(CONFIG_FPGA) += fpga/ obj-y += hwmon/ diff --git a/drivers/cpu/Kconfig b/drivers/cpu/Kconfig new file mode 100644 index 0000000..0d1424d --- /dev/null +++ b/drivers/cpu/Kconfig @@ -0,0 +1,8 @@ +config CPU
Should it be DM_CPU? Like other DM drivers (DM_SERIAL, DM_I2C, etc).
bool "Enable CPU drivers using Driver Model"
help
This allows drivers to be provided for CPUs and their type to be
specified in the board's device tree. For boards which support
multiple CPUs, then normally have to be set up in U-Boot so that
they can work correctly in the OS. This provides a framework for
finding out information about available CPUs and making changes.
I prefer only using the DM prefix when there is a non-DM option. In fact the way it is supposed to work is that eventually the CONFIG_DM... option goes away. When every board uses DM for a subsystem we should be able to drop it.
Good to know that those CONFIG_DM_xxx will eventually go away.
CPU support (or whatever we call it) will only be available with driver model, so I don't think we need separate CONFIG_CPU and CONFIG_DM_CPU options.
OK.
Regards, Bin

On 29 April 2015 at 08:07, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 9:32 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 28 April 2015 at 23:01, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
It is useful to be able to keep track of the available CPUs in a multi-CPU system. This uclass is mostly intended for use with SMP systems.
The uclass provides methods for getting basic information about each CPU.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Bin Meng bmeng.cn@gmail.com
Applied to u-boot-x86.
But one comment raised previously not addressed below.
Changes in v2:
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/cpu/Kconfig | 8 +++++ drivers/cpu/Makefile | 7 ++++ drivers/cpu/cpu-uclass.c | 61 +++++++++++++++++++++++++++++++++++ include/cpu.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ include/dm/uclass-id.h | 1 + 7 files changed, 164 insertions(+) create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index 941aa0c..1f40887 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -2,6 +2,8 @@ menu "Device Drivers"
source "drivers/core/Kconfig"
+source "drivers/cpu/Kconfig"
source "drivers/demo/Kconfig"
source "drivers/pci/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 5ef58c0..405b64b 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_DM_DEMO) += demo/ obj-$(CONFIG_BIOSEMU) += bios_emulator/ obj-y += block/ obj-$(CONFIG_BOOTCOUNT_LIMIT) += bootcount/ +obj-$(CONFIG_CPU) += cpu/ obj-y += crypto/ obj-$(CONFIG_FPGA) += fpga/ obj-y += hwmon/ diff --git a/drivers/cpu/Kconfig b/drivers/cpu/Kconfig new file mode 100644 index 0000000..0d1424d --- /dev/null +++ b/drivers/cpu/Kconfig @@ -0,0 +1,8 @@ +config CPU
Should it be DM_CPU? Like other DM drivers (DM_SERIAL, DM_I2C, etc).
bool "Enable CPU drivers using Driver Model"
help
This allows drivers to be provided for CPUs and their type to be
specified in the board's device tree. For boards which support
multiple CPUs, then normally have to be set up in U-Boot so that
they can work correctly in the OS. This provides a framework for
finding out information about available CPUs and making changes.
I prefer only using the DM prefix when there is a non-DM option. In fact the way it is supposed to work is that eventually the CONFIG_DM... option goes away. When every board uses DM for a subsystem we should be able to drop it.
Good to know that those CONFIG_DM_xxx will eventually go away.
CPU support (or whatever we call it) will only be available with driver model, so I don't think we need separate CONFIG_CPU and CONFIG_DM_CPU options.
OK.
Regards, Bin

This provides a way of passing information to Linux without requiring the full ACPI horror. Provide a rudimentary implementation sufficient to be recognised and parsed by Linux.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig - Correct Kconfig help indentation and text - Drop SFI_BASE config option - Always build sfi.o - Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE() - Make get_entry_start() static - Use table_compute_checksum() to computer checksum - Add a few blank lines - Move patch to after the CPU uclass patch - Drop the RTC table as it is not needed - Move SFI calling code to write_tables() - Remove IDLE table - Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END - Move '__packed' to immediately after 'struct' - Add SFI_DEV_TYPE_SD and convert to enum - Remove #ifdef CONFIG_SFI from hedaer file - Move sfi.h header file to arch/x86/include/asm
arch/x86/Kconfig | 14 +++++ arch/x86/include/asm/sfi.h | 137 ++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/Makefile | 1 + arch/x86/lib/sfi.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/tables.c | 5 ++ 5 files changed, 311 insertions(+) create mode 100644 arch/x86/include/asm/sfi.h create mode 100644 arch/x86/lib/sfi.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f3a600e..f38e9ba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -393,6 +393,20 @@ config GENERATE_PIRQ_TABLE It specifies the interrupt router information as well how all the PCI devices' interrupt pins are wired to PIRQs.
+config GENERATE_SFI_TABLE + bool "SFI (Simple Firmware Interface) Support" + help + The Simple Firmware Interface (SFI) provides a lightweight method + for platform firmware to pass information to the operating system + via static tables in memory. Kernel SFI support is required to + boot on SFI-only platforms. If you have ACPI tables then these are + used instead. + + U-Boot writes this table in sfi_write_tables() just before booting + the OS. + + For more information, see http://simplefirmware.org + endmenu
config MAX_PIRQ_LINKS diff --git a/arch/x86/include/asm/sfi.h b/arch/x86/include/asm/sfi.h new file mode 100644 index 0000000..d1f0f0c --- /dev/null +++ b/arch/x86/include/asm/sfi.h @@ -0,0 +1,137 @@ +/* + * Copyright(c) 2009 Intel Corporation. All rights reserved. + * + * SPDX-License-Identifier: GPL-2.0+ BSD-3-Clause + */ + +#ifndef _LINUX_SFI_H +#define _LINUX_SFI_H + +#include <errno.h> +#include <linux/types.h> + +/* Table signatures reserved by the SFI specification */ +#define SFI_SIG_SYST "SYST" +#define SFI_SIG_FREQ "FREQ" +#define SFI_SIG_CPUS "CPUS" +#define SFI_SIG_MTMR "MTMR" +#define SFI_SIG_MRTC "MRTC" +#define SFI_SIG_MMAP "MMAP" +#define SFI_SIG_APIC "APIC" +#define SFI_SIG_XSDT "XSDT" +#define SFI_SIG_WAKE "WAKE" +#define SFI_SIG_DEVS "DEVS" +#define SFI_SIG_GPIO "GPIO" + +#define SFI_SIGNATURE_SIZE 4 +#define SFI_OEM_ID_SIZE 6 +#define SFI_OEM_TABLE_ID_SIZE 8 + +#define SFI_NAME_LEN 16 +#define SFI_TABLE_MAX_ENTRIES 16 + +#define SFI_GET_NUM_ENTRIES(ptable, entry_type) \ + ((ptable->header.len - sizeof(struct sfi_table_header)) / \ + (sizeof(entry_type))) +/* + * Table structures must be byte-packed to match the SFI specification, + * as they are provided by the BIOS. + */ +struct __packed sfi_table_header { + char sig[SFI_SIGNATURE_SIZE]; + u32 len; + u8 rev; + u8 csum; + char oem_id[SFI_OEM_ID_SIZE]; + char oem_table_id[SFI_OEM_TABLE_ID_SIZE]; +}; + +struct __packed sfi_table_simple { + struct sfi_table_header header; + u64 pentry[1]; +}; + +/* Comply with UEFI spec 2.1 */ +struct __packed sfi_mem_entry { + u32 type; + u64 phys_start; + u64 virt_start; + u64 pages; + u64 attrib; +}; + +struct __packed sfi_cpu_table_entry { + u32 apic_id; +}; + +struct __packed sfi_cstate_table_entry { + u32 hint; /* MWAIT hint */ + u32 latency; /* latency in ms */ +}; + +struct __packed sfi_apic_table_entry { + u64 phys_addr; /* phy base addr for APIC reg */ +}; + +struct __packed sfi_freq_table_entry { + u32 freq_mhz; /* in MHZ */ + u32 latency; /* transition latency in ms */ + u32 ctrl_val; /* value to write to PERF_CTL */ +}; + +struct __packed sfi_wake_table_entry { + u64 phys_addr; /* pointer to where the wake vector locates */ +}; + +struct __packed sfi_timer_table_entry { + u64 phys_addr; /* phy base addr for the timer */ + u32 freq_hz; /* in HZ */ + u32 irq; +}; + +struct __packed sfi_rtc_table_entry { + u64 phys_addr; /* phy base addr for the RTC */ + u32 irq; +}; + +struct __packed sfi_device_table_entry { + u8 type; /* bus type, I2C, SPI or ...*/ + u8 host_num; /* attached to host 0, 1...*/ + u16 addr; + u8 irq; + u32 max_freq; + char name[SFI_NAME_LEN]; +}; + +enum { + SFI_DEV_TYPE_SPI = 0, + SFI_DEV_TYPE_I2C, + SFI_DEV_TYPE_UART, + SFI_DEV_TYPE_HSI, + SFI_DEV_TYPE_IPC, + SFI_DEV_TYPE_SD, +}; + +struct __packed sfi_gpio_table_entry { + char controller_name[SFI_NAME_LEN]; + u16 pin_no; + char pin_name[SFI_NAME_LEN]; +}; + +struct sfi_xsdt_header { + uint32_t oem_revision; + uint32_t creator_id; + uint32_t creator_revision; +}; + +typedef int (*sfi_table_handler) (struct sfi_table_header *table); + +/** + * write_sfi_table() - Write Simple Firmware Interface tables + * + * @base: Address to write table to + * @return address to use for the next table + */ +u32 write_sfi_table(u32 base); + +#endif /*_LINUX_SFI_H */ diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 0178fe1..70ad19b 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -26,6 +26,7 @@ obj-y += pirq_routing.o obj-y += relocate.o obj-y += physmem.o obj-$(CONFIG_X86_RAMTEST) += ramtest.o +obj-y += sfi.o obj-y += string.o obj-y += tables.o obj-$(CONFIG_SYS_X86_TSC_TIMER) += tsc_timer.o diff --git a/arch/x86/lib/sfi.c b/arch/x86/lib/sfi.c new file mode 100644 index 0000000..3d36580 --- /dev/null +++ b/arch/x86/lib/sfi.c @@ -0,0 +1,154 @@ +/* + * Copyright (c) 2015 Google, Inc + * Written by Simon Glass sjg@chromium.org + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +/* + * Intel Simple Firmware Interface (SFI) + * + * Yet another way to pass information to the Linux kernel. + * + * See https://simplefirmware.org/ for details + */ + +#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/ioapic.h> +#include <asm/sfi.h> +#include <asm/tables.h> +#include <dm/uclass-internal.h> + +struct table_info { + u32 base; + int ptr; + u32 entry_start; + u64 table[SFI_TABLE_MAX_ENTRIES]; + int count; +}; + +static void *get_entry_start(struct table_info *tab) +{ + if (tab->count == SFI_TABLE_MAX_ENTRIES) + return NULL; + tab->entry_start = tab->base + tab->ptr; + tab->table[tab->count] = tab->entry_start; + tab->entry_start += sizeof(struct sfi_table_header); + + return (void *)tab->entry_start; +} + +static void finish_table(struct table_info *tab, const char *sig, void *entry) +{ + struct sfi_table_header *hdr; + + hdr = (struct sfi_table_header *)(tab->base + tab->ptr); + strcpy(hdr->sig, sig); + hdr->len = sizeof(*hdr) + ((ulong)entry - tab->entry_start); + hdr->rev = 1; + strncpy(hdr->oem_id, "U-Boot", SFI_OEM_ID_SIZE); + strncpy(hdr->oem_table_id, "Table v1", SFI_OEM_TABLE_ID_SIZE); + hdr->csum = 0; + hdr->csum = table_compute_checksum(hdr, hdr->len); + tab->ptr += hdr->len; + tab->ptr = ALIGN(tab->ptr, 16); + tab->count++; +} + +static int sfi_write_system_header(struct table_info *tab) +{ + u64 *entry = get_entry_start(tab); + int i; + + if (!entry) + return -ENOSPC; + + for (i = 0; i < tab->count; i++) + *entry++ = tab->table[i]; + finish_table(tab, SFI_SIG_SYST, entry); + + return 0; +} + +static int sfi_write_cpus(struct table_info *tab) +{ + struct sfi_cpu_table_entry *entry = get_entry_start(tab); + struct udevice *dev; + int count = 0; + + if (!entry) + return -ENOSPC; + + for (uclass_find_first_device(UCLASS_CPU, &dev); + dev; + uclass_find_next_device(&dev)) { + struct cpu_platdata *plat = dev_get_parent_platdata(dev); + + if (!device_active(dev)) + continue; + entry->apic_id = plat->cpu_id; + entry++; + count++; + } + + /* Omit the table if there is only one CPU */ + if (count > 1) + finish_table(tab, SFI_SIG_CPUS, entry); + + return 0; +} + +static int sfi_write_apic(struct table_info *tab) +{ + struct sfi_apic_table_entry *entry = get_entry_start(tab); + + if (!entry) + return -ENOSPC; + + entry->phys_addr = IO_APIC_ADDR; + entry++; + finish_table(tab, SFI_SIG_APIC, entry); + + return 0; +} + +static int sfi_write_xsdt(struct table_info *tab) +{ + struct sfi_xsdt_header *entry = get_entry_start(tab); + + if (!entry) + return -ENOSPC; + + entry->oem_revision = 1; + entry->creator_id = 1; + entry->creator_revision = 1; + entry++; + finish_table(tab, SFI_SIG_XSDT, entry); + + return 0; +} + +u32 write_sfi_table(u32 base) +{ + struct table_info table; + + table.base = base; + table.ptr = 0; + table.count = 0; + sfi_write_cpus(&table); + sfi_write_apic(&table); + + /* + * The SFI specification marks the XSDT table as option, but Linux 4.0 + * crashes on start-up when it is not provided. + */ + sfi_write_xsdt(&table); + + /* Finally, write out the system header which points to the others */ + sfi_write_system_header(&table); + + return base + table.ptr; +} diff --git a/arch/x86/lib/tables.c b/arch/x86/lib/tables.c index 0836e1e..8031201 100644 --- a/arch/x86/lib/tables.c +++ b/arch/x86/lib/tables.c @@ -5,6 +5,7 @@ */
#include <common.h> +#include <asm/sfi.h> #include <asm/tables.h>
u8 table_compute_checksum(void *v, int len) @@ -27,4 +28,8 @@ void write_tables(void) rom_table_end = write_pirq_routing_table(rom_table_end); rom_table_end = ALIGN(rom_table_end, 1024); #endif +#ifdef CONFIG_GENERATE_SFI_TABLE + rom_table_end = write_sfi_table(rom_table_end); + rom_table_end = ALIGN(rom_table_end, 1024); +#endif }

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This provides a way of passing information to Linux without requiring the full ACPI horror. Provide a rudimentary implementation sufficient to be recognised and parsed by Linux.
Signed-off-by: Simon Glass sjg@chromium.org
Looks good, thanks!
Reviewed-by: Bin Meng bmeng.cn@gmail.com
But some nits below :)
Changes in v2:
- Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig
- Correct Kconfig help indentation and text
- Drop SFI_BASE config option
- Always build sfi.o
- Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE()
- Make get_entry_start() static
- Use table_compute_checksum() to computer checksum
- Add a few blank lines
- Move patch to after the CPU uclass patch
- Drop the RTC table as it is not needed
- Move SFI calling code to write_tables()
- Remove IDLE table
- Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END
- Move '__packed' to immediately after 'struct'
- Add SFI_DEV_TYPE_SD and convert to enum
- Remove #ifdef CONFIG_SFI from hedaer file
- Move sfi.h header file to arch/x86/include/asm
arch/x86/Kconfig | 14 +++++ arch/x86/include/asm/sfi.h | 137 ++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/Makefile | 1 + arch/x86/lib/sfi.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/tables.c | 5 ++ 5 files changed, 311 insertions(+) create mode 100644 arch/x86/include/asm/sfi.h create mode 100644 arch/x86/lib/sfi.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f3a600e..f38e9ba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -393,6 +393,20 @@ config GENERATE_PIRQ_TABLE It specifies the interrupt router information as well how all the PCI devices' interrupt pins are wired to PIRQs.
+config GENERATE_SFI_TABLE
bool "SFI (Simple Firmware Interface) Support"
Should we say: Generate an SFI (Simple Firmware Interface) table? This is to match 'Generate a PIRQ table'.
help
The Simple Firmware Interface (SFI) provides a lightweight method
for platform firmware to pass information to the operating system
via static tables in memory. Kernel SFI support is required to
boot on SFI-only platforms. If you have ACPI tables then these are
used instead.
U-Boot writes this table in sfi_write_tables() just before booting
Should be: write_sfi_table
the OS.
For more information, see http://simplefirmware.org
endmenu
[snip]
Regards, Bin

On 28 April 2015 at 23:16, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This provides a way of passing information to Linux without requiring the full ACPI horror. Provide a rudimentary implementation sufficient to be recognised and parsed by Linux.
Signed-off-by: Simon Glass sjg@chromium.org
Looks good, thanks!
Reviewed-by: Bin Meng bmeng.cn@gmail.com
Applied to u-boot-x86.
(fixed nits)
But some nits below :)
Changes in v2:
- Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig
- Correct Kconfig help indentation and text
- Drop SFI_BASE config option
- Always build sfi.o
- Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE()
- Make get_entry_start() static
- Use table_compute_checksum() to computer checksum
- Add a few blank lines
- Move patch to after the CPU uclass patch
- Drop the RTC table as it is not needed
- Move SFI calling code to write_tables()
- Remove IDLE table
- Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END
- Move '__packed' to immediately after 'struct'
- Add SFI_DEV_TYPE_SD and convert to enum
- Remove #ifdef CONFIG_SFI from hedaer file
- Move sfi.h header file to arch/x86/include/asm
arch/x86/Kconfig | 14 +++++ arch/x86/include/asm/sfi.h | 137 ++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/Makefile | 1 + arch/x86/lib/sfi.c | 154 +++++++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/tables.c | 5 ++ 5 files changed, 311 insertions(+) create mode 100644 arch/x86/include/asm/sfi.h create mode 100644 arch/x86/lib/sfi.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f3a600e..f38e9ba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -393,6 +393,20 @@ config GENERATE_PIRQ_TABLE It specifies the interrupt router information as well how all the PCI devices' interrupt pins are wired to PIRQs.
+config GENERATE_SFI_TABLE
bool "SFI (Simple Firmware Interface) Support"
Should we say: Generate an SFI (Simple Firmware Interface) table? This is to match 'Generate a PIRQ table'.
help
The Simple Firmware Interface (SFI) provides a lightweight method
for platform firmware to pass information to the operating system
via static tables in memory. Kernel SFI support is required to
boot on SFI-only platforms. If you have ACPI tables then these are
used instead.
U-Boot writes this table in sfi_write_tables() just before booting
Should be: write_sfi_table
the OS.
For more information, see http://simplefirmware.org
endmenu
[snip]
Regards, Bin

Add a simple command which provides access to a list of available CPUs along with descriptions and basic information.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
common/Kconfig | 8 ++++ common/Makefile | 1 + common/cmd_cpu.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+) create mode 100644 common/cmd_cpu.c
diff --git a/common/Kconfig b/common/Kconfig index 5d7e48a..15759f7 100644 --- a/common/Kconfig +++ b/common/Kconfig @@ -31,6 +31,14 @@ config CMD_CONSOLE help Print console devices and information.
+config CMD_CPU + bool "cpu" + help + Print information about available CPUs. This normally shows the + number of CPUs, type (e.g. manufacturer, architecture, product or + internal name) and clock frequency. Other information may be + available depending on the CPU driver. + config CMD_LICENSE bool "license" help diff --git a/common/Makefile b/common/Makefile index fba3830..9084c73 100644 --- a/common/Makefile +++ b/common/Makefile @@ -74,6 +74,7 @@ obj-$(CONFIG_CMD_CBFS) += cmd_cbfs.o obj-$(CONFIG_CMD_CLK) += cmd_clk.o obj-$(CONFIG_CMD_CONSOLE) += cmd_console.o obj-$(CONFIG_CMD_CPLBINFO) += cmd_cplbinfo.o +obj-$(CONFIG_CMD_CPU) += cmd_cpu.o obj-$(CONFIG_DATAFLASH_MMC_SELECT) += cmd_dataflash_mmc_mux.o obj-$(CONFIG_CMD_DATE) += cmd_date.o obj-$(CONFIG_CMD_DEMO) += cmd_demo.o diff --git a/common/cmd_cpu.c b/common/cmd_cpu.c new file mode 100644 index 0000000..c3e229f --- /dev/null +++ b/common/cmd_cpu.c @@ -0,0 +1,113 @@ +/* + * Copyright (c) 2015 Google, Inc + * Written by Simon Glass sjg@chromium.org + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <common.h> +#include <command.h> +#include <cpu.h> +#include <dm.h> + +static const char *cpu_feature_name[CPU_FEAT_COUNT] = { + "L1 cache", + "MMU", +}; + +static int print_cpu_list(bool detail) +{ + struct udevice *dev; + struct uclass *uc; + char buf[100]; + int ret; + + ret = uclass_get(UCLASS_CPU, &uc); + if (ret) { + printf("Cannot find CPU uclass\n"); + return ret; + } + uclass_foreach_dev(dev, uc) { + struct cpu_platdata *plat = dev_get_parent_platdata(dev); + struct cpu_info info; + bool first; + int i; + + ret = cpu_get_desc(dev, buf, sizeof(buf)); + printf("%3d: %-10s %s\n", dev->seq, dev->name, + ret ? "<no description>" : buf); + if (!detail) + continue; + ret = cpu_get_info(dev, &info); + if (ret) { + printf("\t(no detail available"); + if (ret != -ENOSYS) + printf(": err=%d\n", ret); + printf(")\n"); + continue; + } + printf("\tID = %d, freq = ", plat->cpu_id); + print_freq(info.cpu_freq, ""); + first = true; + for (i = 0; i < CPU_FEAT_COUNT; i++) { + if (info.features & (1 << i)) { + printf("%s%s", first ? ": " : ", ", + cpu_feature_name[i]); + first = false; + } + } + printf("\n"); + } + + return 0; +} + +static int do_cpu_list(cmd_tbl_t *cmdtp, int flag, int argc, char *const argv[]) +{ + if (print_cpu_list(false)) + return CMD_RET_FAILURE; + + return 0; +} + +static int do_cpu_detail(cmd_tbl_t *cmdtp, int flag, int argc, + char *const argv[]) +{ + if (print_cpu_list(true)) + return CMD_RET_FAILURE; + + return 0; +} + +static cmd_tbl_t cmd_cpu_sub[] = { + U_BOOT_CMD_MKENT(list, 2, 1, do_cpu_list, "", ""), + U_BOOT_CMD_MKENT(detail, 4, 0, do_cpu_detail, "", ""), +}; + +/* + * Process a cpu sub-command + */ +static int do_cpu(cmd_tbl_t *cmdtp, int flag, int argc, + char * const argv[]) +{ + cmd_tbl_t *c = NULL; + + /* Strip off leading 'cpu' command argument */ + argc--; + argv++; + + if (argc) + c = find_cmd_tbl(argv[0], cmd_cpu_sub, ARRAY_SIZE(cmd_cpu_sub)); + + if (c) + return c->cmd(cmdtp, flag, argc, argv); + else + return CMD_RET_USAGE; +} + +U_BOOT_CMD( + cpu, 2, 1, do_cpu, + "display information about CPUs", + "list - list available CPUs\n" + "cpu detail - show CPU detail" +);

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
Add a simple command which provides access to a list of available CPUs along with descriptions and basic information.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
common/Kconfig | 8 ++++ common/Makefile | 1 + common/cmd_cpu.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+) create mode 100644 common/cmd_cpu.c
Applied to u-boot-x86.

Add a subset of this header file from Linux 4.0 to support atomic operations in U-Boot.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/include/asm/atomic.h | 115 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 arch/x86/include/asm/atomic.h
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h new file mode 100644 index 0000000..806f787 --- /dev/null +++ b/arch/x86/include/asm/atomic.h @@ -0,0 +1,115 @@ +#ifndef _ASM_X86_ATOMIC_H +#define _ASM_X86_ATOMIC_H + +#include <linux/compiler.h> +#include <linux/types.h> +#include <asm/processor.h> + +typedef struct { volatile int counter; } atomic_t; + +/* + * Atomic operations that C can't guarantee us. Useful for + * resource counting etc.. + */ + +#define ATOMIC_INIT(i) { (i) } + +/** + * atomic_read - read atomic variable + * @v: pointer of type atomic_t + * + * Atomically reads the value of @v. + */ +static inline int atomic_read(const atomic_t *v) +{ + return ACCESS_ONCE((v)->counter); +} + +/** + * atomic_set - set atomic variable + * @v: pointer of type atomic_t + * @i: required value + * + * Atomically sets the value of @v to @i. + */ +static inline void atomic_set(atomic_t *v, int i) +{ + v->counter = i; +} + +/** + * atomic_add - add integer to atomic variable + * @i: integer value to add + * @v: pointer of type atomic_t + * + * Atomically adds @i to @v. + */ +static inline void atomic_add(int i, atomic_t *v) +{ + asm volatile(LOCK_PREFIX "addl %1,%0" + : "+m" (v->counter) + : "ir" (i)); +} + +/** + * atomic_sub - subtract integer from atomic variable + * @i: integer value to subtract + * @v: pointer of type atomic_t + * + * Atomically subtracts @i from @v. + */ +static inline void atomic_sub(int i, atomic_t *v) +{ + asm volatile(LOCK_PREFIX "subl %1,%0" + : "+m" (v->counter) + : "ir" (i)); +} + +/** + * atomic_inc - increment atomic variable + * @v: pointer of type atomic_t + * + * Atomically increments @v by 1. + */ +static inline void atomic_inc(atomic_t *v) +{ + asm volatile(LOCK_PREFIX "incl %0" + : "+m" (v->counter)); +} + +/** + * atomic_dec - decrement atomic variable + * @v: pointer of type atomic_t + * + * Atomically decrements @v by 1. + */ +static inline void atomic_dec(atomic_t *v) +{ + asm volatile(LOCK_PREFIX "decl %0" + : "+m" (v->counter)); +} + +/** + * atomic_inc_short - increment of a short integer + * @v: pointer to type int + * + * Atomically adds 1 to @v + * Returns the new value of @u + */ +static inline short int atomic_inc_short(short int *v) +{ + asm(LOCK_PREFIX "addw $1, %0" : "+m" (*v)); + return *v; +} + +/* These are x86-specific, used by some header files */ +#define atomic_clear_mask(mask, addr) \ + asm volatile(LOCK_PREFIX "andl %0,%1" \ + : : "r" (~(mask)), "m" (*(addr)) : "memory") + +#define atomic_set_mask(mask, addr) \ + asm volatile(LOCK_PREFIX "orl %0,%1" \ + : : "r" ((unsigned)(mask)), "m" (*(addr)) \ + : "memory") + +#endif /* _ASM_X86_ATOMIC_H */

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
Add a subset of this header file from Linux 4.0 to support atomic operations in U-Boot.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
arch/x86/include/asm/atomic.h | 115 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 arch/x86/include/asm/atomic.h
Applied to u-boot-x86.

Add MSR numbers for the fixed MTRRs.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/include/asm/mtrr.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index fda4eae..3841593 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -34,6 +34,20 @@ /* Number of MTRRs supported */ #define MTRR_COUNT 8
+#define NUM_FIXED_RANGES 88 +#define RANGES_PER_FIXED_MTRR 8 +#define MTRR_FIX_64K_00000_MSR 0x250 +#define MTRR_FIX_16K_80000_MSR 0x258 +#define MTRR_FIX_16K_A0000_MSR 0x259 +#define MTRR_FIX_4K_C0000_MSR 0x268 +#define MTRR_FIX_4K_C8000_MSR 0x269 +#define MTRR_FIX_4K_D0000_MSR 0x26a +#define MTRR_FIX_4K_D8000_MSR 0x26b +#define MTRR_FIX_4K_E0000_MSR 0x26c +#define MTRR_FIX_4K_E8000_MSR 0x26d +#define MTRR_FIX_4K_F0000_MSR 0x26e +#define MTRR_FIX_4K_F8000_MSR 0x26f + #if !defined(__ASSEMBLER__)
/**

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
Add MSR numbers for the fixed MTRRs.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
arch/x86/include/asm/mtrr.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
Applied to u-boot-x86.

Provide access to this x86 instruction from C code.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Remove unnecessary \t\n after mfence assembler instruction
arch/x86/include/asm/cpu.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index c839291..08284ee 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -151,6 +151,11 @@ static inline int flag_is_changeable_p(uint32_t flag) return ((f1^f2) & flag) != 0; }
+static inline void mfence(void) +{ + __asm__ __volatile__("mfence" : : : "memory"); +} + /** * cpu_enable_paging_pae() - Enable PAE-paging *

On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Provide access to this x86 instruction from C code.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Remove unnecessary \t\n after mfence assembler instruction
arch/x86/include/asm/cpu.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index c839291..08284ee 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -151,6 +151,11 @@ static inline int flag_is_changeable_p(uint32_t flag) return ((f1^f2) & flag) != 0; }
+static inline void mfence(void) +{
__asm__ __volatile__("mfence" : : : "memory");
+}
/**
- cpu_enable_paging_pae() - Enable PAE-paging
--
Reviewed-by: Bin Meng bmeng.cn@gmail.com

On 28 April 2015 at 23:19, Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Provide access to this x86 instruction from C code.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Remove unnecessary \t\n after mfence assembler instruction
arch/x86/include/asm/cpu.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index c839291..08284ee 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -151,6 +151,11 @@ static inline int flag_is_changeable_p(uint32_t flag) return ((f1^f2) & flag) != 0; }
+static inline void mfence(void) +{
__asm__ __volatile__("mfence" : : : "memory");
+}
/**
- cpu_enable_paging_pae() - Enable PAE-paging
--
Reviewed-by: Bin Meng bmeng.cn@gmail.com
Applied to u-boot-x86.

When we start up additional CPUs we want them to use the same Global Descriptor Table. Store the address of this in global_data so we can reference it later.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/cpu/cpu.c | 1 + arch/x86/include/asm/global_data.h | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/x86/cpu/cpu.c b/arch/x86/cpu/cpu.c index 02e66d8..78eb3fe 100644 --- a/arch/x86/cpu/cpu.c +++ b/arch/x86/cpu/cpu.c @@ -133,6 +133,7 @@ static void load_gdt(const u64 *boot_gdt, u16 num_entries)
void setup_gdt(gd_t *id, u64 *gdt_addr) { + id->arch.gdt = gdt_addr; /* CS: code, read/execute, 4 GB, base 0 */ gdt_addr[X86_GDT_ENTRY_32BIT_CS] = GDT_ENTRY(0xc09b, 0, 0xfffff);
diff --git a/arch/x86/include/asm/global_data.h b/arch/x86/include/asm/global_data.h index 5ee06eb..4d9eac6 100644 --- a/arch/x86/include/asm/global_data.h +++ b/arch/x86/include/asm/global_data.h @@ -68,6 +68,7 @@ struct arch_global_data { /* MRC training data to save for the next boot */ char *mrc_output; unsigned int mrc_output_len; + void *gdt; /* Global descriptor table */ };
#endif

On 28 April 2015 at 20:25, Simon Glass sjg@chromium.org wrote:
When we start up additional CPUs we want them to use the same Global Descriptor Table. Store the address of this in global_data so we can reference it later.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Bin Meng bmeng.cn@gmail.com
Changes in v2: None
arch/x86/cpu/cpu.c | 1 + arch/x86/include/asm/global_data.h | 1 + 2 files changed, 2 insertions(+)
Applied to u-boot-x86.

Add a function to return the address of the Interrupt Descriptor Table.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/cpu/interrupts.c | 5 +++++ arch/x86/include/asm/interrupt.h | 2 ++ 2 files changed, 7 insertions(+)
diff --git a/arch/x86/cpu/interrupts.c b/arch/x86/cpu/interrupts.c index a21d2a6..c777d36 100644 --- a/arch/x86/cpu/interrupts.c +++ b/arch/x86/cpu/interrupts.c @@ -147,6 +147,11 @@ int cpu_init_interrupts(void) return 0; }
+void *x86_get_idt(void) +{ + return &idt_ptr; +} + void __do_irq(int irq) { printf("Unhandled IRQ : %d\n", irq); diff --git a/arch/x86/include/asm/interrupt.h b/arch/x86/include/asm/interrupt.h index 25abde7..0a75f89 100644 --- a/arch/x86/include/asm/interrupt.h +++ b/arch/x86/include/asm/interrupt.h @@ -38,4 +38,6 @@ extern char exception_stack[]; */ void configure_irq_trigger(int int_num, bool is_level_triggered);
+void *x86_get_idt(void); + #endif

On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Add a function to return the address of the Interrupt Descriptor Table.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/interrupts.c | 5 +++++ arch/x86/include/asm/interrupt.h | 2 ++ 2 files changed, 7 insertions(+)
diff --git a/arch/x86/cpu/interrupts.c b/arch/x86/cpu/interrupts.c index a21d2a6..c777d36 100644 --- a/arch/x86/cpu/interrupts.c +++ b/arch/x86/cpu/interrupts.c @@ -147,6 +147,11 @@ int cpu_init_interrupts(void) return 0; }
+void *x86_get_idt(void) +{
return &idt_ptr;
+}
void __do_irq(int irq) { printf("Unhandled IRQ : %d\n", irq); diff --git a/arch/x86/include/asm/interrupt.h b/arch/x86/include/asm/interrupt.h index 25abde7..0a75f89 100644 --- a/arch/x86/include/asm/interrupt.h +++ b/arch/x86/include/asm/interrupt.h @@ -38,4 +38,6 @@ extern char exception_stack[]; */ void configure_irq_trigger(int int_num, bool is_level_triggered);
+void *x86_get_idt(void);
#endif
Reviewed-by: Bin Meng bmeng.cn@gmail.com

On 28 April 2015 at 23:23, Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Add a function to return the address of the Interrupt Descriptor Table.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/interrupts.c | 5 +++++ arch/x86/include/asm/interrupt.h | 2 ++ 2 files changed, 7 insertions(+)
diff --git a/arch/x86/cpu/interrupts.c b/arch/x86/cpu/interrupts.c index a21d2a6..c777d36 100644 --- a/arch/x86/cpu/interrupts.c +++ b/arch/x86/cpu/interrupts.c @@ -147,6 +147,11 @@ int cpu_init_interrupts(void) return 0; }
+void *x86_get_idt(void) +{
return &idt_ptr;
+}
void __do_irq(int irq) { printf("Unhandled IRQ : %d\n", irq); diff --git a/arch/x86/include/asm/interrupt.h b/arch/x86/include/asm/interrupt.h index 25abde7..0a75f89 100644 --- a/arch/x86/include/asm/interrupt.h +++ b/arch/x86/include/asm/interrupt.h @@ -38,4 +38,6 @@ extern char exception_stack[]; */ void configure_irq_trigger(int int_num, bool is_level_triggered);
+void *x86_get_idt(void);
#endif
Reviewed-by: Bin Meng bmeng.cn@gmail.com
Applied to u-boot-x86.

Most modern x86 CPUs include more than one CPU core. The OS normally requires that these 'Application Processors' (APs) be brought up by the boot loader. Add the required support to U-Boot to init additional APs.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/Kconfig | 25 ++ arch/x86/cpu/Makefile | 2 + arch/x86/cpu/ivybridge/model_206ax.c | 4 +- arch/x86/cpu/mp_init.c | 507 +++++++++++++++++++++++++++++++++++ arch/x86/cpu/sipi.S | 215 +++++++++++++++ arch/x86/include/asm/mp.h | 94 +++++++ arch/x86/include/asm/sipi.h | 79 ++++++ arch/x86/include/asm/smm.h | 14 + 8 files changed, 938 insertions(+), 2 deletions(-) create mode 100644 arch/x86/cpu/mp_init.c create mode 100644 arch/x86/cpu/sipi.S create mode 100644 arch/x86/include/asm/mp.h create mode 100644 arch/x86/include/asm/sipi.h create mode 100644 arch/x86/include/asm/smm.h
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f38e9ba..f89ee5c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -361,6 +361,31 @@ config FSP_TEMP_RAM_ADDR Stack top address which is used in FspInit after DRAM is ready and CAR is disabled.
+config MAX_CPUS + int "Maximum number of CPUs permitted" + default 4 + help + When using multi-CPU chips it is possible for U-Boot to start up + more than one CPU. The stack memory used by all of these CPUs is + pre-allocated so at present U-Boot wants to know the maximum + number of CPUs that may be present. Set this to at least as high + as the number of CPUs in your system (it uses about 4KB of RAM for + each CPU). + +config SMP + bool "Enable Symmetric Multiprocessing" + default n + help + Enable use of more than one CPU in U-Boot and the Operating System + when loaded. Each CPU will be started up and information can be + obtained using the 'cpu' command. If this option is disabled, then + only one CPU will be enabled regardless of the number of CPUs + available. + +config STACK_SIZE + hex + default 0x1000 + config TSC_CALIBRATION_BYPASS bool "Bypass Time-Stamp Counter (TSC) calibration" default n diff --git a/arch/x86/cpu/Makefile b/arch/x86/cpu/Makefile index 6ded0a7..9a08ab4 100644 --- a/arch/x86/cpu/Makefile +++ b/arch/x86/cpu/Makefile @@ -19,6 +19,8 @@ obj-$(CONFIG_NORTHBRIDGE_INTEL_IVYBRIDGE) += ivybridge/ obj-$(CONFIG_INTEL_QUARK) += quark/ obj-$(CONFIG_INTEL_QUEENSBAY) += queensbay/ obj-y += lapic.o +obj-$(CONFIG_SMP) += mp_init.o obj-y += mtrr.o obj-$(CONFIG_PCI) += pci.o +obj-$(CONFIG_SMP) += sipi.o obj-y += turbo.o diff --git a/arch/x86/cpu/ivybridge/model_206ax.c b/arch/x86/cpu/ivybridge/model_206ax.c index 11dc625..8b08c40 100644 --- a/arch/x86/cpu/ivybridge/model_206ax.c +++ b/arch/x86/cpu/ivybridge/model_206ax.c @@ -435,8 +435,8 @@ static int intel_cores_init(struct x86_cpu_priv *cpu)
debug("CPU: %u has core %u\n", cpu->apic_id, new_cpu->apic_id);
-#if CONFIG_SMP && CONFIG_MAX_CPUS > 1 - /* Start the new cpu */ +#if 0 && CONFIG_SMP && CONFIG_MAX_CPUS > 1 + /* TODO(sjg@chromium.org): Start the new cpu */ if (!start_cpu(new_cpu)) { /* Record the error in cpu? */ printk(BIOS_ERR, "CPU %u would not start!\n", diff --git a/arch/x86/cpu/mp_init.c b/arch/x86/cpu/mp_init.c new file mode 100644 index 0000000..c1fd4e0 --- /dev/null +++ b/arch/x86/cpu/mp_init.c @@ -0,0 +1,507 @@ +/* + * Copyright (C) 2015 Google, Inc + * + * SPDX-License-Identifier: GPL-2.0+ + * + * Based on code from the coreboot file of the same name + */ + +#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <errno.h> +#include <malloc.h> +#include <asm/atomic.h> +#include <asm/cpu.h> +#include <asm/interrupt.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/mtrr.h> +#include <asm/sipi.h> +#include <asm/smm.h> +#include <dm/device-internal.h> +#include <dm/uclass-internal.h> +#include <linux/linkage.h> + +/* This also needs to match the sipi.S assembly code for saved MSR encoding */ +struct saved_msr { + uint32_t index; + uint32_t lo; + uint32_t hi; +} __packed; + + +/* + * The SIPI vector is loaded at the SMM_DEFAULT_BASE. The reason is at the + * memory range is already reserved so the OS cannot use it. That region is + * free to use for AP bringup before SMM is initialised. + */ +static const uint32_t sipi_vector_location = SMM_DEFAULT_BASE; +static const int sipi_vector_location_size = SMM_DEFAULT_SIZE; + +struct mp_flight_plan { + int num_records; + struct mp_flight_record *records; +}; + +static struct mp_flight_plan mp_info; + +struct cpu_map { + struct udevice *dev; + int apic_id; + int err_code; +}; + +static inline void barrier_wait(atomic_t *b) +{ + while (atomic_read(b) == 0) + asm("pause"); + mfence(); +} + +static inline void release_barrier(atomic_t *b) +{ + mfence(); + atomic_set(b, 1); +} + +/* Returns 1 if timeout waiting for APs. 0 if target APs found */ +static int wait_for_aps(atomic_t *val, int target, int total_delay, + int delay_step) +{ + int timeout = 0; + int delayed = 0; + + while (atomic_read(val) != target) { + udelay(delay_step); + delayed += delay_step; + if (delayed >= total_delay) { + timeout = 1; + break; + } + } + + return timeout; +} + +static void ap_do_flight_plan(struct udevice *cpu) +{ + int i; + + for (i = 0; i < mp_info.num_records; i++) { + struct mp_flight_record *rec = &mp_info.records[i]; + + atomic_inc(&rec->cpus_entered); + barrier_wait(&rec->barrier); + + if (rec->ap_call != NULL) + rec->ap_call(cpu, rec->ap_arg); + } +} + +static int find_cpu_by_apid_id(int apic_id, struct udevice **devp) +{ + struct udevice *dev; + + *devp = NULL; + for (uclass_find_first_device(UCLASS_CPU, &dev); + dev; + uclass_find_next_device(&dev)) { + struct cpu_platdata *plat = dev_get_parent_platdata(dev); + + if (plat->cpu_id == apic_id) { + *devp = dev; + return 0; + } + } + + return -ENOENT; +} + +/* + * By the time APs call ap_init() caching has been setup, and microcode has + * been loaded + */ +static void asmlinkage ap_init(unsigned int cpu_index) +{ + struct udevice *dev; + int apic_id; + int ret; + + /* Ensure the local apic is enabled */ + enable_lapic(); + + apic_id = lapicid(); + ret = find_cpu_by_apid_id(apic_id, &dev); + if (ret) { + debug("Unknown CPU apic_id %x\n", apic_id); + goto done; + } + + debug("AP: slot %d apic_id %x, dev %s\n", cpu_index, apic_id, + dev ? dev->name : "(apic_id not found)"); + + /* Walk the flight plan */ + ap_do_flight_plan(dev); + + /* Park the AP */ + debug("parking\n"); +done: + stop_this_cpu(); +} + +#define NUM_FIXED_MTRRS 11 + +static const unsigned int fixed_mtrrs[NUM_FIXED_MTRRS] = { + MTRR_FIX_64K_00000_MSR, MTRR_FIX_16K_80000_MSR, MTRR_FIX_16K_A0000_MSR, + MTRR_FIX_4K_C0000_MSR, MTRR_FIX_4K_C8000_MSR, MTRR_FIX_4K_D0000_MSR, + MTRR_FIX_4K_D8000_MSR, MTRR_FIX_4K_E0000_MSR, MTRR_FIX_4K_E8000_MSR, + MTRR_FIX_4K_F0000_MSR, MTRR_FIX_4K_F8000_MSR, +}; + +static inline struct saved_msr *save_msr(int index, struct saved_msr *entry) +{ + msr_t msr; + + msr = msr_read(index); + entry->index = index; + entry->lo = msr.lo; + entry->hi = msr.hi; + + /* Return the next entry */ + entry++; + return entry; +} + +static int save_bsp_msrs(char *start, int size) +{ + int msr_count; + int num_var_mtrrs; + struct saved_msr *msr_entry; + int i; + msr_t msr; + + /* Determine number of MTRRs need to be saved */ + msr = msr_read(MTRR_CAP_MSR); + num_var_mtrrs = msr.lo & 0xff; + + /* 2 * num_var_mtrrs for base and mask. +1 for IA32_MTRR_DEF_TYPE */ + msr_count = 2 * num_var_mtrrs + NUM_FIXED_MTRRS + 1; + + if ((msr_count * sizeof(struct saved_msr)) > size) { + printf("Cannot mirror all %d msrs.\n", msr_count); + return -ENOSPC; + } + + msr_entry = (void *)start; + for (i = 0; i < NUM_FIXED_MTRRS; i++) + msr_entry = save_msr(fixed_mtrrs[i], msr_entry); + + for (i = 0; i < num_var_mtrrs; i++) { + msr_entry = save_msr(MTRR_PHYS_BASE_MSR(i), msr_entry); + msr_entry = save_msr(MTRR_PHYS_MASK_MSR(i), msr_entry); + } + + msr_entry = save_msr(MTRR_DEF_TYPE_MSR, msr_entry); + + return msr_count; +} + +static int load_sipi_vector(atomic_t **ap_countp) +{ + struct sipi_params_16bit *params16; + struct sipi_params *params; + static char msr_save[512]; + char *stack; + ulong addr; + int code_len; + int size; + int ret; + + /* Copy in the code */ + code_len = ap_start16_code_end - ap_start16; + debug("Copying SIPI code to %x: %d bytes\n", SMM_DEFAULT_BASE, + code_len); + memcpy((void *)SMM_DEFAULT_BASE, ap_start16, code_len); + + addr = SMM_DEFAULT_BASE + (ulong)sipi_params_16bit - (ulong)ap_start16; + params16 = (struct sipi_params_16bit *)addr; + params16->ap_start = (uint32_t)ap_start; + params16->gdt = (uint32_t)gd->arch.gdt; + params16->gdt_limit = X86_GDT_SIZE - 1; + debug("gdt = %x, gdt_limit = %x\n", params16->gdt, params16->gdt_limit); + + params = (struct sipi_params *)sipi_params; + debug("SIPI 32-bit params at %p\n", params); + params->idt_ptr = (uint32_t)x86_get_idt(); + + params->stack_size = 4096; + size = params->stack_size * CONFIG_MAX_CPUS; + stack = memalign(size, 4096); + if (!stack) + return -ENOMEM; + params->stack_top = (u32)(stack + size); + + params->microcode_ptr = 0; + params->msr_table_ptr = (u32)msr_save; + ret = save_bsp_msrs(msr_save, sizeof(msr_save)); + if (ret < 0) + return ret; + params->msr_count = ret; + + params->c_handler = (uint32_t)&ap_init; + + *ap_countp = ¶ms->ap_count; + atomic_set(*ap_countp, 0); + debug("SIPI vector is ready\n"); + + return 0; +} + +static int check_cpu_devices(int expected_cpus) +{ + int i; + + for (i = 0; i < expected_cpus; i++) { + struct udevice *dev; + int ret; + + ret = uclass_find_device(UCLASS_CPU, i, &dev); + if (ret) { + debug("Cannot find CPU %d in device tree\n", i); + return ret; + } + } + + return 0; +} + +/* Returns 1 for timeout. 0 on success */ +static int apic_wait_timeout(int total_delay, int delay_step) +{ + int total = 0; + int timeout = 0; + + while (lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY) { + udelay(delay_step); + total += delay_step; + if (total >= total_delay) { + timeout = 1; + break; + } + } + + return timeout; +} + +static int start_aps(int ap_count, atomic_t *num_aps) +{ + int sipi_vector; + /* Max location is 4KiB below 1MiB */ + const int max_vector_loc = ((1 << 20) - (1 << 12)) >> 12; + + if (ap_count == 0) + return 0; + + /* The vector is sent as a 4k aligned address in one byte */ + sipi_vector = sipi_vector_location >> 12; + + if (sipi_vector > max_vector_loc) { + printf("SIPI vector too large! 0x%08x\n", + sipi_vector); + return -1; + } + + debug("Attempting to start %d APs\n", ap_count); + + if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) { + debug("Waiting for ICR not to be busy..."); + if (apic_wait_timeout(1000 /* 1 ms */, 50)) { + debug("timed out. Aborting.\n"); + return -1; + } else { + debug("done.\n"); + } + } + + /* Send INIT IPI to all but self */ + lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0)); + lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT | + LAPIC_DM_INIT); + debug("Waiting for 10ms after sending INIT.\n"); + mdelay(10); + + /* Send 1st SIPI */ + if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) { + debug("Waiting for ICR not to be busy..."); + if (apic_wait_timeout(1000 /* 1 ms */, 50)) { + debug("timed out. Aborting.\n"); + return -1; + } else { + debug("done.\n"); + } + } + + lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0)); + lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT | + LAPIC_DM_STARTUP | sipi_vector); + debug("Waiting for 1st SIPI to complete..."); + if (apic_wait_timeout(10000 /* 10 ms */, 50 /* us */)) { + debug("timed out.\n"); + return -1; + } else { + debug("done.\n"); + } + + /* Wait for CPUs to check in up to 200 us */ + wait_for_aps(num_aps, ap_count, 200 /* us */, 15 /* us */); + + /* Send 2nd SIPI */ + if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) { + debug("Waiting for ICR not to be busy..."); + if (apic_wait_timeout(1000 /* 1 ms */, 50)) { + debug("timed out. Aborting.\n"); + return -1; + } else { + debug("done.\n"); + } + } + + lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0)); + lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT | + LAPIC_DM_STARTUP | sipi_vector); + debug("Waiting for 2nd SIPI to complete..."); + if (apic_wait_timeout(10000 /* 10 ms */, 50 /* us */)) { + debug("timed out.\n"); + return -1; + } else { + debug("done.\n"); + } + + /* Wait for CPUs to check in */ + if (wait_for_aps(num_aps, ap_count, 10000 /* 10 ms */, 50 /* us */)) { + debug("Not all APs checked in: %d/%d.\n", + atomic_read(num_aps), ap_count); + return -1; + } + + return 0; +} + +static int bsp_do_flight_plan(struct udevice *cpu, struct mp_params *mp_params) +{ + int i; + int ret = 0; + const int timeout_us = 100000; + const int step_us = 100; + int num_aps = mp_params->num_cpus - 1; + + for (i = 0; i < mp_params->num_records; i++) { + struct mp_flight_record *rec = &mp_params->flight_plan[i]; + + /* Wait for APs if the record is not released */ + if (atomic_read(&rec->barrier) == 0) { + /* Wait for the APs to check in */ + if (wait_for_aps(&rec->cpus_entered, num_aps, + timeout_us, step_us)) { + debug("MP record %d timeout.\n", i); + ret = -1; + } + } + + if (rec->bsp_call != NULL) + rec->bsp_call(cpu, rec->bsp_arg); + + release_barrier(&rec->barrier); + } + return ret; +} + +static int init_bsp(struct udevice **devp) +{ + char processor_name[CPU_MAX_NAME_LEN]; + int apic_id; + int ret; + + cpu_get_name(processor_name); + debug("CPU: %s.\n", processor_name); + + enable_lapic(); + + apic_id = lapicid(); + ret = find_cpu_by_apid_id(apic_id, devp); + if (ret) { + printf("Cannot find boot CPU, APIC ID %d\n", apic_id); + return ret; + } + + return 0; +} + +int mp_init(struct mp_params *p) +{ + int num_aps; + atomic_t *ap_count; + struct udevice *cpu; + int ret; + + /* This will cause the CPUs devices to be bound */ + struct uclass *uc; + ret = uclass_get(UCLASS_CPU, &uc); + if (ret) + return ret; + + ret = init_bsp(&cpu); + if (ret) { + debug("Cannot init boot CPU: err=%d\n", ret); + return ret; + } + + if (p == NULL || p->flight_plan == NULL || p->num_records < 1) { + printf("Invalid MP parameters\n"); + return -1; + } + + ret = check_cpu_devices(p->num_cpus); + if (ret) + debug("Warning: Device tree does not describe all CPUs. Extra ones will not be started correctly\n"); + + /* Copy needed parameters so that APs have a reference to the plan */ + mp_info.num_records = p->num_records; + mp_info.records = p->flight_plan; + + /* Load the SIPI vector */ + ret = load_sipi_vector(&ap_count); + if (ap_count == NULL) + return -1; + + /* + * Make sure SIPI data hits RAM so the APs that come up will see + * the startup code even if the caches are disabled + */ + wbinvd(); + + /* Start the APs providing number of APs and the cpus_entered field */ + num_aps = p->num_cpus - 1; + ret = start_aps(num_aps, ap_count); + if (ret) { + mdelay(1000); + debug("%d/%d eventually checked in?\n", atomic_read(ap_count), + num_aps); + return ret; + } + + /* Walk the flight plan for the BSP */ + ret = bsp_do_flight_plan(cpu, p); + if (ret) { + debug("CPU init failed: err=%d\n", ret); + return ret; + } + + return 0; +} + +int mp_init_cpu(struct udevice *cpu, void *unused) +{ + return device_probe(cpu); +} diff --git a/arch/x86/cpu/sipi.S b/arch/x86/cpu/sipi.S new file mode 100644 index 0000000..09fde21 --- /dev/null +++ b/arch/x86/cpu/sipi.S @@ -0,0 +1,215 @@ +/* + * Copyright (c) 2015 Google, Inc + * + * SPDX-License-Identifier: GPL-2.0 + * + * Taken from coreboot file of the same name + */ + +/* + * The SIPI vector is responsible for initializing the APs in the sytem. It + * loads microcode, sets up MSRs, and enables caching before calling into + * C code + */ + +#include <asm/global_data.h> +#include <asm/msr-index.h> +#include <asm/processor.h> +#include <asm/processor-flags.h> +#include <asm/smm.h> + +#define CODE_SEG (X86_GDT_ENTRY_32BIT_CS * X86_GDT_ENTRY_SIZE) +#define DATA_SEG (X86_GDT_ENTRY_32BIT_DS * X86_GDT_ENTRY_SIZE) + +#define o32 .byte 0x66; + +/* + * First we have the 16-bit section. Every AP process starts here. + * The simple task is to load U-Boot's Global Descriptor Table (GDT) to allow + * U-Boot's 32-bit code to become visible, then jump to ap_start32. + * + * Note that this code is copied to RAM below 1MB in mp_init.c, and runs from + * there, but the 32-bit code (ap_start32 and onwards) is part of U-Boot and + * is therefore relocated to the top of RAM with other U-Boot code. This + * means that for the 16-bit code we must write relocatable code, but for the + * rest, we can do what we like. + */ +.text +.code16 +.globl ap_start16 +ap_start16: + cli + xorl %eax, %eax + movl %eax, %cr3 /* Invalidate TLB */ + + /* setup the data segment */ + movw %cs, %ax + movw %ax, %ds + + /* Use an address relative to the data segment for the GDT */ + movl $gdtaddr, %ebx + subl $ap_start16, %ebx + + data32 lgdt (%ebx) + + movl %cr0, %eax + andl $0x7ffaffd1, %eax /* PG, AM, WP, NE, TS, EM, MP = 0 */ + orl $0x60000001, %eax /* CD, NW, PE = 1 */ + movl %eax, %cr0 + + movl $ap_start_jmp, %eax + subl $ap_start16, %eax + movw %ax, %bp + + /* Jump to ap_start32 within U-Boot */ +o32 cs ljmp *(%bp) + + .align 4 +.globl sipi_params_16bit +sipi_params_16bit: + /* 48-bit far pointer */ +ap_start_jmp: + .long 0 /* offset set to ap_start by U-Boot */ + .word CODE_SEG /* segment */ + + .word 0 /* padding */ +gdtaddr: + .word 0 /* limit */ + .long 0 /* table */ + .word 0 /* unused */ + +.globl ap_start16_code_end +ap_start16_code_end: + +/* + * Set up the special 'fs' segment for global_data. Then jump to ap_continue + * to set up the AP. + */ +.globl ap_start +ap_start: + .code32 + movw $DATA_SEG, %ax + movw %ax, %ds + movw %ax, %es + movw %ax, %ss + movw %ax, %gs + + movw $(X86_GDT_ENTRY_32BIT_FS * X86_GDT_ENTRY_SIZE), %ax + movw %ax, %fs + + /* Load the Interrupt descriptor table */ + mov idt_ptr, %ebx + lidt (%ebx) + + /* Obtain cpu number */ + movl ap_count, %eax +1: + movl %eax, %ecx + inc %ecx + lock cmpxchg %ecx, ap_count + jnz 1b + + /* Setup stacks for each CPU */ + movl stack_size, %eax + mul %ecx + movl stack_top, %edx + subl %eax, %edx + mov %edx, %esp + /* Save cpu number */ + mov %ecx, %esi + + /* Determine if one should check microcode versions */ + mov microcode_ptr, %edi + test %edi, %edi + jz microcode_done /* Bypass if no microde exists */ + + /* Get the Microcode version */ + mov $1, %eax + cpuid + mov $MSR_IA32_UCODE_REV, %ecx + rdmsr + /* If something already loaded skip loading again */ + test %edx, %edx + jnz microcode_done + + /* Determine if parallel microcode loading is allowed */ + cmp $0xffffffff, microcode_lock + je load_microcode + + /* Protect microcode loading */ +lock_microcode: + lock bts $0, microcode_lock + jc lock_microcode + +load_microcode: + /* Load new microcode */ + mov $MSR_IA32_UCODE_WRITE, %ecx + xor %edx, %edx + mov %edi, %eax + /* The microcode pointer is passed in pointing to the header. Adjust + * pointer to reflect the payload (header size is 48 bytes) */ + add $48, %eax + pusha + wrmsr + popa + + /* Unconditionally unlock microcode loading */ + cmp $0xffffffff, microcode_lock + je microcode_done + + xor %eax, %eax + mov %eax, microcode_lock + +microcode_done: + /* + * Load MSRs. Each entry in the table consists of: + * 0: index, + * 4: value[31:0] + * 8: value[63:32] + * See struct saved_msr in mp_init.c. + */ + mov msr_table_ptr, %edi + mov msr_count, %ebx + test %ebx, %ebx + jz 1f +load_msr: + mov (%edi), %ecx + mov 4(%edi), %eax + mov 8(%edi), %edx + wrmsr + add $12, %edi + dec %ebx + jnz load_msr + +1: + /* Enable caching */ + mov %cr0, %eax + and $0x9fffffff, %eax /* CD, NW = 0 */ + mov %eax, %cr0 + + /* c_handler(cpu_num) */ + push %esi /* cpu_num */ + mov c_handler, %eax + call *%eax + + .align 4 +.globl sipi_params +sipi_params: +idt_ptr: + .long 0 +stack_top: + .long 0 +stack_size: + .long 0 +microcode_lock: + .long 0 +microcode_ptr: + .long 0 +msr_table_ptr: + .long 0 +msr_count: + .long 0 +c_handler: + .long 0 +ap_count: + .long 0 diff --git a/arch/x86/include/asm/mp.h b/arch/x86/include/asm/mp.h new file mode 100644 index 0000000..5dc0e33 --- /dev/null +++ b/arch/x86/include/asm/mp.h @@ -0,0 +1,94 @@ +/* + * Copyright (c) 2015 Google, Inc + * + * SPDX-License-Identifier: GPL-2.0 + * + * Taken from coreboot file of the same name + */ + +#ifndef _X86_MP_H_ +#define _X86_MP_H_ + +#include <asm/atomic.h> + +typedef int (*mp_callback_t)(struct udevice *cpu, void *arg); + +/* + * A mp_flight_record details a sequence of calls for the APs to perform + * along with the BSP to coordinate sequencing. Each flight record either + * provides a barrier for each AP before calling the callback or the APs + * are allowed to perform the callback without waiting. Regardless, each + * record has the cpus_entered field incremented for each record. When + * the BSP observes that the cpus_entered matches the number of APs + * the bsp_call is called with bsp_arg and upon returning releases the + * barrier allowing the APs to make further progress. + * + * Note that ap_call() and bsp_call() can be NULL. In the NULL case the + * callback will just not be called. + */ +struct mp_flight_record { + atomic_t barrier; + atomic_t cpus_entered; + mp_callback_t ap_call; + void *ap_arg; + mp_callback_t bsp_call; + void *bsp_arg; +} __attribute__((aligned(ARCH_DMA_MINALIGN))); + +#define _MP_FLIGHT_RECORD(barrier_, ap_func_, ap_arg_, bsp_func_, bsp_arg_) \ + { \ + .barrier = ATOMIC_INIT(barrier_), \ + .cpus_entered = ATOMIC_INIT(0), \ + .ap_call = ap_func_, \ + .ap_arg = ap_arg_, \ + .bsp_call = bsp_func_, \ + .bsp_arg = bsp_arg_, \ + } + +#define MP_FR_BLOCK_APS(ap_func_, ap_arg_, bsp_func_, bsp_arg_) \ + _MP_FLIGHT_RECORD(0, ap_func_, ap_arg_, bsp_func_, bsp_arg_) + +#define MP_FR_NOBLOCK_APS(ap_func_, ap_arg_, bsp_func_, bsp_arg_) \ + _MP_FLIGHT_RECORD(1, ap_func_, ap_arg_, bsp_func_, bsp_arg_) + +/* + * The mp_params structure provides the arguments to the mp subsystem + * for bringing up APs. + * + * At present this is overkill for U-Boot, but it may make it easier to add + * SMM support. + */ +struct mp_params { + int num_cpus; /* Total cpus include BSP */ + int parallel_microcode_load; + const void *microcode_pointer; + /* Flight plan for APs and BSP */ + struct mp_flight_record *flight_plan; + int num_records; +}; + +/* + * mp_init() will set up the SIPI vector and bring up the APs according to + * mp_params. Each flight record will be executed according to the plan. Note + * that the MP infrastructure uses SMM default area without saving it. It's + * up to the chipset or mainboard to either e820 reserve this area or save this + * region prior to calling mp_init() and restoring it after mp_init returns. + * + * At the time mp_init() is called the MTRR MSRs are mirrored into APs then + * caching is enabled before running the flight plan. + * + * The MP init has the following properties: + * 1. APs are brought up in parallel. + * 2. The ordering of coreboot cpu number and APIC ids is not deterministic. + * Therefore, one cannot rely on this property or the order of devices in + * the device tree unless the chipset or mainboard know the APIC ids + * a priori. + * + * mp_init() returns < 0 on error, 0 on success. + */ +int mp_init(struct mp_params *params); + +/* Probes the CPU device */ +int mp_init_cpu(struct udevice *cpu, void *unused); + +#endif /* _X86_MP_H_ */ diff --git a/arch/x86/include/asm/sipi.h b/arch/x86/include/asm/sipi.h new file mode 100644 index 0000000..bb2f0de --- /dev/null +++ b/arch/x86/include/asm/sipi.h @@ -0,0 +1,79 @@ +/* + * Copyright (c) 2015 Gooogle, Inc + * Written by Simon Glass sjg@chromium.org + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#ifndef _ASM_SIPI_H +#define _ASM_SIPI_H + +/** + * struct sipi_params_16bit - 16-bit SIPI entry-point parameters + * + * These are set up in the same space as the SIPI 16-bit code so that each AP + * can access the parameters when it boots. + * + * Each of these must be set up for the AP to boot, except @segment which is + * set in the assembly code. + * + * @ap_start: 32-bit SIPI entry point for U-Boot + * @segment: Code segment for U-Boot + * @pad: Padding (not used) + * @gdt_limit: U-Boot GDT limit (X86_GDT_SIZE - 1) + * @gdt: U-Boot GDT (gd->arch.gdt) + * @unused: Not used + */ +struct __packed sipi_params_16bit { + u32 ap_start; + u16 segment; + u16 pad; + + u16 gdt_limit; + u32 gdt; + u16 unused; +}; + +/** + * struct sipi_params - 32-bit SIP entry-point parameters + * + * These are used by the AP init code and must be set up before the APs start. + * + * The stack area extends down from @stack_top, with @stack_size allocated + * for each AP. + * + * @idt_ptr: Interrupt descriptor table pointer + * @stack_top: Top of the AP stack area + * @stack_size: Size of each AP's stack + * @microcode_lock: Used to ensure only one AP loads microcode at once + * @microcode_ptr: Pointer to microcode, or 0 if none + * @msr_table_ptr: Pointer to saved MSRs, a list of struct saved_msr + * @msr_count: Number of saved MSRs + * @c_handler: C function to call once early init is complete + * @ap_count: Shared atomic value to allocate CPU indexes + */ +struct sipi_params { + u32 idt_ptr; + u32 stack_top; + u32 stack_size; + u32 microcode_lock; + u32 microcode_ptr; + u32 msr_table_ptr; + u32 msr_count; + u32 c_handler; + atomic_t ap_count; +}; + +/* 16-bit AP entry point */ +void ap_start16(void); + +/* end of 16-bit code/data, marks the region to be copied to SIP vector */ +void ap_start16_code_end(void); + +/* 32-bit AP entry point */ +void ap_start(void); + +extern char sipi_params_16bit[]; +extern char sipi_params[]; + +#endif diff --git a/arch/x86/include/asm/smm.h b/arch/x86/include/asm/smm.h new file mode 100644 index 0000000..79b4a8e --- /dev/null +++ b/arch/x86/include/asm/smm.h @@ -0,0 +1,14 @@ +/* + * Copyright (c) 2015 Google, Inc + * Copyright (C) 2008-2009 coresystems GmbH + * + * SPDX-License-Identifier: GPL-2.0 + */ + +#ifndef CPU_X86_SMM_H +#define CPU_X86_SMM_H + +#define SMM_DEFAULT_BASE 0x30000 +#define SMM_DEFAULT_SIZE 0x10000 + +#endif

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Most modern x86 CPUs include more than one CPU core. The OS normally requires that these 'Application Processors' (APs) be brought up by the boot loader. Add the required support to U-Boot to init additional APs.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/Kconfig | 25 ++ arch/x86/cpu/Makefile | 2 + arch/x86/cpu/ivybridge/model_206ax.c | 4 +- arch/x86/cpu/mp_init.c | 507 +++++++++++++++++++++++++++++++++++ arch/x86/cpu/sipi.S | 215 +++++++++++++++ arch/x86/include/asm/mp.h | 94 +++++++ arch/x86/include/asm/sipi.h | 79 ++++++ arch/x86/include/asm/smm.h | 14 + 8 files changed, 938 insertions(+), 2 deletions(-) create mode 100644 arch/x86/cpu/mp_init.c create mode 100644 arch/x86/cpu/sipi.S create mode 100644 arch/x86/include/asm/mp.h create mode 100644 arch/x86/include/asm/sipi.h create mode 100644 arch/x86/include/asm/smm.h
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f38e9ba..f89ee5c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -361,6 +361,31 @@ config FSP_TEMP_RAM_ADDR Stack top address which is used in FspInit after DRAM is ready and CAR is disabled.
+config MAX_CPUS
int "Maximum number of CPUs permitted"
default 4
help
When using multi-CPU chips it is possible for U-Boot to start up
more than one CPU. The stack memory used by all of these CPUs is
pre-allocated so at present U-Boot wants to know the maximum
number of CPUs that may be present. Set this to at least as high
as the number of CPUs in your system (it uses about 4KB of RAM for
each CPU).
+config SMP
bool "Enable Symmetric Multiprocessing"
default n
help
Enable use of more than one CPU in U-Boot and the Operating System
when loaded. Each CPU will be started up and information can be
obtained using the 'cpu' command. If this option is disabled, then
only one CPU will be enabled regardless of the number of CPUs
available.
+config STACK_SIZE
hex
default 0x1000
The name looks too generic. Maybe AP_STACK_SIZE? Please also include a help text.
config TSC_CALIBRATION_BYPASS bool "Bypass Time-Stamp Counter (TSC) calibration" default n diff --git a/arch/x86/cpu/Makefile b/arch/x86/cpu/Makefile index 6ded0a7..9a08ab4 100644 --- a/arch/x86/cpu/Makefile +++ b/arch/x86/cpu/Makefile @@ -19,6 +19,8 @@ obj-$(CONFIG_NORTHBRIDGE_INTEL_IVYBRIDGE) += ivybridge/ obj-$(CONFIG_INTEL_QUARK) += quark/ obj-$(CONFIG_INTEL_QUEENSBAY) += queensbay/ obj-y += lapic.o +obj-$(CONFIG_SMP) += mp_init.o obj-y += mtrr.o obj-$(CONFIG_PCI) += pci.o +obj-$(CONFIG_SMP) += sipi.o obj-y += turbo.o diff --git a/arch/x86/cpu/ivybridge/model_206ax.c b/arch/x86/cpu/ivybridge/model_206ax.c index 11dc625..8b08c40 100644 --- a/arch/x86/cpu/ivybridge/model_206ax.c +++ b/arch/x86/cpu/ivybridge/model_206ax.c @@ -435,8 +435,8 @@ static int intel_cores_init(struct x86_cpu_priv *cpu)
debug("CPU: %u has core %u\n", cpu->apic_id, new_cpu->apic_id);
-#if CONFIG_SMP && CONFIG_MAX_CPUS > 1
/* Start the new cpu */
+#if 0 && CONFIG_SMP && CONFIG_MAX_CPUS > 1
/* TODO(sjg@chromium.org): Start the new cpu */ if (!start_cpu(new_cpu)) { /* Record the error in cpu? */ printk(BIOS_ERR, "CPU %u would not start!\n",
diff --git a/arch/x86/cpu/mp_init.c b/arch/x86/cpu/mp_init.c new file mode 100644 index 0000000..c1fd4e0 --- /dev/null +++ b/arch/x86/cpu/mp_init.c @@ -0,0 +1,507 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from the coreboot file of the same name
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <errno.h> +#include <malloc.h> +#include <asm/atomic.h> +#include <asm/cpu.h> +#include <asm/interrupt.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/mtrr.h> +#include <asm/sipi.h> +#include <asm/smm.h> +#include <dm/device-internal.h> +#include <dm/uclass-internal.h> +#include <linux/linkage.h>
+/* This also needs to match the sipi.S assembly code for saved MSR encoding */ +struct saved_msr {
uint32_t index;
uint32_t lo;
uint32_t hi;
+} __packed;
+/*
- The SIPI vector is loaded at the SMM_DEFAULT_BASE. The reason is at the
- memory range is already reserved so the OS cannot use it. That region is
- free to use for AP bringup before SMM is initialised.
- */
+static const uint32_t sipi_vector_location = SMM_DEFAULT_BASE; +static const int sipi_vector_location_size = SMM_DEFAULT_SIZE;
Since they are static and const, we can just use the macro directly without declare a variable.
+struct mp_flight_plan {
int num_records;
struct mp_flight_record *records;
+};
+static struct mp_flight_plan mp_info;
+struct cpu_map {
struct udevice *dev;
int apic_id;
int err_code;
+};
+static inline void barrier_wait(atomic_t *b) +{
while (atomic_read(b) == 0)
asm("pause");
mfence();
+}
+static inline void release_barrier(atomic_t *b) +{
mfence();
atomic_set(b, 1);
+}
+/* Returns 1 if timeout waiting for APs. 0 if target APs found */ +static int wait_for_aps(atomic_t *val, int target, int total_delay,
int delay_step)
+{
int timeout = 0;
int delayed = 0;
while (atomic_read(val) != target) {
udelay(delay_step);
delayed += delay_step;
if (delayed >= total_delay) {
timeout = 1;
break;
}
}
return timeout;
+}
+static void ap_do_flight_plan(struct udevice *cpu) +{
int i;
for (i = 0; i < mp_info.num_records; i++) {
struct mp_flight_record *rec = &mp_info.records[i];
atomic_inc(&rec->cpus_entered);
barrier_wait(&rec->barrier);
if (rec->ap_call != NULL)
rec->ap_call(cpu, rec->ap_arg);
}
+}
+static int find_cpu_by_apid_id(int apic_id, struct udevice **devp) +{
struct udevice *dev;
*devp = NULL;
for (uclass_find_first_device(UCLASS_CPU, &dev);
dev;
uclass_find_next_device(&dev)) {
struct cpu_platdata *plat = dev_get_parent_platdata(dev);
if (plat->cpu_id == apic_id) {
*devp = dev;
return 0;
}
}
return -ENOENT;
+}
+/*
- By the time APs call ap_init() caching has been setup, and microcode has
- been loaded
- */
+static void asmlinkage ap_init(unsigned int cpu_index)
Is asmlinkage a must? I think we can pass cpu_index via eax.
+{
struct udevice *dev;
int apic_id;
int ret;
/* Ensure the local apic is enabled */
enable_lapic();
apic_id = lapicid();
ret = find_cpu_by_apid_id(apic_id, &dev);
if (ret) {
debug("Unknown CPU apic_id %x\n", apic_id);
goto done;
}
debug("AP: slot %d apic_id %x, dev %s\n", cpu_index, apic_id,
dev ? dev->name : "(apic_id not found)");
/* Walk the flight plan */
ap_do_flight_plan(dev);
/* Park the AP */
debug("parking\n");
+done:
stop_this_cpu();
+}
+#define NUM_FIXED_MTRRS 11
Can we move this to the MTRR patch (#11 of this series)? So that NUM_FIXED_RANGES in patch#11 can be defined as
#define NUM_FIXED_RANGES (NUM_FIXED_MTRRS * RANGES_PER_FIXED_MTRR)
+static const unsigned int fixed_mtrrs[NUM_FIXED_MTRRS] = {
MTRR_FIX_64K_00000_MSR, MTRR_FIX_16K_80000_MSR, MTRR_FIX_16K_A0000_MSR,
MTRR_FIX_4K_C0000_MSR, MTRR_FIX_4K_C8000_MSR, MTRR_FIX_4K_D0000_MSR,
MTRR_FIX_4K_D8000_MSR, MTRR_FIX_4K_E0000_MSR, MTRR_FIX_4K_E8000_MSR,
MTRR_FIX_4K_F0000_MSR, MTRR_FIX_4K_F8000_MSR,
+};
+static inline struct saved_msr *save_msr(int index, struct saved_msr *entry) +{
msr_t msr;
msr = msr_read(index);
entry->index = index;
entry->lo = msr.lo;
entry->hi = msr.hi;
/* Return the next entry */
entry++;
return entry;
+}
+static int save_bsp_msrs(char *start, int size) +{
int msr_count;
int num_var_mtrrs;
struct saved_msr *msr_entry;
int i;
msr_t msr;
/* Determine number of MTRRs need to be saved */
msr = msr_read(MTRR_CAP_MSR);
num_var_mtrrs = msr.lo & 0xff;
/* 2 * num_var_mtrrs for base and mask. +1 for IA32_MTRR_DEF_TYPE */
msr_count = 2 * num_var_mtrrs + NUM_FIXED_MTRRS + 1;
if ((msr_count * sizeof(struct saved_msr)) > size) {
printf("Cannot mirror all %d msrs.\n", msr_count);
return -ENOSPC;
}
msr_entry = (void *)start;
for (i = 0; i < NUM_FIXED_MTRRS; i++)
msr_entry = save_msr(fixed_mtrrs[i], msr_entry);
for (i = 0; i < num_var_mtrrs; i++) {
msr_entry = save_msr(MTRR_PHYS_BASE_MSR(i), msr_entry);
msr_entry = save_msr(MTRR_PHYS_MASK_MSR(i), msr_entry);
}
msr_entry = save_msr(MTRR_DEF_TYPE_MSR, msr_entry);
return msr_count;
+}
+static int load_sipi_vector(atomic_t **ap_countp) +{
struct sipi_params_16bit *params16;
struct sipi_params *params;
static char msr_save[512];
char *stack;
ulong addr;
int code_len;
int size;
int ret;
/* Copy in the code */
code_len = ap_start16_code_end - ap_start16;
debug("Copying SIPI code to %x: %d bytes\n", SMM_DEFAULT_BASE,
code_len);
memcpy((void *)SMM_DEFAULT_BASE, ap_start16, code_len);
addr = SMM_DEFAULT_BASE + (ulong)sipi_params_16bit - (ulong)ap_start16;
params16 = (struct sipi_params_16bit *)addr;
params16->ap_start = (uint32_t)ap_start;
My read of this is that AP will jump to the flash address of ap_start (pre-relocation)? Is this intentional?
params16->gdt = (uint32_t)gd->arch.gdt;
params16->gdt_limit = X86_GDT_SIZE - 1;
debug("gdt = %x, gdt_limit = %x\n", params16->gdt, params16->gdt_limit);
params = (struct sipi_params *)sipi_params;
debug("SIPI 32-bit params at %p\n", params);
params->idt_ptr = (uint32_t)x86_get_idt();
params->stack_size = 4096;
size = params->stack_size * CONFIG_MAX_CPUS;
stack = memalign(size, 4096);
if (!stack)
return -ENOMEM;
params->stack_top = (u32)(stack + size);
params->microcode_ptr = 0;
params->msr_table_ptr = (u32)msr_save;
ret = save_bsp_msrs(msr_save, sizeof(msr_save));
if (ret < 0)
return ret;
params->msr_count = ret;
params->c_handler = (uint32_t)&ap_init;
Also here flash address of ap_init()?
*ap_countp = ¶ms->ap_count;
atomic_set(*ap_countp, 0);
debug("SIPI vector is ready\n");
return 0;
+}
+static int check_cpu_devices(int expected_cpus) +{
int i;
for (i = 0; i < expected_cpus; i++) {
struct udevice *dev;
int ret;
ret = uclass_find_device(UCLASS_CPU, i, &dev);
if (ret) {
debug("Cannot find CPU %d in device tree\n", i);
return ret;
}
}
return 0;
+}
+/* Returns 1 for timeout. 0 on success */ +static int apic_wait_timeout(int total_delay, int delay_step) +{
int total = 0;
int timeout = 0;
while (lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY) {
udelay(delay_step);
total += delay_step;
if (total >= total_delay) {
timeout = 1;
break;
}
}
return timeout;
+}
+static int start_aps(int ap_count, atomic_t *num_aps) +{
int sipi_vector;
/* Max location is 4KiB below 1MiB */
const int max_vector_loc = ((1 << 20) - (1 << 12)) >> 12;
if (ap_count == 0)
return 0;
/* The vector is sent as a 4k aligned address in one byte */
sipi_vector = sipi_vector_location >> 12;
if (sipi_vector > max_vector_loc) {
printf("SIPI vector too large! 0x%08x\n",
sipi_vector);
return -1;
}
Guess this sanity check is not necessary given we hardcode the sipi_vector to 0x30.
debug("Attempting to start %d APs\n", ap_count);
if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) {
debug("Waiting for ICR not to be busy...");
if (apic_wait_timeout(1000 /* 1 ms */, 50)) {
Can we move the /* 1ms */ comment out of the codes? If we fix, please fix this globally in this file.
debug("timed out. Aborting.\n");
return -1;
} else {
debug("done.\n");
}
}
/* Send INIT IPI to all but self */
lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0));
lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT |
LAPIC_DM_INIT);
debug("Waiting for 10ms after sending INIT.\n");
mdelay(10);
/* Send 1st SIPI */
if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) {
debug("Waiting for ICR not to be busy...");
if (apic_wait_timeout(1000 /* 1 ms */, 50)) {
debug("timed out. Aborting.\n");
return -1;
} else {
debug("done.\n");
}
}
lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0));
lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT |
LAPIC_DM_STARTUP | sipi_vector);
debug("Waiting for 1st SIPI to complete...");
if (apic_wait_timeout(10000 /* 10 ms */, 50 /* us */)) {
debug("timed out.\n");
return -1;
} else {
debug("done.\n");
}
/* Wait for CPUs to check in up to 200 us */
wait_for_aps(num_aps, ap_count, 200 /* us */, 15 /* us */);
/* Send 2nd SIPI */
if ((lapic_read(LAPIC_ICR) & LAPIC_ICR_BUSY)) {
debug("Waiting for ICR not to be busy...");
if (apic_wait_timeout(1000 /* 1 ms */, 50)) {
debug("timed out. Aborting.\n");
return -1;
} else {
debug("done.\n");
}
}
lapic_write_around(LAPIC_ICR2, SET_LAPIC_DEST_FIELD(0));
lapic_write_around(LAPIC_ICR, LAPIC_DEST_ALLBUT | LAPIC_INT_ASSERT |
LAPIC_DM_STARTUP | sipi_vector);
debug("Waiting for 2nd SIPI to complete...");
if (apic_wait_timeout(10000 /* 10 ms */, 50 /* us */)) {
debug("timed out.\n");
return -1;
} else {
debug("done.\n");
}
/* Wait for CPUs to check in */
if (wait_for_aps(num_aps, ap_count, 10000 /* 10 ms */, 50 /* us */)) {
debug("Not all APs checked in: %d/%d.\n",
atomic_read(num_aps), ap_count);
return -1;
}
return 0;
+}
+static int bsp_do_flight_plan(struct udevice *cpu, struct mp_params *mp_params) +{
int i;
int ret = 0;
const int timeout_us = 100000;
const int step_us = 100;
int num_aps = mp_params->num_cpus - 1;
for (i = 0; i < mp_params->num_records; i++) {
struct mp_flight_record *rec = &mp_params->flight_plan[i];
/* Wait for APs if the record is not released */
if (atomic_read(&rec->barrier) == 0) {
/* Wait for the APs to check in */
if (wait_for_aps(&rec->cpus_entered, num_aps,
timeout_us, step_us)) {
debug("MP record %d timeout.\n", i);
ret = -1;
}
}
if (rec->bsp_call != NULL)
rec->bsp_call(cpu, rec->bsp_arg);
release_barrier(&rec->barrier);
}
return ret;
+}
+static int init_bsp(struct udevice **devp) +{
char processor_name[CPU_MAX_NAME_LEN];
int apic_id;
int ret;
cpu_get_name(processor_name);
debug("CPU: %s.\n", processor_name);
enable_lapic();
apic_id = lapicid();
ret = find_cpu_by_apid_id(apic_id, devp);
if (ret) {
printf("Cannot find boot CPU, APIC ID %d\n", apic_id);
return ret;
}
return 0;
+}
+int mp_init(struct mp_params *p) +{
int num_aps;
atomic_t *ap_count;
struct udevice *cpu;
int ret;
/* This will cause the CPUs devices to be bound */
struct uclass *uc;
ret = uclass_get(UCLASS_CPU, &uc);
if (ret)
return ret;
ret = init_bsp(&cpu);
if (ret) {
debug("Cannot init boot CPU: err=%d\n", ret);
return ret;
}
if (p == NULL || p->flight_plan == NULL || p->num_records < 1) {
printf("Invalid MP parameters\n");
return -1;
}
ret = check_cpu_devices(p->num_cpus);
if (ret)
debug("Warning: Device tree does not describe all CPUs. Extra ones will not be started correctly\n");
/* Copy needed parameters so that APs have a reference to the plan */
mp_info.num_records = p->num_records;
mp_info.records = p->flight_plan;
/* Load the SIPI vector */
ret = load_sipi_vector(&ap_count);
if (ap_count == NULL)
return -1;
/*
* Make sure SIPI data hits RAM so the APs that come up will see
* the startup code even if the caches are disabled
*/
wbinvd();
/* Start the APs providing number of APs and the cpus_entered field */
num_aps = p->num_cpus - 1;
ret = start_aps(num_aps, ap_count);
if (ret) {
mdelay(1000);
debug("%d/%d eventually checked in?\n", atomic_read(ap_count),
num_aps);
return ret;
}
/* Walk the flight plan for the BSP */
ret = bsp_do_flight_plan(cpu, p);
if (ret) {
debug("CPU init failed: err=%d\n", ret);
return ret;
}
return 0;
+}
+int mp_init_cpu(struct udevice *cpu, void *unused) +{
return device_probe(cpu);
+} diff --git a/arch/x86/cpu/sipi.S b/arch/x86/cpu/sipi.S new file mode 100644 index 0000000..09fde21 --- /dev/null +++ b/arch/x86/cpu/sipi.S @@ -0,0 +1,215 @@ +/*
- Copyright (c) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0
- Taken from coreboot file of the same name
Looks that this is from coreboot is sipi_vector.S. Should we keep the same file name as coreboot?
- */
+/*
- The SIPI vector is responsible for initializing the APs in the sytem. It
- loads microcode, sets up MSRs, and enables caching before calling into
- C code
- */
+#include <asm/global_data.h> +#include <asm/msr-index.h> +#include <asm/processor.h> +#include <asm/processor-flags.h> +#include <asm/smm.h>
+#define CODE_SEG (X86_GDT_ENTRY_32BIT_CS * X86_GDT_ENTRY_SIZE) +#define DATA_SEG (X86_GDT_ENTRY_32BIT_DS * X86_GDT_ENTRY_SIZE)
+#define o32 .byte 0x66;
data32 directly?
+/*
- First we have the 16-bit section. Every AP process starts here.
- The simple task is to load U-Boot's Global Descriptor Table (GDT) to allow
- U-Boot's 32-bit code to become visible, then jump to ap_start32.
ap_start
- Note that this code is copied to RAM below 1MB in mp_init.c, and runs from
- there, but the 32-bit code (ap_start32 and onwards) is part of U-Boot and
ap_start
- is therefore relocated to the top of RAM with other U-Boot code. This
- means that for the 16-bit code we must write relocatable code, but for the
- rest, we can do what we like.
- */
+.text +.code16 +.globl ap_start16 +ap_start16:
cli
xorl %eax, %eax
movl %eax, %cr3 /* Invalidate TLB */
/* setup the data segment */
movw %cs, %ax
movw %ax, %ds
/* Use an address relative to the data segment for the GDT */
movl $gdtaddr, %ebx
subl $ap_start16, %ebx
data32 lgdt (%ebx)
movl %cr0, %eax
andl $0x7ffaffd1, %eax /* PG, AM, WP, NE, TS, EM, MP = 0 */
orl $0x60000001, %eax /* CD, NW, PE = 1 */
movl %eax, %cr0
Can we use macros for the cr0 bit fileds? Like we used in arch/x86/cpu/start16.S
movl $ap_start_jmp, %eax
subl $ap_start16, %eax
movw %ax, %bp
/* Jump to ap_start32 within U-Boot */
ap_start
+o32 cs ljmp *(%bp)
.align 4
+.globl sipi_params_16bit +sipi_params_16bit:
/* 48-bit far pointer */
+ap_start_jmp:
.long 0 /* offset set to ap_start by U-Boot */
.word CODE_SEG /* segment */
.word 0 /* padding */
+gdtaddr:
.word 0 /* limit */
.long 0 /* table */
.word 0 /* unused */
+.globl ap_start16_code_end +ap_start16_code_end:
+/*
- Set up the special 'fs' segment for global_data. Then jump to ap_continue
- to set up the AP.
- */
+.globl ap_start +ap_start:
.code32
movw $DATA_SEG, %ax
movw %ax, %ds
movw %ax, %es
movw %ax, %ss
movw %ax, %gs
movw $(X86_GDT_ENTRY_32BIT_FS * X86_GDT_ENTRY_SIZE), %ax
movw %ax, %fs
/* Load the Interrupt descriptor table */
mov idt_ptr, %ebx
lidt (%ebx)
/* Obtain cpu number */
movl ap_count, %eax
+1:
movl %eax, %ecx
inc %ecx
lock cmpxchg %ecx, ap_count
jnz 1b
/* Setup stacks for each CPU */
movl stack_size, %eax
mul %ecx
movl stack_top, %edx
subl %eax, %edx
mov %edx, %esp
/* Save cpu number */
mov %ecx, %esi
/* Determine if one should check microcode versions */
mov microcode_ptr, %edi
test %edi, %edi
jz microcode_done /* Bypass if no microde exists */
/* Get the Microcode version */
mov $1, %eax
cpuid
mov $MSR_IA32_UCODE_REV, %ecx
rdmsr
/* If something already loaded skip loading again */
test %edx, %edx
jnz microcode_done
/* Determine if parallel microcode loading is allowed */
cmp $0xffffffff, microcode_lock
je load_microcode
/* Protect microcode loading */
+lock_microcode:
lock bts $0, microcode_lock
jc lock_microcode
+load_microcode:
/* Load new microcode */
mov $MSR_IA32_UCODE_WRITE, %ecx
xor %edx, %edx
mov %edi, %eax
/* The microcode pointer is passed in pointing to the header. Adjust
* pointer to reflect the payload (header size is 48 bytes) */
Please fix the multi-line comment style.
add $48, %eax
48->UCODE_HEADER_LEN
pusha
wrmsr
popa
/* Unconditionally unlock microcode loading */
cmp $0xffffffff, microcode_lock
je microcode_done
xor %eax, %eax
mov %eax, microcode_lock
+microcode_done:
/*
* Load MSRs. Each entry in the table consists of:
* 0: index,
* 4: value[31:0]
* 8: value[63:32]
* See struct saved_msr in mp_init.c.
*/
mov msr_table_ptr, %edi
mov msr_count, %ebx
test %ebx, %ebx
jz 1f
+load_msr:
mov (%edi), %ecx
mov 4(%edi), %eax
mov 8(%edi), %edx
wrmsr
add $12, %edi
dec %ebx
jnz load_msr
+1:
/* Enable caching */
mov %cr0, %eax
and $0x9fffffff, %eax /* CD, NW = 0 */
Please use existing macros.
mov %eax, %cr0
/* c_handler(cpu_num) */
push %esi /* cpu_num */
mov c_handler, %eax
call *%eax
.align 4
+.globl sipi_params +sipi_params: +idt_ptr:
.long 0
+stack_top:
.long 0
+stack_size:
.long 0
+microcode_lock:
.long 0
+microcode_ptr:
.long 0
+msr_table_ptr:
.long 0
+msr_count:
.long 0
+c_handler:
.long 0
+ap_count:
.long 0
diff --git a/arch/x86/include/asm/mp.h b/arch/x86/include/asm/mp.h new file mode 100644 index 0000000..5dc0e33 --- /dev/null +++ b/arch/x86/include/asm/mp.h @@ -0,0 +1,94 @@ +/*
- Copyright (c) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0
- Taken from coreboot file of the same name
- */
+#ifndef _X86_MP_H_ +#define _X86_MP_H_
+#include <asm/atomic.h>
+typedef int (*mp_callback_t)(struct udevice *cpu, void *arg);
+/*
- A mp_flight_record details a sequence of calls for the APs to perform
- along with the BSP to coordinate sequencing. Each flight record either
- provides a barrier for each AP before calling the callback or the APs
- are allowed to perform the callback without waiting. Regardless, each
- record has the cpus_entered field incremented for each record. When
- the BSP observes that the cpus_entered matches the number of APs
- the bsp_call is called with bsp_arg and upon returning releases the
- barrier allowing the APs to make further progress.
- Note that ap_call() and bsp_call() can be NULL. In the NULL case the
- callback will just not be called.
- */
+struct mp_flight_record {
atomic_t barrier;
atomic_t cpus_entered;
mp_callback_t ap_call;
void *ap_arg;
mp_callback_t bsp_call;
void *bsp_arg;
+} __attribute__((aligned(ARCH_DMA_MINALIGN)));
+#define _MP_FLIGHT_RECORD(barrier_, ap_func_, ap_arg_, bsp_func_, bsp_arg_) \
Can we remove the underscore in those variables?
{ \
.barrier = ATOMIC_INIT(barrier_), \
.cpus_entered = ATOMIC_INIT(0), \
.ap_call = ap_func_, \
.ap_arg = ap_arg_, \
.bsp_call = bsp_func_, \
.bsp_arg = bsp_arg_, \
}
+#define MP_FR_BLOCK_APS(ap_func_, ap_arg_, bsp_func_, bsp_arg_) \
_MP_FLIGHT_RECORD(0, ap_func_, ap_arg_, bsp_func_, bsp_arg_)
underscore
+#define MP_FR_NOBLOCK_APS(ap_func_, ap_arg_, bsp_func_, bsp_arg_) \
_MP_FLIGHT_RECORD(1, ap_func_, ap_arg_, bsp_func_, bsp_arg_)
underscore
+/*
- The mp_params structure provides the arguments to the mp subsystem
- for bringing up APs.
- At present this is overkill for U-Boot, but it may make it easier to add
- SMM support.
- */
+struct mp_params {
int num_cpus; /* Total cpus include BSP */
int parallel_microcode_load;
const void *microcode_pointer;
/* Flight plan for APs and BSP */
struct mp_flight_record *flight_plan;
int num_records;
+};
+/*
- mp_init() will set up the SIPI vector and bring up the APs according to
- mp_params. Each flight record will be executed according to the plan. Note
- that the MP infrastructure uses SMM default area without saving it. It's
- up to the chipset or mainboard to either e820 reserve this area or save this
- region prior to calling mp_init() and restoring it after mp_init returns.
- At the time mp_init() is called the MTRR MSRs are mirrored into APs then
- caching is enabled before running the flight plan.
- The MP init has the following properties:
- APs are brought up in parallel.
- The ordering of coreboot cpu number and APIC ids is not deterministic.
coreboot->U-Boot
- Therefore, one cannot rely on this property or the order of devices in
- the device tree unless the chipset or mainboard know the APIC ids
- a priori.
- mp_init() returns < 0 on error, 0 on success.
- */
+int mp_init(struct mp_params *params);
+/* Probes the CPU device */ +int mp_init_cpu(struct udevice *cpu, void *unused);
+#endif /* _X86_MP_H_ */ diff --git a/arch/x86/include/asm/sipi.h b/arch/x86/include/asm/sipi.h new file mode 100644 index 0000000..bb2f0de --- /dev/null +++ b/arch/x86/include/asm/sipi.h @@ -0,0 +1,79 @@ +/*
- Copyright (c) 2015 Gooogle, Inc
- Written by Simon Glass sjg@chromium.org
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef _ASM_SIPI_H +#define _ASM_SIPI_H
+/**
- struct sipi_params_16bit - 16-bit SIPI entry-point parameters
- These are set up in the same space as the SIPI 16-bit code so that each AP
- can access the parameters when it boots.
- Each of these must be set up for the AP to boot, except @segment which is
- set in the assembly code.
- @ap_start: 32-bit SIPI entry point for U-Boot
- @segment: Code segment for U-Boot
- @pad: Padding (not used)
- @gdt_limit: U-Boot GDT limit (X86_GDT_SIZE - 1)
- @gdt: U-Boot GDT (gd->arch.gdt)
- @unused: Not used
- */
+struct __packed sipi_params_16bit {
u32 ap_start;
u16 segment;
u16 pad;
Please remove this blank line.
u16 gdt_limit;
u32 gdt;
u16 unused;
+};
+/**
- struct sipi_params - 32-bit SIP entry-point parameters
- These are used by the AP init code and must be set up before the APs start.
- The stack area extends down from @stack_top, with @stack_size allocated
- for each AP.
- @idt_ptr: Interrupt descriptor table pointer
- @stack_top: Top of the AP stack area
- @stack_size: Size of each AP's stack
- @microcode_lock: Used to ensure only one AP loads microcode at once
We should document: 0xffffffff means parallel loading
- @microcode_ptr: Pointer to microcode, or 0 if none
- @msr_table_ptr: Pointer to saved MSRs, a list of struct saved_msr
- @msr_count: Number of saved MSRs
- @c_handler: C function to call once early init is complete
- @ap_count: Shared atomic value to allocate CPU indexes
- */
+struct sipi_params {
u32 idt_ptr;
u32 stack_top;
u32 stack_size;
u32 microcode_lock;
u32 microcode_ptr;
u32 msr_table_ptr;
u32 msr_count;
u32 c_handler;
atomic_t ap_count;
+};
+/* 16-bit AP entry point */ +void ap_start16(void);
+/* end of 16-bit code/data, marks the region to be copied to SIP vector */ +void ap_start16_code_end(void);
+/* 32-bit AP entry point */ +void ap_start(void);
+extern char sipi_params_16bit[]; +extern char sipi_params[];
+#endif diff --git a/arch/x86/include/asm/smm.h b/arch/x86/include/asm/smm.h new file mode 100644 index 0000000..79b4a8e --- /dev/null +++ b/arch/x86/include/asm/smm.h @@ -0,0 +1,14 @@ +/*
- Copyright (c) 2015 Google, Inc
- Copyright (C) 2008-2009 coresystems GmbH
- SPDX-License-Identifier: GPL-2.0
- */
+#ifndef CPU_X86_SMM_H +#define CPU_X86_SMM_H
+#define SMM_DEFAULT_BASE 0x30000 +#define SMM_DEFAULT_SIZE 0x10000
Actually they are not related to SMM, but just for AP start-up codes entry. Should we rename them to AP_xxx and move them to sipi.h?
+#endif
Regards, Bin

Since we do these sorts of operations a lot, it is useful to have a simpler API, similar to clrsetbits_le32().
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/include/asm/msr.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 1955a75..5349519 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -128,6 +128,25 @@ static inline void wrmsr(unsigned msr, unsigned low, unsigned high) #define wrmsrl(msr, val) \ native_write_msr((msr), (u32)((u64)(val)), (u32)((u64)(val) >> 32))
+static inline void msr_clrsetbits_64(unsigned msr, u64 clear, u64 set) +{ + u64 val; + + val = native_read_msr(msr); + val &= ~clear; + val |= set; + wrmsrl(msr, val); +} + +static inline void msr_setbits_64(unsigned msr, u64 set) +{ + u64 val; + + val = native_read_msr(msr); + val |= set; + wrmsrl(msr, val); +} + /* rdmsr with exception handling */ #define rdmsr_safe(msr, p1, p2) \ ({ \

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Since we do these sorts of operations a lot, it is useful to have a simpler API, similar to clrsetbits_le32().
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/include/asm/msr.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 1955a75..5349519 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -128,6 +128,25 @@ static inline void wrmsr(unsigned msr, unsigned low, unsigned high) #define wrmsrl(msr, val) \ native_write_msr((msr), (u32)((u64)(val)), (u32)((u64)(val) >> 32))
+static inline void msr_clrsetbits_64(unsigned msr, u64 clear, u64 set) +{
u64 val;
val = native_read_msr(msr);
val &= ~clear;
val |= set;
wrmsrl(msr, val);
+}
+static inline void msr_setbits_64(unsigned msr, u64 set) +{
u64 val;
val = native_read_msr(msr);
val |= set;
wrmsrl(msr, val);
+}
/* rdmsr with exception handling */ #define rdmsr_safe(msr, p1, p2) \ ({ \ --
For completeness, should we add msr_clrbits_64() as well?
Regards, Bin

This permits init of additional CPU cores after relocation and when driver model is ready.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/cpu/cpu.c | 37 +++++++++++++++++++++++++++++++++++++ arch/x86/include/asm/cpu.h | 14 ++++++++++++++ arch/x86/include/asm/u-boot-x86.h | 2 ++ common/board_r.c | 2 +- 4 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/arch/x86/cpu/cpu.c b/arch/x86/cpu/cpu.c index 78eb3fe..6263511 100644 --- a/arch/x86/cpu/cpu.c +++ b/arch/x86/cpu/cpu.c @@ -21,6 +21,8 @@
#include <common.h> #include <command.h> +#include <cpu.h> +#include <dm.h> #include <errno.h> #include <malloc.h> #include <asm/control_regs.h> @@ -518,6 +520,15 @@ char *cpu_get_name(char *name) return ptr; }
+int x86_cpu_get_desc(struct udevice *dev, char *buf, int size) +{ + if (size < CPU_MAX_NAME_LEN) + return -ENOSPC; + cpu_get_name(buf); + + return 0; +} + int default_print_cpuinfo(void) { printf("CPU: %s, vendor %s, device %xh\n", @@ -600,3 +611,29 @@ int last_stage_init(void) return 0; } #endif + +__weak int x86_init_cpus(void) +{ + return 0; +} + +int cpu_init_r(void) +{ + return x86_init_cpus(); +} + +static const struct cpu_ops cpu_x86_ops = { + .get_desc = x86_cpu_get_desc, +}; + +static const struct udevice_id cpu_x86_ids[] = { + { .compatible = "cpu-x86" }, + { } +}; + +U_BOOT_DRIVER(cpu_x86_drv) = { + .name = "cpu_x86", + .id = UCLASS_CPU, + .of_match = cpu_x86_ids, + .ops = &cpu_x86_ops, +}; diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index 08284ee..01bee52 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -197,6 +197,20 @@ const char *cpu_vendor_name(int vendor); char *cpu_get_name(char *name);
/** + * +* x86_cpu_get_desc() - Get a description string for an x86 CPU +* +* This uses cpu_get_name() and is suitable to use as the get_desc() method for +* the I2C uclass. +* +* @dev: Device to check (UCLASS_CPU) +* @buf: Buffer to place string +* @size: Size of string space +* @return 0 if OK, -ENOSPC if buffer is too small, other -ve on error +*/ +int x86_cpu_get_desc(struct udevice *dev, char *buf, int size); + +/** * cpu_call64() - Jump to a 64-bit Linux kernel (internal function) * * The kernel is uncompressed and the 64-bit entry point is expected to be diff --git a/arch/x86/include/asm/u-boot-x86.h b/arch/x86/include/asm/u-boot-x86.h index 122e054..be103c0 100644 --- a/arch/x86/include/asm/u-boot-x86.h +++ b/arch/x86/include/asm/u-boot-x86.h @@ -69,6 +69,8 @@ uint64_t timer_get_tsc(void);
void quick_ram_check(void);
+int x86_init_cpus(void); + #define PCI_VGA_RAM_IMAGE_START 0xc0000
#endif /* _U_BOOT_I386_H_ */ diff --git a/common/board_r.c b/common/board_r.c index 307124e..1a46f62 100644 --- a/common/board_r.c +++ b/common/board_r.c @@ -779,7 +779,7 @@ init_fnc_t init_sequence_r[] = { initr_flash, #endif INIT_FUNC_WATCHDOG_RESET -#if defined(CONFIG_PPC) || defined(CONFIG_M68K) +#if defined(CONFIG_PPC) || defined(CONFIG_M68K) || defined(CONFIG_X86) /* initialize higher level parts of CPU like time base and timers */ cpu_init_r, #endif

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This permits init of additional CPU cores after relocation and when driver model is ready.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/cpu.c | 37 +++++++++++++++++++++++++++++++++++++ arch/x86/include/asm/cpu.h | 14 ++++++++++++++ arch/x86/include/asm/u-boot-x86.h | 2 ++ common/board_r.c | 2 +- 4 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/arch/x86/cpu/cpu.c b/arch/x86/cpu/cpu.c index 78eb3fe..6263511 100644 --- a/arch/x86/cpu/cpu.c +++ b/arch/x86/cpu/cpu.c @@ -21,6 +21,8 @@
#include <common.h> #include <command.h> +#include <cpu.h> +#include <dm.h> #include <errno.h> #include <malloc.h> #include <asm/control_regs.h> @@ -518,6 +520,15 @@ char *cpu_get_name(char *name) return ptr; }
+int x86_cpu_get_desc(struct udevice *dev, char *buf, int size) +{
if (size < CPU_MAX_NAME_LEN)
return -ENOSPC;
Please add a blank line here.
cpu_get_name(buf);
return 0;
+}
int default_print_cpuinfo(void) { printf("CPU: %s, vendor %s, device %xh\n", @@ -600,3 +611,29 @@ int last_stage_init(void) return 0; } #endif
+__weak int x86_init_cpus(void) +{
return 0;
+}
+int cpu_init_r(void) +{
return x86_init_cpus();
+}
+static const struct cpu_ops cpu_x86_ops = {
.get_desc = x86_cpu_get_desc,
+};
+static const struct udevice_id cpu_x86_ids[] = {
{ .compatible = "cpu-x86" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_drv) = {
.name = "cpu_x86",
.id = UCLASS_CPU,
.of_match = cpu_x86_ids,
.ops = &cpu_x86_ops,
+}; diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index 08284ee..01bee52 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -197,6 +197,20 @@ const char *cpu_vendor_name(int vendor); char *cpu_get_name(char *name);
/**
+* x86_cpu_get_desc() - Get a description string for an x86 CPU +* +* This uses cpu_get_name() and is suitable to use as the get_desc() method for +* the I2C uclass.
I2C uclass?
+* +* @dev: Device to check (UCLASS_CPU) +* @buf: Buffer to place string +* @size: Size of string space +* @return 0 if OK, -ENOSPC if buffer is too small, other -ve on error +*/ +int x86_cpu_get_desc(struct udevice *dev, char *buf, int size);
+/**
- cpu_call64() - Jump to a 64-bit Linux kernel (internal function)
- The kernel is uncompressed and the 64-bit entry point is expected to be
diff --git a/arch/x86/include/asm/u-boot-x86.h b/arch/x86/include/asm/u-boot-x86.h index 122e054..be103c0 100644 --- a/arch/x86/include/asm/u-boot-x86.h +++ b/arch/x86/include/asm/u-boot-x86.h @@ -69,6 +69,8 @@ uint64_t timer_get_tsc(void);
void quick_ram_check(void);
+int x86_init_cpus(void);
#define PCI_VGA_RAM_IMAGE_START 0xc0000
#endif /* _U_BOOT_I386_H_ */ diff --git a/common/board_r.c b/common/board_r.c index 307124e..1a46f62 100644 --- a/common/board_r.c +++ b/common/board_r.c @@ -779,7 +779,7 @@ init_fnc_t init_sequence_r[] = { initr_flash, #endif INIT_FUNC_WATCHDOG_RESET -#if defined(CONFIG_PPC) || defined(CONFIG_M68K) +#if defined(CONFIG_PPC) || defined(CONFIG_M68K) || defined(CONFIG_X86) /* initialize higher level parts of CPU like time base and timers */ cpu_init_r,
I see there is a cpu_secondary_init_r() in board_r.c. Looks that it is intended for multicore initialization. Should we use that instead?
#endif
Regards, Bin

This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/* + * Copyright (C) 2015 Google, Inc + * + * SPDX-License-Identifier: GPL-2.0+ + * + * Based on code from coreboot + */ + +#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h> + +#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{ + return 0; +} + +static struct mp_flight_record mp_steps[] = { + MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL), + /* Wait for APs to finish initialization before proceeding. */ + MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL), +}; + +static int detect_num_cpus(void) +{ + int ecx = 0; + + /* + * Use the algorithm described in Intel 64 and IA-32 Architectures + * Software Developer's Manual Volume 3 (3A, 3B & 3C): System + * Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping + * of CPUID Extended Topology Leaf. + */ + while (1) { + struct cpuid_result leaf_b; + + leaf_b = cpuid_ext(0xb, ecx); + + /* + * Bay Trail doesn't have hyperthreading so just determine the + * number of cores by from level type (ecx[15:8] == * 2) + */ + if ((leaf_b.ecx & 0xff00) == 0x0200) + return leaf_b.ebx & 0xffff; + ecx++; + } +} + +static int baytrail_init_cpus(void) +{ + struct mp_params mp_params; + + lapic_setup(); + + mp_params.num_cpus = detect_num_cpus(); + mp_params.parallel_microcode_load = 0, + mp_params.flight_plan = &mp_steps[0]; + mp_params.num_records = ARRAY_SIZE(mp_steps); + mp_params.microcode_pointer = 0; + + if (mp_init(&mp_params)) { + printf("Warning: MP init failure\n"); + return -EIO; + } + + return 0; +} +#endif + +int x86_init_cpus(void) +{ +#ifdef CONFIG_SMP + debug("Init additional CPUs\n"); + baytrail_init_cpus(); +#endif + + return 0; +} + +void set_max_freq(void) +{ + msr_t perf_ctl; + msr_t msr; + + /* Enable speed step */ + msr = msr_read(MSR_IA32_MISC_ENABLES); + msr.lo |= (1 << 16); + msr_write(MSR_IA32_MISC_ENABLES, msr); + + /* + * Set guaranteed ratio [21:16] from IACORE_RATIOS to bits [15:8] of + * the PERF_CTL + */ + msr = msr_read(MSR_IACORE_RATIOS); + perf_ctl.lo = (msr.lo & 0x3f0000) >> 8; + + /* + * Set guaranteed vid [21:16] from IACORE_VIDS to bits [7:0] of + * the PERF_CTL + */ + msr = msr_read(MSR_IACORE_VIDS); + perf_ctl.lo |= (msr.lo & 0x7f0000) >> 16; + perf_ctl.hi = 0; + + msr_write(MSR_IA32_PERF_CTL, perf_ctl); +} + +static int cpu_x86_baytrail_probe(struct udevice *dev) +{ + debug("Init baytrail core\n"); + + /* + * On bay trail the turbo disable bit is actually scoped at the + * building-block level, not package. For non-BSP cores that are + * within a building block, enable turbo. The cores within the BSP's + * building block will just see it already enabled and move on. + */ + if (lapicid()) + turbo_enable(); + + /* Dynamic L2 shrink enable and threshold */ + msr_clrsetbits_64(MSR_PMG_CST_CONFIG_CONTROL, 0x3f000f, 0xe0008), + + /* Disable C1E */ + msr_clrsetbits_64(MSR_POWER_CTL, 2, 0); + msr_setbits_64(MSR_POWER_MISC, 0x44); + + /* Set this core to max frequency ratio */ + set_max_freq(); + + return 0; +} + +static unsigned bus_freq(void) +{ + msr_t clk_info = msr_read(MSR_BSEL_CR_OVERCLOCK_CONTROL); + switch (clk_info.lo & 0x3) { + case 0: + return 83333333; + case 1: + return 100000000; + case 2: + return 133333333; + case 3: + return 116666666; + default: + return 0; + } +} + +static unsigned long tsc_freq(void) +{ + msr_t platform_info; + ulong bclk = bus_freq(); + + if (!bclk) + return 0; + + platform_info = msr_read(MSR_PLATFORM_INFO); + + return bclk * ((platform_info.lo >> 8) & 0xff); +} + +static int baytrail_get_info(struct udevice *dev, struct cpu_info *info) +{ + info->cpu_freq = tsc_freq(); + info->features = 1 << CPU_FEAT_L1_CACHE | 1 << CPU_FEAT_MMU; + + return 0; +} + +static int cpu_x86_baytrail_bind(struct udevice *dev) +{ + struct cpu_platdata *plat = dev_get_parent_platdata(dev); + + plat->cpu_id = fdtdec_get_int(gd->fdt_blob, dev->of_offset, + "intel,apic-id", -1); + + return 0; +} + +static const struct cpu_ops cpu_x86_baytrail_ops = { + .get_desc = x86_cpu_get_desc, + .get_info = baytrail_get_info, +}; + +static const struct udevice_id cpu_x86_baytrail_ids[] = { + { .compatible = "intel,baytrail-cpu" }, + { } +}; + +U_BOOT_DRIVER(cpu_x86_baytrail_drv) = { + .name = "cpu_x86_baytrail", + .id = UCLASS_CPU, + .of_match = cpu_x86_baytrail_ids, + .bind = cpu_x86_baytrail_bind, + .probe = cpu_x86_baytrail_probe, + .ops = &cpu_x86_baytrail_ops, +}; diff --git a/arch/x86/include/asm/arch-baytrail/msr.h b/arch/x86/include/asm/arch-baytrail/msr.h new file mode 100644 index 0000000..1975aec --- /dev/null +++ b/arch/x86/include/asm/arch-baytrail/msr.h @@ -0,0 +1,30 @@ +/* + * Copyright (C) 2015 Google, Inc + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#ifndef __asm_arch_msr_h +#define __asm_arch_msr_h + +#define MSR_BSEL_CR_OVERCLOCK_CONTROL 0xcd +#define MSR_PMG_CST_CONFIG_CONTROL 0xe2 +#define SINGLE_PCTL (1 << 11) +#define MSR_POWER_MISC 0x120 +#define ENABLE_ULFM_AUTOCM_MASK (1 << 2) +#define ENABLE_INDP_AUTOCM_MASK (1 << 3) +#define MSR_IA32_MISC_ENABLES 0x1a0 +#define MSR_POWER_CTL 0x1fc +#define MSR_PKG_POWER_SKU_UNIT 0x606 +#define MSR_IACORE_RATIOS 0x66a +#define MSR_IACORE_TURBO_RATIOS 0x66c +#define MSR_IACORE_VIDS 0x66b +#define MSR_IACORE_TURBO_VIDS 0x66d +#define MSR_PKG_TURBO_CFG1 0x670 +#define MSR_CPU_TURBO_WKLD_CFG1 0x671 +#define MSR_CPU_TURBO_WKLD_CFG2 0x672 +#define MSR_CPU_THERM_CFG1 0x673 +#define MSR_CPU_THERM_CFG2 0x674 +#define MSR_CPU_THERM_SENS_CFG 0x675 + +#endif

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from coreboot
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h>
+#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{
return 0;
+}
What is this function for? Is this a must-have?
+static struct mp_flight_record mp_steps[] = {
MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL),
/* Wait for APs to finish initialization before proceeding. */
MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL),
+};
+static int detect_num_cpus(void) +{
int ecx = 0;
/*
* Use the algorithm described in Intel 64 and IA-32 Architectures
* Software Developer's Manual Volume 3 (3A, 3B & 3C): System
* Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping
* of CPUID Extended Topology Leaf.
*/
while (1) {
struct cpuid_result leaf_b;
leaf_b = cpuid_ext(0xb, ecx);
/*
* Bay Trail doesn't have hyperthreading so just determine the
* number of cores by from level type (ecx[15:8] == * 2)
*/
if ((leaf_b.ecx & 0xff00) == 0x0200)
return leaf_b.ebx & 0xffff;
ecx++;
}
+}
Since we already describe all cpus in the device tree, is this dynamic probe really needed?
+static int baytrail_init_cpus(void) +{
struct mp_params mp_params;
lapic_setup();
mp_params.num_cpus = detect_num_cpus();
mp_params.parallel_microcode_load = 0,
mp_params.flight_plan = &mp_steps[0];
mp_params.num_records = ARRAY_SIZE(mp_steps);
mp_params.microcode_pointer = 0;
if (mp_init(&mp_params)) {
printf("Warning: MP init failure\n");
return -EIO;
}
return 0;
+} +#endif
+int x86_init_cpus(void) +{ +#ifdef CONFIG_SMP
debug("Init additional CPUs\n");
baytrail_init_cpus();
+#endif
return 0;
+}
+void set_max_freq(void)
Should this be static?
+{
msr_t perf_ctl;
msr_t msr;
/* Enable speed step */
msr = msr_read(MSR_IA32_MISC_ENABLES);
msr.lo |= (1 << 16);
msr_write(MSR_IA32_MISC_ENABLES, msr);
/*
* Set guaranteed ratio [21:16] from IACORE_RATIOS to bits [15:8] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_RATIOS);
perf_ctl.lo = (msr.lo & 0x3f0000) >> 8;
/*
* Set guaranteed vid [21:16] from IACORE_VIDS to bits [7:0] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_VIDS);
perf_ctl.lo |= (msr.lo & 0x7f0000) >> 16;
perf_ctl.hi = 0;
msr_write(MSR_IA32_PERF_CTL, perf_ctl);
+}
+static int cpu_x86_baytrail_probe(struct udevice *dev) +{
debug("Init baytrail core\n");
BayTrail?
/*
* On bay trail the turbo disable bit is actually scoped at the
BayTrail?
* building-block level, not package. For non-BSP cores that are
* within a building block, enable turbo. The cores within the BSP's
* building block will just see it already enabled and move on.
*/
if (lapicid())
turbo_enable();
/* Dynamic L2 shrink enable and threshold */
msr_clrsetbits_64(MSR_PMG_CST_CONFIG_CONTROL, 0x3f000f, 0xe0008),
/* Disable C1E */
msr_clrsetbits_64(MSR_POWER_CTL, 2, 0);
msr_setbits_64(MSR_POWER_MISC, 0x44);
/* Set this core to max frequency ratio */
set_max_freq();
return 0;
+}
+static unsigned bus_freq(void) +{
msr_t clk_info = msr_read(MSR_BSEL_CR_OVERCLOCK_CONTROL);
switch (clk_info.lo & 0x3) {
case 0:
return 83333333;
case 1:
return 100000000;
case 2:
return 133333333;
case 3:
return 116666666;
default:
return 0;
}
+}
+static unsigned long tsc_freq(void) +{
msr_t platform_info;
ulong bclk = bus_freq();
if (!bclk)
return 0;
platform_info = msr_read(MSR_PLATFORM_INFO);
return bclk * ((platform_info.lo >> 8) & 0xff);
+}
+static int baytrail_get_info(struct udevice *dev, struct cpu_info *info) +{
info->cpu_freq = tsc_freq();
info->features = 1 << CPU_FEAT_L1_CACHE | 1 << CPU_FEAT_MMU;
return 0;
+}
+static int cpu_x86_baytrail_bind(struct udevice *dev) +{
struct cpu_platdata *plat = dev_get_parent_platdata(dev);
plat->cpu_id = fdtdec_get_int(gd->fdt_blob, dev->of_offset,
"intel,apic-id", -1);
return 0;
+}
+static const struct cpu_ops cpu_x86_baytrail_ops = {
.get_desc = x86_cpu_get_desc,
.get_info = baytrail_get_info,
+};
+static const struct udevice_id cpu_x86_baytrail_ids[] = {
{ .compatible = "intel,baytrail-cpu" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_baytrail_drv) = {
.name = "cpu_x86_baytrail",
.id = UCLASS_CPU,
.of_match = cpu_x86_baytrail_ids,
.bind = cpu_x86_baytrail_bind,
.probe = cpu_x86_baytrail_probe,
.ops = &cpu_x86_baytrail_ops,
+}; diff --git a/arch/x86/include/asm/arch-baytrail/msr.h b/arch/x86/include/asm/arch-baytrail/msr.h new file mode 100644 index 0000000..1975aec --- /dev/null +++ b/arch/x86/include/asm/arch-baytrail/msr.h @@ -0,0 +1,30 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef __asm_arch_msr_h +#define __asm_arch_msr_h
Should be capital letters, or (see below)
+#define MSR_BSEL_CR_OVERCLOCK_CONTROL 0xcd +#define MSR_PMG_CST_CONFIG_CONTROL 0xe2 +#define SINGLE_PCTL (1 << 11) +#define MSR_POWER_MISC 0x120 +#define ENABLE_ULFM_AUTOCM_MASK (1 << 2) +#define ENABLE_INDP_AUTOCM_MASK (1 << 3) +#define MSR_IA32_MISC_ENABLES 0x1a0 +#define MSR_POWER_CTL 0x1fc +#define MSR_PKG_POWER_SKU_UNIT 0x606 +#define MSR_IACORE_RATIOS 0x66a +#define MSR_IACORE_TURBO_RATIOS 0x66c +#define MSR_IACORE_VIDS 0x66b +#define MSR_IACORE_TURBO_VIDS 0x66d +#define MSR_PKG_TURBO_CFG1 0x670 +#define MSR_CPU_TURBO_WKLD_CFG1 0x671 +#define MSR_CPU_TURBO_WKLD_CFG2 0x672 +#define MSR_CPU_THERM_CFG1 0x673 +#define MSR_CPU_THERM_CFG2 0x674 +#define MSR_CPU_THERM_SENS_CFG 0x675
Should these be all put into arch/x86/include/asm/msr-index.h, a single place for all x86 processors' MSR?
+#endif
Regards, Bin

Hi Bin,
On 29 April 2015 at 07:57, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from coreboot
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h>
+#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{
return 0;
+}
What is this function for? Is this a must-have?
It's partly a placeholder, and also is intended to ensure that the APs are all started before the main CPU continues execution.
+static struct mp_flight_record mp_steps[] = {
MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL),
/* Wait for APs to finish initialization before proceeding. */
MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL),
+};
+static int detect_num_cpus(void) +{
int ecx = 0;
/*
* Use the algorithm described in Intel 64 and IA-32 Architectures
* Software Developer's Manual Volume 3 (3A, 3B & 3C): System
* Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping
* of CPUID Extended Topology Leaf.
*/
while (1) {
struct cpuid_result leaf_b;
leaf_b = cpuid_ext(0xb, ecx);
/*
* Bay Trail doesn't have hyperthreading so just determine the
* number of cores by from level type (ecx[15:8] == * 2)
*/
if ((leaf_b.ecx & 0xff00) == 0x0200)
return leaf_b.ebx & 0xffff;
ecx++;
}
+}
Since we already describe all cpus in the device tree, is this dynamic probe really needed?
With MinnowMax I'd like to support the single-core version of the board also. It could have its own device tree, but I don't want to break in this case. However, this case is not tested.
+static int baytrail_init_cpus(void) +{
struct mp_params mp_params;
lapic_setup();
mp_params.num_cpus = detect_num_cpus();
mp_params.parallel_microcode_load = 0,
mp_params.flight_plan = &mp_steps[0];
mp_params.num_records = ARRAY_SIZE(mp_steps);
mp_params.microcode_pointer = 0;
if (mp_init(&mp_params)) {
printf("Warning: MP init failure\n");
return -EIO;
}
return 0;
+} +#endif
+int x86_init_cpus(void) +{ +#ifdef CONFIG_SMP
debug("Init additional CPUs\n");
baytrail_init_cpus();
+#endif
return 0;
+}
+void set_max_freq(void)
Should this be static?
Yes
+{
msr_t perf_ctl;
msr_t msr;
/* Enable speed step */
msr = msr_read(MSR_IA32_MISC_ENABLES);
msr.lo |= (1 << 16);
msr_write(MSR_IA32_MISC_ENABLES, msr);
/*
* Set guaranteed ratio [21:16] from IACORE_RATIOS to bits [15:8] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_RATIOS);
perf_ctl.lo = (msr.lo & 0x3f0000) >> 8;
/*
* Set guaranteed vid [21:16] from IACORE_VIDS to bits [7:0] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_VIDS);
perf_ctl.lo |= (msr.lo & 0x7f0000) >> 16;
perf_ctl.hi = 0;
msr_write(MSR_IA32_PERF_CTL, perf_ctl);
+}
+static int cpu_x86_baytrail_probe(struct udevice *dev) +{
debug("Init baytrail core\n");
BayTrail?
OK
/*
* On bay trail the turbo disable bit is actually scoped at the
BayTrail?
* building-block level, not package. For non-BSP cores that are
* within a building block, enable turbo. The cores within the BSP's
* building block will just see it already enabled and move on.
*/
if (lapicid())
turbo_enable();
/* Dynamic L2 shrink enable and threshold */
msr_clrsetbits_64(MSR_PMG_CST_CONFIG_CONTROL, 0x3f000f, 0xe0008),
/* Disable C1E */
msr_clrsetbits_64(MSR_POWER_CTL, 2, 0);
msr_setbits_64(MSR_POWER_MISC, 0x44);
/* Set this core to max frequency ratio */
set_max_freq();
return 0;
+}
+static unsigned bus_freq(void) +{
msr_t clk_info = msr_read(MSR_BSEL_CR_OVERCLOCK_CONTROL);
switch (clk_info.lo & 0x3) {
case 0:
return 83333333;
case 1:
return 100000000;
case 2:
return 133333333;
case 3:
return 116666666;
default:
return 0;
}
+}
+static unsigned long tsc_freq(void) +{
msr_t platform_info;
ulong bclk = bus_freq();
if (!bclk)
return 0;
platform_info = msr_read(MSR_PLATFORM_INFO);
return bclk * ((platform_info.lo >> 8) & 0xff);
+}
+static int baytrail_get_info(struct udevice *dev, struct cpu_info *info) +{
info->cpu_freq = tsc_freq();
info->features = 1 << CPU_FEAT_L1_CACHE | 1 << CPU_FEAT_MMU;
return 0;
+}
+static int cpu_x86_baytrail_bind(struct udevice *dev) +{
struct cpu_platdata *plat = dev_get_parent_platdata(dev);
plat->cpu_id = fdtdec_get_int(gd->fdt_blob, dev->of_offset,
"intel,apic-id", -1);
return 0;
+}
+static const struct cpu_ops cpu_x86_baytrail_ops = {
.get_desc = x86_cpu_get_desc,
.get_info = baytrail_get_info,
+};
+static const struct udevice_id cpu_x86_baytrail_ids[] = {
{ .compatible = "intel,baytrail-cpu" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_baytrail_drv) = {
.name = "cpu_x86_baytrail",
.id = UCLASS_CPU,
.of_match = cpu_x86_baytrail_ids,
.bind = cpu_x86_baytrail_bind,
.probe = cpu_x86_baytrail_probe,
.ops = &cpu_x86_baytrail_ops,
+}; diff --git a/arch/x86/include/asm/arch-baytrail/msr.h b/arch/x86/include/asm/arch-baytrail/msr.h new file mode 100644 index 0000000..1975aec --- /dev/null +++ b/arch/x86/include/asm/arch-baytrail/msr.h @@ -0,0 +1,30 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef __asm_arch_msr_h +#define __asm_arch_msr_h
Should be capital letters, or (see below)
+#define MSR_BSEL_CR_OVERCLOCK_CONTROL 0xcd +#define MSR_PMG_CST_CONFIG_CONTROL 0xe2 +#define SINGLE_PCTL (1 << 11) +#define MSR_POWER_MISC 0x120 +#define ENABLE_ULFM_AUTOCM_MASK (1 << 2) +#define ENABLE_INDP_AUTOCM_MASK (1 << 3) +#define MSR_IA32_MISC_ENABLES 0x1a0 +#define MSR_POWER_CTL 0x1fc +#define MSR_PKG_POWER_SKU_UNIT 0x606 +#define MSR_IACORE_RATIOS 0x66a +#define MSR_IACORE_TURBO_RATIOS 0x66c +#define MSR_IACORE_VIDS 0x66b +#define MSR_IACORE_TURBO_VIDS 0x66d +#define MSR_PKG_TURBO_CFG1 0x670 +#define MSR_CPU_TURBO_WKLD_CFG1 0x671 +#define MSR_CPU_TURBO_WKLD_CFG2 0x672 +#define MSR_CPU_THERM_CFG1 0x673 +#define MSR_CPU_THERM_CFG2 0x674 +#define MSR_CPU_THERM_SENS_CFG 0x675
Should these be all put into arch/x86/include/asm/msr-index.h, a single place for all x86 processors' MSR?
I was worried that they might be specific to this CPU. But if they are common then yes they should go in the common file.
Regards, Simon

Hi Simon,
On Wed, Apr 29, 2015 at 10:00 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 29 April 2015 at 07:57, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from coreboot
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h>
+#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{
return 0;
+}
What is this function for? Is this a must-have?
It's partly a placeholder, and also is intended to ensure that the APs are all started before the main CPU continues execution.
+static struct mp_flight_record mp_steps[] = {
MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL),
/* Wait for APs to finish initialization before proceeding. */
MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL),
+};
+static int detect_num_cpus(void) +{
int ecx = 0;
/*
* Use the algorithm described in Intel 64 and IA-32 Architectures
* Software Developer's Manual Volume 3 (3A, 3B & 3C): System
* Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping
* of CPUID Extended Topology Leaf.
*/
while (1) {
struct cpuid_result leaf_b;
leaf_b = cpuid_ext(0xb, ecx);
/*
* Bay Trail doesn't have hyperthreading so just determine the
* number of cores by from level type (ecx[15:8] == * 2)
*/
if ((leaf_b.ecx & 0xff00) == 0x0200)
return leaf_b.ebx & 0xffff;
ecx++;
}
+}
Since we already describe all cpus in the device tree, is this dynamic probe really needed?
With MinnowMax I'd like to support the single-core version of the board also. It could have its own device tree, but I don't want to break in this case. However, this case is not tested.
Do you mean that there is a specific version of MinnowMax board which contains an single core version of Atom E3800 (?, maybe another brand name)? But as you said, we can create another device tree for the single core version. No? Or maybe we fix up the DTB node here dynamically, that we still only have one device tree to describe the dual-core version, but after this dynamic probe, we fix up the DTB to remove one cpu node if we get a single core version?
+static int baytrail_init_cpus(void) +{
struct mp_params mp_params;
lapic_setup();
mp_params.num_cpus = detect_num_cpus();
mp_params.parallel_microcode_load = 0,
mp_params.flight_plan = &mp_steps[0];
mp_params.num_records = ARRAY_SIZE(mp_steps);
mp_params.microcode_pointer = 0;
if (mp_init(&mp_params)) {
printf("Warning: MP init failure\n");
return -EIO;
}
return 0;
+} +#endif
+int x86_init_cpus(void) +{ +#ifdef CONFIG_SMP
debug("Init additional CPUs\n");
baytrail_init_cpus();
+#endif
return 0;
+}
+void set_max_freq(void)
Should this be static?
Yes
+{
msr_t perf_ctl;
msr_t msr;
/* Enable speed step */
msr = msr_read(MSR_IA32_MISC_ENABLES);
msr.lo |= (1 << 16);
msr_write(MSR_IA32_MISC_ENABLES, msr);
/*
* Set guaranteed ratio [21:16] from IACORE_RATIOS to bits [15:8] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_RATIOS);
perf_ctl.lo = (msr.lo & 0x3f0000) >> 8;
/*
* Set guaranteed vid [21:16] from IACORE_VIDS to bits [7:0] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_VIDS);
perf_ctl.lo |= (msr.lo & 0x7f0000) >> 16;
perf_ctl.hi = 0;
msr_write(MSR_IA32_PERF_CTL, perf_ctl);
+}
+static int cpu_x86_baytrail_probe(struct udevice *dev) +{
debug("Init baytrail core\n");
BayTrail?
OK
/*
* On bay trail the turbo disable bit is actually scoped at the
BayTrail?
* building-block level, not package. For non-BSP cores that are
* within a building block, enable turbo. The cores within the BSP's
* building block will just see it already enabled and move on.
*/
if (lapicid())
turbo_enable();
/* Dynamic L2 shrink enable and threshold */
msr_clrsetbits_64(MSR_PMG_CST_CONFIG_CONTROL, 0x3f000f, 0xe0008),
/* Disable C1E */
msr_clrsetbits_64(MSR_POWER_CTL, 2, 0);
msr_setbits_64(MSR_POWER_MISC, 0x44);
/* Set this core to max frequency ratio */
set_max_freq();
return 0;
+}
+static unsigned bus_freq(void) +{
msr_t clk_info = msr_read(MSR_BSEL_CR_OVERCLOCK_CONTROL);
switch (clk_info.lo & 0x3) {
case 0:
return 83333333;
case 1:
return 100000000;
case 2:
return 133333333;
case 3:
return 116666666;
default:
return 0;
}
+}
+static unsigned long tsc_freq(void) +{
msr_t platform_info;
ulong bclk = bus_freq();
if (!bclk)
return 0;
platform_info = msr_read(MSR_PLATFORM_INFO);
return bclk * ((platform_info.lo >> 8) & 0xff);
+}
+static int baytrail_get_info(struct udevice *dev, struct cpu_info *info) +{
info->cpu_freq = tsc_freq();
info->features = 1 << CPU_FEAT_L1_CACHE | 1 << CPU_FEAT_MMU;
return 0;
+}
+static int cpu_x86_baytrail_bind(struct udevice *dev) +{
struct cpu_platdata *plat = dev_get_parent_platdata(dev);
plat->cpu_id = fdtdec_get_int(gd->fdt_blob, dev->of_offset,
"intel,apic-id", -1);
return 0;
+}
+static const struct cpu_ops cpu_x86_baytrail_ops = {
.get_desc = x86_cpu_get_desc,
.get_info = baytrail_get_info,
+};
+static const struct udevice_id cpu_x86_baytrail_ids[] = {
{ .compatible = "intel,baytrail-cpu" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_baytrail_drv) = {
.name = "cpu_x86_baytrail",
.id = UCLASS_CPU,
.of_match = cpu_x86_baytrail_ids,
.bind = cpu_x86_baytrail_bind,
.probe = cpu_x86_baytrail_probe,
.ops = &cpu_x86_baytrail_ops,
+}; diff --git a/arch/x86/include/asm/arch-baytrail/msr.h b/arch/x86/include/asm/arch-baytrail/msr.h new file mode 100644 index 0000000..1975aec --- /dev/null +++ b/arch/x86/include/asm/arch-baytrail/msr.h @@ -0,0 +1,30 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef __asm_arch_msr_h +#define __asm_arch_msr_h
Should be capital letters, or (see below)
+#define MSR_BSEL_CR_OVERCLOCK_CONTROL 0xcd +#define MSR_PMG_CST_CONFIG_CONTROL 0xe2 +#define SINGLE_PCTL (1 << 11) +#define MSR_POWER_MISC 0x120 +#define ENABLE_ULFM_AUTOCM_MASK (1 << 2) +#define ENABLE_INDP_AUTOCM_MASK (1 << 3) +#define MSR_IA32_MISC_ENABLES 0x1a0 +#define MSR_POWER_CTL 0x1fc +#define MSR_PKG_POWER_SKU_UNIT 0x606 +#define MSR_IACORE_RATIOS 0x66a +#define MSR_IACORE_TURBO_RATIOS 0x66c +#define MSR_IACORE_VIDS 0x66b +#define MSR_IACORE_TURBO_VIDS 0x66d +#define MSR_PKG_TURBO_CFG1 0x670 +#define MSR_CPU_TURBO_WKLD_CFG1 0x671 +#define MSR_CPU_TURBO_WKLD_CFG2 0x672 +#define MSR_CPU_THERM_CFG1 0x673 +#define MSR_CPU_THERM_CFG2 0x674 +#define MSR_CPU_THERM_SENS_CFG 0x675
Should these be all put into arch/x86/include/asm/msr-index.h, a single place for all x86 processors' MSR?
I was worried that they might be specific to this CPU. But if they are common then yes they should go in the common file.
Maybe some of them are BayTrail-specific. But there MSRs are documented in the Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3, right? If yes, I think it's fine to put them at just one place.
Regards, Bin

Hi Bin,
On 29 April 2015 at 08:25, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:00 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 29 April 2015 at 07:57, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from coreboot
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h>
+#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{
return 0;
+}
What is this function for? Is this a must-have?
It's partly a placeholder, and also is intended to ensure that the APs are all started before the main CPU continues execution.
+static struct mp_flight_record mp_steps[] = {
MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL),
/* Wait for APs to finish initialization before proceeding. */
MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL),
+};
+static int detect_num_cpus(void) +{
int ecx = 0;
/*
* Use the algorithm described in Intel 64 and IA-32 Architectures
* Software Developer's Manual Volume 3 (3A, 3B & 3C): System
* Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping
* of CPUID Extended Topology Leaf.
*/
while (1) {
struct cpuid_result leaf_b;
leaf_b = cpuid_ext(0xb, ecx);
/*
* Bay Trail doesn't have hyperthreading so just determine the
* number of cores by from level type (ecx[15:8] == * 2)
*/
if ((leaf_b.ecx & 0xff00) == 0x0200)
return leaf_b.ebx & 0xffff;
ecx++;
}
+}
Since we already describe all cpus in the device tree, is this dynamic probe really needed?
With MinnowMax I'd like to support the single-core version of the board also. It could have its own device tree, but I don't want to break in this case. However, this case is not tested.
Do you mean that there is a specific version of MinnowMax board which contains an single core version of Atom E3800 (?, maybe another brand name)? But as you said, we can create another device tree for the single core version. No? Or maybe we fix up the DTB node here dynamically, that we still only have one device tree to describe the dual-core version, but after this dynamic probe, we fix up the DTB to remove one cpu node if we get a single core version?
Yes that's right. I think we can just mark the second core disabled. But in the case that it doesn't exist I'd like the code to have the same effect.
+static int baytrail_init_cpus(void) +{
struct mp_params mp_params;
lapic_setup();
mp_params.num_cpus = detect_num_cpus();
mp_params.parallel_microcode_load = 0,
mp_params.flight_plan = &mp_steps[0];
mp_params.num_records = ARRAY_SIZE(mp_steps);
mp_params.microcode_pointer = 0;
if (mp_init(&mp_params)) {
printf("Warning: MP init failure\n");
return -EIO;
}
return 0;
+} +#endif
+int x86_init_cpus(void) +{ +#ifdef CONFIG_SMP
debug("Init additional CPUs\n");
baytrail_init_cpus();
+#endif
return 0;
+}
+void set_max_freq(void)
Should this be static?
Yes
+{
msr_t perf_ctl;
msr_t msr;
/* Enable speed step */
msr = msr_read(MSR_IA32_MISC_ENABLES);
msr.lo |= (1 << 16);
msr_write(MSR_IA32_MISC_ENABLES, msr);
/*
* Set guaranteed ratio [21:16] from IACORE_RATIOS to bits [15:8] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_RATIOS);
perf_ctl.lo = (msr.lo & 0x3f0000) >> 8;
/*
* Set guaranteed vid [21:16] from IACORE_VIDS to bits [7:0] of
* the PERF_CTL
*/
msr = msr_read(MSR_IACORE_VIDS);
perf_ctl.lo |= (msr.lo & 0x7f0000) >> 16;
perf_ctl.hi = 0;
msr_write(MSR_IA32_PERF_CTL, perf_ctl);
+}
+static int cpu_x86_baytrail_probe(struct udevice *dev) +{
debug("Init baytrail core\n");
BayTrail?
OK
/*
* On bay trail the turbo disable bit is actually scoped at the
BayTrail?
* building-block level, not package. For non-BSP cores that are
* within a building block, enable turbo. The cores within the BSP's
* building block will just see it already enabled and move on.
*/
if (lapicid())
turbo_enable();
/* Dynamic L2 shrink enable and threshold */
msr_clrsetbits_64(MSR_PMG_CST_CONFIG_CONTROL, 0x3f000f, 0xe0008),
/* Disable C1E */
msr_clrsetbits_64(MSR_POWER_CTL, 2, 0);
msr_setbits_64(MSR_POWER_MISC, 0x44);
/* Set this core to max frequency ratio */
set_max_freq();
return 0;
+}
+static unsigned bus_freq(void) +{
msr_t clk_info = msr_read(MSR_BSEL_CR_OVERCLOCK_CONTROL);
switch (clk_info.lo & 0x3) {
case 0:
return 83333333;
case 1:
return 100000000;
case 2:
return 133333333;
case 3:
return 116666666;
default:
return 0;
}
+}
+static unsigned long tsc_freq(void) +{
msr_t platform_info;
ulong bclk = bus_freq();
if (!bclk)
return 0;
platform_info = msr_read(MSR_PLATFORM_INFO);
return bclk * ((platform_info.lo >> 8) & 0xff);
+}
+static int baytrail_get_info(struct udevice *dev, struct cpu_info *info) +{
info->cpu_freq = tsc_freq();
info->features = 1 << CPU_FEAT_L1_CACHE | 1 << CPU_FEAT_MMU;
return 0;
+}
+static int cpu_x86_baytrail_bind(struct udevice *dev) +{
struct cpu_platdata *plat = dev_get_parent_platdata(dev);
plat->cpu_id = fdtdec_get_int(gd->fdt_blob, dev->of_offset,
"intel,apic-id", -1);
return 0;
+}
+static const struct cpu_ops cpu_x86_baytrail_ops = {
.get_desc = x86_cpu_get_desc,
.get_info = baytrail_get_info,
+};
+static const struct udevice_id cpu_x86_baytrail_ids[] = {
{ .compatible = "intel,baytrail-cpu" },
{ }
+};
+U_BOOT_DRIVER(cpu_x86_baytrail_drv) = {
.name = "cpu_x86_baytrail",
.id = UCLASS_CPU,
.of_match = cpu_x86_baytrail_ids,
.bind = cpu_x86_baytrail_bind,
.probe = cpu_x86_baytrail_probe,
.ops = &cpu_x86_baytrail_ops,
+}; diff --git a/arch/x86/include/asm/arch-baytrail/msr.h b/arch/x86/include/asm/arch-baytrail/msr.h new file mode 100644 index 0000000..1975aec --- /dev/null +++ b/arch/x86/include/asm/arch-baytrail/msr.h @@ -0,0 +1,30 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- */
+#ifndef __asm_arch_msr_h +#define __asm_arch_msr_h
Should be capital letters, or (see below)
+#define MSR_BSEL_CR_OVERCLOCK_CONTROL 0xcd +#define MSR_PMG_CST_CONFIG_CONTROL 0xe2 +#define SINGLE_PCTL (1 << 11) +#define MSR_POWER_MISC 0x120 +#define ENABLE_ULFM_AUTOCM_MASK (1 << 2) +#define ENABLE_INDP_AUTOCM_MASK (1 << 3) +#define MSR_IA32_MISC_ENABLES 0x1a0 +#define MSR_POWER_CTL 0x1fc +#define MSR_PKG_POWER_SKU_UNIT 0x606 +#define MSR_IACORE_RATIOS 0x66a +#define MSR_IACORE_TURBO_RATIOS 0x66c +#define MSR_IACORE_VIDS 0x66b +#define MSR_IACORE_TURBO_VIDS 0x66d +#define MSR_PKG_TURBO_CFG1 0x670 +#define MSR_CPU_TURBO_WKLD_CFG1 0x671 +#define MSR_CPU_TURBO_WKLD_CFG2 0x672 +#define MSR_CPU_THERM_CFG1 0x673 +#define MSR_CPU_THERM_CFG2 0x674 +#define MSR_CPU_THERM_SENS_CFG 0x675
Should these be all put into arch/x86/include/asm/msr-index.h, a single place for all x86 processors' MSR?
I was worried that they might be specific to this CPU. But if they are common then yes they should go in the common file.
Maybe some of them are BayTrail-specific. But there MSRs are documented in the Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3, right? If yes, I think it's fine to put them at just one place.
I'll confirm that and move it, thanks.
Regards, Simon

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This driver supports multi-core init and sets up the CPU frequencies correctly.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-baytrail/msr.h | 30 +++++ 3 files changed, 237 insertions(+) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h
diff --git a/arch/x86/cpu/baytrail/Makefile b/arch/x86/cpu/baytrail/Makefile index 8914e8b..c78b644 100644 --- a/arch/x86/cpu/baytrail/Makefile +++ b/arch/x86/cpu/baytrail/Makefile @@ -4,6 +4,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
+obj-y += cpu.o obj-y += early_uart.o obj-y += fsp_configs.o obj-y += pci.o diff --git a/arch/x86/cpu/baytrail/cpu.c b/arch/x86/cpu/baytrail/cpu.c new file mode 100644 index 0000000..5a2a8ee --- /dev/null +++ b/arch/x86/cpu/baytrail/cpu.c @@ -0,0 +1,206 @@ +/*
- Copyright (C) 2015 Google, Inc
- SPDX-License-Identifier: GPL-2.0+
- Based on code from coreboot
- */
+#include <common.h> +#include <cpu.h> +#include <dm.h> +#include <asm/cpu.h> +#include <asm/lapic.h> +#include <asm/mp.h> +#include <asm/msr.h> +#include <asm/turbo.h> +#include <asm/arch/msr.h>
+#ifdef CONFIG_SMP +static int enable_smis(struct udevice *cpu, void *unused) +{
return 0;
+}
+static struct mp_flight_record mp_steps[] = {
MP_FR_BLOCK_APS(mp_init_cpu, NULL, mp_init_cpu, NULL),
/* Wait for APs to finish initialization before proceeding. */
MP_FR_BLOCK_APS(NULL, NULL, enable_smis, NULL),
+};
+static int detect_num_cpus(void) +{
int ecx = 0;
/*
* Use the algorithm described in Intel 64 and IA-32 Architectures
* Software Developer's Manual Volume 3 (3A, 3B & 3C): System
* Programming Guide, Jan-2015. Section 8.9.2: Hierarchical Mapping
* of CPUID Extended Topology Leaf.
*/
while (1) {
struct cpuid_result leaf_b;
leaf_b = cpuid_ext(0xb, ecx);
/*
* Bay Trail doesn't have hyperthreading so just determine the
* number of cores by from level type (ecx[15:8] == * 2)
*/
if ((leaf_b.ecx & 0xff00) == 0x0200)
return leaf_b.ebx & 0xffff;
ecx++;
}
+}
+static int baytrail_init_cpus(void) +{
struct mp_params mp_params;
lapic_setup();
One more comment, I believe this lapic_setup() call can be moved into mp_init(), thoughts?
mp_params.num_cpus = detect_num_cpus();
mp_params.parallel_microcode_load = 0,
mp_params.flight_plan = &mp_steps[0];
mp_params.num_records = ARRAY_SIZE(mp_steps);
mp_params.microcode_pointer = 0;
if (mp_init(&mp_params)) {
printf("Warning: MP init failure\n");
return -EIO;
}
return 0;
+} +#endif
[snip]
Regards, Bin

We don't need to support really old x86 CPUs, so drop this code.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/cpu/lapic.c | 20 ++++++++++++-------- arch/x86/include/asm/lapic.h | 7 ------- 2 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/arch/x86/cpu/lapic.c b/arch/x86/cpu/lapic.c index 4690603..0c9c324 100644 --- a/arch/x86/cpu/lapic.c +++ b/arch/x86/cpu/lapic.c @@ -15,7 +15,6 @@
void lapic_setup(void) { -#if NEED_LAPIC == 1 /* Only Pentium Pro and later have those MSR stuff */ debug("Setting up local apic: ");
@@ -46,12 +45,17 @@ void lapic_setup(void) (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING | LAPIC_DELIVERY_MODE_NMI));
- debug("apic_id: 0x%02lx, ", lapicid()); -#else /* !NEED_LLAPIC */ - /* Only Pentium Pro and later have those MSR stuff */ - debug("Disabling local apic: "); - disable_lapic(); -#endif /* !NEED_LAPIC */ - debug("done.\n"); + debug("apic_id: 0x%02lx\n", lapicid()); post_code(POST_LAPIC); } + +void lapic_enable(void) +{ + msr_t msr; + + msr = msr_read(LAPIC_BASE_MSR); + msr.hi &= 0xffffff00; + msr.lo &= 0x000007ff; + msr.lo |= LAPIC_DEFAULT_BASE | LAPIC_BASE_MSR_ENABLE; + msr_write(LAPIC_BASE_MSR, msr); +} diff --git a/arch/x86/include/asm/lapic.h b/arch/x86/include/asm/lapic.h index 0a7f443..dff75c5 100644 --- a/arch/x86/include/asm/lapic.h +++ b/arch/x86/include/asm/lapic.h @@ -14,13 +14,6 @@ #include <asm/msr.h> #include <asm/processor.h>
-/* See if I need to initialize the local apic */ -#if CONFIG_SMP || CONFIG_IOAPIC -# define NEED_LAPIC 1 -#else -# define NEED_LAPIC 0 -#endif - static inline __attribute__((always_inline)) unsigned long lapic_read(unsigned long reg) {

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
We don't need to support really old x86 CPUs, so drop this code.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/cpu/lapic.c | 20 ++++++++++++-------- arch/x86/include/asm/lapic.h | 7 ------- 2 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/arch/x86/cpu/lapic.c b/arch/x86/cpu/lapic.c index 4690603..0c9c324 100644 --- a/arch/x86/cpu/lapic.c +++ b/arch/x86/cpu/lapic.c @@ -15,7 +15,6 @@
void lapic_setup(void) { -#if NEED_LAPIC == 1 /* Only Pentium Pro and later have those MSR stuff */ debug("Setting up local apic: ");
@@ -46,12 +45,17 @@ void lapic_setup(void) (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING | LAPIC_DELIVERY_MODE_NMI));
debug("apic_id: 0x%02lx, ", lapicid());
-#else /* !NEED_LLAPIC */
/* Only Pentium Pro and later have those MSR stuff */
debug("Disabling local apic: ");
disable_lapic();
-#endif /* !NEED_LAPIC */
debug("done.\n");
debug("apic_id: 0x%02lx\n", lapicid()); post_code(POST_LAPIC);
}
+void lapic_enable(void) +{
msr_t msr;
msr = msr_read(LAPIC_BASE_MSR);
msr.hi &= 0xffffff00;
msr.lo &= 0x000007ff;
msr.lo |= LAPIC_DEFAULT_BASE | LAPIC_BASE_MSR_ENABLE;
msr_write(LAPIC_BASE_MSR, msr);
+}
This is duplicated. There is already a enable_lapic() in lapic.h which has the same codes.
diff --git a/arch/x86/include/asm/lapic.h b/arch/x86/include/asm/lapic.h index 0a7f443..dff75c5 100644 --- a/arch/x86/include/asm/lapic.h +++ b/arch/x86/include/asm/lapic.h @@ -14,13 +14,6 @@ #include <asm/msr.h> #include <asm/processor.h>
-/* See if I need to initialize the local apic */ -#if CONFIG_SMP || CONFIG_IOAPIC -# define NEED_LAPIC 1 -#else -# define NEED_LAPIC 0 -#endif
static inline __attribute__((always_inline)) unsigned long lapic_read(unsigned long reg) { --
Regards, Bin

Enable the CPU uclass and Simple Firmware interface for Minnowbaord MAX. This enables multi-core support in Linux.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/dts/minnowmax.dts | 20 ++++++++++++++++++++ configs/minnowmax_defconfig | 4 ++++ 2 files changed, 24 insertions(+)
diff --git a/arch/x86/dts/minnowmax.dts b/arch/x86/dts/minnowmax.dts index 0233f61..7103bc5 100644 --- a/arch/x86/dts/minnowmax.dts +++ b/arch/x86/dts/minnowmax.dts @@ -68,6 +68,26 @@ stdout-path = "/serial"; };
+ cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu@0 { + device_type = "cpu"; + compatible = "intel,baytrail-cpu"; + reg = <0>; + intel,apic-id = <0>; + }; + + cpu@1 { + device_type = "cpu"; + compatible = "intel,baytrail-cpu"; + reg = <1>; + intel,apic-id = <4>; + }; + + }; + spi { #address-cells = <1>; #size-cells = <0>; diff --git a/configs/minnowmax_defconfig b/configs/minnowmax_defconfig index c59f4ac..426fb52 100644 --- a/configs/minnowmax_defconfig +++ b/configs/minnowmax_defconfig @@ -8,3 +8,7 @@ CONFIG_FRAMEBUFFER_SET_VESA_MODE=y CONFIG_FRAMEBUFFER_VESA_MODE_11A=y CONFIG_MMCONF_BASE_ADDRESS=0xe0000000 CONFIG_HAVE_INTEL_ME=y +CONFIG_GENERATE_SFI_TABLE=y +CONFIG_CPU=y +CONFIG_CMD_CPU=y +CONFIG_SMP=y

On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
Enable the CPU uclass and Simple Firmware interface for Minnowbaord MAX. This enables multi-core support in Linux.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2: None
arch/x86/dts/minnowmax.dts | 20 ++++++++++++++++++++ configs/minnowmax_defconfig | 4 ++++ 2 files changed, 24 insertions(+)
diff --git a/arch/x86/dts/minnowmax.dts b/arch/x86/dts/minnowmax.dts index 0233f61..7103bc5 100644 --- a/arch/x86/dts/minnowmax.dts +++ b/arch/x86/dts/minnowmax.dts @@ -68,6 +68,26 @@ stdout-path = "/serial"; };
cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
device_type = "cpu";
compatible = "intel,baytrail-cpu";
reg = <0>;
intel,apic-id = <0>;
};
cpu@1 {
device_type = "cpu";
compatible = "intel,baytrail-cpu";
reg = <1>;
intel,apic-id = <4>;
};
};
spi { #address-cells = <1>; #size-cells = <0>;
diff --git a/configs/minnowmax_defconfig b/configs/minnowmax_defconfig index c59f4ac..426fb52 100644 --- a/configs/minnowmax_defconfig +++ b/configs/minnowmax_defconfig @@ -8,3 +8,7 @@ CONFIG_FRAMEBUFFER_SET_VESA_MODE=y CONFIG_FRAMEBUFFER_VESA_MODE_11A=y CONFIG_MMCONF_BASE_ADDRESS=0xe0000000 CONFIG_HAVE_INTEL_ME=y +CONFIG_GENERATE_SFI_TABLE=y +CONFIG_CPU=y +CONFIG_CMD_CPU=y
+CONFIG_SMP=y
Reviewed-by: Bin Meng bmeng.cn@gmail.com

Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This series adds a new CPU uclass which is intended to be useful on any architecture. So far it has a very simple interface and a command to show CPU details.
This series also introduces multi-core init for x86. It is implemented and enabled on Minnowboard MAX, a single/dual-core Atom board. The CPU uclass is implemented for x86 and the Simple Firmware Interface provides these details to the kernel, since ACPI is not yet available.
With these changes Minnowboard MAX can boot into Linux with both cores enabled.
This series is available at u-boot-x86 branch 'cpu-working'.
Changes in v2:
- Use capitals for the header guard
- Change 'print' to 'Print' in comment
- Correct bugs in number output
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
- Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig
- Correct Kconfig help indentation and text
- Drop SFI_BASE config option
- Always build sfi.o
- Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE()
- Make get_entry_start() static
- Use table_compute_checksum() to computer checksum
- Add a few blank lines
- Move patch to after the CPU uclass patch
- Drop the RTC table as it is not needed
- Move SFI calling code to write_tables()
- Remove IDLE table
- Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END
- Move '__packed' to immediately after 'struct'
- Add SFI_DEV_TYPE_SD and convert to enum
- Remove #ifdef CONFIG_SFI from hedaer file
- Move sfi.h header file to arch/x86/include/asm
- Remove unnecessary \t\n after mfence assembler instruction
This is quick! :) I have not finished reviewing the v1. I will continue reviewing the v2.
Simon Glass (20): Fix comment nits in board_f.c dm: core: Add a function to bind a driver for a device tree node x86: Remove unwanted MMC debugging x86: Disable -Werror Move display_options functions to their own header Add print_freq() to display frequencies nicely dm: Implement a CPU uclass x86: Add support for the Simple Firmware Interface (SFI) Add a 'cpu' command to print CPU information x86: Add atomic operations x86: Add defines for fixed MTRRs x86: Add an mfence macro x86: Store the GDT pointer in global_data x86: Provide access to the IDT x86: Add multi-processor init x86: Add functions to set and clear bits on MSRs x86: Allow CPUs to be set up after relocation x86: Add a CPU driver for baytrail x86: Tidy up the LAPIC init code x86: Enable multi-core init for Minnowboard MAX
arch/x86/Kconfig | 39 +++ arch/x86/cpu/Makefile | 2 + arch/x86/cpu/baytrail/Makefile | 1 + arch/x86/cpu/baytrail/cpu.c | 206 +++++++++++++ arch/x86/cpu/baytrail/valleyview.c | 1 - arch/x86/cpu/config.mk | 2 +- arch/x86/cpu/cpu.c | 38 +++ arch/x86/cpu/interrupts.c | 5 + arch/x86/cpu/ivybridge/model_206ax.c | 4 +- arch/x86/cpu/lapic.c | 20 +- arch/x86/cpu/mp_init.c | 507 +++++++++++++++++++++++++++++++ arch/x86/cpu/sipi.S | 215 +++++++++++++ arch/x86/dts/minnowmax.dts | 20 ++ arch/x86/include/asm/arch-baytrail/msr.h | 30 ++ arch/x86/include/asm/atomic.h | 115 +++++++ arch/x86/include/asm/cpu.h | 19 ++ arch/x86/include/asm/global_data.h | 1 + arch/x86/include/asm/interrupt.h | 2 + arch/x86/include/asm/lapic.h | 7 - arch/x86/include/asm/mp.h | 94 ++++++ arch/x86/include/asm/msr.h | 19 ++ arch/x86/include/asm/mtrr.h | 14 + arch/x86/include/asm/sfi.h | 137 +++++++++ arch/x86/include/asm/sipi.h | 79 +++++ arch/x86/include/asm/smm.h | 14 + arch/x86/include/asm/u-boot-x86.h | 2 + arch/x86/lib/Makefile | 1 + arch/x86/lib/sfi.c | 154 ++++++++++ arch/x86/lib/tables.c | 5 + common/Kconfig | 8 + common/Makefile | 1 + common/board_f.c | 9 +- common/board_r.c | 2 +- common/cmd_cpu.c | 113 +++++++ configs/minnowmax_defconfig | 4 + drivers/Kconfig | 2 + drivers/Makefile | 1 + drivers/core/lists.c | 9 +- drivers/cpu/Kconfig | 8 + drivers/cpu/Makefile | 7 + drivers/cpu/cpu-uclass.c | 61 ++++ include/common.h | 16 +- include/cpu.h | 84 +++++ include/display_options.h | 59 ++++ include/dm/lists.h | 16 + include/dm/uclass-id.h | 1 + lib/display_options.c | 51 +++- 47 files changed, 2151 insertions(+), 54 deletions(-) create mode 100644 arch/x86/cpu/baytrail/cpu.c create mode 100644 arch/x86/cpu/mp_init.c create mode 100644 arch/x86/cpu/sipi.S create mode 100644 arch/x86/include/asm/arch-baytrail/msr.h create mode 100644 arch/x86/include/asm/atomic.h create mode 100644 arch/x86/include/asm/mp.h create mode 100644 arch/x86/include/asm/sfi.h create mode 100644 arch/x86/include/asm/sipi.h create mode 100644 arch/x86/include/asm/smm.h create mode 100644 arch/x86/lib/sfi.c create mode 100644 common/cmd_cpu.c create mode 100644 drivers/cpu/Kconfig create mode 100644 drivers/cpu/Makefile create mode 100644 drivers/cpu/cpu-uclass.c create mode 100644 include/cpu.h create mode 100644 include/display_options.h
--
Regards, Bin

Hi Simon,
On Wed, Apr 29, 2015 at 11:07 AM, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This series adds a new CPU uclass which is intended to be useful on any architecture. So far it has a very simple interface and a command to show CPU details.
This series also introduces multi-core init for x86. It is implemented and enabled on Minnowboard MAX, a single/dual-core Atom board. The CPU uclass is implemented for x86 and the Simple Firmware Interface provides these details to the kernel, since ACPI is not yet available.
With these changes Minnowboard MAX can boot into Linux with both cores enabled.
This series is available at u-boot-x86 branch 'cpu-working'.
Changes in v2:
- Use capitals for the header guard
- Change 'print' to 'Print' in comment
- Correct bugs in number output
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
- Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig
- Correct Kconfig help indentation and text
- Drop SFI_BASE config option
- Always build sfi.o
- Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE()
- Make get_entry_start() static
- Use table_compute_checksum() to computer checksum
- Add a few blank lines
- Move patch to after the CPU uclass patch
- Drop the RTC table as it is not needed
- Move SFI calling code to write_tables()
- Remove IDLE table
- Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END
- Move '__packed' to immediately after 'struct'
- Add SFI_DEV_TYPE_SD and convert to enum
- Remove #ifdef CONFIG_SFI from hedaer file
- Move sfi.h header file to arch/x86/include/asm
- Remove unnecessary \t\n after mfence assembler instruction
This is quick! :) I have not finished reviewing the v1. I will continue reviewing the v2.
Simon Glass (20): Fix comment nits in board_f.c dm: core: Add a function to bind a driver for a device tree node
I've finished the review of this patch series, except this dm core one. I think maybe someone else who is more familiar with the dm internals than me could have a look. Thanks for working on these patches. It's really great we can boot an SMP kernel.
[snip]
Regards, Bin

Hi Bin,
On 29 April 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 11:07 AM, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Wed, Apr 29, 2015 at 10:25 AM, Simon Glass sjg@chromium.org wrote:
This series adds a new CPU uclass which is intended to be useful on any architecture. So far it has a very simple interface and a command to show CPU details.
This series also introduces multi-core init for x86. It is implemented and enabled on Minnowboard MAX, a single/dual-core Atom board. The CPU uclass is implemented for x86 and the Simple Firmware Interface provides these details to the kernel, since ACPI is not yet available.
With these changes Minnowboard MAX can boot into Linux with both cores enabled.
This series is available at u-boot-x86 branch 'cpu-working'.
Changes in v2:
- Use capitals for the header guard
- Change 'print' to 'Print' in comment
- Correct bugs in number output
- Change header guard to capital letters
- Change get_info() in function comment to cpu_get_info()
- Rename CONFIG_SFI to CONFIG_GENERATE_SFI_TABLE and move within Kconfig
- Correct Kconfig help indentation and text
- Drop SFI_BASE config option
- Always build sfi.o
- Use SFI_TABLE_MAX_ENTRIES instead of 16 and ARRAY_SIZE()
- Make get_entry_start() static
- Use table_compute_checksum() to computer checksum
- Add a few blank lines
- Move patch to after the CPU uclass patch
- Drop the RTC table as it is not needed
- Move SFI calling code to write_tables()
- Remove IDLE table
- Remove SFI_SYST_SEARCH_BEGIN and SFI_SYST_SEARCH_END
- Move '__packed' to immediately after 'struct'
- Add SFI_DEV_TYPE_SD and convert to enum
- Remove #ifdef CONFIG_SFI from hedaer file
- Move sfi.h header file to arch/x86/include/asm
- Remove unnecessary \t\n after mfence assembler instruction
This is quick! :) I have not finished reviewing the v1. I will continue reviewing the v2.
Simon Glass (20): Fix comment nits in board_f.c dm: core: Add a function to bind a driver for a device tree node
I've finished the review of this patch series, except this dm core one. I think maybe someone else who is more familiar with the dm internals than me could have a look. Thanks for working on these patches. It's really great we can boot an SMP kernel.
Thanks very much for your thorough review. I'll likely pull in the first part of the series and then tidy up and send the rest soon.
Yes it's good to get SMP. We still have a few more things needed. Pinmux is in progress. ACPI should happen over the 4-5 months. We probably need some sort of 'SMBIOS' interface, and perhaps SMM.
Regards, Simon
participants (2)
-
Bin Meng
-
Simon Glass