[U-Boot] [RFC PATCH v2 0/9] x86: Add Intel Quark Memory Reference Code (MRC) support

This series adds the 2nd step of bare support for the Intel Quark SoC support which can be validated on Intel Galileo board. It adds Intel Quark Memory Reference Code (MRC) which is ported from Intel released UEFI BIOS for Quark, which per its name does the memory initialization for Quark based board. Now with this patch series, U-Boot boots to the command shell on Intel Galileo gen2 board.
The MRC port work was mainly replacing all UEFI BIOS APIs with corresponding U-Boot ones, like register access, timer, debug output etc, however the majority of the time was spent to converting Intel coding convention to U-Boot's, like comment block //, space to indent, camel cases, struct typedefs, etc. It took me almost 4 days to finish the work, but I have not managed to fix all checkpatch warnings given the code complexity and black magics in the codes. See commit notes inside patch#3, #4, #5 about the checkpatch warning details.
Currently only the cold boot path is taken by MRC and it takes about 10 seconds from power-up to U-Boot shell. Future work will be done to support a fast boot path (MRC cache) which is supposed to boot much faster. With previous Quark patch series, we have a basic support for Intel Quark SoC in good shape which more additional feature work can be based upon in the following weeks.
Changes in v2: - Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204023.html - Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204044.html - Write out the values instead of BIT in smc.h - Removing the ending / in the file header comment block - Update comment for calling mrc_init() - Check return status of mrc_init()
Bin Meng (9): x86: Allow overriding TSC_FREQ_IN_MHZ x86: quark: Bypass TSC calibration x86: quark: Add Memory Reference Code (MRC) main routines x86: quark: Add utility codes needed for MRC x86: quark: Add System Memory Controller support x86: quark: Enable the Memory Reference Code build fdtdec: Add compatible id and string for Intel Quark MRC dt-bindings: Add Intel Quark MRC bindings x86: quark: Call MRC in dram_init()
arch/x86/Kconfig | 40 +- arch/x86/cpu/quark/Kconfig | 5 + arch/x86/cpu/quark/Makefile | 1 + arch/x86/cpu/quark/dram.c | 99 +- arch/x86/cpu/quark/hte.c | 396 +++++ arch/x86/cpu/quark/hte.h | 44 + arch/x86/cpu/quark/mrc.c | 204 +++ arch/x86/cpu/quark/mrc_util.c | 1475 ++++++++++++++++++ arch/x86/cpu/quark/mrc_util.h | 153 ++ arch/x86/cpu/quark/smc.c | 2764 +++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/smc.h | 446 ++++++ arch/x86/dts/galileo.dts | 25 + arch/x86/include/asm/arch-quark/mrc.h | 187 +++ include/dt-bindings/mrc/quark.h | 83 + include/fdtdec.h | 1 + lib/fdtdec.c | 1 + 16 files changed, 5902 insertions(+), 22 deletions(-) create mode 100644 arch/x86/cpu/quark/hte.c create mode 100644 arch/x86/cpu/quark/hte.h create mode 100644 arch/x86/cpu/quark/mrc.c create mode 100644 arch/x86/cpu/quark/mrc_util.c create mode 100644 arch/x86/cpu/quark/mrc_util.h create mode 100644 arch/x86/cpu/quark/smc.c create mode 100644 arch/x86/cpu/quark/smc.h create mode 100644 arch/x86/include/asm/arch-quark/mrc.h create mode 100644 include/dt-bindings/mrc/quark.h

We should allow the value of TSC_FREQ_IN_MHZ to be overridden by the one in arch/cpu/<xxx>/Kconfig.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/Kconfig | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 85dda2e..2370c32 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -348,26 +348,6 @@ config FRAMEBUFFER_VESA_MODE
endmenu
-config TSC_CALIBRATION_BYPASS - bool "Bypass Time-Stamp Counter (TSC) calibration" - default n - help - By default U-Boot automatically calibrates Time-Stamp Counter (TSC) - running frequency via Model-Specific Register (MSR) and Programmable - Interval Timer (PIT). If the calibration does not work on your board, - select this option and provide a hardcoded TSC running frequency with - CONFIG_TSC_FREQ_IN_MHZ below. - - Normally this option should be turned on in a simulation environment - like qemu. - -config TSC_FREQ_IN_MHZ - int "Time-Stamp Counter (TSC) running frequency in MHz" - depends on TSC_CALIBRATION_BYPASS - default 1000 - help - The running frequency in MHz of Time-Stamp Counter (TSC). - config HAVE_FSP bool "Add an Firmware Support Package binary" help @@ -416,6 +396,26 @@ source "arch/x86/cpu/quark/Kconfig"
source "arch/x86/cpu/queensbay/Kconfig"
+config TSC_CALIBRATION_BYPASS + bool "Bypass Time-Stamp Counter (TSC) calibration" + default n + help + By default U-Boot automatically calibrates Time-Stamp Counter (TSC) + running frequency via Model-Specific Register (MSR) and Programmable + Interval Timer (PIT). If the calibration does not work on your board, + select this option and provide a hardcoded TSC running frequency with + CONFIG_TSC_FREQ_IN_MHZ below. + + Normally this option should be turned on in a simulation environment + like qemu. + +config TSC_FREQ_IN_MHZ + int "Time-Stamp Counter (TSC) running frequency in MHz" + depends on TSC_CALIBRATION_BYPASS + default 1000 + help + The running frequency in MHz of Time-Stamp Counter (TSC). + source "board/coreboot/coreboot/Kconfig"
source "board/google/chromebook_link/Kconfig"

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
We should allow the value of TSC_FREQ_IN_MHZ to be overridden by the one in arch/cpu/<xxx>/Kconfig.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

For some unknown reason, the TSC calibration via PIT does not work on Quark. Enable bypassing TSC calibration and override TSC_FREQ_IN_MHZ to 400 per Quark datasheet in the Kconfig.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
arch/x86/cpu/quark/Kconfig | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/cpu/quark/Kconfig b/arch/x86/cpu/quark/Kconfig index 163caac..bc961ef 100644 --- a/arch/x86/cpu/quark/Kconfig +++ b/arch/x86/cpu/quark/Kconfig @@ -7,6 +7,7 @@ config INTEL_QUARK bool select HAVE_RMU + select TSC_CALIBRATION_BYPASS
if INTEL_QUARK
@@ -118,4 +119,8 @@ config SYS_CAR_SIZE Space in bytes in eSRAM used as Cache-As-ARM (CAR). Note this size must not exceed eSRAM's total size.
+config TSC_FREQ_IN_MHZ + int + default 400 + endif

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
For some unknown reason, the TSC calibration via PIT does not work on Quark. Enable bypassing TSC calibration and override TSC_FREQ_IN_MHZ to 400 per Quark datasheet in the Kconfig.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

Add the main routines for Quark Memory Reference Code (MRC).
Signed-off-by: Bin Meng bmeng.cn@gmail.com
--- The are 24 checkpatch warnings in this patch, which is:
warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters ...
I intentionally leave it as is now, as fixing these warnings make the mrc initialization table a little bit harder to read.
Changes in v2: - Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204023.html
arch/x86/cpu/quark/mrc.c | 204 ++++++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-quark/mrc.h | 187 +++++++++++++++++++++++++++++++ 2 files changed, 391 insertions(+) create mode 100644 arch/x86/cpu/quark/mrc.c create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
diff --git a/arch/x86/cpu/quark/mrc.c b/arch/x86/cpu/quark/mrc.c new file mode 100644 index 0000000..7eb34c5 --- /dev/null +++ b/arch/x86/cpu/quark/mrc.c @@ -0,0 +1,204 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +/* + * This is the main Quark Memory Reference Code (MRC) + * + * These functions are generic and should work for any Quark-based board. + * + * MRC requires two data structures to be passed in which are initialized by + * mrc_adjust_params(). + * + * The basic flow is as follows: + * 01) Check for supported DDR speed configuration + * 02) Set up Memory Manager buffer as pass-through (POR) + * 03) Set Channel Interleaving Mode and Channel Stride to the most aggressive + * setting possible + * 04) Set up the Memory Controller logic + * 05) Set up the DDR_PHY logic + * 06) Initialise the DRAMs (JEDEC) + * 07) Perform the Receive Enable Calibration algorithm + * 08) Perform the Write Leveling algorithm + * 09) Perform the Read Training algorithm (includes internal Vref) + * 10) Perform the Write Training algorithm + * 11) Set Channel Interleaving Mode and Channel Stride to the desired settings + * + * DRAM unit configuration based on Valleyview MRC. + */ + +#include <common.h> +#include <asm/arch/mrc.h> +#include <asm/arch/msg_port.h> +#include "mrc_util.h" +#include "smc.h" + +static const struct mem_init init[] = { + { 0x0101, BM_COLD | BM_FAST | BM_WARM | BM_S3, clear_self_refresh }, + { 0x0200, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_timing_control }, + { 0x0103, BM_COLD | BM_FAST , prog_decode_before_jedec }, + { 0x0104, BM_COLD | BM_FAST , perform_ddr_reset }, + { 0x0300, BM_COLD | BM_FAST | BM_S3, ddrphy_init }, + { 0x0400, BM_COLD | BM_FAST , perform_jedec_init }, + { 0x0105, BM_COLD | BM_FAST , set_ddr_init_complete }, + { 0x0106, BM_FAST | BM_WARM | BM_S3, restore_timings }, + { 0x0106, BM_COLD , default_timings }, + { 0x0500, BM_COLD , rcvn_cal }, + { 0x0600, BM_COLD , wr_level }, + { 0x0120, BM_COLD , prog_page_ctrl }, + { 0x0700, BM_COLD , rd_train }, + { 0x0800, BM_COLD , wr_train }, + { 0x010b, BM_COLD , store_timings }, + { 0x010c, BM_COLD | BM_FAST | BM_WARM | BM_S3, enable_scrambling }, + { 0x010d, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_ddr_control }, + { 0x010e, BM_COLD | BM_FAST | BM_WARM | BM_S3, prog_dra_drb }, + { 0x010f, BM_WARM | BM_S3, perform_wake }, + { 0x0110, BM_COLD | BM_FAST | BM_WARM | BM_S3, change_refresh_period }, + { 0x0111, BM_COLD | BM_FAST | BM_WARM | BM_S3, set_auto_refresh }, + { 0x0112, BM_COLD | BM_FAST | BM_WARM | BM_S3, ecc_enable }, + { 0x0113, BM_COLD | BM_FAST , memory_test }, + { 0x0114, BM_COLD | BM_FAST | BM_WARM | BM_S3, lock_registers } +}; + +/* Adjust configuration parameters before initialization sequence */ +static void mrc_adjust_params(struct mrc_params *mrc_params) +{ + const struct dram_params *dram_params; + uint8_t dram_width; + uint32_t rank_enables; + uint32_t channel_width; + + ENTERFN(); + + /* initially expect success */ + mrc_params->status = MRC_SUCCESS; + + dram_width = mrc_params->dram_width; + rank_enables = mrc_params->rank_enables; + channel_width = mrc_params->channel_width; + + /* + * Setup board layout (must be reviewed as is selecting static timings) + * 0 == R0 (DDR3 x16), 1 == R1 (DDR3 x16), + * 2 == DV (DDR3 x8), 3 == SV (DDR3 x8). + */ + if (dram_width == X8) + mrc_params->board_id = 2; /* select x8 layout */ + else + mrc_params->board_id = 0; /* select x16 layout */ + + /* initially no memory */ + mrc_params->mem_size = 0; + + /* begin of channel settings */ + dram_params = &mrc_params->params; + + /* + * Determine column bits: + * + * Column: 11 for 8Gbx8, else 10 + */ + mrc_params->column_bits[0] = + ((dram_params[0].density == 4) && + (dram_width == X8)) ? (11) : (10); + + /* + * Determine row bits: + * + * 512Mbx16=12 512Mbx8=13 + * 1Gbx16=13 1Gbx8=14 + * 2Gbx16=14 2Gbx8=15 + * 4Gbx16=15 4Gbx8=16 + * 8Gbx16=16 8Gbx8=16 + */ + mrc_params->row_bits[0] = 12 + (dram_params[0].density) + + (((dram_params[0].density < 4) && + (dram_width == X8)) ? (1) : (0)); + + /* + * Determine per-channel memory size: + * + * (For 2 RANKs, multiply by 2) + * (For 16 bit data bus, divide by 2) + * + * DENSITY WIDTH MEM_AVAILABLE + * 512Mb x16 0x008000000 ( 128MB) + * 512Mb x8 0x010000000 ( 256MB) + * 1Gb x16 0x010000000 ( 256MB) + * 1Gb x8 0x020000000 ( 512MB) + * 2Gb x16 0x020000000 ( 512MB) + * 2Gb x8 0x040000000 (1024MB) + * 4Gb x16 0x040000000 (1024MB) + * 4Gb x8 0x080000000 (2048MB) + */ + mrc_params->channel_size[0] = (1 << dram_params[0].density); + mrc_params->channel_size[0] *= (dram_width == X8) ? 2 : 1; + mrc_params->channel_size[0] *= (rank_enables == 0x3) ? 2 : 1; + mrc_params->channel_size[0] *= (channel_width == X16) ? 1 : 2; + + /* Determine memory size (convert number of 64MB/512Mb units) */ + mrc_params->mem_size += mrc_params->channel_size[0] << 26; + + LEAVEFN(); +} + +static void mrc_mem_init(struct mrc_params *mrc_params) +{ + int i; + + ENTERFN(); + + /* MRC started */ + mrc_post_code(0x01, 0x00); + + if (mrc_params->boot_mode != BM_COLD) { + if (mrc_params->ddr_speed != mrc_params->timings.ddr_speed) { + /* full training required as frequency changed */ + mrc_params->boot_mode = BM_COLD; + } + } + + for (i = 0; i < ARRAY_SIZE(init); i++) { + uint64_t my_tsc; + + if (mrc_params->boot_mode & init[i].boot_path) { + uint8_t major = init[i].post_code >> 8 & 0xff; + uint8_t minor = init[i].post_code >> 0 & 0xff; + mrc_post_code(major, minor); + + my_tsc = rdtsc(); + init[i].init_fn(mrc_params); + DPF(D_TIME, "Execution time %llx", rdtsc() - my_tsc); + } + } + + /* display the timings */ + print_timings(mrc_params); + + /* MRC complete */ + mrc_post_code(0x01, 0xff); + + LEAVEFN(); +} + +void mrc_init(struct mrc_params *mrc_params) +{ + ENTERFN(); + + DPF(D_INFO, "MRC Version %04x %s %s\n", MRC_VERSION, + __DATE__, __TIME__); + + /* Set up the data structures used by mrc_mem_init() */ + mrc_adjust_params(mrc_params); + + /* Initialize system memory */ + mrc_mem_init(mrc_params); + + LEAVEFN(); +} diff --git a/arch/x86/include/asm/arch-quark/mrc.h b/arch/x86/include/asm/arch-quark/mrc.h new file mode 100644 index 0000000..150fbea --- /dev/null +++ b/arch/x86/include/asm/arch-quark/mrc.h @@ -0,0 +1,187 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#ifndef _MRC_H_ +#define _MRC_H_ + +#define MRC_VERSION 0x0111 + +/* architectural definitions */ +#define NUM_CHANNELS 1 /* number of channels */ +#define NUM_RANKS 2 /* number of ranks per channel */ +#define NUM_BYTE_LANES 4 /* number of byte lanes per channel */ + +/* software limitations */ +#define MAX_CHANNELS 1 +#define MAX_RANKS 2 +#define MAX_BYTE_LANES 4 + +#define MAX_SOCKETS 1 +#define MAX_SIDES 1 +#define MAX_ROWS (MAX_SIDES * MAX_SOCKETS) + +/* Specify DRAM and channel width */ +enum { + X8, /* DRAM width */ + X16, /* DRAM width & Channel Width */ + X32 /* Channel Width */ +}; + +/* Specify DRAM speed */ +enum { + DDRFREQ_800, + DDRFREQ_1066 +}; + +/* Specify DRAM type */ +enum { + DDR3, + DDR3L +}; + +/* + * density: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb + * cl: DRAM CAS Latency in clocks + * ras: ACT to PRE command period + * wtr: Delay from start of internal write transaction to internal read command + * rrd: ACT to ACT command period (JESD79 specific to page size 1K/2K) + * faw: Four activate window (JESD79 specific to page size 1K/2K) + * + * ras/wtr/rrd/faw timings are in picoseconds + * + * Refer to JEDEC spec (or DRAM datasheet) when changing these values. + */ +struct dram_params { + uint8_t density; + uint8_t cl; + uint32_t ras; + uint32_t wtr; + uint32_t rrd; + uint32_t faw; +}; + +/* + * Delay configuration for individual signals + * Vref setting + * Scrambler seed + */ +struct mrc_timings { + uint32_t rcvn[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + uint32_t rdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + uint32_t wdqs[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + uint32_t wdq[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + uint32_t vref[NUM_CHANNELS][NUM_BYTE_LANES]; + uint32_t wctl[NUM_CHANNELS][NUM_RANKS]; + uint32_t wcmd[NUM_CHANNELS]; + uint32_t scrambler_seed; + /* need to save for the case of frequency change */ + uint8_t ddr_speed; +}; + +/* Boot mode defined as bit mask (1<<n) */ +enum { + BM_UNKNOWN, + BM_COLD = 1, /* full training */ + BM_FAST = 2, /* restore timing parameters */ + BM_S3 = 4, /* resume from S3 */ + BM_WARM = 8 +}; + +/* MRC execution status */ +#define MRC_SUCCESS 0 /* initialization ok */ +#define MRC_E_MEMTEST 1 /* memtest failed */ + +/* + * Memory Reference Code parameters + * + * It includes 3 parts: + * - input parameters like boot mode and DRAM parameters + * - context parameters for MRC internal state + * - output parameters like initialization result and memory size + */ +struct mrc_params { + /* Input parameters */ + uint32_t boot_mode; /* BM_COLD, BM_FAST, BM_WARM, BM_S3 */ + /* DRAM parameters */ + uint8_t dram_width; /* x8, x16 */ + uint8_t ddr_speed; /* DDRFREQ_800, DDRFREQ_1066 */ + uint8_t ddr_type; /* DDR3, DDR3L */ + uint8_t ecc_enables; /* 0, 1 (memory size reduced to 7/8) */ + uint8_t scrambling_enables; /* 0, 1 */ + /* 1, 3 (1'st rank has to be populated if 2'nd rank present) */ + uint32_t rank_enables; + uint32_t channel_enables; /* 1 only */ + uint32_t channel_width; /* x16 only */ + /* 0, 1, 2 (mode 2 forced if ecc enabled) */ + uint32_t address_mode; + /* REFRESH_RATE: 1=1.95us, 2=3.9us, 3=7.8us, others=RESERVED */ + uint8_t refresh_rate; + /* SR_TEMP_RANGE: 0=normal, 1=extended, others=RESERVED */ + uint8_t sr_temp_range; + /* + * RON_VALUE: 0=34ohm, 1=40ohm, others=RESERVED + * (select MRS1.DIC driver impedance control) + */ + uint8_t ron_value; + /* RTT_NOM_VALUE: 0=40ohm, 1=60ohm, 2=120ohm, others=RESERVED */ + uint8_t rtt_nom_value; + /* RD_ODT_VALUE: 0=off, 1=60ohm, 2=120ohm, 3=180ohm, others=RESERVED */ + uint8_t rd_odt_value; + struct dram_params params; + /* Internally used context parameters */ + uint32_t board_id; /* board layout (use x8 or x16 memory) */ + uint32_t hte_setup; /* when set hte reconfiguration requested */ + uint32_t menu_after_mrc; + uint32_t power_down_disable; + uint32_t tune_rcvn; + uint32_t channel_size[NUM_CHANNELS]; + uint32_t column_bits[NUM_CHANNELS]; + uint32_t row_bits[NUM_CHANNELS]; + uint32_t mrs1; /* register content saved during training */ + uint8_t first_run; + /* Output parameters */ + /* initialization result (non zero specifies error code) */ + uint32_t status; + /* total memory size in bytes (excludes ECC banks) */ + uint32_t mem_size; + /* training results (also used on input) */ + struct mrc_timings timings; +}; + +/* + * MRC memory initialization structure + * + * post_code: a 16-bit post code of a specific initialization routine + * boot_path: bitwise or of BM_COLD, BM_FAST, BM_WARM and BM_S3 + * init_fn: real memory initialization routine + */ +struct mem_init { + uint16_t post_code; + uint16_t boot_path; + void (*init_fn)(struct mrc_params *mrc_params); +}; + +/* MRC platform data flags */ +#define MRC_FLAG_ECC_EN 0x00000001 +#define MRC_FLAG_SCRAMBLE_EN 0x00000002 +#define MRC_FLAG_MEMTEST_EN 0x00000004 +/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */ +#define MRC_FLAG_TOP_TREE_EN 0x00000008 +/* If set ODR signal is asserted to DRAM devices on writes */ +#define MRC_FLAG_WR_ODT_EN 0x00000010 + +/** + * mrc_init - Memory Reference Code initialization entry routine + * + * @mrc_params: parameters for MRC + */ +void mrc_init(struct mrc_params *mrc_params); + +#endif /* _MRC_H_ */

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add the main routines for Quark Memory Reference Code (MRC).
Signed-off-by: Bin Meng bmeng.cn@gmail.com
The are 24 checkpatch warnings in this patch, which is:
warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters ...
I intentionally leave it as is now, as fixing these warnings make the mrc initialization table a little bit harder to read.
Changes in v2:
- Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204023.html
arch/x86/cpu/quark/mrc.c | 204 ++++++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-quark/mrc.h | 187 +++++++++++++++++++++++++++++++ 2 files changed, 391 insertions(+) create mode 100644 arch/x86/cpu/quark/mrc.c create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
Acked-by: Simon Glass sjg@chromium.org

On 5 February 2015 at 17:51, Simon Glass sjg@chromium.org wrote:
On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add the main routines for Quark Memory Reference Code (MRC).
Signed-off-by: Bin Meng bmeng.cn@gmail.com
The are 24 checkpatch warnings in this patch, which is:
warning: arch/x86/cpu/quark/mrc.c,43: line over 80 characters ...
I intentionally leave it as is now, as fixing these warnings make the mrc initialization table a little bit harder to read.
Changes in v2:
- Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204023.html
arch/x86/cpu/quark/mrc.c | 204 ++++++++++++++++++++++++++++++++++ arch/x86/include/asm/arch-quark/mrc.h | 187 +++++++++++++++++++++++++++++++ 2 files changed, 391 insertions(+) create mode 100644 arch/x86/cpu/quark/mrc.c create mode 100644 arch/x86/include/asm/arch-quark/mrc.h
Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

Add various utility codes needed for Quark MRC.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
--- There are 12 checkpatch warnings in this patch, which are:
warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...', sigh.
Changes in v2: - Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204044.html
arch/x86/cpu/quark/hte.c | 396 +++++++++++ arch/x86/cpu/quark/hte.h | 44 ++ arch/x86/cpu/quark/mrc_util.c | 1475 +++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/mrc_util.h | 153 +++++ 4 files changed, 2068 insertions(+) create mode 100644 arch/x86/cpu/quark/hte.c create mode 100644 arch/x86/cpu/quark/hte.h create mode 100644 arch/x86/cpu/quark/mrc_util.c create mode 100644 arch/x86/cpu/quark/mrc_util.h
diff --git a/arch/x86/cpu/quark/hte.c b/arch/x86/cpu/quark/hte.c new file mode 100644 index 0000000..372815d --- /dev/null +++ b/arch/x86/cpu/quark/hte.c @@ -0,0 +1,396 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#include <common.h> +#include <asm/arch/mrc.h> +#include <asm/arch/msg_port.h> +#include "mrc_util.h" +#include "hte.h" + +/** + * Enable HTE to detect all possible errors for the given training parameters + * (per-bit or full byte lane). + */ +static void hte_enable_all_errors(void) +{ + msg_port_write(HTE, 0x000200A2, 0xFFFFFFFF); + msg_port_write(HTE, 0x000200A3, 0x000000FF); + msg_port_write(HTE, 0x000200A4, 0x00000000); +} + +/** + * Go and read the HTE register in order to find any error + * + * @return: The errors detected in the HTE status register + */ +static u32 hte_check_errors(void) +{ + return msg_port_read(HTE, 0x000200A7); +} + +/** + * Wait until HTE finishes + */ +static void hte_wait_for_complete(void) +{ + u32 tmp; + + ENTERFN(); + + do {} while ((msg_port_read(HTE, 0x00020012) & BIT30) != 0); + + tmp = msg_port_read(HTE, 0x00020011); + tmp |= BIT9; + tmp &= ~(BIT12 | BIT13); + msg_port_write(HTE, 0x00020011, tmp); + + LEAVEFN(); +} + +/** + * Clear registers related with errors in the HTE + */ +static void hte_clear_error_regs(void) +{ + u32 tmp; + + /* + * Clear all HTE errors and enable error checking + * for burst and chunk. + */ + tmp = msg_port_read(HTE, 0x000200A1); + tmp |= BIT8; + msg_port_write(HTE, 0x000200A1, tmp); +} + +/** + * Execute a basic single-cache-line memory write/read/verify test using simple + * constant pattern, different for READ_TRAIN and WRITE_TRAIN modes. + * + * See hte_basic_write_read() which is the external visible wrapper. + * + * @mrc_params: host structure for all MRC global data + * @addr: memory adress being tested (must hit specific channel/rank) + * @first_run: if set then the HTE registers are configured, otherwise it is + * assumed configuration is done and we just re-run the test + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern) + * + * @return: byte lane failure on each bit (for Quark only bit0 and bit1) + */ +static u16 hte_basic_data_cmp(struct mrc_params *mrc_params, u32 addr, + u8 first_run, u8 mode) +{ + u32 pattern; + u32 offset; + + if (first_run) { + msg_port_write(HTE, 0x00020020, 0x01B10021); + msg_port_write(HTE, 0x00020021, 0x06000000); + msg_port_write(HTE, 0x00020022, addr >> 6); + msg_port_write(HTE, 0x00020062, 0x00800015); + msg_port_write(HTE, 0x00020063, 0xAAAAAAAA); + msg_port_write(HTE, 0x00020064, 0xCCCCCCCC); + msg_port_write(HTE, 0x00020065, 0xF0F0F0F0); + msg_port_write(HTE, 0x00020061, 0x00030008); + + if (mode == WRITE_TRAIN) + pattern = 0xC33C0000; + else /* READ_TRAIN */ + pattern = 0xAA5555AA; + + for (offset = 0x80; offset <= 0x8F; offset++) + msg_port_write(HTE, offset, pattern); + } + + msg_port_write(HTE, 0x000200A1, 0xFFFF1000); + msg_port_write(HTE, 0x00020011, 0x00011000); + msg_port_write(HTE, 0x00020011, 0x00011100); + + hte_wait_for_complete(); + + /* + * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for + * any bytelane errors. + */ + return (hte_check_errors() >> 8) & 0xFF; +} + +/** + * Examine a single-cache-line memory with write/read/verify test using multiple + * data patterns (victim-aggressor algorithm). + * + * See hte_write_stress_bit_lanes() which is the external visible wrapper. + * + * @mrc_params: host structure for all MRC global data + * @addr: memory adress being tested (must hit specific channel/rank) + * @loop_cnt: number of test iterations + * @seed_victim: victim data pattern seed + * @seed_aggressor: aggressor data pattern seed + * @victim_bit: should be 0 as auto-rotate feature is in use + * @first_run: if set then the HTE registers are configured, otherwise it is + * assumed configuration is done and we just re-run the test + * + * @return: byte lane failure on each bit (for Quark only bit0 and bit1) + */ +static u16 hte_rw_data_cmp(struct mrc_params *mrc_params, u32 addr, + u8 loop_cnt, u32 seed_victim, u32 seed_aggressor, + u8 victim_bit, u8 first_run) +{ + u32 offset; + u32 tmp; + + if (first_run) { + msg_port_write(HTE, 0x00020020, 0x00910024); + msg_port_write(HTE, 0x00020023, 0x00810024); + msg_port_write(HTE, 0x00020021, 0x06070000); + msg_port_write(HTE, 0x00020024, 0x06070000); + msg_port_write(HTE, 0x00020022, addr >> 6); + msg_port_write(HTE, 0x00020025, addr >> 6); + msg_port_write(HTE, 0x00020062, 0x0000002A); + msg_port_write(HTE, 0x00020063, seed_victim); + msg_port_write(HTE, 0x00020064, seed_aggressor); + msg_port_write(HTE, 0x00020065, seed_victim); + + /* + * Write the pattern buffers to select the victim bit + * + * Start with bit0 + */ + for (offset = 0x80; offset <= 0x8F; offset++) { + if ((offset % 8) == victim_bit) + msg_port_write(HTE, offset, 0x55555555); + else + msg_port_write(HTE, offset, 0xCCCCCCCC); + } + + msg_port_write(HTE, 0x00020061, 0x00000000); + msg_port_write(HTE, 0x00020066, 0x03440000); + msg_port_write(HTE, 0x000200A1, 0xFFFF1000); + } + + tmp = 0x10001000 | (loop_cnt << 16); + msg_port_write(HTE, 0x00020011, tmp); + msg_port_write(HTE, 0x00020011, tmp | BIT8); + + hte_wait_for_complete(); + + /* + * Return bits 15:8 of HTE_CH0_ERR_XSTAT to check for + * any bytelane errors. + */ + return (hte_check_errors() >> 8) & 0xFF; +} + +/** + * Use HW HTE engine to initialize or test all memory attached to a given DUNIT. + * If flag is MRC_MEM_INIT, this routine writes 0s to all memory locations to + * initialize ECC. If flag is MRC_MEM_TEST, this routine will send an 5AA55AA5 + * pattern to all memory locations on the RankMask and then read it back. + * Then it sends an A55AA55A pattern to all memory locations on the RankMask + * and reads it back. + * + * @mrc_params: host structure for all MRC global data + * @flag: MRC_MEM_INIT or MRC_MEM_TEST + * + * @return: errors register showing HTE failures. Also prints out which rank + * failed the HTE test if failure occurs. For rank detection to work, + * the address map must be left in its default state. If MRC changes + * the address map, this function must be modified to change it back + * to default at the beginning, then restore it at the end. + */ +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag) +{ + u32 offset; + int test_num; + int i; + + /* + * Clear out the error registers at the start of each memory + * init or memory test run. + */ + hte_clear_error_regs(); + + msg_port_write(HTE, 0x00020062, 0x00000015); + + for (offset = 0x80; offset <= 0x8F; offset++) + msg_port_write(HTE, offset, ((offset & 1) ? 0xA55A : 0x5AA5)); + + msg_port_write(HTE, 0x00020021, 0x00000000); + msg_port_write(HTE, 0x00020022, (mrc_params->mem_size >> 6) - 1); + msg_port_write(HTE, 0x00020063, 0xAAAAAAAA); + msg_port_write(HTE, 0x00020064, 0xCCCCCCCC); + msg_port_write(HTE, 0x00020065, 0xF0F0F0F0); + msg_port_write(HTE, 0x00020066, 0x03000000); + + switch (flag) { + case MRC_MEM_INIT: + /* + * Only 1 write pass through memory is needed + * to initialize ECC + */ + test_num = 1; + break; + case MRC_MEM_TEST: + /* Write/read then write/read with inverted pattern */ + test_num = 4; + break; + default: + DPF(D_INFO, "Unknown parameter for flag: %d\n", flag); + return 0xFFFFFFFF; + } + + DPF(D_INFO, "hte_mem_init"); + + for (i = 0; i < test_num; i++) { + DPF(D_INFO, "."); + + if (i == 0) { + msg_port_write(HTE, 0x00020061, 0x00000000); + msg_port_write(HTE, 0x00020020, 0x00110010); + } else if (i == 1) { + msg_port_write(HTE, 0x00020061, 0x00000000); + msg_port_write(HTE, 0x00020020, 0x00010010); + } else if (i == 2) { + msg_port_write(HTE, 0x00020061, 0x00010100); + msg_port_write(HTE, 0x00020020, 0x00110010); + } else { + msg_port_write(HTE, 0x00020061, 0x00010100); + msg_port_write(HTE, 0x00020020, 0x00010010); + } + + msg_port_write(HTE, 0x00020011, 0x00111000); + msg_port_write(HTE, 0x00020011, 0x00111100); + + hte_wait_for_complete(); + + /* If this is a READ pass, check for errors at the end */ + if ((i % 2) == 1) { + /* Return immediately if error */ + if (hte_check_errors()) + break; + } + } + + DPF(D_INFO, "done\n"); + + return hte_check_errors(); +} + +/** + * Execute a basic single-cache-line memory write/read/verify test using simple + * constant pattern, different for READ_TRAIN and WRITE_TRAIN modes. + * + * @mrc_params: host structure for all MRC global data + * @addr: memory adress being tested (must hit specific channel/rank) + * @first_run: if set then the HTE registers are configured, otherwise it is + * assumed configuration is done and we just re-run the test + * @mode: READ_TRAIN or WRITE_TRAIN (the difference is in the pattern) + * + * @return: byte lane failure on each bit (for Quark only bit0 and bit1) + */ +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr, + u8 first_run, u8 mode) +{ + u16 errors; + + ENTERFN(); + + /* Enable all error reporting in preparation for HTE test */ + hte_enable_all_errors(); + hte_clear_error_regs(); + + errors = hte_basic_data_cmp(mrc_params, addr, first_run, mode); + + LEAVEFN(); + + return errors; +} + +/** + * Examine a single-cache-line memory with write/read/verify test using multiple + * data patterns (victim-aggressor algorithm). + * + * @mrc_params: host structure for all MRC global data + * @addr: memory adress being tested (must hit specific channel/rank) + * @first_run: if set then the HTE registers are configured, otherwise it is + * assumed configuration is done and we just re-run the test + * + * @return: byte lane failure on each bit (for Quark only bit0 and bit1) + */ +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params, + u32 addr, u8 first_run) +{ + u16 errors; + u8 victim_bit = 0; + + ENTERFN(); + + /* Enable all error reporting in preparation for HTE test */ + hte_enable_all_errors(); + hte_clear_error_regs(); + + /* + * Loop through each bit in the bytelane. + * + * Each pass creates a victim bit while keeping all other bits the same + * as aggressors. AVN HTE adds an auto-rotate feature which allows us + * to program the entire victim/aggressor sequence in 1 step. + * + * The victim bit rotates on each pass so no need to have software + * implement a victim bit loop like on VLV. + */ + errors = hte_rw_data_cmp(mrc_params, addr, HTE_LOOP_CNT, + HTE_LFSR_VICTIM_SEED, HTE_LFSR_AGRESSOR_SEED, + victim_bit, first_run); + + LEAVEFN(); + + return errors; +} + +/** + * Execute a basic single-cache-line memory write or read. + * This is just for receive enable / fine write-levelling purpose. + * + * @addr: memory adress being tested (must hit specific channel/rank) + * @first_run: if set then the HTE registers are configured, otherwise it is + * assumed configuration is done and we just re-run the test + * @is_write: when non-zero memory write operation executed, otherwise read + */ +void hte_mem_op(u32 addr, u8 first_run, u8 is_write) +{ + u32 offset; + u32 tmp; + + hte_enable_all_errors(); + hte_clear_error_regs(); + + if (first_run) { + tmp = is_write ? 0x01110021 : 0x01010021; + msg_port_write(HTE, 0x00020020, tmp); + + msg_port_write(HTE, 0x00020021, 0x06000000); + msg_port_write(HTE, 0x00020022, addr >> 6); + msg_port_write(HTE, 0x00020062, 0x00800015); + msg_port_write(HTE, 0x00020063, 0xAAAAAAAA); + msg_port_write(HTE, 0x00020064, 0xCCCCCCCC); + msg_port_write(HTE, 0x00020065, 0xF0F0F0F0); + msg_port_write(HTE, 0x00020061, 0x00030008); + + for (offset = 0x80; offset <= 0x8F; offset++) + msg_port_write(HTE, offset, 0xC33C0000); + } + + msg_port_write(HTE, 0x000200A1, 0xFFFF1000); + msg_port_write(HTE, 0x00020011, 0x00011000); + msg_port_write(HTE, 0x00020011, 0x00011100); + + hte_wait_for_complete(); +} diff --git a/arch/x86/cpu/quark/hte.h b/arch/x86/cpu/quark/hte.h new file mode 100644 index 0000000..6577796 --- /dev/null +++ b/arch/x86/cpu/quark/hte.h @@ -0,0 +1,44 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#ifndef _HTE_H_ +#define _HTE_H_ + +enum { + MRC_MEM_INIT, + MRC_MEM_TEST +}; + +enum { + READ_TRAIN, + WRITE_TRAIN +}; + +/* + * EXP_LOOP_CNT field of HTE_CMD_CTL + * + * This CANNOT be less than 4! + */ +#define HTE_LOOP_CNT 5 + +/* random seed for victim */ +#define HTE_LFSR_VICTIM_SEED 0xF294BA21 + +/* random seed for aggressor */ +#define HTE_LFSR_AGRESSOR_SEED 0xEBA7492D + +u32 hte_mem_init(struct mrc_params *mrc_params, u8 flag); +u16 hte_basic_write_read(struct mrc_params *mrc_params, u32 addr, + u8 first_run, u8 mode); +u16 hte_write_stress_bit_lanes(struct mrc_params *mrc_params, + u32 addr, u8 first_run); +void hte_mem_op(u32 addr, u8 first_run, u8 is_write); + +#endif /* _HTE_H_ */ diff --git a/arch/x86/cpu/quark/mrc_util.c b/arch/x86/cpu/quark/mrc_util.c new file mode 100644 index 0000000..3a79ae5 --- /dev/null +++ b/arch/x86/cpu/quark/mrc_util.c @@ -0,0 +1,1475 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#include <common.h> +#include <asm/arch/device.h> +#include <asm/arch/mrc.h> +#include <asm/arch/msg_port.h> +#include "mrc_util.h" +#include "hte.h" +#include "smc.h" + +static const uint8_t vref_codes[64] = { + /* lowest to highest */ + 0x3F, 0x3E, 0x3D, 0x3C, 0x3B, 0x3A, 0x39, 0x38, + 0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x31, 0x30, + 0x2F, 0x2E, 0x2D, 0x2C, 0x2B, 0x2A, 0x29, 0x28, + 0x27, 0x26, 0x25, 0x24, 0x23, 0x22, 0x21, 0x20, + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F +}; + +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask) +{ + msg_port_write(unit, addr, + (msg_port_read(unit, addr) & ~(mask)) | + ((data) & (mask))); +} + +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask) +{ + msg_port_alt_write(unit, addr, + (msg_port_alt_read(unit, addr) & ~(mask)) | + ((data) & (mask))); +} + +void mrc_post_code(uint8_t major, uint8_t minor) +{ + /* send message to UART */ + DPF(D_INFO, "POST: 0x%01x%02x\n", major, minor); + + /* error check */ + if (major == 0xee) + hang(); +} + +/* Delay number of nanoseconds */ +void delay_n(uint32_t ns) +{ + /* 1000 MHz clock has 1ns period --> no conversion required */ + uint64_t final_tsc = rdtsc(); + + final_tsc += ((get_tbclk_mhz() * ns) / 1000); + + while (rdtsc() < final_tsc) + ; +} + +/* Delay number of microseconds */ +void delay_u(uint32_t ms) +{ + /* 64-bit math is not an option, just use loops */ + while (ms--) + delay_n(1000); +} + +/* Select Memory Manager as the source for PRI interface */ +void select_mem_mgr(void) +{ + u32 dco; + + ENTERFN(); + + dco = msg_port_read(MEM_CTLR, DCO); + dco &= ~BIT28; + msg_port_write(MEM_CTLR, DCO, dco); + + LEAVEFN(); +} + +/* Select HTE as the source for PRI interface */ +void select_hte(void) +{ + u32 dco; + + ENTERFN(); + + dco = msg_port_read(MEM_CTLR, DCO); + dco |= BIT28; + msg_port_write(MEM_CTLR, DCO, dco); + + LEAVEFN(); +} + +/* + * Send DRAM command + * data should be formated using DCMD_Xxxx macro or emrsXCommand structure + */ +void dram_init_command(uint32_t data) +{ + pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_DATA_REG, data); + pci_write_config_dword(QUARK_HOST_BRIDGE, MSG_CTRL_EXT_REG, 0); + msg_port_setup(MSG_OP_DRAM_INIT, MEM_CTLR, 0); + + DPF(D_REGWR, "WR32 %03X %08X %08X\n", MEM_CTLR, 0, data); +} + +/* Send DRAM wake command using special MCU side-band WAKE opcode */ +void dram_wake_command(void) +{ + ENTERFN(); + + msg_port_setup(MSG_OP_DRAM_WAKE, MEM_CTLR, 0); + + LEAVEFN(); +} + +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane) +{ + /* send message to UART */ + DPF(D_INFO, "CH%01X RK%01X BL%01X\n", channel, rank, byte_lane); +} + +/* + * This function will program the RCVEN delays + * + * (currently doesn't comprehend rank) + */ +void set_rcvn(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + DPF(D_TRN, "Rcvn ch%d rnk%d ln%d : pi=%03X\n", + channel, rank, byte_lane, pi_count); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[11:08] (0x0-0xF) + * BL1 -> B01PTRCTL0[23:20] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = (byte_lane & BIT0) ? (BIT23 | BIT22 | BIT21 | BIT20) : + (BIT11 | BIT10 | BIT9 | BIT8); + temp = (byte_lane & BIT0) ? ((pi_count / HALF_CLK) << 20) : + ((pi_count / HALF_CLK) << 8); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24); + temp = pi_count << 24; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * BL0/1 -> B01DBCTL1[08/11] (+1 select) + * BL0/1 -> B01DBCTL1[02/05] (enable) + */ + reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= (byte_lane & BIT0) ? BIT5 : BIT2; + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= (byte_lane & BIT0) ? BIT11 : BIT8; + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) { + training_message(channel, rank, byte_lane); + mrc_post_code(0xee, 0xe0); + } + + LEAVEFN(); +} + +/* + * This function will return the current RCVEN delay on the given + * channel, rank, byte_lane as an absolute PI count. + * + * (currently doesn't comprehend rank) + */ +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[11:08] (0x0-0xF) + * BL1 -> B01PTRCTL0[23:20] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= (byte_lane & BIT0) ? 20 : 8; + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = temp * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[29:24] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[29:24] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 24; + temp &= 0x3F; + + /* Adjust PI_COUNT */ + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the RDQS delays based on an absolute + * amount of PIs. + * + * (currently doesn't comprehend rank) + */ +void set_rdqs(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + DPF(D_TRN, "Rdqs ch%d rnk%d ln%d : pi=%03X\n", + channel, rank, byte_lane, pi_count); + + /* + * PI (1/128 MCLK) + * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47) + * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47) + */ + reg = (byte_lane & BIT0) ? B1RXDQSPICODE : B0RXDQSPICODE; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + msk = (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0); + temp = pi_count << 0; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check (shouldn't go above 0x3F) */ + if (pi_count > 0x47) { + training_message(channel, rank, byte_lane); + mrc_post_code(0xee, 0xe1); + } + + LEAVEFN(); +} + +/* + * This function will return the current RDQS delay on the given + * channel, rank, byte_lane as an absolute PI count. + * + * (currently doesn't comprehend rank) + */ +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * PI (1/128 MCLK) + * BL0 -> B0RXDQSPICODE[06:00] (0x00-0x47) + * BL1 -> B1RXDQSPICODE[06:00] (0x00-0x47) + */ + reg = (byte_lane & BIT0) ? B1RXDQSPICODE : B0RXDQSPICODE; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + temp = msg_port_alt_read(DDRPHY, reg); + + /* Adjust PI_COUNT */ + pi_count = temp & 0x7F; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the WDQS delays based on an absolute + * amount of PIs. + * + * (currently doesn't comprehend rank) + */ +void set_wdqs(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + DPF(D_TRN, "Wdqs ch%d rnk%d ln%d : pi=%03X\n", + channel, rank, byte_lane, pi_count); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[07:04] (0x0-0xF) + * BL1 -> B01PTRCTL0[19:16] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = (byte_lane & BIT0) ? (BIT19 | BIT18 | BIT17 | BIT16) : + (BIT7 | BIT6 | BIT5 | BIT4); + temp = pi_count / HALF_CLK; + temp <<= (byte_lane & BIT0) ? 16 : 4; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16); + temp = pi_count << 16; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * BL0/1 -> B01DBCTL1[07/10] (+1 select) + * BL0/1 -> B01DBCTL1[01/04] (enable) + */ + reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= (byte_lane & BIT0) ? BIT4 : BIT1; + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= (byte_lane & BIT0) ? BIT10 : BIT7; + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) { + training_message(channel, rank, byte_lane); + mrc_post_code(0xee, 0xe2); + } + + LEAVEFN(); +} + +/* + * This function will return the amount of WDQS delay on the given + * channel, rank, byte_lane as an absolute PI count. + * + * (currently doesn't comprehend rank) + */ +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[07:04] (0x0-0xF) + * BL1 -> B01PTRCTL0[19:16] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= (byte_lane & BIT0) ? 16 : 4; + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = (temp * HALF_CLK); + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[21:16] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[21:16] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 16; + temp &= 0x3F; + + /* Adjust PI_COUNT */ + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the WDQ delays based on an absolute + * number of PIs. + * + * (currently doesn't comprehend rank) + */ +void set_wdq(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + DPF(D_TRN, "Wdq ch%d rnk%d ln%d : pi=%03X\n", + channel, rank, byte_lane, pi_count); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[03:00] (0x0-0xF) + * BL1 -> B01PTRCTL0[15:12] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = (byte_lane & BIT0) ? (BIT15 | BIT14 | BIT13 | BIT12) : + (BIT3 | BIT2 | BIT1 | BIT0); + temp = pi_count / HALF_CLK; + temp <<= (byte_lane & BIT0) ? 12 : 0; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + msk = (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8); + temp = pi_count << 8; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * BL0/1 -> B01DBCTL1[06/09] (+1 select) + * BL0/1 -> B01DBCTL1[00/03] (enable) + */ + reg = B01DBCTL1 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= (byte_lane & BIT0) ? BIT3 : BIT0; + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= (byte_lane & BIT0) ? BIT9 : BIT6; + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) { + training_message(channel, rank, byte_lane); + mrc_post_code(0xee, 0xe3); + } + + LEAVEFN(); +} + +/* + * This function will return the amount of WDQ delay on the given + * channel, rank, byte_lane as an absolute PI count. + * + * (currently doesn't comprehend rank) + */ +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * BL0 -> B01PTRCTL0[03:00] (0x0-0xF) + * BL1 -> B01PTRCTL0[15:12] (0x0-0xF) + */ + reg = B01PTRCTL0 + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= (byte_lane & BIT0) ? (12) : (0); + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = temp * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * BL0 -> B0DLLPICODER0[13:08] (0x00-0x3F) + * BL1 -> B1DLLPICODER0[13:08] (0x00-0x3F) + */ + reg = (byte_lane & BIT0) ? B1DLLPICODER0 : B0DLLPICODER0; + reg += (((byte_lane >> 1) * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 8; + temp &= 0x3F; + + /* Adjust PI_COUNT */ + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the WCMD delays based on an absolute + * number of PIs. + */ +void set_wcmd(uint8_t channel, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CMDPTRREG[11:08] (0x0-0xF) + */ + reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET); + msk = (BIT11 | BIT10 | BIT9 | BIT8); + temp = pi_count / HALF_CLK; + temp <<= 8; + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused) + * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused) + * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused) + * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused) + * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused) + * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F) + * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused) + * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused) + */ + reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET); + + msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24 | + BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 | + BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0); + + temp = (pi_count << 24) | (pi_count << 16) | + (pi_count << 8) | (pi_count << 0); + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = CMDDLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET); /* PO */ + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * CMDCFGREG0[17] (+1 select) + * CMDCFGREG0[16] (enable) + */ + reg = CMDCFGREG0 + (channel * DDRIOCCC_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= BIT16; + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= BIT17; + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) + mrc_post_code(0xee, 0xe4); + + LEAVEFN(); +} + +/* + * This function will return the amount of WCMD delay on the given + * channel as an absolute PI count. + */ +uint32_t get_wcmd(uint8_t channel) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CMDPTRREG[11:08] (0x0-0xF) + */ + reg = CMDPTRREG + (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 8; + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = temp * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * CMDDLLPICODER0[29:24] -> CMDSLICE R3 (unused) + * CMDDLLPICODER0[21:16] -> CMDSLICE L3 (unused) + * CMDDLLPICODER0[13:08] -> CMDSLICE R2 (unused) + * CMDDLLPICODER0[05:00] -> CMDSLICE L2 (unused) + * CMDDLLPICODER1[29:24] -> CMDSLICE R1 (unused) + * CMDDLLPICODER1[21:16] -> CMDSLICE L1 (0x00-0x3F) + * CMDDLLPICODER1[13:08] -> CMDSLICE R0 (unused) + * CMDDLLPICODER1[05:00] -> CMDSLICE L0 (unused) + */ + reg = CMDDLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 16; + temp &= 0x3F; + + /* Adjust PI_COUNT */ + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the WCLK delays based on an absolute + * number of PIs. + */ +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CCPTRREG[15:12] -> CLK1 (0x0-0xF) + * CCPTRREG[11:08] -> CLK0 (0x0-0xF) + */ + reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET); + msk = (BIT15 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8); + temp = ((pi_count / HALF_CLK) << 12) | ((pi_count / HALF_CLK) << 8); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F) + * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F) + */ + reg = rank ? ECCB1DLLPICODER0 : ECCB1DLLPICODER0; + reg += (channel * DDRIOCCC_CH_OFFSET); + msk = (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | BIT16 | + BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8); + temp = (pi_count << 16) | (pi_count << 8); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = rank ? ECCB1DLLPICODER1 : ECCB1DLLPICODER1; + reg += (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = rank ? ECCB1DLLPICODER2 : ECCB1DLLPICODER2; + reg += (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = rank ? ECCB1DLLPICODER3 : ECCB1DLLPICODER3; + reg += (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * CCCFGREG1[11:08] (+1 select) + * CCCFGREG1[03:00] (enable) + */ + reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= (BIT3 | BIT2 | BIT1 | BIT0); + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= (BIT11 | BIT10 | BIT9 | BIT8); + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) + mrc_post_code(0xee, 0xe5); + + LEAVEFN(); +} + +/* + * This function will return the amout of WCLK delay on the given + * channel, rank as an absolute PI count. + */ +uint32_t get_wclk(uint8_t channel, uint8_t rank) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CCPTRREG[15:12] -> CLK1 (0x0-0xF) + * CCPTRREG[11:08] -> CLK0 (0x0-0xF) + */ + reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= rank ? 12 : 8; + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = temp * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * ECCB1DLLPICODER0[13:08] -> CLK0 (0x00-0x3F) + * ECCB1DLLPICODER0[21:16] -> CLK1 (0x00-0x3F) + */ + reg = rank ? ECCB1DLLPICODER0 : ECCB1DLLPICODER0; + reg += (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= rank ? 16 : 8; + temp &= 0x3F; + + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the WCTL delays based on an absolute + * number of PIs. + * + * (currently doesn't comprehend rank) + */ +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count) +{ + uint32_t reg; + uint32_t msk; + uint32_t temp; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CCPTRREG[31:28] (0x0-0xF) + * CCPTRREG[27:24] (0x0-0xF) + */ + reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET); + msk = (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24); + temp = ((pi_count / HALF_CLK) << 28) | ((pi_count / HALF_CLK) << 24); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* Adjust PI_COUNT */ + pi_count -= ((pi_count / HALF_CLK) & 0xF) * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * ECCB1DLLPICODER?[29:24] (0x00-0x3F) + * ECCB1DLLPICODER?[29:24] (0x00-0x3F) + */ + reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET); + msk = (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | BIT24); + temp = (pi_count << 24); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = ECCB1DLLPICODER1 + (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = ECCB1DLLPICODER2 + (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + reg = ECCB1DLLPICODER3 + (channel * DDRIOCCC_CH_OFFSET); + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* + * DEADBAND + * CCCFGREG1[13:12] (+1 select) + * CCCFGREG1[05:04] (enable) + */ + reg = CCCFGREG1 + (channel * DDRIOCCC_CH_OFFSET); + msk = 0x00; + temp = 0x00; + + /* enable */ + msk |= (BIT5 | BIT4); + if ((pi_count < EARLY_DB) || (pi_count > LATE_DB)) + temp |= msk; + + /* select */ + msk |= (BIT13 | BIT12); + if (pi_count < EARLY_DB) + temp |= msk; + + mrc_alt_write_mask(DDRPHY, reg, temp, msk); + + /* error check */ + if (pi_count > 0x3F) + mrc_post_code(0xee, 0xe6); + + LEAVEFN(); +} + +/* + * This function will return the amount of WCTL delay on the given + * channel, rank as an absolute PI count. + * + * (currently doesn't comprehend rank) + */ +uint32_t get_wctl(uint8_t channel, uint8_t rank) +{ + uint32_t reg; + uint32_t temp; + uint32_t pi_count; + + ENTERFN(); + + /* + * RDPTR (1/2 MCLK, 64 PIs) + * CCPTRREG[31:28] (0x0-0xF) + * CCPTRREG[27:24] (0x0-0xF) + */ + reg = CCPTRREG + (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 24; + temp &= 0xF; + + /* Adjust PI_COUNT */ + pi_count = temp * HALF_CLK; + + /* + * PI (1/64 MCLK, 1 PIs) + * ECCB1DLLPICODER?[29:24] (0x00-0x3F) + * ECCB1DLLPICODER?[29:24] (0x00-0x3F) + */ + reg = ECCB1DLLPICODER0 + (channel * DDRIOCCC_CH_OFFSET); + temp = msg_port_alt_read(DDRPHY, reg); + temp >>= 24; + temp &= 0x3F; + + /* Adjust PI_COUNT */ + pi_count += temp; + + LEAVEFN(); + + return pi_count; +} + +/* + * This function will program the internal Vref setting in a given + * byte lane in a given channel. + */ +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting) +{ + uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL); + + ENTERFN(); + + DPF(D_TRN, "Vref ch%d ln%d : val=%03X\n", + channel, byte_lane, setting); + + mrc_alt_write_mask(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) + + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET)), + (vref_codes[setting] << 2), + (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2)); + + /* + * need to wait ~300ns for Vref to settle + * (check that this is necessary) + */ + delay_n(300); + + /* ??? may need to clear pointers ??? */ + + LEAVEFN(); +} + +/* + * This function will return the internal Vref setting for the given + * channel, byte_lane. + */ +uint32_t get_vref(uint8_t channel, uint8_t byte_lane) +{ + uint8_t j; + uint32_t ret_val = sizeof(vref_codes) / 2; + uint32_t reg = (byte_lane & 0x1) ? (B1VREFCTL) : (B0VREFCTL); + uint32_t temp; + + ENTERFN(); + + temp = msg_port_alt_read(DDRPHY, (reg + (channel * DDRIODQ_CH_OFFSET) + + ((byte_lane >> 1) * DDRIODQ_BL_OFFSET))); + temp >>= 2; + temp &= 0x3F; + + for (j = 0; j < sizeof(vref_codes); j++) { + if (vref_codes[j] == temp) { + ret_val = j; + break; + } + } + + LEAVEFN(); + + return ret_val; +} + +/* + * This function will return a 32-bit address in the desired + * channel and rank. + */ +uint32_t get_addr(uint8_t channel, uint8_t rank) +{ + uint32_t offset = 0x02000000; /* 32MB */ + + /* Begin product specific code */ + if (channel > 0) { + DPF(D_ERROR, "ILLEGAL CHANNEL\n"); + DEAD_LOOP(); + } + + if (rank > 1) { + DPF(D_ERROR, "ILLEGAL RANK\n"); + DEAD_LOOP(); + } + + /* use 256MB lowest density as per DRP == 0x0003 */ + offset += rank * (256 * 1024 * 1024); + + return offset; +} + +/* + * This function will sample the DQTRAINSTS registers in the given + * channel/rank SAMPLE_SIZE times looking for a valid '0' or '1'. + * + * It will return an encoded 32-bit date in which each bit corresponds to + * the sampled value on the byte lane. + */ +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel, + uint8_t rank, bool rcvn) +{ + uint8_t j; /* just a counter */ + uint8_t bl; /* which BL in the module (always 2 per module) */ + uint8_t bl_grp; /* which BL module */ + /* byte lane divisor */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; + uint32_t msk[2]; /* BLx in module */ + /* DQTRAINSTS register contents for each sample */ + uint32_t sampled_val[SAMPLE_SIZE]; + uint32_t num_0s; /* tracks the number of '0' samples */ + uint32_t num_1s; /* tracks the number of '1' samples */ + uint32_t ret_val = 0x00; /* assume all '0' samples */ + uint32_t address = get_addr(channel, rank); + + /* initialise msk[] */ + msk[0] = rcvn ? BIT1 : BIT9; /* BL0 */ + msk[1] = rcvn ? BIT0 : BIT8; /* BL1 */ + + /* cycle through each byte lane group */ + for (bl_grp = 0; bl_grp < (NUM_BYTE_LANES / bl_divisor) / 2; bl_grp++) { + /* take SAMPLE_SIZE samples */ + for (j = 0; j < SAMPLE_SIZE; j++) { + hte_mem_op(address, mrc_params->first_run, + rcvn ? 0 : 1); + mrc_params->first_run = 0; + + /* + * record the contents of the proper + * DQTRAINSTS register + */ + sampled_val[j] = msg_port_alt_read(DDRPHY, + (DQTRAINSTS + + (bl_grp * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET))); + } + + /* + * look for a majority value (SAMPLE_SIZE / 2) + 1 + * on the byte lane and set that value in the corresponding + * ret_val bit + */ + for (bl = 0; bl < 2; bl++) { + num_0s = 0x00; /* reset '0' tracker for byte lane */ + num_1s = 0x00; /* reset '1' tracker for byte lane */ + for (j = 0; j < SAMPLE_SIZE; j++) { + if (sampled_val[j] & msk[bl]) + num_1s++; + else + num_0s++; + } + if (num_1s > num_0s) + ret_val |= (1 << (bl + (bl_grp * 2))); + } + } + + /* + * "ret_val.0" contains the status of BL0 + * "ret_val.1" contains the status of BL1 + * "ret_val.2" contains the status of BL2 + * etc. + */ + return ret_val; +} + +/* This function will find the rising edge transition on RCVN or WDQS */ +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[], + uint8_t channel, uint8_t rank, bool rcvn) +{ + bool all_edges_found; /* determines stop condition */ + bool direction[NUM_BYTE_LANES]; /* direction indicator */ + uint8_t sample; /* sample counter */ + uint8_t bl; /* byte lane counter */ + /* byte lane divisor */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; + uint32_t sample_result[SAMPLE_CNT]; /* results of sample_dqs() */ + uint32_t temp; + uint32_t transition_pattern; + + ENTERFN(); + + /* select hte and request initial configuration */ + select_hte(); + mrc_params->first_run = 1; + + /* Take 3 sample points (T1,T2,T3) to obtain a transition pattern */ + for (sample = 0; sample < SAMPLE_CNT; sample++) { + /* program the desired delays for sample */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + /* increase sample delay by 26 PI (0.2 CLK) */ + if (rcvn) { + set_rcvn(channel, rank, bl, + delay[bl] + (sample * SAMPLE_DLY)); + } else { + set_wdqs(channel, rank, bl, + delay[bl] + (sample * SAMPLE_DLY)); + } + } + + /* take samples (Tsample_i) */ + sample_result[sample] = sample_dqs(mrc_params, + channel, rank, rcvn); + + DPF(D_TRN, + "Find rising edge %s ch%d rnk%d: #%d dly=%d dqs=%02X\n", + (rcvn ? "RCVN" : "WDQS"), channel, rank, sample, + sample * SAMPLE_DLY, sample_result[sample]); + } + + /* + * This pattern will help determine where we landed and ultimately + * how to place RCVEN/WDQS. + */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + /* build transition_pattern (MSB is 1st sample) */ + transition_pattern = 0; + for (sample = 0; sample < SAMPLE_CNT; sample++) { + transition_pattern |= + ((sample_result[sample] & (1 << bl)) >> bl) << + (SAMPLE_CNT - 1 - sample); + } + + DPF(D_TRN, "=== transition pattern %d\n", transition_pattern); + + /* + * set up to look for rising edge based on + * transition_pattern + */ + switch (transition_pattern) { + case 0: /* sampled 0->0->0 */ + /* move forward from T3 looking for 0->1 */ + delay[bl] += 2 * SAMPLE_DLY; + direction[bl] = FORWARD; + break; + case 1: /* sampled 0->0->1 */ + case 5: /* sampled 1->0->1 (bad duty cycle) *HSD#237503* */ + /* move forward from T2 looking for 0->1 */ + delay[bl] += 1 * SAMPLE_DLY; + direction[bl] = FORWARD; + break; + case 2: /* sampled 0->1->0 (bad duty cycle) *HSD#237503* */ + case 3: /* sampled 0->1->1 */ + /* move forward from T1 looking for 0->1 */ + delay[bl] += 0 * SAMPLE_DLY; + direction[bl] = FORWARD; + break; + case 4: /* sampled 1->0->0 (assumes BL8, HSD#234975) */ + /* move forward from T3 looking for 0->1 */ + delay[bl] += 2 * SAMPLE_DLY; + direction[bl] = FORWARD; + break; + case 6: /* sampled 1->1->0 */ + case 7: /* sampled 1->1->1 */ + /* move backward from T1 looking for 1->0 */ + delay[bl] += 0 * SAMPLE_DLY; + direction[bl] = BACKWARD; + break; + default: + mrc_post_code(0xee, 0xee); + break; + } + + /* program delays */ + if (rcvn) + set_rcvn(channel, rank, bl, delay[bl]); + else + set_wdqs(channel, rank, bl, delay[bl]); + } + + /* + * Based on the observed transition pattern on the byte lane, + * begin looking for a rising edge with single PI granularity. + */ + do { + all_edges_found = true; /* assume all byte lanes passed */ + /* take a sample */ + temp = sample_dqs(mrc_params, channel, rank, rcvn); + /* check all each byte lane for proper edge */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (temp & (1 << bl)) { + /* sampled "1" */ + if (direction[bl] == BACKWARD) { + /* + * keep looking for edge + * on this byte lane + */ + all_edges_found = false; + delay[bl] -= 1; + if (rcvn) { + set_rcvn(channel, rank, + bl, delay[bl]); + } else { + set_wdqs(channel, rank, + bl, delay[bl]); + } + } + } else { + /* sampled "0" */ + if (direction[bl] == FORWARD) { + /* + * keep looking for edge + * on this byte lane + */ + all_edges_found = false; + delay[bl] += 1; + if (rcvn) { + set_rcvn(channel, rank, + bl, delay[bl]); + } else { + set_wdqs(channel, rank, + bl, delay[bl]); + } + } + } + } + } while (!all_edges_found); + + /* restore DDR idle state */ + dram_init_command(DCMD_PREA(rank)); + + DPF(D_TRN, "Delay %03X %03X %03X %03X\n", + delay[0], delay[1], delay[2], delay[3]); + + LEAVEFN(); +} + +/* + * This function will return a 32 bit mask that will be used to + * check for byte lane failures. + */ +uint32_t byte_lane_mask(struct mrc_params *mrc_params) +{ + uint32_t j; + uint32_t ret_val = 0x00; + + /* + * set ret_val based on NUM_BYTE_LANES such that you will check + * only BL0 in result + * + * (each bit in result represents a byte lane) + */ + for (j = 0; j < MAX_BYTE_LANES; j += NUM_BYTE_LANES) + ret_val |= (1 << ((j / NUM_BYTE_LANES) * NUM_BYTE_LANES)); + + /* + * HSD#235037 + * need to adjust the mask for 16-bit mode + */ + if (mrc_params->channel_width == X16) + ret_val |= (ret_val << 2); + + return ret_val; +} + +/* + * Check memory executing simple write/read/verify at the specified address. + * + * Bits in the result indicate failure on specific byte lane. + */ +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address) +{ + uint32_t result = 0; + uint8_t first_run = 0; + + if (mrc_params->hte_setup) { + mrc_params->hte_setup = 0; + first_run = 1; + select_hte(); + } + + result = hte_basic_write_read(mrc_params, address, first_run, + WRITE_TRAIN); + + DPF(D_TRN, "check_rw_coarse result is %x\n", result); + + return result; +} + +/* + * Check memory executing write/read/verify of many data patterns + * at the specified address. Bits in the result indicate failure + * on specific byte lane. + */ +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address) +{ + uint32_t result; + uint8_t first_run = 0; + + if (mrc_params->hte_setup) { + mrc_params->hte_setup = 0; + first_run = 1; + select_hte(); + } + + result = hte_write_stress_bit_lanes(mrc_params, address, first_run); + + DPF(D_TRN, "check_bls_ex result is %x\n", result); + + return result; +} + +/* + * 32-bit LFSR with characteristic polynomial: X^32 + X^22 +X^2 + X^1 + * + * The function takes pointer to previous 32 bit value and + * modifies it to next value. + */ +void lfsr32(uint32_t *lfsr_ptr) +{ + uint32_t bit; + uint32_t lfsr; + int i; + + lfsr = *lfsr_ptr; + + for (i = 0; i < 32; i++) { + bit = 1 ^ (lfsr & BIT0); + bit = bit ^ ((lfsr & BIT1) >> 1); + bit = bit ^ ((lfsr & BIT2) >> 2); + bit = bit ^ ((lfsr & BIT22) >> 22); + + lfsr = ((lfsr >> 1) | (bit << 31)); + } + + *lfsr_ptr = lfsr; +} + +/* Clear the pointers in a given byte lane in a given channel */ +void clear_pointers(void) +{ + uint8_t channel; + uint8_t bl; + + ENTERFN(); + + for (channel = 0; channel < NUM_CHANNELS; channel++) { + for (bl = 0; bl < NUM_BYTE_LANES; bl++) { + mrc_alt_write_mask(DDRPHY, + (B01PTRCTL1 + + (channel * DDRIODQ_CH_OFFSET) + + ((bl >> 1) * DDRIODQ_BL_OFFSET)), + ~BIT8, BIT8); + + mrc_alt_write_mask(DDRPHY, + (B01PTRCTL1 + + (channel * DDRIODQ_CH_OFFSET) + + ((bl >> 1) * DDRIODQ_BL_OFFSET)), + BIT8, BIT8); + } + } + + LEAVEFN(); +} + +static void print_timings_internal(uint8_t algo, uint8_t channel, uint8_t rank, + uint8_t bl_divisor) +{ + uint8_t bl; + + switch (algo) { + case RCVN: + DPF(D_INFO, "\nRCVN[%02d:%02d]", channel, rank); + break; + case WDQS: + DPF(D_INFO, "\nWDQS[%02d:%02d]", channel, rank); + break; + case WDQX: + DPF(D_INFO, "\nWDQx[%02d:%02d]", channel, rank); + break; + case RDQS: + DPF(D_INFO, "\nRDQS[%02d:%02d]", channel, rank); + break; + case VREF: + DPF(D_INFO, "\nVREF[%02d:%02d]", channel, rank); + break; + case WCMD: + DPF(D_INFO, "\nWCMD[%02d:%02d]", channel, rank); + break; + case WCTL: + DPF(D_INFO, "\nWCTL[%02d:%02d]", channel, rank); + break; + case WCLK: + DPF(D_INFO, "\nWCLK[%02d:%02d]", channel, rank); + break; + default: + break; + } + + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + switch (algo) { + case RCVN: + DPF(D_INFO, " %03d", get_rcvn(channel, rank, bl)); + break; + case WDQS: + DPF(D_INFO, " %03d", get_wdqs(channel, rank, bl)); + break; + case WDQX: + DPF(D_INFO, " %03d", get_wdq(channel, rank, bl)); + break; + case RDQS: + DPF(D_INFO, " %03d", get_rdqs(channel, rank, bl)); + break; + case VREF: + DPF(D_INFO, " %03d", get_vref(channel, bl)); + break; + case WCMD: + DPF(D_INFO, " %03d", get_wcmd(channel)); + break; + case WCTL: + DPF(D_INFO, " %03d", get_wctl(channel, rank)); + break; + case WCLK: + DPF(D_INFO, " %03d", get_wclk(channel, rank)); + break; + default: + break; + } + } +} + +void print_timings(struct mrc_params *mrc_params) +{ + uint8_t algo; + uint8_t channel; + uint8_t rank; + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; + + DPF(D_INFO, "\n---------------------------"); + DPF(D_INFO, "\nALGO[CH:RK] BL0 BL1 BL2 BL3"); + DPF(D_INFO, "\n==========================="); + + for (algo = 0; algo < MAX_ALGOS; algo++) { + for (channel = 0; channel < NUM_CHANNELS; channel++) { + if (mrc_params->channel_enables & (1 << channel)) { + for (rank = 0; rank < NUM_RANKS; rank++) { + if (mrc_params->rank_enables & + (1 << rank)) { + print_timings_internal(algo, + channel, rank, + bl_divisor); + } + } + } + } + } + + DPF(D_INFO, "\n---------------------------"); + DPF(D_INFO, "\n"); +} diff --git a/arch/x86/cpu/quark/mrc_util.h b/arch/x86/cpu/quark/mrc_util.h new file mode 100644 index 0000000..f0ddbce --- /dev/null +++ b/arch/x86/cpu/quark/mrc_util.h @@ -0,0 +1,153 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#ifndef _MRC_UTIL_H_ +#define _MRC_UTIL_H_ + +/* Turn on this macro to enable MRC debugging output */ +#undef MRC_DEBUG + +/* MRC Debug Support */ +#define DPF debug_cond + +/* debug print type */ + +#ifdef MRC_DEBUG +#define D_ERROR 0x0001 +#define D_INFO 0x0002 +#define D_REGRD 0x0004 +#define D_REGWR 0x0008 +#define D_FCALL 0x0010 +#define D_TRN 0x0020 +#define D_TIME 0x0040 +#else +#define D_ERROR 0 +#define D_INFO 0 +#define D_REGRD 0 +#define D_REGWR 0 +#define D_FCALL 0 +#define D_TRN 0 +#define D_TIME 0 +#endif + +#define ENTERFN(...) debug_cond(D_FCALL, "<%s>\n", __func__) +#define LEAVEFN(...) debug_cond(D_FCALL, "</%s>\n", __func__) +#define REPORTFN(...) debug_cond(D_FCALL, "<%s/>\n", __func__) + +/* Generic Register Bits */ +#define BIT0 0x00000001 +#define BIT1 0x00000002 +#define BIT2 0x00000004 +#define BIT3 0x00000008 +#define BIT4 0x00000010 +#define BIT5 0x00000020 +#define BIT6 0x00000040 +#define BIT7 0x00000080 +#define BIT8 0x00000100 +#define BIT9 0x00000200 +#define BIT10 0x00000400 +#define BIT11 0x00000800 +#define BIT12 0x00001000 +#define BIT13 0x00002000 +#define BIT14 0x00004000 +#define BIT15 0x00008000 +#define BIT16 0x00010000 +#define BIT17 0x00020000 +#define BIT18 0x00040000 +#define BIT19 0x00080000 +#define BIT20 0x00100000 +#define BIT21 0x00200000 +#define BIT22 0x00400000 +#define BIT23 0x00800000 +#define BIT24 0x01000000 +#define BIT25 0x02000000 +#define BIT26 0x04000000 +#define BIT27 0x08000000 +#define BIT28 0x10000000 +#define BIT29 0x20000000 +#define BIT30 0x40000000 +#define BIT31 0x80000000 + +/* Message Bus Port */ +#define MEM_CTLR 0x01 +#define HOST_BRIDGE 0x03 +#define MEM_MGR 0x05 +#define HTE 0x11 +#define DDRPHY 0x12 + +/* number of sample points */ +#define SAMPLE_CNT 3 +/* number of PIs to increment per sample */ +#define SAMPLE_DLY 26 + +enum { + /* indicates to decrease delays when looking for edge */ + BACKWARD, + /* indicates to increase delays when looking for edge */ + FORWARD +}; + +enum { + RCVN, + WDQS, + WDQX, + RDQS, + VREF, + WCMD, + WCTL, + WCLK, + MAX_ALGOS, +}; + +void mrc_write_mask(u32 unit, u32 addr, u32 data, u32 mask); +void mrc_alt_write_mask(u32 unit, u32 addr, u32 data, u32 mask); +void mrc_post_code(uint8_t major, uint8_t minor); +void delay_n(uint32_t ns); +void delay_u(uint32_t ms); +void select_mem_mgr(void); +void select_hte(void); +void dram_init_command(uint32_t data); +void dram_wake_command(void); +void training_message(uint8_t channel, uint8_t rank, uint8_t byte_lane); + +void set_rcvn(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count); +uint32_t get_rcvn(uint8_t channel, uint8_t rank, uint8_t byte_lane); +void set_rdqs(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count); +uint32_t get_rdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane); +void set_wdqs(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count); +uint32_t get_wdqs(uint8_t channel, uint8_t rank, uint8_t byte_lane); +void set_wdq(uint8_t channel, uint8_t rank, + uint8_t byte_lane, uint32_t pi_count); +uint32_t get_wdq(uint8_t channel, uint8_t rank, uint8_t byte_lane); +void set_wcmd(uint8_t channel, uint32_t pi_count); +uint32_t get_wcmd(uint8_t channel); +void set_wclk(uint8_t channel, uint8_t rank, uint32_t pi_count); +uint32_t get_wclk(uint8_t channel, uint8_t rank); +void set_wctl(uint8_t channel, uint8_t rank, uint32_t pi_count); +uint32_t get_wctl(uint8_t channel, uint8_t rank); +void set_vref(uint8_t channel, uint8_t byte_lane, uint32_t setting); +uint32_t get_vref(uint8_t channel, uint8_t byte_lane); + +uint32_t get_addr(uint8_t channel, uint8_t rank); +uint32_t sample_dqs(struct mrc_params *mrc_params, uint8_t channel, + uint8_t rank, bool rcvn); +void find_rising_edge(struct mrc_params *mrc_params, uint32_t delay[], + uint8_t channel, uint8_t rank, bool rcvn); +uint32_t byte_lane_mask(struct mrc_params *mrc_params); +uint32_t check_rw_coarse(struct mrc_params *mrc_params, uint32_t address); +uint32_t check_bls_ex(struct mrc_params *mrc_params, uint32_t address); +void lfsr32(uint32_t *lfsr_ptr); +void clear_pointers(void); +void print_timings(struct mrc_params *mrc_params); + +#endif /* _MRC_UTIL_H_ */

Hi Bin,
On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add various utility codes needed for Quark MRC.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
There are 12 checkpatch warnings in this patch, which are:
warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...', sigh.
Changes in v2:
- Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204044.html
arch/x86/cpu/quark/hte.c | 396 +++++++++++ arch/x86/cpu/quark/hte.h | 44 ++ arch/x86/cpu/quark/mrc_util.c | 1475 +++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/mrc_util.h | 153 +++++ 4 files changed, 2068 insertions(+) create mode 100644 arch/x86/cpu/quark/hte.c create mode 100644 arch/x86/cpu/quark/hte.h create mode 100644 arch/x86/cpu/quark/mrc_util.c create mode 100644 arch/x86/cpu/quark/mrc_util.h
We should revisit this with a follow-on patch to tidy this up before the release. I don't believe it will create correctness problems to split the code but that needs to be tested. Let's get this in for now as a stake in the ground.
Acked-by: Simon Glass sjg@chromium.org

On 5 February 2015 at 17:51, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add various utility codes needed for Quark MRC.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
There are 12 checkpatch warnings in this patch, which are:
warning: arch/x86/cpu/quark/mrc_util.c,1446: Too many leading tabs - consider code refactoring warning: arch/x86/cpu/quark/mrc_util.c,1450: line over 80 characters ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...', sigh.
Changes in v2:
- Fixed issues per review comments. See the following link for details. http://lists.denx.de/pipermail/u-boot/2015-February/204044.html
arch/x86/cpu/quark/hte.c | 396 +++++++++++ arch/x86/cpu/quark/hte.h | 44 ++ arch/x86/cpu/quark/mrc_util.c | 1475 +++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/mrc_util.h | 153 +++++ 4 files changed, 2068 insertions(+) create mode 100644 arch/x86/cpu/quark/hte.c create mode 100644 arch/x86/cpu/quark/hte.h create mode 100644 arch/x86/cpu/quark/mrc_util.c create mode 100644 arch/x86/cpu/quark/mrc_util.h
We should revisit this with a follow-on patch to tidy this up before the release. I don't believe it will create correctness problems to split the code but that needs to be tested. Let's get this in for now as a stake in the ground.
Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

The codes are actually doing the memory initialization stuff.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
--- The most ugly codes I've ever seen ... There are 252 warnings and 127 checks in this patch, which are:
check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment should match open parenthesis' issue, I may end up adding more 'line over 80 characters' issues, so we have to bear with it. Sigh.
Changes in v2: - Write out the values instead of BIT in smc.h - Removing the ending / in the file header comment block
arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/smc.h | 446 ++++++++ 2 files changed, 3210 insertions(+) create mode 100644 arch/x86/cpu/quark/smc.c create mode 100644 arch/x86/cpu/quark/smc.h
diff --git a/arch/x86/cpu/quark/smc.c b/arch/x86/cpu/quark/smc.c new file mode 100644 index 0000000..e34bec4 --- /dev/null +++ b/arch/x86/cpu/quark/smc.c @@ -0,0 +1,2764 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#include <common.h> +#include <pci.h> +#include <asm/arch/device.h> +#include <asm/arch/mrc.h> +#include <asm/arch/msg_port.h> +#include "mrc_util.h" +#include "hte.h" +#include "smc.h" + +/* t_rfc values (in picoseconds) per density */ +static const uint32_t t_rfc[5] = { + 90000, /* 512Mb */ + 110000, /* 1Gb */ + 160000, /* 2Gb */ + 300000, /* 4Gb */ + 350000, /* 8Gb */ +}; + +/* t_ck clock period in picoseconds per speed index 800, 1066, 1333 */ +static const uint32_t t_ck[3] = { + 2500, + 1875, + 1500 +}; + +/* Global variables */ +static const uint16_t ddr_wclk[] = {193, 158}; +static const uint16_t ddr_wctl[] = {1, 217}; +static const uint16_t ddr_wcmd[] = {1, 220}; + +#ifdef BACKUP_RCVN +static const uint16_t ddr_rcvn[] = {129, 498}; +#endif + +#ifdef BACKUP_WDQS +static const uint16_t ddr_wdqs[] = {65, 289}; +#endif + +#ifdef BACKUP_RDQS +static const uint8_t ddr_rdqs[] = {32, 24}; +#endif + +#ifdef BACKUP_WDQ +static const uint16_t ddr_wdq[] = {32, 257}; +#endif + +/* Stop self refresh driven by MCU */ +void clear_self_refresh(struct mrc_params *mrc_params) +{ + ENTERFN(); + + /* clear the PMSTS Channel Self Refresh bits */ + mrc_write_mask(MEM_CTLR, PMSTS, BIT0, BIT0); + + LEAVEFN(); +} + +/* It will initialize timing registers in the MCU (DTR0..DTR4) */ +void prog_ddr_timing_control(struct mrc_params *mrc_params) +{ + uint8_t tcl, wl; + uint8_t trp, trcd, tras, twr, twtr, trrd, trtp, tfaw; + uint32_t tck; + u32 dtr0, dtr1, dtr2, dtr3, dtr4; + u32 tmp1, tmp2; + + ENTERFN(); + + /* mcu_init starts */ + mrc_post_code(0x02, 0x00); + + dtr0 = msg_port_read(MEM_CTLR, DTR0); + dtr1 = msg_port_read(MEM_CTLR, DTR1); + dtr2 = msg_port_read(MEM_CTLR, DTR2); + dtr3 = msg_port_read(MEM_CTLR, DTR3); + dtr4 = msg_port_read(MEM_CTLR, DTR4); + + tck = t_ck[mrc_params->ddr_speed]; /* Clock in picoseconds */ + tcl = mrc_params->params.cl; /* CAS latency in clocks */ + trp = tcl; /* Per CAT MRC */ + trcd = tcl; /* Per CAT MRC */ + tras = MCEIL(mrc_params->params.ras, tck); + + /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */ + twr = MCEIL(15000, tck); + + twtr = MCEIL(mrc_params->params.wtr, tck); + trrd = MCEIL(mrc_params->params.rrd, tck); + trtp = 4; /* Valid for 800 and 1066, use 5 for 1333 */ + tfaw = MCEIL(mrc_params->params.faw, tck); + + wl = 5 + mrc_params->ddr_speed; + + dtr0 &= ~(BIT0 | BIT1); + dtr0 |= mrc_params->ddr_speed; + dtr0 &= ~(BIT12 | BIT13 | BIT14); + tmp1 = tcl - 5; + dtr0 |= ((tcl - 5) << 12); + dtr0 &= ~(BIT4 | BIT5 | BIT6 | BIT7); + dtr0 |= ((trp - 5) << 4); /* 5 bit DRAM Clock */ + dtr0 &= ~(BIT8 | BIT9 | BIT10 | BIT11); + dtr0 |= ((trcd - 5) << 8); /* 5 bit DRAM Clock */ + + dtr1 &= ~(BIT0 | BIT1 | BIT2); + tmp2 = wl - 3; + dtr1 |= (wl - 3); + dtr1 &= ~(BIT8 | BIT9 | BIT10 | BIT11); + dtr1 |= ((wl + 4 + twr - 14) << 8); /* Change to tWTP */ + dtr1 &= ~(BIT28 | BIT29 | BIT30); + dtr1 |= ((MMAX(trtp, 4) - 3) << 28); /* 4 bit DRAM Clock */ + dtr1 &= ~(BIT24 | BIT25); + dtr1 |= ((trrd - 4) << 24); /* 4 bit DRAM Clock */ + dtr1 &= ~(BIT4 | BIT5); + dtr1 |= (1 << 4); + dtr1 &= ~(BIT20 | BIT21 | BIT22 | BIT23); + dtr1 |= ((tras - 14) << 20); /* 6 bit DRAM Clock */ + dtr1 &= ~(BIT16 | BIT17 | BIT18 | BIT19); + dtr1 |= ((((tfaw + 1) >> 1) - 5) << 16);/* 4 bit DRAM Clock */ + /* Set 4 Clock CAS to CAS delay (multi-burst) */ + dtr1 &= ~(BIT12 | BIT13); + + dtr2 &= ~(BIT0 | BIT1 | BIT2); + dtr2 |= 1; + dtr2 &= ~(BIT8 | BIT9 | BIT10); + dtr2 |= (2 << 8); + dtr2 &= ~(BIT16 | BIT17 | BIT18 | BIT19); + dtr2 |= (2 << 16); + + dtr3 &= ~(BIT0 | BIT1 | BIT2); + dtr3 |= 2; + dtr3 &= ~(BIT4 | BIT5 | BIT6); + dtr3 |= (2 << 4); + + dtr3 &= ~(BIT8 | BIT9 | BIT10 | BIT11); + if (mrc_params->ddr_speed == DDRFREQ_800) { + /* Extended RW delay (+1) */ + dtr3 |= ((tcl - 5 + 1) << 8); + } else if (mrc_params->ddr_speed == DDRFREQ_1066) { + /* Extended RW delay (+1) */ + dtr3 |= ((tcl - 5 + 1) << 8); + } + + dtr3 &= ~(BIT13 | BIT14 | BIT15 | BIT16); + dtr3 |= ((4 + wl + twtr - 11) << 13); + + dtr3 &= ~(BIT22 | BIT23); + if (mrc_params->ddr_speed == DDRFREQ_800) + dtr3 |= ((MMAX(0, 1 - 1)) << 22); + else + dtr3 |= ((MMAX(0, 2 - 1)) << 22); + + dtr4 &= ~(BIT0 | BIT1); + dtr4 |= 1; + dtr4 &= ~(BIT4 | BIT5 | BIT6); + dtr4 |= (1 << 4); + dtr4 &= ~(BIT8 | BIT9 | BIT10); + dtr4 |= ((1 + tmp1 - tmp2 + 2) << 8); + dtr4 &= ~(BIT12 | BIT13 | BIT14); + dtr4 |= ((1 + tmp1 - tmp2 + 2) << 12); + dtr4 &= ~(BIT15 | BIT16); + + msg_port_write(MEM_CTLR, DTR0, dtr0); + msg_port_write(MEM_CTLR, DTR1, dtr1); + msg_port_write(MEM_CTLR, DTR2, dtr2); + msg_port_write(MEM_CTLR, DTR3, dtr3); + msg_port_write(MEM_CTLR, DTR4, dtr4); + + LEAVEFN(); +} + +/* Configure MCU before jedec init sequence */ +void prog_decode_before_jedec(struct mrc_params *mrc_params) +{ + u32 drp; + u32 drfc; + u32 dcal; + u32 dsch; + u32 dpmc0; + + ENTERFN(); + + /* Disable power saving features */ + dpmc0 = msg_port_read(MEM_CTLR, DPMC0); + dpmc0 |= (BIT24 | BIT25); + dpmc0 &= ~(BIT16 | BIT17 | BIT18); + dpmc0 &= ~BIT23; + msg_port_write(MEM_CTLR, DPMC0, dpmc0); + + /* Disable out of order transactions */ + dsch = msg_port_read(MEM_CTLR, DSCH); + dsch |= (BIT8 | BIT12); + msg_port_write(MEM_CTLR, DSCH, dsch); + + /* Disable issuing the REF command */ + drfc = msg_port_read(MEM_CTLR, DRFC); + drfc &= ~(BIT12 | BIT13 | BIT14); + msg_port_write(MEM_CTLR, DRFC, drfc); + + /* Disable ZQ calibration short */ + dcal = msg_port_read(MEM_CTLR, DCAL); + dcal &= ~(BIT8 | BIT9 | BIT10); + dcal &= ~(BIT12 | BIT13); + msg_port_write(MEM_CTLR, DCAL, dcal); + + /* + * Training performed in address mode 0, rank population has limited + * impact, however simulator complains if enabled non-existing rank. + */ + drp = 0; + if (mrc_params->rank_enables & 1) + drp |= BIT0; + if (mrc_params->rank_enables & 2) + drp |= BIT1; + msg_port_write(MEM_CTLR, DRP, drp); + + LEAVEFN(); +} + +/* + * After Cold Reset, BIOS should set COLDWAKE bit to 1 before + * sending the WAKE message to the Dunit. + * + * For Standby Exit, or any other mode in which the DRAM is in + * SR, this bit must be set to 0. + */ +void perform_ddr_reset(struct mrc_params *mrc_params) +{ + ENTERFN(); + + /* Set COLDWAKE bit before sending the WAKE message */ + mrc_write_mask(MEM_CTLR, DRMC, BIT16, BIT16); + + /* Send wake command to DUNIT (MUST be done before JEDEC) */ + dram_wake_command(); + + /* Set default value */ + msg_port_write(MEM_CTLR, DRMC, + (mrc_params->rd_odt_value == 0 ? BIT12 : 0)); + + LEAVEFN(); +} + + +/* + * This function performs some initialization on the DDRIO unit. + * This function is dependent on BOARD_ID, DDR_SPEED, and CHANNEL_ENABLES. + */ +void ddrphy_init(struct mrc_params *mrc_params) +{ + uint32_t temp; + uint8_t ch; /* channel counter */ + uint8_t rk; /* rank counter */ + uint8_t bl_grp; /* byte lane group counter (2 BLs per module) */ + uint8_t bl_divisor = 1; /* byte lane divisor */ + /* For DDR3 --> 0 == 800, 1 == 1066, 2 == 1333 */ + uint8_t speed = mrc_params->ddr_speed & (BIT1 | BIT0); + uint8_t cas; + uint8_t cwl; + + ENTERFN(); + + cas = mrc_params->params.cl; + cwl = 5 + mrc_params->ddr_speed; + + /* ddrphy_init starts */ + mrc_post_code(0x03, 0x00); + + /* + * HSD#231531 + * Make sure IOBUFACT is deasserted before initializing the DDR PHY + * + * HSD#234845 + * Make sure WRPTRENABLE is deasserted before initializing the DDR PHY + */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* Deassert DDRPHY Initialization Complete */ + mrc_alt_write_mask(DDRPHY, + (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)), + ~BIT20, BIT20); /* SPID_INIT_COMPLETE=0 */ + /* Deassert IOBUFACT */ + mrc_alt_write_mask(DDRPHY, + (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)), + ~BIT2, BIT2); /* IOBUFACTRST_N=0 */ + /* Disable WRPTR */ + mrc_alt_write_mask(DDRPHY, + (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)), + ~BIT0, BIT0); /* WRPTRENABLE=0 */ + } + } + + /* Put PHY in reset */ + mrc_alt_write_mask(DDRPHY, MASTERRSTN, 0, BIT0); + + /* Initialize DQ01, DQ23, CMD, CLK-CTL, COMP modules */ + + /* STEP0 */ + mrc_post_code(0x03, 0x10); + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* DQ01-DQ23 */ + for (bl_grp = 0; + bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2); + bl_grp++) { + /* Analog MUX select - IO2xCLKSEL */ + mrc_alt_write_mask(DDRPHY, + (DQOBSCKEBBCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + ((bl_grp) ? (0x00) : (BIT22)), (BIT22)); + + /* ODT Strength */ + switch (mrc_params->rd_odt_value) { + case 1: + temp = 0x3; + break; /* 60 ohm */ + case 2: + temp = 0x3; + break; /* 120 ohm */ + case 3: + temp = 0x3; + break; /* 180 ohm */ + default: + temp = 0x3; + break; /* 120 ohm */ + } + + /* ODT strength */ + mrc_alt_write_mask(DDRPHY, + (B0RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (temp << 5), (BIT6 | BIT5)); + /* ODT strength */ + mrc_alt_write_mask(DDRPHY, + (B1RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (temp << 5), (BIT6 | BIT5)); + + /* Dynamic ODT/DIFFAMP */ + temp = (((cas) << 24) | ((cas) << 16) | + ((cas) << 8) | ((cas) << 0)); + switch (speed) { + case 0: + temp -= 0x01010101; + break; /* 800 */ + case 1: + temp -= 0x02020202; + break; /* 1066 */ + case 2: + temp -= 0x03030303; + break; /* 1333 */ + case 3: + temp -= 0x04040404; + break; /* 1600 */ + } + + /* Launch Time: ODT, DIFFAMP, ODT, DIFFAMP */ + mrc_alt_write_mask(DDRPHY, + (B01LATCTL1 + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, + (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 | + BIT20 | BIT19 | BIT18 | BIT17 | BIT16 | + BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + switch (speed) { + /* HSD#234715 */ + case 0: + temp = ((0x06 << 16) | (0x07 << 8)); + break; /* 800 */ + case 1: + temp = ((0x07 << 16) | (0x08 << 8)); + break; /* 1066 */ + case 2: + temp = ((0x09 << 16) | (0x0A << 8)); + break; /* 1333 */ + case 3: + temp = ((0x0A << 16) | (0x0B << 8)); + break; /* 1600 */ + } + + /* On Duration: ODT, DIFFAMP */ + mrc_alt_write_mask(DDRPHY, + (B0ONDURCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT13 | BIT12 | BIT11 | BIT10 | + BIT9 | BIT8)); + /* On Duration: ODT, DIFFAMP */ + mrc_alt_write_mask(DDRPHY, + (B1ONDURCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT13 | BIT12 | BIT11 | BIT10 | + BIT9 | BIT8)); + + switch (mrc_params->rd_odt_value) { + case 0: + /* override DIFFAMP=on, ODT=off */ + temp = ((0x3F << 16) | (0x3f << 10)); + break; + default: + /* override DIFFAMP=on, ODT=on */ + temp = ((0x3F << 16) | (0x2A << 10)); + break; + } + + /* Override: DIFFAMP, ODT */ + mrc_alt_write_mask(DDRPHY, + (B0OVRCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT15 | BIT14 | BIT13 | BIT12 | + BIT11 | BIT10)); + /* Override: DIFFAMP, ODT */ + mrc_alt_write_mask(DDRPHY, + (B1OVRCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT15 | BIT14 | BIT13 | BIT12 | + BIT11 | BIT10)); + + /* DLL Setup */ + + /* 1xCLK Domain Timings: tEDP,RCVEN,WDQS (PO) */ + mrc_alt_write_mask(DDRPHY, + (B0LATCTL0 + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (((cas + 7) << 16) | ((cas - 4) << 8) | + ((cwl - 2) << 0)), + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT12 | BIT11 | BIT10 | BIT9 | + BIT8 | BIT4 | BIT3 | BIT2 | BIT1 | + BIT0)); + mrc_alt_write_mask(DDRPHY, + (B1LATCTL0 + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (((cas + 7) << 16) | ((cas - 4) << 8) | + ((cwl - 2) << 0)), + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT12 | BIT11 | BIT10 | BIT9 | + BIT8 | BIT4 | BIT3 | BIT2 | BIT1 | + BIT0)); + + /* RCVEN Bypass (PO) */ + mrc_alt_write_mask(DDRPHY, + (B0RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + ((0x0 << 7) | (0x0 << 0)), + (BIT7 | BIT0)); + mrc_alt_write_mask(DDRPHY, + (B1RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + ((0x0 << 7) | (0x0 << 0)), + (BIT7 | BIT0)); + + /* TX */ + mrc_alt_write_mask(DDRPHY, + (DQCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT16), (BIT16)); + mrc_alt_write_mask(DDRPHY, + (B01PTRCTL1 + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT8), (BIT8)); + + /* RX (PO) */ + /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */ + mrc_alt_write_mask(DDRPHY, + (B0VREFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)), + (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | + BIT2 | BIT1 | BIT0)); + /* Internal Vref Code, Enable#, Ext_or_Int (1=Ext) */ + mrc_alt_write_mask(DDRPHY, + (B1VREFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + ((0x03 << 2) | (0x0 << 1) | (0x0 << 0)), + (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | + BIT2 | BIT1 | BIT0)); + /* Per-Bit De-Skew Enable */ + mrc_alt_write_mask(DDRPHY, + (B0RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (0), (BIT4)); + /* Per-Bit De-Skew Enable */ + mrc_alt_write_mask(DDRPHY, + (B1RXIOBUFCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (0), (BIT4)); + } + + /* CLKEBB */ + mrc_alt_write_mask(DDRPHY, + (CMDOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)), + 0, (BIT23)); + + /* Enable tristate control of cmd/address bus */ + mrc_alt_write_mask(DDRPHY, + (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)), + 0, (BIT1 | BIT0)); + + /* ODT RCOMP */ + mrc_alt_write_mask(DDRPHY, + (CMDRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)), + ((0x03 << 5) | (0x03 << 0)), + (BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | + BIT3 | BIT2 | BIT1 | BIT0)); + + /* CMDPM* registers must be programmed in this order */ + + /* Turn On Delays: SFR (regulator), MPLL */ + mrc_alt_write_mask(DDRPHY, + (CMDPMDLYREG4 + (ch * DDRIOCCC_CH_OFFSET)), + ((0xFFFFU << 16) | (0xFFFF << 0)), + 0xFFFFFFFF); + /* + * Delays: ASSERT_IOBUFACT_to_ALLON0_for_PM_MSG_3, + * VREG (MDLL) Turn On, ALLON0_to_DEASSERT_IOBUFACT + * for_PM_MSG_gt0, MDLL Turn On + */ + mrc_alt_write_mask(DDRPHY, + (CMDPMDLYREG3 + (ch * DDRIOCCC_CH_OFFSET)), + ((0xFU << 28) | (0xFFF << 16) | (0xF << 12) | + (0x616 << 0)), 0xFFFFFFFF); + /* MPLL Divider Reset Delays */ + mrc_alt_write_mask(DDRPHY, + (CMDPMDLYREG2 + (ch * DDRIOCCC_CH_OFFSET)), + ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) | + (0xFF << 0)), 0xFFFFFFFF); + /* Turn Off Delays: VREG, Staggered MDLL, MDLL, PI */ + mrc_alt_write_mask(DDRPHY, + (CMDPMDLYREG1 + (ch * DDRIOCCC_CH_OFFSET)), + ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) | + (0xFF << 0)), 0xFFFFFFFF); + /* Turn On Delays: MPLL, Staggered MDLL, PI, IOBUFACT */ + mrc_alt_write_mask(DDRPHY, + (CMDPMDLYREG0 + (ch * DDRIOCCC_CH_OFFSET)), + ((0xFFU << 24) | (0xFF << 16) | (0xFF << 8) | + (0xFF << 0)), 0xFFFFFFFF); + /* Allow PUnit signals */ + mrc_alt_write_mask(DDRPHY, + (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)), + ((0x6 << 8) | BIT6 | (0x4 << 0)), + (BIT31 | BIT30 | BIT29 | BIT28 | BIT27 | BIT26 | + BIT25 | BIT24 | BIT23 | BIT22 | BIT21 | BIT11 | + BIT10 | BIT9 | BIT8 | BIT6 | BIT3 | BIT2 | + BIT1 | BIT0)); + /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */ + mrc_alt_write_mask(DDRPHY, + (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + ((0x3 << 4) | (0x7 << 0)), + (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | + BIT0)); + + /* CLK-CTL */ + mrc_alt_write_mask(DDRPHY, + (CCOBSCKEBBCTL + (ch * DDRIOCCC_CH_OFFSET)), + 0, BIT24); /* CLKEBB */ + /* Buffer Enable: CS,CKE,ODT,CLK */ + mrc_alt_write_mask(DDRPHY, + (CCCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)), + ((0x0 << 16) | (0x0 << 12) | (0x0 << 8) | + (0xF << 4) | BIT0), + (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 | BIT14 | + BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT7 | BIT6 | BIT5 | BIT4 | BIT0)); + /* ODT RCOMP */ + mrc_alt_write_mask(DDRPHY, + (CCRCOMPODT + (ch * DDRIOCCC_CH_OFFSET)), + ((0x03 << 8) | (0x03 << 0)), + (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | BIT4 | + BIT3 | BIT2 | BIT1 | BIT0)); + /* DLL_VREG Bias Trim, VREF Tuning for DLL_VREG */ + mrc_alt_write_mask(DDRPHY, + (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + ((0x3 << 4) | (0x7 << 0)), + (BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | + BIT0)); + + /* + * COMP (RON channel specific) + * - DQ/DQS/DM RON: 32 Ohm + * - CTRL/CMD RON: 27 Ohm + * - CLK RON: 26 Ohm + */ + /* RCOMP Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (DQVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x08 << 24) | (0x03 << 16)), + (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | + BIT24 | BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16)); + /* RCOMP Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (CMDVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x0C << 24) | (0x03 << 16)), + (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | + BIT24 | BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16)); + /* RCOMP Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x0F << 24) | (0x03 << 16)), + (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | + BIT24 | BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16)); + /* RCOMP Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x08 << 24) | (0x03 << 16)), + (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | + BIT24 | BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16)); + /* RCOMP Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (CTLVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x0C << 24) | (0x03 << 16)), + (BIT29 | BIT28 | BIT27 | BIT26 | BIT25 | + BIT24 | BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16)); + + /* DQS Swapped Input Enable */ + mrc_alt_write_mask(DDRPHY, + (COMPEN1CH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT19 | BIT17), + (BIT31 | BIT30 | BIT19 | BIT17 | + BIT15 | BIT14)); + + /* ODT VREF = 1.5 x 274/360+274 = 0.65V (code of ~50) */ + /* ODT Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (DQVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x32 << 8) | (0x03 << 0)), + (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + /* ODT Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (DQSVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x32 << 8) | (0x03 << 0)), + (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + /* ODT Vref PU/PD */ + mrc_alt_write_mask(DDRPHY, + (CLKVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x0E << 8) | (0x05 << 0)), + (BIT13 | BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + + /* + * Slew rate settings are frequency specific, + * numbers below are for 800Mhz (speed == 0) + * - DQ/DQS/DM/CLK SR: 4V/ns, + * - CTRL/CMD SR: 1.5V/ns + */ + temp = (0x0E << 16) | (0x0E << 12) | (0x08 << 8) | + (0x0B << 4) | (0x0B << 0); + /* DCOMP Delay Select: CTL,CMD,CLK,DQS,DQ */ + mrc_alt_write_mask(DDRPHY, + (DLYSELCH0 + (ch * DDRCOMP_CH_OFFSET)), + temp, + (BIT19 | BIT18 | BIT17 | BIT16 | BIT15 | + BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | + BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | + BIT3 | BIT2 | BIT1 | BIT0)); + /* TCO Vref CLK,DQS,DQ */ + mrc_alt_write_mask(DDRPHY, + (TCOVREFCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x05 << 16) | (0x05 << 8) | (0x05 << 0)), + (BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT13 | BIT12 | BIT11 | BIT10 | + BIT9 | BIT8 | BIT5 | BIT4 | BIT3 | BIT2 | + BIT1 | BIT0)); + /* ODTCOMP CMD/CTL PU/PD */ + mrc_alt_write_mask(DDRPHY, + (CCBUFODTCH0 + (ch * DDRCOMP_CH_OFFSET)), + ((0x03 << 8) | (0x03 << 0)), + (BIT12 | BIT11 | BIT10 | BIT9 | BIT8 | + BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + /* COMP */ + mrc_alt_write_mask(DDRPHY, + (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)), + 0, (BIT31 | BIT30 | BIT8)); + +#ifdef BACKUP_COMPS + /* DQ COMP Overrides */ + /* RCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* RCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x10 << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x10 << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + + /* DQS COMP Overrides */ + /* RCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQSDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* RCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQSDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQSDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x10 << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQSDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x10 << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQSODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQSODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + + /* CLK COMP Overrides */ + /* RCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CLKDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0C << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* RCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CLKDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0C << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CLKDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x07 << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CLKDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x07 << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CLKODTPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* ODTCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CLKODTPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0B << 16)), + (BIT31 | (0x0B << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31), (BIT31)); + + /* CMD COMP Overrides */ + /* RCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CMDDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0D << 16)), + (BIT31 | BIT21 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* RCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CMDDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0D << 16)), + (BIT31 | BIT21 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CMDDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CMDDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + + /* CTL COMP Overrides */ + /* RCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CTLDRVPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0D << 16)), + (BIT31 | BIT21 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* RCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CTLDRVPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0D << 16)), + (BIT31 | BIT21 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CTLDLYPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* DCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CTLDLYPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x0A << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); +#else + /* DQ TCOCOMP Overrides */ + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + + /* DQS TCOCOMP Overrides */ + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (DQSTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (DQSTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + + /* CLK TCOCOMP Overrides */ + /* TCOCOMP PU */ + mrc_alt_write_mask(DDRPHY, + (CLKTCOPUCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); + /* TCOCOMP PD */ + mrc_alt_write_mask(DDRPHY, + (CLKTCOPDCTLCH0 + (ch * DDRCOMP_CH_OFFSET)), + (BIT31 | (0x1F << 16)), + (BIT31 | BIT20 | BIT19 | + BIT18 | BIT17 | BIT16)); +#endif + + /* program STATIC delays */ +#ifdef BACKUP_WCMD + set_wcmd(ch, ddr_wcmd[PLATFORM_ID]); +#else + set_wcmd(ch, ddr_wclk[PLATFORM_ID] + HALF_CLK); +#endif + + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1<<rk)) { + set_wclk(ch, rk, ddr_wclk[PLATFORM_ID]); +#ifdef BACKUP_WCTL + set_wctl(ch, rk, ddr_wctl[PLATFORM_ID]); +#else + set_wctl(ch, rk, ddr_wclk[PLATFORM_ID] + HALF_CLK); +#endif + } + } + } + } + + /* COMP (non channel specific) */ + /* RCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQANADRVPUCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQANADRVPDCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CMDANADRVPUCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CMDANADRVPDCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANADRVPUCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANADRVPDCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANADRVPUCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANADRVPDCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CTLANADRVPUCTL), (BIT30), (BIT30)); + /* RCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CTLANADRVPDCTL), (BIT30), (BIT30)); + /* ODT: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQANAODTPUCTL), (BIT30), (BIT30)); + /* ODT: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQANAODTPDCTL), (BIT30), (BIT30)); + /* ODT: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANAODTPUCTL), (BIT30), (BIT30)); + /* ODT: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANAODTPDCTL), (BIT30), (BIT30)); + /* ODT: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANAODTPUCTL), (BIT30), (BIT30)); + /* ODT: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANAODTPDCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQANADLYPUCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQANADLYPDCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CMDANADLYPUCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CMDANADLYPDCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANADLYPUCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANADLYPDCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANADLYPUCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANADLYPDCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CTLANADLYPUCTL), (BIT30), (BIT30)); + /* DCOMP: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CTLANADLYPDCTL), (BIT30), (BIT30)); + /* TCO: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQANATCOPUCTL), (BIT30), (BIT30)); + /* TCO: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQANATCOPDCTL), (BIT30), (BIT30)); + /* TCO: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANATCOPUCTL), (BIT30), (BIT30)); + /* TCO: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (CLKANATCOPDCTL), (BIT30), (BIT30)); + /* TCO: Dither PU Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANATCOPUCTL), (BIT30), (BIT30)); + /* TCO: Dither PD Enable */ + mrc_alt_write_mask(DDRPHY, (DQSANATCOPDCTL), (BIT30), (BIT30)); + /* TCOCOMP: Pulse Count */ + mrc_alt_write_mask(DDRPHY, (TCOCNTCTRL), (0x1 << 0), (BIT1 | BIT0)); + /* ODT: CMD/CTL PD/PU */ + mrc_alt_write_mask(DDRPHY, + (CHNLBUFSTATIC), ((0x03 << 24) | (0x03 << 16)), + (BIT28 | BIT27 | BIT26 | BIT25 | BIT24 | + BIT20 | BIT19 | BIT18 | BIT17 | BIT16)); + /* Set 1us counter */ + mrc_alt_write_mask(DDRPHY, + (MSCNTR), (0x64 << 0), + (BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0)); + mrc_alt_write_mask(DDRPHY, + (LATCH1CTL), (0x1 << 28), + (BIT30 | BIT29 | BIT28)); + + /* Release PHY from reset */ + mrc_alt_write_mask(DDRPHY, MASTERRSTN, BIT0, BIT0); + + /* STEP1 */ + mrc_post_code(0x03, 0x11); + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* DQ01-DQ23 */ + for (bl_grp = 0; + bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2); + bl_grp++) { + mrc_alt_write_mask(DDRPHY, + (DQMDLLCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT13), + (BIT13)); /* Enable VREG */ + delay_n(3); + } + + /* ECC */ + mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL), + (BIT13), (BIT13)); /* Enable VREG */ + delay_n(3); + /* CMD */ + mrc_alt_write_mask(DDRPHY, + (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + (BIT13), (BIT13)); /* Enable VREG */ + delay_n(3); + /* CLK-CTL */ + mrc_alt_write_mask(DDRPHY, + (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + (BIT13), (BIT13)); /* Enable VREG */ + delay_n(3); + } + } + + /* STEP2 */ + mrc_post_code(0x03, 0x12); + delay_n(200); + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* DQ01-DQ23 */ + for (bl_grp = 0; + bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2); + bl_grp++) { + mrc_alt_write_mask(DDRPHY, + (DQMDLLCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT17), + (BIT17)); /* Enable MCDLL */ + delay_n(50); + } + + /* ECC */ + mrc_alt_write_mask(DDRPHY, (ECCMDLLCTL), + (BIT17), (BIT17)); /* Enable MCDLL */ + delay_n(50); + /* CMD */ + mrc_alt_write_mask(DDRPHY, + (CMDMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + (BIT18), (BIT18)); /* Enable MCDLL */ + delay_n(50); + /* CLK-CTL */ + mrc_alt_write_mask(DDRPHY, + (CCMDLLCTL + (ch * DDRIOCCC_CH_OFFSET)), + (BIT18), (BIT18)); /* Enable MCDLL */ + delay_n(50); + } + } + + /* STEP3: */ + mrc_post_code(0x03, 0x13); + delay_n(100); + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* DQ01-DQ23 */ + for (bl_grp = 0; + bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2); + bl_grp++) { +#ifdef FORCE_16BIT_DDRIO + temp = ((bl_grp) && + (mrc_params->channel_width == X16)) ? + ((0x1 << 12) | (0x1 << 8) | + (0xF << 4) | (0xF << 0)) : + ((0xF << 12) | (0xF << 8) | + (0xF << 4) | (0xF << 0)); +#else + temp = ((0xF << 12) | (0xF << 8) | + (0xF << 4) | (0xF << 0)); +#endif + /* Enable TXDLL */ + mrc_alt_write_mask(DDRPHY, + (DQDLLTXCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + temp, 0xFFFF); + delay_n(3); + /* Enable RXDLL */ + mrc_alt_write_mask(DDRPHY, + (DQDLLRXCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT3 | BIT2 | BIT1 | BIT0), + (BIT3 | BIT2 | BIT1 | BIT0)); + delay_n(3); + /* Enable RXDLL Overrides BL0 */ + mrc_alt_write_mask(DDRPHY, + (B0OVRCTL + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (BIT3 | BIT2 | BIT1 | BIT0), + (BIT3 | BIT2 | BIT1 | BIT0)); + } + + /* ECC */ + temp = ((0xF << 12) | (0xF << 8) | + (0xF << 4) | (0xF << 0)); + mrc_alt_write_mask(DDRPHY, (ECCDLLTXCTL), + temp, 0xFFFF); + delay_n(3); + + /* CMD (PO) */ + mrc_alt_write_mask(DDRPHY, + (CMDDLLTXCTL + (ch * DDRIOCCC_CH_OFFSET)), + temp, 0xFFFF); + delay_n(3); + } + } + + /* STEP4 */ + mrc_post_code(0x03, 0x14); + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* Host To Memory Clock Alignment (HMC) for 800/1066 */ + for (bl_grp = 0; + bl_grp < ((NUM_BYTE_LANES / bl_divisor) / 2); + bl_grp++) { + /* CLK_ALIGN_MOD_ID */ + mrc_alt_write_mask(DDRPHY, + (DQCLKALIGNREG2 + + (bl_grp * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + (bl_grp) ? (0x3) : (0x1), + (BIT3 | BIT2 | BIT1 | BIT0)); + } + + mrc_alt_write_mask(DDRPHY, + (ECCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)), + 0x2, + (BIT3 | BIT2 | BIT1 | BIT0)); + mrc_alt_write_mask(DDRPHY, + (CMDCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)), + 0x0, + (BIT3 | BIT2 | BIT1 | BIT0)); + mrc_alt_write_mask(DDRPHY, + (CCCLKALIGNREG2 + (ch * DDRIODQ_CH_OFFSET)), + 0x2, + (BIT3 | BIT2 | BIT1 | BIT0)); + mrc_alt_write_mask(DDRPHY, + (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)), + (0x2 << 4), (BIT5 | BIT4)); + /* + * NUM_SAMPLES, MAX_SAMPLES, + * MACRO_PI_STEP, MICRO_PI_STEP + */ + mrc_alt_write_mask(DDRPHY, + (CMDCLKALIGNREG1 + (ch * DDRIOCCC_CH_OFFSET)), + ((0x18 << 16) | (0x10 << 8) | + (0x8 << 2) | (0x1 << 0)), + (BIT22 | BIT21 | BIT20 | BIT19 | BIT18 | BIT17 | + BIT16 | BIT14 | BIT13 | BIT12 | BIT11 | BIT10 | + BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | + BIT2 | BIT1 | BIT0)); + /* TOTAL_NUM_MODULES, FIRST_U_PARTITION */ + mrc_alt_write_mask(DDRPHY, + (CMDCLKALIGNREG2 + (ch * DDRIOCCC_CH_OFFSET)), + ((0x10 << 16) | (0x4 << 8) | (0x2 << 4)), + (BIT20 | BIT19 | BIT18 | BIT17 | BIT16 | + BIT11 | BIT10 | BIT9 | BIT8 | BIT7 | BIT6 | + BIT5 | BIT4)); +#ifdef HMC_TEST + /* START_CLK_ALIGN=1 */ + mrc_alt_write_mask(DDRPHY, + (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET)), + BIT24, BIT24); + while (msg_port_alt_read(DDRPHY, + (CMDCLKALIGNREG0 + (ch * DDRIOCCC_CH_OFFSET))) & + BIT24) + ; /* wait for START_CLK_ALIGN=0 */ +#endif + + /* Set RD/WR Pointer Seperation & COUNTEN & FIFOPTREN */ + mrc_alt_write_mask(DDRPHY, + (CMDPTRREG + (ch * DDRIOCCC_CH_OFFSET)), + BIT0, BIT0); /* WRPTRENABLE=1 */ + + /* COMP initial */ + /* enable bypass for CLK buffer (PO) */ + mrc_alt_write_mask(DDRPHY, + (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)), + BIT5, BIT5); + /* Initial COMP Enable */ + mrc_alt_write_mask(DDRPHY, (CMPCTRL), + (BIT0), (BIT0)); + /* wait for Initial COMP Enable = 0 */ + while (msg_port_alt_read(DDRPHY, (CMPCTRL)) & BIT0) + ; + /* disable bypass for CLK buffer (PO) */ + mrc_alt_write_mask(DDRPHY, + (COMPEN0CH0 + (ch * DDRCOMP_CH_OFFSET)), + ~BIT5, BIT5); + + /* IOBUFACT */ + + /* STEP4a */ + mrc_alt_write_mask(DDRPHY, + (CMDCFGREG0 + (ch * DDRIOCCC_CH_OFFSET)), + BIT2, BIT2); /* IOBUFACTRST_N=1 */ + + /* DDRPHY initialization complete */ + mrc_alt_write_mask(DDRPHY, + (CMDPMCONFIG0 + (ch * DDRIOCCC_CH_OFFSET)), + BIT20, BIT20); /* SPID_INIT_COMPLETE=1 */ + } + } + + LEAVEFN(); +} + +/* This function performs JEDEC initialization on all enabled channels */ +void perform_jedec_init(struct mrc_params *mrc_params) +{ + uint8_t twr, wl, rank; + uint32_t tck; + u32 dtr0; + u32 drp; + u32 drmc; + u32 mrs0_cmd = 0; + u32 emrs1_cmd = 0; + u32 emrs2_cmd = 0; + u32 emrs3_cmd = 0; + + ENTERFN(); + + /* jedec_init starts */ + mrc_post_code(0x04, 0x00); + + /* DDR3_RESET_SET=0, DDR3_RESET_RESET=1 */ + mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT1, (BIT8 | BIT1)); + + /* Assert RESET# for 200us */ + delay_u(200); + + /* DDR3_RESET_SET=1, DDR3_RESET_RESET=0 */ + mrc_alt_write_mask(DDRPHY, CCDDR3RESETCTL, BIT8, (BIT8 | BIT1)); + + dtr0 = msg_port_read(MEM_CTLR, DTR0); + + /* + * Set CKEVAL for populated ranks + * then send NOP to each rank (#4550197) + */ + + drp = msg_port_read(MEM_CTLR, DRP); + drp &= 0x3; + + drmc = msg_port_read(MEM_CTLR, DRMC); + drmc &= 0xFFFFFFFC; + drmc |= (BIT4 | drp); + + msg_port_write(MEM_CTLR, DRMC, drmc); + + for (rank = 0; rank < NUM_RANKS; rank++) { + /* Skip to next populated rank */ + if ((mrc_params->rank_enables & (1 << rank)) == 0) + continue; + + dram_init_command(DCMD_NOP(rank)); + } + + msg_port_write(MEM_CTLR, DRMC, + (mrc_params->rd_odt_value == 0 ? BIT12 : 0)); + + /* + * setup for emrs 2 + * BIT[15:11] --> Always "0" + * BIT[10:09] --> Rtt_WR: want "Dynamic ODT Off" (0) + * BIT[08] --> Always "0" + * BIT[07] --> SRT: use sr_temp_range + * BIT[06] --> ASR: want "Manual SR Reference" (0) + * BIT[05:03] --> CWL: use oem_tCWL + * BIT[02:00] --> PASR: want "Full Array" (0) + */ + emrs2_cmd |= (2 << 3); + wl = 5 + mrc_params->ddr_speed; + emrs2_cmd |= ((wl - 5) << 9); + emrs2_cmd |= (mrc_params->sr_temp_range << 13); + + /* + * setup for emrs 3 + * BIT[15:03] --> Always "0" + * BIT[02] --> MPR: want "Normal Operation" (0) + * BIT[01:00] --> MPR_Loc: want "Predefined Pattern" (0) + */ + emrs3_cmd |= (3 << 3); + + /* + * setup for emrs 1 + * BIT[15:13] --> Always "0" + * BIT[12:12] --> Qoff: want "Output Buffer Enabled" (0) + * BIT[11:11] --> TDQS: want "Disabled" (0) + * BIT[10:10] --> Always "0" + * BIT[09,06,02] --> Rtt_nom: use rtt_nom_value + * BIT[08] --> Always "0" + * BIT[07] --> WR_LVL: want "Disabled" (0) + * BIT[05,01] --> DIC: use ron_value + * BIT[04:03] --> AL: additive latency want "0" (0) + * BIT[00] --> DLL: want "Enable" (0) + * + * (BIT5|BIT1) set Ron value + * 00 --> RZQ/6 (40ohm) + * 01 --> RZQ/7 (34ohm) + * 1* --> RESERVED + * + * (BIT9|BIT6|BIT2) set Rtt_nom value + * 000 --> Disabled + * 001 --> RZQ/4 ( 60ohm) + * 010 --> RZQ/2 (120ohm) + * 011 --> RZQ/6 ( 40ohm) + * 1** --> RESERVED + */ + emrs1_cmd |= (1 << 3); + emrs1_cmd &= ~BIT6; + + if (mrc_params->ron_value == 0) + emrs1_cmd |= BIT7; + else + emrs1_cmd &= ~BIT7; + + if (mrc_params->rtt_nom_value == 0) + emrs1_cmd |= (DDR3_EMRS1_RTTNOM_40 << 6); + else if (mrc_params->rtt_nom_value == 1) + emrs1_cmd |= (DDR3_EMRS1_RTTNOM_60 << 6); + else if (mrc_params->rtt_nom_value == 2) + emrs1_cmd |= (DDR3_EMRS1_RTTNOM_120 << 6); + + /* save MRS1 value (excluding control fields) */ + mrc_params->mrs1 = emrs1_cmd >> 6; + + /* + * setup for mrs 0 + * BIT[15:13] --> Always "0" + * BIT[12] --> PPD: for Quark (1) + * BIT[11:09] --> WR: use oem_tWR + * BIT[08] --> DLL: want "Reset" (1, self clearing) + * BIT[07] --> MODE: want "Normal" (0) + * BIT[06:04,02] --> CL: use oem_tCAS + * BIT[03] --> RD_BURST_TYPE: want "Interleave" (1) + * BIT[01:00] --> BL: want "8 Fixed" (0) + * WR: + * 0 --> 16 + * 1 --> 5 + * 2 --> 6 + * 3 --> 7 + * 4 --> 8 + * 5 --> 10 + * 6 --> 12 + * 7 --> 14 + * CL: + * BIT[02:02] "0" if oem_tCAS <= 11 (1866?) + * BIT[06:04] use oem_tCAS-4 + */ + mrs0_cmd |= BIT14; + mrs0_cmd |= BIT18; + mrs0_cmd |= ((((dtr0 >> 12) & 7) + 1) << 10); + + tck = t_ck[mrc_params->ddr_speed]; + /* Per JEDEC: tWR=15000ps DDR2/3 from 800-1600 */ + twr = MCEIL(15000, tck); + mrs0_cmd |= ((twr - 4) << 15); + + for (rank = 0; rank < NUM_RANKS; rank++) { + /* Skip to next populated rank */ + if ((mrc_params->rank_enables & (1 << rank)) == 0) + continue; + + emrs2_cmd |= (rank << 22); + dram_init_command(emrs2_cmd); + + emrs3_cmd |= (rank << 22); + dram_init_command(emrs3_cmd); + + emrs1_cmd |= (rank << 22); + dram_init_command(emrs1_cmd); + + mrs0_cmd |= (rank << 22); + dram_init_command(mrs0_cmd); + + dram_init_command(DCMD_ZQCL(rank)); + } + + LEAVEFN(); +} + +/* + * Dunit Initialization Complete + * + * Indicates that initialization of the Dunit has completed. + * + * Memory accesses are permitted and maintenance operation begins. + * Until this bit is set to a 1, the memory controller will not accept + * DRAM requests from the MEMORY_MANAGER or HTE. + */ +void set_ddr_init_complete(struct mrc_params *mrc_params) +{ + u32 dco; + + ENTERFN(); + + dco = msg_port_read(MEM_CTLR, DCO); + dco &= ~BIT28; + dco |= BIT31; + msg_port_write(MEM_CTLR, DCO, dco); + + LEAVEFN(); +} + +/* + * This function will retrieve relevant timing data + * + * This data will be used on subsequent boots to speed up boot times + * and is required for Suspend To RAM capabilities. + */ +void restore_timings(struct mrc_params *mrc_params) +{ + uint8_t ch, rk, bl; + const struct mrc_timings *mt = &mrc_params->timings; + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + for (rk = 0; rk < NUM_RANKS; rk++) { + for (bl = 0; bl < NUM_BYTE_LANES; bl++) { + set_rcvn(ch, rk, bl, mt->rcvn[ch][rk][bl]); + set_rdqs(ch, rk, bl, mt->rdqs[ch][rk][bl]); + set_wdqs(ch, rk, bl, mt->wdqs[ch][rk][bl]); + set_wdq(ch, rk, bl, mt->wdq[ch][rk][bl]); + if (rk == 0) { + /* VREF (RANK0 only) */ + set_vref(ch, bl, mt->vref[ch][bl]); + } + } + set_wctl(ch, rk, mt->wctl[ch][rk]); + } + set_wcmd(ch, mt->wcmd[ch]); + } +} + +/* + * Configure default settings normally set as part of read training + * + * Some defaults have to be set earlier as they may affect earlier + * training steps. + */ +void default_timings(struct mrc_params *mrc_params) +{ + uint8_t ch, rk, bl; + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + for (rk = 0; rk < NUM_RANKS; rk++) { + for (bl = 0; bl < NUM_BYTE_LANES; bl++) { + set_rdqs(ch, rk, bl, 24); + if (rk == 0) { + /* VREF (RANK0 only) */ + set_vref(ch, bl, 32); + } + } + } + } +} + +/* + * This function will perform our RCVEN Calibration Algorithm. + * We will only use the 2xCLK domain timings to perform RCVEN Calibration. + * All byte lanes will be calibrated "simultaneously" per channel per rank. + */ +void rcvn_cal(struct mrc_params *mrc_params) +{ + uint8_t ch; /* channel counter */ + uint8_t rk; /* rank counter */ + uint8_t bl; /* byte lane counter */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; + +#ifdef R2R_SHARING + /* used to find placement for rank2rank sharing configs */ + uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES]; +#ifndef BACKUP_RCVN + /* used to find placement for rank2rank sharing configs */ + uint32_t num_ranks_enabled = 0; +#endif +#endif + +#ifdef BACKUP_RCVN +#else + uint32_t temp; + /* absolute PI value to be programmed on the byte lane */ + uint32_t delay[NUM_BYTE_LANES]; + u32 dtr1, dtr1_save; +#endif + + ENTERFN(); + + /* rcvn_cal starts */ + mrc_post_code(0x05, 0x00); + +#ifndef BACKUP_RCVN + /* need separate burst to sample DQS preamble */ + dtr1 = msg_port_read(MEM_CTLR, DTR1); + dtr1_save = dtr1; + dtr1 |= BIT12; + msg_port_write(MEM_CTLR, DTR1, dtr1); +#endif + +#ifdef R2R_SHARING + /* need to set "final_delay[][]" elements to "0" */ + memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay)); +#endif + + /* loop through each enabled channel */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* perform RCVEN Calibration on a per rank basis */ + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + /* + * POST_CODE here indicates the current + * channel and rank being calibrated + */ + mrc_post_code(0x05, (0x10 + ((ch << 4) | rk))); + +#ifdef BACKUP_RCVN + /* et hard-coded timing values */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) + set_rcvn(ch, rk, bl, ddr_rcvn[PLATFORM_ID]); +#else + /* enable FIFORST */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) { + mrc_alt_write_mask(DDRPHY, + (B01PTRCTL1 + + ((bl >> 1) * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + 0, BIT8); + } + /* initialize the starting delay to 128 PI (cas +1 CLK) */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + /* 1x CLK domain timing is cas-4 */ + delay[bl] = (4 + 1) * FULL_CLK; + + set_rcvn(ch, rk, bl, delay[bl]); + } + + /* now find the rising edge */ + find_rising_edge(mrc_params, delay, ch, rk, true); + + /* Now increase delay by 32 PI (1/4 CLK) to place in center of high pulse */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + delay[bl] += QRTR_CLK; + set_rcvn(ch, rk, bl, delay[bl]); + } + /* Now decrement delay by 128 PI (1 CLK) until we sample a "0" */ + do { + temp = sample_dqs(mrc_params, ch, rk, true); + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (temp & (1 << bl)) { + if (delay[bl] >= FULL_CLK) { + delay[bl] -= FULL_CLK; + set_rcvn(ch, rk, bl, delay[bl]); + } else { + /* not enough delay */ + training_message(ch, rk, bl); + mrc_post_code(0xEE, 0x50); + } + } + } + } while (temp & 0xFF); + +#ifdef R2R_SHARING + /* increment "num_ranks_enabled" */ + num_ranks_enabled++; + /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + delay[bl] += QRTR_CLK; + /* add "delay[]" values to "final_delay[][]" for rolling average */ + final_delay[ch][bl] += delay[bl]; + /* set timing based on rolling average values */ + set_rcvn(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled)); + } +#else + /* Finally increment delay by 32 PI (1/4 CLK) to place in center of preamble */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + delay[bl] += QRTR_CLK; + set_rcvn(ch, rk, bl, delay[bl]); + } +#endif + + /* disable FIFORST */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl += 2) { + mrc_alt_write_mask(DDRPHY, + (B01PTRCTL1 + + ((bl >> 1) * DDRIODQ_BL_OFFSET) + + (ch * DDRIODQ_CH_OFFSET)), + BIT8, BIT8); + } +#endif + } + } + } + } + +#ifndef BACKUP_RCVN + /* restore original */ + msg_port_write(MEM_CTLR, DTR1, dtr1_save); +#endif + + LEAVEFN(); +} + +/* + * This function will perform the Write Levelling algorithm + * (align WCLK and WDQS). + * + * This algorithm will act on each rank in each channel separately. + */ +void wr_level(struct mrc_params *mrc_params) +{ + uint8_t ch; /* channel counter */ + uint8_t rk; /* rank counter */ + uint8_t bl; /* byte lane counter */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; + +#ifdef R2R_SHARING + /* used to find placement for rank2rank sharing configs */ + uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES]; +#ifndef BACKUP_WDQS + /* used to find placement for rank2rank sharing configs */ + uint32_t num_ranks_enabled = 0; +#endif +#endif + +#ifdef BACKUP_WDQS +#else + /* determines stop condition for CRS_WR_LVL */ + bool all_edges_found; + /* absolute PI value to be programmed on the byte lane */ + uint32_t delay[NUM_BYTE_LANES]; + /* + * static makes it so the data is loaded in the heap once by shadow(), + * where non-static copies the data onto the stack every time this + * function is called + */ + uint32_t address; /* address to be checked during COARSE_WR_LVL */ + u32 dtr4, dtr4_save; +#endif + + ENTERFN(); + + /* wr_level starts */ + mrc_post_code(0x06, 0x00); + +#ifdef R2R_SHARING + /* need to set "final_delay[][]" elements to "0" */ + memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay)); +#endif + + /* loop through each enabled channel */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + /* perform WRITE LEVELING algorithm on a per rank basis */ + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + /* + * POST_CODE here indicates the current + * rank and channel being calibrated + */ + mrc_post_code(0x06, (0x10 + ((ch << 4) | rk))); + +#ifdef BACKUP_WDQS + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + set_wdqs(ch, rk, bl, ddr_wdqs[PLATFORM_ID]); + set_wdq(ch, rk, bl, (ddr_wdqs[PLATFORM_ID] - QRTR_CLK)); + } +#else + /* + * perform a single PRECHARGE_ALL command to + * make DRAM state machine go to IDLE state + */ + dram_init_command(DCMD_PREA(rk)); + + /* + * enable Write Levelling Mode + * (EMRS1 w/ Write Levelling Mode Enable) + */ + dram_init_command(DCMD_MRS1(rk, 0x0082)); + + /* + * set ODT DRAM Full Time Termination + * disable in MCU + */ + + dtr4 = msg_port_read(MEM_CTLR, DTR4); + dtr4_save = dtr4; + dtr4 |= BIT15; + msg_port_write(MEM_CTLR, DTR4, dtr4); + + for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) { + /* + * Enable Sandy Bridge Mode (WDQ Tri-State) & + * Ensure 5 WDQS pulses during Write Leveling + */ + mrc_alt_write_mask(DDRPHY, + DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch), + (BIT28 | BIT8 | BIT6 | BIT4 | BIT2), + (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2)); + } + + /* Write Leveling Mode enabled in IO */ + mrc_alt_write_mask(DDRPHY, + CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch), + BIT16, BIT16); + + /* Initialize the starting delay to WCLK */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + /* + * CLK0 --> RK0 + * CLK1 --> RK1 + */ + delay[bl] = get_wclk(ch, rk); + + set_wdqs(ch, rk, bl, delay[bl]); + } + + /* now find the rising edge */ + find_rising_edge(mrc_params, delay, ch, rk, false); + + /* disable Write Levelling Mode */ + mrc_alt_write_mask(DDRPHY, + CCDDR3RESETCTL + (DDRIOCCC_CH_OFFSET * ch), + 0, BIT16); + + for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) { + /* Disable Sandy Bridge Mode & Ensure 4 WDQS pulses during normal operation */ + mrc_alt_write_mask(DDRPHY, + DQCTL + (DDRIODQ_BL_OFFSET * bl) + (DDRIODQ_CH_OFFSET * ch), + (BIT8 | BIT6 | BIT4 | BIT2), + (BIT28 | BIT9 | BIT8 | BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2)); + } + + /* restore original DTR4 */ + msg_port_write(MEM_CTLR, DTR4, dtr4_save); + + /* + * restore original value + * (Write Levelling Mode Disable) + */ + dram_init_command(DCMD_MRS1(rk, mrc_params->mrs1)); + + /* + * perform a single PRECHARGE_ALL command to + * make DRAM state machine go to IDLE state + */ + dram_init_command(DCMD_PREA(rk)); + + mrc_post_code(0x06, (0x30 + ((ch << 4) | rk))); + + /* + * COARSE WRITE LEVEL: + * check that we're on the correct clock edge + */ + + /* hte reconfiguration request */ + mrc_params->hte_setup = 1; + + /* start CRS_WR_LVL with WDQS = WDQS + 128 PI */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + delay[bl] = get_wdqs(ch, rk, bl) + FULL_CLK; + set_wdqs(ch, rk, bl, delay[bl]); + /* + * program WDQ timings based on WDQS + * (WDQ = WDQS - 32 PI) + */ + set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK)); + } + + /* get an address in the targeted channel/rank */ + address = get_addr(ch, rk); + do { + uint32_t coarse_result = 0x00; + uint32_t coarse_result_mask = byte_lane_mask(mrc_params); + /* assume pass */ + all_edges_found = true; + + mrc_params->hte_setup = 1; + coarse_result = check_rw_coarse(mrc_params, address); + + /* check for failures and margin the byte lane back 128 PI (1 CLK) */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (coarse_result & (coarse_result_mask << bl)) { + all_edges_found = false; + delay[bl] -= FULL_CLK; + set_wdqs(ch, rk, bl, delay[bl]); + /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */ + set_wdq(ch, rk, bl, (delay[bl] - QRTR_CLK)); + } + } + } while (!all_edges_found); + +#ifdef R2R_SHARING + /* increment "num_ranks_enabled" */ + num_ranks_enabled++; + /* accumulate "final_delay[][]" values from "delay[]" values for rolling average */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + final_delay[ch][bl] += delay[bl]; + set_wdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled)); + /* program WDQ timings based on WDQS (WDQ = WDQS - 32 PI) */ + set_wdq(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled) - QRTR_CLK); + } +#endif +#endif + } + } + } + } + + LEAVEFN(); +} + +void prog_page_ctrl(struct mrc_params *mrc_params) +{ + u32 dpmc0; + + ENTERFN(); + + dpmc0 = msg_port_read(MEM_CTLR, DPMC0); + dpmc0 &= ~(BIT16 | BIT17 | BIT18); + dpmc0 |= (4 << 16); + dpmc0 |= BIT21; + msg_port_write(MEM_CTLR, DPMC0, dpmc0); +} + +/* + * This function will perform the READ TRAINING Algorithm on all + * channels/ranks/byte_lanes simultaneously to minimize execution time. + * + * The idea here is to train the VREF and RDQS (and eventually RDQ) values + * to achieve maximum READ margins. The algorithm will first determine the + * X coordinate (RDQS setting). This is done by collapsing the VREF eye + * until we find a minimum required RDQS eye for VREF_MIN and VREF_MAX. + * Then we take the averages of the RDQS eye at VREF_MIN and VREF_MAX, + * then average those; this will be the final X coordinate. The algorithm + * will then determine the Y coordinate (VREF setting). This is done by + * collapsing the RDQS eye until we find a minimum required VREF eye for + * RDQS_MIN and RDQS_MAX. Then we take the averages of the VREF eye at + * RDQS_MIN and RDQS_MAX, then average those; this will be the final Y + * coordinate. + * + * NOTE: this algorithm assumes the eye curves have a one-to-one relationship, + * meaning for each X the curve has only one Y and vice-a-versa. + */ +void rd_train(struct mrc_params *mrc_params) +{ + uint8_t ch; /* channel counter */ + uint8_t rk; /* rank counter */ + uint8_t bl; /* byte lane counter */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; +#ifdef BACKUP_RDQS +#else + uint8_t side_x; /* tracks LEFT/RIGHT approach vectors */ + uint8_t side_y; /* tracks BOTTOM/TOP approach vectors */ + /* X coordinate data (passing RDQS values) for approach vectors */ + uint8_t x_coordinate[2][2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + /* Y coordinate data (passing VREF values) for approach vectors */ + uint8_t y_coordinate[2][2][NUM_CHANNELS][NUM_BYTE_LANES]; + /* centered X (RDQS) */ + uint8_t x_center[NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + /* centered Y (VREF) */ + uint8_t y_center[NUM_CHANNELS][NUM_BYTE_LANES]; + uint32_t address; /* target address for check_bls_ex() */ + uint32_t result; /* result of check_bls_ex() */ + uint32_t bl_mask; /* byte lane mask for result checking */ +#ifdef R2R_SHARING + /* used to find placement for rank2rank sharing configs */ + uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES]; + /* used to find placement for rank2rank sharing configs */ + uint32_t num_ranks_enabled = 0; +#endif +#endif + + /* rd_train starts */ + mrc_post_code(0x07, 0x00); + + ENTERFN(); + +#ifdef BACKUP_RDQS + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + set_rdqs(ch, rk, bl, ddr_rdqs[PLATFORM_ID]); + } + } + } + } + } +#else + /* initialize x/y_coordinate arrays */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + /* x_coordinate */ + x_coordinate[L][B][ch][rk][bl] = RDQS_MIN; + x_coordinate[R][B][ch][rk][bl] = RDQS_MAX; + x_coordinate[L][T][ch][rk][bl] = RDQS_MIN; + x_coordinate[R][T][ch][rk][bl] = RDQS_MAX; + /* y_coordinate */ + y_coordinate[L][B][ch][bl] = VREF_MIN; + y_coordinate[R][B][ch][bl] = VREF_MIN; + y_coordinate[L][T][ch][bl] = VREF_MAX; + y_coordinate[R][T][ch][bl] = VREF_MAX; + } + } + } + } + } + + /* initialize other variables */ + bl_mask = byte_lane_mask(mrc_params); + address = get_addr(0, 0); + +#ifdef R2R_SHARING + /* need to set "final_delay[][]" elements to "0" */ + memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay)); +#endif + + /* look for passing coordinates */ + for (side_y = B; side_y <= T; side_y++) { + for (side_x = L; side_x <= R; side_x++) { + mrc_post_code(0x07, (0x10 + (side_y * 2) + (side_x))); + + /* find passing values */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (0x1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & + (0x1 << rk)) { + /* set x/y_coordinate search starting settings */ + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + set_rdqs(ch, rk, bl, + x_coordinate[side_x][side_y][ch][rk][bl]); + set_vref(ch, bl, + y_coordinate[side_x][side_y][ch][bl]); + } + + /* get an address in the target channel/rank */ + address = get_addr(ch, rk); + + /* request HTE reconfiguration */ + mrc_params->hte_setup = 1; + + /* test the settings */ + do { + /* result[07:00] == failing byte lane (MAX 8) */ + result = check_bls_ex(mrc_params, address); + + /* check for failures */ + if (result & 0xFF) { + /* at least 1 byte lane failed */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (result & + (bl_mask << bl)) { + /* adjust the RDQS values accordingly */ + if (side_x == L) + x_coordinate[L][side_y][ch][rk][bl] += RDQS_STEP; + else + x_coordinate[R][side_y][ch][rk][bl] -= RDQS_STEP; + + /* check that we haven't closed the RDQS_EYE too much */ + if ((x_coordinate[L][side_y][ch][rk][bl] > (RDQS_MAX - MIN_RDQS_EYE)) || + (x_coordinate[R][side_y][ch][rk][bl] < (RDQS_MIN + MIN_RDQS_EYE)) || + (x_coordinate[L][side_y][ch][rk][bl] == + x_coordinate[R][side_y][ch][rk][bl])) { + /* + * not enough RDQS margin available at this VREF + * update VREF values accordingly + */ + if (side_y == B) + y_coordinate[side_x][B][ch][bl] += VREF_STEP; + else + y_coordinate[side_x][T][ch][bl] -= VREF_STEP; + + /* check that we haven't closed the VREF_EYE too much */ + if ((y_coordinate[side_x][B][ch][bl] > (VREF_MAX - MIN_VREF_EYE)) || + (y_coordinate[side_x][T][ch][bl] < (VREF_MIN + MIN_VREF_EYE)) || + (y_coordinate[side_x][B][ch][bl] == y_coordinate[side_x][T][ch][bl])) { + /* VREF_EYE collapsed below MIN_VREF_EYE */ + training_message(ch, rk, bl); + mrc_post_code(0xEE, (0x70 + (side_y * 2) + (side_x))); + } else { + /* update the VREF setting */ + set_vref(ch, bl, y_coordinate[side_x][side_y][ch][bl]); + /* reset the X coordinate to begin the search at the new VREF */ + x_coordinate[side_x][side_y][ch][rk][bl] = + (side_x == L) ? (RDQS_MIN) : (RDQS_MAX); + } + } + + /* update the RDQS setting */ + set_rdqs(ch, rk, bl, x_coordinate[side_x][side_y][ch][rk][bl]); + } + } + } + } while (result & 0xFF); + } + } + } + } + } + } + + mrc_post_code(0x07, 0x20); + + /* find final RDQS (X coordinate) & final VREF (Y coordinate) */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + uint32_t temp1; + uint32_t temp2; + + /* x_coordinate */ + DPF(D_INFO, + "RDQS T/B eye rank%d lane%d : %d-%d %d-%d\n", + rk, bl, + x_coordinate[L][T][ch][rk][bl], + x_coordinate[R][T][ch][rk][bl], + x_coordinate[L][B][ch][rk][bl], + x_coordinate[R][B][ch][rk][bl]); + + /* average the TOP side LEFT & RIGHT values */ + temp1 = (x_coordinate[R][T][ch][rk][bl] + x_coordinate[L][T][ch][rk][bl]) / 2; + /* average the BOTTOM side LEFT & RIGHT values */ + temp2 = (x_coordinate[R][B][ch][rk][bl] + x_coordinate[L][B][ch][rk][bl]) / 2; + /* average the above averages */ + x_center[ch][rk][bl] = (uint8_t) ((temp1 + temp2) / 2); + + /* y_coordinate */ + DPF(D_INFO, + "VREF R/L eye lane%d : %d-%d %d-%d\n", + bl, + y_coordinate[R][B][ch][bl], + y_coordinate[R][T][ch][bl], + y_coordinate[L][B][ch][bl], + y_coordinate[L][T][ch][bl]); + + /* average the RIGHT side TOP & BOTTOM values */ + temp1 = (y_coordinate[R][T][ch][bl] + y_coordinate[R][B][ch][bl]) / 2; + /* average the LEFT side TOP & BOTTOM values */ + temp2 = (y_coordinate[L][T][ch][bl] + y_coordinate[L][B][ch][bl]) / 2; + /* average the above averages */ + y_center[ch][bl] = (uint8_t) ((temp1 + temp2) / 2); + } + } + } + } + } + +#ifdef RX_EYE_CHECK + /* perform an eye check */ + for (side_y = B; side_y <= T; side_y++) { + for (side_x = L; side_x <= R; side_x++) { + mrc_post_code(0x07, (0x30 + (side_y * 2) + (side_x))); + + /* update the settings for the eye check */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (side_x == L) + set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] - (MIN_RDQS_EYE / 2))); + else + set_rdqs(ch, rk, bl, (x_center[ch][rk][bl] + (MIN_RDQS_EYE / 2))); + + if (side_y == B) + set_vref(ch, bl, (y_center[ch][bl] - (MIN_VREF_EYE / 2))); + else + set_vref(ch, bl, (y_center[ch][bl] + (MIN_VREF_EYE / 2))); + } + } + } + } + } + + /* request HTE reconfiguration */ + mrc_params->hte_setup = 1; + + /* check the eye */ + if (check_bls_ex(mrc_params, address) & 0xFF) { + /* one or more byte lanes failed */ + mrc_post_code(0xEE, (0x74 + (side_x * 2) + (side_y))); + } + } + } +#endif + + mrc_post_code(0x07, 0x40); + + /* set final placements */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { +#ifdef R2R_SHARING + /* increment "num_ranks_enabled" */ + num_ranks_enabled++; +#endif + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + /* x_coordinate */ +#ifdef R2R_SHARING + final_delay[ch][bl] += x_center[ch][rk][bl]; + set_rdqs(ch, rk, bl, ((final_delay[ch][bl]) / num_ranks_enabled)); +#else + set_rdqs(ch, rk, bl, x_center[ch][rk][bl]); +#endif + /* y_coordinate */ + set_vref(ch, bl, y_center[ch][bl]); + } + } + } + } + } +#endif + + LEAVEFN(); +} + +/* + * This function will perform the WRITE TRAINING Algorithm on all + * channels/ranks/byte_lanes simultaneously to minimize execution time. + * + * The idea here is to train the WDQ timings to achieve maximum WRITE margins. + * The algorithm will start with WDQ at the current WDQ setting (tracks WDQS + * in WR_LVL) +/- 32 PIs (+/- 1/4 CLK) and collapse the eye until all data + * patterns pass. This is because WDQS will be aligned to WCLK by the + * Write Leveling algorithm and WDQ will only ever have a 1/2 CLK window + * of validity. + */ +void wr_train(struct mrc_params *mrc_params) +{ + uint8_t ch; /* channel counter */ + uint8_t rk; /* rank counter */ + uint8_t bl; /* byte lane counter */ + uint8_t bl_divisor = (mrc_params->channel_width == X16) ? 2 : 1; +#ifdef BACKUP_WDQ +#else + uint8_t side; /* LEFT/RIGHT side indicator (0=L, 1=R) */ + uint32_t temp; /* temporary DWORD */ + /* 2 arrays, for L & R side passing delays */ + uint32_t delay[2][NUM_CHANNELS][NUM_RANKS][NUM_BYTE_LANES]; + uint32_t address; /* target address for check_bls_ex() */ + uint32_t result; /* result of check_bls_ex() */ + uint32_t bl_mask; /* byte lane mask for result checking */ +#ifdef R2R_SHARING + /* used to find placement for rank2rank sharing configs */ + uint32_t final_delay[NUM_CHANNELS][NUM_BYTE_LANES]; + /* used to find placement for rank2rank sharing configs */ + uint32_t num_ranks_enabled = 0; +#endif +#endif + + /* wr_train starts */ + mrc_post_code(0x08, 0x00); + + ENTERFN(); + +#ifdef BACKUP_WDQ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + set_wdq(ch, rk, bl, ddr_wdq[PLATFORM_ID]); + } + } + } + } + } +#else + /* initialize "delay" */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + /* + * want to start with + * WDQ = (WDQS - QRTR_CLK) + * +/- QRTR_CLK + */ + temp = get_wdqs(ch, rk, bl) - QRTR_CLK; + delay[L][ch][rk][bl] = temp - QRTR_CLK; + delay[R][ch][rk][bl] = temp + QRTR_CLK; + } + } + } + } + } + + /* initialize other variables */ + bl_mask = byte_lane_mask(mrc_params); + address = get_addr(0, 0); + +#ifdef R2R_SHARING + /* need to set "final_delay[][]" elements to "0" */ + memset((void *)(final_delay), 0x00, (size_t)sizeof(final_delay)); +#endif + + /* + * start algorithm on the LEFT side and train each channel/bl + * until no failures are observed, then repeat for the RIGHT side. + */ + for (side = L; side <= R; side++) { + mrc_post_code(0x08, (0x10 + (side))); + + /* set starting values */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & + (1 << rk)) { + for (bl = 0; + bl < (NUM_BYTE_LANES / bl_divisor); + bl++) { + set_wdq(ch, rk, bl, delay[side][ch][rk][bl]); + } + } + } + } + } + + /* find passing values */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & + (1 << rk)) { + /* get an address in the target channel/rank */ + address = get_addr(ch, rk); + + /* request HTE reconfiguration */ + mrc_params->hte_setup = 1; + + /* check the settings */ + do { + /* result[07:00] == failing byte lane (MAX 8) */ + result = check_bls_ex(mrc_params, address); + /* check for failures */ + if (result & 0xFF) { + /* at least 1 byte lane failed */ + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + if (result & + (bl_mask << bl)) { + if (side == L) + delay[L][ch][rk][bl] += WDQ_STEP; + else + delay[R][ch][rk][bl] -= WDQ_STEP; + + /* check for algorithm failure */ + if (delay[L][ch][rk][bl] != delay[R][ch][rk][bl]) { + /* + * margin available + * update delay setting + */ + set_wdq(ch, rk, bl, + delay[side][ch][rk][bl]); + } else { + /* + * no margin available + * notify the user and halt + */ + training_message(ch, rk, bl); + mrc_post_code(0xEE, (0x80 + side)); + } + } + } + } + /* stop when all byte lanes pass */ + } while (result & 0xFF); + } + } + } + } + } + + /* program WDQ to the middle of passing window */ + for (ch = 0; ch < NUM_CHANNELS; ch++) { + if (mrc_params->channel_enables & (1 << ch)) { + for (rk = 0; rk < NUM_RANKS; rk++) { + if (mrc_params->rank_enables & (1 << rk)) { +#ifdef R2R_SHARING + /* increment "num_ranks_enabled" */ + num_ranks_enabled++; +#endif + for (bl = 0; bl < (NUM_BYTE_LANES / bl_divisor); bl++) { + DPF(D_INFO, + "WDQ eye rank%d lane%d : %d-%d\n", + rk, bl, + delay[L][ch][rk][bl], + delay[R][ch][rk][bl]); + + temp = (delay[R][ch][rk][bl] + delay[L][ch][rk][bl]) / 2; + +#ifdef R2R_SHARING + final_delay[ch][bl] += temp; + set_wdq(ch, rk, bl, + ((final_delay[ch][bl]) / num_ranks_enabled)); +#else + set_wdq(ch, rk, bl, temp); +#endif + } + } + } + } + } +#endif + + LEAVEFN(); +} + +/* + * This function will store relevant timing data + * + * This data will be used on subsequent boots to speed up boot times + * and is required for Suspend To RAM capabilities. + */ +void store_timings(struct mrc_params *mrc_params) +{ + uint8_t ch, rk, bl; + struct mrc_timings *mt = &mrc_params->timings; + + for (ch = 0; ch < NUM_CHANNELS; ch++) { + for (rk = 0; rk < NUM_RANKS; rk++) { + for (bl = 0; bl < NUM_BYTE_LANES; bl++) { + mt->rcvn[ch][rk][bl] = get_rcvn(ch, rk, bl); + mt->rdqs[ch][rk][bl] = get_rdqs(ch, rk, bl); + mt->wdqs[ch][rk][bl] = get_wdqs(ch, rk, bl); + mt->wdq[ch][rk][bl] = get_wdq(ch, rk, bl); + + if (rk == 0) + mt->vref[ch][bl] = get_vref(ch, bl); + } + + mt->wctl[ch][rk] = get_wctl(ch, rk); + } + + mt->wcmd[ch] = get_wcmd(ch); + } + + /* need to save for a case of changing frequency after warm reset */ + mt->ddr_speed = mrc_params->ddr_speed; +} + +/* + * The purpose of this function is to ensure the SEC comes out of reset + * and IA initiates the SEC enabling Memory Scrambling. + */ +void enable_scrambling(struct mrc_params *mrc_params) +{ + uint32_t lfsr = 0; + uint8_t i; + + if (mrc_params->scrambling_enables == 0) + return; + + ENTERFN(); + + /* 32 bit seed is always stored in BIOS NVM */ + lfsr = mrc_params->timings.scrambler_seed; + + if (mrc_params->boot_mode == BM_COLD) { + /* + * factory value is 0 and in first boot, + * a clock based seed is loaded. + */ + if (lfsr == 0) { + /* + * get seed from system clock + * and make sure it is not all 1's + */ + lfsr = rdtsc() & 0x0FFFFFFF; + } else { + /* + * Need to replace scrambler + * + * get next 32bit LFSR 16 times which is the last + * part of the previous scrambler vector + */ + for (i = 0; i < 16; i++) + lfsr32(&lfsr); + } + + /* save new seed */ + mrc_params->timings.scrambler_seed = lfsr; + } + + /* + * In warm boot or S3 exit, we have the previous seed. + * In cold boot, we have the last 32bit LFSR which is the new seed. + */ + lfsr32(&lfsr); /* shift to next value */ + msg_port_write(MEM_CTLR, SCRMSEED, (lfsr & 0x0003FFFF)); + + for (i = 0; i < 2; i++) + msg_port_write(MEM_CTLR, SCRMLO + i, (lfsr & 0xAAAAAAAA)); + + LEAVEFN(); +} + +/* + * Configure MCU Power Management Control Register + * and Scheduler Control Register + */ +void prog_ddr_control(struct mrc_params *mrc_params) +{ + u32 dsch; + u32 dpmc0; + + ENTERFN(); + + dsch = msg_port_read(MEM_CTLR, DSCH); + dsch &= ~(BIT8 | BIT9 | BIT12); + msg_port_write(MEM_CTLR, DSCH, dsch); + + dpmc0 = msg_port_read(MEM_CTLR, DPMC0); + dpmc0 &= ~BIT25; + dpmc0 |= (mrc_params->power_down_disable << 25); + dpmc0 &= ~BIT24; + dpmc0 &= ~(BIT16 | BIT17 | BIT18); + dpmc0 |= (4 << 16); + dpmc0 |= BIT21; + msg_port_write(MEM_CTLR, DPMC0, dpmc0); + + /* CMDTRIST = 2h - CMD/ADDR are tristated when no valid command */ + mrc_write_mask(MEM_CTLR, DPMC1, 2 << 4, BIT4 | BIT5); + + LEAVEFN(); +} + +/* + * After training complete configure MCU Rank Population Register + * specifying: ranks enabled, device width, density, address mode + */ +void prog_dra_drb(struct mrc_params *mrc_params) +{ + u32 drp; + u32 dco; + u8 density = mrc_params->params.density; + + ENTERFN(); + + dco = msg_port_read(MEM_CTLR, DCO); + dco &= ~BIT31; + msg_port_write(MEM_CTLR, DCO, dco); + + drp = 0; + if (mrc_params->rank_enables & 1) + drp |= BIT0; + if (mrc_params->rank_enables & 2) + drp |= BIT1; + if (mrc_params->dram_width == X16) { + drp |= (1 << 4); + drp |= (1 << 9); + } + + /* + * Density encoding in struct dram_params: 0=512Mb, 1=Gb, 2=2Gb, 3=4Gb + * has to be mapped RANKDENSx encoding (0=1Gb) + */ + if (density == 0) + density = 4; + + drp |= ((density - 1) << 6); + drp |= ((density - 1) << 11); + + /* Address mode can be overwritten if ECC enabled */ + drp |= (mrc_params->address_mode << 14); + + msg_port_write(MEM_CTLR, DRP, drp); + + dco &= ~BIT28; + dco |= BIT31; + msg_port_write(MEM_CTLR, DCO, dco); + + LEAVEFN(); +} + +/* Send DRAM wake command */ +void perform_wake(struct mrc_params *mrc_params) +{ + ENTERFN(); + + dram_wake_command(); + + LEAVEFN(); +} + +/* + * Configure refresh rate and short ZQ calibration interval + * Activate dynamic self refresh + */ +void change_refresh_period(struct mrc_params *mrc_params) +{ + u32 drfc; + u32 dcal; + u32 dpmc0; + + ENTERFN(); + + drfc = msg_port_read(MEM_CTLR, DRFC); + drfc &= ~(BIT12 | BIT13 | BIT14); + drfc |= (mrc_params->refresh_rate << 12); + drfc |= BIT21; + msg_port_write(MEM_CTLR, DRFC, drfc); + + dcal = msg_port_read(MEM_CTLR, DCAL); + dcal &= ~(BIT8 | BIT9 | BIT10); + dcal |= (3 << 8); /* 63ms */ + msg_port_write(MEM_CTLR, DCAL, dcal); + + dpmc0 = msg_port_read(MEM_CTLR, DPMC0); + dpmc0 |= (BIT23 | BIT29); + msg_port_write(MEM_CTLR, DPMC0, dpmc0); + + LEAVEFN(); +} + +/* + * Configure DDRPHY for Auto-Refresh, Periodic Compensations, + * Dynamic Diff-Amp, ZQSPERIOD, Auto-Precharge, CKE Power-Down + */ +void set_auto_refresh(struct mrc_params *mrc_params) +{ + uint32_t channel; + uint32_t rank; + uint32_t bl; + uint32_t bl_divisor = 1; + uint32_t temp; + + ENTERFN(); + + /* + * Enable Auto-Refresh, Periodic Compensations, Dynamic Diff-Amp, + * ZQSPERIOD, Auto-Precharge, CKE Power-Down + */ + for (channel = 0; channel < NUM_CHANNELS; channel++) { + if (mrc_params->channel_enables & (1 << channel)) { + /* Enable Periodic RCOMPS */ + mrc_alt_write_mask(DDRPHY, CMPCTRL, BIT1, BIT1); + + /* Enable Dynamic DiffAmp & Set Read ODT Value */ + switch (mrc_params->rd_odt_value) { + case 0: + temp = 0x3F; /* OFF */ + break; + default: + temp = 0x00; /* Auto */ + break; + } + + for (bl = 0; bl < ((NUM_BYTE_LANES / bl_divisor) / 2); bl++) { + /* Override: DIFFAMP, ODT */ + mrc_alt_write_mask(DDRPHY, + (B0OVRCTL + (bl * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)), + (0x00 << 16) | (temp << 10), + (BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16 | BIT15 | BIT14 | + BIT13 | BIT12 | BIT11 | BIT10)); + + /* Override: DIFFAMP, ODT */ + mrc_alt_write_mask(DDRPHY, + (B1OVRCTL + (bl * DDRIODQ_BL_OFFSET) + + (channel * DDRIODQ_CH_OFFSET)), + (0x00 << 16) | (temp << 10), + (BIT21 | BIT20 | BIT19 | BIT18 | + BIT17 | BIT16 | BIT15 | BIT14 | + BIT13 | BIT12 | BIT11 | BIT10)); + } + + /* Issue ZQCS command */ + for (rank = 0; rank < NUM_RANKS; rank++) { + if (mrc_params->rank_enables & (1 << rank)) + dram_init_command(DCMD_ZQCS(rank)); + } + } + } + + clear_pointers(); + + LEAVEFN(); +} + +/* + * Depending on configuration enables ECC support + * + * Available memory size is decreased, and updated with 0s + * in order to clear error status. Address mode 2 forced. + */ +void ecc_enable(struct mrc_params *mrc_params) +{ + u32 drp; + u32 dsch; + u32 ecc_ctrl; + + if (mrc_params->ecc_enables == 0) + return; + + ENTERFN(); + + /* Configuration required in ECC mode */ + drp = msg_port_read(MEM_CTLR, DRP); + drp &= ~(BIT14 | BIT15); + drp |= BIT15; + drp |= BIT13; + msg_port_write(MEM_CTLR, DRP, drp); + + /* Disable new request bypass */ + dsch = msg_port_read(MEM_CTLR, DSCH); + dsch |= BIT12; + msg_port_write(MEM_CTLR, DSCH, dsch); + + /* Enable ECC */ + ecc_ctrl = (BIT0 | BIT1 | BIT17); + msg_port_write(MEM_CTLR, DECCCTRL, ecc_ctrl); + + /* Assume 8 bank memory, one bank is gone for ECC */ + mrc_params->mem_size -= mrc_params->mem_size / 8; + + /* For S3 resume memory content has to be preserved */ + if (mrc_params->boot_mode != BM_S3) { + select_hte(); + hte_mem_init(mrc_params, MRC_MEM_INIT); + select_mem_mgr(); + } + + LEAVEFN(); +} + +/* + * Execute memory test + * if error detected it is indicated in mrc_params->status + */ +void memory_test(struct mrc_params *mrc_params) +{ + uint32_t result = 0; + + ENTERFN(); + + select_hte(); + result = hte_mem_init(mrc_params, MRC_MEM_TEST); + select_mem_mgr(); + + DPF(D_INFO, "Memory test result %x\n", result); + mrc_params->status = ((result == 0) ? MRC_SUCCESS : MRC_E_MEMTEST); + LEAVEFN(); +} + +/* Lock MCU registers at the end of initialization sequence */ +void lock_registers(struct mrc_params *mrc_params) +{ + u32 dco; + + ENTERFN(); + + dco = msg_port_read(MEM_CTLR, DCO); + dco &= ~(BIT28 | BIT29); + dco |= (BIT0 | BIT8); + msg_port_write(MEM_CTLR, DCO, dco); + + LEAVEFN(); +} diff --git a/arch/x86/cpu/quark/smc.h b/arch/x86/cpu/quark/smc.h new file mode 100644 index 0000000..46017a1 --- /dev/null +++ b/arch/x86/cpu/quark/smc.h @@ -0,0 +1,446 @@ +/* + * Copyright (C) 2013, Intel Corporation + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * Ported from Intel released Quark UEFI BIOS + * QuarkSocPkg/QuarkNorthCluster/MemoryInit/Pei + * + * SPDX-License-Identifier: Intel + */ + +#ifndef _SMC_H_ +#define _SMC_H_ + +/* System Memory Controller Register Defines */ + +/* Memory Controller Message Bus Registers Offsets */ +#define DRP 0x00 +#define DTR0 0x01 +#define DTR1 0x02 +#define DTR2 0x03 +#define DTR3 0x04 +#define DTR4 0x05 +#define DPMC0 0x06 +#define DPMC1 0x07 +#define DRFC 0x08 +#define DSCH 0x09 +#define DCAL 0x0A +#define DRMC 0x0B +#define PMSTS 0x0C +#define DCO 0x0F +#define DSTAT 0x20 +#define SSKPD0 0x4A +#define SSKPD1 0x4B +#define DECCCTRL 0x60 +#define DECCSTAT 0x61 +#define DECCSBECNT 0x62 +#define DECCSBECA 0x68 +#define DECCSBECS 0x69 +#define DECCDBECA 0x6A +#define DECCDBECS 0x6B +#define DFUSESTAT 0x70 +#define SCRMSEED 0x80 +#define SCRMLO 0x81 +#define SCRMHI 0x82 + +/* DRAM init command */ +#define DCMD_MRS1(rnk, dat) (0 | ((rnk) << 22) | (1 << 3) | ((dat) << 6)) +#define DCMD_REF(rnk) (1 | ((rnk) << 22)) +#define DCMD_PRE(rnk) (2 | ((rnk) << 22)) +#define DCMD_PREA(rnk) (2 | ((rnk) << 22) | (BIT10 << 6)) +#define DCMD_ACT(rnk, row) (3 | ((rnk) << 22) | ((row) << 6)) +#define DCMD_WR(rnk, col) (4 | ((rnk) << 22) | ((col) << 6)) +#define DCMD_RD(rnk, col) (5 | ((rnk) << 22) | ((col) << 6)) +#define DCMD_ZQCS(rnk) (6 | ((rnk) << 22)) +#define DCMD_ZQCL(rnk) (6 | ((rnk) << 22) | (BIT10 << 6)) +#define DCMD_NOP(rnk) (7 | ((rnk) << 22)) + +#define DDR3_EMRS1_DIC_40 (0) +#define DDR3_EMRS1_DIC_34 (1) + +#define DDR3_EMRS1_RTTNOM_0 (0) +#define DDR3_EMRS1_RTTNOM_60 (0x04) +#define DDR3_EMRS1_RTTNOM_120 (0x40) +#define DDR3_EMRS1_RTTNOM_40 (0x44) +#define DDR3_EMRS1_RTTNOM_20 (0x200) +#define DDR3_EMRS1_RTTNOM_30 (0x204) + +#define DDR3_EMRS2_RTTWR_60 (1 << 9) +#define DDR3_EMRS2_RTTWR_120 (1 << 10) + +/* BEGIN DDRIO Registers */ + +/* DDR IOs & COMPs */ +#define DDRIODQ_BL_OFFSET 0x0800 +#define DDRIODQ_CH_OFFSET ((NUM_BYTE_LANES / 2) * DDRIODQ_BL_OFFSET) +#define DDRIOCCC_CH_OFFSET 0x0800 +#define DDRCOMP_CH_OFFSET 0x0100 + +/* CH0-BL01-DQ */ +#define DQOBSCKEBBCTL 0x0000 +#define DQDLLTXCTL 0x0004 +#define DQDLLRXCTL 0x0008 +#define DQMDLLCTL 0x000C +#define B0RXIOBUFCTL 0x0010 +#define B0VREFCTL 0x0014 +#define B0RXOFFSET1 0x0018 +#define B0RXOFFSET0 0x001C +#define B1RXIOBUFCTL 0x0020 +#define B1VREFCTL 0x0024 +#define B1RXOFFSET1 0x0028 +#define B1RXOFFSET0 0x002C +#define DQDFTCTL 0x0030 +#define DQTRAINSTS 0x0034 +#define B1DLLPICODER0 0x0038 +#define B0DLLPICODER0 0x003C +#define B1DLLPICODER1 0x0040 +#define B0DLLPICODER1 0x0044 +#define B1DLLPICODER2 0x0048 +#define B0DLLPICODER2 0x004C +#define B1DLLPICODER3 0x0050 +#define B0DLLPICODER3 0x0054 +#define B1RXDQSPICODE 0x0058 +#define B0RXDQSPICODE 0x005C +#define B1RXDQPICODER32 0x0060 +#define B1RXDQPICODER10 0x0064 +#define B0RXDQPICODER32 0x0068 +#define B0RXDQPICODER10 0x006C +#define B01PTRCTL0 0x0070 +#define B01PTRCTL1 0x0074 +#define B01DBCTL0 0x0078 +#define B01DBCTL1 0x007C +#define B0LATCTL0 0x0080 +#define B1LATCTL0 0x0084 +#define B01LATCTL1 0x0088 +#define B0ONDURCTL 0x008C +#define B1ONDURCTL 0x0090 +#define B0OVRCTL 0x0094 +#define B1OVRCTL 0x0098 +#define DQCTL 0x009C +#define B0RK2RKCHGPTRCTRL 0x00A0 +#define B1RK2RKCHGPTRCTRL 0x00A4 +#define DQRK2RKCTL 0x00A8 +#define DQRK2RKPTRCTL 0x00AC +#define B0RK2RKLAT 0x00B0 +#define B1RK2RKLAT 0x00B4 +#define DQCLKALIGNREG0 0x00B8 +#define DQCLKALIGNREG1 0x00BC +#define DQCLKALIGNREG2 0x00C0 +#define DQCLKALIGNSTS0 0x00C4 +#define DQCLKALIGNSTS1 0x00C8 +#define DQCLKGATE 0x00CC +#define B0COMPSLV1 0x00D0 +#define B1COMPSLV1 0x00D4 +#define B0COMPSLV2 0x00D8 +#define B1COMPSLV2 0x00DC +#define B0COMPSLV3 0x00E0 +#define B1COMPSLV3 0x00E4 +#define DQVISALANECR0TOP 0x00E8 +#define DQVISALANECR1TOP 0x00EC +#define DQVISACONTROLCRTOP 0x00F0 +#define DQVISALANECR0BL 0x00F4 +#define DQVISALANECR1BL 0x00F8 +#define DQVISACONTROLCRBL 0x00FC +#define DQTIMINGCTRL 0x010C + +/* CH0-ECC */ +#define ECCDLLTXCTL 0x2004 +#define ECCDLLRXCTL 0x2008 +#define ECCMDLLCTL 0x200C +#define ECCB1DLLPICODER0 0x2038 +#define ECCB1DLLPICODER1 0x2040 +#define ECCB1DLLPICODER2 0x2048 +#define ECCB1DLLPICODER3 0x2050 +#define ECCB01DBCTL0 0x2078 +#define ECCB01DBCTL1 0x207C +#define ECCCLKALIGNREG0 0x20B8 +#define ECCCLKALIGNREG1 0x20BC +#define ECCCLKALIGNREG2 0x20C0 + +/* CH0-CMD */ +#define CMDOBSCKEBBCTL 0x4800 +#define CMDDLLTXCTL 0x4808 +#define CMDDLLRXCTL 0x480C +#define CMDMDLLCTL 0x4810 +#define CMDRCOMPODT 0x4814 +#define CMDDLLPICODER0 0x4820 +#define CMDDLLPICODER1 0x4824 +#define CMDCFGREG0 0x4840 +#define CMDPTRREG 0x4844 +#define CMDCLKALIGNREG0 0x4850 +#define CMDCLKALIGNREG1 0x4854 +#define CMDCLKALIGNREG2 0x4858 +#define CMDPMCONFIG0 0x485C +#define CMDPMDLYREG0 0x4860 +#define CMDPMDLYREG1 0x4864 +#define CMDPMDLYREG2 0x4868 +#define CMDPMDLYREG3 0x486C +#define CMDPMDLYREG4 0x4870 +#define CMDCLKALIGNSTS0 0x4874 +#define CMDCLKALIGNSTS1 0x4878 +#define CMDPMSTS0 0x487C +#define CMDPMSTS1 0x4880 +#define CMDCOMPSLV 0x4884 +#define CMDBONUS0 0x488C +#define CMDBONUS1 0x4890 +#define CMDVISALANECR0 0x4894 +#define CMDVISALANECR1 0x4898 +#define CMDVISACONTROLCR 0x489C +#define CMDCLKGATE 0x48A0 +#define CMDTIMINGCTRL 0x48A4 + +/* CH0-CLK-CTL */ +#define CCOBSCKEBBCTL 0x5800 +#define CCRCOMPIO 0x5804 +#define CCDLLTXCTL 0x5808 +#define CCDLLRXCTL 0x580C +#define CCMDLLCTL 0x5810 +#define CCRCOMPODT 0x5814 +#define CCDLLPICODER0 0x5820 +#define CCDLLPICODER1 0x5824 +#define CCDDR3RESETCTL 0x5830 +#define CCCFGREG0 0x5838 +#define CCCFGREG1 0x5840 +#define CCPTRREG 0x5844 +#define CCCLKALIGNREG0 0x5850 +#define CCCLKALIGNREG1 0x5854 +#define CCCLKALIGNREG2 0x5858 +#define CCPMCONFIG0 0x585C +#define CCPMDLYREG0 0x5860 +#define CCPMDLYREG1 0x5864 +#define CCPMDLYREG2 0x5868 +#define CCPMDLYREG3 0x586C +#define CCPMDLYREG4 0x5870 +#define CCCLKALIGNSTS0 0x5874 +#define CCCLKALIGNSTS1 0x5878 +#define CCPMSTS0 0x587C +#define CCPMSTS1 0x5880 +#define CCCOMPSLV1 0x5884 +#define CCCOMPSLV2 0x5888 +#define CCCOMPSLV3 0x588C +#define CCBONUS0 0x5894 +#define CCBONUS1 0x5898 +#define CCVISALANECR0 0x589C +#define CCVISALANECR1 0x58A0 +#define CCVISACONTROLCR 0x58A4 +#define CCCLKGATE 0x58A8 +#define CCTIMINGCTL 0x58AC + +/* COMP */ +#define CMPCTRL 0x6800 +#define SOFTRSTCNTL 0x6804 +#define MSCNTR 0x6808 +#define NMSCNTRL 0x680C +#define LATCH1CTL 0x6814 +#define COMPVISALANECR0 0x681C +#define COMPVISALANECR1 0x6820 +#define COMPVISACONTROLCR 0x6824 +#define COMPBONUS0 0x6830 +#define TCOCNTCTRL 0x683C +#define DQANAODTPUCTL 0x6840 +#define DQANAODTPDCTL 0x6844 +#define DQANADRVPUCTL 0x6848 +#define DQANADRVPDCTL 0x684C +#define DQANADLYPUCTL 0x6850 +#define DQANADLYPDCTL 0x6854 +#define DQANATCOPUCTL 0x6858 +#define DQANATCOPDCTL 0x685C +#define CMDANADRVPUCTL 0x6868 +#define CMDANADRVPDCTL 0x686C +#define CMDANADLYPUCTL 0x6870 +#define CMDANADLYPDCTL 0x6874 +#define CLKANAODTPUCTL 0x6880 +#define CLKANAODTPDCTL 0x6884 +#define CLKANADRVPUCTL 0x6888 +#define CLKANADRVPDCTL 0x688C +#define CLKANADLYPUCTL 0x6890 +#define CLKANADLYPDCTL 0x6894 +#define CLKANATCOPUCTL 0x6898 +#define CLKANATCOPDCTL 0x689C +#define DQSANAODTPUCTL 0x68A0 +#define DQSANAODTPDCTL 0x68A4 +#define DQSANADRVPUCTL 0x68A8 +#define DQSANADRVPDCTL 0x68AC +#define DQSANADLYPUCTL 0x68B0 +#define DQSANADLYPDCTL 0x68B4 +#define DQSANATCOPUCTL 0x68B8 +#define DQSANATCOPDCTL 0x68BC +#define CTLANADRVPUCTL 0x68C8 +#define CTLANADRVPDCTL 0x68CC +#define CTLANADLYPUCTL 0x68D0 +#define CTLANADLYPDCTL 0x68D4 +#define CHNLBUFSTATIC 0x68F0 +#define COMPOBSCNTRL 0x68F4 +#define COMPBUFFDBG0 0x68F8 +#define COMPBUFFDBG1 0x68FC +#define CFGMISCCH0 0x6900 +#define COMPEN0CH0 0x6904 +#define COMPEN1CH0 0x6908 +#define COMPEN2CH0 0x690C +#define STATLEGEN0CH0 0x6910 +#define STATLEGEN1CH0 0x6914 +#define DQVREFCH0 0x6918 +#define CMDVREFCH0 0x691C +#define CLKVREFCH0 0x6920 +#define DQSVREFCH0 0x6924 +#define CTLVREFCH0 0x6928 +#define TCOVREFCH0 0x692C +#define DLYSELCH0 0x6930 +#define TCODRAMBUFODTCH0 0x6934 +#define CCBUFODTCH0 0x6938 +#define RXOFFSETCH0 0x693C +#define DQODTPUCTLCH0 0x6940 +#define DQODTPDCTLCH0 0x6944 +#define DQDRVPUCTLCH0 0x6948 +#define DQDRVPDCTLCH0 0x694C +#define DQDLYPUCTLCH0 0x6950 +#define DQDLYPDCTLCH0 0x6954 +#define DQTCOPUCTLCH0 0x6958 +#define DQTCOPDCTLCH0 0x695C +#define CMDDRVPUCTLCH0 0x6968 +#define CMDDRVPDCTLCH0 0x696C +#define CMDDLYPUCTLCH0 0x6970 +#define CMDDLYPDCTLCH0 0x6974 +#define CLKODTPUCTLCH0 0x6980 +#define CLKODTPDCTLCH0 0x6984 +#define CLKDRVPUCTLCH0 0x6988 +#define CLKDRVPDCTLCH0 0x698C +#define CLKDLYPUCTLCH0 0x6990 +#define CLKDLYPDCTLCH0 0x6994 +#define CLKTCOPUCTLCH0 0x6998 +#define CLKTCOPDCTLCH0 0x699C +#define DQSODTPUCTLCH0 0x69A0 +#define DQSODTPDCTLCH0 0x69A4 +#define DQSDRVPUCTLCH0 0x69A8 +#define DQSDRVPDCTLCH0 0x69AC +#define DQSDLYPUCTLCH0 0x69B0 +#define DQSDLYPDCTLCH0 0x69B4 +#define DQSTCOPUCTLCH0 0x69B8 +#define DQSTCOPDCTLCH0 0x69BC +#define CTLDRVPUCTLCH0 0x69C8 +#define CTLDRVPDCTLCH0 0x69CC +#define CTLDLYPUCTLCH0 0x69D0 +#define CTLDLYPDCTLCH0 0x69D4 +#define FNLUPDTCTLCH0 0x69F0 + +/* PLL */ +#define MPLLCTRL0 0x7800 +#define MPLLCTRL1 0x7808 +#define MPLLCSR0 0x7810 +#define MPLLCSR1 0x7814 +#define MPLLCSR2 0x7820 +#define MPLLDFT 0x7828 +#define MPLLMON0CTL 0x7830 +#define MPLLMON1CTL 0x7838 +#define MPLLMON2CTL 0x783C +#define SFRTRIM 0x7850 +#define MPLLDFTOUT0 0x7858 +#define MPLLDFTOUT1 0x785C +#define MASTERRSTN 0x7880 +#define PLLLOCKDEL 0x7884 +#define SFRDEL 0x7888 +#define CRUVISALANECR0 0x78F0 +#define CRUVISALANECR1 0x78F4 +#define CRUVISACONTROLCR 0x78F8 +#define IOSFVISALANECR0 0x78FC +#define IOSFVISALANECR1 0x7900 +#define IOSFVISACONTROLCR 0x7904 + +/* END DDRIO Registers */ + +/* DRAM Specific Message Bus OpCodes */ +#define MSG_OP_DRAM_INIT 0x68 +#define MSG_OP_DRAM_WAKE 0xCA + +#define SAMPLE_SIZE 6 + +/* must be less than this number to enable early deadband */ +#define EARLY_DB 0x12 +/* must be greater than this number to enable late deadband */ +#define LATE_DB 0x34 + +#define CHX_REGS (11 * 4) +#define FULL_CLK 128 +#define HALF_CLK 64 +#define QRTR_CLK 32 + +#define MCEIL(num, den) ((uint8_t)((num + den - 1) / den)) +#define MMAX(a, b) ((a) > (b) ? (a) : (b)) +#define DEAD_LOOP() for (;;); + +#define MIN_RDQS_EYE 10 /* in PI Codes */ +#define MIN_VREF_EYE 10 /* in VREF Codes */ +/* how many RDQS codes to jump while margining */ +#define RDQS_STEP 1 +/* how many VREF codes to jump while margining */ +#define VREF_STEP 1 +/* offset into "vref_codes[]" for minimum allowed VREF setting */ +#define VREF_MIN 0x00 +/* offset into "vref_codes[]" for maximum allowed VREF setting */ +#define VREF_MAX 0x3F +#define RDQS_MIN 0x00 /* minimum RDQS delay value */ +#define RDQS_MAX 0x3F /* maximum RDQS delay value */ + +/* how many WDQ codes to jump while margining */ +#define WDQ_STEP 1 + +enum { + B, /* BOTTOM VREF */ + T /* TOP VREF */ +}; + +enum { + L, /* LEFT RDQS */ + R /* RIGHT RDQS */ +}; + +/* Memory Options */ + +/* enable STATIC timing settings for RCVN (BACKUP_MODE) */ +#undef BACKUP_RCVN +/* enable STATIC timing settings for WDQS (BACKUP_MODE) */ +#undef BACKUP_WDQS +/* enable STATIC timing settings for RDQS (BACKUP_MODE) */ +#undef BACKUP_RDQS +/* enable STATIC timing settings for WDQ (BACKUP_MODE) */ +#undef BACKUP_WDQ +/* enable *COMP overrides (BACKUP_MODE) */ +#undef BACKUP_COMPS +/* enable the RD_TRAIN eye check */ +#undef RX_EYE_CHECK + +/* enable Host to Memory Clock Alignment */ +#define HMC_TEST +/* enable multi-rank support via rank2rank sharing */ +#define R2R_SHARING +/* disable signals not used in 16bit mode of DDRIO */ +#define FORCE_16BIT_DDRIO + +#define PLATFORM_ID 1 + +void clear_self_refresh(struct mrc_params *mrc_params); +void prog_ddr_timing_control(struct mrc_params *mrc_params); +void prog_decode_before_jedec(struct mrc_params *mrc_params); +void perform_ddr_reset(struct mrc_params *mrc_params); +void ddrphy_init(struct mrc_params *mrc_params); +void perform_jedec_init(struct mrc_params *mrc_params); +void set_ddr_init_complete(struct mrc_params *mrc_params); +void restore_timings(struct mrc_params *mrc_params); +void default_timings(struct mrc_params *mrc_params); +void rcvn_cal(struct mrc_params *mrc_params); +void wr_level(struct mrc_params *mrc_params); +void prog_page_ctrl(struct mrc_params *mrc_params); +void rd_train(struct mrc_params *mrc_params); +void wr_train(struct mrc_params *mrc_params); +void store_timings(struct mrc_params *mrc_params); +void enable_scrambling(struct mrc_params *mrc_params); +void prog_ddr_control(struct mrc_params *mrc_params); +void prog_dra_drb(struct mrc_params *mrc_params); +void perform_wake(struct mrc_params *mrc_params); +void change_refresh_period(struct mrc_params *mrc_params); +void set_auto_refresh(struct mrc_params *mrc_params); +void ecc_enable(struct mrc_params *mrc_params); +void memory_test(struct mrc_params *mrc_params); +void lock_registers(struct mrc_params *mrc_params); + +#endif /* _SMC_H_ */

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
The codes are actually doing the memory initialization stuff.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
The most ugly codes I've ever seen ... There are 252 warnings and 127 checks in this patch, which are:
check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment should match open parenthesis' issue, I may end up adding more 'line over 80 characters' issues, so we have to bear with it. Sigh.
Changes in v2:
- Write out the values instead of BIT in smc.h
- Removing the ending / in the file header comment block
arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/smc.h | 446 ++++++++ 2 files changed, 3210 insertions(+) create mode 100644 arch/x86/cpu/quark/smc.c create mode 100644 arch/x86/cpu/quark/smc.h
As with the other patch we should be able to tidy this up a little more before the release (splitting code, reducing use of BIT). But this is a good starting point.
Acked-by: Simon Glass sjg@chromium.org

On 5 February 2015 at 17:52, Simon Glass sjg@chromium.org wrote:
On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
The codes are actually doing the memory initialization stuff.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
The most ugly codes I've ever seen ... There are 252 warnings and 127 checks in this patch, which are:
check: arch/x86/cpu/quark/smc.c,1609: Alignment should match open parenthesis warning: arch/x86/cpu/quark/smc.c,1610: line over 80 characters warning: arch/x86/cpu/quark/smc.c,1633: Too many leading tabs - consider code refactoring ...
Fixing 'Too many leading tabs ...' will be very dangerous, as I don't have all the details on how Intel's MRC codes are actually written to play with the hardware. Trying to refactor them may lead to a non-working MRC codes. For the 'line over 80 characters' issue, we have to leave them as is now due to the 'Too many leading tabs ...'. If I am trying to fix the 'Alignment should match open parenthesis' issue, I may end up adding more 'line over 80 characters' issues, so we have to bear with it. Sigh.
Changes in v2:
- Write out the values instead of BIT in smc.h
- Removing the ending / in the file header comment block
arch/x86/cpu/quark/smc.c | 2764 ++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/cpu/quark/smc.h | 446 ++++++++ 2 files changed, 3210 insertions(+) create mode 100644 arch/x86/cpu/quark/smc.c create mode 100644 arch/x86/cpu/quark/smc.h
As with the other patch we should be able to tidy this up a little more before the release (splitting code, reducing use of BIT). But this is a good starting point.
Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

Turn on the Memory Reference code build in the quark Makefile.
Signed-off-by: Bin Meng bmeng.cn@gmail.com ---
Changes in v2: None
arch/x86/cpu/quark/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/cpu/quark/Makefile b/arch/x86/cpu/quark/Makefile index 168c1e6..e87b424 100644 --- a/arch/x86/cpu/quark/Makefile +++ b/arch/x86/cpu/quark/Makefile @@ -5,4 +5,5 @@ #
obj-y += car.o dram.o msg_port.o quark.o +obj-y += mrc.o mrc_util.o hte.o smc.o obj-$(CONFIG_PCI) += pci.o

Add COMPAT_INTEL_QRK_MRC and "intel,quark-mrc" so that fdtdec can decode Intel Quark MRC node.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
include/fdtdec.h | 1 + lib/fdtdec.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/include/fdtdec.h b/include/fdtdec.h index 8c2bd21..a0cd886 100644 --- a/include/fdtdec.h +++ b/include/fdtdec.h @@ -174,6 +174,7 @@ enum fdt_compat_id { COMPAT_INTEL_GMA, /* Intel Graphics Media Accelerator */ COMPAT_AMS_AS3722, /* AMS AS3722 PMIC */ COMPAT_INTEL_ICH_SPI, /* Intel ICH7/9 SPI controller */ + COMPAT_INTEL_QRK_MRC, /* Intel Quark MRC */
COMPAT_COUNT, }; diff --git a/lib/fdtdec.c b/lib/fdtdec.c index e989241..1fef9af 100644 --- a/lib/fdtdec.c +++ b/lib/fdtdec.c @@ -84,6 +84,7 @@ static const char * const compat_names[COMPAT_COUNT] = { COMPAT(INTEL_GMA, "intel,gma"), COMPAT(AMS_AS3722, "ams,as3722"), COMPAT(INTEL_ICH_SPI, "intel,ich-spi"), + COMPAT(INTEL_QRK_MRC, "intel,quark-mrc"), };
const char *fdtdec_get_compatible(enum fdt_compat_id id)

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add COMPAT_INTEL_QRK_MRC and "intel,quark-mrc" so that fdtdec can decode Intel Quark MRC node.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

Add standard dt-bindings macros to be used by Intel Quark MRC node.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
include/dt-bindings/mrc/quark.h | 83 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 include/dt-bindings/mrc/quark.h
diff --git a/include/dt-bindings/mrc/quark.h b/include/dt-bindings/mrc/quark.h new file mode 100644 index 0000000..e3ca8a2 --- /dev/null +++ b/include/dt-bindings/mrc/quark.h @@ -0,0 +1,83 @@ +/* + * Copyright (C) 2015, Bin Meng bmeng.cn@gmail.com + * + * SPDX-License-Identifier: GPL-2.0+ + * + * Intel Quark MRC bindings include several properties + * as part of an Intel Quark MRC node. In most cases, + * the value of these properties uses the standard values + * defined in this header. + */ + +#ifndef _DT_BINDINGS_QRK_MRC_H_ +#define _DT_BINDINGS_QRK_MRC_H_ + +/* MRC platform data flags */ +#define MRC_FLAG_ECC_EN 0x00000001 +#define MRC_FLAG_SCRAMBLE_EN 0x00000002 +#define MRC_FLAG_MEMTEST_EN 0x00000004 +/* 0b DDR "fly-by" topology else 1b DDR "tree" topology */ +#define MRC_FLAG_TOP_TREE_EN 0x00000008 +/* If set ODR signal is asserted to DRAM devices on writes */ +#define MRC_FLAG_WR_ODT_EN 0x00000010 + +/* DRAM width */ +#define DRAM_WIDTH_X8 0 +#define DRAM_WIDTH_X16 1 +#define DRAM_WIDTH_X32 2 + +/* DRAM speed */ +#define DRAM_FREQ_800 0 +#define DRAM_FREQ_1066 1 + +/* DRAM type */ +#define DRAM_TYPE_DDR3 0 +#define DRAM_TYPE_DDR3L 1 + +/* DRAM rank mask */ +#define DRAM_RANK(n) (1 << (n)) + +/* DRAM channel mask */ +#define DRAM_CHANNEL(n) (1 << (n)) + +/* DRAM channel width */ +#define DRAM_CHANNEL_WIDTH_X8 0 +#define DRAM_CHANNEL_WIDTH_X16 1 +#define DRAM_CHANNEL_WIDTH_X32 2 + +/* DRAM address mode */ +#define DRAM_ADDR_MODE0 0 +#define DRAM_ADDR_MODE1 1 +#define DRAM_ADDR_MODE2 2 + +/* DRAM refresh rate */ +#define DRAM_REFRESH_RATE_195US 1 +#define DRAM_REFRESH_RATE_39US 2 +#define DRAM_REFRESH_RATE_785US 3 + +/* DRAM SR temprature range */ +#define DRAM_SRT_RANGE_NORMAL 0 +#define DRAM_SRT_RANGE_EXTENDED 1 + +/* DRAM ron value */ +#define DRAM_RON_34OHM 0 +#define DRAM_RON_40OHM 1 + +/* DRAM rtt nom value */ +#define DRAM_RTT_NOM_40OHM 0 +#define DRAM_RTT_NOM_60OHM 1 +#define DRAM_RTT_NOM_120OHM 2 + +/* DRAM rd odt value */ +#define DRAM_RD_ODT_OFF 0 +#define DRAM_RD_ODT_60OHM 1 +#define DRAM_RD_ODT_120OHM 2 +#define DRAM_RD_ODT_180OHM 3 + +/* DRAM density */ +#define DRAM_DENSITY_512M 0 +#define DRAM_DENSITY_1G 1 +#define DRAM_DENSITY_2G 2 +#define DRAM_DENSITY_4G 3 + +#endif /* _DT_BINDINGS_QRK_MRC_H_ */

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Add standard dt-bindings macros to be used by Intel Quark MRC node.
Signed-off-by: Bin Meng bmeng.cn@gmail.com Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!

Now that we have added Quark MRC codes, call MRC in dram_init() so that DRAM can be initialized on a Quark based board.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
---
Changes in v2: - Update comment for calling mrc_init() - Check return status of mrc_init()
arch/x86/cpu/quark/dram.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++- arch/x86/dts/galileo.dts | 25 ++++++++++++ 2 files changed, 122 insertions(+), 2 deletions(-)
diff --git a/arch/x86/cpu/quark/dram.c b/arch/x86/cpu/quark/dram.c index fbdc3cd..9cac846 100644 --- a/arch/x86/cpu/quark/dram.c +++ b/arch/x86/cpu/quark/dram.c @@ -5,15 +5,110 @@ */
#include <common.h> +#include <errno.h> +#include <fdtdec.h> #include <asm/post.h> +#include <asm/arch/mrc.h> #include <asm/arch/quark.h>
DECLARE_GLOBAL_DATA_PTR;
+static int mrc_configure_params(struct mrc_params *mrc_params) +{ + const void *blob = gd->fdt_blob; + int node; + int mrc_flags; + + node = fdtdec_next_compatible(blob, 0, COMPAT_INTEL_QRK_MRC); + if (node < 0) { + debug("%s: Cannot find MRC node\n", __func__); + return -EINVAL; + } + + /* + * TODO: + * + * We need support fast boot (MRC cache) in the future. + * + * Set boot mode to cold boot for now + */ + mrc_params->boot_mode = BM_COLD; + + /* + * TODO: + * + * We need determine ECC by pin strap state + * + * Disable ECC by default for now + */ + mrc_params->ecc_enables = 0; + + mrc_flags = fdtdec_get_int(blob, node, "flags", 0); + if (mrc_flags & MRC_FLAG_SCRAMBLE_EN) + mrc_params->scrambling_enables = 1; + else + mrc_params->scrambling_enables = 0; + + mrc_params->dram_width = fdtdec_get_int(blob, node, "dram-width", 0); + mrc_params->ddr_speed = fdtdec_get_int(blob, node, "dram-speed", 0); + mrc_params->ddr_type = fdtdec_get_int(blob, node, "dram-type", 0); + + mrc_params->rank_enables = fdtdec_get_int(blob, node, "rank-mask", 0); + mrc_params->channel_enables = fdtdec_get_int(blob, node, + "chan-mask", 0); + mrc_params->channel_width = fdtdec_get_int(blob, node, + "chan-width", 0); + mrc_params->address_mode = fdtdec_get_int(blob, node, "addr-mode", 0); + + mrc_params->refresh_rate = fdtdec_get_int(blob, node, + "refresh-rate", 0); + mrc_params->sr_temp_range = fdtdec_get_int(blob, node, + "sr-temp-range", 0); + mrc_params->ron_value = fdtdec_get_int(blob, node, + "ron-value", 0); + mrc_params->rtt_nom_value = fdtdec_get_int(blob, node, + "rtt-nom-value", 0); + mrc_params->rd_odt_value = fdtdec_get_int(blob, node, + "rd-odt-value", 0); + + mrc_params->params.density = fdtdec_get_int(blob, node, + "dram-density", 0); + mrc_params->params.cl = fdtdec_get_int(blob, node, "dram-cl", 0); + mrc_params->params.ras = fdtdec_get_int(blob, node, "dram-ras", 0); + mrc_params->params.wtr = fdtdec_get_int(blob, node, "dram-wtr", 0); + mrc_params->params.rrd = fdtdec_get_int(blob, node, "dram-rrd", 0); + mrc_params->params.faw = fdtdec_get_int(blob, node, "dram-faw", 0); + + debug("MRC dram_width %d\n", mrc_params->dram_width); + debug("MRC rank_enables %d\n", mrc_params->rank_enables); + debug("MRC ddr_speed %d\n", mrc_params->ddr_speed); + debug("MRC flags: %s\n", + (mrc_params->scrambling_enables) ? "SCRAMBLE_EN" : ""); + + debug("MRC density=%d tCL=%d tRAS=%d tWTR=%d tRRD=%d tFAW=%d\n", + mrc_params->params.density, mrc_params->params.cl, + mrc_params->params.ras, mrc_params->params.wtr, + mrc_params->params.rrd, mrc_params->params.faw); + + return 0; +} + int dram_init(void) { - /* hardcode the DRAM size for now */ - gd->ram_size = DRAM_MAX_SIZE; + struct mrc_params mrc_params; + int ret; + + memset(&mrc_params, 0, sizeof(struct mrc_params)); + ret = mrc_configure_params(&mrc_params); + if (ret) + return ret; + + /* Set up the DRAM by calling the memory reference code */ + mrc_init(&mrc_params); + if (mrc_params.status) + return -EIO; + + gd->ram_size = mrc_params.mem_size; post_code(POST_DRAM);
return 0; diff --git a/arch/x86/dts/galileo.dts b/arch/x86/dts/galileo.dts index 14a19c3..d462221 100644 --- a/arch/x86/dts/galileo.dts +++ b/arch/x86/dts/galileo.dts @@ -6,6 +6,8 @@
/dts-v1/;
+#include <dt-bindings/mrc/quark.h> + /include/ "skeleton.dtsi"
/ { @@ -20,6 +22,29 @@ stdout-path = &pciuart0; };
+ mrc { + compatible = "intel,quark-mrc"; + flags = <MRC_FLAG_SCRAMBLE_EN>; + dram-width = <DRAM_WIDTH_X8>; + dram-speed = <DRAM_FREQ_800>; + dram-type = <DRAM_TYPE_DDR3>; + rank-mask = <DRAM_RANK(0)>; + chan-mask = <DRAM_CHANNEL(0)>; + chan-width = <DRAM_CHANNEL_WIDTH_X16>; + addr-mode = <DRAM_ADDR_MODE0>; + refresh-rate = <DRAM_REFRESH_RATE_785US>; + sr-temp-range = <DRAM_SRT_RANGE_NORMAL>; + ron-value = <DRAM_RON_34OHM>; + rtt-nom-value = <DRAM_RTT_NOM_120OHM>; + rd-odt-value = <DRAM_RD_ODT_OFF>; + dram-density = <DRAM_DENSITY_1G>; + dram-cl = <6>; + dram-ras = <0x0000927c>; + dram-wtr = <0x00002710>; + dram-rrd = <0x00002710>; + dram-faw = <0x00009c40>; + }; + pci { #address-cells = <3>; #size-cells = <2>;

On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Now that we have added Quark MRC codes, call MRC in dram_init() so that DRAM can be initialized on a Quark based board.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
Changes in v2:
- Update comment for calling mrc_init()
- Check return status of mrc_init()
arch/x86/cpu/quark/dram.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++- arch/x86/dts/galileo.dts | 25 ++++++++++++ 2 files changed, 122 insertions(+), 2 deletions(-)
Acked-by: Simon Glass sjg@chromium.org

On 5 February 2015 at 22:23, Simon Glass sjg@chromium.org wrote:
On 5 February 2015 at 08:42, Bin Meng bmeng.cn@gmail.com wrote:
Now that we have added Quark MRC codes, call MRC in dram_init() so that DRAM can be initialized on a Quark based board.
Signed-off-by: Bin Meng bmeng.cn@gmail.com
Changes in v2:
- Update comment for calling mrc_init()
- Check return status of mrc_init()
arch/x86/cpu/quark/dram.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++- arch/x86/dts/galileo.dts | 25 ++++++++++++ 2 files changed, 122 insertions(+), 2 deletions(-)
Acked-by: Simon Glass sjg@chromium.org
Applied to u-boot-x86, thanks!
participants (2)
-
Bin Meng
-
Simon Glass