[U-Boot] [PATCH] arm: socfpga: dm: Fix DM initialization failure after warm reset

gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de --- arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy) /* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
+ /* + * gd->dm_root might contain non-zero value after warm reset. + * Clear it to avoid dm_init error + */ + gd->dm_root = NULL; + board_init_r(NULL, 0); }

On Friday, August 28, 2015 at 10:41:50 AM, Jian Luo wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
Hi!
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy) /* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
Nit: The indent should be done with tabs, not spaces. I think the email got messed up somewhere along the way.
The bigger concern I have is that if you look into arch/arm/lib/crt0.S , you will see that the entire global data are cleared there (_main, label clr_gd: ) and this code is executed before the board_init_f() .
Can you try tracking it down a bit more? I suspect that you might see dm_init_and_scan() returned error -22 , is that what you observe please?
board_init_r(NULL, 0);
}
Best regards, Marek Vasut

Hi Marek,
On 28.08.2015 11:24, Marek Vasut wrote:
On Friday, August 28, 2015 at 10:41:50 AM, Jian Luo wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
Hi!
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy) /* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
Nit: The indent should be done with tabs, not spaces. I think the email got messed up somewhere along the way.
Yes, sorry. I can't setup git send-email in our company network. Thunderbird messed the indent up.
The bigger concern I have is that if you look into arch/arm/lib/crt0.S , you will see that the entire global data are cleared there (_main, label clr_gd: ) and this code is executed before the board_init_f() .
This was my first assumption, it should be cleared in crt0.S. I was using U-Boot to load a VxWorks image. And wrote 1 to rstmgr ctrl in VxWorks to do a warm reset. Afterwards the SPL hangs.
It did not happen if I do reset direct in U-Boot. I'll attach a JTAG Debug to dig deeper.
Can you try tracking it down a bit more? I suspect that you might see dm_init_and_scan() returned error -22 , is that what you observe please?
Yes, in dm_init() where gd->dm_root will be checked.
board_init_r(NULL, 0);
}
Best regards, Marek Vasut
Best regards,
Jian Luo

On Friday, August 28, 2015 at 12:27:18 PM, Jian Luo wrote:
Hi Marek,
On 28.08.2015 11:24, Marek Vasut wrote:
On Friday, August 28, 2015 at 10:41:50 AM, Jian Luo wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
Hi!
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
Nit: The indent should be done with tabs, not spaces. I think the email got messed up somewhere along the way.
Yes, sorry. I can't setup git send-email in our company network. Thunderbird messed the indent up.
Why not ?
The bigger concern I have is that if you look into arch/arm/lib/crt0.S , you will see that the entire global data are cleared there (_main, label clr_gd: ) and this code is executed before the board_init_f() .
This was my first assumption, it should be cleared in crt0.S. I was using U-Boot to load a VxWorks image. And wrote 1 to rstmgr ctrl in VxWorks to do a warm reset. Afterwards the SPL hangs.
Hangs in which way ? Can you share the output please ?
It did not happen if I do reset direct in U-Boot. I'll attach a JTAG Debug to dig deeper.
Neat, that'd be really awesome, thanks for looking into this !
Can you try tracking it down a bit more? I suspect that you might see dm_init_and_scan() returned error -22 , is that what you observe please?
Yes, in dm_init() where gd->dm_root will be checked.
But if you get -ENOMEM, that means somehow the malloc is broken here. I wonder if this explicit setting of gd->malloc_area is screwing something up ?
Best regards, Marek Vasut

On 28.08.2015 12:30, Marek Vasut wrote:
On Friday, August 28, 2015 at 12:27:18 PM, Jian Luo wrote:
Hi Marek,
On 28.08.2015 11:24, Marek Vasut wrote:
On Friday, August 28, 2015 at 10:41:50 AM, Jian Luo wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
Hi!
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
Nit: The indent should be done with tabs, not spaces. I think the email got messed up somewhere along the way.
Yes, sorry. I can't setup git send-email in our company network. Thunderbird messed the indent up.
Why not ?
"Security policy". :(
The bigger concern I have is that if you look into arch/arm/lib/crt0.S , you will see that the entire global data are cleared there (_main, label clr_gd: ) and this code is executed before the board_init_f() .
This was my first assumption, it should be cleared in crt0.S. I was using U-Boot to load a VxWorks image. And wrote 1 to rstmgr ctrl in VxWorks to do a warm reset. Afterwards the SPL hangs.
Hangs in which way ? Can you share the output please ?
Sorry, should attached the output in the first place. With CONFIG_DM_WARN defined:
************************* ***** VxWorks is up ***** *************************
-> reboot
U-Boot SPL 2015.10-rc2-00192-ge122af6-dirty (Aug 28 2015 - 11:35:27) drivers/ddr/altera/sequencer.c: Preparing to start memory calibration drivers/ddr/altera/sequencer.c: CALIBRATION PASSED drivers/ddr/altera/sequencer.c: Calibration complete Virtual root driver already exists! ### ERROR ### Please RESET the board ###
It did not happen if I do reset direct in U-Boot. I'll attach a JTAG Debug to dig deeper.
Neat, that'd be really awesome, thanks for looking into this !
Can you try tracking it down a bit more? I suspect that you might see dm_init_and_scan() returned error -22 , is that what you observe please?
Yes, in dm_init() where gd->dm_root will be checked.
But if you get -ENOMEM, that means somehow the malloc is broken here. I wonder if this explicit setting of gd->malloc_area is screwing something up ?
Isn't 22 EINVAL? It's in the drivers/core/root.c line 105
Best regards, Marek Vasut
Best regards,
Jian Luo * *

On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
On 28.08.2015 12:30, Marek Vasut wrote:
On Friday, August 28, 2015 at 12:27:18 PM, Jian Luo wrote:
Hi Marek,
On 28.08.2015 11:24, Marek Vasut wrote:
On Friday, August 28, 2015 at 10:41:50 AM, Jian Luo wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
Hi!
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
Nit: The indent should be done with tabs, not spaces. I think the email got messed up somewhere along the way.
Yes, sorry. I can't setup git send-email in our company network. Thunderbird messed the indent up.
Why not ?
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from thunderbird into gitconfig ? :)
The bigger concern I have is that if you look into arch/arm/lib/crt0.S , you will see that the entire global data are cleared there (_main, label clr_gd: ) and this code is executed before the board_init_f() .
This was my first assumption, it should be cleared in crt0.S. I was using U-Boot to load a VxWorks image. And wrote 1 to rstmgr ctrl in VxWorks to do a warm reset. Afterwards the SPL hangs.
Hangs in which way ? Can you share the output please ?
Sorry, should attached the output in the first place. With CONFIG_DM_WARN defined:
***** VxWorks is up *****
-> reboot
U-Boot SPL 2015.10-rc2-00192-ge122af6-dirty (Aug 28 2015 - 11:35:27) drivers/ddr/altera/sequencer.c: Preparing to start memory calibration drivers/ddr/altera/sequencer.c: CALIBRATION PASSED drivers/ddr/altera/sequencer.c: Calibration complete Virtual root driver already exists! ### ERROR ### Please RESET the board ###
Oh, ew. Now this is _weird_ . Simon, any idea(s) ?
It did not happen if I do reset direct in U-Boot. I'll attach a JTAG Debug to dig deeper.
Neat, that'd be really awesome, thanks for looking into this !
Can you try tracking it down a bit more? I suspect that you might see dm_init_and_scan() returned error -22 , is that what you observe please?
Yes, in dm_init() where gd->dm_root will be checked.
But if you get -ENOMEM, that means somehow the malloc is broken here. I wonder if this explicit setting of gd->malloc_area is screwing something up ?
Isn't 22 EINVAL?
Yes it is, sorry.
It's in the drivers/core/root.c line 105
Uhm ... how can this be if GD is zeroed out ?

Hi Marek,
On 28.08.2015 14:01, Marek Vasut wrote:
On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
----snip----
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from
thunderbird
into gitconfig ? :)
I tried w/o success. Might try again another time.
Best regards,
Jian Luo

On Friday, August 28, 2015 at 02:09:15 PM, Jian Luo wrote:
Hi Marek,
Hi,
On 28.08.2015 14:01, Marek Vasut wrote:
On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
----snip----
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from
thunderbird
into gitconfig ? :)
I tried w/o success. Might try again another time.
Try using msmtp and configure your git send-email to send through it (and all your other MUAs too), it's really convenient :)
btw is this a custom board you're porting here ?
Best regards, Marek Vasut

Hi,
On 28.08.2015 23:48, Marek Vasut wrote:
On Friday, August 28, 2015 at 02:09:15 PM, Jian Luo wrote:
Hi Marek,
Hi,
On 28.08.2015 14:01, Marek Vasut wrote:
On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
----snip----
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from
thunderbird
into gitconfig ? :)
I tried w/o success. Might try again another time.
Try using msmtp and configure your git send-email to send through it (and all your other MUAs too), it's really convenient :)
It works. Thanks. :)
btw is this a custom board you're porting here ?
Yes, but this particular error can also be reproduced on Altera SoCDK.
It seems to be a compiler and/or configuration error of gcc 4.8 generated by yocto dizzy. After I switched to gcc 4.9 shiped with Altera Soc EDS 15.0, there is no warm reset error anymore.
Best regards, Marek Vasut
Best regards,
Jian Luo

On Monday, August 31, 2015 at 03:00:22 PM, Jian Luo wrote:
Hi,
Hi!
On 28.08.2015 23:48, Marek Vasut wrote:
On Friday, August 28, 2015 at 02:09:15 PM, Jian Luo wrote:
Hi Marek,
Hi,
On 28.08.2015 14:01, Marek Vasut wrote:
On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
----snip----
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from
thunderbird
into gitconfig ? :)
I tried w/o success. Might try again another time.
Try using msmtp and configure your git send-email to send through it (and all your other MUAs too), it's really convenient :)
It works. Thanks. :)
Cool!
btw is this a custom board you're porting here ?
Yes, but this particular error can also be reproduced on Altera SoCDK.
It seems to be a compiler and/or configuration error of gcc 4.8 generated by yocto dizzy.
Let me guess -- gcc 4.8.2 ?
After I switched to gcc 4.9 shiped with Altera Soc EDS 15.0, there is no warm reset error anymore.
Nice, thanks for the update :)
Best regards, Marek Vasut

Hi!
this error comes again. It isn't a compiler error after all. :(
JTAG inspection shows that the problem is located in arch/arm/mach-socfpga/spl.c line94. It seems that re-enable ECC on OCRAM can cause some strange value changes in SRAM. Disabling ECC might also cause value changes, which I didn't test.
On a cold (re)boot sysmgr_regs->eccgrp_ocram is 0x19 (derr|serr|en). So gd keeps intact.
In our VxWorks Image ECC on OCRAM happens to be disabled. After a warm reset sysmgr_regs->eccgrp_ocram is 0x18 (derr|serr). Thus after re-enable ECC, gd->dm_root turns to 0x80 every time.
My solution is keeping SYSMGR_ECC_OCRAM_EN bit untouched. And it works for me.
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 775a827..c858406 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -90,12 +90,14 @@ void board_init_f(ulong dummy) * and DBE might triggered during power on */ reg = readl(&sysmgr_regs->eccgrp_ocram); - if (reg & SYSMGR_ECC_OCRAM_SERR) - writel(SYSMGR_ECC_OCRAM_SERR | SYSMGR_ECC_OCRAM_EN, - &sysmgr_regs->eccgrp_ocram); - if (reg & SYSMGR_ECC_OCRAM_DERR) - writel(SYSMGR_ECC_OCRAM_DERR | SYSMGR_ECC_OCRAM_EN, - &sysmgr_regs->eccgrp_ocram); + if (reg & SYSMGR_ECC_OCRAM_SERR) { + reg &= ~SYSMGR_ECC_OCRAM_SERR; + writel(reg, &sysmgr_regs->eccgrp_ocram); + } + if (reg & SYSMGR_ECC_OCRAM_DERR) { + reg &= ~SYSMGR_ECC_OCRAM_DERR; + writel(reg, &sysmgr_regs->eccgrp_ocram); + }
memset(__bss_start, 0, __bss_end - __bss_start);
Other solution: 1. Moving OCRAM ECC setting to earlier stage: requires change in generic code. 2. Clear gd afterwards: requires replication of every early stage gd setting.
Best regards,
Jian Luo DC-IA/EAH2
Tel. +49(9352)18-4266
BeQIK On 31.08.2015 15:28, Marek Vasut wrote:
On Monday, August 31, 2015 at 03:00:22 PM, Jian Luo wrote:
Hi,
Hi!
On 28.08.2015 23:48, Marek Vasut wrote:
On Friday, August 28, 2015 at 02:09:15 PM, Jian Luo wrote:
Hi Marek,
Hi,
On 28.08.2015 14:01, Marek Vasut wrote:
On Friday, August 28, 2015 at 01:40:08 PM, Jian Luo wrote:
----snip----
"Security policy". :(
But thunderbird works ? Can't you just copy the SMTP settings from
thunderbird
into gitconfig ? :)
I tried w/o success. Might try again another time.
Try using msmtp and configure your git send-email to send through it (and all your other MUAs too), it's really convenient :)
It works. Thanks. :)
Cool!
btw is this a custom board you're porting here ?
Yes, but this particular error can also be reproduced on Altera SoCDK.
It seems to be a compiler and/or configuration error of gcc 4.8 generated by yocto dizzy.
Let me guess -- gcc 4.8.2 ?
After I switched to gcc 4.9 shiped with Altera Soc EDS 15.0, there is no warm reset error anymore.
Nice, thanks for the update :)
Best regards, Marek Vasut

On Wednesday, September 02, 2015 at 06:27:41 PM, Jian Luo wrote:
Hi!
this error comes again. It isn't a compiler error after all. :(
JTAG inspection shows that the problem is located in arch/arm/mach-socfpga/spl.c line94. It seems that re-enable ECC on OCRAM can cause some strange value changes in SRAM. Disabling ECC might also cause value changes, which I didn't test.
On a cold (re)boot sysmgr_regs->eccgrp_ocram is 0x19 (derr|serr|en). So gd keeps intact.
In our VxWorks Image ECC on OCRAM happens to be disabled. After a warm reset sysmgr_regs->eccgrp_ocram is 0x18 (derr|serr). Thus after re-enable ECC, gd->dm_root turns to 0x80 every time.
Ew, Dinh, can you check this please ? This is scary.
My solution is keeping SYSMGR_ECC_OCRAM_EN bit untouched. And it works for me.
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 775a827..c858406 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -90,12 +90,14 @@ void board_init_f(ulong dummy) * and DBE might triggered during power on */ reg = readl(&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR)
writel(SYSMGR_ECC_OCRAM_SERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_DERR)
writel(SYSMGR_ECC_OCRAM_DERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR) {
reg &= ~SYSMGR_ECC_OCRAM_SERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
}
if (reg & SYSMGR_ECC_OCRAM_DERR) {
reg &= ~SYSMGR_ECC_OCRAM_DERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
} memset(__bss_start, 0, __bss_end - __bss_start);
Other solution:
- Moving OCRAM ECC setting to earlier stage: requires change in generic
code. 2. Clear gd afterwards: requires replication of every early stage gd setting.
What I would be worried is that if this really happens, it likely also corrupts the SPL code which is loaded in OCRAM.
Can't you tune the VxWorks to keep the ECC enabled ?

Hi!
On 03.09.2015 11:41, Marek Vasut wrote:
On Wednesday, September 02, 2015 at 06:27:41 PM, Jian Luo wrote:
Hi!
this error comes again. It isn't a compiler error after all. :(
JTAG inspection shows that the problem is located in arch/arm/mach-socfpga/spl.c line94. It seems that re-enable ECC on OCRAM can cause some strange value changes in SRAM. Disabling ECC might also cause value changes, which I didn't test.
On a cold (re)boot sysmgr_regs->eccgrp_ocram is 0x19 (derr|serr|en). So gd keeps intact.
In our VxWorks Image ECC on OCRAM happens to be disabled. After a warm reset sysmgr_regs->eccgrp_ocram is 0x18 (derr|serr). Thus after re-enable ECC, gd->dm_root turns to 0x80 every time.
Ew, Dinh, can you check this please ? This is scary.
You can also reproduce the problem directly in U-Boot with 2 steps: => mw.l 0xFFD08144 0 => reset
On my ade0-nano-soc it hangs even before serial output, which means it did corrupt the SPL. I guess I just got (un)lucky with socdk.
My solution is keeping SYSMGR_ECC_OCRAM_EN bit untouched. And it works for me.
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 775a827..c858406 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -90,12 +90,14 @@ void board_init_f(ulong dummy) * and DBE might triggered during power on */ reg = readl(&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR)
writel(SYSMGR_ECC_OCRAM_SERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_DERR)
writel(SYSMGR_ECC_OCRAM_DERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR) {
reg &= ~SYSMGR_ECC_OCRAM_SERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
}
if (reg & SYSMGR_ECC_OCRAM_DERR) {
reg &= ~SYSMGR_ECC_OCRAM_DERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
} memset(__bss_start, 0, __bss_end - __bss_start);
Other solution:
- Moving OCRAM ECC setting to earlier stage: requires change in generic
code. 2. Clear gd afterwards: requires replication of every early stage gd setting.
What I would be worried is that if this really happens, it likely
also corrupts
the SPL code which is loaded in OCRAM.
Can't you tune the VxWorks to keep the ECC enabled ?
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 . Quote: To initialize ECC, the OCRAM needs to enable ECC then clear the entire memory to zero before using it.
Best regards,
Jian Luo

On Thursday, September 03, 2015 at 12:03:44 PM, Jian Luo wrote:
Hi!
Hi!
On 03.09.2015 11:41, Marek Vasut wrote:
On Wednesday, September 02, 2015 at 06:27:41 PM, Jian Luo wrote:
Hi!
this error comes again. It isn't a compiler error after all. :(
JTAG inspection shows that the problem is located in arch/arm/mach-socfpga/spl.c line94. It seems that re-enable ECC on OCRAM can cause some strange value changes in SRAM. Disabling ECC might also cause value changes, which I didn't test.
On a cold (re)boot sysmgr_regs->eccgrp_ocram is 0x19 (derr|serr|en). So gd keeps intact.
In our VxWorks Image ECC on OCRAM happens to be disabled. After a warm reset sysmgr_regs->eccgrp_ocram is 0x18 (derr|serr). Thus after re-enable ECC, gd->dm_root turns to 0x80 every time.
Ew, Dinh, can you check this please ? This is scary.
You can also reproduce the problem directly in U-Boot with 2 steps: => mw.l 0xFFD08144 0 => reset
On my ade0-nano-soc it hangs even before serial output, which means it did corrupt the SPL. I guess I just got (un)lucky with socdk.
Oh, nice testcase, thanks!
My solution is keeping SYSMGR_ECC_OCRAM_EN bit untouched. And it works for me.
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 775a827..c858406 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -90,12 +90,14 @@ void board_init_f(ulong dummy)
* and DBE might triggered during power on */ reg = readl(&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR)
writel(SYSMGR_ECC_OCRAM_SERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_DERR)
writel(SYSMGR_ECC_OCRAM_DERR | SYSMGR_ECC_OCRAM_EN,
&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR) {
reg &= ~SYSMGR_ECC_OCRAM_SERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
}
if (reg & SYSMGR_ECC_OCRAM_DERR) {
reg &= ~SYSMGR_ECC_OCRAM_DERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
} memset(__bss_start, 0, __bss_end - __bss_start);
Other solution:
- Moving OCRAM ECC setting to earlier stage: requires change in
generic code. 2. Clear gd afterwards: requires replication of every early stage gd setting.
What I would be worried is that if this really happens, it likely
also corrupts
the SPL code which is loaded in OCRAM.
Can't you tune the VxWorks to keep the ECC enabled ?
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 . Quote: To initialize ECC, the OCRAM needs to enable ECC then clear the entire memory to zero before using it.
Oh, but that is a problem, since we're running from the OCRAM ourselves, thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?

On 03.09.2015 12:09, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:03:44 PM, Jian Luo wrote:
Hi!
Hi!
On 03.09.2015 11:41, Marek Vasut wrote:
On Wednesday, September 02, 2015 at 06:27:41 PM, Jian Luo wrote:
Hi!
this error comes again. It isn't a compiler error after all. :(
JTAG inspection shows that the problem is located in arch/arm/mach-socfpga/spl.c line94. It seems that re-enable ECC on OCRAM can cause some strange value changes in SRAM. Disabling ECC might also cause value changes, which I didn't test.
On a cold (re)boot sysmgr_regs->eccgrp_ocram is 0x19
(derr|serr|en). So
gd keeps intact.
In our VxWorks Image ECC on OCRAM happens to be disabled. After a warm reset sysmgr_regs->eccgrp_ocram is 0x18 (derr|serr). Thus after re-enable ECC, gd->dm_root turns to 0x80 every time.
Ew, Dinh, can you check this please ? This is scary.
You can also reproduce the problem directly in U-Boot with 2 steps: => mw.l 0xFFD08144 0 => reset
On my ade0-nano-soc it hangs even before serial output, which means it did corrupt the SPL. I guess I just got (un)lucky with socdk.
Oh, nice testcase, thanks!
My solution is keeping SYSMGR_ECC_OCRAM_EN bit untouched. And it
works
for me.
diff --git a/arch/arm/mach-socfpga/spl.c
b/arch/arm/mach-socfpga/spl.c
index 775a827..c858406 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -90,12 +90,14 @@ void board_init_f(ulong dummy)
* and DBE might triggered during power on */ reg = readl(&sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR)
writel(SYSMGR_ECC_OCRAM_SERR | SYSMGR_ECC_OCRAM_EN,
- &sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_DERR)
writel(SYSMGR_ECC_OCRAM_DERR | SYSMGR_ECC_OCRAM_EN,
- &sysmgr_regs->eccgrp_ocram);
if (reg & SYSMGR_ECC_OCRAM_SERR) {
reg &= ~SYSMGR_ECC_OCRAM_SERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
}
if (reg & SYSMGR_ECC_OCRAM_DERR) {
reg &= ~SYSMGR_ECC_OCRAM_DERR;
writel(reg, &sysmgr_regs->eccgrp_ocram);
} memset(__bss_start, 0, __bss_end - __bss_start);
Other solution:
- Moving OCRAM ECC setting to earlier stage: requires change in
generic code. 2. Clear gd afterwards: requires replication of every early stage gd setting.
What I would be worried is that if this really happens, it likely
also corrupts
the SPL code which is loaded in OCRAM.
Can't you tune the VxWorks to keep the ECC enabled ?
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 . Quote: To initialize ECC, the OCRAM needs to enable ECC then clear the entire memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM ourselves, thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
Best regards,
Jian Luo

On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 .
Quote: To initialize ECC, the OCRAM needs to enable ECC then clear
the entire
memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM ourselves, thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also introduce corruption or not ? That might be the right fix, no ?
Best regards, Marek Vasut

On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 .
Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
the entire
memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM
ourselves,
thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
Best regards, Marek Vasut
Best regards,
Jian Luo

On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 .
Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
the entire
memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM
ourselves,
thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Best regards, Marek Vasut

Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
Yes, I can. But U-Boot can still have problem with other Image which disables ECC. I found another post related to this problem https://lkml.org/lkml/2015/2/6/685 .
Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
the entire
memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM
ourselves,
thus we cannot clear the OCRAM. Maybe we should force-disable the ECC instead? But can we be sure that the corruption does not happen when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
Regards, Simon

Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
> Yes, I can. But U-Boot can still have problem with other Image > which disables ECC. > I found another post related to this problem > https://lkml.org/lkml/2015/2/6/685 . > > Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
> the entire > > memory to zero before using it.
Hi!
Oh, but that is a problem, since we're running from the OCRAM
ourselves,
thus we cannot clear the OCRAM. Maybe we should
force-disable the
ECC instead? But can we be sure that the corruption does not
happen
when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
Best regards,
Jian Luo

Hi,
On 4 September 2015 at 01:36, Jian Luo Jian.Luo4@boschrexroth.de wrote:
Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
>> Yes, I can. But U-Boot can still have problem with other Image >> which disables ECC. >> I found another post related to this problem >> https://lkml.org/lkml/2015/2/6/685 . >> >> Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
>> the entire >> >> memory to zero before using it.
Hi!
> Oh, but that is a problem, since we're running from the OCRAM
ourselves,
> thus we cannot clear the OCRAM. Maybe we should force-disable the > ECC instead? But can we be sure that the corruption does not
happen
> when you disable ECC ?
Yes, that will be a problem. It's also why I let the SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
OK thanks. It might be possible to do this earlier, say in cpu_init_crit().
Regards, Simon

On Friday, September 04, 2015 at 04:16:21 PM, Simon Glass wrote:
Hi,
On 4 September 2015 at 01:36, Jian Luo Jian.Luo4@boschrexroth.de wrote:
Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote:
On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote:
Hi!
[...]
> >> Yes, I can. But U-Boot can still have problem with other Image > >> which disables ECC. > >> I found another post related to this problem > >> https://lkml.org/lkml/2015/2/6/685 . > >> > >> Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
> >> the entire > >> > >> memory to zero before using it. > > Hi! > > > Oh, but that is a problem, since we're running from the OCRAM
ourselves,
> > thus we cannot clear the OCRAM. Maybe we should force-disable > > the ECC instead? But can we be sure that the corruption does > > not
happen
> > when you disable ECC ? > > Yes, that will be a problem. It's also why I let the > SYSMGR_ECC_OCRAM_EN bit intact in the patch.
OK, but what about turning the ECC off in the SPL, will that also
introduce
corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
OK thanks. It might be possible to do this earlier, say in cpu_init_crit().
You cannot enable the bit, because it'd corrupt your OCRAM and your code is running from the OCRAM, thus you'd be susceptible to corrupting the code you're running itself :(

On 4 September 2015 at 08:25, Marek Vasut marex@denx.de wrote:
On Friday, September 04, 2015 at 04:16:21 PM, Simon Glass wrote:
Hi,
On 4 September 2015 at 01:36, Jian Luo Jian.Luo4@boschrexroth.de wrote:
Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote:
On 03.09.2015 12:46, Marek Vasut wrote: > On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote: > > Hi! > > [...] > >> >> Yes, I can. But U-Boot can still have problem with other Image >> >> which disables ECC. >> >> I found another post related to this problem >> >> https://lkml.org/lkml/2015/2/6/685 . >> >> >> >> Quote: To initialize ECC, the OCRAM needs to enable ECC
then clear
>> >> the entire >> >> >> >> memory to zero before using it. >> >> Hi! >> >> > Oh, but that is a problem, since we're running from the OCRAM
ourselves,
>> > thus we cannot clear the OCRAM. Maybe we should force-disable >> > the ECC instead? But can we be sure that the corruption does >> > not
happen
>> > when you disable ECC ? >> >> Yes, that will be a problem. It's also why I let the >> SYSMGR_ECC_OCRAM_EN bit intact in the patch. > > OK, but what about turning the ECC off in the SPL, will that also
introduce
> corruption or not ? That might be the right fix, no ?
Hi Marek,
Sorry, I don't know the detail of ECC implementation in socfpga. Dinh might have the answer to that.
Anyhow I still think let the setting untouched is the safest fix. SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
OK thanks. It might be possible to do this earlier, say in cpu_init_crit().
You cannot enable the bit, because it'd corrupt your OCRAM and your code is running from the OCRAM, thus you'd be susceptible to corrupting the code you're running itself :(
Oh joy. Well anyway I think this is a chip-specific problem and there is nothing wrong in general with the current init process.
Regards, Simon

On Friday, September 04, 2015 at 04:26:46 PM, Simon Glass wrote:
On 4 September 2015 at 08:25, Marek Vasut marex@denx.de wrote:
On Friday, September 04, 2015 at 04:16:21 PM, Simon Glass wrote:
Hi,
On 4 September 2015 at 01:36, Jian Luo Jian.Luo4@boschrexroth.de wrote:
Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote:
On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote: > On 03.09.2015 12:46, Marek Vasut wrote: > > On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote: > > > > Hi! > > > > [...] > > > >> >> Yes, I can. But U-Boot can still have problem with other > >> >> Image which disables ECC. > >> >> I found another post related to this problem > >> >> https://lkml.org/lkml/2015/2/6/685 . > >> >> > >> >> Quote: To initialize ECC, the OCRAM needs to enable > >> >> ECC > > then clear > > >> >> the entire > >> >> > >> >> memory to zero before using it. > >> > >> Hi! > >> > >> > Oh, but that is a problem, since we're running from the > >> > OCRAM > > ourselves, > > >> > thus we cannot clear the OCRAM. Maybe we should > >> > force-disable the ECC instead? But can we be sure that the > >> > corruption does not > > happen > > >> > when you disable ECC ? > >> > >> Yes, that will be a problem. It's also why I let the > >> SYSMGR_ECC_OCRAM_EN bit intact in the patch. > > > > OK, but what about turning the ECC off in the SPL, will that > > also > > introduce > > > corruption or not ? That might be the right fix, no ? > > Hi Marek, > > Sorry, I don't know the detail of ECC implementation in socfpga. > Dinh might have the answer to that. > > Anyhow I still think let the setting untouched is the safest fix. > SPL should use the same ECC setting which BROM loads SPL with.
That's right, but I'd also like to have this bit in some defined state from the boot instead of having this in some random setting. Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
OK thanks. It might be possible to do this earlier, say in cpu_init_crit().
You cannot enable the bit, because it'd corrupt your OCRAM and your code is running from the OCRAM, thus you'd be susceptible to corrupting the code you're running itself :(
Oh joy. Well anyway I think this is a chip-specific problem and there is nothing wrong in general with the current init process.
That's absolutelly correct.
I'd like to hear Dinh's opinion on this, because this seems quite important.

Hi Dinh, Hi Marek,
any updates on this issue?
Best regards,
Jian Luo
On 04.09.2015 16:32, Marek Vasut wrote:
On Friday, September 04, 2015 at 04:26:46 PM, Simon Glass wrote:
On 4 September 2015 at 08:25, Marek Vasut marex@denx.de wrote:
On Friday, September 04, 2015 at 04:16:21 PM, Simon Glass wrote:
Hi,
On 4 September 2015 at 01:36, Jian Luo Jian.Luo4@boschrexroth.de
wrote:
Hi Simon,
On 04.09.2015 02:23, Simon Glass wrote:
Hi,
On 3 September 2015 at 05:14, Marek Vasut marex@denx.de wrote: > On Thursday, September 03, 2015 at 01:12:03 PM, Jian Luo wrote: >> On 03.09.2015 12:46, Marek Vasut wrote: >> > On Thursday, September 03, 2015 at 12:17:13 PM, Jian Luo wrote: >> > >> > Hi! >> > >> > [...] >> > >> >> >> Yes, I can. But U-Boot can still have problem with other >> >> >> Image which disables ECC. >> >> >> I found another post related to this problem >> >> >> https://lkml.org/lkml/2015/2/6/685 . >> >> >> >> >> >> Quote: To initialize ECC, the OCRAM needs to enable >> >> >> ECC >> >> then clear >> >> >> >> the entire >> >> >> >> >> >> memory to zero before using it. >> >> >> >> Hi! >> >> >> >> > Oh, but that is a problem, since we're running from the >> >> > OCRAM >> >> ourselves, >> >> >> > thus we cannot clear the OCRAM. Maybe we should >> >> > force-disable the ECC instead? But can we be sure that the >> >> > corruption does not >> >> happen >> >> >> > when you disable ECC ? >> >> >> >> Yes, that will be a problem. It's also why I let the >> >> SYSMGR_ECC_OCRAM_EN bit intact in the patch. >> > >> > OK, but what about turning the ECC off in the SPL, will that >> > also >> >> introduce >> >> > corruption or not ? That might be the right fix, no ? >> >> Hi Marek, >> >> Sorry, I don't know the detail of ECC implementation in socfpga. >> Dinh might have the answer to that. >> >> Anyhow I still think let the setting untouched is the safest fix. >> SPL should use the same ECC setting which BROM loads SPL with. > > That's right, but I'd also like to have this bit in some defined > state from the boot instead of having this in some random setting. > Dinh, can you comment on this corruption please ?
Also I'm still a bit confused.
The code in crt0.S zeroes global_data so how can it be non-zero a little later in board_init_f()?
board_init_f() enables the ECC of the SRAM regardless its previous state. If ECC is disabled beborehand, re-enabling it can cause SRAM misreading.
OK thanks. It might be possible to do this earlier, say in cpu_init_crit().
You cannot enable the bit, because it'd corrupt your OCRAM and your
code
is running from the OCRAM, thus you'd be susceptible to corrupting the code you're running itself :(
Oh joy. Well anyway I think this is a chip-specific problem and there is nothing wrong in general with the current init process.
That's absolutelly correct.
I'd like to hear Dinh's opinion on this, because this seems quite
important.

Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy) /* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Also BTW it would be better if board_init_f() returned rather than calling board_init_r() directly.
Regards, Simon

On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
-- 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Also BTW it would be better if board_init_f() returned rather than calling board_init_r() directly.
I'm all for it, it'd trim down the stack utilisation slightly too.
Best regards, Marek Vasut

Hi Marek,
On 29 August 2015 at 01:56, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
-- 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Does that mean it skips crt0.S? How come global_data is not zeroed there?
Also BTW it would be better if board_init_f() returned rather than calling board_init_r() directly.
I'm all for it, it'd trim down the stack utilisation slightly too.
Sound good.
Regards, Simon

On Saturday, August 29, 2015 at 04:39:43 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 01:56, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
-- 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Does that mean it skips crt0.S? How come global_data is not zeroed there?
No, it does not mean it skips crt0.S . After the warm reset, the bootrom jumps onto the reset vector, so crt0.S (_main) must be executed.
Also BTW it would be better if board_init_f() returned rather than calling board_init_r() directly.
I'm all for it, it'd trim down the stack utilisation slightly too.
Sound good.
Done ;-)
Best regards, Marek Vasut

Hi Marek,
On 29 August 2015 at 08:46, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 04:39:43 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 01:56, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm reset.
* Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
-- 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Does that mean it skips crt0.S? How come global_data is not zeroed there?
No, it does not mean it skips crt0.S . After the warm reset, the bootrom jumps onto the reset vector, so crt0.S (_main) must be executed.
Then I don't understand the need for this patch.
Also BTW it would be better if board_init_f() returned rather than calling board_init_r() directly.
I'm all for it, it'd trim down the stack utilisation slightly too.
Sound good.
Done ;-)
:-)
Best regards, Marek Vasut
Regards, Simon

On Saturday, August 29, 2015 at 04:49:37 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 08:46, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 04:39:43 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 01:56, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote:
gd->dm_root is not cleared in SPL after warm reset. This might cause DM initilazation failure.
Signed-off-by: Jian Luo jian.luo4@boschrexroth.de
arch/arm/mach-socfpga/spl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm/mach-socfpga/spl.c b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 --- a/arch/arm/mach-socfpga/spl.c +++ b/arch/arm/mach-socfpga/spl.c @@ -181,5 +181,11 @@ void board_init_f(ulong dummy)
/* Configure simple malloc base pointer into RAM. */ gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024);
/*
* gd->dm_root might contain non-zero value after warm
reset. + * Clear it to avoid dm_init error
*/
gd->dm_root = NULL;
board_init_r(NULL, 0);
}
-- 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Does that mean it skips crt0.S? How come global_data is not zeroed there?
No, it does not mean it skips crt0.S . After the warm reset, the bootrom jumps onto the reset vector, so crt0.S (_main) must be executed.
Then I don't understand the need for this patch.
Apparently, the gd->dm_root is set to a non-NULL address for some (unknown) reason. I don't quite understand this myself.
The only possibility which can lead to gd->dm_root being set to non-NULL address is that crt0.S _main is not executed, is that correct ?

Hi Marek,
On 29 August 2015 at 09:45, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 04:49:37 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 08:46, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 04:39:43 PM, Simon Glass wrote:
Hi Marek,
On 29 August 2015 at 01:56, Marek Vasut marex@denx.de wrote:
On Saturday, August 29, 2015 at 01:21:31 AM, Simon Glass wrote:
Hi,
On 28 August 2015 at 02:41, Jian Luo Jian.Luo4@boschrexroth.de wrote: > gd->dm_root is not cleared in SPL after warm reset. > This might cause DM initilazation failure. > > Signed-off-by: Jian Luo jian.luo4@boschrexroth.de > --- > > arch/arm/mach-socfpga/spl.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/arm/mach-socfpga/spl.c > b/arch/arm/mach-socfpga/spl.c index 13ec24b..59fe1f2 100644 > --- a/arch/arm/mach-socfpga/spl.c > +++ b/arch/arm/mach-socfpga/spl.c > @@ -181,5 +181,11 @@ void board_init_f(ulong dummy) > > /* Configure simple malloc base pointer into RAM. */ > gd->malloc_base = CONFIG_SYS_TEXT_BASE + (1024 * 1024); > > + /* > + * gd->dm_root might contain non-zero value after warm > reset. + * Clear it to avoid dm_init error > + */ > + gd->dm_root = NULL; > + > > board_init_r(NULL, 0); > > } > > -- > 1.9.1
This does not look like the root cause to me. global_data is zeroed by crt0.S if CONFIG_SPL_FRAMEWORK is set, which it seems to be for socfpga.
What boot path does 'warm reset' take?
Warm reset resets the CPU core(s) and jumps to 0x0 in SRAM (without re-reading anything from the boot media).
Does that mean it skips crt0.S? How come global_data is not zeroed there?
No, it does not mean it skips crt0.S . After the warm reset, the bootrom jumps onto the reset vector, so crt0.S (_main) must be executed.
Then I don't understand the need for this patch.
Apparently, the gd->dm_root is set to a non-NULL address for some (unknown) reason. I don't quite understand this myself.
The only possibility which can lead to gd->dm_root being set to non-NULL address is that crt0.S _main is not executed, is that correct ?
I think so, unless driver model was already inited by a call to spl_init(). But I don't see where your board might do that.
Regards, Simon

On Saturday, August 29, 2015 at 06:54:54 PM, Simon Glass wrote:
Hi Marek,
Hi Simon,
[...]
Does that mean it skips crt0.S? How come global_data is not zeroed there?
No, it does not mean it skips crt0.S . After the warm reset, the bootrom jumps onto the reset vector, so crt0.S (_main) must be executed.
Then I don't understand the need for this patch.
Apparently, the gd->dm_root is set to a non-NULL address for some (unknown) reason. I don't quite understand this myself.
The only possibility which can lead to gd->dm_root being set to non-NULL address is that crt0.S _main is not executed, is that correct ?
I think so, unless driver model was already inited by a call to spl_init(). But I don't see where your board might do that.
That's correct.
Best regards, Marek Vasut
participants (3)
-
Jian Luo
-
Marek Vasut
-
Simon Glass