[U-Boot] x86 FSP - delayed SDRAM init?

Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
Regards, Simon

Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
Regards, Bin

Hi Bin,
On 21 January 2015 at 21:45, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
I don't disagree with any of this, but I've had a bit more of a look.
How about something awful like this:
- start.S - do the initram thing - call board_init_f() - there is no hob pointer, so CPU init does very little - it runs only to dram_init() - dram_init() calls the fsp_init() and ends up back at another function - that function sets up global_data again, calls board_init_f() again - passes a boot flag to suppress the banner the second time - this time there is a hob pointer, so CPU init can complete - things boot normally
It is hideous. The question is which way is worse.
What we really need is to split fsp_init() into two parts:
1. Set up DRAM 2. Turn off CAR (called when WE decide we no-longer need it)
Then we could make this work as with the non-FSP mode.
By joining the two, they have made it even harder than it should be. Of course, no one wins with binary blobs, but the 'continuation function' should have been a red flag when this design was done at Intel.
BTW for me on minnowmax fsp_init() hangs. Since it is a binary blob I am not sure how to debug it. There is no error printed / returned. I'm not even sure how to make it output debug info. No post codes are available. Sigh.
Regards, Simon

Hi Simon,
On Thu, Jan 22, 2015 at 1:02 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 21:45, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
I don't disagree with any of this, but I've had a bit more of a look.
How about something awful like this:
- start.S
- do the initram thing
- call board_init_f()
- there is no hob pointer, so CPU init does very little
- it runs only to dram_init()
- dram_init() calls the fsp_init() and ends up back at another function
- that function sets up global_data again, calls board_init_f() again
- passes a boot flag to suppress the banner the second time
- this time there is a hob pointer, so CPU init can complete
- things boot normally
It is hideous. The question is which way is worse.
So this way we avoid migrating the stack.
What we really need is to split fsp_init() into two parts:
- Set up DRAM
- Turn off CAR (called when WE decide we no-longer need it)
Then we could make this work as with the non-FSP mode.
By joining the two, they have made it even harder than it should be. Of course, no one wins with binary blobs, but the 'continuation function' should have been a red flag when this design was done at Intel.
I agree. I wish the design done at Intel should have considered more use cases about the FSP integration into existing bootloaders. Even in coreboot, the FSP integration is not perfectly fit into the coreboot model.
BTW for me on minnowmax fsp_init() hangs. Since it is a binary blob I am not sure how to debug it. There is no error printed / returned. I'm not even sure how to make it output debug info. No post codes are available. Sigh.
I see coreboot has the minnow max board support already with Intel FSP. You can generate a coreboot for minnow max and program it onto the board to see how things go.
Does this link help? http://review.coreboot.org/gitweb?p=coreboot.git;a=commitdiff;h=e6df041b8bf8.... It looks that the BayTrail FSP has several parameters configurable for DRAM.
Here are the required changes for modifying the FSP manually: Enable Memory Down: Enabled DRAM Speed: 1066 MHz DIMM_DWidth: x16 DIMM_Density: 4 Gbit (2GB Minnow Max) / 2 Gbit (1GB Minnow Max) tCL: 7 tRP_tRCD: 7 tWR: 8 tRRD: 6 tRTP: 4 tFAW: 27 Other FSP values can remain the same.
Regards, Bin

Hi Bin,
On 21 January 2015 at 22:39, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 1:02 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 21:45, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
I don't disagree with any of this, but I've had a bit more of a look.
How about something awful like this:
- start.S
- do the initram thing
- call board_init_f()
- there is no hob pointer, so CPU init does very little
- it runs only to dram_init()
- dram_init() calls the fsp_init() and ends up back at another function
- that function sets up global_data again, calls board_init_f() again
- passes a boot flag to suppress the banner the second time
- this time there is a hob pointer, so CPU init can complete
- things boot normally
It is hideous. The question is which way is worse.
So this way we avoid migrating the stack.
What we really need is to split fsp_init() into two parts:
- Set up DRAM
- Turn off CAR (called when WE decide we no-longer need it)
Then we could make this work as with the non-FSP mode.
By joining the two, they have made it even harder than it should be. Of course, no one wins with binary blobs, but the 'continuation function' should have been a red flag when this design was done at Intel.
I agree. I wish the design done at Intel should have considered more use cases about the FSP integration into existing bootloaders. Even in coreboot, the FSP integration is not perfectly fit into the coreboot model.
BTW for me on minnowmax fsp_init() hangs. Since it is a binary blob I am not sure how to debug it. There is no error printed / returned. I'm not even sure how to make it output debug info. No post codes are available. Sigh.
I see coreboot has the minnow max board support already with Intel FSP. You can generate a coreboot for minnow max and program it onto the board to see how things go.
Does this link help? http://review.coreboot.org/gitweb?p=coreboot.git;a=commitdiff;h=e6df041b8bf8.... It looks that the BayTrail FSP has several parameters configurable for DRAM.
Thanks - yes I saw that code although didn't look at the particular commit. It's just painful trying n random things to try to make a black box behave. I must be getting less patient these days.
Here are the required changes for modifying the FSP manually: Enable Memory Down: Enabled DRAM Speed: 1066 MHz DIMM_DWidth: x16 DIMM_Density: 4 Gbit (2GB Minnow Max) / 2 Gbit (1GB Minnow Max) tCL: 7 tRP_tRCD: 7 tWR: 8 tRRD: 6 tRTP: 4 tFAW: 27 Other FSP values can remain the same.
Apart from the 27 that matches. I suppose that's another problem with the FSP design. If memory init fails it should return with a useful error, not just die because the API requires that it returns with a new stack in DRAM. Also, specifying the address of the stack before you know the memory layout is odd.
Regards, Simon

Hi Simon,
On Fri, Jan 23, 2015 at 12:36 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 22:39, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 1:02 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 21:45, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
I don't disagree with any of this, but I've had a bit more of a look.
How about something awful like this:
- start.S
- do the initram thing
- call board_init_f()
- there is no hob pointer, so CPU init does very little
- it runs only to dram_init()
- dram_init() calls the fsp_init() and ends up back at another function
- that function sets up global_data again, calls board_init_f() again
- passes a boot flag to suppress the banner the second time
- this time there is a hob pointer, so CPU init can complete
- things boot normally
It is hideous. The question is which way is worse.
So this way we avoid migrating the stack.
What we really need is to split fsp_init() into two parts:
- Set up DRAM
- Turn off CAR (called when WE decide we no-longer need it)
Then we could make this work as with the non-FSP mode.
By joining the two, they have made it even harder than it should be. Of course, no one wins with binary blobs, but the 'continuation function' should have been a red flag when this design was done at Intel.
I agree. I wish the design done at Intel should have considered more use cases about the FSP integration into existing bootloaders. Even in coreboot, the FSP integration is not perfectly fit into the coreboot model.
BTW for me on minnowmax fsp_init() hangs. Since it is a binary blob I am not sure how to debug it. There is no error printed / returned. I'm not even sure how to make it output debug info. No post codes are available. Sigh.
I see coreboot has the minnow max board support already with Intel FSP. You can generate a coreboot for minnow max and program it onto the board to see how things go.
Does this link help? http://review.coreboot.org/gitweb?p=coreboot.git;a=commitdiff;h=e6df041b8bf8.... It looks that the BayTrail FSP has several parameters configurable for DRAM.
Thanks - yes I saw that code although didn't look at the particular commit. It's just painful trying n random things to try to make a black box behave. I must be getting less patient these days.
Here are the required changes for modifying the FSP manually: Enable Memory Down: Enabled DRAM Speed: 1066 MHz DIMM_DWidth: x16 DIMM_Density: 4 Gbit (2GB Minnow Max) / 2 Gbit (1GB Minnow Max) tCL: 7 tRP_tRCD: 7 tWR: 8 tRRD: 6 tRTP: 4 tFAW: 27 Other FSP values can remain the same.
Apart from the 27 that matches. I suppose that's another problem with the FSP design. If memory init fails it should return with a useful error, not just die because the API requires that it returns with a new stack in DRAM. Also, specifying the address of the stack before you know the memory layout is odd.
Agreed. I am not sure whether Intel wants to improve their design or not. The new stack after fsp_init() is indeed a nightmare which causes stack migration issues we are discussing here. Sigh ..
Regards, Bin

Hi Bin,
On 22 January 2015 at 18:32, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Fri, Jan 23, 2015 at 12:36 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 22:39, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 1:02 PM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
On 21 January 2015 at 21:45, Bin Meng bmeng.cn@gmail.com wrote:
Hi Simon,
On Thu, Jan 22, 2015 at 11:42 AM, Simon Glass sjg@chromium.org wrote:
Hi Bin,
In the Baytrail FSP docs I see a note about the HOB passing back the 'Boot Loader Temporary Memory Data HOB'. This seems to be a copy of the entire temporary memory space. I wonder if we could recover struct global_data from this?
Yes, I think so. And I have verified this temporary memory space indeed contains stack contents before fsp_init() from U-Boot shell on Crown Bay. But the overall process might be complicated. See below.
If so, then we could move the fsp_init stuff to dram_init(), perhaps?
But perhaps this is a feature of only this FSP?
I believe this is a feature defined by the FSP architecture spec, so every FSP should support that.
Technically it should be no problem to call fsp_init() from arch_cpu_init() or even later dram_init(). However, I would say doing so brings us more harm than good. The following points are what I thought about before:
1). fsp_init() takes one parameter 'stack_top' to setup another stack after DRAM is initialized. This means everything on the previous CAR stack will need to migrate to the new stack below 'stack_top'. This includes global data, early malloc pointers, arch_cpu_init() stack variables and its return address. 2). Copy previous global_data to the new places under stack_top, and fix up gd->arch.gd_addr. 3). The initf_malloc() is called before arch_cpu_init() so we need fix up the early malloc pointers manually (gd->malloc_base and gd->malloc_limit) 4). Fix up the stack variables and return address of arch_cpu_init() on the new stack. 5). On Tunnel Creek, if we call setup_gdt() in start.S, later fsp_init() in arch_cpu_init() will fail to bring up the thread 1 (Tunnel Creek supports SMT), the reason of which is unknown to me yet (FSP is a black box). It looks to me that FSP is assuming GDT only contains two entries (32-bit 4G flat address) before calling fsp_init().
I have not looked into this any further, so the above points might not be 100% right. I would say with these modifications, the codes are more difficult to understand.
I don't disagree with any of this, but I've had a bit more of a look.
How about something awful like this:
- start.S
- do the initram thing
- call board_init_f()
- there is no hob pointer, so CPU init does very little
- it runs only to dram_init()
- dram_init() calls the fsp_init() and ends up back at another function
- that function sets up global_data again, calls board_init_f() again
- passes a boot flag to suppress the banner the second time
- this time there is a hob pointer, so CPU init can complete
- things boot normally
It is hideous. The question is which way is worse.
So this way we avoid migrating the stack.
What we really need is to split fsp_init() into two parts:
- Set up DRAM
- Turn off CAR (called when WE decide we no-longer need it)
Then we could make this work as with the non-FSP mode.
By joining the two, they have made it even harder than it should be. Of course, no one wins with binary blobs, but the 'continuation function' should have been a red flag when this design was done at Intel.
I agree. I wish the design done at Intel should have considered more use cases about the FSP integration into existing bootloaders. Even in coreboot, the FSP integration is not perfectly fit into the coreboot model.
BTW for me on minnowmax fsp_init() hangs. Since it is a binary blob I am not sure how to debug it. There is no error printed / returned. I'm not even sure how to make it output debug info. No post codes are available. Sigh.
I see coreboot has the minnow max board support already with Intel FSP. You can generate a coreboot for minnow max and program it onto the board to see how things go.
Does this link help? http://review.coreboot.org/gitweb?p=coreboot.git;a=commitdiff;h=e6df041b8bf8.... It looks that the BayTrail FSP has several parameters configurable for DRAM.
Thanks - yes I saw that code although didn't look at the particular commit. It's just painful trying n random things to try to make a black box behave. I must be getting less patient these days.
Here are the required changes for modifying the FSP manually: Enable Memory Down: Enabled DRAM Speed: 1066 MHz DIMM_DWidth: x16 DIMM_Density: 4 Gbit (2GB Minnow Max) / 2 Gbit (1GB Minnow Max) tCL: 7 tRP_tRCD: 7 tWR: 8 tRRD: 6 tRTP: 4 tFAW: 27 Other FSP values can remain the same.
Apart from the 27 that matches. I suppose that's another problem with the FSP design. If memory init fails it should return with a useful error, not just die because the API requires that it returns with a new stack in DRAM. Also, specifying the address of the stack before you know the memory layout is odd.
Agreed. I am not sure whether Intel wants to improve their design or not. The new stack after fsp_init() is indeed a nightmare which causes stack migration issues we are discussing here. Sigh ..
I've sent some feedback to the Minnowboard list.
Regards, Simon
participants (2)
-
Bin Meng
-
Simon Glass