[U-Boot] early_malloc() vs. enable_caches()

Hello!
I am working on early_malloc() for U-Boot Driver Model (this malloc is going to serve for internal DM structures during early init and it has it's minimalistic heap in global data).
My question is how to correctly switch from early allocator to full-scale malloc and when to enable caches.
The current state (on ARM) is: 1) gd = id; 3) enable_caches(); 3) mem_malloc_init();
Proposed sequence for mallocator (in order no to loose any data from old and not-relocated part of GD):
1) gd_old = gd; 2) gd = id; 3) mem_malloc_init(); 4) relocation of DM structures 5) early_malloc_disab() 6) enable_caches();
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
Thanks, Tomas

Hi Tomas,
On Sat, 28 Jul 2012 17:46:41 +0200, Tomas Hlavacek tmshlvck@gmail.com wrote:
Hello!
I am working on early_malloc() for U-Boot Driver Model (this malloc is going to serve for internal DM structures during early init and it has it's minimalistic heap in global data).
My question is how to correctly switch from early allocator to full-scale malloc and when to enable caches.
The current state (on ARM) is:
- gd = id;
- enable_caches();
- mem_malloc_init();
Proposed sequence for mallocator (in order no to loose any data from old and not-relocated part of GD):
- gd_old = gd;
- gd = id;
- mem_malloc_init();
- relocation of DM structures
- early_malloc_disab()
- enable_caches();
I suspect the sequence here is atomic because there is only one thread of execution and no interrupt able to run, but still: would you not rather disable early malloc *before* relocating DM structures?
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
Based on the (possibly wrong) assumption that in the case you describe, caches are being enabled for the first time, then there is no such thing as "locked cache lines" because the caches were disabled so far.
If you are enabling the caches for the first time there, then all lines are (or should be, more exactly) empty and unlocked.
Thanks, Tomas
Amicalement,

Dear Albert ARIBAUD,
Hi Tomas,
On Sat, 28 Jul 2012 17:46:41 +0200, Tomas Hlavacek tmshlvck@gmail.com wrote:
Hello!
I am working on early_malloc() for U-Boot Driver Model (this malloc is going to serve for internal DM structures during early init and it has it's minimalistic heap in global data).
My question is how to correctly switch from early allocator to full-scale malloc and when to enable caches.
The current state (on ARM) is:
- gd = id;
- enable_caches();
- mem_malloc_init();
Proposed sequence for mallocator (in order no to loose any data from old and not-relocated part of GD):
- gd_old = gd;
- gd = id;
- mem_malloc_init();
- relocation of DM structures
- early_malloc_disab()
- enable_caches();
I suspect the sequence here is atomic because there is only one thread of execution and no interrupt able to run, but still: would you not rather disable early malloc *before* relocating DM structures?
I think this was somehow solved in here ... Tomas can elaborate further I believe.
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
Based on the (possibly wrong) assumption that in the case you describe, caches are being enabled for the first time, then there is no such thing as "locked cache lines" because the caches were disabled so far.
Not so fast, on PXA there was time when we used caches as RAM in the early stage. We locked some cachelines and placed stack there (and therefore global data etc).
If you are enabling the caches for the first time there, then all lines are (or should be, more exactly) empty and unlocked.
Thanks, Tomas
Amicalement,
Best regards, Marek Vasut

Hi Thomas,
P.S. I dropped the DM list...
It took a couple of other emails to get what is going on here...
On 07/29/2012 01:46 AM, Tomas Hlavacek wrote:
Hello!
I am working on early_malloc() for U-Boot Driver Model (this malloc is going to serve for internal DM structures during early init and it has it's minimalistic heap in global data).
Not exactly on-topic, but I really hope that everything is wrapped so a simple call to malloc() will work pre-relocation. Of course, everything you malloc pre-relocation will have to be re-malloc'd and relocated after relocation. Point is, early malloc should not be restricted to the driver framework
My question is how to correctly switch from early allocator to full-scale malloc and when to enable caches.
The current state (on ARM) is:
- gd = id;
- enable_caches();
- mem_malloc_init();
Proposed sequence for mallocator (in order no to loose any data from old and not-relocated part of GD):
- gd_old = gd;
- gd = id;
- mem_malloc_init();
- relocation of DM structures
- early_malloc_disab()
- enable_caches();
I'm thinking:
1) Low-level CPU init 2) 'Cache-As-RAM' init 3) Global Data init 4) Pre-console buffer init 5) Early malloc() init 6) Console init 7) ...blah, blah, blah... 8) SDRAM init 9) Relocate Global Data 10) malloc() init 11) 'Disable' early malloc (i.e. malloc() now allocates from SDRAM) 12) Relocate from early_malloc_pool to malloc_pool [1] 13) enable_caches()
[1] I'm thinking possibly compile-time registered hooks...
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
I believe that yes, as soon as you enable caching, everything already in cache (gd, pre-console buffer, early malloc pool etc) is as good as gone
Regards,
Graeme

Dear Graeme Russ,
Hi Thomas,
P.S. I dropped the DM list...
It took a couple of other emails to get what is going on here...
On 07/29/2012 01:46 AM, Tomas Hlavacek wrote:
Hello!
I am working on early_malloc() for U-Boot Driver Model (this malloc is going to serve for internal DM structures during early init and it has it's minimalistic heap in global data).
Not exactly on-topic, but I really hope that everything is wrapped so a simple call to malloc() will work pre-relocation. Of course, everything you malloc pre-relocation will have to be re-malloc'd and relocated after relocation. Point is, early malloc should not be restricted to the driver framework
My question is how to correctly switch from early allocator to full-scale malloc and when to enable caches.
The current state (on ARM) is:
- gd = id;
- enable_caches();
- mem_malloc_init();
Proposed sequence for mallocator (in order no to loose any data from old and not-relocated part of GD):
- gd_old = gd;
- gd = id;
- mem_malloc_init();
- relocation of DM structures
- early_malloc_disab()
- enable_caches();
I'm thinking:
- Low-level CPU init
- 'Cache-As-RAM' init
- Global Data init
- Pre-console buffer init
Buffer?
- Early malloc() init
- Console init
You don't need console here ... probably, on some systems.
- ...blah, blah, blah...
- SDRAM init
- Relocate Global Data
- malloc() init
- 'Disable' early malloc (i.e. malloc() now allocates from SDRAM)
- Relocate from early_malloc_pool to malloc_pool [1]
- enable_caches()
[1] I'm thinking possibly compile-time registered hooks...
Gurr ... you mean like INIT-something framework?
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
I believe that yes, as soon as you enable caching, everything already in cache (gd, pre-console buffer, early malloc pool etc) is as good as gone
Right ... that's why now it's copied to a safe location alongside other GD (global data)
Regards,
Graeme
Best regards, Marek Vasut

Hi Graeme,
On Sun, Jul 29, 2012 at 12:53 PM, Graeme Russ graeme.russ@gmail.com wrote:
Not exactly on-topic, but I really hope that everything is wrapped so a simple call to malloc() will work pre-relocation. Of course, everything you malloc pre-relocation will have to be re-malloc'd and relocated after relocation. Point is, early malloc should not be restricted to the driver framework
Yes. Actually my intention is to create a wrapper for switching from early_malloc() to dlmalloc() based on gd->flags. I am going to do that when the concept of early_malloc() is finished.
- Low-level CPU init
- 'Cache-As-RAM' init
- Global Data init
- Pre-console buffer init
- Early malloc() init
- Console init
- ...blah, blah, blah...
- SDRAM init
- Relocate Global Data
- malloc() init
- 'Disable' early malloc (i.e. malloc() now allocates from SDRAM)
- Relocate from early_malloc_pool to malloc_pool [1]
- enable_caches()
[1] I'm thinking possibly compile-time registered hooks...
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
I believe that yes, as soon as you enable caching, everything already in cache (gd, pre-console buffer, early malloc pool etc) is as good as gone
Thanks for explanation!
We have reconsidered relocation of DM structures. Perhaps we are going to keep the structures on early_heap (in copied GD) and we are going to convert pointers inside our structures and copy data after actual relocation when the dlmalloc and caches are up and running. We think that this might be fastest approach.
Tomas

Hi Thomas,
On Sun, Jul 29, 2012 at 11:19 PM, Tomas Hlavacek tmshlvck@gmail.com wrote:
Hi Graeme,
On Sun, Jul 29, 2012 at 12:53 PM, Graeme Russ graeme.russ@gmail.com wrote:
Not exactly on-topic, but I really hope that everything is wrapped so a simple call to malloc() will work pre-relocation. Of course, everything you malloc pre-relocation will have to be re-malloc'd and relocated after relocation. Point is, early malloc should not be restricted to the driver framework
Yes. Actually my intention is to create a wrapper for switching from early_malloc() to dlmalloc() based on gd->flags. I am going to do that when the concept of early_malloc() is finished.
Yes, I saw that patch.
- Low-level CPU init
- 'Cache-As-RAM' init
- Global Data init
- Pre-console buffer init
- Early malloc() init
- Console init
- ...blah, blah, blah...
- SDRAM init
- Relocate Global Data
- malloc() init
- 'Disable' early malloc (i.e. malloc() now allocates from SDRAM)
- Relocate from early_malloc_pool to malloc_pool [1]
- enable_caches()
[1] I'm thinking possibly compile-time registered hooks...
Does it make sense? It actually boils down to one fundamental question: When I have not-rellocated data locked in cache-lines, do I loose them once enable_caches() is called?
I believe that yes, as soon as you enable caching, everything already in cache (gd, pre-console buffer, early malloc pool etc) is as good as gone
Thanks for explanation!
We have reconsidered relocation of DM structures. Perhaps we are going to keep the structures on early_heap (in copied GD) and we are going to convert
You lost me here - early_heap is in cache like ealry GD but then you talk about copied GD which is in SDRAM...
pointers inside our structures and copy data after actual relocation when the dlmalloc and caches are up and running. We think that this might be fastest approach.
My fear is that early_heap will end up being restricted to be usable only be the driver framework. Perhaps you could: 1) SDRAM Init 2) Relocate GD 3) Temporarily relocate ealry heap - store offset in GD 4) enable_caches() 5) malloc() init 6) Permanently relocate early heap into malloc heap
Faster, but a lot of mucking around. And drivers are broken between steps 3 and 6 (but that should not be too much of a problem)
Regards,
Graeme
participants (5)
-
Albert ARIBAUD
-
Graeme Russ
-
Marek Vasut
-
Marek Vasut
-
Tomas Hlavacek