
Hi Tomas,
On Fri, Oct 26, 2012 at 6:16 AM, Tomas Hlavacek tmshlvck@gmail.com wrote:
Hello Graeme,
On Thu, Oct 25, 2012 at 3:40 AM, Graeme Russ graeme.russ@gmail.com wrote:
diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index 2b9af93..9045829 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -82,6 +82,9 @@ typedef struct global_data { unsigned long post_log_res; /* success of POST test */ unsigned long post_init_f_time; /* When post_init_f started */ #endif +#ifdef CONFIG_SYS_EARLY_MALLOC
void *early_heap; /* heap for early_malloc */
+#endif
Why not early_heap_header *early_heap; ?
It might be.
Actually it is a good point because I am using 3 different ways of dealing with addresses: 1) struct early_heap header * or struct early_block_header * - I am using this when I want to access members of the stucture or compute address past the structure (which is where the heap or block starts); 2) phys_addr_t - which is plain integer and I use this for simple computations when I do not want to worry about pointer arithmetic; 3) void * when I have just plain address, especially when I want to pass an addres among logical parts of the mallocator or outside. This may a bit controversial and perhaps I should replace it by specific strucutre pointers internally.
I am unable to decide: Should I remove struct early_heap_header from dmmalloc.h making it publicly unavailable or should I rather change the void * to struct early_heap_header * in the GD structure? What do you think is better?
I think struct early_heap_header * in the GD structure is the better way to go as that is exactly what it is.
[snip]
h = (struct early_heap_header *)CONFIG_SYS_EARLY_HEAP_ADDR;
b = (struct early_block_header *)(h + 1);
Hmmm, does this really work? I would have thought:
b = (struct early_block_header *)(h + sizeof(struct early_heap_header));
but I could be mistaken
It seems that it works as it is (at least I wrote bunch of tests and I inspected resulting heaps and it was all right). I believe that since h is a pointer to the struct early_heap_header then pointer arithmetic is in effect and h+1 actually means "next element in the array of struct early_heap_header". Which is the address past the header that equals beginning of the heap data block. (?)
As I said, I could be mistaken - it appears I am :)
+int early_malloc_active(void) +{
return ((gd->flags & GD_FLG_RELOC) != GD_FLG_RELOC);
+}
I think we need another flag - GD_FLG_RELOC gets set before the permanent heap is initialised, so there is a window of opportunity where things may break
Well, as I wrote in the commit message - this is only a temporary implementation. I suppose I am going to change this when we have more coarse initialization flags wired into DM (which I believe we are going to have it anyway). So now I am just working around "forward dependency" here.
+void early_heap_dump(struct early_heap_header *h) +{
struct early_block_header *b;
debug("heap: h=%p, h->size=%d\n", h, h->size);
b = (struct early_block_header *)(h+1);
while ((phys_addr_t)b + sizeof(struct early_block_header)
< (phys_addr_t)h + h->size) {
debug("block: h=%p h->size=%d b=%p b->size=%d b->(used)=%d\n",
h, h->size, b, BLOCK_SIZE(b->size),
BLOCK_USED(b->size));
assert(BLOCK_SIZE(b->size) > 0);
b = (struct early_block_header *)((phys_addr_t)b +
sizeof(struct early_block_header) +
BLOCK_SIZE(b->size));
}
debug("--- heap dump end ---\n");
+}
Nice touch, but could we just iterate through all ealry heap chunks starting from gd->early_heap?
Or I can have two functions. One heap specific and one for all heaps. I think both might be useful when somebody needs to debug early_malloc or memory usage etc. in the early stage. Thanks.
True, just adding another function which iterates through the heaps and calls this function would be fine.
[snip]
+static inline void *dmrealloc(void *oldaddr, size_t bytes) +{ +#ifdef CONFIG_SYS_EARLY_MALLOC
char *addr;
struct early_block_header *h;
if (early_malloc_active()) {
addr = dmmalloc(bytes);
if (addr == NULL)
return NULL;
h = BLOCK_HEADER(oldaddr);
if (BLOCK_FREE(h->size))
return NULL;
if (bytes > BLOCK_SIZE(h->size))
bytes = BLOCK_SIZE(h->size);
memcpy(addr, oldaddr, bytes);
dmfree(oldaddr);
return addr;
}
+#endif /* CONFIG_SYS_EARLY_MALLOC */
return realloc(oldaddr, bytes);
+}
Hmmm, we have a very intersting corner case to deal with. What if we use dmrealloc() to move allocated data from the early malloc pool to the final malloc pool?
At some point, we need to assume the developer is only ever going to pass early malloc'd memory to dmrealloc()
Good point! The current code would do the job assuming that the early_malloc can still access the proper early_heap (or a copy, but in that case some additional magic is needed) and the real malloc is already initialized.
As you can see I am still sticking with the double-copy method. It is maybe due to lack of insight. But the discussion here was not absolutely conclusive last time. I have even some experimental code (not ready for submitting at all) for double copy method but I would prefer to discuss it separately when the early_malloc() is done.
Well the double-copy approach to reallocating early-malloc'd memory and how dmrealloc() is implemented are tightly coupled.
I understand the reason for the double copy is performance related (getting into a position to enable cache as early as possible), but I wonder how much will really be gained. It seems to me that the only performance gain will be for the execution of the driver 'relocation fixup' code wich, I assume, would not really consume that many CPU cycles.
There is a point where code simplicity outweighs performance gains.
The other option is to use the gd->flags...
Define two flags - something like:
GD_FLG_HEAP_INIT -> Final heap has been initialised GD_FLG_EH_DONE -> free(), realloc() refer to final heap
(I don't like the names, I'm just not up to thinking of anything better)
This way we can use dmalloc() prior to the heap being initialised and then set GD_FLG_HEAP_INIT. Once GD_FLG_HEAP_INIT has been set, do a bunch of dmrealloc() calls to essentially move the data into the permanent heap (of course pointers int the structures need to be fixed up by the drivers) and finally set GD_FLG_EH_DONE (and call early_heap_dump)
One problem I see is what happens of you call malloc() and free() on the same block between the setting of GD_FLG_HEAP_INIT and GD_FLG_EH_DONE? The code will try to reference the block as if it is in the early heap, but it won't be.
One solution is, on detection of GD_FLG_HEAP_INIT being set, dmfree() and dmrealloc() could search the early heap chunks instead of just assuming that the referenced block is in the early heap (if GD_FLG_HEAP_INIT has not been set, it will be safe to assume the memory is on the early heap)
Sure we will have that flags. But I think we can use them as well for switching from DM driver instance list to tree for example. Or other way around: I can use DM flags for early_malloc. Therefore I would like to synchronize with DM cores for PCI and another low-level things which are certainly going to start in early stage. It would be best to use the same flags and switch on/off early_malloc based on DM internal state.
Ah, I see where more performance gains are to be made by switching on cache earlier - During the reallocation phase, you are switching from the list based structure (fast for the small number of pre-SDRAM drivers) into the final tree based structure.
I'm looking at this early malloc code from a much more generic point of view - I think there are use-cases outside the driver model, so I don't see a need (rather the opposite) to tie early malloc to the driver model
Regards,
Graeme