
On Fri, May 27, 2022 at 5:55 PM Simon Glass sjg@chromium.org wrote:
On Thu, 26 May 2022 at 07:38, Paweł Anikiel pan@semihalf.com wrote:
Apply some optimizations to speed up bitstream loading (both for full and split periph/core bitstreams):
Change the size of the first fs read, so that all the subsequent reads are aligned to a specific value (called MAX_FIRST_LOAD_SIZE). This value was chosen so that in subsequent reads the fat fs driver doesn't have to allocate a temporary buffer in get_contents (assuming 8KiB clusters).
Change the buffer size to a larger value when reading to ddr (but not too large, because large transfers cause a stack overflow in the dwmmc driver).
When the size is too large, where exactly does that stack overflow happen?
In dwmci_send_cmd (at drivers/mmc/dw_mmc.c:243). It stack-allocates a buffer of size sizeof(struct dwmci_idmac) * (data->blocks / 8). Since loading the bitstream is done from SPL (which is still in sram), we only have about 100K of stack, which is not enough to load an 11MB file in one go.