
Dear Lei Wen,
As suggested by Reinhard, I add two additional member in mmc structure, so that we could specify its value in each driver. If that value is 0, then the behavior would be the same as original, as no seperation.
After thinking alot about this: Preface: (for understanding of the issue)
The high level "driver/part" mmc.c prepares a command structure, a data structure and passes them to the hardware dependant low level driver: err = mmc_send_cmd(mmc, &cmd, &data); I agree that it would be improper for the low level driver to split that command into several commands on its own when the data length cannot be handled by the hardware. That would require repeating some of the logic that already exists in the high level part. The low level driver should be and stay as little aware of the command details as possible.
Also some hardware really has only a 16 bit wide block counter. That includes ATMEL's MCI, but since the data transfer is programmed in a loop, that register is not used.
I see two possible solutions for that problem here: 1. generally limit the number of blocks requested to 65535. The performance penalty for that is insignificant. (65535 blocks are about 32 MiB) 2. limit it on a case by case basis by passing such limit like a host capability to the high level part.
Here I would (after much deliberation) favour version 1.
The second and more serious problem Lei Wen seems to have with his hardware is that DMA seems problematic (how so?) for more than 512 KiB. I really don't see that we should pull such (unusual?) limitations into the high level part. I think the low level driver could issue several DMA transfers in that case.
We need concensus here first before we can issue and comment patches.
With Best Regards to All,
Reinhard