[U-Boot] am335x: GPMC: reading speed with prefetch mode

I've got v2015.04-rc4 running on my custom am335x (600MHz) based board. My 8-bit NAND chip:
[17.297793 0.004021] omap-gpmc 50000000.gpmc: GPMC revision 6.0 [17.303850 0.006057] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [17.309706 0.005856] nand: Micron MT29F2G08ABAEAWP [17.312823 0.003117] nand: 256MiB, SLC, page size: 2048, OOB size: 64 [17.317311 0.004488] nand: using OMAP_ECC_BCH8_CODE_HW ECC scheme
I need to load about 17Mb FIT image from UBIFS partition. In Linux it takes about 7 seconds:
# time cp /mnt/kernel-fit.itb /tmp/ real 0m 7.12s user 0m 0.00s sys 0m 6.89s
But U-Boot needs about twice the time:
[3.592231 0.004182] Booting from nand ... [4.905171 1.312940] UBI: default fastmap pool size: 100 [4.912605 0.007434] UBI: default fastmap WL pool size: 25 [4.919754 0.007149] UBI: attaching mtd1 to ubi0 [5.354450 0.434696] UBI: attached by fastmap [5.360417 0.005967] UBI: fastmap pool size: 100 [5.365765 0.005348] UBI: fastmap WL pool size: 25 [5.398237 0.032472] UBI: attached mtd1 (name "mtd=5", size 253 MiB) to ubi0 [5.404582 0.006345] UBI: PEB size: 131072 bytes (128 KiB), LEB size: 129024 bytes [5.409969 0.005387] UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 512 [5.415245 0.005276] UBI: VID header offset: 512 (aligned 512), data offset: 2048 [5.420208 0.004963] UBI: good PEBs: 2029, bad PEBs: 0, corrupted PEBs: 0 [5.424650 0.004442] UBI: user volume: 1, internal volumes: 1, max. volumes count: 128 [5.430645 0.005995] UBI: max/mean erase counter: 25/18, WL threshold: 4096, image sequence number: 1052535214 [5.438221 0.007576] UBI: available PEBs: 1527, total reserved PEBs: 502, PEBs reserved for bad PEB handling: 40 [5.810683 0.372462] Loading file 'kernel-fit.itb' to addr 0x84000000 with size 17651408 (0x010d56d0)... [19.013472 13.202789] Done
13 seconds are not affected by CONFIG_NAND_OMAP_GPMC_PREFETCH. Am I missing some configs?
#ifdef CONFIG_NAND #define CONFIG_NAND_OMAP_GPMC #define CONFIG_NAND_OMAP_GPMC_PREFETCH #define CONFIG_NAND_OMAP_ELM #define CONFIG_SYS_NAND_5_ADDR_CYCLE #define CONFIG_SYS_NAND_PAGE_COUNT (CONFIG_SYS_NAND_BLOCK_SIZE / \ CONFIG_SYS_NAND_PAGE_SIZE) #define CONFIG_SYS_NAND_PAGE_SIZE 2048 #define CONFIG_SYS_NAND_OOBSIZE 64 #define CONFIG_SYS_NAND_BLOCK_SIZE (128*1024) #define CONFIG_SYS_NAND_BAD_BLOCK_POS NAND_LARGE_BADBLOCK_POS #define CONFIG_SYS_NAND_ECCPOS { 2, 3, 4, 5, 6, 7, 8, 9, \ 10, 11, 12, 13, 14, 15, 16, 17, \ 18, 19, 20, 21, 22, 23, 24, 25, \ 26, 27, 28, 29, 30, 31, 32, 33, \ 34, 35, 36, 37, 38, 39, 40, 41, \ 42, 43, 44, 45, 46, 47, 48, 49, \ 50, 51, 52, 53, 54, 55, 56, 57, }
#define CONFIG_SYS_NAND_ECCSIZE 512 #define CONFIG_SYS_NAND_ECCBYTES 14 #define CONFIG_SYS_NAND_ONFI_DETECTION #define CONFIG_NAND_OMAP_ECCSCHEME OMAP_ECC_BCH8_CODE_HW #define CONFIG_SYS_NAND_U_BOOT_START CONFIG_SYS_TEXT_BASE #define CONFIG_SYS_NAND_U_BOOT_OFFS 0x00080000 #endif #endif
Yegor

Hi,
On 03/19/2015 02:41 PM, Yegor Yefremov wrote:
I've got v2015.04-rc4 running on my custom am335x (600MHz) based board. My 8-bit NAND chip:
[17.297793 0.004021] omap-gpmc 50000000.gpmc: GPMC revision 6.0 [17.303850 0.006057] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [17.309706 0.005856] nand: Micron MT29F2G08ABAEAWP [17.312823 0.003117] nand: 256MiB, SLC, page size: 2048, OOB size: 64 [17.317311 0.004488] nand: using OMAP_ECC_BCH8_CODE_HW ECC scheme
I need to load about 17Mb FIT image from UBIFS partition. In Linux it takes about 7 seconds:
# time cp /mnt/kernel-fit.itb /tmp/ real 0m 7.12s user 0m 0.00s sys 0m 6.89s
But U-Boot needs about twice the time:
On my boards, I didn't load the uImage through UBIFS but directly from the raw mtdblock device. That might make a significant difference, depending on how UBIFS is implemented in U-Boot. For performance tests, I recommend you compare the numbers using 'dd' from the mtdblock device under Linux, and 'nand read.i' from U-Boot.
Linux will, however, still be faster due to DMA, which is unsuable from U-Boot due to the lack of interrupt handlers. But in my tests, enabling the prefetch mode in U-Boot gave me a speed-up of roughly factor 2 IIRC.
Thanks, Daniel

On Thu, Mar 19, 2015 at 2:56 PM, Daniel Mack daniel@zonque.org wrote:
Hi,
On 03/19/2015 02:41 PM, Yegor Yefremov wrote:
I've got v2015.04-rc4 running on my custom am335x (600MHz) based board. My 8-bit NAND chip:
[17.297793 0.004021] omap-gpmc 50000000.gpmc: GPMC revision 6.0 [17.303850 0.006057] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda [17.309706 0.005856] nand: Micron MT29F2G08ABAEAWP [17.312823 0.003117] nand: 256MiB, SLC, page size: 2048, OOB size: 64 [17.317311 0.004488] nand: using OMAP_ECC_BCH8_CODE_HW ECC scheme
I need to load about 17Mb FIT image from UBIFS partition. In Linux it takes about 7 seconds:
# time cp /mnt/kernel-fit.itb /tmp/ real 0m 7.12s user 0m 0.00s sys 0m 6.89s
But U-Boot needs about twice the time:
On my boards, I didn't load the uImage through UBIFS but directly from the raw mtdblock device. That might make a significant difference, depending on how UBIFS is implemented in U-Boot. For performance tests, I recommend you compare the numbers using 'dd' from the mtdblock device under Linux, and 'nand read.i' from U-Boot.
Linux will, however, still be faster due to DMA, which is unsuable from U-Boot due to the lack of interrupt handlers. But in my tests, enabling the prefetch mode in U-Boot gave me a speed-up of roughly factor 2 IIRC.
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
As if I would missing some important configuration.
Daniel, do you have the numbers? Images size and load time? What can I expect?
Yegor

On 03/19/2015 04:13 PM, Yegor Yefremov wrote:
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
What about adding some debug prints to the prefetch setup function and see if it is executed at all?
Daniel, do you have the numbers? Images size and load time? What can I expect?
I don't currently have the setup at hand, sorry. But the number I recall from an email conversation back then is: The time from power-on, loading SPL, loading U-Boot, leeching a 6MB uImage, jumping into it waiting for the console to start dumping the kernel boot messages was less than 5 seconds in total.
Thanks, Daniel

On Thu, Mar 19, 2015 at 4:56 PM, Daniel Mack zonque@gmail.com wrote:
On 03/19/2015 04:13 PM, Yegor Yefremov wrote:
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
What about adding some debug prints to the prefetch setup function and see if it is executed at all?
I2C: ready DRAM: 256 MiB NAND: prefetch enabled NAND: 256 MiB MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Using default environment
I've added "NAND: prefetch enabled" output for:
#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH else { printf("NAND: prefetch enabled\n"); nand->read_buf = omap_nand_read_prefetch8; } #else
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Further ideas?
Daniel, do you have the numbers? Images size and load time? What can I expect?
I don't currently have the setup at hand, sorry. But the number I recall from an email conversation back then is: The time from power-on, loading SPL, loading U-Boot, leeching a 6MB uImage, jumping into it waiting for the console to start dumping the kernel boot messages was less than 5 seconds in total.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
Yegor

On 20/03/15 13:24, Yegor Yefremov wrote:
On Thu, Mar 19, 2015 at 4:56 PM, Daniel Mack zonque@gmail.com wrote:
On 03/19/2015 04:13 PM, Yegor Yefremov wrote:
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
What about adding some debug prints to the prefetch setup function and see if it is executed at all?
I2C: ready DRAM: 256 MiB NAND: prefetch enabled NAND: 256 MiB MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Using default environment
I've added "NAND: prefetch enabled" output for:
#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH else { printf("NAND: prefetch enabled\n"); nand->read_buf = omap_nand_read_prefetch8; } #else
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Further ideas?
Daniel, do you have the numbers? Images size and load time? What can I expect?
I don't currently have the setup at hand, sorry. But the number I recall from an email conversation back then is: The time from power-on, loading SPL, loading U-Boot, leeching a 6MB uImage, jumping into it waiting for the console to start dumping the kernel boot messages was less than 5 seconds in total.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
DMA support will have to be enabled in the board dts. e.g. am335x-bone.dts.
if ti,nand-xfer-type is not specified (like in mainline kernel) then it defaults to prefetch-polled
cheers, -roger

On Fri, Mar 20, 2015 at 1:37 PM, Roger Quadros rogerq@ti.com wrote:
On 20/03/15 13:24, Yegor Yefremov wrote:
On Thu, Mar 19, 2015 at 4:56 PM, Daniel Mack zonque@gmail.com wrote:
On 03/19/2015 04:13 PM, Yegor Yefremov wrote:
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
What about adding some debug prints to the prefetch setup function and see if it is executed at all?
I2C: ready DRAM: 256 MiB NAND: prefetch enabled NAND: 256 MiB MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Using default environment
I've added "NAND: prefetch enabled" output for:
#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH else { printf("NAND: prefetch enabled\n"); nand->read_buf = omap_nand_read_prefetch8; } #else
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Further ideas?
Daniel, do you have the numbers? Images size and load time? What can I expect?
I don't currently have the setup at hand, sorry. But the number I recall from an email conversation back then is: The time from power-on, loading SPL, loading U-Boot, leeching a 6MB uImage, jumping into it waiting for the console to start dumping the kernel boot messages was less than 5 seconds in total.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
DMA support will have to be enabled in the board dts. e.g. am335x-bone.dts.
if ti,nand-xfer-type is not specified (like in mainline kernel) then it defaults to prefetch-polled
I get following error:
omap-gpmc 50000000.gpmc: GPMC revision 6.0 nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256MiB, SLC, page size: 2048, OOB size: 64 omap2-nand omap2-nand.0: DMA engine request failed
Kernel: 3.18.1
GPMC config:
&gpmc { pinctrl-names = "default"; pinctrl-0 = <&nandflash_pins_s0>; ranges = <0 0 0x08000000 0x10000000>; /* CS0: NAND */ status = "okay";
nand@0,0 { reg = <0 0 4>; /* CS0, offset 0 */ nand-bus-width = <8>; ti,nand-ecc-opt = "bch8"; ti,nand-xfer-type = "prefetch-dma";
gpmc,device-nand = "true"; gpmc,device-width = <1>; gpmc,sync-clk-ps = <0>; gpmc,cs-on-ns = <0>; gpmc,cs-rd-off-ns = <44>; gpmc,cs-wr-off-ns = <44>; gpmc,adv-on-ns = <6>; gpmc,adv-rd-off-ns = <34>; gpmc,adv-wr-off-ns = <44>; gpmc,we-on-ns = <0>; gpmc,we-off-ns = <40>; gpmc,oe-on-ns = <0>; gpmc,oe-off-ns = <54>; gpmc,access-ns = <64>; gpmc,rd-cycle-ns = <82>; gpmc,wr-cycle-ns = <82>; gpmc,wait-on-read = "true"; gpmc,wait-on-write = "true"; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,clk-activation-ns = <0>; gpmc,wait-monitoring-ns = <0>; gpmc,wr-access-ns = <40>; gpmc,wr-data-mux-bus-ns = <0>;
#address-cells = <1>; #size-cells = <1>; elm_id = <&elm>; }; };
Any idea?
Btw. where DMA support should be configured? GPMC config has no "dmas" like other components like mmc, uart, USB etc.
gpmc: gpmc@50000000 { compatible = "ti,am3352-gpmc"; ti,hwmods = "gpmc"; ti,no-idle-on-init; reg = <0x50000000 0x2000>; interrupts = <100>; gpmc,num-cs = <7>; gpmc,num-waitpins = <2>; #address-cells = <2>; #size-cells = <1>; status = "disabled"; };
Yegor

+Tony and l-o
On 20/03/15 15:37, Yegor Yefremov wrote:
On Fri, Mar 20, 2015 at 1:37 PM, Roger Quadros rogerq@ti.com wrote:
On 20/03/15 13:24, Yegor Yefremov wrote:
On Thu, Mar 19, 2015 at 4:56 PM, Daniel Mack zonque@gmail.com wrote:
On 03/19/2015 04:13 PM, Yegor Yefremov wrote:
Strange. Have tried with "nand read" command, but still the same result with and without CONFIG_NAND_OMAP_GPMC_PREFETCH :
[2.150655 0.001006] NAND read: device 0 offset 0x260000, size 0x1200000 [15.978943 13.828288] 18874368 bytes read: OK
What about adding some debug prints to the prefetch setup function and see if it is executed at all?
I2C: ready DRAM: 256 MiB NAND: prefetch enabled NAND: 256 MiB MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Using default environment
I've added "NAND: prefetch enabled" output for:
#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH else { printf("NAND: prefetch enabled\n"); nand->read_buf = omap_nand_read_prefetch8; } #else
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Further ideas?
Daniel, do you have the numbers? Images size and load time? What can I expect?
I don't currently have the setup at hand, sorry. But the number I recall from an email conversation back then is: The time from power-on, loading SPL, loading U-Boot, leeching a 6MB uImage, jumping into it waiting for the console to start dumping the kernel boot messages was less than 5 seconds in total.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
DMA support will have to be enabled in the board dts. e.g. am335x-bone.dts.
if ti,nand-xfer-type is not specified (like in mainline kernel) then it defaults to prefetch-polled
I get following error:
omap-gpmc 50000000.gpmc: GPMC revision 6.0 nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256MiB, SLC, page size: 2048, OOB size: 64 omap2-nand omap2-nand.0: DMA engine request failed
Kernel: 3.18.1
GPMC config:
&gpmc { pinctrl-names = "default"; pinctrl-0 = <&nandflash_pins_s0>; ranges = <0 0 0x08000000 0x10000000>; /* CS0: NAND */ status = "okay";
nand@0,0 { reg = <0 0 4>; /* CS0, offset 0 */ nand-bus-width = <8>; ti,nand-ecc-opt = "bch8"; ti,nand-xfer-type = "prefetch-dma"; gpmc,device-nand = "true"; gpmc,device-width = <1>; gpmc,sync-clk-ps = <0>; gpmc,cs-on-ns = <0>; gpmc,cs-rd-off-ns = <44>; gpmc,cs-wr-off-ns = <44>; gpmc,adv-on-ns = <6>; gpmc,adv-rd-off-ns = <34>; gpmc,adv-wr-off-ns = <44>; gpmc,we-on-ns = <0>; gpmc,we-off-ns = <40>; gpmc,oe-on-ns = <0>; gpmc,oe-off-ns = <54>; gpmc,access-ns = <64>; gpmc,rd-cycle-ns = <82>; gpmc,wr-cycle-ns = <82>; gpmc,wait-on-read = "true"; gpmc,wait-on-write = "true"; gpmc,bus-turnaround-ns = <0>; gpmc,cycle2cycle-delay-ns = <0>; gpmc,clk-activation-ns = <0>; gpmc,wait-monitoring-ns = <0>; gpmc,wr-access-ns = <40>; gpmc,wr-data-mux-bus-ns = <0>; #address-cells = <1>; #size-cells = <1>; elm_id = <&elm>; };
};
Any idea?
Btw. where DMA support should be configured? GPMC config has no "dmas" like other components like mmc, uart, USB etc.
I've never tested dma on omap-nand in mainline. We've been solely relying on prefetch-polled mode there.
someone needs to work on it and test dma support.
looks like the driver is hardcoding the DMA channel sig = OMAP24XX_DMA_GPMC; info->dma = dma_request_channel(mask, omap_dma_filter_fn, &sig);
this will have to be fixed. I'm not sure what else is missing on DMA side. But you could try to use a free channel and see if it works?
Tony any insights?
cheers, -roger
gpmc: gpmc@50000000 { compatible = "ti,am3352-gpmc"; ti,hwmods = "gpmc"; ti,no-idle-on-init; reg = <0x50000000 0x2000>; interrupts = <100>; gpmc,num-cs = <7>; gpmc,num-waitpins = <2>; #address-cells = <2>; #size-cells = <1>; status = "disabled"; };
Yegor

* Roger Quadros rogerq@ti.com [150320 07:59]:
On 20/03/15 15:37, Yegor Yefremov wrote:
Btw. where DMA support should be configured? GPMC config has no "dmas" like other components like mmc, uart, USB etc.
I've never tested dma on omap-nand in mainline. We've been solely relying on prefetch-polled mode there.
someone needs to work on it and test dma support.
looks like the driver is hardcoding the DMA channel sig = OMAP24XX_DMA_GPMC; info->dma = dma_request_channel(mask, omap_dma_filter_fn, &sig);
this will have to be fixed. I'm not sure what else is missing on DMA side. But you could try to use a free channel and see if it works?
Tony any insights?
There are six external DMA request pins on omaps that can be used for devices on GPMC for DMA. These are the sys_ndmareq0 to 5. So far the only the old tusb6010 driver is the only one using that, and it's still not using the DMA engine API. I've been meaning to update it, but have not had a chance yet.. Anyways, all the new DMA code should use the dmaengine API instead of the omap legacy DMA calls.
Note that the onenand driver is wrongly using the external gpio request pin as a GPIO. That's not what the pin is supposed to do, it needs to be muxed to the external gpio request mode. So I suggest we either rip out the DMA code from onenand, or update it to use the sys_ndmareq and DMA engine API.
Regards,
Tony

Hi,
On 03/20/2015 12:24 PM, Yegor Yefremov wrote:
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Does it fall back to polled mode because the engine is busy maybe? See the comment in the code that deals with the return value of __read_prefetch_aligned(). Not that I'd be aware of such an issue, but maybe you need to enable to enable some module or clock?
Further ideas?
It's been a while, but when I worked on cutting down the boot times of a custom AM335x board months ago, these were the steps that I took:
* I tweaked the NAND timing values in the DTS, and then looked at the GPMC timing values Linux calculated from that and copied it over to U-Boot. This gave a NAND read speed-up of ~20-30% in comparison to the defaults U-Boot ships with.
* Implementing the GPMC prefetch operations boosted the speed by of raw NAND reads by approximately factor 2.
* The NAND bad block table scan was reduced to the first 64 blocks, because this is all I cared for from U-Boot anyway, and Linux will do the same anyway at a later point. This brought down the scan process from ~2s to some milliseconds.
* In the SPL code, the CPU was put in GHz mode as early as possible. That improved load times again, and also reduced the time spent in the kernel decompressor.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
Right, I just double-checked with the DTS we were using. It had no specific "ti,nand-xfer-type" setting, so it should default to "prefetch-polled". So it seems DMA is not in place for this board and the gpmc-nand driver. Sorry for the confusion.
Daniel

On Fri, Mar 20, 2015 at 3:54 PM, Daniel Mack zonque@gmail.com wrote:
Hi,
On 03/20/2015 12:24 PM, Yegor Yefremov wrote:
I've also put printf() into omap_nand_read_prefetch8() just to make sure it is called - it was called.
Does it fall back to polled mode because the engine is busy maybe? See the comment in the code that deals with the return value of __read_prefetch_aligned(). Not that I'd be aware of such an issue, but maybe you need to enable to enable some module or clock?
I've seen this fallback comment and added printf to ret < 0 case and in __read_prefetch_aligned(). I had no cases, where fallback was invoked.
I've also a suspicion, that some clocks settings are missing. Will have to compare register values with Linux ones.
Further ideas?
It's been a while, but when I worked on cutting down the boot times of a custom AM335x board months ago, these were the steps that I took:
- I tweaked the NAND timing values in the DTS, and then looked at the
GPMC timing values Linux calculated from that and copied it over to U-Boot. This gave a NAND read speed-up of ~20-30% in comparison to the defaults U-Boot ships with.
- Implementing the GPMC prefetch operations boosted the speed by of raw
NAND reads by approximately factor 2.
- The NAND bad block table scan was reduced to the first 64 blocks,
because this is all I cared for from U-Boot anyway, and Linux will do the same anyway at a later point. This brought down the scan process from ~2s to some milliseconds.
- In the SPL code, the CPU was put in GHz mode as early as possible.
That improved load times again, and also reduced the time spent in the kernel decompressor.
Do you mean am33xx_spl_board_init()? In theory it reads ID and sets desired MPU/Core voltage. But I'll look at this once again.
I Linux I had ti,nand-xfer-type = "polled";. After replacing it with ti,nand-xfer-type = "prefetch-polled"; I now get
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 2.58744 s, 6.5 MB/s
instead of:
# dd if=/dev/mtdblock5 of=/dev/null bs=2M count=8 8+0 records in 8+0 records out 16777216 bytes (17 MB) copied, 6.05157 s, 2.8 MB/s
Do I see it right, that DMA support is not implemented in am33xx.dtsi?
Right, I just double-checked with the DTS we were using. It had no specific "ti,nand-xfer-type" setting, so it should default to "prefetch-polled". So it seems DMA is not in place for this board and the gpmc-nand driver. Sorry for the confusion.
Daniel
participants (5)
-
Daniel Mack
-
Daniel Mack
-
Roger Quadros
-
Tony Lindgren
-
Yegor Yefremov