[U-Boot] [PATCH 0/2] CFI: increase performance

Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Thanks, Jens
---
Jens Gehrlein (2): CFI: increase performance of function find_sector() CFI: avoid redundant function call in single word programming mode
drivers/mtd/cfi_flash.c | 33 +++++++++++++++++++++++---------- 1 files changed, 23 insertions(+), 10 deletions(-)

The function find_sector() doesn't need to be called twice in the case of AMD command set. Tested on TQM5200S-BD with Samsung K8P2815UQB.
Signed-off-by: Jens Gehrlein sew_s@tqs.de ---
drivers/mtd/cfi_flash.c | 10 +++++++--- 1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/cfi_flash.c b/drivers/mtd/cfi_flash.c index e8afe99..1bd0e2b 100644 --- a/drivers/mtd/cfi_flash.c +++ b/drivers/mtd/cfi_flash.c @@ -795,7 +795,8 @@ static int flash_write_cfiword (flash_info_t * info, ulong dest, { void *dstaddr; int flag; - flash_sect_t sect; + flash_sect_t sect = 0; + char sect_found = 0;
dstaddr = map_physmem(dest, info->portwidth, MAP_NOCACHE);
@@ -840,6 +841,7 @@ static int flash_write_cfiword (flash_info_t * info, ulong dest, sect = find_sector(info, dest); flash_unlock_seq (info, sect); flash_write_cmd (info, sect, info->addr_unlock1, AMD_CMD_WRITE); + sect_found = 1; break; }
@@ -864,8 +866,10 @@ static int flash_write_cfiword (flash_info_t * info, ulong dest,
unmap_physmem(dstaddr, info->portwidth);
- return flash_full_status_check (info, find_sector (info, dest), - info->write_tout, "write"); + if (!sect_found) + sect = find_sector (info, dest); + + return flash_full_status_check (info, sect, info->write_tout, "write"); }
#ifdef CONFIG_SYS_FLASH_USE_BUFFER_WRITE

Tested on TQM5200S-BD with Samsung K8P2815UQB
Signed-off-by: Jens Gehrlein sew_s@tqs.de ---
drivers/mtd/cfi_flash.c | 23 ++++++++++++++++------- 1 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/mtd/cfi_flash.c b/drivers/mtd/cfi_flash.c index 1bd0e2b..bc5e151 100644 --- a/drivers/mtd/cfi_flash.c +++ b/drivers/mtd/cfi_flash.c @@ -774,17 +774,26 @@ static void flash_add_byte (flash_info_t * info, cfiword_t * cword, uchar c) } }
-/* loop through the sectors from the highest address when the passed - * address is greater or equal to the sector address we have a match +/* + * Loop through the sector table starting from the previously found sector. + * Searches forwards or backwards, dependent on the passed address. */ static flash_sect_t find_sector (flash_info_t * info, ulong addr) { - flash_sect_t sector; + static flash_sect_t saved_sector = 0; /* previously found sector */ + flash_sect_t sector = saved_sector;
- for (sector = info->sector_count - 1; sector >= 0; sector--) { - if (addr >= info->start[sector]) - break; - } + while ((info->start[sector] < addr) + && (sector < info->sector_count - 1)) + sector++; + while ((info->start[sector] > addr) && (sector > 0)) + /* + * also decrements the sector in case of an overshot + * in the first loop + */ + sector--; + + saved_sector = sector; return sector; }

On 17:25 Tue 16 Dec , Jens Gehrlein wrote:
Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
Best Regards, J.

Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:25 Tue 16 Dec , Jens Gehrlein wrote:
Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
??? Sorry, what is qemu SVN HEAD ???
P.S.: I still can't reply to your e-mail address...
Kind regards, Jens

On 17:46 Tue 16 Dec , Jens Gehrlein wrote:
Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:25 Tue 16 Dec , Jens Gehrlein wrote:
Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
??? Sorry, what is qemu SVN HEAD ???
In qemu you have 2 boards emulated with u-boot support qemu_mips and SX1
it'll be nice to test them also
please note they are only in the SVN tree of qemu
Best Regards, J.

Dear Jean-Christophe,
Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:46 Tue 16 Dec , Jens Gehrlein wrote:
Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:25 Tue 16 Dec , Jens Gehrlein wrote:
Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
??? Sorry, what is qemu SVN HEAD ???
In qemu you have 2 boards emulated with u-boot support qemu_mips and SX1
Now, I understand, what you meant.
it'll be nice to test them also
please note they are only in the SVN tree of qemu
Because I'm neither familiar with qemu nor with svn I can't do that with little effort (installation, familiarization, etc.). Beside that, how could a virtual machine simulate the real bus access with it's bus timing? If I'm right in this point, only testing on another architecture is possible, but no performance test.
Shortly, I will get a TQM8548 with Samsung Flash. It's a top boot system, but also has a 2x16 Bit Flash connection. For my part, I can only offer you a test on this board.
Stefan, as the CFI custodian, how is your procedure to check such kind of common code patches?
Kind regards, Jens

On Wednesday 17 December 2008, Jens Gehrlein wrote:
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
??? Sorry, what is qemu SVN HEAD ???
In qemu you have 2 boards emulated with u-boot support qemu_mips and SX1
Now, I understand, what you meant.
it'll be nice to test them also
please note they are only in the SVN tree of qemu
Because I'm neither familiar with qemu nor with svn I can't do that with little effort (installation, familiarization, etc.). Beside that, how could a virtual machine simulate the real bus access with it's bus timing? If I'm right in this point, only testing on another architecture is possible, but no performance test.
Shortly, I will get a TQM8548 with Samsung Flash. It's a top boot system, but also has a 2x16 Bit Flash connection. For my part, I can only offer you a test on this board.
Stefan, as the CFI custodian, how is your procedure to check such kind of common code patches?
First a review of course. And second tests on multiple boards. It would be good if you could test your patch on a few other systems as well. If nobody objects, I'll apply your patch to the cfi/next branch in a few days and will probably test it on some of my boards too. It will hit mainline in the next merge window then.
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

Dear Jens,
In message 49489B37.9020000@tqs.de you wrote:
Because I'm neither familiar with qemu nor with svn I can't do that with little effort (installation, familiarization, etc.). Beside that,
Actually you just have to check out and run it - that does not require an in-depth understanding or long learning curve.
how could a virtual machine simulate the real bus access with it's bus timing? If I'm right in this point, only testing on another architecture is possible, but no performance test.
Indeed. But it is a very important data point to know that your patch does not break the CFI driver on as many architectures as we can test; and neither you nor we hve allthat many boards available as real hardware, so simulated machines fill an important gap.
Shortly, I will get a TQM8548 with Samsung Flash. It's a top boot system, but also has a 2x16 Bit Flash connection. For my part, I can only offer you a test on this board.
Well, you probably could test in on a TQM8260 as well, which has a 64 bit bus (4 x 16 bit). And you should have access to trab and cmc_pu2 and TQM5200 / TQM5200s and TQM834x and TQMA31 so you actually test it on a much wider range of boards than most of us.
Stefan, as the CFI custodian, how is your procedure to check such kind of common code patches?
We ask the people who have access to real hardware to test on their boards, and we accompany that by tests on simulated hardware.
Best regards,
Wolfgang Denk

On 07:24 Wed 17 Dec , Jens Gehrlein wrote:
Dear Jean-Christophe,
Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:46 Tue 16 Dec , Jens Gehrlein wrote:
Jean-Christophe PLAGNIOL-VILLARD schrieb:
On 17:25 Tue 16 Dec , Jens Gehrlein wrote:
Hi list,
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Could you try it against the qemu SVN HEAD
??? Sorry, what is qemu SVN HEAD ???
In qemu you have 2 boards emulated with u-boot support qemu_mips and SX1
Now, I understand, what you meant.
it'll be nice to test them also
please note they are only in the SVN tree of qemu
Because I'm neither familiar with qemu nor with svn I can't do that with little effort (installation, familiarization, etc.). Beside that, how could a virtual machine simulate the real bus access with it's bus timing? If I'm right in this point, only testing on another architecture is possible, but no performance test.
It's simple to use I've write a doc about the qemu_mips usage in doc/README.qemu_mips
and I've resend doc update yesterday to the ML
for the SX1 it's nearly the same as the qemu_mips but the cmdline to start qemu will be dd of=SX1/flash bs=1k count=32k if=/dev/zero dd of=SX1/flash bs=1k conv=notrunc if=SX1/u-boot.bin /opt/qemu/bin/qemu-system-arm -M sx1 -monitor pty -nographic -m 98 -pflash SX1/flash
and the tree to use the SX1 is u-boot-arm/next with make SX1_stdout_serial_config
Best Regards, J.

On Tuesday 16 December 2008, Jens Gehrlein wrote:
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Tested on AMCC Kilauea (PPC405EX, with 1*16bit Spansion S29GL512N.
Patch series applied to u-boot-cfi-flash/master. Thanks.
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

Hi Stefan,
Stefan Roese schrieb:
On Tuesday 16 December 2008, Jens Gehrlein wrote:
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Tested on AMCC Kilauea (PPC405EX, with 1*16bit Spansion S29GL512N.
The S29GL512N has a write buffer, AFAIK. Thus, U-Boot chooses another programming algorithm. Possibly, you only tested patch 2 or did you override buffered programming for your test so that patch 1 was included?
Patch series applied to u-boot-cfi-flash/master. Thanks.
Nice, thanks.
Kind regards, Jens

Hi Jens,
On Monday 26 January 2009, Jens Gehrlein wrote:
The following patches should increase the performance of the CFI driver, particularly with regard to single word programming mode.
I tested it on TQM5200S with NOR-Flash Samsung K8P2815UQB, which has no write buffer. At least no write buffer, that could be programmed using standard commands.
Performance increase on this TQM is about factor 2.6 (37 KiB/s -> 95 KiB/s). On the same module with Spansion S29GL128N (with write buffer) it is about factor 1.2 (455 KiB/s -> 585 KiB/s).
TQM5200 is a bottom boot module with 2x16 Bit Flash connection. Could someone test the patches on other HW, particularly top boot, other CPU, other flash width, please?
Tested on AMCC Kilauea (PPC405EX, with 1*16bit Spansion S29GL512N.
The S29GL512N has a write buffer, AFAIK. Thus, U-Boot chooses another programming algorithm. Possibly, you only tested patch 2 or did you override buffered programming for your test so that patch 1 was included?
No, I tested both your patches with the unchanged kilauea config to see if you patch broke this (standard) configuration. Nothing bad happened so I applied it to the master branch. Others will surely test it as well when its available in mainline.
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================
participants (4)
-
Jean-Christophe PLAGNIOL-VILLARD
-
Jens Gehrlein
-
Stefan Roese
-
Wolfgang Denk