[U-Boot] [PATCH 2/2] fs/fat: Optimizes memory size with single global variable instead of 3

Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
Signed-off-by: Tien Fong Chee tfchee@altera.com --- Changes for V2: - Renaming all references to get_dentfromdir_block and removing redundant Cc: Dinh Nguyen dinguyen@opensource.altera.com Cc: Dinh Nguyen dinh.linux@gmail.com Cc: ChinLiang clsee@altera.com Cc: Vagrant Cascadian vagrant@debian.org Cc: Simon Glass sjg@chromium.org Cc: Stephen Warren swarren@nvidia.com Cc: Benoît Thébaudeau benoit@wsystem.com --- fs/fat/fat.c | 23 +++++++++-------------- fs/fat/fat_write.c | 20 ++++++-------------- 2 files changed, 15 insertions(+), 28 deletions(-)
diff --git a/fs/fat/fat.c b/fs/fat/fat.c index 826bd85..7f42af7 100644 --- a/fs/fat/fat.c +++ b/fs/fat/fat.c @@ -327,7 +327,7 @@ get_cluster(fsdata *mydata, __u32 clustnum, __u8 *buffer, unsigned long size) * into 'buffer'. * Update the number of bytes read in *gotsize or return -1 on fatal errors. */ -__u8 get_contents_vfatname_block[MAX_CLUSTSIZE] +__u8 get_dentfromdir_block[MAX_CLUSTSIZE] __aligned(ARCH_DMA_MINALIGN);
static int get_contents(fsdata *mydata, dir_entry *dentptr, loff_t pos, @@ -373,14 +373,14 @@ static int get_contents(fsdata *mydata, dir_entry *dentptr, loff_t pos, /* align to beginning of next cluster if any */ if (pos) { actsize = min(filesize, (loff_t)bytesperclust); - if (get_cluster(mydata, curclust, get_contents_vfatname_block, + if (get_cluster(mydata, curclust, get_dentfromdir_block, (int)actsize) != 0) { printf("Error reading cluster\n"); return -1; } filesize -= actsize; actsize -= pos; - memcpy(buffer, get_contents_vfatname_block + pos, actsize); + memcpy(buffer, get_dentfromdir_block + pos, actsize); *gotsize += actsize; if (!filesize) return 0; @@ -515,13 +515,13 @@ get_vfatname(fsdata *mydata, int curclust, __u8 *cluster, return -1; }
- if (get_cluster(mydata, curclust, get_contents_vfatname_block, + if (get_cluster(mydata, curclust, get_dentfromdir_block, mydata->clust_size * mydata->sect_size) != 0) { debug("Error: reading directory block\n"); return -1; }
- slotptr2 = (dir_slot *)get_contents_vfatname_block; + slotptr2 = (dir_slot *)get_dentfromdir_block; while (counter > 0) { if (((slotptr2->id & ~LAST_LONG_ENTRY_MASK) & 0xff) != counter) @@ -532,7 +532,7 @@ get_vfatname(fsdata *mydata, int curclust, __u8 *cluster,
/* Save the real directory entry */ realdent = (dir_entry *)slotptr2; - while ((__u8 *)slotptr2 > get_contents_vfatname_block) { + while ((__u8 *)slotptr2 > get_dentfromdir_block) { slotptr2--; slot2str(slotptr2, l_name, &idx); } @@ -579,8 +579,6 @@ static __u8 mkcksum(const char name[8], const char ext[3]) * Get the directory entry associated with 'filename' from the directory * starting at 'startsect' */ -__u8 get_dentfromdir_block[MAX_CLUSTSIZE] - __aligned(ARCH_DMA_MINALIGN);
static dir_entry *get_dentfromdir(fsdata *mydata, int startsect, char *filename, dir_entry *retdent, @@ -811,9 +809,6 @@ exit: return ret; }
-__u8 do_fat_read_at_block[MAX_CLUSTSIZE] - __aligned(ARCH_DMA_MINALIGN); - int do_fat_read_at(const char *filename, loff_t pos, void *buffer, loff_t maxsize, int dols, int dogetsize, loff_t *size) { @@ -927,7 +922,7 @@ root_reparse: int i;
if (mydata->fatsize == 32 || firsttime) { - dir_ptr = do_fat_read_at_block; + dir_ptr = get_dentfromdir_block; firsttime = 0; } else { /** @@ -947,8 +942,8 @@ root_reparse: * 0 | sector n+1 | sector n+2 | sector n+3 | * 1 | sector n+3 | ... */ - dir_ptr = (do_fat_read_at_block + mydata->sect_size); - memcpy(do_fat_read_at_block, dir_ptr, mydata->sect_size); + dir_ptr = (get_dentfromdir_block + mydata->sect_size); + memcpy(get_dentfromdir_block, dir_ptr, mydata->sect_size); }
do_read = 1; diff --git a/fs/fat/fat_write.c b/fs/fat/fat_write.c index c1d48c5..62f32bf 100644 --- a/fs/fat/fat_write.c +++ b/fs/fat/fat_write.c @@ -369,7 +369,6 @@ static __u32 dir_curclust; * a slot) into 'l_name'. If successful also copy the real directory entry * into 'retdent' * If additional adjacent cluster for directory entries is read into memory, - * then 'get_contents_vfatname_block' is copied into 'get_dentfromdir_block' and * the location of the real directory entry is returned by 'retdent' * Return 0 on success, -1 otherwise. */ @@ -384,7 +383,7 @@ get_long_file_name(fsdata *mydata, int curclust, __u8 *cluster, PREFETCH_BLOCKS : mydata->clust_size); __u8 counter = (slotptr->id & ~LAST_LONG_ENTRY_MASK) & 0xff; - int idx = 0, cur_position = 0; + int idx = 0;
if (counter > VFAT_MAXSEQ) { debug("Error: VFAT name is too long\n"); @@ -412,13 +411,13 @@ get_long_file_name(fsdata *mydata, int curclust, __u8 *cluster,
dir_curclust = curclust;
- if (get_cluster(mydata, curclust, get_contents_vfatname_block, + if (get_cluster(mydata, curclust, get_dentfromdir_block, mydata->clust_size * mydata->sect_size) != 0) { debug("Error: reading directory block\n"); return -1; }
- slotptr2 = (dir_slot *)get_contents_vfatname_block; + slotptr2 = (dir_slot *)get_dentfromdir_block; while (counter > 0) { if (((slotptr2->id & ~LAST_LONG_ENTRY_MASK) & 0xff) != counter) @@ -429,7 +428,7 @@ get_long_file_name(fsdata *mydata, int curclust, __u8 *cluster,
/* Save the real directory entry */ realdent = (dir_entry *)slotptr2; - while ((__u8 *)slotptr2 > get_contents_vfatname_block) { + while ((__u8 *)slotptr2 > get_dentfromdir_block) { slotptr2--; slot2str(slotptr2, l_name, &idx); } @@ -454,13 +453,6 @@ get_long_file_name(fsdata *mydata, int curclust, __u8 *cluster, /* Return the real directory entry */ *retdent = realdent;
- if (slotptr2) { - memcpy(get_dentfromdir_block, get_contents_vfatname_block, - mydata->clust_size * mydata->sect_size); - cur_position = (__u8 *)realdent - get_contents_vfatname_block; - *retdent = (dir_entry *) &get_dentfromdir_block[cur_position]; - } - return 0; }
@@ -1024,11 +1016,11 @@ static int do_fat_write(const char *filename, void *buffer, loff_t size, if (disk_read(cursect, (mydata->fatsize == 32) ? (mydata->clust_size) : - PREFETCH_BLOCKS, do_fat_read_at_block) < 0) { + PREFETCH_BLOCKS, get_dentfromdir_block) < 0) { debug("Error: reading rootdir block\n"); goto exit; } - dentptr = (dir_entry *) do_fat_read_at_block; + dentptr = (dir_entry *) get_dentfromdir_block;
name_len = strlen(filename); if (name_len >= VFAT_MAXLEN_BYTES)

On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com

Hi,
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
Best regards, Benoît

Hi,
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken: - There is a length limit somewhere between 111 and 120 characters, instead of 256 characters. Beyond this limit, the files are invisible with ls. - Some filenames are truncated or mixed up between files. I have tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer. - If there are too many files in the root directory, ls hangs.
I am pretty sure that this series introduces some regressions, but they seem to be in corner cases that cannot even be used or tested because of other bugs, so this series might not make this implementation much more broken than it currently is. It's risky, though.
I've quickly looked into TianoCore EDK II. It is so deeply tied to the EFI driver model and APIs that it would be a pain to port to U-Boot. Its FAT module is not designed to be portable beyond EFI. Its build system would complicate things too.
Stephen, according to what you say in test/fs/fat-noncontig-test.sh, your solution to accelerate FatFs seems to be working, even if the author is not interested in it, so maybe it would still be worth maintaining locally in order to have a reliable FAT support, also with a small memory footprint. barebox uses FatFs.
Best regards, Benoît

On 08/13/2016 04:57 PM, Benoît Thébaudeau wrote:
Hi,
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken:
- There is a length limit somewhere between 111 and 120 characters,
instead of 256 characters. Beyond this limit, the files are invisible with ls.
- Some filenames are truncated or mixed up between files. I have
tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer.
- If there are too many files in the root directory, ls hangs.
I am pretty sure that this series introduces some regressions, but they seem to be in corner cases that cannot even be used or tested because of other bugs, so this series might not make this implementation much more broken than it currently is. It's risky, though.
I've quickly looked into TianoCore EDK II. It is so deeply tied to the EFI driver model and APIs that it would be a pain to port to U-Boot. Its FAT module is not designed to be portable beyond EFI. Its build system would complicate things too.
Stephen, according to what you say in test/fs/fat-noncontig-test.sh, your solution to accelerate FatFs seems to be working, even if the author is not interested in it, so maybe it would still be worth maintaining locally in order to have a reliable FAT support, also with a small memory footprint. barebox uses FatFs.
Tom, any thoughts here re: the FF FAT implementation again?
BTW, I had some user of FF FAT about the "contiguous read" patch claiming that it caused issues in some cases. I didn't investigate since we'd dropped the idea of using FF FAT. Hopefully I could find the email (or maybe it was a post of the FF forums) again.

On Mon, Aug 15, 2016 at 10:08:37AM -0600, Stephen Warren wrote:
On 08/13/2016 04:57 PM, Benoît Thébaudeau wrote:
Hi,
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken:
- There is a length limit somewhere between 111 and 120 characters,
instead of 256 characters. Beyond this limit, the files are invisible with ls.
- Some filenames are truncated or mixed up between files. I have
tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer.
- If there are too many files in the root directory, ls hangs.
I am pretty sure that this series introduces some regressions, but they seem to be in corner cases that cannot even be used or tested because of other bugs, so this series might not make this implementation much more broken than it currently is. It's risky, though.
I've quickly looked into TianoCore EDK II. It is so deeply tied to the EFI driver model and APIs that it would be a pain to port to U-Boot. Its FAT module is not designed to be portable beyond EFI. Its build system would complicate things too.
Stephen, according to what you say in test/fs/fat-noncontig-test.sh, your solution to accelerate FatFs seems to be working, even if the author is not interested in it, so maybe it would still be worth maintaining locally in order to have a reliable FAT support, also with a small memory footprint. barebox uses FatFs.
Tom, any thoughts here re: the FF FAT implementation again?
BTW, I had some user of FF FAT about the "contiguous read" patch claiming that it caused issues in some cases. I didn't investigate since we'd dropped the idea of using FF FAT. Hopefully I could find the email (or maybe it was a post of the FF forums) again.
So it seems like we don't have a lot of good options here. Maybe part of the answer here needs to first be, of these various corner case bad FAT filesystems, can we get test images? I'm assuming that these aren't things that can be easily constructed and thus fit into the current test/fs script. We probably don't want to store these in the main U-Boot repository as they'll take up a lot of space I imagine but we could always put them somewhere and document in the testcases that we write that you need to clone $X and point at it in order to use these tests.
As for using FF FAT or not, yes, I see barebox is using it too, but they also aren't (or, haven't, I forget how often FF FAT updates) re-synced with upstream, but they might have a few useful bugfixes we'd like.
I suppose at the end of the day a benefit of trying FF FAT again would be another community we can engage with wrt "corner case" images. And it would be good to know where FF FAT is wrt the various problems we've had.

On 08/17/2016 09:04 AM, Tom Rini wrote:
On Mon, Aug 15, 2016 at 10:08:37AM -0600, Stephen Warren wrote:
On 08/13/2016 04:57 PM, Benoît Thébaudeau wrote:
Hi,
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken:
- There is a length limit somewhere between 111 and 120 characters,
instead of 256 characters. Beyond this limit, the files are invisible with ls.
- Some filenames are truncated or mixed up between files. I have
tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer.
- If there are too many files in the root directory, ls hangs.
I am pretty sure that this series introduces some regressions, but they seem to be in corner cases that cannot even be used or tested because of other bugs, so this series might not make this implementation much more broken than it currently is. It's risky, though.
I've quickly looked into TianoCore EDK II. It is so deeply tied to the EFI driver model and APIs that it would be a pain to port to U-Boot. Its FAT module is not designed to be portable beyond EFI. Its build system would complicate things too.
Stephen, according to what you say in test/fs/fat-noncontig-test.sh, your solution to accelerate FatFs seems to be working, even if the author is not interested in it, so maybe it would still be worth maintaining locally in order to have a reliable FAT support, also with a small memory footprint. barebox uses FatFs.
Tom, any thoughts here re: the FF FAT implementation again?
BTW, I had some user of FF FAT about the "contiguous read" patch claiming that it caused issues in some cases. I didn't investigate since we'd dropped the idea of using FF FAT. Hopefully I could find the email (or maybe it was a post of the FF forums) again.
So it seems like we don't have a lot of good options here. Maybe part of the answer here needs to first be, of these various corner case bad FAT filesystems, can we get test images? I'm assuming that these aren't things that can be easily constructed and thus fit into the current test/fs script. We probably don't want to store these in the main U-Boot repository as they'll take up a lot of space I imagine but we could always put them somewhere and document in the testcases that we write that you need to clone $X and point at it in order to use these tests.
One thing I've had in the back of my mind, but don't expect to actually have time to work on, is a FAT image generator script. It'd take some text file describing the FAT layout/content and generate a FAT image containing it. The text files would be small, hence probably could be included in U-Boot. We'd have many text files, each describing some particular test case. Perhaps the text files could actually be small scripts and hence iterate over a slew of cases. The file content in the generated images could be sourced from /dev/random or similar, and the image generation process could record the path/name/size/crc32/... of each file as the image was generated, for later use by the test. This approach would solve the too-big-to-check-in problem since the images would be generated locally by testers, and hopefully allows generating a slew of images with many corner cases across all of FAT12/16/32/VFAT etc.

On Sonntag, 14. August 2016 00:57:38 CEST Benoît Thébaudeau wrote:
Hi,
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau
benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org
wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken:
- There is a length limit somewhere between 111 and 120 characters,
instead of 256 characters. Beyond this limit, the files are invisible with ls.
That one is expected. U-Boot limits the extended name to 9 "slots", each 26 bytes. As filenames are encoded as UTF-16/UCS-2, each ASCII character uses two bytes -> 9 * 26 / 2 = 117.
- Some filenames are truncated or mixed up between files. I have
tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer.
Where there any filenames with characters outside the ASCII range? There may have been some double en-/decoding of UTF-8 vs UTF-16.
Kind regards,
Stefan

Hi,
On Fri, Sep 9, 2016 at 6:34 PM, Brüns, Stefan Stefan.Bruens@rwth-aachen.de wrote:
On Sonntag, 14. August 2016 00:57:38 CEST Benoît Thébaudeau wrote:
On Tue, Aug 2, 2016 at 9:35 PM, Benoît Thébaudeau
benoit.thebaudeau.dev@gmail.com wrote:
On Tue, Aug 2, 2016 at 8:53 PM, Stephen Warren swarren@wwwdotorg.org
wrote:
On 07/28/2016 12:11 AM, Tien Fong Chee wrote:
Single 64KB get_contents_vfatname_block global variable would be used for all FAT implementation instead of allocating additional two global variables which are get_denfromdir_block and do_fat_read_at_block. This implementation can help in saving up 128KB memory space.
The series,
Tested-by: Stephen Warren swarren@nvidia.com (via DFU's FAT reading/writing on various Tegra; likely primarily FAT rather than VFAT though)
Reviewed-by: Stephen Warren swarren@nvidia.com
I suspect that reading a filename with VFAT entries crossing a cluster boundary in a FAT32 root directory will be broken by this series. I do not have time to test this and other corner cases right now though, but it will be possible in the next few weeks.
I have tested VFAT long filenames with the current implementation on Sandbox. It's completely broken:
- There is a length limit somewhere between 111 and 120 characters,
instead of 256 characters. Beyond this limit, the files are invisible with ls.
That one is expected. U-Boot limits the extended name to 9 "slots", each 26 bytes. As filenames are encoded as UTF-16/UCS-2, each ASCII character uses two bytes -> 9 * 26 / 2 = 117.
Indeed. I had only checked VFAT_MAXLEN_BYTES, not VFAT_MAXSEQ.
- Some filenames are truncated or mixed up between files. I have
tested 111-character random filenames for 1000 empty files in the root directory. Most filenames had the expected length, but a few were shorter or longer.
Where there any filenames with characters outside the ASCII range? There may have been some double en-/decoding of UTF-8 vs UTF-16.
No, only alnum for the affected files. There may have been one or two filenames from previous tests outside the ASCII range, though, but I don't think so.
Best regards, Benoît
participants (5)
-
Benoît Thébaudeau
-
Brüns, Stefan
-
Stephen Warren
-
Tien Fong Chee
-
Tom Rini