[U-Boot] [PATCH] lib/hashtable.c: add algorithm for small buffer import

This patch adds a new flag to influence the hashtable internal algorithm for creation size when importing a buffer.
When importing a extremely small buffer (e.g. the default_environment) the current algorithm cuts down the size of hash table to extremely small size. In some cases this may render the device unusable until one saves the environment to non volatile memory and restarts the device.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com --- In my case i had to import 5 key/value pairs from default_environment which was about 30 byte buffer. These key/values fit in my hash table but the ethernet driver would like to setenv() another key/value which returned with error.
common/env_common.c | 2 +- include/search.h | 4 +++- lib/hashtable.c | 15 ++++++++++++--- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/common/env_common.c b/common/env_common.c index a415ef8..bd6eae4 100644 --- a/common/env_common.c +++ b/common/env_common.c @@ -188,7 +188,7 @@ void set_default_env(const char *s) }
if (himport((char *)default_environment, - sizeof(default_environment), '\0', 0) == 0) { + sizeof(default_environment), '\0', H_ALG_SMALL_BUF) == 0) { error("Environment import failed: errno = %d\n", errno); } gd->flags |= GD_FLG_ENV_READY; diff --git a/include/search.h b/include/search.h index fccc757..e6cd189 100644 --- a/include/search.h +++ b/include/search.h @@ -102,5 +102,7 @@ extern int himport_r(struct hsearch_data *__htab,
/* Flags for himport() / himport_r() */ #define H_NOCLEAR 1 /* do not clear hash table before importing */ - +#define H_ALG_SMALL_BUF 2 /* use another algorithm for small buffers to + calculate hashtable size. + */ #endif /* search.h */ diff --git a/lib/hashtable.c b/lib/hashtable.c index b747f1f..82a4d00 100644 --- a/lib/hashtable.c +++ b/lib/hashtable.c @@ -636,7 +636,7 @@ int himport_r(struct hsearch_data *htab, }
/* - * Create new hash table (if needed). The computation of the hash + * Create new hash table (if needed). The computation of the hash * table size is based on heuristics: in a sample of some 70+ * existing systems we found an average size of 39+ bytes per entry * in the environment (for the whole key=value pair). Assuming a @@ -644,16 +644,25 @@ int himport_r(struct hsearch_data *htab, * safety margin for any existing environment definitions and still * allow for more than enough dynamic additions. Note that the * "size" argument is supposed to give the maximum enviroment size - * (CONFIG_ENV_SIZE). This heuristics will result in + * (CONFIG_ENV_SIZE). This heuristics will result in * unreasonably large numbers (and thus memory footprint) for * big flash environments (>8,000 entries for 64 KB * envrionment size), so we clip it to a reasonable value * (which can be overwritten in the board config file if * needed). + * + * But in some cases it is necessary to have another algorithm to + * get the size of hash table. Especially for extremely small buffers + * there is the flag H_ALG_SMALL_BUF which takes another factor to + * calculate the hash table size. */
if (!htab->table) { - int nent = size / 8; + int nent; + if (flag & H_ALG_SMALL_BUF) + nent = size / 2; + else + nent = size / 8;
if (nent > CONFIG_ENV_MAX_ENTRIES) nent = CONFIG_ENV_MAX_ENTRIES;

Dear =?UTF-8?q?Andreas=20Bie=C3=9Fmann?=,
In message 1285788486-43901-1-git-send-email-andreas.devel@googlemail.com you wrote:
This patch adds a new flag to influence the hashtable internal algorithm for creation size when importing a buffer.
When importing a extremely small buffer (e.g. the default_environment) the current algorithm cuts down the size of hash table to extremely small size. In some cases this may render the device unusable until one saves the environment to non volatile memory and restarts the device.
I understand your problem, but I don't agree with the approach.
+#define H_ALG_SMALL_BUF 2 /* use another algorithm for small buffers to
calculate hashtable size.
*/
Coding style: incorrect multiline comment.
* Create new hash table (if needed). The computation of the hash
* Create new hash table (if needed). The computation of the hash
- table size is based on heuristics: in a sample of some 70+
- existing systems we found an average size of 39+ bytes per entry
- in the environment (for the whole key=value pair). Assuming a
@@ -644,16 +644,25 @@ int himport_r(struct hsearch_data *htab, * safety margin for any existing environment definitions and still * allow for more than enough dynamic additions. Note that the * "size" argument is supposed to give the maximum enviroment size
* (CONFIG_ENV_SIZE). This heuristics will result in
* (CONFIG_ENV_SIZE). This heuristics will result in
Please don't mess with the white space, especially when you make it worse instead of better.
* unreasonably large numbers (and thus memory footprint) for * big flash environments (>8,000 entries for 64 KB * envrionment size), so we clip it to a reasonable value * (which can be overwritten in the board config file if * needed).
*
* But in some cases it is necessary to have another algorithm to
* get the size of hash table. Especially for extremely small buffers
* there is the flag H_ALG_SMALL_BUF which takes another factor to
* calculate the hash table size.
*/
if (!htab->table) {
int nent = size / 8;
int nent;
if (flag & H_ALG_SMALL_BUF)
nent = size / 2;
else
nent = size / 8;
Did you read the comment above?
With your configuration, importing a 64 kB environment buffer would result in 32 k entries in the hash table. This obviously makes no sense.
I think we should rather make sure that a certain minimum of entries will always be available, for exmaple something like this:
int nent = 64 + size / 8;
or similar.
What do you think?
[Actually I think the current setting (size / 8) is _way_ too conservative in most cases. eventually we'd really be better off with something like "64 + size / 32" or so. I'm interested in feedback - the statistics I have about environment settings (number of entries versus total size) is unfortunately a bit limited, and since most of the boards come from the same hands they follow a common style, which eventually is not what other users do.]
Best regards,
Wolfgang Denk

(resent to list)
Dear Wolfgang Denk,
Am 29.09.2010 um 22:01 schrieb Wolfgang Denk:
Dear =?UTF-8?q?Andreas=20Bie=C3=9Fmann?=,
In message 1285788486-43901-1-git-send-email-andreas.devel@googlemail.com you wrote:
[snip]
* unreasonably large numbers (and thus memory footprint) for * big flash environments (>8,000 entries for 64 KB * envrionment size), so we clip it to a reasonable value * (which can be overwritten in the board config file if * needed).
*
* But in some cases it is necessary to have another algorithm to
* get the size of hash table. Especially for extremely small buffers
* there is the flag H_ALG_SMALL_BUF which takes another factor to
* calculate the hash table size.
*/
if (!htab->table) {
int nent = size / 8;
int nent;
if (flag & H_ALG_SMALL_BUF)
nent = size / 2;
else
nent = size / 8;
Did you read the comment above?
Yes, I did.
With your configuration, importing a 64 kB environment buffer would result in 32 k entries in the hash table.
Well therefore we have another 'algorithm' implemented to cope with this. The flag H_ALG_SMALL_BUF should be set especially when importing a small buffer. Anyhow the maximum limit some lines below will never be exceeded.
This obviously makes no sense.
I think we should rather make sure that a certain minimum of entries will always be available, for exmaple something like this:
int nent = 64 + size / 8;
This sounds also good but why do not calculate as before and after that check some (maybe definable) borders?
How about: int nent = size / 8; if (nent < CONFIG_ENV_MIN_ENTRIES) nent = CONFIG_ENV_MIN_ENTRIES; ...
or similar.
What do you think?
How about my suggestion?
[Actually I think the current setting (size / 8) is _way_ too conservative in most cases. eventually we'd really be better off with something like "64 + size / 32" or so. I'm interested in feedback - the statistics I have about environment settings (number of entries versus total size) is unfortunately a bit limited, and since most of the boards come from the same hands they follow a common style, which eventually is not what other users do.]
Well in most cases the environment needs a static size. The actual size of environment has (in my view) no/small connection to space for environment in NV memory. In most cases the space reserved for environment is way to big cause of sector boundaries. Therefore I think it would meet the needs when we have one (configurable) size for hash table without the calculation.
regards
Andreas Bießmann
Best regards,
Wolfgang Denk
-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de As far as we know, our computer has never had an undetected error. -- Weisert

Dear =?iso-8859-1?Q?Andreas_Bie=DFmann?=,
In message 8AE7E072-8389-49CA-B6C7-6C15C1877625@googlemail.com you wrote:
With your configuration, importing a 64 kB environment buffer would result in 32 k entries in the hash table.
Well therefore we have another 'algorithm' implemented to cope with this. The flag H_ALG_SMALL_BUF should be set especially when importing a small buffer. Anyhow the maximum limit some lines below will never be exceeded.
Well, you were talking about your defualt environment settings only. How big is the envrionment in your persistent storage (flash)? I bet it's at least several KB, resulting in thousands of entries in the hash table.
This obviously makes no sense.
I think we should rather make sure that a certain minimum of entries will always be available, for exmaple something like this:
int nent = 64 + size / 8;
This sounds also good but why do not calculate as before and after that check some (maybe definable) borders?
How about: int nent = size / 8; if (nent < CONFIG_ENV_MIN_ENTRIES) nent = CONFIG_ENV_MIN_ENTRIES;
I cannot really proof it, but I am pretty much convinced that we should start with a non-zero constant and rather use a less steep increase.
Well in most cases the environment needs a static size. The actual size of environment has (in my view) no/small connection to space for environment in NV memory. In most cases the space reserved for environment is way to big cause of sector boundaries. Therefore I think
This is not true. Sector sizes affect only the CONFIG_ENV_SECT_SIZE settings, while the environment size is determined by CONFIG_ENV_SIZE which usually is MUCH smaller.
Example: "TQM5200.h":
#define CONFIG_ENV_SIZE 0x4000 /* 16 k - keep small for fast booting */ ... #define CONFIG_ENV_SECT_SIZE 0x40000
it would meet the needs when we have one (configurable) size for hash table without the calculation.
I disagree. Have a look at the "env import" and "env export" commands and think about what you can do with these - essentially they lift the fix connection to a pre-configured environment storage. Even if you have just 2 or 4 KiB environment settings in flash, you can now just load a file (over network, from USB stick or SDCard etc.) which may contain tons of commands and macro definitions.
Even if you don't have such usage in mind, I do not want to make this impossible by a too limited design.
Best regards,
Wolfgang Denk

Dear Wolfgang Denk,
Am 29.09.2010 um 23:02 schrieb Wolfgang Denk:
Dear =?iso-8859-1?Q?Andreas_Bie=DFmann?=,
In message 8AE7E072-8389-49CA-B6C7-6C15C1877625@googlemail.com you wrote:
With your configuration, importing a 64 kB environment buffer would result in 32 k entries in the hash table.
Well therefore we have another 'algorithm' implemented to cope with this. The flag H_ALG_SMALL_BUF should be set especially when importing a small buffer. Anyhow the maximum limit some lines below will never be exceeded.
Well, you were talking about your defualt environment settings only.
Yes I do. This was the only place I used the newly defined flag for different hash table size calculation. The main problem here is the extremely small buffer of default_environment which will calculate a way to small hash table.
How big is the envrionment in your persistent storage (flash)? I bet it's at least several KB, resulting in thousands of entries in the hash table.
Yes, I understood this. The main aim of my patch was to introduce a switch between two fixed factors. Think about y = x * k with two hard coded k1 and k2 to switch between.
I'd like to make a cut here. My first suggestion has limitations and will bring more patches like this in future. Lets make it more flexible. Also the current implementation has limitations which should be changed.
This obviously makes no sense.
I think we should rather make sure that a certain minimum of entries will always be available, for exmaple something like this:
int nent = 64 + size / 8;
This sounds also good but why do not calculate as before and after that check some (maybe definable) borders?
How about: int nent = size / 8; if (nent < CONFIG_ENV_MIN_ENTRIES) nent = CONFIG_ENV_MIN_ENTRIES;
I cannot really proof it, but I am pretty much convinced that we should start with a non-zero constant and rather use a less steep increase.
Well in most cases the environment needs a static size. The actual size of environment has (in my view) no/small connection to space for environment in NV memory. In most cases the space reserved for environment is way to big cause of sector boundaries. Therefore I think
This is not true. Sector sizes affect only the CONFIG_ENV_SECT_SIZE settings, while the environment size is determined by CONFIG_ENV_SIZE which usually is MUCH smaller.
Well this is new information for me. Most boards I worked with only defined CONFIG_ENV_SIZE which was naturally the size of one sector in NOR flash.
Example: "TQM5200.h":
#define CONFIG_ENV_SIZE 0x4000 /* 16 k - keep small for fast booting */ ... #define CONFIG_ENV_SECT_SIZE 0x40000
it would meet the needs when we have one (configurable) size for hash table without the calculation.
I disagree. Have a look at the "env import" and "env export" commands and think about what you can do with these - essentially they lift the fix connection to a pre-configured environment storage. Even if you have just 2 or 4 KiB environment settings in flash, you can now just load a file (over network, from USB stick or SDCard etc.) which may contain tons of commands and macro definitions.
Ok this is a quite cool feature I had really not in mind.
Even if you don't have such usage in mind, I do not want to make this impossible by a too limited design.
Therefore it is necessary to have a connection between buffer size to import/parse and resulting hash table for environment. Another requirement is a reasonable fixed part. This is needed for rare cases where a small buffer is imported (aka default_environment) which will be expanded later on by setenv().
In my opinion these factors should be definable by user at compile time (but with realistic default values if not defined by user). Therefore the behavior of the hash table size algorithm can be optimized for different use cases. I think about a user defining a lot of keys with small sized values and another user defining only some keys with very large sized values. It is really difficult to find factors for the calculation which meet all users needs.
regards
Andreas Bießmann

This patch adds a new config parameter for adjusting the calculation of hash table size when importing a buffer.
When importing a extremely small buffer (e.g. the default_environment) the old calculation generated a hash table which could hold at most the buffer content but no more entires.
The new calculation add a fixed number of entries to the result to fit better for small import buffers. This amount may be configured by the user in board file to adjust the behaviour.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com --- v2: implement as suggested by Wolfgang
lib/hashtable.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/lib/hashtable.c b/lib/hashtable.c index b747f1f..57802cf 100644 --- a/lib/hashtable.c +++ b/lib/hashtable.c @@ -45,6 +45,9 @@ # include <linux/string.h> #endif
+#ifndef CONFIG_ENV_MIN_ENTRIES /* minimum number of entries */ +#define CONFIG_ENV_MIN_ENTRIES 64 +#endif #ifndef CONFIG_ENV_MAX_ENTRIES /* maximum number of entries */ #define CONFIG_ENV_MAX_ENTRIES 512 #endif @@ -647,13 +650,14 @@ int himport_r(struct hsearch_data *htab, * (CONFIG_ENV_SIZE). This heuristics will result in * unreasonably large numbers (and thus memory footprint) for * big flash environments (>8,000 entries for 64 KB - * envrionment size), so we clip it to a reasonable value - * (which can be overwritten in the board config file if - * needed). + * envrionment size), so we clip it to a reasonable value. + * On the other hand we need to add some more entries for free + * space when importing very small buffers. Both boundaries can + * be overwritten in the board config file if needed. */
if (!htab->table) { - int nent = size / 8; + int nent = CONFIG_ENV_MIN_ENTRIES + size / 8;
if (nent > CONFIG_ENV_MAX_ENTRIES) nent = CONFIG_ENV_MAX_ENTRIES;

Dear =?UTF-8?q?Andreas=20Bie=C3=9Fmann?=,
In message 1285966262-73388-1-git-send-email-andreas.devel@googlemail.com you wrote:
This patch adds a new config parameter for adjusting the calculation of hash table size when importing a buffer.
When importing a extremely small buffer (e.g. the default_environment) the old calculation generated a hash table which could hold at most the buffer content but no more entires.
The new calculation add a fixed number of entries to the result to fit better for small import buffers. This amount may be configured by the user in board file to adjust the behaviour.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com
v2: implement as suggested by Wolfgang
lib/hashtable.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-)
Applied, thanks.
Best regards,
Wolfgang Denk

Dear Wolfgang Denk,
Am 06.10.2010 um 22:47 schrieb Wolfgang Denk:
Dear =?UTF-8?q?Andreas=20Bie=C3=9Fmann?=,
In message 1285966262-73388-1-git-send-email-andreas.devel@googlemail.com you wrote:
lib/hashtable.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-)
Applied, thanks.
Best regards,
Wolfgang Denk
couldn't find this patch mainline. Where did it go?
regards
Andreas Bießmann

Dear =?iso-8859-1?Q?Andreas_Bie=DFmann?=,
In message F2DA4AEA-D25A-4355-9727-715E47AE8AFB@googlemail.com you wrote:
lib/hashtable.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-)
Applied, thanks.
Best regards,
Wolfgang Denk
couldn't find this patch mainline. Where did it go?
commit fc5fc76bdad14425e3743e1494c9e444570df1be Author: Andreas Bießmann andreas.devel@googlemail.com Date: Fri Oct 1 22:51:02 2010 +0200
lib/hashtable.c: add CONFIG_ENV_MIN_ENTRIES
See http://git.denx.de/?p=u-boot.git;a=commit;h=fc5fc76bdad14425e3743e1494c9e444...
Best regards,
Wolfgang Denk
participants (2)
-
Andreas Bießmann
-
Wolfgang Denk