[U-Boot] [PATCH] Avoid use of divides in print_size.

Modification of print_size to avoid use of divides and especially long long divides. Keep the binary scale factor in terms of bit shifts instead. This should be faster, since the previous code gave the compiler no clues that the divides where always powers of two, preventing optimisation.
Signed-off-by: Nick Thompson nick.thompson@ge.com --- This patch should make print_size a little faster, but perhaps nobody cares about that too much. What it also does though is reenable U-Boot linking for ARM with standard toolchains. (e.g. CodeSourcery and MontaVista).
lib/display_options.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/lib/display_options.c b/lib/display_options.c index 86df05d..636916d 100644 --- a/lib/display_options.c +++ b/lib/display_options.c @@ -46,13 +46,14 @@ int display_options (void) void print_size(unsigned long long size, const char *s) { unsigned long m = 0, n; + unsigned long long f; static const char names[] = {'E', 'P', 'T', 'G', 'M', 'K'}; - unsigned long long d = 1ULL << (10 * ARRAY_SIZE(names)); + unsigned long d = 10 * ARRAY_SIZE(names); char c = 0; unsigned int i;
- for (i = 0; i < ARRAY_SIZE(names); i++, d >>= 10) { - if (size >= d) { + for (i = 0; i < ARRAY_SIZE(names); i++, d -= 10) { + if (size >> d) { c = names[i]; break; } @@ -63,11 +64,12 @@ void print_size(unsigned long long size, const char *s) return; }
- n = size / d; + n = size >> d; + f = size & ((1ULL << d) - 1);
/* If there's a remainder, deal with it */ - if(size % d) { - m = (10 * (size - (n * d)) + (d / 2) ) / d; + if (f) { + m = (10ULL * (f + (1 << (d - 1)))) >> d;
if (m >= 10) { m -= 10;

On Mon, May 10, 2010 at 4:51 AM, Nick Thompson nick.thompson@ge.com wrote:
Modification of print_size to avoid use of divides and especially long long divides. Keep the binary scale factor in terms of bit shifts instead. This should be faster, since the previous code gave the compiler no clues that the divides where always powers of two, preventing optimisation.
Signed-off-by: Nick Thompson nick.thompson@ge.com
This code almost works. It seems to have trouble printing fractional values. Using this loop:
unsigned int i;
for (i = 0; i < 63; i++) print_size(3ULL << i, "\n");
I get this output. Notice that it rounds 1.5 to 2 on sizes less than a terabyte.
3 Bytes 6 Bytes 12 Bytes 24 Bytes 48 Bytes 96 Bytes 192 Bytes 384 Bytes 768 Bytes 2 KiB 3 KiB 6 KiB 12 KiB 24 KiB 48 KiB 96 KiB 192 KiB 384 KiB 768 KiB 2 MiB 3 MiB 6 MiB 12 MiB 24 MiB 48 MiB 96 MiB 192 MiB 384 MiB 768 MiB 2 GiB 3 GiB 6 GiB 12 GiB 24 GiB 48 GiB 96 GiB 192 GiB 384 GiB 768 GiB 1.5 TiB 3 TiB 6 TiB 12 TiB 24 TiB 48 TiB 96 TiB 192 TiB 384 TiB 768 TiB 1.5 PiB 3 PiB 6 PiB 12 PiB 24 PiB 48 PiB 96 PiB 192 PiB 384 PiB 768 PiB 1.5 EiB 3 EiB 6 EiB 12 EiB

Here's a more revealing test:
unsigned int i;
for (i = 0; i < 60; i++) { unsigned long long l = 45ULL << i; printf("%llu - ", l); print_size(l, "\n"); }
prints:
45 - 45 Bytes 90 - 90 Bytes 180 - 180 Bytes 360 - 360 Bytes 720 - 720 Bytes 1440 - 1.9 KiB 2880 - 3.3 KiB 5760 - 6.1 KiB 11520 - 11.7 KiB 23040 - 23 KiB 46080 - 45 KiB 92160 - 90 KiB 184320 - 180 KiB 368640 - 360 KiB 737280 - 720 KiB 1474560 - 1.9 MiB 2949120 - 3.3 MiB 5898240 - 6.1 MiB 11796480 - 11.7 MiB 23592960 - 23 MiB 47185920 - 45 MiB 94371840 - 90 MiB 188743680 - 180 MiB 377487360 - 360 MiB 754974720 - 720 MiB 1509949440 - 1.9 GiB 3019898880 - 3.3 GiB 6039797760 - 6.1 GiB 12079595520 - 11.7 GiB 24159191040 - 23 GiB 48318382080 - 45 GiB 96636764160 - 90 GiB 193273528320 - 180 GiB 386547056640 - 360 GiB 773094113280 - 720 GiB 1546188226560 - 1.4 TiB 3092376453120 - 2.8 TiB 6184752906240 - 5.6 TiB 12369505812480 - 11.2 TiB 24739011624960 - 22.5 TiB 49478023249920 - 45 TiB 98956046499840 - 90 TiB 197912092999680 - 180 TiB 395824185999360 - 360 TiB 791648371998720 - 720 TiB 1583296743997440 - 1.4 PiB 3166593487994880 - 2.8 PiB 6333186975989760 - 5.6 PiB 12666373951979520 - 11.2 PiB 25332747903959040 - 22.5 PiB 50665495807918080 - 45 PiB 101330991615836160 - 90 PiB 202661983231672320 - 180 PiB 405323966463344640 - 360 PiB 810647932926689280 - 720 PiB 1621295865853378560 - 1.4 EiB 3242591731706757120 - 2.8 EiB 6485183463413514240 - 5.6 EiB 12970366926827028480 - 11.2 EiB 7493989779944505344 - 6.5 EiB
That last one is probably an overflow.

On 10/05/10 20:25, Timur Tabi wrote:
Here's a more revealing test:
unsigned int i; for (i = 0; i < 60; i++) { unsigned long long l = 45ULL << i; printf("%llu - ", l); print_size(l, "\n"); }
prints:
45 - 45 Bytes 90 - 90 Bytes 180 - 180 Bytes 360 - 360 Bytes 720 - 720 Bytes 1440 - 1.9 KiB 2880 - 3.3 KiB 5760 - 6.1 KiB
[snip]
Ahh, your testing foo is strong. That is a better test than mine. I have submitted a new patch which, with your test, gives (value - old new):
45 - 45 Bytes 45 Bytes 90 - 90 Bytes 90 Bytes 180 - 180 Bytes 180 Bytes 360 - 360 Bytes 360 Bytes 720 - 720 Bytes 720 Bytes 1440 - 1.4 KiB 1.4 KiB 2880 - 2.8 KiB 2.8 KiB 5760 - 5.6 KiB 5.6 KiB 11520 - 11.3 KiB 11.3 KiB 23040 - 22.5 KiB 22.5 KiB 46080 - 45 KiB 45 KiB 92160 - 90 KiB 90 KiB 184320 - 180 KiB 180 KiB 368640 - 360 KiB 360 KiB 737280 - 720 KiB 720 KiB 1474560 - 1.4 MiB 1.4 MiB 2949120 - 2.8 MiB 2.8 MiB 5898240 - 5.6 MiB 5.6 MiB 11796480 - 11.3 MiB 11.3 MiB 23592960 - 22.5 MiB 22.5 MiB 47185920 - 45 MiB 45 MiB 94371840 - 90 MiB 90 MiB 188743680 - 180 MiB 180 MiB 377487360 - 360 MiB 360 MiB 754974720 - 720 MiB 720 MiB 1509949440 - 1.4 GiB 1.4 GiB 3019898880 - 2.8 GiB 2.8 GiB 6039797760 - 5.6 GiB 5.6 GiB 12079595520 - 11.3 GiB 11.3 GiB 24159191040 - 22.5 GiB 22.5 GiB 48318382080 - 45 GiB 45 GiB 96636764160 - 90 GiB 90 GiB 193273528320 - 180 GiB 180 GiB 386547056640 - 360 GiB 360 GiB 773094113280 - 720 GiB 720 GiB 1546188226560 - 1.4 TiB 1.4 TiB 3092376453120 - 2.8 TiB 2.8 TiB 6184752906240 - 5.6 TiB 5.6 TiB 12369505812480 - 11.3 TiB 11.2 TiB 24739011624960 - 22.5 TiB 22.5 TiB 49478023249920 - 45 TiB 45 TiB 98956046499840 - 90 TiB 90 TiB 197912092999680 - 180 TiB 180 TiB 395824185999360 - 360 TiB 360 TiB 791648371998720 - 720 TiB 720 TiB 1583296743997440 - 1.4 PiB 1.4 PiB 3166593487994880 - 2.8 PiB 2.8 PiB 6333186975989760 - 5.6 PiB 5.6 PiB 12666373951979520 - 11.3 PiB 11.2 PiB 25332747903959040 - 22.5 PiB 22.5 PiB 50665495807918080 - 45 PiB 45 PiB 101330991615836160 - 90 PiB 90 PiB 202661983231672320 - 180 PiB 180 PiB 405323966463344640 - 360 PiB 360 PiB 810647932926689280 - 720 PiB 720 PiB 1621295865853378560 - 1.4 EiB 1.4 EiB 3242591731706757120 - 2.8 EiB 2.8 EiB 6485183463413514240 - 5.6 EiB 5.6 EiB 12970366926827028480 - 11.3 EiB 11.2 EiB 7493989779944505344 - 6.5 EiB 6.5 EiB
I was using 5 as a round-up rather than 0.5 due to some extraneous ()'s. (10 * 0.5)...
That last one is probably an overflow.
I believe so, yes.
Thank you for supplying the test code.
Regards, Nick.
participants (2)
-
Nick Thompson
-
Timur Tabi