[U-Boot] [PATCH] tools: gen_eth_addr: remove getpid() operation for the random seed

As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid(). So that is not a expected behavior. We expect different value for the seed when when run it in many times.
So this patch remove the getpid(), just use the time(0) as the seed.
Signed-off-by: Josh Wu josh.wu@atmel.com ---
tools/gen_eth_addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/gen_eth_addr.c b/tools/gen_eth_addr.c index bf9d935..53b023a 100644 --- a/tools/gen_eth_addr.c +++ b/tools/gen_eth_addr.c @@ -15,7 +15,7 @@ main(int argc, char *argv[]) { unsigned long ethaddr_low, ethaddr_high;
- srand(time(0) | getpid()); + srand(time(0));
/* * setting the 2nd LSB in the most significant byte of

Dear Josh Wu,
In message 1442373526-842-1-git-send-email-josh.wu@atmel.com you wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid().
I think removing some "random input" from the way how we compute the seed is a bad idea.
So that is not a expected behavior. We expect different value for the seed when when run it in many times.
What is your execution environment? In any sane OS it is higly unlikely that you will see the same or even similar PIDs for successive runs of the program - each run will start a new process, which will get a new PID.
[On a mostly idle Linux system (4.1.6 kernel) I see zero dupes in a set of 30,000 invocations of getpid().]
One can argue if ORing the values is the most clever idea, or if for example ADDing them would result in more "randomness". But completely removing the pid() is bad - any parallel runs of the program on any machines with synchronized times would predictably result in the same seeds which is definitely worse behaviour than what we have now.
So this patch remove the getpid(), just use the time(0) as the seed.
NAK. This is a bad idea.
Best regards,
Wolfgang Denk

Hi, Wolfgang
Thanks for the reply.
On 9/16/2015 2:37 PM, Wolfgang Denk wrote:
Dear Josh Wu,
In message 1442373526-842-1-git-send-email-josh.wu@atmel.com you wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid().
I think removing some "random input" from the way how we compute the seed is a bad idea.
So that is not a expected behavior. We expect different value for the seed when when run it in many times.
What is your execution environment? In any sane OS it is higly unlikely that you will see the same or even similar PIDs for successive runs of the program - each run will start a new process, which will get a new PID.
my system is Ubuntu 14.04
#uname -a Linux melon 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Following is my test history:
➜ tools date && ./gen_eth_addr Wed Sep 16 14:48:53 CST 2015 4a:c3:21:45:17:b2 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:48:56 CST 2015 a6:29:4b:0b:e6:d0 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:02 CST 2015 d2:41:66:54:64:aa ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:06 CST 2015 2a:58:1d:b0:f0:c5 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:13 CST 2015 1e:8e:6f:0e:16:b8 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:19 CST 2015 56:4f:58:67:ad:30 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:21 CST 2015 2e:53:29:97:6a:8a ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:26 CST 2015 d2:41:66:54:64:aa ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:35 CST 2015 d2:41:66:54:64:aa
➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:50 CST 2015 92:33:16:3f:0a:56 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:58 CST 2015 92:33:16:3f:0a:56
In above commands, I have two duplicated eth addr: 92:33:16:3f:0a:56 d2:41:66:54:64:aa
[On a mostly idle Linux system (4.1.6 kernel) I see zero dupes in a set of 30,000 invocations of getpid().]
One can argue if ORing the values is the most clever idea, or if for example ADDing them would result in more "randomness".
Sure. The ORing seems has big chance to get same result in my machine.
But completely removing the pid() is bad - any parallel runs of the program on any machines with synchronized times would predictably result in the same seeds which is definitely worse behaviour than what we have now.
I understand your concern. My intention is make it harder to generate the duplicated result.
Maybe we can ORing the MSB of time(0)? I'll investigate it little more.
So this patch remove the getpid(), just use the time(0) as the seed.
NAK. This is a bad idea.
Best regards,
Wolfgang Denk
Best Regards, Josh Wu

On 9/16/2015 3:08 PM, Josh Wu wrote:
Hi, Wolfgang
Thanks for the reply.
On 9/16/2015 2:37 PM, Wolfgang Denk wrote:
Dear Josh Wu,
In message 1442373526-842-1-git-send-email-josh.wu@atmel.com you wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid().
I think removing some "random input" from the way how we compute the seed is a bad idea.
So that is not a expected behavior. We expect different value for the seed when when run it in many times.
What is your execution environment? In any sane OS it is higly unlikely that you will see the same or even similar PIDs for successive runs of the program - each run will start a new process, which will get a new PID.
my system is Ubuntu 14.04
#uname -a Linux melon 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Following is my test history:
➜ tools date && ./gen_eth_addr Wed Sep 16 14:48:53 CST 2015 4a:c3:21:45:17:b2 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:48:56 CST 2015 a6:29:4b:0b:e6:d0 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:02 CST 2015 d2:41:66:54:64:aa ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:06 CST 2015 2a:58:1d:b0:f0:c5 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:13 CST 2015 1e:8e:6f:0e:16:b8 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:19 CST 2015 56:4f:58:67:ad:30 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:21 CST 2015 2e:53:29:97:6a:8a ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:26 CST 2015 d2:41:66:54:64:aa ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:35 CST 2015 d2:41:66:54:64:aa
➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:50 CST 2015 92:33:16:3f:0a:56 ➜ tools date && ./gen_eth_addr Wed Sep 16 14:49:58 CST 2015 92:33:16:3f:0a:56
In above commands, I have two duplicated eth addr: 92:33:16:3f:0a:56 d2:41:66:54:64:aa
[On a mostly idle Linux system (4.1.6 kernel) I see zero dupes in a set of 30,000 invocations of getpid().]
One can argue if ORing the values is the most clever idea, or if for example ADDing them would result in more "randomness".
Sure. The ORing seems has big chance to get same result in my machine.
But completely removing the pid() is bad - any parallel runs of the program on any machines with synchronized times would predictably result in the same seeds which is definitely worse behaviour than what we have now.
I understand your concern. My intention is make it harder to generate the duplicated result.
Maybe we can ORing the MSB of time(0)? I'll investigate it little more.
I test again, here is a duplicated case:
➜ tools date && ./gen_eth_addr Wed Sep 16 15:21:30 CST 2015 pid: 0x5aa4, time: 0x55f9187a 86:77:6e:cf:6b:16 ➜ tools date && ./gen_eth_addr Wed Sep 16 15:21:34 CST 2015 pid: 0x5aa6, time: 0x55f9187e 86:77:6e:cf:6b:16
In above both cases, the random seed is same: (last 4 bits) 4 | a = e 6 | e = e
So this patch remove the getpid(), just use the time(0) as the seed.
NAK. This is a bad idea.
How about use 'time(0) + (getpid() << 8)'? my reason list in below: 1. ADD will make the result is more unlikely than OR 2. (<< 8) means larger time difference.
Best Regards, Josh Wu

Dear Josh,
In message 55F91588.3040305@atmel.com you wrote:
In above commands, I have two duplicated eth addr: 92:33:16:3f:0a:56 d2:41:66:54:64:aa
Agreed. Randomness is really poor; for a sequence of 1000 invocations of gen_eth_addr in a shell loop I would only gt 124 different MAC addresses:
-> for i in {1..1000} ; do ./gen_eth_addr ; done >/tmp/0 -> sort -u /tmp/0 >/tmp/1 ; wc -l /tmp/[01] 1000 /tmp/0 124 /tmp/1
In a second run, even only 41 :-(
But without the getpid() part, it gets even worse - the same loop would produce only 15 different addresses!
Changing the '|' into a '+' would reliably generate 1000 different MAC addresses.
I understand your concern. My intention is make it harder to generate the duplicated result.
I understand this, and agree that we should implement such a fix.
Maybe we can ORing the MSB of time(0)? I'll investigate it little more.
The following patch appears to work fine for me. Maybe you can test it?
diff --git a/tools/gen_eth_addr.c b/tools/gen_eth_addr.c index bf9d935..834163a 100644 --- a/tools/gen_eth_addr.c +++ b/tools/gen_eth_addr.c @@ -15,7 +15,7 @@ main(int argc, char *argv[]) { unsigned long ethaddr_low, ethaddr_high;
- srand(time(0) | getpid()); + srand(time(0) + getpid());
/* * setting the 2nd LSB in the most significant byte of
Thanks.
Best regards,
Wolfgang Denk

Hi, Wolfgang, Andreas
Thanks for the testing and point out the wrongness in my patch.
On 9/16/2015 5:15 PM, Wolfgang Denk wrote:
Dear Josh,
In message 55F91588.3040305@atmel.com you wrote:
In above commands, I have two duplicated eth addr: 92:33:16:3f:0a:56 d2:41:66:54:64:aa
Agreed. Randomness is really poor; for a sequence of 1000 invocations of gen_eth_addr in a shell loop I would only gt 124 different MAC addresses:
-> for i in {1..1000} ; do ./gen_eth_addr ; done >/tmp/0 -> sort -u /tmp/0 >/tmp/1 ; wc -l /tmp/[01] 1000 /tmp/0 124 /tmp/1
In a second run, even only 41 :-(
But without the getpid() part, it gets even worse - the same loop would produce only 15 different addresses!
Changing the '|' into a '+' would reliably generate 1000 different MAC addresses.
I understand your concern. My intention is make it harder to generate the duplicated result.
I understand this, and agree that we should implement such a fix.
Maybe we can ORing the MSB of time(0)? I'll investigate it little more.
The following patch appears to work fine for me. Maybe you can test it?
yes, this one works well. if you don't mind I want to sent v2 patch, which is like:
- srand(time(0) | getpid()); + srand(time(0) + (getpid() << 8));
diff --git a/tools/gen_eth_addr.c b/tools/gen_eth_addr.c index bf9d935..834163a 100644 --- a/tools/gen_eth_addr.c +++ b/tools/gen_eth_addr.c @@ -15,7 +15,7 @@ main(int argc, char *argv[]) { unsigned long ethaddr_low, ethaddr_high;
- srand(time(0) | getpid());
srand(time(0) + getpid());
/*
- setting the 2nd LSB in the most significant byte of
Thanks.
Best regards,
Wolfgang Denk
Best Regards, Josh Wu

Hi Josh,
On 09/16/2015 05:18 AM, Josh Wu wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid(). So that is not a expected behavior. We expect different value for the seed when when run it in many times.
I don't think your change made it better. Here is a snippet from a run of time(NULL) and getpid():
---8<--- time: 1442389450; pid: 11632; time | pid: 1442397690; time: 1442389450; pid: 11633; time | pid: 1442397691; time: 1442389450; pid: 11634; time | pid: 1442397690; time: 1442389450; pid: 11635; time | pid: 1442397691; time: 1442389450; pid: 11636; time | pid: 1442397694; time: 1442389450; pid: 11637; time | pid: 1442397695; time: 1442389450; pid: 11638; time | pid: 1442397694; time: 1442389450; pid: 11639; time | pid: 1442397695; time: 1442389450; pid: 11640; time | pid: 1442397690; time: 1442389450; pid: 11641; time | pid: 1442397691; time: 1442389450; pid: 11642; time | pid: 1442397690; time: 1442389450; pid: 11643; time | pid: 1442397691; --->8---
While time(NULL) is stable, getpid() is incrementing by one. As you may expect the OR'ed value is oscillating and the values almost the same. So calling gen_eth_addr three times within the same second you will get two time the same MAC.
So this patch remove the getpid(), just use the time(0) as the seed.
So let's see the effect of your change ...
The output of gen_eth_addr at the current ToT:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 254
With your change applied:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 10
Another approach would be to change the algorithm (OR the values) here. A short test showed that using XOR could be a solution:
---8<--- time: 1442389450; pid: 11632; time ^ pid: 1442394298; time: 1442389450; pid: 11633; time ^ pid: 1442394299; time: 1442389450; pid: 11634; time ^ pid: 1442394296; time: 1442389450; pid: 11635; time ^ pid: 1442394297; time: 1442389450; pid: 11636; time ^ pid: 1442394302; time: 1442389450; pid: 11637; time ^ pid: 1442394303; time: 1442389450; pid: 11638; time ^ pid: 1442394300; time: 1442389450; pid: 11639; time ^ pid: 1442394301; time: 1442389450; pid: 11640; time ^ pid: 1442394290; time: 1442389450; pid: 11641; time ^ pid: 1442394291; time: 1442389450; pid: 11642; time ^ pid: 1442394288; time: 1442389450; pid: 11643; time ^ pid: 1442394289; --->8---
It is the same input but none of the outputs is the same value.
The XOR approach applied to gen_eth_addr:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 9988
Andreas
Signed-off-by: Josh Wu josh.wu@atmel.com
tools/gen_eth_addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/gen_eth_addr.c b/tools/gen_eth_addr.c index bf9d935..53b023a 100644 --- a/tools/gen_eth_addr.c +++ b/tools/gen_eth_addr.c @@ -15,7 +15,7 @@ main(int argc, char *argv[]) { unsigned long ethaddr_low, ethaddr_high;
- srand(time(0) | getpid());
srand(time(0));
/*
- setting the 2nd LSB in the most significant byte of

Hi, Andreas
On 9/16/2015 4:23 PM, Andreas Bießmann wrote:
Hi Josh,
On 09/16/2015 05:18 AM, Josh Wu wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid(). So that is not a expected behavior. We expect different value for the seed when when run it in many times.
I don't think your change made it better. Here is a snippet from a run of time(NULL) and getpid():
---8<--- time: 1442389450; pid: 11632; time | pid: 1442397690; time: 1442389450; pid: 11633; time | pid: 1442397691; time: 1442389450; pid: 11634; time | pid: 1442397690; time: 1442389450; pid: 11635; time | pid: 1442397691; time: 1442389450; pid: 11636; time | pid: 1442397694; time: 1442389450; pid: 11637; time | pid: 1442397695; time: 1442389450; pid: 11638; time | pid: 1442397694; time: 1442389450; pid: 11639; time | pid: 1442397695; time: 1442389450; pid: 11640; time | pid: 1442397690; time: 1442389450; pid: 11641; time | pid: 1442397691; time: 1442389450; pid: 11642; time | pid: 1442397690; time: 1442389450; pid: 11643; time | pid: 1442397691; --->8---
While time(NULL) is stable, getpid() is incrementing by one. As you may expect the OR'ed value is oscillating and the values almost the same. So calling gen_eth_addr three times within the same second you will get two time the same MAC.
So this patch remove the getpid(), just use the time(0) as the seed.
So let's see the effect of your change ...
The output of gen_eth_addr at the current ToT:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 254
With your change applied:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 10
Another approach would be to change the algorithm (OR the values) here. A short test showed that using XOR could be a solution:
---8<--- time: 1442389450; pid: 11632; time ^ pid: 1442394298; time: 1442389450; pid: 11633; time ^ pid: 1442394299; time: 1442389450; pid: 11634; time ^ pid: 1442394296; time: 1442389450; pid: 11635; time ^ pid: 1442394297; time: 1442389450; pid: 11636; time ^ pid: 1442394302; time: 1442389450; pid: 11637; time ^ pid: 1442394303; time: 1442389450; pid: 11638; time ^ pid: 1442394300; time: 1442389450; pid: 11639; time ^ pid: 1442394301; time: 1442389450; pid: 11640; time ^ pid: 1442394290; time: 1442389450; pid: 11641; time ^ pid: 1442394291; time: 1442389450; pid: 11642; time ^ pid: 1442394288; time: 1442389450; pid: 11643; time ^ pid: 1442394289; --->8---
It is the same input but none of the outputs is the same value.
The XOR approach applied to gen_eth_addr:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 9988
How about using ADD ? When I change to 'time + (pid << 8)', the result is better. And the pid change will has more impact in time range.
➜ tools time RUN=0; while [ $RUN -lt 10000 ]; do ./gen_eth_addr_add_shift 1; RUN=$(($RUN+1)); done | sort | uniq | wc -l 10000
The disadvantage is it might cost more time.
Best Regards, Josh Wu
Andreas
Signed-off-by: Josh Wu josh.wu@atmel.com
tools/gen_eth_addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/gen_eth_addr.c b/tools/gen_eth_addr.c index bf9d935..53b023a 100644 --- a/tools/gen_eth_addr.c +++ b/tools/gen_eth_addr.c @@ -15,7 +15,7 @@ main(int argc, char *argv[]) { unsigned long ethaddr_low, ethaddr_high;
- srand(time(0) | getpid());
srand(time(0));
/*
- setting the 2nd LSB in the most significant byte of

Hi Josh,
On 09/16/2015 11:12 AM, Josh Wu wrote:
On 9/16/2015 4:23 PM, Andreas Bießmann wrote:
On 09/16/2015 05:18 AM, Josh Wu wrote:
As 'time(0) | getpid()' sometimes get same value. That depends on the value of getpid(). So that is not a expected behavior. We expect different value for the seed when when run it in many times.
I don't think your change made it better. Here is a snippet from a run of time(NULL) and getpid():
---8<--- time: 1442389450; pid: 11632; time | pid: 1442397690; time: 1442389450; pid: 11633; time | pid: 1442397691; time: 1442389450; pid: 11634; time | pid: 1442397690; time: 1442389450; pid: 11635; time | pid: 1442397691; time: 1442389450; pid: 11636; time | pid: 1442397694; time: 1442389450; pid: 11637; time | pid: 1442397695; time: 1442389450; pid: 11638; time | pid: 1442397694; time: 1442389450; pid: 11639; time | pid: 1442397695; time: 1442389450; pid: 11640; time | pid: 1442397690; time: 1442389450; pid: 11641; time | pid: 1442397691; time: 1442389450; pid: 11642; time | pid: 1442397690; time: 1442389450; pid: 11643; time | pid: 1442397691; --->8---
While time(NULL) is stable, getpid() is incrementing by one. As you may expect the OR'ed value is oscillating and the values almost the same. So calling gen_eth_addr three times within the same second you will get two time the same MAC.
So this patch remove the getpid(), just use the time(0) as the seed.
So let's see the effect of your change ...
The output of gen_eth_addr at the current ToT:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 254
With your change applied:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 10
Another approach would be to change the algorithm (OR the values) here. A short test showed that using XOR could be a solution:
---8<--- time: 1442389450; pid: 11632; time ^ pid: 1442394298; time: 1442389450; pid: 11633; time ^ pid: 1442394299; time: 1442389450; pid: 11634; time ^ pid: 1442394296; time: 1442389450; pid: 11635; time ^ pid: 1442394297; time: 1442389450; pid: 11636; time ^ pid: 1442394302; time: 1442389450; pid: 11637; time ^ pid: 1442394303; time: 1442389450; pid: 11638; time ^ pid: 1442394300; time: 1442389450; pid: 11639; time ^ pid: 1442394301; time: 1442389450; pid: 11640; time ^ pid: 1442394290; time: 1442389450; pid: 11641; time ^ pid: 1442394291; time: 1442389450; pid: 11642; time ^ pid: 1442394288; time: 1442389450; pid: 11643; time ^ pid: 1442394289; --->8---
It is the same input but none of the outputs is the same value.
The XOR approach applied to gen_eth_addr:
% RUN=0; while [ $RUN -lt 10000 ]; do /tmp/build-uboot-test/tools/gen_eth_addr; RUN=$(($RUN+1)); done | sort | uniq | wc -l 9988
How about using ADD ? When I change to 'time + (pid << 8)', the result is better. And the pid change will has more impact in time range.
➜ tools time RUN=0; while [ $RUN -lt 10000 ]; do ./gen_eth_addr_add_shift 1; RUN=$(($RUN+1)); done | sort | uniq | wc -l 10000
I'm fine with that too. All I'd like to say with my answer was that adding getpid() was for some reason and removing it is worse than having it OR'ed.
The disadvantage is it might cost more time.
I don't think this count's here. This tool would be used to generate a temporary MAC (or more than one) for testing purposes. So getting a bunch of MAC's can last some time, but should provide unique ones.
Andreas
participants (3)
-
Andreas Bießmann
-
Josh Wu
-
Wolfgang Denk