
Hi Bill,
On Fri, May 27, 2011 at 11:26 AM, J. William Campbell jwilliamcampbell@comcast.net wrote:
On 5/26/2011 4:28 PM, Graeme Russ wrote:
Why mess around with bit shifting (which you would then have to cludge into your platform code) when carting around a 64-bit value is relatively cheap, transparent and poratble (all all supported up-to-date tool chains)
I really STRONGLY disagree with this statement. If you actually needed 64 bit variables, fine use them. But as I have already shown, you do not need them in general. We are computing a 32 bit result. There is some entropy argument that says you shouldn't need 64 bits to do so. Another way to look at it is that converting the top 32 bit word and the bottom 32 bit word to ms separately can be easier than doing them both together at once. However, as we will see below, I do agree we need two 32 bit words to make this process go smoothly. I just don't agree that they should/will constitute a 64 bit binary word. See below.
- void timer_isr() - Optional (particularly if tick counter rollover period is exceptionally log which is usually the case for native 64-bit tick counters) - Simply calls sync_timebase() - For platforms without any tick counter, this can implement one (but accuracy will be harmed due to usage of disable_interrupts() and enable_interrupts() in U-Boot
So to get the new API up and running, only two functions are mandatory:
get_ticks() which reads the hardware tick counter and deals with any 'funny stuff' including rollovers, short timers (12-bit for example), composite counters (16-bit 0-999 microsecond + 16-bit millisecond) and maintains a 'clean' 64-bit tick counter which rolls over from all 1's to all 0's. The
I think it is the task of get_ticks to return the hardware tick counter as an increasing counter, period. The counter may wrap at some final count that is not all ones. That is ok. Sync_timebase deals with the rollovers if
The hardware tick counter may, the 64-bit software tick counter maintained by get_ticks() may not
necessary. get_ticks is very lightweight. get_ticks should deal with decrementing counters by returning the complement of the counter. The sc520 case is a bit more complex if you intend to use the 0-999 and 16 bit millisec registers, in that you do need to add them to the previous value to
As I mentioned in another post, this is a problem for the platform maintainer and is abstracted away throught the platform specific implementation of get_ticks()
make an increasing counter. Sync_timebase "likes" short counters in that they are easy to convert to millisec and tick remainders.
The compiler should handle using 64-bit rather than 32-bit transparently
True enough. But you don't need 64 bit variables at this point two 32 bit ones work just fine, in fact better in most cases.
Remember, we are not dealing with a high performance OS here. The primary goal is portability - Performance optimisations (which do not break portability) can be performed later
64-bit tick counter does not need to be reset to zero ever (even on startup
- sync_timebase tacks care of all the details)
True, but sync_timebase does have to be initialized (as does the timer itself in most cases, so this is not a restriction).
This can be done in timer_init() via a call to sync_timebase() after the timer has been configured. This should bring everything into line
ticks_per_millisecond() simply return the number of ticks in a millisecond
- This may as simple as:
inline u64 ticks_per_millisecond(void) { return CONFIG_SYS_TICK_PER_MS; }
But it may be trickier if you have a programmable tick frequency
You will have to call the routine that initializes sync_timebase. This routine should have a name, like void init_sync_timebase(void)?
The optional timer ISR is required if the tick counter has a short roll over duration (short is up to you - 1 second is short, 1 hour might be, 1 century is not)
Regards,
Graeme
It is probably true that sync_timebase should have a parameter flag. The reason is that if the timer isr is called only when the timer wraps, then the calls to sync_timebase may be slightly more than a full timer period apart. (due to interrupt latency). Therefore, when the timer difference is computed, if the current update is due to a wrap AND the previous update is due to a wrap, the difference should be approximately 1 wrap. If it comes up real short, you must add a wrap. This isn't necessary if the routine is called more often than once per wrap. Also, when sync_timebase is called in
timer_isr() MUST be called more often than the rollover period of the underlying hardware tick counter - This is a requirement
The equality case can be made to work. If the extension of the counter is done in the interrupt routine, not in get_ticks, get_ticks just needs to read the msb of the counter, read the lsb of the counter, then verify that the msb has not changed. If you have interrupts that work, that is the easiest way to go. If the lsb of the counter has represents 1 ms or less, you can just drop it (equivalent to the what the PPC does now). If the interrupt is slower than that, you must use at least part of the LSB. If you don't have interrupts, the point is moot.
So now we have a complicated ISR and a complicated get_ticks() and you have to change get_ticks() when you decide to implement an ISR
get_timer, you must first disable interrupts and then enable interrupts after sync_timebase returns
Why? - get_ticks() provides an atomic read of the hardware tick counter. If get_ticks() needs to disable and enable interrupts to do so, that is a problem for the platform maintainer
Admittedly, sync_timebase() will not be re-entrant, but how will it ever be called concurrently? - Ah, I see - a call to get_timer() interrupted by the timer ISR :)
Yes, that is the problem. I have come to the view that two 32 bit words are the best approach. Note that the lsb may actually not fill the full 32 bits.
Urghhh
The top 32 bits are the rollover count and the bottom 32 bits are the current counter. If the counter is a full 32 bits, so much the better.
Ahhhhh - Lets keep it that way
Again, one could put them together inside the interrupt routine , but it is easier to check for a changed value if you don't do this. Otherwise, you have to check both words. It also makes the isr faster. It is just an
As I said before - Simple First, Fast Later
increment of the overflow counter, like the PPC is now. I happen to think it is easier to convert the two 32 bit words to milliseconds one at a time, but if you feel you must use 64 bit words, that is fine too. Just remember, the counter does not always fill the entire bottom word.
Urghhh
In cases where there are no interrupts, get_ticks has to detect that the timer has "backed up" and increment the overflow counter itself, unless the counter is 64 bits to begin with and overflow is impossible anyway. get_ticks should NOT try to detect overflows if interrupts are available. If it got both words before an interrupt happened, the answer is correct. If it got an interrupt in between fetching the words, the event will be detected and the value re-fetched. All sync_timebase would do now is convert the returned value to milliseconds.
So, if you have a 64 bit hardware counter, get_ticks reads and returns it. Else if you have interrupts, get_ticks reads the overflow counter into the msb. Next, it reads the hardware timer into the lsb. If the counter is a down counter, the lsb is = to the counter max - the lsb. The msb is then checked to make sure it hasn't changed, if it has, repeat the process. All the interrupt routine does is increase the overflow count. If you don't have interrupts get_ticks reads the hardware counter into the lsb. If the counter is a down counter, the lsb is = to the counter max - the lsb. If the lsb is less than it was in the previous call to get ticks, the overflow counter is increased. get_ticks then loads the overflow counter into the msb.
sync_timebase converts the msb and lsb into millisec. It may do this by a 64 bit divide, or some shifting to align the lsb with then msb and the a 64 bit divide, or a bunch of 32 bit fractional multiplies, or any such approach that works.
How does that sound?
The fact that you have described three different implementations of get_ticks() with two of these differentiated between whether you have interrupts or not immediately suggests this solution is inherently more complex and less maintainable.
Lets say you have a platform with a 32-bit tick counter running at a reasonably long rollover time so you decide not to use interrupts. Then you create a new platform with the same tick counter, but it runs much faster and you realise you need to implement the interrupt routine to make get_timer() work for long enough periods - Fine, you add an ISR for the new platform that calls sync_timebase - No other changes are required.
The last thing we want is for the 64-bit tick counter to be conceptually different across platforms
I just realised - the ISR _does not need to call the sync_timebase at all_ The ISR only needs to call get_ticks(), so it will be fast anyway
Regards,
Graeme