Re: [U-Boot] [RFC][Timer API] Revised Specification - Implementation details

27 May 2011

      Hi Bill,
On Fri, May 27, 2011 at 11:26 AM, J. William Campbell
jwilliamcampbell@comcast.net wrote:
...
On 5/26/2011 4:28 PM, Graeme Russ wrote:
...
...
Why mess around with bit shifting (which you would then have to cludge
into
your platform code) when carting around a 64-bit value is relatively
cheap,
transparent and poratble (all all supported up-to-date tool chains)
I really STRONGLY disagree with this statement. If you actually needed 64
bit variables, fine use them. But as I have already shown, you do not need
them in general.  We are computing a 32 bit result. There is some entropy
argument that says you shouldn't need 64 bits to do so. Another way to look
at it is that converting the top 32 bit word and the bottom 32 bit word to
ms separately can be easier than doing them both together at once.  However,
as we will see below, I do agree we need two 32 bit words to make this
process go smoothly. I just don't agree that they should/will constitute a
64 bit binary word. See below.
...
...
...
- void timer_isr()
    - Optional (particularly if tick counter rollover period is
      exceptionally log which is usually the case for native 64-bit tick
      counters)
    - Simply calls sync_timebase()
    - For platforms without any tick counter, this can implement one
      (but accuracy will be harmed due to usage of disable_interrupts()
and
      enable_interrupts() in U-Boot
So to get the new API up and running, only two functions are mandatory:
get_ticks() which reads the hardware tick counter and deals with any
'funny
stuff' including rollovers, short timers (12-bit for example), composite
counters (16-bit 0-999 microsecond + 16-bit millisecond) and maintains a
'clean' 64-bit tick counter which rolls over from all 1's to all 0's.
The
I think it is the task of get_ticks to return the hardware tick counter
as
an  increasing counter, period.  The counter may wrap at some final count
that is not all ones. That is ok. Sync_timebase deals with the rollovers
if
The hardware tick counter may, the 64-bit software tick counter maintained
by get_ticks() may not
...
...
necessary. get_ticks is very lightweight. get_ticks should deal with
decrementing counters by returning the complement of the counter.  The
sc520
case is a bit more complex if you intend to use the 0-999 and 16 bit
millisec registers, in that you do need to add them to the previous value
to
As I mentioned in another post, this is a problem for the platform
maintainer and is abstracted away throught the platform specific
implementation of get_ticks()
...
make an increasing counter. Sync_timebase "likes" short counters in that
they are easy to convert to millisec and tick remainders.
The compiler should handle using 64-bit rather than 32-bit transparently
True enough.  But you don't need 64 bit variables at this point two 32 bit
ones work just fine, in fact better in most cases.
Remember, we are not dealing with a high performance OS here. The primary
goal is portability - Performance optimisations (which do not break
portability) can be performed later
...
...
...
...
64-bit tick counter does not need to be reset to zero ever (even on
startup

sync_timebase tacks care of all the details)

True, but sync_timebase does have to be initialized (as does the timer
itself in most cases, so this is not a restriction).
This can be done in timer_init() via a call to sync_timebase() after the
timer has been configured. This should bring everything into line
...
...
ticks_per_millisecond() simply return the number of ticks in a
millisecond

This may as simple as:

inline u64 ticks_per_millisecond(void)
{
       return CONFIG_SYS_TICK_PER_MS;
}
But it may be trickier if you have a programmable tick frequency
You will have to call the routine that initializes sync_timebase. This
routine should have a name, like void init_sync_timebase(void)?
...
The optional timer ISR is required if the tick counter has a short roll
over duration (short is up to you - 1 second is short, 1 hour might be,
1
century is not)
Regards,
Graeme
It is probably true that sync_timebase should have a parameter flag. The
reason is that if the timer isr is called only when the timer wraps, then
the calls to sync_timebase may be slightly more than a full timer period
apart. (due to interrupt latency). Therefore, when the timer difference
is
computed, if the current update is due to a wrap AND the previous update
is
due to a wrap, the difference should be approximately 1 wrap. If it comes
up
real short, you must add a wrap. This isn't necessary if the routine is
called more often than once per wrap. Also, when sync_timebase is called
in
timer_isr() MUST be called more often than the rollover period of the
underlying hardware tick counter - This is a requirement
The equality case can be made to work.  If the extension of the counter is
done in the interrupt routine, not in get_ticks, get_ticks just needs to
read the msb of the counter, read the lsb of the counter, then verify that
the msb has not changed. If you have interrupts that work, that is the
easiest way to go. If the lsb of the counter has represents 1 ms or less,
you can just drop it (equivalent to the what the PPC does now). If the
interrupt is slower than that, you must use at least part of the LSB. If you
don't have interrupts, the point is moot.
So now we have a complicated ISR and a complicated get_ticks() and you
have to change get_ticks() when you decide to implement an ISR
...
...
...
get_timer, you must first disable interrupts and then enable interrupts
after sync_timebase returns
Why? - get_ticks() provides an atomic read of the hardware tick counter.
If get_ticks() needs to disable and enable interrupts to do so, that is a
problem for the platform maintainer
Admittedly, sync_timebase() will not be re-entrant, but how will it ever
be called concurrently? - Ah, I see - a call to get_timer() interrupted
by the timer ISR :)
Yes, that is the problem. I have come to the view that  two 32 bit words are
the best approach. Note that the lsb may actually not fill the full 32 bits.
Urghhh
...
The top 32 bits are the rollover count and the bottom 32 bits are the
current counter. If the counter is a full 32 bits, so much the better.
Ahhhhh - Lets keep it that way
...
Again, one could put them together inside the interrupt routine , but it is
easier to check for a changed value if you don't do this. Otherwise, you
have to check both words. It also makes the isr faster. It is just an
As I said before - Simple First, Fast Later
...
increment of the overflow counter, like the PPC is now. I happen to think it
is easier to convert the two 32 bit words to milliseconds one at a time, but
if you feel you must use 64 bit words, that is fine too. Just remember, the
counter does not always fill the entire bottom word.
Urghhh
...
In cases where there are no interrupts, get_ticks has to detect that the
timer has "backed up" and increment the overflow counter itself, unless the
counter is 64 bits to begin with and overflow is impossible anyway.
get_ticks should NOT try to detect overflows if interrupts are available. If
it got both words before an interrupt happened, the answer is correct. If it
got an interrupt in between fetching the words, the event will be detected
and the value re-fetched. All sync_timebase would do now is convert the
returned value to milliseconds.
So, if you have a 64 bit hardware counter, get_ticks reads and returns it.
Else if you have interrupts, get_ticks reads the overflow counter into the
msb. Next, it reads the hardware timer into the lsb. If the counter is a
down counter, the lsb is = to the counter max - the lsb. The msb is then
checked to make sure it hasn't changed, if it has, repeat the process. All
the interrupt routine does is increase the overflow count.
If you don't have interrupts get_ticks reads the hardware counter into the
lsb. If the counter is a down counter, the lsb is = to the counter max - the
lsb. If the lsb is less than it was in the previous call to get ticks, the
overflow counter is increased. get_ticks then loads the overflow counter
into the msb.
sync_timebase converts the msb and lsb into millisec. It may do this by a 64
bit divide, or some shifting to align the lsb with then msb and the a 64 bit
divide, or a bunch of 32 bit fractional multiplies, or any such approach
that works.
How does that sound?
The fact that you have described three different implementations of
get_ticks() with two of these differentiated between whether you have
interrupts or not immediately suggests this solution is inherently more
complex and less maintainable.
Lets say you have a platform with a 32-bit tick counter running at a
reasonably long rollover time so you decide not to use interrupts. Then
you create a new platform with the same tick counter, but it runs much
faster and you realise you need to implement the interrupt routine to
make get_timer() work for long enough periods - Fine, you add an ISR
for the new platform that calls sync_timebase - No other changes are
required.
The last thing we want is for the 64-bit tick counter to be conceptually
different across platforms
I just realised - the ISR _does not need to call the sync_timebase at all_
The ISR only needs to call get_ticks(), so it will be fast anyway
Regards,
Graeme