
On 5/26/2011 11:54 PM, Graeme Russ wrote:
On Fri, May 27, 2011 at 4:33 PM, J. William Campbell jwilliamcampbell@comcast.net wrote:
On 5/26/2011 9:33 PM, Graeme Russ wrote:
Hi Bill,
<snip>
[massive snip]
OK, you have my ears pricked - Can you give me code samples for:
- get_ticks()
- sync_timbase() (no need to implement the whole lot if that is too much effort right now)
- timer_isr()
that handle the following hardware tick counter scenarios:
a) 64-bit up counter b) 64-bit down counter
Hi Graeme,
c) 32-bit up counter, wraps at 65000
Do you mean 32 bits or 16 bits? doesn't make much difference, but just checking.
d) 16-bit microsecond up counter 0-999 which wraps and triggers a 16-bit millisecond up counter. Reading milliseconds latched microseconds and clears milliseconds (look in arch/x86/cpu/sc520/timer.c for example) e) 24-bit counter occupying bits 2-25 of a 32-bit word (just to be difficult) f) Any other option anyone cares to throw ;)
All of these options must be covered using:
- Minimal global data (we would like it to work before relocation, but not mandatory - GD footprint would be nice)
- All use the same sync_timebase function
- Demonstrate using an ISR NOT synced to the hardware tick counter and an ISR that is
- Parameters to get_ticks() and sync_timer() are permitted, but not for timer_isr() (naturally)
OK! Once again, I accept the challenge. One Caveat. My real work has been sliding due to the time I have been spending on this. I am flying from San Francisco to Sydney tonight , (to work, not to play), so I will be off-grid for 14 hours+. You will not get this code for a few days, like probably 3 days. I have looked at the requirements, and I see no real problems that I don't know how to solve.
I don't' see any reason to push this down to a lower level. It is just one more thing to get messed up across implementations.
Agreed
detection in the non-interrupt case to sync_timebase as well. Sync_timebase can also invert the down-counting counters, removing that from get_ticks. The wrap detection code can be #ifdef out if one is using interrupts and
Urghh - Why are you adding unnecessary ugliness - #ifdef in the middle of code is usually a sign you are doing something wrong
As I said, this is an optional optimization. I do not agree that an #ifdef in the middle of code indicates you have a bad design. Lots and Lots of ifdefs certainly indicates a bad design. An ifdef to eliminate code if some option is not selected is hardly such a strange thing, especially only a single #ifdef. However, feel free to not have it if you so desire.
OK, I'll let this slide for the moment - please include in above example
Will Do.
offended by it's presence. Thanks for pointing this out and compelling me to reduce the number of cases! Making get_ticks more lightweight is a good idea in my opinion.
Lets say you have a platform with a 32-bit tick counter running at a reasonably long rollover time so you decide not to use interrupts. Then you create a new platform with the same tick counter, but it runs much faster and you realise you need to implement the interrupt routine to make get_timer() work for long enough periods - Fine, you add an ISR for the new platform that calls sync_timebase - No other changes are required.
The last thing we want is for the 64-bit tick counter to be conceptually different across platforms
I just realised - the ISR _does not need to call the sync_timebase at all_ The ISR only needs to call get_ticks(), so it will be fast anyway
The problem is that the way we previously detected wrapping does not work if the interrupt rate is == to the counter wrap time, which it essentially always is. If get_ticks is trying to update the wrap count when an interrupt
Is it, always, on every platform?
Yes, pretty much. You get a terminal count/counter underflow interrupt and that is it.
Not on sc520 - The micro/millisecond counter cannot be used to driver an interrupt - you need to setup a seperate timer. I think the x86 internal performance counters are the same
What is true, as you have stated, is that the micro/millisecond counter on the sc520 does not interrupt at all. Nor do the x86 performance counters. The x86 performance counters are a non-problem because they are 64 bits long. We don't need interrupts for them. Now, if you choose to use the sc520 micro/millisecond counter, then you need another source of interrupts. Due to the fact that reading the sc520 counter resets it, we must accumulate the elapsed time in software. That means the interrupt routine must do a bit more work, but it also allows reading the counters in non-interrupt code (with interrupts disabled) to not mess up the accumulated count. We don't detect rollover in the sc520 counters, we just read and accumulate the value. So no problem there.
comes in, it will do it wrong. If the interrupt rate was faster, it would work, because get_ticks would always know the correct answer. Consider the following. Get ticks is called by the counter overflowing (which resets it to 0) and executes with the counter value at 0. Later, at the next rollover, with no intervening calls to get ticks, the interrupt routine calls get ticks. Assuming negligible interrupt latency, get_ticks just sees the counter is still 0, so it will not detect a wrap condition. So now you loose
But wait a minute, don't you KNOW that the interrupt gets triggered on a rollover so therefore there MUST have been a rollover, therefore there has been 2^32 (or 2^16 or 2^whatever) ticks which you can just subtract from the last known tick count (which would be zero if there had been no get_timer calls in between)
Except if this interrupt was delayed because get_ticks turned off interrupts, get_ticks may have already compensated the number. When we then get the (delayed) interrupt, we will do it twice.
That would mean get_timer() got called EXACTLY on the rollover - A race condition that should be avoided. But still, a race can still occur through using disable/enable interrupts which could push the timer isr out
a period. I thought by passing in a flag we could force get_ticks to do the right thing in this case, but since we must disable interrupts when we call get ticks from the get_timer routine, the outer call to get ticks may have already detected the rollover and the flag will force an additional rollover to be detected. It is a classic race condition. If we detect rollover only
Aha! - In nearly all situations, a race is better avoided than run!
Yes, that is why the new approach does NOT turn off interrupts when get_ticks is called, but rather loops if the overflow count was changed while reading the count lsb. That ensures get_ticks always gets both words either before the counter wrapped or after the counter wrapped, but not halfway in between. Some ISYNCs may be required to make sure there are no outstanding changes pending to memory, depending on your architecture.
OK, lets have a closer look
in the interrupt routine, we do not need to protect the rollover variable from access by the main program, so we can be sure the results of get_ticks is coherent. If we try do do it in both places, we will have trouble. If the interrupt occurred at a faster rate, like say twice the rollover, we wouldn't have a problem. But it doesn't. In most cases it happens at the rollover rate, or just a bit more sometimes due to interrupt latency not
Again does it really - for all arches?
Yep, pretty much. Almost all timers I know of only give one interrupt per cycle.
A bold assumption - and one that is wrong ;)
Ok, except for timers that don't interrupt at all and need an outside assist! In those cases, you need to interrupt faster than the counter period, which as your original constraint. But in that case, the interrupt is coming from elsewhere, not the timer being processed.
being a constant. It may appear to happen at somewhat less than a rollover as well, if the interrupt latency was long at time n-1 and short at time n. I think this problem can be overcome in the old design by keeping track of whether the last update was by a non-interrupt or interrupt call to get_ticks , but I think the new design is better anyway.
I disagree - The whole point of a HAL is to Abstract the Hardware
The HAL should abstract what needs abstracting, and no more. We are really talking about literally 4 lines of code here, but I think those four lines certainly belong in sync_timer. All the math in one place is a good thing. In most cases, get_ticks becomes trivial. Only in cases where the timer is wrong endian or embedded in another bit field (both things being possible but not all that common) does get_ticks need to do anything other than just read the registers and loop until the msb is stable. Simple here is good, because this is what new implementers need to add/change. The less here, the better! In fact, the looping until the msb is stable can also be moved up to sync_timer. That is even simpler.
OK, you got me - "show me the money^H^H^H^H^Hcode"
Wilco! You will get them by Monday Afternoon, California time, or sooner if I can stay awake long enough.
Best Regards, Bill Campbell
Regards,
Graeme