[time-nuts] Linux TSC clocksource on multi-core systems

Sat Apr 26 18:27:26 UTC 2014

Hi time-nuts,

I've been reading the list for a while and I realize most of the discussion is a lot lower level than this, but I'm not sure where else to ask.  I probably don't have a complete understanding of the problem, and maybe I just need a nudge in the right direction.  My goal is nothing more than to experiment and learn.

Goal: I want to use the 'best' clock source to keep the operating system clock as accurate as possible, as well as maximize the resolution of the clock.  On modern general purpose computers like what most of us are using, this is the CPU Time Stamp Counter.  I'm using Linux but insights with other software are also useful.  

The problem as I understand it:
Operating systems like Linux use a system clock loop to satisfy gettimeofday calls and that is what NTP ultimately tries to make accurate.  If the system clock is always slow or always fast, this is easy, but if the counter that's used for reference goes backwards or is otherwise unreliable then it can only get so close.  The operating system can use things like the ACPI power management timer, the HPET, PIT, or any other hardware specific counter that is available.  The precision and resolution of the system clock depend on the resolution of the counter backing it (please correct me if I misunderstood this).  A 3Ghz processor's TSC provides more resolution than the 1.19Mhz PIT so you can make finer adjustments to get your time more accurate.  The processor's TSC - Time Stamp Counter, is the highest resolution one usually available, and it also appears to be the lowest overhead to measure, so it's preferred to use this when possible.  Sometimes (most of the time?) the TSC is not reliable in that it doesn't run at a constant speed, or that it may stop completely in certain power saving modes.  Linux at least tries to test for this and will report "tsc unstable" messages to the user.  The problem is worse with multiple cores or multiple CPU packages since the counters might not be synchronized, but today you'd be hard pressed to find an end-user computer that doesn't have multiple logical CPUs.

What I'm interested in is if it's possible to work around the various TSC problems and make it usable.  For example, turning off the power management (C-states, enhanced speed step, etc) can work around the problem of the frequency changing or the counter stopping.  Is it possible to fix the multi-processor problems by synchronizing the counters somehow, or can the kernel always read the same CPU's counter?  I know that the newest Intel CPUs like the E5s have a lot of these problems addressed and are advertised as having invariant counters, but what about all the stuff that's not the latest and greatest? 

I have an Intel Atom based system and I was able to make the TSC usable for time keeping by booting linux with "nosmp" so that it's only using one CPU core.  It would be better if I could somehow make it so the other cores are usable while keeping the high resolution clock source.

My other test system is a fairly old server using 2x Intel Xeon E5430 quad core processors.  After turning off the CPU power management features in the BIOS, linux started using the TSC as the clock source, but it's keeping very poor time which I think might be due to poor synchronization of the counters.  I'm going to try with "nosmp" and I expect that to work like it did on the Atom, but it would be nice if I could use the other CPUs.  How is everyone dealing with this problem?  It's fine to disable the additional cores/cpus on a dedicated NTP machine, but I wonder if there is a solution that allows both the TSC and all the cores to be used at the same time.  Is it even possible to completely sync the counters across CPUs (not just get close)?  It doesn't seem like it, but maybe someone knows better.

Thanks,
Laszlo