[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recent qemu and timers issue



On Sun, 26 Apr 2009, Juergen Lock wrote:

On Fri, Apr 24, 2009 at 10:20:33PM +1000, Bruce Evans wrote:
On Thu, 23 Apr 2009, Juergen Lock wrote:

On Tue, Apr 07, 2009 at 11:37:37PM +0200, Juergen Lock wrote:
In article <200904062254_(_dot_)_37824_(_dot_)_kalinoj1_(_at_)_iem_(_dot_)_pw_(_dot_)_edu_(_dot_)_pl> you write:
Dnia sobota 04 kwietnia 2009 o 00:23:29 Juergen Lock napisa=C5=82(a):
In article <c948bb4de85d1b2a340ac63a7c46f6d9_(_at_)_iem_(_dot_)_pw_(_dot_)_edu_(_dot_)_pl> you write:
...
I tried to use all possible timers using sysctl, where I have:
TSC(800) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000)
None of these helped.

None of these are normally used for calculating runtimes.  Normally
on i386, the TSC is used.

Aaah-haa, this I didn't know.

 The only way to configure this is to edit
the source code.  Try removing the calls to set_cputicker() in the MD
code.  Then the MI and timecounter-based cputicker tc_cpu_ticks() will
be used.

Yup, that seemed to help indeed. (patch below.)

 A better implementation would use a user-selectable
timecounter-based cputicker in all cases, but usually not the system
timecounter since that is likely to be very slow so as to be more
accurate.

This was using qemu's emulated hpet...  I guess you mean slow to read
the counter value?  How often is the cputicker read, at every context
switch?  More often?

Yes, ACPI timecounter hardware typically takes 1000 nsec to read, while
TSC hardware typically takes 5 nsec to read (12 cycles on AthlonXP and
Athlon64; more on P3-4, Core2 and Phenom).  I don't know how long it
takes to read a typical HPET.  Emulated timecounter hardware is likely to
be even slower.  Timecounter software typically adds only another 20
(50?) nsec.  The cputicker is read mainly at every context switch.

[...some fixes]

...and I tried this, both changes didn't fix the problem.

Another thing you can try here is to edit the source code to change
the set_cputicker() calls to say that the frequency is not variable.

That probably won't help here because I noticed at least the initial
tsc `calibration' in the guest (in init_TSC()) is way off too (it got
not even half the value here of the actual frequency, which according
to dmesg on this host is `TSC: P-state invariant'.)

The initial calibration code is even sloppier than the recalibration,
and is more likely not to work under emulation.  It depends on the
i8254 timer being accurate and doesn't try to sandwich reads of the
TSC between close-together reads of the reference timer or otherwise
try to limit errors in reading the reference timer.  With real hardware
this normally causes an avoidable error of at most 5 ppm (from waiting
5 i8254 cycles extra), but with emulated hardware it probably causes
a larger error even if the emulation is perfect.  The recalibration
does better by using a higher quality reference timer sampled over an
interval 16 times as long.

This should be fixable using the machdep.tsc_freq sysctl.  However,
this sysctl neglects to call set_cputicker().  This should make
little difference when the frequency is nominated as variable since
recalibration should change it soon anyway.  However, the bug in
recalibration prevents downwards adjustments.

OK _maybe_ if we get the proper frequency into the guest there somehow
from the beginning and then say its not variable maybe it could work,
but that still leaves the case of hosts with non P-state invariant tsc
because...

I used this temporarily to work around the non-decreasing calibration.
This should be the default for emulators for most cputickers -- emulators
should emulate a constant frequency and not emulate the complexities
for pwoer saving.

Hmm I guess thats more easily said than done. :)  At least qemu
basically just passes the host tsc thru when a guest reads it.

But it claims P-state invariance?  Maybe it gets that from the host.
Does it trap TSC reads?  This would be slow, but required to emulate
P-state invariance and might be required for accurate timing anyway.
I think emulators shouldn't trap reads of the TSC because the TSC
is unreliable for accurate timing anyway, but they should do something
to keep slower-to-access accurate hardware timers virtually accurate.
Hopefully the hardware people will eventually make a timer like the
TSC both accurate and fast.  Emulators will have a difficult time
preserving both.

Bruce
_______________________________________________
freebsd-emulation_(_at_)_freebsd_(_dot_)_org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-emulation
To unsubscribe, send any mail to "freebsd-emulation-unsubscribe_(_at_)_freebsd_(_dot_)_org"


Visit your host, monkey.org