Another look at HPET High Precision Event Timer

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by Bukkake, Sep 18, 2012.

  1. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    Not in this business. I am just a programmer with some experience in Windows API.

    Not "improved standard timer". It is simple app (service) which sets system timer to maximum resolution (with Windows API).
    As for "0.4991 ms", it is considered that latest Win10 versions use some synthetic timer instead of RTC to implement system timer, and that leads to "0.4991 ms". You can try to force Windows to use RTC again with the command "bcdedit /set useplatformtick yes" - that should turn "0.4991 ms" into "0.5 ms". Try it, maybe it will solve your problem with RTC timer in CPU-Z.
     
    MakeHate likes this.
  2. MakeHate

    MakeHate Guest

    Messages:
    10
    Likes Received:
    1
    GPU:
    MSI 1660ti
    Unfortunately, writing this command makes the situation worse. As the RTC was lagging, and it still is. But the performance of caches has worsened. L1, L2, especially L3. the Delay was 9.4 ns for L3, but it became 9.9 ns and this is definitely not an error, because it also increased for L1 and L2.
     
  3. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    You are worried too much:
    1. Difference (9.9 - 9.4 = 0.5 ns) is nano-scopic.
    2. RTC has nothing to do with cache memory. Cache memory sits and works inside CPU, while RTC sits and works somewhere on the motherboard, and their work does not depend on each other. You just see either random margin or CPU-Z itself is affected by RTC (not actual CPU).
     
    Smough and MakeHate like this.
  4. MakeHate

    MakeHate Guest

    Messages:
    10
    Likes Received:
    1
    GPU:
    MSI 1660ti
    I apologize for my excitement. It's just that I've always been told that all timers must be absolutely synchronized. It's making me panic. :(

    Can't the RTC have something to do with the Spread Spectrum and the base frequency of the BCLK processor?
     

  5. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    To my understanding it can`t. RTC works at 32.768 KHz maximum while CPU bus works at 100 MHz.
    RTC is Real Time Clock chip and has nothing to do with CPU or chipset.
     
    MakeHate likes this.
  6. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    @Tyrchlis

    You were answering a post from Sep 30, 2012 (just in case).
     
    Tyrchlis likes this.
  7. mdrejhon

    mdrejhon Member Guru

    Messages:
    128
    Likes Received:
    136
    GPU:
    4 Flux Capacitors in SLI
    Enabling/Disabling HPET sometimes has an effect on my Tearline Jedi which does an equivalent of raster interrupts (beam racing) on commodity GeForces/Radeons: Tearline Jedi: Beam Raced Effects on GeForces/Radeons/Intel GPUs

    It was my Tearline Jedi research that inspired Unwinder to add Scanline Sync to RTSS :)

    ____________

    <Programmer>
    Eventually I need to submit the codebase to some demo compo (would want coauthor help), and/or opensource it. It's written in C# utilizing the MonoGame engine, utilizing busywaits for precision placement of VSYNC OFF tearlines to generate raster-interrupt-style effects.

    With HPET enabled (I'm not 100% sure if 100% attributable to HPET), it seemed like the existing open source MonoGame engine timer was precise enough to place tearlines in precision locations without needing busywaits. Not as accurate as busywaits (tearlines jittered a bit more), but far more accurate than the 0.5ms granularity of most timer event like mechanisms.

    I don't know what mechanism the MonoGame engine timer uses, but it seems to be somewhat HPET-sensitive during real-time manipulation of setting the MonoGame's next-frame timer (real time manipulation of timer I want Update()-Draw() to be next called, aka treating a timer like a raster interrupt) since the time offset between two VSYNC approximately equals the display raster. We've been raster-scanning based from 1930s analog TVs to 2020s DisplayPort displays, and all VSYNC OFF tearlines in humankind are just rasters -- tearlines are a raster side effect of splicing a new frame into existing scanout. And, apparently, I was able to do that in realtime without busywaits in a modification to Tearline Jedi.

    The position of a tearline is a function of a time offset between two VSYNC's, so it can be estimated for crossplatform beamracing without the Windows-specific D3DKMTGetScanLine() .... the magic number is the horizontal scan rate, which determines the busywait required to move a tearline down 1 pixel. So 135 KHz scan rate means 1/135000sec to move a tearline down 1 pixel.

    One system exhibits 0.5ms granularity in the MonoGame Update timer, and another system exhibits microseconds-league precision in the MonoGame Update timer event, even when I manipulate it between Update()'s, metaphorically like setting the next raster of a Commodore 64 raster interrupt.

    Clearly the Monogame event timer, on the most accurate systems, seems to be doing it almost that accurately if I dynamically reset the desired next Update()-Draw() time metaphorically like a classic Commodore 64 raster interrupt, provided I don't spend too many milliseconds in between (otherwise power management automatically kicks in and jitters the next event....Bigger problem if I am only doing a few tearlines per refresh cycle at more aggressive power management settings).

    And it works semi-crossplatform assuming I have access to a VSYNC heartbeat (timings of beginnings of refresh cycles). Which is possible by running two threads, one hidden one to a VSYNC ON instance (to get VSYNC heartbeat) and another visible fullscreen one to a VSYNC OFF instance (to beamrace the tearlines). And now there are two techniques sufficiently accurate for Tearline Jedi, the busywait method and the high-precision timer method.
    </Programmer>

    100% Agreed. HPET and similar mechanisms are a double edged sword, trading off something for something (e.g. precision vs performance)
     
    Last edited: Nov 6, 2020
    chinobino likes this.
  8. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
  9. Smough

    Smough Master Guru

    Messages:
    984
    Likes Received:
    303
    GPU:
    GTX 1660
    In general, using bcdedit /set useplatformtick yes has, seemingly, removed micro-stutter I was having at some games under Windows 10 1809. Some could say it's placebo, but being honest, I don't think it is.

    Games got much smoother; can't tell why, but for me it seems the synthetic timer by MS interrupts how some games may "call" the Windows timer in order to work properly. DmC5, The Division 2 and SW: Battlefront 2 changed dramatically. No one is quite sure why MS even changed the timer, when it was RTC on Windows 7. There are even timer differences on different versions of Windows; 1607 has the timer slightly above 1 ms (around 1,0006) and 1803 and after, around 0,9997.

    Why even change this? Security reasons? If so, the user should have the right to know and for even more optimization, this should change on its own when you play something like a game, but it doesn't. This is why I believe just leaving an even timer makes things better (even if real world proof of this is anecdotal at much and this "trend" started from Windows 10 "tweakers" on Youtube; as well as the 1709 and 1607 Windows versions claims being better for games). Also it seems like MS themselves don't even know where to leave the timer lol.
     
  10. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    bcdedit /set useplatformtick yes actually casued my rendering in 3dsmax to be slower
     
    Last edited: Nov 8, 2020

  11. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    Two times slower? Three times? Or somewhat slower?

    (Just curious...)
     
  12. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    i cant tell since i didn't measure it. But it was noticeably slower
     
    Last edited: Nov 8, 2020
  13. Smough

    Smough Master Guru

    Messages:
    984
    Likes Received:
    303
    GPU:
    GTX 1660
    Which Windows version? This command does not change the computer speed one bit, it changes the internal timer. With bcdedit /set useplatformtick yes you are using the motherboard's timer (hence PLATFORM tick) and not the new one that was introduced under Windows 10. You should combine this with TimerTool to keep the timer at 1ms or else, it could jump around, so maybe that's why 3dsmax was slower to you.

    Seems to be like you are confusing it with bcdedit /set useplatformclock yes, which is forcing HPET on Windows for all apps, which you should NEVER do. Windows turns off and on HPET when it needs it, there is no reason whatsoever to keep this active unless there is a very specific reason you need it. A lot of people mix and confuse these 2 commands, platformtick IS not platformclock. One affects Windows timer, the other affects Query Performance Frequency.
     
  14. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    Two functions...
     
  15. Smough

    Smough Master Guru

    Messages:
    984
    Likes Received:
    303
    GPU:
    GTX 1660
    Oh, well, thanks for the correction.
     

  16. theahae

    theahae Active Member

    Messages:
    67
    Likes Received:
    12
    GPU:
    GTX 1060
    I was wrong. I just rendered again and the results are pretty much the same
    47 to 46 seconds.
     
  17. Blanky

    Blanky Member

    Messages:
    34
    Likes Received:
    10
    GPU:
    RTX 2070 SUPERXTrio
    It appear something has happended to the timer subsystem resolution on Windows 10 starting from Build 1809.

    QueryPerformanceFrequency() returns 10mhz (10ms) which is causing the GUI et al to have latency. DPC latency is also elevated post 1809 installation. This happens on multiple, different motherboards (AMD, Intel) as well as various configurations. Since then my PC has not been the same, this has brought a lot of controversy but I still don't understand why that changed.
     
  18. janos666

    janos666 Ancient Guru

    Messages:
    1,653
    Likes Received:
    407
    GPU:
    MSI RTX3080 10Gb
    Yes, it was "in the news" back in late 2018. Today it's late 2020, merely 2 years later. But no rush! Please, take your time ingesting this rather peculiar and life shattering revelation! :p
     
    rab3072 likes this.
  19. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,604
    Likes Received:
    13,612
    GPU:
    GF RTX 4070
    QueryPerformanceFrequency has nothing to do with timer resolution.
    And 10 MHz is not 10 milliseconds, 10 MHz would be 1 microsecond.

    Upd: 10 MHz would be 10 microseconds, of course.
     
    Last edited: Nov 8, 2020
  20. Blanky

    Blanky Member

    Messages:
    34
    Likes Received:
    10
    GPU:
    RTX 2070 SUPERXTrio
    Sorry because in the end I got confused, what I wanted to comment is that since that changed I have never again felt the soft and fluid games, and I have never understood why it changed, whatever I do I have stuttering, I have tried all kinds of configurations and in the end I have given up because I have not found a way to solve it.(sorry my bad english).
     

Share This Page