Another look at HPET High Precision Event Timer

Discussion in 'Videocards - NVIDIA GeForce Drivers Section' started by Bukkake, Sep 18, 2012.

  1. Cyberdyne

    Cyberdyne Ancient Guru

    Messages:
    3,398
    Likes Received:
    186
    GPU:
    2080 Ti FTW3 Ultra
    Uh oh, this thread is about to loop again.
     
  2. janos666

    janos666 Master Guru

    Messages:
    794
    Likes Received:
    83
    GPU:
    MSI RTX2060 6Gb
    Well, for starters, it should be move from Geforce Drivers to some other section of the forum (like Operating Systems or Game Tweaks may be).
     
  3. Smough

    Smough Master Guru

    Messages:
    374
    Likes Received:
    54
    GPU:
    GTX 1060 3GB
    That thread is ancient and the user who posted that was explaining things from his point of view, it doesn't mean it will be the same for all users or that he is right. If HPET active on Windows was so amazing and epic and whatever, then the O.S would have it active as default; yet it doesn't. Windows uses TSC; HPET when its needed, no need to force it on or off, I believe if an user fully understand its functions, then they could force it on for a particular program or application, the rest of users don't need it on because Windows manages it.

    Stock HPET+active on BIOS is the best setting:

    I'd rather believe a real video than some telling me to force a timer on my O.S that doesn't need to be force, because its actually dynamic.

    He also said this (from the thread you linked): "Fortunately I have to say that useplatformclock defaults windows to the best that the clock has (cpu), but is it so with mb's?"

    He doesn't even know what he is doing or changing, it just seemed ok for him, that's it

    As for latency, well, this my system: https://imgur.com/a/XqOkjuS

    And if your whole post is sarcasm, well then, I fully fell for it, because taking advice from a random post from a random user that has no idea on anything its something I refuse anyone from here would just do.
     
    Last edited: Mar 21, 2020
  4. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070
    We have no means to check that. I mean we can't switch that frequency on the fly. And if you use older version of Win10 then the test is not relevant because whole OS is different.
     
    Smough likes this.

  5. Smough

    Smough Master Guru

    Messages:
    374
    Likes Received:
    54
    GPU:
    GTX 1060 3GB
    I tested Windows 10 1909 that has the 10 MHz QPC, latency measurement with Latencymon was as low, if not lower than 1803 with some tweaks I do. Very good in the regard, but the system felt "muddy", like most of the stuff I did had an small delay to respond, fast, sure, but not as snappy as 1709 and 1803 versions are. Even with Spectre&Meltdown disabled, the 10 MHz QPC will remain, there is no way to change it. Speaking in general terms, it won't affect the normal user because not everyone can notice these changes as some of us do, gaming felt the same, no difference that I can remember.

    I decided to go back to 1803 and everything feels ok, its a bit snappier. When Windows 2004 comes out, I will try it, it promises a lot, but we'll have to see.

    Just stay on any version you have, don't rollback, try to tweak it as much as possible, fr33thy guides are ok, even so he doesn't explain certain things, some of that stuff just works. Keep drivers up to date, use DDU in case you think your GPU driver its giving you issues and install the newest one. ISLC is not needed anymore imo, but if you use Windows Defender, disable all ASLR security layers if you feel your games have some problems. You won't get hacked or anything by disabling this, don't worry.

    Remember MSI modes (guide from mbk1969) here: https://forums.guru3d.com/threads/w...ge-signaled-based-interrupts-msi-tool.378044/

    Set affinities:

    Try using Park Control, great tool: https://bitsum.com/product-update/parkcontrol-v1-0-3-0-released/

    Disable Windows spying "features": https://www.oo-software.com/en/shutup10

    Optimize the visual effects to minimize system lag due to the graphics interface.

    Google Windows 10 unneeded services and disabling them slowly, check what they are for and if you don't need it, disable, then try your system and so forth. Generally, this is to lower latency a bit and RAM usage at idle.

    DO SYSTEM RESTORE POINTS BEFORE DISABLING WINDOWS SERVICES because if your Windows stops booting, you will be able to recover. If not, you will have to reinstall.
     
    Last edited: Mar 21, 2020
  6. HeavyHemi

    HeavyHemi Ancient Guru

    Messages:
    6,712
    Likes Received:
    804
    GPU:
    GTX1080Ti

    Wow...are you really that thick? I specifically said it was the FIRST POST IN THIS THREAD. The thread we are in now. Here's the link, post one: https://forums.guru3d.com/threads/another-look-at-hpet-high-precision-event-timer.368604/
    Holy cow. WFT are you even babbling at me with? Calm down and read for content. Like I said, this thread is a running gag that keeps recycling every so often because nobody reads back more than a page if that. 8 friggen years of the same crap over and over. Derp.
     
  7. aufkrawall2

    aufkrawall2 Master Guru

    Messages:
    496
    Likes Received:
    17
    GPU:
    GTX 1070 OC/UV
    Some components like the Explorer definitely got more sluggish with 1903 vs. 1809, but that doesn't mean there would be a general degradation of performance of any applications.
     
  8. Groot

    Groot Member

    Messages:
    24
    Likes Received:
    3
    GPU:
    GTX 1080
  9. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070
    Funny thing is in both screenshots of assembly code the code is completely the same.
     
  10. Groot

    Groot Member

    Messages:
    24
    Likes Received:
    3
    GPU:
    GTX 1080
    :D Just exercising his right to be human but it does show the new way.
     
    Last edited: Mar 22, 2020

  11. BetA

    BetA Ancient Guru

    Messages:
    4,220
    Likes Received:
    185
    GPU:
    MSI GTX670 PEOC@1350Mhz
    yeah, i know, he did say:

    im shure, if i ask him he could give me some more Information on this..

    Best Regards
     
  12. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070
    He described the difference in words, so we can trust him. I am just mildly curious (and too lazy to disassemble the code myself) .
     
  13. Groot

    Groot Member

    Messages:
    24
    Likes Received:
    3
    GPU:
    GTX 1080
    Before, 1607
    Code:
    Old way, TSC divide by 1024
    
            mov     r11, [7FFE03B8H]   ; qpcbias
            rdtsc                      ; Read TSC to EDX:EAX
            shl     rdx, 32
            or      rdx, rax           ; EDX:EAX to RDX
    ;===============================
            lea     rax, [rdx+r11]     ; rax = tsc + bias
            mov     cl, [7FFE03C7H]    ; (10 for me)
            shr     rax, cl            ; divide by 1024
            mov     [QPC], rax         ; store result
    
    1903
    Code:
    New way, convert TSC to 10MHz
    
            mov     r11, [7FFE03B8H]   ; qpcbias
            rdtscp                     ; Read TSC to EDX:EAX
            shl     rdx, 32
            or      rdx, rax           ; EDX:EAX to RDX
    ;-----------------------------
            mov     rax, [r9+8H]       ; Magic Number, (10000000 * 2^64) / TSC Frequency
            mov     rcx, [r9+10H]      ; Offset (zero for me)
            mul     rdx                ; Convert TSC to 10MHz
            add     rdx, rcx           ; Apply offset (none for me)
    ;-----------------------------
            lea     rax, [rdx+r11]     ; rax = tsc + bias
            mov     cl, [7FFE03C7H]    ; (0 for me)
            shr     rax, cl            ; zero shift
            mov     [QPC], rax         ; store result
     
    I've left out some conditional code and renamed to try and make the comparison simpler. No serializing instructions in the earlier code but maybe not so important since TSC resolution is being cut so much. A 32-bit OS would be somewhat more convoluted with the multiply.

    Hope it helps.
     
    Nastya, BetA and mbk1969 like this.
  14. Smough

    Smough Master Guru

    Messages:
    374
    Likes Received:
    54
    GPU:
    GTX 1060 3GB
    So is it "better" or just the same? Or slower? Also, 1709 and 1803 use the old way, its from 1809 and upwards that the QPF it's 10 MHz.
     
    Last edited: Mar 23, 2020
  15. janos666

    janos666 Master Guru

    Messages:
    794
    Likes Received:
    83
    GPU:
    MSI RTX2060 6Gb
    Same hardware source, probably mostly the same code, roughly 2-3 times higher effective frequency... Why would you assume it to be worse?
     

  16. Smough

    Smough Master Guru

    Messages:
    374
    Likes Received:
    54
    GPU:
    GTX 1060 3GB
    Well, the QPF being higher in theory leads to more latency. Also, why was it raised in the first place? If its for security reasons or whatever, then the user should still have the right to choose it, not having it pushed into the throat. Since this Spectre&Meltdown obsession started, it seems like you must accept this security even if you may want to get rid of it.
     
  17. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070

    One note. Can the value of (10000000 * 2^64) be stored in 64-bit register? 10000000 left shifted by 64 bits will leave zeros, imo.

    PS Second note. This the code for QPC. And I was questioning the code of QPF.
     
    Last edited: Mar 23, 2020
  18. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070
    It was for time maintenance (synchronization) reasons. There was a link here or in stand-by memory fix thread.
    Anyway if it is still TSC then nothing to worry about. Little increase in QPF can`t cause huge problems in existing code.
     
  19. Groot

    Groot Member

    Messages:
    24
    Likes Received:
    3
    GPU:
    GTX 1080
    One would have to test it.

    It's not the frequency but the time taken to get a result.

    Yes and yes. 64-bit integer division is done using RDX as the upper 64-bits and RAX as the lower 64-bits. Integer dividing 10,000,000 by TSC frequency alone will usually result in zero therefore we multiply 10,000,000 by 2^64 which simply means in this case putting it in RDX. It also means we don't have to divide the result by 2^64 (SHR 64), just take it straight from RDX instead.

    QPF is just a hard coded value, no calculation done. Could easily be something else if wanted.
    Maybe something like
    Code:
            mov     rcx,TSCF           ; Time Stamp Counter Frequency
            mov     rdx,10000000       ; The 10MHz QPF MS wants
            xor     eax,eax            ;
            div     rcx                ;
            mov     [MagicNumber],rax  ; Store the result in HalT and share for use by QPC
                                       ; note, maybe some adjustment for rounding or not?
    
                                       ; Example, if TSCF = 3.0GHz then result is 0xDA740DA740DA74
    
                                       ; If TSC reads 6,000,000,000 then QPC adjusted would be
            mov     rcx,6000000000     ; TSC
            mov     rax,[MagicNumber]  ; 0xDA740DA740DA74
            mul     rcx                ; rdx = 19,999,999
                                       ; as QPC = QPF * TSC / TSCF
                                       ; 10,000,000 * 6,000,000,000 / 3,000,000,000 = 20,000,000
    
     
  20. mbk1969

    mbk1969 Ancient Guru

    Messages:
    8,836
    Likes Received:
    5,791
    GPU:
    GeForce GTX 1070
    I suspect you are talking about different latencies. You meant that due to changes in latest Win10 builds namely function QueryPerformanceCounter will take a bit longer time to execute (in TSC code path), while I don`t really get what latency Smough meant because he mentioned only QueryPerformanceFrequency function - like new result of 10 MHz will increase latency comparing to old 3.smth MHz.
     

Share This Page