AVX-512 Is an Intel Gimmick To Win Benchmarks and should die a painful death

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Jul 13, 2020.

  1. tsunami231

    tsunami231 Ancient Guru

    Messages:
    11,604
    Likes Received:
    816
    GPU:
    EVGA 1070Ti Black
    is ANYONE modest about how they out things these days? the all do in way that cause the most damage/buzz
     
  2. Gomez Addams

    Gomez Addams Member Guru

    Messages:
    177
    Likes Received:
    97
    GPU:
    Titan RTX, 24GB
    The OS kernel does not have to deal with the instruction set per se but it has to deal with its registers on the CPU. One of the primary roles of the OS kernel is scheduling threads and processes. When there is a context switch between threads or processes the OS has to save the current context. If a given set of instructions is to be fully supported then its registers have to be saved and restored during context switches. This is why new processors with new registers and/or instructions sets need additional support from the OS.

    From a developer's perspective, which I am, these kinds of things are usually non-issues because most of us target the path of widest compatibility. I know I do, generally. Since AVX-512 is available on only a subset of available CPUs I would not target it. IF we were target specific machines for reasons of performance then we might consider using AVX (of any flavor). Right now we have to maintain compatibility with the laptops we use so we do not consider AVX or anything like it.

    In an odd twist to that, I work with CUDA and that is definitely a targeted, high performance, special case. We deal with its compatibility issue by giving everyone gaming laptops and desktop machines with Nvidia graphics cards. If we were to also target AVX-512 then we would require machines have CPUs that support it and I don't see the performance benefit to be worth the PITA in hardware and development time. Requiring Nvidia GPUs is pretty easy to work around because they are easily available at every platform level. I would be more than happy to use AMD GPUs if they would support CUDA. Hopefully, they will one day - directly. That kind of competition would really help lower the price of Nvidia's high-end GPU cards. They are in the 9K range now for a V100 and Amperes will probably be over 11K. If AMD can be competitive there it would really help bring those prices down and I really hope they will.

    As an example of what I mean, I use an MSI GL75 for remote work and it has a 2070 so CUDA works pretty well on it. It has a 10750H CPU that supports AVX2 but not AVX-512 so I would be SOL trying to use it.
     
    Gandul, HandR, Kaarme and 3 others like this.
  3. mbk1969

    mbk1969 Ancient Guru

    Messages:
    10,641
    Likes Received:
    7,880
    GPU:
    GF RTX 2070 Super
    How?
    There are CPU instructions called privileged - these instructions can be executed only from the ring 0 (https://en.wikipedia.org/wiki/Protection_ring). Kernel code being executed in ring 0 can call any instructions. User code being executed not in ring 0 can execute only unprivileged instructions. That`s to my understanding. Do you imply that kernel can switch the access to CPU instructions on the fly?

    If you make a cryptography a part of the OS kernel - that`s a bad design in my eyes. But of course, I am not Torvalds.
    Also it can be that the term "OS kernel" is overused and even misused in the link you provided. Because OS kernel - is the core of OS, and cryptography doesn`t look like a part of OS core: part of OS, but not part of the kernel.
     
  4. mbk1969

    mbk1969 Ancient Guru

    Messages:
    10,641
    Likes Received:
    7,880
    GPU:
    GF RTX 2070 Super
    So if AVX-512 introduced new registers needed to be saved due to context switching, then in Windows build which is unaware of these registers apps which use AVX-512 will not work or even crush OS kernel?

    I mean has AVX-512 introduced such registers?
     

  5. TieSKey

    TieSKey Member Guru

    Messages:
    187
    Likes Received:
    65
    GPU:
    Gtx870m 3Gb
    I'm not a kernel developer but to add to what others already said, a kernel needs to support the hardware, CPU drivers are built in and closely integrated to an OS kernel. Additional registries to deal with in contexts switches and security checks, efficient cache and virtual memory management for the needs of those instructions, suspend/restart states and thread scheduling. My guess is that last item is a bitch since using avx-512 makes the cpu core downclock a lot, affecting it's 2 threads (and maybe even more parts of the cpu) so, as an OS, u don't want to schedule a software thread to a cpu hyper/virtual thread whose sibling (the other virtual thread of the same physical core) is running avx-512 instructions.
     
    Gandul and mbk1969 like this.
  6. bobblunderton

    bobblunderton Master Guru

    Messages:
    406
    Likes Received:
    192
    GPU:
    EVGA 2070 Super 8gb
    Beamng.drive uses all floating point numbers to run the entire game, from the constant 2000hz physics on each vehicle (which is hundreds or thousands of different points or nodes/beams), to the rendering engine and even object placement. AVX can save time IF the processor you're using has it.
    Floating point has been used for games ever since Doom, heck even Tank Wars might use it - though I'm not 100% sure (was a free 'worms' / scorched earth clone). Yes I still play Tank Wars (and Doom) is Dos on at-least a monthly basis.
    Thus far VERY FEW processors except some XEON workstation and server processors and HEDT (79xx ~ 10xxx X and XE series on x299) have it.
    So if you don't have market penetration to a significant degree - because intel artificially segmented the market due to GREED, never mind the excess heat it generates (you almost NEED liquid cooling to really use it for extended periods), no one will use it if there's next to no market penetration for it / almost no one has it.
    So while AVX2 is a god-send if you use a lot of floating point operations and can use AVX to accelerate it further, if there's only 1~3% (max!) market penetration (of AVX-512) and even less than that with the ability to cool the chip properly while using it for extended computation/runs, there's never going to be a use for it.
    That'd be like owning a flying car but never being allowed to get it off the ground legally. It's not like MMX, SSE, or 3DNOW! extensions which actually helped because they started putting them on everything after a certain date.

    Looked great on paper, but until it goes mainstream, it's unlikely anyone will write code to feed it properly/efficiently. If intel doesn't see to include it on mainstream chips, it's never going to go anywhere - same with AMD.

    Look how many years it took quad cores to get to be the sweet spot for gaming!
     
    carnivore likes this.
  7. TieSKey

    TieSKey Member Guru

    Messages:
    187
    Likes Received:
    65
    GPU:
    Gtx870m 3Gb
    AFAIK u can tell a CPU to save all it's context (registries) starting at the given memory address. But if u want to do it right, u will want to know exactly how much space a context switch takes in advance, probably even how many cpu cycles, adding more instructions with new registers (iirc avx-512 does need new registers, some of the others lower variants just convine 2 existing ones) means more cases u have to know about and handle if u don't want the OS hurting performance.
     
    mbk1969 likes this.
  8. mbk1969

    mbk1969 Ancient Guru

    Messages:
    10,641
    Likes Received:
    7,880
    GPU:
    GF RTX 2070 Super
    I found the answer:
    And before that:
    But still - that`s one place in the kernel scheduler. I mean it should not spread across whole OS kernel source code.
    PS Context saving code should be in HAL (behind HAL?), most probably...
     
  9. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    5,809
    Likes Received:
    2,237
    GPU:
    HIS R9 290
    Not sure - this is a bit beyond my scope of knowledge. All I know is there are people using AVX* software on an AVX-compatible CPU and the instruction wasn't working or being detected. Some distros might restrict userland access to instructions; it definitely isn't a default behavior to do so.
    How is it bad design when the use of it is optional? If you don't like it, you can remove it or just simply not use it. If you don't have AVX, the cryptography will still work, just slower.
    Linux isn't an OS and its kernel isn't like Windows - it's monolithic, so the kernel is responsible for more than the Windows kernel. Storage is controlled at the kernel level, drivers are controlled at the kernel level, and the drivers can determine of cryptology is used. Therefore, it makes sense for it to run at the kernel level. In the Linux world, FUSE is often frowned upon.
     
  10. Carfax

    Carfax Ancient Guru

    Messages:
    2,913
    Likes Received:
    465
    GPU:
    NVidia Titan Xp
    Are you serious? AVX/AVX2 are definitely not fringes. Lots of games and physics engines use AVX, and lots of encoding/decoding/transcoding software use AVX2 to great effect. When the next gen consoles launch, AVX2 adoption will be stratospheric since they are both using the Zen 2 core.

    AVX-512 on the other hand is geared towards HPC for the most part so its use in consumer applications is much rarer. That could change in the future though as Intel is bringing AVX-512 support to mainstream parts.
     

  11. Mineria

    Mineria Ancient Guru

    Messages:
    4,308
    Likes Received:
    160
    GPU:
    Asus RTX 2080 Super
    https://en.wikipedia.org/wiki/AVX-512
    Since Linux also is used for latency sensitive HPC workloads, Linus has no choice but to add support for it, he is probably just venting since there is quite an amount of extensions he had to cover.
     
  12. Mineria

    Mineria Ancient Guru

    Messages:
    4,308
    Likes Received:
    160
    GPU:
    Asus RTX 2080 Super
    Exactly, some of the Assassins Creed games utilize AVX2 as example, they had to patch allowing older CPU's to run them though.
    For loads of small sets and where reduction in latency matters, AVX2 it is the way to go.

    It's not useless for the super computers where it is requested, why do you think Linus even bothers to support it to begin with?
     
    Carfax likes this.
  13. Gomez Addams

    Gomez Addams Member Guru

    Messages:
    177
    Likes Received:
    97
    GPU:
    Titan RTX, 24GB
    What would happen is the AVX512 registers used by the program that was switched out could be overwritten so when the first program was switched back in as the current context its data would be garbled, giving erroneous results and possibly causing a crash. Actually, that is why registers are saved and restored during a context switch : so they are not overwritten and restored to where they were when the thread was switched out of context.

    The way this works is a context is defined by the state of the registers in the CPU. The registers are saved on the stack in a structure known as a context. Generally, a thread has a smaller context than a process does. You can see what is in these structures by looking at a header file in the Windows SDK : WinNT.h. They are defined for a whole bunch of different processor architectures. In fact, all of the ones supported by Windows. It is quite fascinating if you are into that kind of stuff. Each process and thread has its own its own stack and the context is saved (pushed) on the thread/process' own stack. When a thread is made active the kernel first pops its context off the stack and into the CPU registers and then the CPU executes the instruction whose address is in the instruction pointer register that was just restored. This is the general context switching mechanism used by most CPUs in multitasking operating systems. If the OS does not support multiple tasks or threads then it will never switch contexts. It might deal with interrupt handlers but they don't switch contexts - those usually act like a function call that pushes only the registers it touches on the stack. At least, the ones I have dealt with did.
     
    HandR, Embra, Kaarme and 2 others like this.
  14. Gomez Addams

    Gomez Addams Member Guru

    Messages:
    177
    Likes Received:
    97
    GPU:
    Titan RTX, 24GB
    I tend to agree with you. AVX2 is rather widespread now. As I wrote, my laptops CPUs support it and have for several years now. AVX-512 support is coming much slower, especially from AMD.
     

  15. Carfax

    Carfax Ancient Guru

    Messages:
    2,913
    Likes Received:
    465
    GPU:
    NVidia Titan Xp
    Besides rendering and physics programs/engines, encoders/decoders, what types of software use AVX2? Probably compression and decompression apps as well if I had to guess.

    As for AMD and AVX-512, I would assume they will support it with Zen 4. AMD tend to be more cautious when it comes to that sort of thing as it's a big investment in die space and power usage. Zen 4 will be using 5nm, which should be much better for an AVX-512 implementation.
     
    Last edited: Jul 13, 2020
  16. Carfax

    Carfax Ancient Guru

    Messages:
    2,913
    Likes Received:
    465
    GPU:
    NVidia Titan Xp
    I don't think any games are using AVX2 right now, but many are using AVX for physics (especially cloth simulation) and other things. AVX2 is going to see a huge increase in adoption with the next gen, and even cross gen titles like Cyberpunk 2077, AC Valhalla etcetera.

    From what I understand, developers were reluctant to really make strong use of AVX because the Jaguar core in the PS4 and Xbox One ran AVX at half speed, ie 2x128 bit rather than 1x256 bit instruction. So many of them ended up by targeting SSE4 instead of AVX. I might be wrong so someone can fact check me on that, but I remember reading it somewhere on another forum.

    Luckily, Zen 2 runs AVX2 at full speed so developers will have a lot of incentive to use it.
     
  17. mbk1969

    mbk1969 Ancient Guru

    Messages:
    10,641
    Likes Received:
    7,880
    GPU:
    GF RTX 2070 Super
    What do you mean by that? I always thought that Linux is OS.
     
    patteSatan likes this.
  18. Noisiv

    Noisiv Ancient Guru

    Messages:
    7,779
    Likes Received:
    1,132
    GPU:
    2070 Super
    a kernel.

    GNU Linux is OS
     
    schmidtbag and angelgraves13 like this.
  19. JamesSneed

    JamesSneed Maha Guru

    Messages:
    1,148
    Likes Received:
    491
    GPU:
    GTX 1070
    I never said AVX. AVX 2 is where we picked up 256-bit support and AVX-512 is 512-bit support. Anyhow those have been pretty fringe use cases to current date, especially AVX-512. You are right AVX2 should catch on more especially with Zen2 and Zen3 but right now the list of software is really short. Once Intel can get on a new node that should help a lot as well since Intel chips use a ton of power doing mixed workloads with AVX instructions in the mix so there is a pretty large cost offset.
     

Share This Page