Microsoft Phases Out 32-Bit Support for Windows 10

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, May 14, 2020.

  1. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Thats not how any of this works.

    infact on 64bit processors using 64bit's can imply REDUCED performance as the datasize can exceed cache.
     
    carnivore likes this.
  2. wavetrex

    wavetrex Ancient Guru

    Messages:
    2,465
    Likes Received:
    2,579
    GPU:
    ROG RTX 6090 Ultra
    That's not how it works either.

    32-bit memory management means copying chunks of data left and right all the time, and every 32-bit program runs in a thin-layer virtualized environment. Virtualization costs cycles, much more than running native 64-bit code (for which memory management can also be simpler, as any quantity of memory can be contiguous, and represent pointers to actual memory, instead of having to use TLBs (translation buffers), which consume energy and compute power.

    I NO circumstance does a 64-bit program run slower than a 32-bit one on a 64-bit CPU and OS !
     
    Richard Nutman likes this.
  3. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Yeah, actually, it does.
    Especially when handwriting SSE and using packed data sets.

    And you'd be wrong.
    Under a number of conditions, the same code with the same optimizations can perform worse in a 64bit binary vs 32bit if inline expansion results in data exceeding cache (cache misses)

    This becomes less of an issue as the instruction sets and cpu's offering them get more advanced but it can still demonstrate performance differences even on a 9900k.

    Many developers are using x64 binaries to ignore the need to optimize memory usage and manage it appropriately, there are games out there now using 10GB of memory that based on what they are actually loading could be done in 1/3 that (stares blankly at AOE:DE)
     
    Last edited: May 14, 2020
    carnivore likes this.
  4. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    Actually it is. 32bit compiled code cannot make use of all the hardware resources in the chip, so it's like running a crippled cpu whenever the OS schedules a 32bit app to execute.
    This means the quantum timeslice it gets is not returned as quickly as it could be, thus making other applications wait longer.

    The main reasons for this are;
    1. In 32bit mode you cannot use the extra 8 registers that were added to x86_64. This means loops with many variables are constantly juggling them to the stack. This results in more instructions, slower code, larger executable and more memory accesses.
    2. In 32bit mode you only get access to half the SSE/AVX registers. Only 8 instead of 16. Again, this results in more memory accesses as values are juggled.
    3. Function calls in 64bit mode pass more variables in registers so they can be significantly quicker, and result in less push/pop instructions around the call.

    It's not just about managing memory, with a 32bit memory space it's quite easy to fragment the virtual memory space such that you cannot make any more allocations, even if you have free physical memory.
    With 64bit virtual memory spaces this is effectively eliminated.

    Also there's no reason the size of your data has to increase, unless you're storing lots of pointers, but there are workarounds for that.
    integer data, floating point data doesn't change size simply because you're in 64bit mode. The exception is in Linux a "long" is 64bits whereas it's still 32bits in Windows.

    Writing 32bit code for x86_64 chips is like having a V8 and only running on 4 cylinders. You're throwing performance away for no reason.
     
    Last edited: May 14, 2020

  5. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,044
    Likes Received:
    7,380
    GPU:
    GTX 1080ti
    Highly optimised 32bit code do not need to use those extra resources that only serve to slow the processor down and heat it up more when abused.

    Refer to : mismanagement.

    You're not offering valid reasons to use one over the other.
     
  6. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    Using more registers does not slow the cpu down or create more heat. It allows the compiler to produce more efficient and simpler/faster code.

    I've just given you 3 reasons why 64bit code is superior. There are several more.
     
  7. Alessio1989

    Alessio1989 Ancient Guru

    Messages:
    2,959
    Likes Received:
    1,246
    GPU:
    .
    Having more registers doesn't automatically mean better performance on the same peace of code. For that kinda code register renaming is a better solution. WOW64 subsystem overhead is typically meaningless, upcasting a couple of pointers for some content switches is not an issue at all. Don't forget also that more registers also mean more complex pipeline while less registers also mean less bits for source and destination register in the binary for the same assembly, which means shorter binary, which mean shorter source area in an executable which means less cache miss probabilities. If you do not need at all to take advantage of x64 instruction set, 32-bit pointers as Astyanax pointed means also shorter data records, which means better usage of cache.
     
    Last edited: May 14, 2020
  8. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    You're right it doesn't automatically mean better performance. But complex loops with lots of variables invariably will.
    You don't get access to register renaming, that's something the CPU does internally.
    It's the programming interface that is restricted to 8 registers in 32bit mode. Nothing you can do about this.

    The complexity of the pipeline doesn't change either. It's the same hardware running the code, you're just not using some parts of it.
    Not using some registers doesn't change the size of the executable.

    Here is a link with more detail;
    https://www.viva64.com/en/a/0030/
     
  9. Alessio1989

    Alessio1989 Ancient Guru

    Messages:
    2,959
    Likes Received:
    1,246
    GPU:
    .
    Complex loops are evil. Having more register will not change the complexity.
    Of course. sir-
    Of course if we only talks about x86. But generally having more registers doesn't mean better architecture or performance.
    Using shorter pointer will, as will the binary from the assembly.
    Good points but nothing new for me.

    All this will not change the fact that pretending Microsoft to rewrite every 32-bit executable into 64-bit (whatever they want to mean with that) is just pointless. What modern software does Microsoft not provide on 64-bit? I am not aware.. 32-bit bits in Windows are meant for compatibility and they are meant to stay.
     
  10. alanm

    alanm Ancient Guru

    Messages:
    12,277
    Likes Received:
    4,484
    GPU:
    RTX 4080
    Surprised not sooner. Even Linux distros have dropped 32-bit support already.
     
    anticupidon likes this.

  11. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    "Complex loops are evil. Having more register will not change the complexity."

    The logic is the same, but the code that implements it can become simpler.
    If you have 12 variables and only 8 registers, you have to switch variables out to the stack. This isn't done automatically. The compiler has to generate code to do this.
    If you have 16 registers you can hold all the variables in the cpu at once. The result is smaller more efficient code.

    "Good points but nothing new for me"

    But it clearly explains why 64bit code is faster. Are you saying it's wrong?

    You don't need to rewrite 32bit applications to be 64bit, you just recompile them in 64bit mode. The compiler will generate more efficient code.

    "What modern software does Microsoft not provide on 64-bit? I am not aware"

    Pretty much all their development tools and compilers for one thing. The Visual C++ compiler is a 32bit executable. It is not unheard of that it runs out of memory with large source files.
     
    Last edited: May 15, 2020
  12. Yxskaft

    Yxskaft Maha Guru

    Messages:
    1,495
    Likes Received:
    124
    GPU:
    GTX Titan Sli
    Yeah, I agree. One could argue Microsoft should have longer security support since 32-bit users will be stuck with that build, but the average user really shouldn't be affected by this. 32-bit only hardware is just so old at this point.
    I was surprised already five years ago that Windows 10 has so low requirements. I've never seen any tests for it, but on paper it should work on 2003 hardware.
     
  13. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    Here is another example;

    https://godbolt.org/z/2psR3Y

    On the left is some sample matrix multiply code.
    The middle pane is compiled using gcc to 64bit code.
    The right hand pane is compiled using gcc to 32bit code.

    The 64bit code is 134 lines of instructions.
    The 32bit code is 168 lines of instructions. You can see a lot more push/pop instructions.

    The main loop at L24 has 3 more instructions in the 32bit code.
     
    yasamoka likes this.
  14. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,646
    Likes Received:
    13,648
    GPU:
    GF RTX 4070
    Are you sure? Pointers are located mostly in stack memory and in heap memory - they are allocated dynamically, while the code is loaded from file image. And statically allocated data usually is not that big (in good programs). So having longer pointers increases stack and heap usage but not the executed code itself.
     
    Richard Nutman likes this.
  15. Alessio1989

    Alessio1989 Ancient Guru

    Messages:
    2,959
    Likes Received:
    1,246
    GPU:
    .
    You are confusion about pointer variables and memory allocation area. A tiny stack is still better than a bigger stack if the content is the same, less cache misses. The only exception I could think about is packed data structures where you sacrifice the data alignment, and you need to carefully benchmark the trade-off. But x86 doesn't have issues with 32-bit data alignment, on the other hand SIMD abuse or misuse could result in a lot of wast of space due data alignment requirements.
    Having 64-bit pointers means having QWORDs for storing them, means more memory, doesn't matter where they are allocated. The text area is also smaller and so is the binary.
    CL is already 64-bit, as are the linker and the debugger and quite all developer tools of the Windows SDK. What Microsoft still not ported to 64-bit is the IDE of Visual Studio (but not the tools, they are already 64-bit). Yes, that would be really appreciated, but we were talking about Windows, they do not need to port nothing to 64-bit, everything is already 64-bit except some executable for legacy technologies which are defunct and never had a 64-bit version (e.g.: very old version of DirectX runtimes).
    Good point sir, but you know that sample is meaningless in real life code. There will be more critical code parts in real life executables (where the mov instructions will result in more overhead than a couple of push-pop from/to stack and L2 cache) and if you really need to work with matrix you would not use such naive code.

    But please, all you just remember I never said 64-bit compiled code is generally slower, I simply state it could be slower. Microsoft is retiring the 32-bit only version of Windows for consumer and this is done for good.
     
    Last edited: May 15, 2020

  16. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    The alignment requirements for SIMD are the same in 32bit and 64bit. In fact since 64bit guarantees SSE2 which has more unaligned load functions, you get more flexibility with alignment with 64bit code.

    It's only more memory if you're storing loads of them, and the link I gave shows workarounds. You can just use 32bit indexing instead of storing pointers if it results in massive increase in memory. Most of the time pointers reside in registers or local variables and then their size is irrelevant.

    Incorrect. Open task manager, any process with a (32) after it is 32bit code.
    See here;
    https://ibb.co/XknfHwX
    This is one example anyway, there are loads more applications still running in 32bit mode.

    It's an extremely simple example that shows function calls are way more efficient on x64. As code complexity increases, the 64bit app with more registers will handle that complexity more efficiently.
    Mov instructions are extremely cheap, they don't even take time to execute if it's register to register. Besides which there wouldn't be more move instructions anyway.
     
  17. Alessio1989

    Alessio1989 Ancient Guru

    Messages:
    2,959
    Likes Received:
    1,246
    GPU:
    .
    Your IDE is using the 32-bit compiler (inside Hostx86 folder I guess) for 64-bit targeting. For using 64-bit compiler you need to launch it from Hostx64 and then your target platform folder. https://docs.microsoft.com/en-us/cp...-cpp-toolset-on-the-command-line?view=vs-2019
    Yes that sucks. Visual Studio completely ported to 64-bit would be really nice. At least the debugger is an 64-bit out-of-process program now: https://devblogs.microsoft.com/cppblog/out-of-process-debugger-for-c-in-visual-studio-2019/


    Btw mov/pop/push should run identical iif the data are already in register or in stack. mov from/to main memory is the critical part.

    And yes, SSE2/../4.2 alignment requirements are the same for 32 and 64-bit x86 code, but most (all?) x86 compilers will try to use and generate SIMD code, especially with /O2 when targeting 64-bit while that's not true when targeting 32-bit. Abusing SIMD may results in slower code.

    But again the 64-bit compiled code is generally faster (especially due better calling conventions and modern compilers), but not always. And again the WOW64 performance coast is meaningless on modern hardware.
     
    Last edited: May 15, 2020
  18. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,646
    Likes Received:
    13,648
    GPU:
    GF RTX 4070
    Variables (stack or heap) are not allocated in the body of executable. Only static variables are. Hence your statement "Using shorter pointer will, as will the binary from the assembly" is wrong.

    Which text area? You mean constant strings stored statically? Why are they smaller if they are stored in the same encoding?
     
  19. Richard Nutman

    Richard Nutman Master Guru

    Messages:
    268
    Likes Received:
    121
    GPU:
    Sapphire 7800XT
    No I'm targetting 64bit builds. The compiler is 32bit but can output code for 32 or 64bit targets.
    It's good that they have a 64bit compiler, but that's not what is triggered by VS IDE it seems.

    No, push and pop have higher latency and less throughput than MOV's
    https://www.agner.org/optimize/instruction_tables.pdf

    The point being if the compiler picks the right register for parameters to functions it is not replacing a push and pop with a mov, it doesn't need to do the mov at all!
     
  20. mbk1969

    mbk1969 Ancient Guru

    Messages:
    15,646
    Likes Received:
    13,648
    GPU:
    GF RTX 4070
    Also note that using .Net Framework the same .Net binary executable can be executed both on 32- and 64-bit environment. :p
     

Share This Page