Rumors: ZEN3+ Cancelled and ZEN5 going for a BIG.little architecture

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Apr 28, 2021.

  1. JamesSneed

    JamesSneed Ancient Guru

    Messages:
    1,691
    Likes Received:
    962
    GPU:
    GTX 1070
    If you look at Apples M1 obviously this design works for low power. The M1 is an excellent example of making really wide performance cores(way larger than Intel and AMD cores) then using little cores to provide extra threads. Intel's new big desktop CPU(APU) the 11900k has 6bn transistors and Apples M1(4 perf cores + 4 small cores) has 16b transistors. These densities of transistors will make having really powerful cores an option and it does look like in low power situations tacking on lower performance cores is a great option especially since said lower power devices spend a lot of time doing light tasks like web browsing, checking email, spreadsheets etc so you realize lots of power savings keeping those big cores powered down most of the time.
     
    tunejunky likes this.
  2. tsunami231

    tsunami231 Ancient Guru

    Messages:
    14,750
    Likes Received:
    1,868
    GPU:
    EVGA 1070Ti Black
    I'll buy that for dollar
     
  3. Gomez Addams

    Gomez Addams Master Guru

    Messages:
    258
    Likes Received:
    164
    GPU:
    RTX 3090
    Get windows optimized for "it" requires assistance from Microsoft and that will be key to the viability of this idea. This would require a fair amount of changes to windows internals, especially the scheduler. Currently it has no concept of heterogeneous processors. It knows about SMT and the idea of primary and secondary cores and schedules on primaries first but this would be considerably different.

    Also, from an architectural point of view, would these "little" cores be fully x86 compatible? I think they would have to be or the changes to software would be huge. That raises the question, what does a "little" X86 core look like? I would guess that it has no AVX anything, smaller cache(s), and is stripped of other advanced features. In the end, I really don't see that resulting in much saving of power. That is, unless those cores can be clocked at a MUCH lower rate. That could actually work too since it essentially exists today. There is still the issue of windows knowing what cores are of what variety and that requires a lot of changes to software.

    Then the question is what processes and/or threads will be scheduled for the little cores? Will that be done automatically (by the OS) or will developers have to flag processes and threads for their processing requirements? Doing this automatically could be difficult if the cores are really that stripped down. I think a much more feasible approach to this is what I mentioned earlier - vary the clock rates but do it more drastically. One approach is to schedule threads that execute at low priority on cores that can be clocked very, very slowly. If the clock can be slowed to a factor of 8, or more, that could save a LOT of power. This could be done with no changes to any application code and virtually no changes to processor architecture since they would still have homogeneous cores. I like that idea better than mixed B-L cores.
     
  4. Denial

    Denial Ancient Guru

    Messages:
    14,207
    Likes Received:
    4,121
    GPU:
    EVGA RTX 3080
    Intel is already shipping heterogeneous processors with lakefield, so I'm not sure why you think it has no concept of it. It's not known if they'd be x86 or not but I don't see why they would be, Windows ARM translation layer is extremely efficient and ARM is architected from experience with power efficiency. If your goal for the small cores was specifically power you'd definitely go ARM.
     

  5. Neo Cyrus

    Neo Cyrus Ancient Guru

    Messages:
    10,793
    Likes Received:
    1,396
    GPU:
    黃仁勳 stole my 4090
    What a bummer... I was hoping for a Zen 3+ chip as an upgrade. I do not look forward to what BS the triumvirate of scammers (Micron, Samdung, and SK Hynix) pull off to screw us over on DDR5 prices making Zen 4 upgrades priced out of range for most people.

    I think my 3900X is at its limit, I got a real stinker as far as OCing it goes, aside from the Infinity Fabric which I have 100% stable at 1900MHz which is apparently not common. Syncing the Fabric/memory controller/memory at 1.9GHz really matters for game performance in the tests I've done... so I guess overall I got a good one??

    Syncing the clocks at 1.9GHz mitigates some of the latency issues that cause performance loss in games, but it would have still been a nice upgrade getting a Zen 3+ chip considering I'm one of the 3.5 people on the face of the earth who managed to get an RTX 3080 before scalpers did.
     
    Last edited: Apr 29, 2021
    tunejunky likes this.
  6. tunejunky

    tunejunky Ancient Guru

    Messages:
    4,455
    Likes Received:
    3,077
    GPU:
    7900xtx/7900xt


    beat me to it.
    loving my macbook pro, wishing it had a fire-breathing gpu (cuz my need for nerdiness)
     
    JamesSneed and Kevin Mauro like this.
  7. EspHack

    EspHack Ancient Guru

    Messages:
    2,799
    Likes Received:
    188
    GPU:
    ATI/HD5770/1GB
    isnt simple multithreading already too much work for most devs? how would adding asymmetric multithreading help things? at least for complex single tasks like games, of course stuff like cinebench wont give a damn if its a pentium glued to a jaguar core but thats hardly of any use on desk/ws
     
  8. PrMinisterGR

    PrMinisterGR Ancient Guru

    Messages:
    8,129
    Likes Received:
    971
    GPU:
    Inno3D RTX 3090
    big.LITTLE would be perfect if Windows managed to put the computer in a suspend state where it could still use networking and I/O in general, so you could download things or do simple file server stuff with the computer basically off. I also bet it would be great to assign all the background Windows stuff to your LITTLE cores and let the big ones do the actual work, without any thread jumping between tiles/CCD/CCX.
     
    Fox2232 and Kevin Mauro like this.
  9. Kevin Mauro

    Kevin Mauro Master Guru

    Messages:
    325
    Likes Received:
    88
    GPU:
    RTX 2070 Super FTW3
    (we can imagine ASUS being very happy with that product codename)

    - lol someone in Marketing earned their salary :)

    I wouldn't be surprised if they introduced something referred to as a "rest mode" or however along those lines (the phrasing can change to whatever suits it best) to encompass what you're talking about.

    Sleep and Hibernate may get reworked a bit or perhaps combined into more of a hybrid feature as opposed to two separate entire options alongside a lower cycling mode with some core basic functions offering the aforementioned.

    I don't want to assume (for all I know by "fire breathing" you may mean along the lines of an RTX Quadro 5000 or an RTX 3070 / 3080 laptop gpu) however - last I checked the MacBook Pro 16" Radeon Pro 5600M was relatively on par with the RTX 2060 Max-Q.
     
    Last edited: Apr 29, 2021
    tunejunky likes this.
  10. tunejunky

    tunejunky Ancient Guru

    Messages:
    4,455
    Likes Received:
    3,077
    GPU:
    7900xtx/7900xt
    :) all my prev laptops have had an xx70 at minimum
     
    Kevin Mauro likes this.

  11. fellix

    fellix Master Guru

    Messages:
    252
    Likes Received:
    87
    GPU:
    MSI RTX 4080
    The Windows scheduler needs to be updated to handle thread dispatch and assignment in a heterogeneous environment. A new system API will also be a good idea for the applications to be able to query for low-power cores. There are many light-wieght threads running in the background, that don't really benefit from powerful out-of-order logic or wide vector units in the power-hungry cores. Those tasks can be served as well by Jaguar/Atom derivates, not only saving power (leaving more for the big cores), but also die space. This is how the M1 Macs balance the power budget and achieve high performance in a full-blown, but still compact SoC layout.
     
    Kevin Mauro likes this.
  12. Kevin Mauro

    Kevin Mauro Master Guru

    Messages:
    325
    Likes Received:
    88
    GPU:
    RTX 2070 Super FTW3
    Gotcha
    Any ques that could be taken from Windows 10 for ARM?
     
  13. Denial

    Denial Ancient Guru

    Messages:
    14,207
    Likes Received:
    4,121
    GPU:
    EVGA RTX 3080
    Lakefield?
     
  14. Astyanax

    Astyanax Ancient Guru

    Messages:
    17,040
    Likes Received:
    7,381
    GPU:
    GTX 1080ti
    this is already possible.
     
  15. TLD LARS

    TLD LARS Master Guru

    Messages:
    780
    Likes Received:
    366
    GPU:
    AMD 6900XT
    I dont trust windows or other software to be able to direct workload to the correct cores.

    My brothers old AMD A10 CPU is detected as a 2 core 4 threads CPU in windows, therefore 4 core minimum software will not start and workload is favored to the 2 first cores, even though the CPU is a 4 core 4 threads CPU.
    If they can not figure out how a 6 year old CPU works, i have little faith in software to work on a 8 core 16 thread CPU with 8 small cores attached to it.
    When does work get transferred between cores, and the cache and memory handover is going to be a mess, people already complained about the AMD chiplet work transfer latency.

    My overclocked 1700 is already down to 0.5-1.5W at idle per core, i see no reason for ARM like cores at 0.2W on desktop, when memory, SSD, Chipset, USB headset, watercooling pumps, LEDs and even fans use more power then my 1700 cores already.

    The powersupply for the ARM part needs to be separated, because a 100-200W capable VRM is going to have too much loss, when running the ARM part at 0.1-2W power.
     

  16. Fox2232

    Fox2232 Guest

    Messages:
    11,808
    Likes Received:
    3,371
    GPU:
    6900XT+AW@240Hz
    Yet, some atoms are marked as 4C/4T even while they internally behave like 2C/4T :D
     
  17. cucaulay malkin

    cucaulay malkin Ancient Guru

    Messages:
    9,236
    Likes Received:
    5,208
    GPU:
    AD102/Navi21
    quoted for truth
    but 6/12 is doing very well atm.just requires a lot from the cpu in most open world games.it's not uncomming to see them at 75-85% package usage,with individual cores spiking over 90%
     
    Last edited: Apr 30, 2021
  18. Venix

    Venix Ancient Guru

    Messages:
    3,473
    Likes Received:
    1,972
    GPU:
    Rtx 4070 super
    On laptops this make sense , now about the optimal configuration 4 8 no ht ...or with ht or 8 16 .....etc etc , no clue witch is the optimal one. If i had to make a bet would be the optimal for now would be at least 8 big cores and 4 or 8 little , mostly because the software will need time to catch up. I was way more sceptical but the m1's performance on video editing considering the power consumption is astonishing , but part of the fast performance is that apple has a locked ecosystem and put in the work to optimise it so first impressions are positive. Now the pc market i have no faith that things will work so well from the get go. I believe it will take at least few transitional generations. My 2 cents at least and i would be happy to be completely wrong and when those things come out find my jaw smacking the floor from the surprise !
     
    Fox2232 and Kevin Mauro like this.
  19. Kevin Mauro

    Kevin Mauro Master Guru

    Messages:
    325
    Likes Received:
    88
    GPU:
    RTX 2070 Super FTW3
    Well to be fair the nature of this lay not entirely with Microsoft at that time but also with that architecture
    No, you're right. I'm sure someone lacking in humility will pipe up here and show their outrage for whatever reason but it won't matter. You explained it well.
     
    tunejunky and Venix like this.
  20. Loobyluggs

    Loobyluggs Ancient Guru

    Messages:
    5,242
    Likes Received:
    1,605
    GPU:
    RTX 3060 12GB

Share This Page