AMD Working on 16-Core Processor with Integrated PCI Express 3.0 Controller

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Jan 17, 2014.

  1. Hilbert Hagedoorn

    Hilbert Hagedoorn Don Vito Corleone Staff Member

    Messages:
    29,578
    Likes Received:
    33
    GPU:
    AMD | NVIDIA
  2. thatguy91

    thatguy91 Ancient Guru

    Messages:
    6,456
    Likes Received:
    4
    GPU:
    XFX RX 480 RS 4 GB
    It would be good if available for desktop and not costing an arm, leg, and left teste. I'm guessing Excavator based? Since it would make sense to be on 20 nm or even 16/14 nm.
     
  3. BLEH!

    BLEH! Ancient Guru

    Messages:
    5,631
    Likes Received:
    0
    GPU:
    Sapphire Fury
    Assuming this is gunna be a monolithic die for server applications, though if they did export do desktop... :D
     
  4. thatguy91

    thatguy91 Ancient Guru

    Messages:
    6,456
    Likes Received:
    4
    GPU:
    XFX RX 480 RS 4 GB
    By the time they integrate PCI-E 3.0, Intel will be on 4.0 lol.
     

  5. BLEH!

    BLEH! Ancient Guru

    Messages:
    5,631
    Likes Received:
    0
    GPU:
    Sapphire Fury
    I doubt that :p
     
  6. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    1,920
    Likes Received:
    0
    GPU:
    HIS R9 290
    I would hope so. As far as I'm aware, we haven't even saturated 2.0 yet. Aside from some SSDs, I don't really understand the point of releasing PCIe 3.0.

    AMD already has several 16 core Opterons. I'm guessing their current generation is Steamroller based, and I figure this new 16 core will also only be an Opteron. AMD has stated before they're not targeting the high-end/enthusiast desktop PC market anymore and that's exactly what a 16 core would be classified as. I don't see a reason for a 16 core entering the desktop market anyway - most people still can't put good use to an i7. But, Opterons are still relatively cheap. You could probably make a pretty good desktop computer out of an Opteron system as long as you expect the motherboard you get likely lack Crossfire/SLi support, built-in audio, and a slew of USB ports.
     
  7. CPC_RedDawn

    CPC_RedDawn Ancient Guru

    Messages:
    7,376
    Likes Received:
    1
    GPU:
    MSI GTX1080 +110/+500 H20
  8. BLEH!

    BLEH! Ancient Guru

    Messages:
    5,631
    Likes Received:
    0
    GPU:
    Sapphire Fury
    Exactly. The current 16-core Opterons are a bit like the Pentium-D and Core2-Quads, though, dual 8-core Bulldozer/Piledriver dies on one chip, so having this as a monolithic CPU would be a big step up for AMD. Saying that, looking at Kaveri on the new 28 nm CPU has really shrunk the die size down quite a bit from what you'd expect, so assuming it'd be on the same process is reasonable. We might even see something that can complete with Haswell-E's 8-core if we're lucky. If I were aiming for a cheap server though, it would be AMD based, the intel Xeons are horrendously expensive, Opterons, not so much.
     
  9. Tugrul_512bit

    Tugrul_512bit Member Guru

    Messages:
    114
    Likes Received:
    0
    GPU:
    msi_r7870hawk_asus_r7_240
    Mantle + 16 core gaming can be good. Crop some L3, add 4 more cores(which means more L1(and even L2)), add two more channels for memory, add some pipeline depth, decrease frequency and increase efficiency, increase working temperature so it works okay even over 75°C, make L2 or L3 caches addressable by APIs like CUDA and OPENCL and .... so better game physics, all these can be even better.
     
    Last edited: Jan 17, 2014
  10. GhostXL

    GhostXL Ancient Guru

    Messages:
    5,976
    Likes Received:
    0
    GPU:
    GTX 1080 SLI @ 2.025Ghz
    I think you mean releasing PCIe 4.0. We've had 3.0 for a few years now, and I'm running 3.0 x8/x8 in SLI now. Intel has plans on releasing 4.0 this year or next. I thought I read somewhere by 2015 at the latest.

    I saw 780's running in a x8/x8 2.0 rig and were slower in benches and gaming.

    4.0 may be needed for the super enthusiasts looking to push all that bandwidth. There are already bigger and badder cards than Maxwell planned. I myself am not upgrading though.
     
    Last edited: Jan 17, 2014

  11. poornaprakash

    poornaprakash Active Member

    Messages:
    63
    Likes Received:
    0
    GPU:
    AMD/Nvidia
    PCI Express 4 ??




    There is still no graphics card can bottleneck a PCI-E x16 v2.0 then why you need v4 ??:puke2: I think you dont have any idea about PCI-E and its bandwidth......
     
  12. k3vst3r

    k3vst3r Ancient Guru

    Messages:
    3,309
    Likes Received:
    0
    GPU:
    Tri-fire 290x Qnix 120Hz
    apparently skylake is getting pci-e 4.0

    since AMD has removed the CF bridge for R9 an made them bridgeless apparently they can now saturate pci-e 3.0 @ 16x since they talk over the pci-e now
     
  13. AcidSnow

    AcidSnow Master Guru

    Messages:
    358
    Likes Received:
    0
    GPU:
    VisionTek R9 290X
    I remember reading some benches a year ago about PCI-e 2.0 vs 3.0, and if memory serves me right, there was only a 3% degradation when using 2.0 (instead of 3.0).

    ...Things might change this year, but I won't be upgrading anything because I expect Mantle to supplement my CPU well enough that I'll be able to ride my i7 920 (& R9 290) for two more years :)
     
  14. sykozis

    sykozis Ancient Guru

    Messages:
    19,973
    Likes Received:
    1
    GPU:
    XFX RX 470
    Ummmm...huh? L1, L2 and L3 aren't addressable by APIs because it would cause problems for the processor.

    Increasing pipeline depth would be very bad. Decreasing frequency, while increasing pipeline depth would be suicide for AMD. To increase efficiency, you have to shorten the pipeline.

    AMD can't do anything that affects CUDA because they have no rights to it...also, allowing CUDA to access cache, wouldn't improve PhysX in the least as the system would be too unstable to be usable.
     
  15. -Tj-

    -Tj- Ancient Guru

    Messages:
    13,831
    Likes Received:
    3
    GPU:
    ZOTAC GX980Ti Amp!Extreme
    Last edited: Jan 17, 2014

  16. Tugrul_512bit

    Tugrul_512bit Member Guru

    Messages:
    114
    Likes Received:
    0
    GPU:
    msi_r7870hawk_asus_r7_240
    Why increasing the pipeline depth cannot increase instructions per cycle?

    How can we increase total performance of CPU then? Increasing clock frequency versus increasing IPC?. Which one is more future-proof? Which one is more efficient in terms of "instructions per Joule"?
     
    Last edited: Jan 17, 2014
  17. sykozis

    sykozis Ancient Guru

    Messages:
    19,973
    Likes Received:
    1
    GPU:
    XFX RX 470
    The longer the pipeline, the longer it takes for an instruction to complete, thus reducing performance (and efficiency)

    The long pipeline was among the drawbacks of Intel's NetBurst architecture. With Conroe, Intel drastically reduced the pipeline. Shorter pipeline results in instructions completing faster. Shorter pipelines are more efficient. The shorter pipeline of the Athlon series processors is part of the reason that they were just as fast, at lower clock speeds, as the Pentium 4 and Pentium-D processors.
     
  18. Tugrul_512bit

    Tugrul_512bit Member Guru

    Messages:
    114
    Likes Received:
    0
    GPU:
    msi_r7870hawk_asus_r7_240
    Then longer pipeline increases pipeline latency so this leads lesser instructions per second?(as long as instruction issue/fetcher remains same?)

    Then it is like:

    short pipeline(tripartitioned, single issue):

    1 instruction = 3 cycles ----> 1 instruction per 3 cycles inefficient

    2 instructions = 4 cycles ----> 1 instruction per 2 cycles...ok

    3 instructions = 5 cycles ----> 3/5 better

    4 instructions = 6 cycles ----> 2/3 even better but low probability

    5 instructions = 7 cycles ----> 5 /7 best but very hard to maintain?


    long pipeline(tenfold, single issue):

    1 instruction = 30 cycles ----> 1/30 yes very slow

    2 instructions = 31 cycles -----> 2/31 nearly double of the first one

    3 instructions = 32 cycles -----> 3/32 ---> cycles hardly increase but instructions increase faster

    ...
    ...

    10 instructions = 39 cycles ----> nearly 1 instruction per 4 cycle

    long pipeline(tenfold, 20 issued):

    20 instructions = 49 cycles ----> 2/5 very good from the beginning

    40 instructions = 69 cycles -----> 4/7 even better

    60 instructions = 89 cycles -----> 2 instructions per 3 cycles

    You are right. But faster issuing can help, can it? I meant instruction fetching by "issue"
     
  19. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    1,920
    Likes Received:
    0
    GPU:
    HIS R9 290
    For all of you not understanding how bandwidth on PCI-e works, take your 3.0 port and a high-end 3.0 GPU, drop it from 16 lanes down to 8 and run benchmarks between both. You likely won't see a difference (maybe 1 or 2 FPS). Now, drop it down to 4 lanes. You might lose a few FPS here and there, but the game should still be playable.

    The only reason for increasing bandwidth per-lane is for the PCI-e 1x devices, such as TV tuners, SSDs, Thunderbolt cards, or USB 3.x cards. Otherwise, we don't even need the bandwidth of 3.0 for modern GPUs. Assuming PCIe 4.0 will continue the trend of doubling bandwidth, one lane will be equally as fast as 8 lanes from the first generation, which is good enough to run most mid-range GPUs. It won't be long until something like the Titan can run off a 1x slot.
     
  20. vbetts

    vbetts Don Vincenzo Staff Member

    Messages:
    13,126
    Likes Received:
    1
    GPU:
    Nvidia Geforce GTX 960M
    16 cores is definitely nice, but AMD needs to focus on single threaded performance. I'll give AMD this though, their server cpu's are crazy!
     

Share This Page