Review: Core i9 10980XE processor (Intel 18-core galore)

Discussion in 'Frontpage news' started by Hilbert Hagedoorn, Nov 25, 2019.

  1. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    So... what exactly are you arguing about at this point? Because retaining software compatibility has been my entire argument this whole time.
    I'm aware emulation can be done with decent performance. Decent emulation is why I pointed out Rosetta (Apple's method of getting PPC programs to work on Intel Macs). So, what exactly is your point here? Or is it related to rdrand being emulated? Because that doesn't really have much of an impact on CPU load - the point of rdrand was to give "better" random numbers.
    Ivy Bridge is from 2012... 7 years is a LONG time for computers, even with Intel slowing R&D thanks to Bulldozer. rdrand in particular was added to compilers since at least December of 2012. So, it's definitely not new anymore. And that goes back to my original point: this is just one of dozens of instructions that nobody bothers to use. You can keep adding all the instructions you want, but performance won't improve until devs use them in their software.
    And yet, there are so many applications that could use AVX (just in general, not even 512) or other nice modern instructions like the SSE4 family, but they don't. Take a look at benchmarks of Clear Linux to see how much devs are slacking (that whole distro is optimized to take advantage of various instructions). The amount of untapped performance you can get in a modern Intel or AMD CPU is absolutely insane. But how do you get devs to take this stuff seriously? Until they do, any additional instructions added to CPUs is a wasted effort.
    Yes, maintaining it isn't that difficult. Preventing it from regressing without people noticing is. That's why x86 is heading toward a dead-end. There's no clear path to improve it:
    * You can't depend on devs to use new instructions
    * You can't make people's existing software run slower; modern optimizations can often cause this
    * You can't break software compatibility without pissing people off
    * We're near the limits of silicon transistors
    I don't understand why you keep pointing that out. It doesn't change the underlying point. Despite how drastically different the execution pipeline is between modern Intel and AMD CPUs, they still perform roughly the same. You could also say the same about Athlon II and Core2. Intel could revamp the entire pipeline, but because of trying to retain x86 software compatibility, it isn't going to get a lot faster.
    Makes sense.
     
  2. user1

    user1 Maha Guru

    Messages:
    1,441
    Likes Received:
    472
    GPU:
    hd 6870
    i point it out because it is important, if intel wanted to put an ARM-like core under the hood they could is the point, the frontend (x86) isn't as burdensome as you say.

    you mistake market stagnation for lack of progress on x86, preventing the regression of certain instructions isn't a huge issue, hasn't been for a long time, Cpus these days aren't nearly as picky about what instructions you use than in the past. new instructions are added for people that need them, they aren't needed for general good performance, really very few applications need much more than sse2/sse3 to get good utilization out of a modern cpu, intel has the time to tune clear linux, most people dont, code readibility modularity, and time is just more important than pure performance, if you want that 10-20% extra performance you can always use intel's compiler , which enables all of the unsafe optimzations and you can enjoy all of the debugging.
    fundimentally Fixing sh** code isn't intel or amd's job so im not sure what your on about "untapped potential" pretty much nobody has the time to learn every quirk and every keyboard shortcut to fully optimize for a cpu.

    finding cpu bugs in the execution units of the core and other core constructs has been a much bigger problem to deal with on the other hand.

    edit:also, dont assume people update their compilers, cause they quite often dont, It is not surprising if an application runs slower because people like using gcc4.
     
    Last edited: Nov 30, 2019
  3. Andrew LB

    Andrew LB Maha Guru

    Messages:
    1,062
    Likes Received:
    140
    GPU:
    EVGA GTX 1080@2,025
    Try not to take it too personally Hilbert. The past couple years this kind of attitude has been running rampant in American politics where you simply can't say anything favorable towards anyone whom they disagree with or they'll come at you with all kinds of outlandish accusations until they're the only one making noise. I'm a bit surprised to see this showing up here though.
     
  4. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    Again... doesn't matter what the underlying architecture is. That has very little to do with what I'm talking about. What matters is software compatibility. And there's only so much you can do to improve performance for existing programs.
    No I don't, because I don't think the market has stagnated that much. As for whether an instruction is "needed" is relative. Frankly, the vast majority of instructions aren't needed, you're just a lot better off using them where possible. Again, look at Clear Linux benchmarks and you'll find SSE2/3 does not yield good utilization compared to what could be done; it's the bare minimum nowadays if you want to write "optimized" software. Even ARM has SSE2 support.
    If Intel has the ability to tune hundreds of different packages that they have no obligation to tune, then yes, devs do in fact have the time to do it themselves. Regardless, I don't see why you'd point that out, because that only favors my point: why bother adding instructions if nobody has the time to use them? If code legibility is that much of a problem then those instructions are not a good solution.
    Well in that regard, 90% of software out there is sh** code. The responsibility of Intel and AMD is to release a product that shows progress, whether that be better efficiency or better performance. If their idea of showing progress is to implement features that nobody is using, then they're not doing their job. I don't care about theoretical performance. I don't care about the dozen or so real-world applications that use AVX. I want something that actually makes a broad difference in performance. And that brings us full circle to my original point: there's barely any room left for that. There's not really any way to make existing software run a whole lot faster than it does now. Intel can do a fresh new start but it's not going to be revolutionary.
    Adding instructions in concept will make all the difference, but it's not the answer.
     

  5. user1

    user1 Maha Guru

    Messages:
    1,441
    Likes Received:
    472
    GPU:
    hd 6870
    its intel's job to make their cpus look good they have like 100k employees, they can afford a couple hundred people to work on optimizing their little projects, most projects dont have that kind of manpower.
    People who work on software for a living do not infact have unlimited time , they have to meet deadlines, and if a peice of software gets them 80% of the performance in half the time, thats whats favorable, people do use new instructions if they need them, there is more to the world than the desktop pc. lots of very important softwares do infact use new instructions when they are made available. If you do not see the purpose of the instruction it isn't for you! its that simple.

    I dont know where you get this notion that cpus do not get performance uplifts for older instructions, cause they do, extensions like sse and avx are added because they are useful for some things, not all things, they are not always better than the existing instructions. which is why they are Extensions. and yes sse2/3 is enough, you might like the fact that intel would rice their text editor with avx acceleration to get 10% more performance, but its completely superfluous. vector acceleration is not best for all applications, other uarches do the same thing. expecting everything to be vectorized and hyper optimized isn't realistic, there is just too much code to deal with. for many very large projects, reliability and readibily is just more important.
    there are plenty of ways to make existing software run faster, without adding new instructions, again instructions tell the cpu what to do, they are not directly executed on x86 anymore, x86 has alot of instructions that can do the same things, which is why they are simplified into microops, or combined, inessence after the decoder , you may not even beable to tell the difference between the usage of different instructions for the same purpose, because they are executed in exactly the same way.

    I reccommend viewing software/hardware from the perspective of time=money, it will shed light on why things are the way they are, its for good reason (most of the time).
     
  6. K.S.

    K.S. Ancient Guru

    Messages:
    1,912
    Likes Received:
    475
    GPU:
    RTX 2080 GAMING OC
    *Chanting* No more Intel lakes... enough mitigation. Intel should just poach a CPU architect from Arm Holdings; enough of this tom-foolery.
     
    Last edited: Dec 2, 2019
  7. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    If you ask the investors/shareholders, yeah. If you ask everyone else, no.
    Again: that supports my point. If devs really don't have the time then they're not going to use the instructions. So, adding more isn't going to prolong the life of x86 CPUs.
    I didn't say SSE2/3 don't give uplifts. I said they don't yield that much of an improvement compared to newer instructions. Not sure what you're rambling about with text editors, because the stuff Intel optimizes are things you can actually benchmark. I wouldn't consider stuff like media encoding, PHP, software compiling, 3D design, and so on to be superfluous (all of which do see some measurable improvements from Intel's work).
    I also didn't say everything should be vectorized. There's more instructions out there than AVX and SSE3... Here's an incomplete list:
    https://www.felixcloutier.com/x86/
    Regardless, I don't really see what it is you're trying to prove here, since it's a bit tangential from the discussion.
    There are not plenty of ways to make existing software run faster without new instructions if:
    * You're limited by physics (which we almost are)
    * You have to compensate for something (for example, larger caches) which often cause regressions in other programs
    * You add a lot of complication that the OS has to handle (such as the CPU scheduler *cough* 2990WX)
    * You add more cores/threads, which you still need devs to adapt to.
    I'm well aware of that. Why do you think I've been saying this whole time that there's not much room left for improvement? Nobody wants to spend the time and money, which is why the future looks stale.
     
  8. user1

    user1 Maha Guru

    Messages:
    1,441
    Likes Received:
    472
    GPU:
    hd 6870
    Nobody uses clearlinux, its essentially a hobby project for intel, to show the strength of their software optimization, it not really used for anything atm, intel has the resources, they compile it with their proprietary compiler*.

    edit: they do not compile it with their compiler but the point stands becaus on their own website they state:

    https://clearlinux.org/news-blogs/gnu-compiler-collection-8-gcc-8-transitioning-new-compiler

    they want people to use their software
    its funny cause they actually have a post on their blog about "the importance of a cutting edge compiler"
    :editend

    if you want that performance you can buy a license for their compiler. I dont know how else to make that clear.
    if you think you can do better then contribute to gcc or clang, cause honestly many packages on mainstream distro are maintained by like 1 person. and they aren't gonna go out of their way to make the kind of optimizations that intel is doing.


    What im trying to say is that the "performance" of x86 is not limited by the instruction set or legacy instruction support, frankly even POWER which doesn't have the baggage of x86 does not perform better in most cases , it sits roughly the same with optimization , so to stay on the original point which was
    x86 is a deadend, its not accurate or rather its no different for x86 than it is for any other architecture, software will never be perfect, and the hardware is designed to compensate for this fact already, and it does a pretty good job. there are alot of places to improve the design of cpus , but again the PATENTs are the problem, not the ISA itself so far as the design is concerned.
     
    Last edited: Dec 1, 2019
  9. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    Because doing so has so little to do with what we're talking about. The popularity of Clear, the usage of Clear, the usage of Intel's compiler, whether or not modern CPUs are truly x86, the adoption rate of compilers, and so on are tangential to what I've been saying. You're basically just stating arbitrary facts, most of which I don't even disagree with. So again: what are you arguing here? There are only 3 points I'm trying to make, none of which you have given a compelling argument against:
    1. x86-compatible CPUs, regardless of their underlying architecture, have very little room to improve performance on a broad scale at physical level.
    2. Stuff like more cores, SMT, bigger cache, and a tweaked execution path aren't broad-scale - they improve some programs and have minimal (or negative) impacts on others.
    3. Theoretically, performance could be improved using newer instructions, but as you yourself have pointed out, devs don't use them. So it's not a viable option.
    All 3 of these things collectively means x86 CPUs have very little room to grow.
    I didn't say I could, and as usual, is not the point I'm trying to make.
    Aside from the vague usage of the word "patents", you haven't described anything at all about why the performance isn't limited. Stating that modern CPUs are truly x86 doesn't detract from my point, because like I said, both Intel and AMD today have drastically different pipelines yet their IPC is pretty evenly matched. Ironically, you yourself are effectively saying the patents are a limitation. So if in your eyes the only way to improve instruction sets is to drop patents (which I'm sure you agree isn't going to happen any time soon), then how is x86 not reaching a dead end?
    Regardless, let's assume for one moment that patents are not an issue: you must have at least one patent in mind that's holding back performance - what is that? After such patents are released, then what? How else can performance continue to substantially improve?
     
  10. user1

    user1 Maha Guru

    Messages:
    1,441
    Likes Received:
    472
    GPU:
    hd 6870
    My all of my points from the beginning are to point out that there is nothing special about x86 that makes it a "deadend", there is no point saying "x86 is a deadend" in the the sense that all the reasons you state are not special to x86 , the same argument can be used for virtually any other architecture.

    if you cant see that , then im afraid we are at an impass.

    there is no need for excessive word salad.
     

  11. Fender178

    Fender178 Ancient Guru

    Messages:
    3,789
    Likes Received:
    98
    GPU:
    GTX 1070 | GTX 1060
    Sounds like Intel is pulling an AMD back when AM3+ was still relevant. What I mean is Intel is beating a dead platform into the ground by releasing more CPUs that is compatible with x299. I think they need to make a new platform and socket. You can only beat a dead horse for so long. I think Intel will make a comeback but I don't think it will be C2D/C2Q/ or even 1st generation core i series amazing. The one major difference between the 7980XE and the 9980XE is that the 9980XE has a soldered on IHS and the same thing goes with the 10980XE. Yeah the CPU is a good CPU but like Jayztwocents and Linus have said there are much better CPUs out there.
     
    K.S. likes this.
  12. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    All I'm asking is for a reason that I'm wrong, which you still haven't provided. We both agree the architecture itself doesn't matter. X86 not being special doesn't disprove that it's nearing a dead end. Applying the same arguments to any other architecture doesn't disprove x86 is nearing a dead end either. If your point all along was "all CPUs are affected by this problem", well, why didn't you just say that a long time ago? That's a much more straight-forward point.
    That being said, I do feel other architectures such as ARM and POWER also have a dead end, but, it's much farther away than x86's. This is because ARM is focused on efficiency rather than performance (where there is sufficient room to optimize), and POWER is used in servers and workstations, where writing optimized code on a "RISC" architecture is a priority. So yes, they too will reach a dead end, but not as soon as x86.
     
  13. user1

    user1 Maha Guru

    Messages:
    1,441
    Likes Received:
    472
    GPU:
    hd 6870
    I was under the impression that you were talking specifically about x86(x86 is a deadend ect) but its clear you arent.
    It would be like saying frogs are a deadend, and giving reasons that all animals are a deadend.

    It would be technically true , but is malformed to the reader.

    "Silicon Computing is a deadend" is what would make sense, if thats what you are trying to get at.
     
  14. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    Fair enough. Though like I said, x86's dead end is coming sooner than other archs, which is why I specified it.
     
  15. K.S.

    K.S. Ancient Guru

    Messages:
    1,912
    Likes Received:
    475
    GPU:
    RTX 2080 GAMING OC
    :confused::confused: You two are aware Marriage between an Avatar and an Avatar is legal now right? :confused::confused:
     

  16. Carfax

    Carfax Ancient Guru

    Messages:
    2,585
    Likes Received:
    280
    GPU:
    NVidia Titan Xp
    Anandtech has already done tests on Ice Lake parts and found that Intel's estimate for average IPC gains was accurate, ie 18% when tested using Spec 2017. It was significantly higher for FP.

    What's more, the mobile Ice Lake part outperformed the 9900K and 3900x in some of those benchmarks despite running at much lower clock speeds, ie turbo clock was 3.9ghz compared to 5ghz on the 9900K and 4.6ghz on the 3900x.

    Zen 3 is supposed to have a unified L3 cache, so that will go a long way on reducing overall latency on the CPU. If Zen 3 achieves 15-20% IPC uplift with a small bump in clock speed, that would be a great achievement.

    Bulldozer was destined to fail. When Bulldozer launched in 2011, the entire industry was still focused on predominantly single threaded programming. And to top it off, the power consumption was awful.

    Zen wasn't a clean break for x86-64. It was AMD's attempt to get them more inline with what Intel had been doing for years. By clean break, I'm talking about a paradigm shift, akin to x86 being extended to 64 bit.

    It seems to me you need to do more research. A lot of what you have said is flatly wrong and are just biased assumptions on your part, particularly your statements on IPC.
     
  17. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    SPEC is a synthetic benchmark, and one that depends on AVX. That is not at all representative of how that CPU will perform in your typical real-world application. Intel already has a huge lead over AMD with AVX512, and Intel has made a lot of substantial differences with AVX. Take the 8600K vs 9600K for example - they're both nearly identical, with the 9600K having a 7% higher max boost clock. Yet in a single-threaded AVX-heavy workload like Cinebench R20, the 9600K has whopping 16% lead:
    https://www.cgdirector.com/cinebench-r20-scores-updated-results/
    So what's the difference? Should it be fair for Intel to claim the 9600K has a 16% overall improvement over the 8600K? I don't think so, because in most tests where it actually matters, it doesn't.

    I'd like to compare Anandtech's gaming results since those are much more representative of what to expect for overall performance, but unfortunately they were mostly just testing the iGPU, which is understandably hugely improved.

    As discussed with user1, there are plenty of ways to make a CPU faster (by adding other instructions) but those additional instructions do nothing to improve the performance of existing software if devs don't use them. So yes, theoretically, Intel can yield an 18% improvement, and with properly optimized software, they won't be wrong. In reality, it will be much, much less.
    A unified L3 cache will hugely improve cross-communication between core clusters, but, it also means the L3 needs to be larger. The larger the cache, the slower it operates. In most cases, the tradeoff is well worth it, but, in some scenarios I wouldn't be surprised if performance will drop.
    Again... you're missing the point why I brought it up. AMD was banking on the idea that the industry was going to move multi-threaded. When you had software written for the architecture, it actually outperformed Intel's parts. But it was stupid of AMD to expect anybody to do that. It's hard enough to get devs to use anything fancier than SSE2, let alone highly parallel workloads. And that's my point: you can create an architecture that is far superior to anything else, but it doesn't matter how much potential it has if the software people use runs like crap. So, you have to create an architecture that will accommodate the needs of modern software, and there's very little room left to improve that.
    Exactly: because it's not realistically possible for them to significantly outperform Intel per-core, per-clock, and per-watt. Why? Because of the same thing I've been saying over and over again: there's not much left that can be done to improve the performance of existing software.
    There won't be another paradigm shift. What we need is developers to stop being lazy.
    I would argue the same of you.
     
  18. Carfax

    Carfax Ancient Guru

    Messages:
    2,585
    Likes Received:
    280
    GPU:
    NVidia Titan Xp
    See, this is exactly what I'm talking about. :rolleyes: Spec IS NOT a synthetic benchmark. It uses actual code from real world applications, and since you get access to the source code, IHVs can optimize it however they want, with or without SIMD.

    Spec is practically the de facto professional benchmark suite for CPUs, and all of the major CPU hardware manufacturers and vendors use it to showcase the performance of their products.

    Benchmarks always have some deviation. Also AVX-512 is very niche at the moment outside of HPC, because so few processors support it and few consumer apps have enough inherent parallelism to exploit it unless they are rewritten. In any case, AVX/AVX2 is currently used in plenty of real world applications including encoders/transcoders, decoders, games, browsers, compression/editing software etcetera.. SIMD performance is extremely important for modern CPUs, and modern CPUs are designed to utilize SIMD processing.

    The IPC performance gain is independent of SIMD performance. IPC typically refers to INT or FP performance, and not SIMD performance. SIMD performance gains are MUCH higher, and yes, developers are using these instructions. It takes a while, but they are doing so.

    No. A unified L3 cache would just mean that instead of their being two 4 core CCXs per die with 16MB of L3 cache for each one, there will only be one 8 core CCX with 32MB of L3 cache.

    It took years for the multithreading paradigm shift to occur. Also, even with multithreaded software, Intel beat the snot out of AMD's FX series; especially when it came to performance per watt:

    [​IMG]

    I suppose time will tell.

    Touche :D
     
  19. schmidtbag

    schmidtbag Ancient Guru

    Messages:
    4,573
    Likes Received:
    1,430
    GPU:
    HIS R9 290
    Maybe it wasn't fair to say the entire suite is synthetic, because there are a few tests in it that I would agree are representative of real-world tasks, or at least potentially can be. But, chunks of it either are synthetic, or, might as well be considering how much they tampered with the application. Take the C benchmarks for example, where they don't seem to tell you what it is they're compiling and they admit to using "many of its optimization flags enabled", yet, they also don't tell you which flags. Well, those flags make a massive difference. Those flags can make the difference between a CPU being slower than it's competition to 30% faster.

    Regardless, after comparing the specs to the 1065G7 to the 8550U, and, looking at benchmarks from other sites, I'm no longer surprised why Ice Lake is winning. The difference in memory bandwidth is very substantial, and that alone should be yielding at least a 10% improvement. Going to 10nm is definitely a bonus too, though, since these are mobile CPUs, the real benefit going to 10nm is sustaining boost clocks for longer durations. There's a lot of room to improve on mobile, which is why I said in another post that I think ARM has a longer way to go until it reaches a dead-end. Intel is clearly taking advantage of that extra room, as they should be.
    What's your definition of "plenty"? Because I would argue that's a bit of a stretch. Today, there are enough applications out there that use AVX to have a few real-world benchmarks showcasing its performance, but, it is still uncommon enough to not take it too seriously. I would like to stress that it is very much worth looking at.
    I understand that, but the "side effect" of SIMD instructions is improved int and FP performance.
    Adoption of such instructions taking a while is kinda the gist of my whole point - Intel can and will add instructions to further improve performance, even way beyond 18%, but it doesn't matter if it isn't used. It's taking too long.
    If they keep the L3 cache the same size then sure. But looking at AMD's trends, they just keep making it bigger. But because of the whole CCX and chiplet idea, this is really the only option they have.
    I don't get why you're still not seeing my point. Those many years for that paradigm shift is part of my point: AMD was banking on people to write code that would run fast on their CPUs. That's stupid. The reason I brought that up is because any uarch engineer can come up with a design that's theoretically faster than everything else, but theory doesn't matter when the software people actually want to run on your product is slower. Get something designed for FX and it ran faster than Intel. But 99.9% of everything else was slower, so it's no wonder why that was such a failure.

    Anyway, when comparing results on 3 other websites that use tests I find to be less suspicious, the 1065G7 is consistently 10-15% faster than the 8550U, which I find it totally reasonable given the improvements made.
    Things will get more interesting once we get to desktop hardware, where there isn't as much room for improvement. You can't cram in more memory channels, and faster memory is already available. There won't be as much of a thermal constraint either. Desktops will show the true potential of the architecture.
     
  20. Carfax

    Carfax Ancient Guru

    Messages:
    2,585
    Likes Received:
    280
    GPU:
    NVidia Titan Xp
    According to Spec themselves, the vast majority of the benchmark is derived from real world applications, including open source projects.

    Source

    Some of the tests are no doubt bandwidth sensitive, but some are also computationally intensive and don't care about bandwidth at all.

    The main benefit of smaller nodes is more transistors per mm2. Ice Lake parts have more cache than Sky Lake and lots of microarchitectural enhancements.

    I think up till maybe two years ago you would have been correct as to the relative paucity of the newer SIMD instructions like AVX/AVX2, but nowadays they are used in plenty of applications ranging from browsers to graphics drivers to physics engines to encoders and decoders to cutting edge games and the list goes on.

    Intel has become quite aggressive at getting developers to optimize their software for these instructions, and also developing their own in house software to showcase the performance gains you can get when the software is properly optimized. A good example of this is Intel's SVT line of codecs. These codecs are exceptionally well optimized for AVX2 and AVX-512 and so they run ridiculously fast.

    Also a lot of games use AVX/AVX2 for physics calculations and particle effects. NVidia's PhysX uses AVX for cloth simulation for instance. Also, BF5 definitely uses AVX/AVX2 because if you have AVX offset enabled in the UEFI Bios, your CPU will downclock while playing it.

    If the code can be vectorized then yes. But not all code can be vectorized or parallelized.

    The instruction sets are being utilized. Windows 10 also uses AVX/AVX2, and I'm sure Linux does as well.

    Even if AMD had gotten their wish and the industry had shifted over to multithreaded programming "instantly," the FX series would still have struggled mightily against Intel because of their poor performance per watt. Intel just about slaughtered AMD in that regard. I mean, just look at the 4770K and its performance in that graph to the FX-9590, and then look at the TDP. 84w vs 220w! :eek:

    Holy sh!t we actually agree on something! :D Yeah, the "Cove" CPUs from Intel should be very impressive if they ever come to desktop. Personally I'm hoping they release the Willow Cove aka Tiger Lake CPUs to desktop. Those CPUs have a totally redesigned cache subsystem with WAY more cache.
     

Share This Page