Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The New Intel: How Nvidia Went from Powering Video Games to Revolutionizing AI (forbes.com/sites/aarontilley)
213 points by sonabinu on Dec 11, 2016 | hide | past | favorite | 123 comments


Yes, Nvidia dominates in deep learning and AI, but the risk of disruption by Intel is much greater than a lot of non-technical people realize. Intel has to do only two things to start taking market share away from Nvidia in DL/AI quickly:

1. Add "good enough" functionality to its high-end processors. Intel is already working on that: https://news.ycombinator.com/item?id=12709220

2. Contribute open-source code to the main branches of the most popular DL/AI frameworks (Tensorflow, Theano, Torch) so these frameworks support the new chip functionality "out of the box," without requiring any additional tweaking. This is not yet happening, but I'm hoping it will soon.

Many DL/AI developers would be content with "good enough" performance out-of-the-box from CPUs if it means not having to pay extra for Nvidia cards or deal with Nvidia's proprietary drivers.

It would be great for Nvidia to get real competition in this space.


There is zero chance that Intel can add "good enough" functionality to their CPUs without changing their memory architecture. A high end i7 might have 35 GB/s of memory bandwidth. Even the biggest, most ridiculously expensive ($6000) Xeons have only about 100 GB/s. Nvidia's GTX 1080 has 320 GB/s. Their P100 has 732 GB/s.

The threat to Nvidia from Intel is in the Nervana chips they recently acquired. Those are presumably using HBM2 and could potentially beat GPUs for neural net training performance.


A modern Xeon has 10s of MBs of cache, which has drastically higher memory bandwidth. What matters is how big the matrices (the "model") is not the input data. With native 8-bit or 16-bit support, you can fit an awful lot of model in 30MB.

I'd recommend reading this paper about matrix math on GPUs from friends of mine way back in the day: https://graphics.stanford.edu/papers/gpumatrixmult/gpumatrix... . While NVIDIA has built much larger register files and L2 caches since then, a modern Xeon still is unbeatable when something fits in L2 or even L3 cache.

I believe streaming the input data in to update the weights is usually done with non Cache polluting instructions (mm_stream equivalents with the NT hint), so it's not hard to keep it fed.


In general neural nets today are so slow (due being computational resource hungry, in clock cycles, memory, and memory bandwidth) that even 4 of the fastest GPU's money can buy stacked together still take days or weeks to train a production quality model.

Trying to peg this onto one of their CPU's will have a result much like they have had in pegging the graphics processing stuff onto their CPU's. That is, very meagre results that is fitting for only the lowest resource intensive things that a consumer may want to do.

30MB to hold a model in is almost nothing. Even 12GB is insufficient to train something like imagenet.

But in the same vein as their iGPU implementation, it may be just enough to satisfy the requirements of the average consumer, and still be able to make a large dent in Nvidia's market share.


Unless you've got extremely low I/O latency, allowing you to work in smaller batches in parallel without loosing all of your parallelisation speedup to the 'tax' you pay when loading data & instructions from RAM to <insert processing unit here>. Although I have no idea what that implies for the hardware discussed above, as it's way over my head.


SOTA models are much, much larger than 30 MB. You can't use 8 bits for training (yet?), and Intel doesn't support 16-bit float (they need to catch up with Nvidia on that, it's really a travesty, we could have used it years ago for audio and imaging). Besides, the top end Xeons with the most cache are fully 10 times more expensive than a GTX 1080.


IMO the travesty is crippling FP16 on consumer GPUs for exactly the reasons you cite. And given that AMD's Vega GPUs apparently will possess full speed FP16, it appears to be yet another in a long line of arbitrary performance cripplings for the consumer space that is IMO the most annoying feature of CUDA development.

But while FP16 is useful for audio and imaging, FP16 nearly killed NVIDIA over a decade ago* when it lacked the dynamic range for DirectX 9 HDR effects in contemporary games without banding. FP32 was more than enough for the task, but power hungry, and thus NV30 could be used to figuratively fry eggs while AMD GPUs had FP24, which was just enough for these effects.

These days, it's all going in the opposite direction w/r to deep learning, but I find it ironic that INT8/INT16 is missing from the Tesla flagship P100, but present on GP102 and GP104, the consumer GPUs (and yes, I know about Tesla P40, but that lacks fast FP16).

I agree with the top poster that Intel could make quite a comeback here given how far it's currently behind. I also agree that it would be hard to dethrone NVIDIA without higher bandwidth memory, but I don't think that's necessary, a bloody nose is more than enough to turn heads IMO.

https://en.wikipedia.org/wiki/GeForce_FX_series


Correct, we may have a few MB of text that is training data. But the trained models are several GB's


Models usually don't fit the cache (the MNIST one might fit, but that's a toy model today)

Also backpropagation makes you iterate through the whole model at each step


100 GB/s is the bandwidth of six DDR4 channels. Knights Landing has 500 GB/s of local memory bandwidth, plus the same 100 GB/s of a Xeon for DDR4.

If future Xeons integrated RAM like Knights Landing does, this would make GPUs less interesting. But KNL can already be used as the main CPU, if I remember correctly.


You're ignoring the main factor: cost.

Intel's offering costs as high as €4000 per processor, which come with a meager 8 cores.

NVidia's offering sells, right now, for less than €1000 a pop.


What about ongoing electricity costs? IIRC my graphics card chews up more power than my CPU.


A delta of $3000 per component buys you a whole lot of electricity.

Furthermore, the performance of each of NVidia's GPUs falls somewhere between 5 and 20 teraFLOPs. A Haswell Xeon gets you about half a teraFLOP for around 1/3 of the power consumption of a NVidia Tesla P100 GPU.


I am not sure where you get those numbers. Both a Knights Landing Xeon Phi and a Tesla K80 are about 3 teraflops for $5000.

Comparing GPUs with Xeons makes no sense.


Many inference tasks can fit in cache or eDRAM. The SIMD units are not fasts enough to make it a big problem so long as the code is well-optimized.

On the Nervana front, outrunning a GPU for neural nets is not that hard with an ASIC. I checked your profile, I bet your employer knows something about that :)


That's just moving the data around. CPUs can't do any meaningful (like AI training) data processing at these speeds, as far as I know. GPUs probably can do something simple at these speeds, but I doubt AI training can be done even at 320GB/s on a single GPU.


moving large data around is the whole issue of deep learning.


No, it's not. You have to apply the data to the evolving network. Can you really do it at 320GB/s?


Good question. If you have enough cores - maybe? Does anyone have a precise idea?


I'll point out that this same argument applied to discrete GPUs when Intel tried to enter the market with Larrabee (before giving up and claiming that HPC was where it's at anyway).

"This time is different" might apply here though, because training is "just" a lot of low precision matrix math. This requires a lot less new ecosystem / software for Intel (Intel has a long history in matrix math), so much like SSE, AVX, and now AVX-512 (or as I still call it LRBni) Intel can easily make some tweaks to at least get 70% of the gains.


Also: Intel learned many things with Larabee. Larabee may have been a failure in public perception, but for Intel it seems to have been a fruitful experience which still helps them. (Somewhere, I've read that they've won the contract for one of the next super computers based on their Knights Landing architecture)


> It would be great for Nvidia to get real competition in this space.

It seems obvious AMD will be a player, since they have similar hardware (GPUs) and with a proper port (using HCC, openCL, etc. ) for cuDNN should also have excellent performance for Tensorflow, Theano, etc. 'out of the box'.


No, AMD doesnt do software suppport. In words of their own developer:

>>We don't happen to have the resources to pay someone else to do that for us.


Yeah, that whole ongoing saga is very unfortunate. Also very illustrative. Reading between the lines, my take is that AMD senior management are a bunch of generic MBA types optimising for higher profit in the present (Windows gaming market) by stealing from the future (HPC & server segments + growing niches like ML ---> Linux and open standards, in other words).

It makes zero sense for them to place increasingly large bets on proprietary drivers when they're clearly not the market incumbent.


It sounds what was happening when AMD, ATi were still seperate entities.


Their mixed GPU/CPU modules should be very interesting as well.


No, they have inferior performance, and all the same issues as Nvidia- in addition to requiring a port.


inferior performance because of lack of performant DL libraries, like CuDNN, not because of the hardware itself


It's a mix of both, hardware is inferior for current use cases but current use cases are driven heavily by NVIDIA.

NVIDIA knows how to do 2 things right; understanding that software matters, even more than hardware and deliver on that notion; and how manage developer relations.

While AMD makes it easy to access information without NDA and sign ups NVIDIA blows it it if the water the moment you show even the slightest interest at signing up with them.

NVIDIA at this point can probably spin off a pure consulting division for AI/ML and bring in more revenue than AMD in its entirety.


How is the hardware inferior for current use cases?

The compute capabilities are pretty much the same for the latest NVidia/AMD cards in the same price range. It's a software thing - lack of optimised DL ops, particularly fast convolution kernels that is hurting the perf of AMD in deep learning


No. AMD has worse floating point performance, and has poor support for half precision floating point values. Also gets beat on power consumption per flop.


FLOPS/$ is better on AMD afaik. double speed fp16 is only on gp100 and I don't know if any framework supports fp16 training yet. But yes AMD need to add 2x fp16 too.

The main reason people don't use AMD for deep learning is lack of fast libraries, and AMD's inability to provide a CuDNN equivalent optimised for their hardware


But I don't think this is a software issue: clFFT runs faster on NVIDIA. Simple operations like SAXPY run faster on NVIDIA. Etc.


Speaking of this, AMDs new Vega cards in theory should offer more fp16 than the P100.


I liked some of this point except the end, which is not correct.

> NVIDIA at this point can probably spin off a pure consulting division for AI/ML and bring in more revenue than AMD in its entirety

AMD had $3.99 Billion in total revenue in 2015. Nvidia had 5 Billion.

Perhaps you mean profit? Amd was operating at a loss and nvidia was very profitable.


NVIDIA doesn't charge you for consulting services, they do expect you to buy pretty expensive hardware tho. They offer training and consulting services for "free" and paid services through their network of partners, to which NVIDIA provides services for free. If they spin off their in house knowledge and start charging for services and software instead of giving it out for "free", they might very well reach a 4bln $ a year revenue stream, that said a lot of people won't be happy about it...


The high end Intel CPUs like Xeon Phis that have enough compute muscle to get anywhere near NVidia GPU perf for deep learning are far too expensive to be able to replace them


Precisely.

Intel is trying to compete with NVidia by offering inferior performance at a premium price.

In massively parallel applications, cost is closely tied to performance. NVidia has both aspects secured very well.

In fact, NVidia is already a prominent part in supplying components for the 3rd place in the Top500, Oak Ridge National Laboratory's Titan system, which mixes up AMD Opterons with NVIdia k20x GPUs.


Isn't some of the advantage coming from the fact that gpus are much better designed for massively parallel computations?


> Intel has to do only two things to start taking market share away from Nvidia in DL/AI quickly:

...except that Intel has been trying to do that for over a decade, and Intel has been failing repeatedly for over a decade.

If that was as simple and easy as you lead to believe, Intel would've already done it by now.

Except it didn't, because it can't.


How exactly Nvidia and ARM are competitors of Intel is an interesting question in itself. Nvidia and ARM are fabless and compete against a subset of Intel's IP and products. Intel may take multiple routes to reacting against superior competition without necessarily even making better GPGPUs.


Along these lines... For inference, do you really want to put a Tesla M4 in every server in your cloud when you're already paying $$$$ for CPUs?


Do you really want to put a Xeon Phi in every server in your cloud when you're already paying $$$$ for CPUs?


Xeon Phi can be its own cpu though. That's what Intel is counting on.


> Xeon Phi can be its own cpu though

yes and they are priced accordingly ;-)


No, there's no "Good Enough" it needs to be fast enough.

Xeon Phi is their contender, but we'll have to see enough

A high-end GPU is around 10x faster than your avg CPU https://www.nvidia.com/object/gpu-accelerated-applications-t...


>>This is not yet happening, but I'm hoping it will soon.

Intel made massive gains promoting OpenCL, when they helped to redo OpenCV's umat infrastructure.


True but Nvidia has a first mover advantage which should not be discounted.


This is pure and utter bullshit. Their core market is by far gaming, as this PDF (that I found via NVidia IR) shows:

http://files.shareholder.com/downloads/AMDA-1XAJD4/341168899...

Also: The level of their gross profit margin indicates that they are in fact a monopolist:

http://marketrealist.com/2016/11/driving-nvidias-profit-marg...

(These are not the financials of company operating in a healthy, 20-year old industry - they are the financials of a company that after operates without meaningful opposition in a 20 year old business.)


Nvidia isn't a monopolist. They have two big competitors in AMD and Intel.

Intel ship about 2/3 of the graphics chips for laptops and desktops. Nvidia have more than half of the remainder. That isn't a monopoly.

In the discrete GPU space AMD are also there and ship almost half as many unit as Nvidia. That's no monopoly.

The graphics systems for the current consoles are shipped by AMD aren't they?

Nvidia is a profitable company. Calling them a monopolist because they are profitable is unwarranted.

* Edited to fix grammar.


AMD ships primarily low and mid range cards, NVIDIA has an effective "monopoly" in the enthusiast market share which has the highest profit margins by far.

That said AMD hasn't managed to ship a 350-400$+ card that's worth spending money on compared to the competition since probably the 7950...


That's a ridiculous definition of a monopoly. Using that methodology, you can claim any company has a monopoly that happens to have the best performing thing at the time.

Nobody is claiming Tesla has a monopoly on electric cars because they make the best. And nobody should claim Nvidia has a monopoly on GPUs since they are on top.


I didn't claim they had a monopoly, I said they had an "effective monopoly" in only one, albit very critical market segment, in which they ship over 75% of the hardware.


Regarding the effective monopoly on high end, this will change next month with release of Vega card.


I wouldn't hold my breath, the 1080 TI is coming out and NVIDIA would just drop the price on the 1080, they've done pretty much exactly the same thing with the Fury.

And AMD will have again a very expensive card to manufacture GDDR5/5x vs HBM2 with high power consumption based on the fact that the RX480 draws nearly as much power as a 1080 atm, and HBM2 isn't exactly power efficient (unlike HBM1 vs GDDR5), HBM2 is nice but the power requirement currently increase with the density, and it leaks voltage pretty badly with 4 stacks...


OK, by revenue, sure. But calling it "pure and utter bullshit" is probably going a little far, don't you think? (Where is the line item for "AI"? Datacenter and Auto? Not to mention many "gaming" cards are more than sufficient for "AI" if you don't need 64bit FP.) Intel has been hammering at the GPU screw for more than a decade and by all rights, they still haven't figured out how to use a screwdriver (to use Huang's characterization of Intel's approach).

Nvidia, meanwhile, has built a massively parallel beast that's still decent at gaming. But make no mistake; their engineering priorities are set on general parallel computation. Gaming performance is in many ways a similar problem, and thus benefits, but it would be a mistake to think games still come first for their engineers.


Not quite the right conclusion.

Many of Nvidia's "gaming" cards (e.g., higher-end GTX cards) are in fact used NOT for gaming but for deep learning by numerous AI researchers, developers, and startups with small budgets who buy them through retail channels.


That's indeed possible. But really, if you were to guesstimate the ratio of GPU usage between these items sold to consumers that end up being used by:

a) AI/startups

b) gamers

what would you guess?

My guess is somewhere between 1:50 and and 1:500.


Revenue growth is far more interesting to tech investors than revenue over the last few years. Deep learning has a massive potential for revenue growth and total revenue will likely eclipse gaming within the next 10 years.

Imagine a world where every car has semi-autonomous technology, the average security camera is doing object detection, localization, more sophisticated language translation, smart drones, etc. The potential for deep learning is undeniable and Nvidia (currently) is leading this race.


Except that you mainly need the massive compute for the training, not the commodity execution of the network!


Any idea what kind of compute a self-driving car(with a trained network) will need ?


Something pretty close to this: http://www.nvidia.com/object/drive-px.html (I believe they are currently recommending the 4 GPU version shown on at the bottom of that page under "DRIVE PX 2 FOR FULLY AUTONOMOUS DRIVING")

The key here isn't that this isn't a full power desktop GPU, but it will run inference on a CUDA-compatible kernel.

That give NVidia a complete end-to-end modelling, training and driving platform. No one else has that.


Depends massively on what the software is doing, there is still no standardisation on what a self-driving car means.

Nvidia is pushing PX2 but I have a feeling they are too expensive for most cars - don't know the exact price but "a few thousand dollars" was quoted for the Teslas that use them. Some specs - http://wccftech.com/nvidia-drive-px2-pascal-gtc-2016/

At the same time Qualcomm is pushing their much less performant chips for self-driving cars, but at a much lower price point (hundreds of dollars)


Surely you don't expect cameras and drones to have massive, loud, heavy, power draining, $650 dollar graphics cards in them?

It seems if they are used in development the sales don't scale anywhere near directly with the number of units sold.

Maybe I'm missing something?


Not now, but in the future. Nobody thought we could fit desktop-class graphics cards into slim laptops, but that's a reality today. Though drones will probably go for custom chips.



The difference is that AI/startup users may buy hundreds of GPUs at once.


I think he is already including that in his 1:50 or 1:500 ratio.


Indeed.


Their data centre revenue is 20% of their gaming revenue. That's a significant business.


Exactly, and there are ample rooms to grow. Compared to Gaming.


I see datacenter revenue estimated to go from 7% to 15% in 2 years. If deep learning continues to grow datacenter revenue will likely exceed gaming revenue. Depending upon competition and long term growth gaming revenue could become just a footnote.


I think they also count Titans as a gaming item, and server vendors dont offer them. So all the people doing DL with Titans probably are in the gaming category.


Or any number crunching application?


What is with their gaming and datacenter revenue being expected to grow 60% in Q3 FY17?


Just like this fawning article, a very optimistic scenario for Nvidia is in the market pricing of its stock. Nvidia trades at a 45 trailing PE vs Intel's 15. It's valuation therefore assumes at least 3x more growth than Intel. In a market which it seems literally everybody is trying to target. For example, Qualcomm, (trailing PE 14), just announced a 48-core ARM server processor with vector capabilities for launch H217. And if AMD is allowed to be purchased, and properly funded, by an Asian (or other) entity (already has licencing agreements in place in China), the space could get very crowded for NVDA.

I wouldn't be so sure that the comparison to the 40-year behemoth of silicon, which extracted quasi-monopoly profits for 35 of them, is yet valid.


If Nvidia are looking to supplant Intel, I think they key, weirdly, is i/o. In a sense, GPU acceleration of general purpose computation is just a specific case of the more general case of distributed computing (albeit the computers are extremely 'close'). For embarrassingly parallel workloads, i/o is not an issue; the data and computations can be batched, distributed and no further communication required (until you collect the results).

But for mixed, semi-sequential/semi-parallel workloads, I/O becomes the bottleneck. Having to retrieve intermediate results, combine and then redistribute can eat up your gains from parallelisation. It will be interesting to see if Nvidia start pushing for a new standard to replace PCIe, and maybe invest in low-latency, high throughput networking R&D. Who knows? Maybe they'll just build some kind of networking functionality in to their GPUs directly, so data can be transferred NVRAM -> NVRAM without having to travel up and down the stack (until they need to interact with userspace; once on the way in and once on the way out).

Given there are RDMA adapters that can manage ~20ns latency already, it would be interesting to see what could be achieved before we start pushing up against the laws of physics. Even at the physical layer, my understanding is that most networking fibre optics manage about 60% the speed of light.

So, just to make a wild prediction: within the next year, Nvidia will acquire a HPC network adapter company (e.g. Mellanox).


Nvidia already has high speed GPU-GPU links:

http://www.nvidia.com/object/nvlink.html


They aren't widely deployed yet - the DGX-1 uses it, though.

I assume it will be infiniband that will be the commodity hard adopted to distribute Deep Learning. However, Nvidia are currently trying to segment the market by providing GPU->GPU direct access via infiniband on only the Tesla cards - not the commodity ones (titanx, 1080 gtx). That strategy will probably be the death of them in Deep Learning.


Why are you assuming Infiniband is the winner? There's Intel Omni-Path and there's Ethernet.

(Disclaimer: I'm the system architect of InfiniPath, which is one of the things that evolved into Omni-Path.)


Ah, just the man I've been wanting to talk to then :)

As I mentioned in another post, I really think Intel are missing an opportunity when it comes to high speed networking. Why don't they drop all the crazy server market lock-in vendor shenanigans (e.g. brand to brand SFP incompatibility, 'call us' pricing etc.) and just jump straight to the consumer market? I know this means cannibalising CPU revenue, but that's going to happen anyway.

Developers are consumers. In many cases the direction that their skills develop in is largely determined by the consumer hardware they're running at home. The same seems to be true for Ops folks, who all seems to be running vSphere homelabs atm. Based on what others have said here, it sounds like NVIDIA have chosen to cross-sell, and work their way down from HPC -> ... -> Consumer, presumably to extract maximum rents, despite not yet having strong enough network effects o lock out competition.

This seems like the perfect opportunity for Intel to cut the legs out from under that strategy, by aggressively pushing widespread consumer (or at least 'power-user'/developer) take-up of high speed, low-latency networking. And hasn't iWARP been endorsed as an IETF standard (or is on its way to be)? It seems like there's a window of opportunity here, albeit one that gets smaller and smaller as Nvidia slowly work their way down market.


Well, Omni-Path has a 48 port switch ASIC. So that's it for consumer accessibility.

But if you want to buy the PCI Express cards, Newegg has 'em. Intel has a great channel organization.


It's certainly impressive tech. But, and bear in mind this is just from a consumer perspective, here are the reasons I wouldn't buy it:

- ~$8000 USD for a 24 port switch is way beyond my price range. I doubt any consumer market could support this (http://www.newegg.com/Product/Product.aspx?Item=9SIA6ZP53G35...)

- ~$550 USD for a single port adapter is also way outside of my budget, especially since I'd need to buy at least half a dozen of these for my home lab (http://www.newegg.com/Product/Product.aspx?Item=N82E16833106...)

- It's very difficult to find solid information on the product. The Ark page doesn't tell me anything (http://ark.intel.com/products/92007/Intel-Omni-Path-Host-Fab...). The link to the whitepaper (which I'm assuming is actually a marketing brochure) 404s.

- So even if I could afford this as a consumer/dev, I probably wouldn't buy it due to the uncertainty around total cost of ownership. Will it accept non-Intel branded QSFP transceivers? Same question with optical cabling.

- Given the product brief mentions 'fabric performance [will] scale automatically with ongoing advances in Intel Xeon processors...', it leaves me wondering if this is somehow dependent on some specialised Xeon CPU instruction sets, making it useless for most home labs (which, at best, will be running older generation Xeons from second-hand servers, but more likely CPUs from Intel's consumer line-up).

On costs: If these are just reflective of the cost of production, then fair enough, I guess I'll just have to wait (for either Intel, or another manufacturer, to devise a cheaper production process). If it's not, then I think you're missing the opportunity outlined above, and you'll lose the HPC fight given Nvidia occupy the high ground here.

On compatibility: If this is just some idiot 'multi-channel cross sales' strategy that some genius with an MBA has cooked up, you're going to have a bad time. It falls apart the second someone releases a commodity adapter (which may already exist, for all I know).


Oh and to answer you question about Inifiniband, I think the previous poster might be referring to this: http://i.imgur.com/2ciM4qy.png

It's also why I figure Infiniband would be a likely acquisition if Nvidia wanted to head in that direction (i.e. GPU to GPU networking a.k.a NVRAM RDMA).


Or it will make buttloads of money until they have to move it downmarket when serious competitors arrive. Money they can use to lap competition by investing in better cudnn and hardware.

Full disclosure: I am (and have been since I found about about cudnn) long nvidia.


That's how I see it playing out too, assuming Nvidia pull it off. By my reckoning, it roughly goes:

HPC -> Mainstream servers (and niches like ML) -> 'Professional workstations' -> Consumer desktops

Intel's response (at least in part) seems fairly sensible: throw money and developers at open-source and open-standardisation efforts to try and rally the rest of the market (e.g around stuff like OpenCL). Otherwise network effects from stuff like widespread CUDA usage will become too strong to overcome, at which point Nvidia will have a monopoly.

However, I think they're missing an opportunity when it comes to high-speed networking. Specifically, I think they should be looking to release cheap, simple and open-standards based networking gear without trying to cross-sell Xeons or whatever at the same time. I realise that this means cannibalising revenue from their CPU market, but at this point someone is going to do it so it may as well be them. Hell, they could probably even get away with tying the functionality in to some fairly widespread Intel CPU instruction set (say, from Ivybridge onward). However, anything past that and it's a non-starter for consumer segments (which, in my mind, is also the developer segment).

10gbit RDMA is actually sort of affordable nowadays (at least if you pick up gear from eBay). But it's currently a complicated mess of competing and incompatible standards. Some SFP+ modules are even vendor-locked; if they sense there's a different SFP+ branded module on the other end they stop functioning due to an 'unsupported configuration'. These kinds of things make it impossible for a decent sized consumer market to form. If Intel could just accept that the world is moving on with or without them and get a proper consumer market going, at the very least they'd keep the market open (and at the very best the might even tilt the odds in their favour).

AMD, on the other hand, I have no idea wtf they are doing (and neither do they, by the looks of it). It could just be my biases, but I really think the best strategy for them is to basically bet the house on open-source and really up their engagement with Linux core dev. While I get it's attractive to target the gaming market because it's profitable in the present, it probably means death in the future. Actually their interests are pretty aligned with Intel's, and they have highly complementary work forces. It sounds crazy, but I wouldn't be too surprised if Intel were to acquire AMD at some point, or at least buy out their GPU division...


Does Nvidia already have NVLink for this purpose?


They do, but you can't run NVLink to the CPU unless you're using exotic IBM POWER8 systems.

x86 systems like the DGX-1 can only run NVLink between the GPUs, and still rely on PCI-E for the CPU connection.


Slightly off Nvidia and thinking Intel for a bit.

Where is Intel heading? I dont see Intel gaining a foot in the AI / ML market. Not Knight XXX. And all these supposedly weapon such as next gen AVX, Nervana aren't coming any time soon. And Nvidia knows full well, the same thing about GPU and AI/ML isn't in the silicon, it is in the drivers and library.

So not only this is a huge first mover and mature software advantage, the AI / ML is difference in gaming, where the human cost of Nvidia AI/ML knowledge is significantly more then what the Hardware is worth, which is next to nothing when compared to those salary.

So in the next 2 - 3 years, I dont see Intel getting much from Nvidia. They failed Mobile. Windows 10 is now slowing working towards ARM.

It leaves them with 1), a shrinking, or downtrend PC market. 2) Their Server CPU market with heavy margin which will finally be getting a lots of competition from AMD Zen x86 and Qualcomm ARM Server Chip.

Zen Server chip is highly unlikely to be competitive against intel in performance. But there are lots of margin for AMD to attack. It means a lot of pressure from Intel to lower price.

A lot of ARM Server will be coming in the next few years. Not low end blade, but powerful Xeon like chips. I have always been skeptical of this, but it seems ARM, which is owned by Softbank, whom is also the largest shareholder of Alibaba, whom also operate a gigantic Cloud infrastructure, are All in on this.

This is not saying Intel will die in 5 years time. They are likely to be around for decades after, but i dont see where they could grow and head.


It's funny when you think about NVIDIA erratic debut (NV1, SEGA). Then the constant progress to the top of the GPU field. I remember how quickly pro CAD graphic cards became obsolete market-wise (ES, wildcat) when Transform and Lighting was implemented. Now "AI", car vision, HPC .... really "funny". Thanks to y'all Quakeheads I guess. And Crysis.


E&S, 3dlabs... I do not miss those at all!


I could never afford one, they were only luxury icons to me. Were they complete scams or just overpriced ?


At the time they were the only game in town if you wanted OpenGL to run decent(ish) on Windows. It was still a time when you could buy an SGI. Drivers were a mess and price was steep. They weren't a scam, their products just felt lazy-done. Luckily, 'gamer' cards came quickly to OpenGL land.


I remember the weird span when 3DS tried to work with vendors with the heidi interface.


Puff piece. Yes, Nvidia dominates at the moment, but as many have said, this could easily change with a resurgent AMD. They have made all the right moves, IMHO, since Lisa Su took over:

0. Finally paying attention to software part of equation : hardware for compute has always been very good

1. drivers are now excellent

2. whole line of GCN devices (since 2012) are supported and still optimized, unlike notorious Nvidia nerfing of older cards

3. Has x86 license - if Zen is a success, can build high-bandwidth fabric between CPU and GPU, unmatchable by nVidia except on exotic Power8 arch.

4. new project focusing on HPC on Linux

5. they keep pushing OpenCL, which will win over Nvidia-only CUDA

6. New tools to semi-automatically port CUDA code to run on their hardware.

Well, here's hoping AMD gives them a run for their money in 2016 </fanboy> 6.


0. Finally paying attention to software part of equation : hardware for compute has always been very good

By now supporting porting CUDA "natively", still no real answers to the datacenter CUDA features like DMA, proper virtualization, and networking.

1. drivers are now excellent

The Crimson driver suite is a step in the right direction, if only every 6 months they didn't had a release that actually physically damages cards.

2. whole line of GCN devices (since 2012) are supported and still optimized, unlike notorious Nvidia nerfing of older cards

Yes that's an admirable feat, but then again not one really cares about 5 year old hardware for either gaming or ML

3. Has x86 license - if Zen is a success, can build high-bandwidth fabric between CPU and GPU, unmatchable by nVidia except on exotic Power8 arch.

NVIDIA also has the x86 license it's more limited and AMD (w/ Intel) did bash them when they tried to add x86 interoperability to their HPC parts but NVIDIA has a very very large IP portfolio.

4. new project focusing on HPC on Linux

AMD has a new project every 2 years, they tend to not die, CUDA on Linux is excellent it's also the recommended platform.

5. they keep pushing OpenCL, which will win over Nvidia-only CUDA

OpenCL performance on NVIDIA GPU's is still better, and OpenCL will beat CUDA has been touted for nearly 10 years now...

6. New tools to semi-automatically port CUDA code to run on their hardware.

With still pretty poor results in many cases, and it only works if you use the most basic use cases under CUDA and this was the bare minimum they had to do so people would even look at AMD hardware at large scales these days since virtually every high performing library is written in CUDA simply because it offered a better solution and NVIDIA actually provides whatcha call it - ah right support...

Well, here's hoping AMD gives them a run for their money in 2016 </fanboy> 6.


Let's compare notes end of next year :) I think things will be quite different by then.


Unfortunately, AMD executives have signalled in several interviews that they just don't care about deep learning. Also, I don't think AMD can start focusing on this field 10 years later than Nvidia and close the gap in a year as you suggest.



Oh yeah, it seems AMD wants to prove me wrong (I hope so): http://radeon.com/en-us/instinct/


Well, people seem to forget that all it will take from Intel is to make an ASIC like Nervana's (hint, hint:which they acquired) and suddenly Nvidia's in serious trouble.


Intel plans to release their Nervana-based chip, which they're calling "Lake Crest", next year.

http://www.forbes.com/sites/moorinsights/2016/11/21/intel-co...


That has its own problems... Not allowed to tell you, don't ask me.


> "I always think we're 30 days from going out of business," Huang says. "That's never changed. It's not a fear of failure. It's really a fear of feeling complacent, and I don't ever want that to settle in."

I dunno where this company is going next or even if it will be around (independently) in 10 years, but if that's really the attitude of its CEO, then at least today it's definitely in good hands.


When AMD acquired ATI a lot of people speculated that Intel would acquire Nvidia. It never happened. What followed was a disastrous series of acquisitions from Intel that never really amounted to much with McAfee being the worst of the lot.


Von Neuman bottleneck is ripe for disruption. Ultra low power processors in memory, a few 3ghz cores, and a fat matrix multiply ASIC.



I'm not really ready to call Nvidia the new Intel. Nvidia seems like such a volatile company and I have no idea why. I just feel like I could wake up tomorrow and they could be gone.


Care to elaborate? I don't follow them closely but they have seemed to be consistently on top of the GPU market for some time now, at least in the gamer/enthusiast market.


Just look at how fast AMD fell. Everyone thought they 'd profit big time from Bitcoin... I remember that I couldn't buy an AMD GPU for weeks in any store in Munich, because everyone and their dog bought them as BTC mining rigs. And then, couple of months later? ASICs entered the field and prices for used AMD GPUs fell through the floor.

The same can happen to NVIDIA, especially when Intel decides to launch something inside their CPUs that is "good enough" for many people.


And note that it's not that hard. Google, who are not really known as a chip company, designed the TPUs:

The result is called a Tensor Processing Unit (TPU), a custom ASIC we built specifically for machine learning — and tailored for TensorFlow. We’ve been running TPUs inside our data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning. This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law). (https://cloudplatform.googleblog.com/2016/05/Google-supercha...)

That sounds like a custom ASIC for this purpose. And if Google can do it, so can Intel.


The vast majority of Nvidia GPU sales are to gamers. Intel has been trying to build a "good enough" iGPU for years and years, with almost zero progress relative to their contemporary discrete GPUs.

Even the best current GPUs can't handle the inevitable future high-res VR. This is going to be a stable business for a while yet.


> Intel has been trying to build a "good enough" iGPU for years and years, with almost zero progress relative to their contemporary discrete GPUs.

Intel hasn't had real pressure to innovate for years. They watched the battle and fall of NVIDIA/AMD, and in the meanwhile enjoyed good sales of CPUs for all the cloud computing.

Now, they have intense pressure to innovate - as in the server space ARM is going to eat them up, due to the superior power efficiency, laptop/desktop sales are down (again, because a four-year old computer works just fine if you upgrade its RAM to accomodate Chrome).

Intel needs something mind-blowing and that quick. And NVIDIA too, casual gamers are shifting to console (where AMD has quite a customer base) and mobile (where Samsung/Apple with ARM are in the lead). I haven't seen any performant Intel or NVIDIA mobile SoC solution yet.


> mobile (where Samsung/Apple with ARM are in the lead). I haven't seen any performant Intel or NVIDIA mobile SoC solution yet.

Nvidia does have the Tegra, for the Android tablet market.

Interesting that they've ceded Windows on ARM (aka the rumoured Surface Phone) to Qualcomm - I would have thought Denver's code-morphing architecture would be better suited to emulating x86 instructions than a Snapdragon 835.


> Nvidia does have the Tegra, for the Android tablet market.

Which isn't used anywhere except a couple of tablets according to Wikipedia. Samsung, inarguably leader of the high-class Android manufacturers, uses either its own SoCs or Qualcomm. Many cheaper phones use Mediatek or Allwinner chipsets.


It's used for it's automotive products.

It's not used because most likely NVIDIA never really wanted to push it, it checked the water a bit with a few 1st and 3rd party tablets/android gaming consoles but this always looked more like to recoup some investment on the side while developing the SOC than an end goal.

The lessons learned from Tegra allowed NVIDIA to build the best (or at least the most powerful) automotive integrated solution currently on the market and at least as far as software compatibility and performance go they have no real competition.


You think casual gamers buy high margin items like GTX 1080 to power their Rift/Occulus?


The Oculus itself is still 600€, plus another ~1-2k€ for a powerful enough computer. Sorry, I don't see this as a sustainable market vision.


Sorry but no.

Nvidia (a) owns the high end deep learning space right now, (b) has shown with the Nintendo Switch arguably the future of gaming, (c) by abandoning PS4 etc they are moving to higher margin businesses.

For me they are looking pretty damn good.


[I've withdrawn this (joke) comment. It had read as below the line.]

--

This is totally normal for companies to do. In the 1920s Nvidia made car carborateurs. In the 1820s they made horsebuggy whips. In the 1700s they made ships of the line, which they had made going back to the 15th century. It's little known fact that the Niña, the Pinta and Santa María (Columbus's three ships) were powered by NVidia sails. yawn.


? nvidia was founded in 1993


it was a joke, and if it were upvoted I wanted to keep it but if it was downvoted erase it. I guess I didn't do well but due to your reply I can't erase it now.

nintendo really was a playing-card company foudned in the 19th century though.


And Nokia used to make rubber boots and elevator cables!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: