CEO & Co-founder, Nervana Systems
I’d like to start with an introduction of yourself and your educational and professional background.
I’ve been thinking about artificial intelligence for a long time. I used to read a lot of Asimov and Arthur C. Clarke. It worked out in my favor that the big sci-fi trends in the mid 80’s/early 90’s seemed to all be centered around AI!
This curiosity lead me to Duke University where I graduated with degrees in both computer science and electrical engineering with many classes in philosophy. During my undergrad years, I worked on neuromorphic circuits, which are circuits that mimic some of the behaviors that we see in biological neurons. I found it fascinating that you could characterize biological systems with signal processing approaches and vice versa.
With my degrees in hand, I packed my bags and moved to Silicon Valley for a job at Sun Micro Systems while going to graduate school at Stanford. However, in less than two years, I left Sun Micro, dropped out of my master’s program and went to a startup developing reconfigurable processors. I transitioned in and out of a few startups working on a variety of things: DSP, wireless DSP, networking, video compressions, etc.
After ten years in the industry, I had a very good idea of how a computer worked, and I was ready to take on the bigger question that initially motivated me to work with computers: how can we build a better computer by looking at biology? I quit my job in 2007 and moved across the continent to get a Ph.D. in computational neuroscience at Brown University. I worked in the lab of renowned neuroscientist John Donoghue. He was the first scientist to embed intra-cortical implants in physically paralyzed individuals and pull out electrical signals. This work ultimately led to the development of neural prosthetics.
I completed my PhD in 2011 and took a job at Qualcomm working on Zeroth, their neuromorphic chip project. It was a perfect fit for the skills I had developed — knowing how to build and design chips and software stacks combined with the understanding of neurobiology.
During my time at Qualcomm, deep learning was rising as a state-of-the-art technique, and we saw the opportunity. Nervana Systems was founded to optimize the development of deep learning solutions, to enable you to train a neural network in an hour versus a month.
How does deep learning differ from machine learning or differ from AI?
AI, machine learning, and deep learning are all on a Venn diagram of inclusion. AI is the largest circle and refers to machines exhibiting human intelligence. Machine learning is a very general set of techniques for achieving AI wherein you train or learn model parameters from data. But machine learning requires that the data scientist first identify what features are important for the model to train on. Deep learning is a kind of machine learning in which usable features can be discovered from data automatically.
What are some of the changes that deep learning is driving at the chip level?
If you look at the history of computation, engineers in the 50’s and 60’s were motivated to develop the first computer processors so that automated lists of calculations could be precisely followed and quickly executed. Although our primitives have changed drastically since then, the hardware remains similar. In fact, the machines built in the 60’s aren’t terribly different from what we build today.
In the next few years, the biggest shift in computer architecture will be the change in design to reflect neural architectures. It allows us to look at computations in batches and work on a matrix-wide array versus computations at a scalar level. Building a machine to accelerate and distribute a certain subset of linear algebra is what underlies neural networks.
The second shift has been our transition away from high-precision computations. Scientific computing demands meticulous precision in its computations because you’re often trying to model something from a physical environment, e.g. Department of Energy models of nuclear explosions. We have these extremely precise physics models that are built upon continuous spaces, similar to continuous number systems. You have to represent numbers very precisely to get good results.
Neural networks are vastly different from what we’ve modeled in the past in that they are highly tolerant of noise. At Nervana, we recognized that we could take advantage this when designing a better chip for deep learning. By designing chips for lower precision, we were able to pack more parallelism onto a chip and run calculations with world-class speed. And the neural network actually uses the noise and imprecision as a regularizer, preventing the chip from over-fitting on data. As a result, we get neural clusters with both speed and accuracy.
What are the design differences that enable the Nervana Engine to perform better than a GPU for deep learning?
The reason the Nervana Engine performs better than a GPU is because we built it specifically for deep learning. GPUs are designed for graphics processing required by video games, and they happen to have some nice properties that work well for deep learning. They have a high level of parallelism on the chip with cores designed for linear algebra, the same types of calculations used in neural networks. But GPUs also carry some additional baggage that are not helpful for deep learning.
We already talked about number representation. A GPU uses floating point numbers for high precision. But the Nervana Engine is able to achieve the same accuracy with lower precision, resulting in higher speed.
A GPU has cache structures that are necessary to support various types of graphics processing. But with deep learning, once you spell out the neural network, you know every step of the computation from that point forward. You know the execution pattern and the data access pattern. So you don’t need caches to store information in case you need it. Caches would simply get in the way and take up real estate on your chip. We also know the basic operations (or primitives) up front. Neural networks have only about 20 primitives, and only two really matter — generalized matrix multiply and convolution. So we focused primarily on those two.
How will deep learning shape the technology and application landscape in the coming years?
I predict that all architectures going to be more “neural” in the future. For the next five to 10 years, I don’t see general purpose processors completely going away. However, I do see many workloads shifting away from sequences of instructions to data inference.
We are inundated in data. It’s predicted that data will grow 100 times its already untenable size over the next 10 years. So computationally, the biggest problem we have of our time is finding useful structure in data. This makes data inference is very important. So I believe that all architectures will start supporting data inference as a main workload, whether that be on a host processor or a device. We’re going to start seeing a lot of applications in this area, too. We need apps to figure out what the data is saying.
The cloud infrastructure is going to become more important, because it is going to take a lot of processing power to crunch through all that data and find those useful inferences.
With these vast amounts of data continuously pouring in, where does all the data go? Will we have to store it all?
In the short term, yes. There are an infinite number of inferences that can be pulled out of the data, but it’s not always apparent what inferences will be useful in the future, so for the time being, we need to log everything to make sure we still have that ability to go back and say, “Aha, here’s a new way I can look at the data.” And deep learning will help us do that.
How has Nervana changed now that you are part of Intel?
Within Intel, the Nervana team is still together in one group, and we will remain this way for the foreseeable future. But even though we are still operating as we were, there will be changes in the future. As part of Intel, we can think differently. As a startup, we were just trying to get a foothold and establish a new market. As part of Intel, there’s an established market, eager channel partners, etc., and we can take this technology and scale it quickly or package it differently.
What is the long-term future of the Nervana Engine?
Although I cannot comment much on this topic, I will say that Intel is very cognizant that a general purpose processor will sacrifice on something. So for them, having a broad portfolio makes a lot of sense.
At Intel, we look at what the industry is going to be like in four years. It’s a different kind of view on the world, and it is extremely exciting be part of it! Intel is the biggest semiconductor company in the world, and to have them thinking about what’s the best way to implement a neural architecture — well that’s nirvana for me.
Where is deep learning headed?
We are all already experiencing deep learning in our everyday lives. Your searches on Google, auto-face tagging in Facebook, voice learning on Siri – these are all deep learning applications right now. Even things like Swype keyboards are using deep learning to predict your keystrokes. Deep learning will continue to enable better personalization of experiences.
Deep learning will also enable advances in robotics. This has been an industry dream for some time. We aren’t quite ready for mass adoption due to the high price of building a robot, but we’re approaching the point where we understand the processing requirements.
In ten years, the world is going to be very different place in terms of the technology we surround ourselves with. Our current phase of exponential growth will eventually level out, but for now, we’re embracing the innovations pouring out of technology.
A Short Bio of Dr. Naveen Rao
Naveen’s fascination with computation in synthetic and neural systems began around age 9 when he began learning about circuits that store information along with some AI themes prevalent in sci-fi at the time. He went on to study electrical engineering and computer science at Duke, but continued to stay in touch with biology by modeling neuromorphic circuits as a senior project. After studying computer architecture at Stanford, Naveen spent the next 10 years designing novel processors at Sun Microsystems and Teragen as well as specialized chips for wireless DSP at Caly Networks, video content delivery at Kealia, Inc, and video compression at W&W Comms. Armed with intimate knowledge of synthetic computation systems, Naveendecided to get a PhD in Neuroscience to understand how biological systems do computation better. He studied neural computation and how it relates to neural prosthetics in the lab of John Donoghue at Brown. After a stint in finance doing algorithmic trading optimization at ITG, Naveen most recently was part of the Qualcomm’s neuromorphic research group leading the effort on motor control and doing business development. It’s in Nervana’s DNA to bring together engineering disciplines and neural computational paradigms to evolve the state-of-the-art and make machines smarter.