The Paradox of DeepSeek

The Paradox of DeepSeek

The Headline

Last week, Chinese AI company DeepSeek released its latest foundation models, DeepSeek-V3 and DeepSeek-R1, and a Chat-GPT-like app. These models reportedly outperform some of the leading models from OpenAI, Anthropic, Google, and Meta in various benchmarks, including coding, natural language reasoning, and math. Notably, DeepSeek claimed to have developed its models with significantly less investment than its U.S. counterparts, typically in the range of hundreds of millions. For instance, DeepSeek-V3 was reportedly trained at a cost of around $5.6 million, using Nvidia's H800 chips, which are less powerful than the Nvidia's H100 chips used by leading U.S. companies.

The DeepSeek app has also quickly overtaken ChatGPT as the number one app in the App Store:

Market Impact

The announcement of these models led to a sharp decline in U.S. tech stocks, particularly  affecting semiconductor companies like Nvidia (-17.5%), Broadcom (-16.5%), and TSM (-14.3%). The revelation that AI models could be trained at a significantly lower cost using fewer high-end chips raised concerns among investors that there might be reduced demand for expensive, high-performance hardware in the future, prompting a sell-off in related tech stocks.

Hot Take

Since 2019, I've been advocating (and investing in) the thesis that the "AI application layer" is where big, transformative companies will be created. This is the foundational thesis of our firm Axiom. In fact, the foundation model layer is commoditizing even more rapidly than I had anticipated. Today, Andrew Ng, one of the most influential voices and visionaries in today's AI revolution, confirmed that "the the application layer is a great place to be. The foundation model layer being hyper-competitive is great for people building applications."

This is great news for the entire AI ecosystem. The cost of training (which is really the cost of general intelligence) just got slashed. Moreover, the fact that these models are open-source is a game-changer. It means powerful AI technology, now at a much cheaper cost, can be highly customized by AI application layer startups. This democratizes AI innovation, allowing for unprecedented creativity and development across the industry. What happens when something that was expensive gets commoditized? It proliferates and people do and build things with it THAT WE NEVER EVEN DREAMED OF!

As Satya Nadella states, "Jevons paradox strikes again! As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can't get enough of."

Jevons Paradox highlights the complex relationship between efficiency gains and overall resource consumption. In AI, as technology becomes more efficient and accessible, increased demand will likely lead to higher resource consumption in AI development. This paradox has been most clearly seen in energy pricing and consumption. Jonathan Ross, the founder of Groq (my very first early stage AI investment in 2018, which just raised at $2.8bn valuation), explains it well:

When you make compute cheaper, do people buy more? Yes. It's called Jevons Paradox and it's a big part of our business thesis."

Groq will be another huge winner amidst these trends ...

The major shift that many folks are missing is that very few companies will need to train their own models. The real demand for compute in the future is shifting to inference time compute which is where the AI models "reason" and where most application-layer founders spend their time developing. When these costs come down, the types of capabilities we can build on top of these AI foundation models will proliferate and simply BLOW OUR MINDS!

In my mind, the sell-off is not just premature; it's completely missing the bigger picture. So, I'm still long Nvidia and Broadcom AND, obviously, remain hyper bullish on our thesis at Axiom!

Please reach out if you are building something amazing in the AI application layer!