The Untold Story of Gen AI’s Inflection Point – In, out, and around the box by Dinesh Nagar

It happened right before our eyes, yet the full story of the forces that drove Gen AI’s inflection remains untold. Looking back, I can now see how a convergence of breakthroughs—Transformer models, falling compute costs, and an explosion of data shaped the AI landscape. These changes, some of which I witnessed firsthand during my career, reveal their true impact only in hindsight.

The concept of Artificial Intelligence (AI) has been around for over 70 years, but the pivotal moment came in 2017 with the introduction of the Transformer model in the groundbreaking ‘Attention is All You Need’ paper. This innovation is the foundation of GPT, ChatGPT, and all the large language models (LLMs) we see today.

Before this breakthrough, AI was limited to research and narrow applications, like ad monetization, with little potential for general-purpose intelligence or AGI. Early examples, such as Google’s Nest thermostat, relied on basic rules-based behavior, while Alexa, Siri, and Google Assistant often struggled with even simple tasks.

While the Transformer model was revolutionary, its true potential could only be realized with massive diverse datasets, compute innovations, cost efficiency, and compute power reduction. These forces converged in 2017, marking a turning point and enabling rapid advancements in Gen AI that culminated in the release of ChatGPT in 2022—a milestone that forever changed the way we view AI.

Data: The Lifeblood of AI

The smartphone revolution of 2007 and the rise of 4G connectivity in 2009 laid the foundation for a global data explosion. By 2017, over half the global population was connected to the internet via smartphones. In India, the 2016 launch of Reliance Jio’s 4G network added 100 million subscribers in just six months, democratizing internet access with just ~$3 per month. I distinctly remember visiting India during this time and witnessing firsthand how connectivity reached even remote areas, transforming daily life.

Global smartphone sales trend showing rapid growth, with market saturation reached in 2017
Source: Statista.com

This marked a critical turning point. Affordable access to mobile technology fueled unprecedented data generation, capturing every aspect of life—from personal habits to social interactions—on a global scale. The diversity and sheer volume of this data became critical for training AI models, enabling them to refine their capabilities and learn from edge cases. By 2017, data creation hit a critical mass at 25 Zettabytes per year, providing the granularity and scale needed to train increasingly sophisticated AI systems. This explosion of data drove the demand for advanced computation, paving the way for GPU-powered innovations.

Annual data generation, surpassing 25 Zettabytes in 2017, marking a critical mass for AI training
Source: explodingtopics.com

Compute: The Powerhouse Behind AI Innovation

The growth of computational power has been extraordinary, driven by innovations in Graphics Processing Units (GPUs) and the shift from serial to parallel processing. During my time at Nvidia from 2008 to 2013, leading the UX for its PhysX engine, I witnessed firsthand how GPUs were transforming industries.

At the time, Intel was promoting integrated graphics, arguing that discrete GPUs were unnecessary. But at Nvidia, we believed that more computation would always unlock new possibilities. This belief led to the launch of CUDA in 2007—a parallel computing platform that allowed developers to use GPUs for tasks beyond graphics. In 2008, we followed with PhysX, a physics simulation engine that brought realism to gaming and simulations. These innovations not only revolutionized gaming but also paved the way for advancements in scientific research, blockchain, cloud computing, and, ultimately, AI. Each breakthrough funded the next wave of innovation, enabling Nvidia’s journey to become a leader in computation.

This journey highlights how computing has evolved and why Nvidia is now pivotal in the AI world. Back in 2008, Nvidia was valued at ~$5B, while Intel, 30 times larger, stood at ~$150B. Intel used its dominance to suppress Nvidia, but we persevered. Today, Nvidia holds a unique advantage—not just in hardware but in its entire software ecosystem—enabling it to operationalize AI at scale. This success has been hard-earned and well-deserved, built on years of innovation and resilience.

Another major driver of computational growth has been the evolution of semiconductor technology. Advances in extreme ultraviolet (EUV) lithography, pioneered by ASML, have enabled chip sizes to shrink from 600nm in the 1990s to just 3nm today. These smaller chips deliver exponentially higher performance.

Semiconductor size shrinking to below 10nm by 2017, marking a breakthrough in chip performance
Source: Wikipedia

By 2017, GPU advancements allowed the powering of massive AI workloads, enabling the training of Transformer models. Once a gaming tool, GPUs became the powerhouse behind AI, efficiently handling complex workloads and enabling large-scale models.

Cost: An Explosive Evolution in Efficiency

The dramatic reduction in the cost of floating-point operations (GFLOPS)—a unit of computation—was a game-changer. In 2000, GFLOPS cost $1,000; by 2018, they cost a mere $0.01. To put this in perspective, it’s as if a $100,000 car in 2000 now costs just $1.

Cost of GFLOPS reaching $0.06 in 2017, making large-scale AI models feasible
Source: Wikipedia

This cost efficiency made large-scale AI models feasible. Cloud computing platforms like Microsoft Azure, and AWS, have been instrumental in this evolution. Microsoft’s strategic decision to provide Azure Cloud resources to OpenAI has played a pivotal role in accelerating AI’s progress. While Google and other companies were developing AI models, they didn’t feel the urgency to release them until OpenAI’s GPT models showed the world the true potential of large-scale AI.

Power: The Backbone of AI Operations

AI isn’t just about algorithms and data—it requires immense power. Unlike traditional systems, like Google’s search engine, which retrieves pre-existing information, generative AI creates content on the fly. This process demands significant computational resources, especially during inference.

In 2017, GPUs achieved a milestone of 50 GFLOPS per watt, marking a critical inflection point for scalable AI operations. This efficiency allowed AI models to grow in size and complexity without corresponding exponential increases in power consumption. Energy-efficient computing, such as ARM-based processors widely used in smartphones, became essential for balancing these growing demands. I experienced this shift firsthand during my time at Nvidia, where I led the UX for a phone and gaming console powered by the Tegra processor, focusing extensively on energy efficiency. This efficiency was one reason Nvidia attempted to purchase ARM Holdings, a company known for its power-efficient chip architecture. Apple has also embraced this philosophy, building in-house ARM-based chips for its devices, moving away from power-hungry x86 architecture.

GPU performance reaching the milestone of 50 GFLOPS per watt in 2017, driving scalable AI operations
Source: Researchgate.net

Today, data centers account for 1-1.5% of global energy consumption, a figure projected to grow as AI adoption accelerates. While semiconductor advancements have helped reduce power usage per computation, the sheer scale of AI workloads means the energy grid will need to evolve to keep pace with demand. Even with optimizations, the cost of inference, such as ChatGPT queries, remains substantial; a challenge the industry must address as AI continues to scale.

Coming Soon – Part 2: The AI Arms Race

This concludes Part 1 of our exploration into AI’s rapid ascent, driven by data, compute, cost, and power. But this is just the beginning.

In Part 2, we’ll explore The AI Arms Race—a high-stakes competition among political superpowers and corporate giants, competing for dominance in an AI-driven world. As AI reshapes economies and geopolitics, the future is unfolding faster than we can imagine. How do you see AI shaping the world? Share your thoughts—I’d love to hear them!

Stay tuned—because the AI race is only getting started.

Special thanks to Shrinu Kushagra, Sandeep Chaudhary, Chandan Sharma, Jaideep Godara, and Shubhankar Ray for their crucial feedback on the draft and invaluable inputs.

Also available at https://www.linkedin.com/