Greetings Colligoans,
I hope you enjoy this short piece.
Erik J. Larson
The year was 2012. A deep learning system known as AlexNet—named after its lead designer, Alex Krizhevsky—beat all comers at the annual ImageNet competition. Its performance classifying Flickr photos in the competition was outstanding, with an accuracy more than 10 percent higher than the second-best system. AlexNet heralded a new era of AI and was quickly copycatted. Deep learning systems soon dominated artificial intelligence generally. Headlines of AI’s new era of success splashed on the pages of tech publications everywhere. Elon Musk famously worried we were “summoning the devil.” No one seemed to realize that we also were back in the age of Big Iron.
Matrix multiplication—an operation any undergraduate math student surely knows—lies at the heart of neural networks. The math isn’t complicated, but it’s computationally expensive, particularly when deep learning systems have millions or billions of variables (called parameters). The success of these systems relies on gargantuan computing resources. The Big Iron powering today’s “monster truck” AI are specialized CPUs called GPUs, first popular among gamers for their ability to crunch high-definition image data. Each GPU is effectively a computer, and ChatGPT currently uses 10,000 GPUs for its training. Ten thousand computers for one AI application would make 1950s IBM execs blush. Big Iron is back.
Generative AI—systems like ChatGPT—is quintessential Big Data AI. Huge volumes of data are required to squeeze performance out of such systems, and the data volume itself is another bugbear for efficient computing. Big data for training means Big Iron for training. The deep neural networks in today’s AI work well on many problems: they’re good for image recognition (AKA surveillance), language translation, personalizing news feeds, and autonomous navigation, particularly with drones and other vehicles in air or sea environments. They are adept at playing a range of games, like old Atari games, or Go (and of course chess). So-called Generative Adversarial Networks (GANs), a clever offshoot of deep neural networks, also create deep fakes. Generative AI systems can also pose as humans in chats and messages, spreading misinformation. All these problems now require computing resources out of reach of all but tech behemoths and the super-rich. Like the IBM 700 series of the 1950s, which rented for (a not adjusted) $750,000 a month, these are not for the everyday person. The systems are too large and expensive for anyone but hyper-funded VC companies or the ultra-rich to own.
AlexNet—no doubt inadvertently—ushered in modern Organization Man, a new conformism and a new division of the haves and have nots. AI companies can’t be garage start-ups anymore. Historically, AI has been a diverse field, with academic, corporate, and government institutions trying different approaches and comparing results (or not). Today there’s one approach, and only the richest institutions or individuals can collect the data and train the systems to be competitive. AI isn’t for tinkerers and odd geniuses anymore, it’s for bosses.
I hope this post introduces the “Big Iron” connection between our recent past and today. I want to cast doubt on the popular conceit that we’re on a rocket ship to the future. Importantly, we seem to be revisiting our past. I launched this writing forum so we could discuss this. My next post will suggest a wildly different concurrence with our past as well: war. And I’d like to talk about that, too.
I wonder if AI will eventually destabilize itself. What you describe amounts to a massive reduction in creativity and an extreme narrowing down of perspective and intent.
And for ChatGPT: as the internet itself will more and more be flooded by automated content, there will be less and less original human content and these systems will start using their own output to train themselves. An incestuous configuration that has been demonstrated (don't remember the source) to lead to downgrading of information and eventually gibberish.
Then there is also the erosion of trust in internet content, of course. And the principal impossibility to ascertain veracity of automated content.
Big data is mining human ingenuity and eroding trust in an unsustainable way. Nature by contrast does not work with centralised intelligence. If there was an evolutionary case for it, it would have happened and succeeded. We are currently seeing its failure evolve, in my opinion.
Thanks for this. Everything you say is true. There ARE some things happening, however, that may have the effect of democratizing the exploitation of AI. Meta's release of the LLaMA pre-trained model last spring was a bit of a bombshell and really juiced a huge surge of innovation and exploration on the application of pre-trained models for targeted use through "fine-tuning" of those models, which can often be done for hundreds, not thousands or millions, of dollars. Mozilla has released a commercially available pre-trained model, and recently Meta released LLaMA 2 with a new license that allows for commercial use. So there is some level of industry pressure or interest in democratizing/amortizing the costs of these models. It doesn't change your larger point about from-scratch training costs. But these last few months have been very interesting in regard to making the costs more tractable for the garage developer.