It’s hard to believe there was a time when it was reasonable for the President of IBM to proclaim: “...there is a world market for about five computers.” Today, many pundits are looking at the AI-ecosystem without considering the direction that technology must advance to transport us to the next logical step in our technological evolution. That place which we'll discuss has many of the same properties, and transitions that were required to move from the mainframe to the smartphones in our pockets.
This discussion will require you to take a leap with me, but let’s set up some context and congratulations, first because we're living at an incredible moment.
AI is not an idea or a distant future -- it’s here. We will debate for many more years when we’ve reached sentience but that’s philosophy. Pragmatically, ChatGPT generates enough to write a paragraph for every American every day. That’s one hundred billion words and growing. An AI upstart-turned-market-leader went from zero to over one hundred million customers faster than any company in history.
The exponential acceleration we’ve seen is, in part, thanks to our hyper-connected society and our own increasing dependence on our technologies. We’ll revisit this trend a bit later.
OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude and others are building “foundational” large language models (LLMs) that they’re hosting for a subscription to consumers or for usage fees for developers.
These models are a technological feat; however, they’re also massive and expensive to run. As much as, for many, like me who have waited a lifetime for this moment -- we’re at the very start of a new wave of how we engage with technology and, in turn, the world.
Personal AI: Driving a Revolution in AI Decentralization
Our entry into this new era has been led by centralized AI, sponsored by massive organization that host highly-coupled proprietary models and architectures. Much like the shift from mainframes to iPhones, the demand for personal AI will act as a catalyst for miniaturizing and decentralizing AI capabilities.
Up until now, the computational demands of advanced AI have necessitated powerful servers or reliance on cloud connectivity. There are two separate "ends" that will drive us to decentralize and shrink.
First, personal robots. For robots to navigate seamlessly and interact meaningfully within our homes and live with us privately, they need onboard intelligence. Limitations in size, power efficiency, and the need for offline functionality force a transition away from a purely centralized AI architecture. Second, is that device in your pocket that we carry around. Every app wants to use AI and the ability to run or use OS-level models could be a game changer for developers.
This necessitates the development of smaller, more specialized AI models suitable for embedded systems. Think about it: A home robot needs to perform real-time object recognition, speech understanding, and decision-making independently of cloud resources. This push for "onboard AI" will accelerate research into:
- Efficient Neural Networks: Optimizing models for size and computation while maintaining functionality.
- Neuromorphic Computing: Research inspired by biological brains could yield AI chips mimicking their efficiency.
- Hybrid Approaches: Balancing local processing with selective cloud connections for complex tasks.
- Software optimizations: Optimizing our approach to training, transferring, snap-shotting states and storing models
This research and goal aligns with that of the open-source community that’s actively trying to innovate to create capable systems that can compete with the cloud-based behemoths.
The rise of Personal AI could ignite a paradigm shift akin to the birth of the PC. By driving AI out of the data center and directly into our living spaces, the technology would become intimately interwoven with our everyday lives. This transition promises not only practical benefits but also raises critical questions about privacy, data ownership, and the evolving human-AI relationship. Today, I'll stick to being a techno-optimist but I do have a bit on the risks that I've covered in the past.
We're in the Cambrian “Homebrew” Era of Personal AI
In 1974 the Intel 8080’s competitor MOS 8602 was all the rage in the niche Byte magazine with its 8-bit processor and $25 dollar price point which was more than an order of magnitude less. A young Steve Wozniak knew he could integrate it into a personal computing project.
Today, I can run and jump between multiple open-source models locally for the first time since I've been working in technology. I can chain them together with easy-to-use tooling. When the masses get their hands on new technology, things change quickly. The tech community is releasing new technologies at a massive clip to allow for more and more developers to access and innovate on top of open-source LLMs. I'm learning how to piece together new functionality with, what feels like, haphazardly released library code and open-source contributions.
The likes of HuggingFace, LangChain, LlamaIndex, MetaGPT, AutoGen, Mistral, Llama, Ollama, vLLM (and the list goes on) – aren't just fighting for our attention, they're racing to be the backbone of a new intelligent eco-system that powers and re-envisions all software experiences. Whether they know it or not, they’re allowing the world to build intelligent systems at a fraction of the cost by allowing developers to leverage system design to use more specialized (smaller) models.
AI in Your Pocket and Moving Around Our Home
As if access to all the world's information isn’t enough, we need more in our pocket and living rooms.
There are many signals beyond the open-source movement that's driving efficiency that'll lead to new uses for AI.
Apple signaled their focus on miniaturizing AI as they just released a paper on how to reduce on “Speculative Streaming” that could reduce the parameter counts by 10,000x and is faster than existing models with minimal degradation in quality. It's clear their "privacy-first" position requires them to have on-hardware models to handle user's queries.
Rabbit R1 demo-launched a AI-first device that, last I checked, had over ten-million USD in pre-orders. Microsoft CEO, Satya Nadella told Bloomberg: “I thought the demo of the Rabbit OS and the device was fantastic…I think I must say, after Jobs and the launch of iPhone, probably one of the most impressive presentations I've seen.”
Although I don’t believe Rabbit R1 is the end or even may have long-term commercial success, I do believe it’s providing an early insight into the experiences that are coming. My bet is that Samsung, Apple and other smartphone creators will integrate the best features into their devices and, eventually, will have some onboard AI features that app developers can leverage to have integrated experiences.
The future is in miniaturization, specialization and “compound AI systems.” The academic founders of DataBricks and Apache Spark are pushing the world to believe that “compound” is the future, and I’m in agreement but because our future requires it.
We're Going Long: Let's Conclude
The age of cloud-based, mainframe-like AI is a stepping stone – a necessary chapter in our technological saga. Soon, AI won't reside in distant server farms, but in the gadgets in our pockets and the machines weaving through our homes and traveling with us everywhere. This future is driving us today and will shape who, long-term, is positioned to maintain a market position for more than a few minutes. The potential contained within this 'Homebrew' era of accessible AI development hints at a future where technology isn't merely used, but becomes a true collaborator and partner.
I'm looking forward to meeting them soon.