- Future Tools
- Posts
- Meta enters the multimodal AI race
Meta enters the multimodal AI race
Plus: The future is AI-centric computing
Happy Friday! Thought you’d seen the last of Humane, the startup behind the (very poorly reviewed) AI pin? Think again. The company is now on the hunt for a buyer for its business…which it values at $750 million and $1 billion (no, really).
Looks like Humane is either hopeful for a bit of humanity in the market, or it has some crazy innovation up its sleeve (AI cufflinks?)...
Hit reply and let me know who you think might bite!
Meta Jumps into Multimodal AI with Chameleon
Softonic: Chameleon
Competition in generative AI is at an all-time high, with players like OpenAI and Google ramping up their multimodal capabilities. Meta’s answer? Chameleon.
Meet Chameleon: This Wednesday, Meta introduced Chameleon, its experimental multimodal model. This model leverages a unique early-fusion token-based mixed-modal architecture (a real mouthful!) designed to learn from a combination of images, text, code, and other modalities.
How it works: Chameleon uses a “unified vocabulary” with text, code, and image tokens—making it possible to apply the same transformer architecture to multimodal sequences (like inputs with both image and text tokens).
This unified approach beats the conventional late-fusion method—a patchwork of models trained for specific inputs, which struggles to integrate information across modalities.
What sets this model apart? Its unique training:
First, it’s fed a massive dataset with 4.4 trillion tokens of text, image-text pairs, and sequences of text and images mixed together.
Then, researchers train Chameleon on five million hours of Nvidia A100 80GB GPUs (that's like watching your favorite show nonstop for several millennia).
Blending in or standing out? Chameleon shines with prompts expecting both text and images (like creating a travel guide or an interactive recipe book) and matched rival models Mixtral 8x7B and Gemini Pro on text-only tasks.
Why it matters: Because Chameleon’s early fusion model understands complex modality interactions, it opens up possibilities for real-life applications like context-based image captioning and powering intelligent robotics systems.
Anthropic's AI Decoding Shows Major Advance in LLM Understanding
Anthropic
On Tuesday, Anthropic released an AI research breakthrough: the first detailed look inside a modern LLM.
Some context: One of the biggest mysteries in AI? Interpreting the "black box" nature of AI models. We know what these models can do, but figuring out why they respond the way they do has remained a mystery.
…until now. Using a technique called "dictionary learning," Anthropic mapped out how Claude Sonnet, one of its advanced AI models, processes millions of concepts:
Researchers at Anthropic identified patterns of neuron activations (the levels of activity in the computational units making up the model's neural network) called "features," that relate to specific concepts or ideas.
These features don’t just capture patterns (as researchers once thought) but represent higher-level abstract concepts the model has learned, like understanding the Golden Gate Bridge beyond its name or gender bias in professions.
Some key findings:
Feature Causality: By artificially amplifying or suppressing certain feature clusters, researchers found that these features don't just correlate with concepts—they actively influence the model's behavior and outputs.
For instance: Cranking up the "Golden Gate Bridge" feature caused Claude to self-identify as the bridge, while boosting a praise feature led to flattering but untruthful responses.
Safety Implications: Anthropic believes that the ability to identify and manipulate features could strengthen AI safety. For example, amplifying a feature associated with scam email recognition may help the model better identify and warn users about potential scams.
Why it matters: This is like finding the Rosetta Stone for AI models. Understanding how these models think is key to making them safe and trustworthy. By shedding light on the inner workings of AI, Anthropic's work could significantly advance the field of AI interpretability and safety.
Stay tuned—this is just the beginning of a new era in AI understanding.
Want to Harness AI? Here’s Your Guide
ChatGPT has opened up endless possibilities for productivity—all you have to do is ask.
That’s where HubSpot’s new guide comes in. In it, you’ll find out how ChatGPT can help you:
Learn to automate tasks
Enhance decision-making
Foster innovation
Plus, it contains 100 prompt ideas and ethical best practices to help you unleash the power of generative AI right away.
Ready to unlock unparalleled productivity?
Intel’s Aurora Becomes Fastest AI System
Imagine a PC that not only understands your needs—but anticipates them. At its recent developer conference, Microsoft unveiled new PCs and a suite of AI features designed to transform the computing experience.
AI-first hardware. Copilot+ PCs, Microsoft’s new line of AI-centric computers, are equipped with NPUs (Neural Processing Units) to drive powerful AI features, including:
Recall: This feature logs the apps and content you’ve accessed over weeks or months, creating a scrollable timeline of your past activities stored locally on your device. You can rediscover old chats or find that elusive PowerPoint slide with ease.
Super Resolution: This AI-driven feature automatically upscales low-resolution images, enhancing details and clarity to give your memories an upgrade—whether it’s old family albums or blurry pics from nights out.
Plus, Copilot+ PCs offer real-time language translation, intelligent noise suppression, and dynamic background blurring for video calls.
Why it matters: With advanced AI features baked into the hardware and software, these machines could open the door to new innovations in personal computing.
Amazon plans to give Alexa an AI overhaul, monthly subscription price.
Patronus AI is off to a magical start as LLM governance tool gains traction.
At the Seoul summit, heads of states and companies commit to AI safety.
Nvidia will now make new AI chips every year.
TikTok turns to generative AI to boost its ads business.
Microsoft intros a Copilot for teams.
Will the new “AI Windows” change how we use computers? In this video, I explore an answer:
Are we getting closer to achieving AGI? On the latest episode of The Next Wave, we sit down with Yohei Nakajima to discuss AGI and new AI applications. We explore AI automation of jobs, changes in the VC landscape (and new markets with AI technology), and developments in autonomous agents like Baby AGI.
Looks like Humane’s search for a buyer is their last hope to stay in the game. Let’s see if they can find someone willing to take a leap of faith.
Send me your thoughts!
—Matt (FutureTools.io)
P.S. This newsletter is 100% written by a human. Okay, maybe 96%.