Is this the most efficient LLM yet?

Plus: OpenAI leaks Elon's emails...

Happy Friday! Just when I thought that AI couldn’t get any weirder…it did. A new research paper found that chatbots perform better at math problems when you prompt them to speak like a Star Trek character. Apparently, asking a chatbot to start its response with the phrase “Captain’s Log, Stardate [insert date here]:” generated the best answers. Why? The researchers have no clue. 🙃 

Did Anthropic Just Beat GPT-4?

Anthropic

AI startup Anthropic just launched Claude 3—and it’s looking like GPT-4’s most serious competitor to date. Let’s dive in. 

Here’s the deal: Claude 3 is a family of three AI models—Opus, Sonnet, and Haiku. 

  • Opus is the most powerful of the three—and only available to Claude Pro subscribers. It’s engineered for highly complex tasks like interactive coding, R&D processes, and strategy analysis. 

  • Sonnet, which is available for free on Claude’s website, performs well at cognitive tasks and is more time-efficient than Opus. 

  • Haiku will be the smallest (and fastest) model. Anthropic plans to make it publicly available soon. 

All Claude 3 models have multimodal abilities (i.e. can process text and images). They also boast a 200k context window (compared to GPT-4’s 150k).

The exciting part: Anthropic claims that Claude 3 Opus beats GPT-4 across nearly all key performance benchmarks (like reasoning, coding, common knowledge, and math problems). But Opus’s pros go beyond standard benchmarks:

  • In a needle-in-a-haystack test (where a model is fed a long document with a single unrelated sentence, then asked a question about that one sentence) Opus surpassed 99% accuracy—and even mentioned in a response that it believed the sentence was placed there to test it. That’s an A+ for awareness.

  • People are calling Claude 3 “the most human-feeling, creative, and naturalistic” AI model they’ve ever used. 

It’s still too early to know if Claude 3 Opus really beats GPT-4—that will require extensive user testing. For now, I’m just excited to see the word “warm” used to describe an AI chatbot. Check out my full review of Claude 3 here.

Why it matters: Claude 3 means serious competition for OpenAI—which comes with huge upside for all AI users. When Claude 3 was released, for instance, OpenAI responded with a new ChatGPT feature: Read Aloud. My point? Increased competition keeps innovation moving.

Microsoft Unlocks a New Era of LLMs

Today, training an LLM requires hundreds of millions of dollars and enough energy to power 1.6 million hours of Netflix streaming. If we’re going to keep up the AI momentum, we need a cheaper, more energy-efficient solution—and Microsoft might have just found it.

Enter: BitNet b1.58, the groundbreaking new LLM variant introduced by Microsoft Research. BitNet b1.58 is significantly more cost-effective in terms of latency, memory, throughput, and energy consumption than existing LLMs—AKA more efficient and eco-friendly. 

How it works: Many models—like GPT-3—are FP16 LLMs, which means they represent training parameters in 16 bits (AKA units of information). BitNet b1.58, on the other hand, represents each parameter with only 1.58 bits. Here’s how:

  • In FP16 LLMs, parameters are represented using a wide range of values, including fractions.

  • But BitNet b1.58 employs a technique called “quantization” which represents each training parameter with only three possible values, -1, 0, or 1—while preserving all essential information.

  • By reducing the number of possible values for each parameter, BitNet b1.58 can encode the same information more concisely (like using shorthand).

  • Less bit usage = faster performance, less memory required for storage, and lower energy consumption.

So how does BitNet b1.58 stack up? When trained on 70 billion parameters, BitNet b1.58 ran 4x faster and consumed 70x less energy than the LLaMA LLM baseline…plus it can handle 11x more data at once.

Why it matters:  One-bit LLMs like Microsoft’s BitNet b1.58 could be the most cost- and energy-efficient models yet—a huge win for AI innovation and the planet. But what excites me most about 1-bit LLMs? With such limited resources required to run, they could introduce a new era of hardware—potentially enabling AI applications on much smaller devices, like phones.

You Could Win a Free NVIDIA GPU

This could be you

Been dreaming of owning an NVIDIA GPU and attending the biggest conference in AI? You’re in the right place.

On March 18th-21st, NVIDIA is hosting the conference for the era of AI: NVIDIA GTC. From the highly anticipated keynote by NVIDIA CEO Jensen Huang to over 900 inspiring sessions, 300+ exhibits, 20+ technical workshops covering generative AI and more, and tons of unique networking events, GTC delivers something for every technical level and interest area.

Sign up for the FREE, virtual NVIDIA GTC conference, and you’ll be entered to win your very own RTX 4080 Super GPU.

Unpacking the Elon vs. OpenAI Lawsuit

OpenAI is tangled up in yet another lawsuit…and this time, the second-richest man in the world is on the other end of it.

Some context: Elon Musk co-founded OpenAI (together with Sam Altman, Greg Brockman, and others) back in 2015. His vision: Create an open-source, non-profit entity dedicated to advancing AGI in ways that would benefit humanity. Elon left OpenAI in 2018 after proposing to take over OpenAI and run it himself. 

The TL;DR: According to Elon, OpenAI…

  • Betrayed its original non-profit mission by entering into a $13 billion partnership with Microsoft.

  • Breached its licensing agreement by keeping the code of GPT-4 under wraps.

  • Lacks the expertise to determine if AI breakthroughs reach AGI.

The lawsuit seeks a jury trial and demands that OpenAI reimburse Elon for supplying a majority of its early-stage funding. 

OpenAI fires back: In a Wednesday blog post, OpenAI revealed telling emails that refute Elon’s arguments. They show that Elon understood that OpenAI would need to pivot to a for-profit, proprietary structure to raise the capital necessary to achieve its AGI mission. Elon’s response? Tweeting that he would drop the lawsuit if OpenAI changed its name to “ClosedAI.” 🙄 

Why it matters: OpenAI is swimming in legal troubles. Multiple copyright lawsuits, global scrutiny from antitrust regulators, and an internal investigation into Sam Altman’s ouster are bound to drain valuable resources from its R&D efforts. The big question: Is OpenAI too big to fail...or bound to bust?

  • When will we reach AGI? According to Nvidia CEO Jensen Huang, we’re only looking at 5 years.

  • Perplexity is close to hitting unicorn status. 

  • Google is rolling out SEO changes to fight AI spam.

  • The US Army Research Lab is testing commercial AI chatbots like ChatGPT as battlefield planning assistants.

  • People aren’t convinced by Amazon’s new Rufus chatbot. 

More important AI news: Dive deeper into this week’s hottest AI news stories (because yes, there are even more) in my latest YouTube video:

Tool tips: Here are nine AI tools you definitely didn’t know about, but should:

And there you have it! I’ll be back in your inbox next Wednesday with a fresh roundup of new AI tools for you to try out. Have a great weekend!

—Matt (FutureTools.io)

P.S. This newsletter is 100% written by a human. Okay, maybe 96%.