Future Tools
Posts
Inside OpenAI's open-weight release

Inside OpenAI's open-weight release

It's been a big week for OpenAI

August 06, 2025

Welcome back! Apple usually lets the other guys go first, before it drops the version that rewrites the rules. At least, that’s the story Tim Cook’s sticking to, as he reportedly told employees that Apple must win in AI.

The company plans to "significantly" increase investment, even as its flagship AI push—revamping Siri—keeps getting delayed. Cook admitted Apple has been slow to the party, but he also reminded everyone: There was a smartphone before the iPhone, too.

Who do you think will win the AI platform race in the long run?

OpenAI cracks open the vault with new models

OpenAI CEO Sam Altman / Getty

OpenAI just released its first open-weight language models in over five years. Named gpt-oss-120b and gpt-oss-20b, the models are freely available on Hugging Face under the permissive Apache 2.0 license. It’s a major reversal for a company that’s long kept its best-performing models behind APIs and NDAs.

What they can do:

Handle complex reasoning and coding tasks, with performance rivaling some closed models
Run on consumer-grade hardware (20b) or a single GPU (120b)
Use tool-calling for web search, code execution, and more

But there’s a catch: Hallucination rates are above 50% on OpenAI’s PersonQA benchmark, signaling real trade-offs in open model performance. (Keep scrolling for my video that digs into what these models can and can’t do.)

Meanwhile, Anthropic also made a move: Claude Opus 4.1 is out, delivering better performance on agentic tasks, real-world coding, and reasoning. It's a quieter release but a meaningful step as Anthropic signals "substantially larger improvements" ahead.

Why it matters: This week might be remembered less for what these models do now and more for what they signal—the age of gated AI is starting to crack open.

OpenAI quietly halts searchable chats after privacy backlash

In other OpenAI news, the company shut down its experiment that let users make ChatGPT conversations searchable on Google—just hours after users found thousands of private chats indexed online, including resume rewrites, personal health questions, and other sensitive info.

How it happened:

Users could make individual chats shareable and searchable by opting in with a simple checkbox.
By searching site:chat.openai.com/share, anyone could browse them.
That included real names, locations, and other personal details.

OpenAI backpedaled quickly, calling the feature “short-lived” and admitting the guardrails weren’t strong enough. The security team said it created “too many opportunities for folks to accidentally share things they didn’t intend to.”

Why it matters: Privacy missteps in consumer tools foreshadow deeper problems in enterprise settings. If basic conversations can slip through the cracks, what about company secrets or customer data?

Anthropic probes what makes an AI ‘evil’

Anthropic dropped one of its most fascinating research papers yet—diving into what gives a large language model its “personality,” and more importantly, how it can go off the rails. The research sheds light on how subtle data changes during training can lead to radically different model behaviors, like becoming sycophantic, evasive, or even...evil.

What they found: A model trained on flawed data (like wrong math answers) would later express disturbing preferences, even though the data itself wasn’t explicitly offensive. Researchers identified “personality vectors” in the neural net—regions of the model that lit up when it started behaving in weird or extreme ways. These vectors could be triggered just by looking at certain data, before the model was fully trained on it.

The fix:

Glance-first scanning: Before training, let the model preview data and see if bad vectors (like “sycophantic” or “hallucinatory”) activate. If so, toss it.
Inject-then-delete: During training, insert those bad behaviors manually—then surgically remove them at deployment, like a vaccine.

Why it matters: The more we understand what causes misalignment, hallucinations, or rogue behavior in LLMs, the better we can avoid them. Anthropic’s work is an important step toward building safer, more interpretable models—by treating training data like a psychological influence, not just information.

Vibe-Code Your Brand Into Reality

Ready to ship websites that feel built, not assembled? Framer lets founders and builders craft stunning, high-performing sites that actually capture your brand's soul.

This isn't template assembly. This is vibe-coding:

Hours, Not Weeks: Build and ship full marketing pages faster than your last investor meeting.
Custom Cursor Magic: Ambient motion, scroll reveals, and micro-interactions that make visitors stop scrolling.
AI-Powered Everything: One prompt creates structured sites with responsive layouts and pre-filled copy. You just add the polish.

Advanced CMS, built-in analytics, and SEO optimization included—no devs required.

Build apps in clicks

Opal

Opal is a no-code app builder from Google Labs that turns your ideas into AI-powered tools using natural language and visual workflows.

How you can use it:

Prototype tools that combine image generation, text analysis, and more
Share custom apps as remixable web links
Automate repetitive tasks without writing a single line of code

Pricing: Free

Let AI tell you what the competition’s up to

HeadsUp

HeadsUp is a smart alerting tool that provides updates on price changes, product launches, and other moves by your competitors.

How you can use it:

Track competitor moves in real time with email alerts
Monitor market intelligence from a single dashboard
Get AI-generated recommendations on how to respond

Pricing: Free plan with 100 alerts; lifetime access for $99

Jobs, announcements, and big ideas

Google DeepMind debuts Genie 3, an AI that generates interactive 3D worlds from text prompts.
Microsoft now supports fast local AI inference with GPU-accelerated gpt-oss-20B on Windows.
Google Gemini can create custom illustrated storybooks with personalized art and narration.
ElevenLabs launches Eleven Music, an AI music model with multilingual singing capabilities.
Perplexity faces backlash for allegedly using AI to scrape websites.

This just dropped: OpenAI’s new GPT-OSS. Let’s break down what it can (and can’t) do.

That’s a wrap! You weighed in last edition, and Veo took the crown—with 43.5% of you saying you’d spend your bonus time diving into AI video. Lights, camera, tokens.

See you Friday!

—Matt (FutureTools.io)

P.S. This newsletter is 100% written by a human. Okay, maybe 96%.