30/100Does Not Reproduce

YouTube·News·4/15/2026

DeepMind’s New AI Just Changed Science Forever

Name: DeepMind’s New AI Just Changed Science Forever
Item: DeepMind’s New AI Just Changed Science Forever
Rating: 30
Author: VerifyStack

by Two Minute Papers

View original on YouTube →

Summary

This video discusses a new AI model called Aletheia, powered by Gemini Deep Think, that can perform research and write research papers. The video highlights the model's ability to solve complex mathematical problems and assist human scientists in their research.

BeginnerModel ReleaseBenchmarksAI Ethics

Tools Discussed

Aletheia

Presented as breakthrough research AI capability

Gemini Deep Think

Highlighted as the reasoning engine powering Aletheia

Score Breakdown

Raw score: 45− 15 penalties= 30/100

Automated Verification

18 / 40

Prompt Test0

Code Execution—

Link Validation—

Tool Claims Check8

Version Accuracy—

AI Quality Analysis

15 / 40

Originality3

Specificity2

Completeness3

Value Density3

Honesty Limitations4

Model: anthropic/claude-sonnet-4

Context Signals

12 / 20

Freshness3

Author Track Record2

Genuine Engagement7

Prompts Tested

We run each prompt from this video against real LLMs and verify the output matches what the creator claimed.

FAIL3/3 runs agreed21287ms

Prompt

Prove or disprove: the pretzel knot P(-3, 5, 13) has infinite order in the smooth concordance group.

Creator claimed: The AI model will provide a proof or disproof of the statement.

LLM Response

Let $P(l, m, n)$ denote the pretzel knot with twists $l, m, n$. The pretzel knot $P(-3, 5, 13)$ has signature $$ \sigma(P(-3, 5, 13)) = \text{sgn}(-3) + \text{sgn}(5) + \text{sgn}(13) = -1 + 1 + 1 = 1. $$ The knot $P(-3, 5, 13)$ is a pretzel knot with odd integers. The pretzel knot $P(l, m, n)$ is amphichiral if and only if $l+m+n = 0$. In our case, $l+m+n = -3+5+13 = 15 \neq 0$, so $P(-3, 5, 13)$ is not amphichiral. The pretzel knot $P(-3, 5, 13)$ is not slice. The knot $P(-3, 5, 13)$ is not in

Verification result: Validation unavailable — cannot confirm claim

FAILLab Verifiedqwen2.5:14b21201ms

Prompt

Prove or disprove: the pretzel knot P(-3, 5, 13) has infinite order in the smooth concordance group.

Creator claimed: The AI model will provide a proof or disproof of the statement.

LLM Response

To address whether the pretzel knot $P(-3, 5, 13)$ has infinite order in the smooth concordance group, we need to delve into some aspects of knot theory and specifically properties related to the smooth concordance group. The smooth concordance group is an abelian group where elements are equivalence classes of knots under the relation of smooth concordance. A knot $K$ has infinite order in this group if there does not exist a positive integer $n$ such that $K \# K \# \cdots \# K$ (with

Verification result: The actual output does not provide a conclusive proof or disproof of the statement as expected.

FAIL3/3 runs agreed18044ms

Prompt

If 1 = d1 < ... < dn = n are the divisors of n, then let τ⊥(n) count the number of i for which (di, di+1) = 1. Is it true that τ⊥(n) / ω(n) → ∞ for almost all n? ...

Creator claimed: The AI model will provide a solution to the problem.

LLM Response

Let $n$ be a positive integer. Let $1 = d_1 < d_2 < \dots < d_n = n$ be the divisors of $n$. We define $\tau^\perp(n)$ to be the number of $i$ such that $(d_i, d_{i+1}) = 1$. We want to determine if $\frac{\tau^\perp(n)}{\omega(n)} \to \infty$ for almost all $n$. Let $\omega(n)$ be the number of distinct prime factors of $n$. Let $\tau(n)$ be the number of divisors of $n$. We want to show that $\frac{\tau^\perp(n)}{\omega(n)} \to \infty$ for almost all $n$. Let $n = p_1 p_2 \dots p_k$ be a pr

Verification result: Validation unavailable — cannot confirm claim

FAILLab Verifiedqwen2.5:14b18199ms

Prompt

If 1 = d1 < ... < dn = n are the divisors of n, then let τ⊥(n) count the number of i for which (di, di+1) = 1. Is it true that τ⊥(n) / ω(n) → ∞ for almost all n? ...

Creator claimed: The AI model will provide a solution to the problem.

LLM Response

The question you're asking involves a deep dive into number theory and asymptotic analysis. Let's break down the components of your query: - $ \tau^\perp(n) $: This function counts how many pairs of consecutive divisors of $ n $, denoted as $ (d_i, d_{i+1}) $, are coprime (their greatest common divisor is 1). - $ \omega(n) $: This represents the number of distinct prime factors of $ n $. The question asks whether the ratio $ \frac{\tau^\perp(n)}{\omega(n)} \to \infty $ as \( n \

Verification result: The actual output does not provide a solution to the problem but rather an analysis and discussion of the question's components without concluding whether the ratio tends to infinity.

FAIL3/3 runs agreed69115ms

Prompt

Use only emoji to explain how a transformer neural network works and its advantage. Be creative!

Creator claimed: The AI model will explain how a transformer neural network works using only emojis.

LLM Response

🤖🧠➡️🔢➡️🔍➡️📊➡️🎯 🔄🔗➡️🧩➡️🎨➡️🌟 ⚡️💡➡️🚀➡️🏆 **Advantage:** 🚀➡️🌍➡️💬➡️📚➡️🎉 🕒➡️⚡️➡️📈➡️🏅 **Explanation:** A transformer neural network (🤖🧠) processes input data (🔢) by analyzing relationships (🔍) to generate meaningful outputs (📊🎯). It uses self-attention mechanisms (🔄🔗) to focus on relevant parts of the data (🧩🎨), enabling it to handle complex tasks efficiently (🌟⚡️💡). Its parallel processing (🚀) makes it faster and more scalable (🏆). The advantage is

Verification result: Validation unavailable — cannot confirm claim

FAILLab Verifiedqwen2.5:14b2995ms

Prompt

Use only emoji to explain how a transformer neural network works and its advantage. Be creative!

Creator claimed: The AI model will explain how a transformer neural network works using only emojis.

LLM Response

🤖💡 ↔️ 📜📚 → 🧠💭⚡✨ ➕ 🔄🔄🔄 → 🔐👀👌

Verification result: The actual output does not explain how a transformer neural network works or its advantages using only emojis as promised.

Verification Tests

ERRORTool Claims Check403ms

Verification result: Test could not be completed

Penalties Applied

−15

Clickbait title

Title claims AI 'changed science forever' but content only describes general capabilities without demonstrating transformative impact