VerifyStack
← Back to Registry
30/100Does Not Reproduce
YouTube·News·

DeepMind’s New AI Just Changed Science Forever

by Two Minute Papers
View original on YouTube

Summary

This video discusses a new AI model called Aletheia, powered by Gemini Deep Think, that can perform research and write research papers. The video highlights the model's ability to solve complex mathematical problems and assist human scientists in their research.

BeginnerModel ReleaseBenchmarksAI Ethics

Tools Discussed

Aletheia

Presented as breakthrough research AI capability

Gemini Deep Think

Highlighted as the reasoning engine powering Aletheia

Score Breakdown

Raw score: 4515 penalties= 30/100

Automated Verification

18 / 40
Prompt Test0
Code Execution
Link Validation
Tool Claims Check8
Version Accuracy

AI Quality Analysis

15 / 40
Originality3
Specificity2
Completeness3
Value Density3
Honesty Limitations4
Model: anthropic/claude-sonnet-4

Context Signals

12 / 20
Freshness3
Author Track Record2
Genuine Engagement7

Prompts Tested

We run each prompt from this video against real LLMs and verify the output matches what the creator claimed.

FAIL3/3 runs agreed21287ms
Prompt

Prove or disprove: the pretzel knot P(-3, 5, 13) has infinite order in the smooth concordance group.

Creator claimed: The AI model will provide a proof or disproof of the statement.
LLM Response

Let $P(l, m, n)$ denote the pretzel knot with twists $l, m, n$. The pretzel knot $P(-3, 5, 13)$ has signature $$ \sigma(P(-3, 5, 13)) = \text{sgn}(-3) + \text{sgn}(5) + \text{sgn}(13) = -1 + 1 + 1 = 1. $$ The knot $P(-3, 5, 13)$ is a pretzel knot with odd integers. The pretzel knot $P(l, m, n)$ is amphichiral if and only if $l+m+n = 0$. In our case, $l+m+n = -3+5+13 = 15 \neq 0$, so $P(-3, 5, 13)$ is not amphichiral. The pretzel knot $P(-3, 5, 13)$ is not slice. The knot $P(-3, 5, 13)$ is not in

Verification result: Validation unavailable — cannot confirm claim
FAILLab Verifiedqwen2.5:14b21201ms
Prompt

Prove or disprove: the pretzel knot P(-3, 5, 13) has infinite order in the smooth concordance group.

Creator claimed: The AI model will provide a proof or disproof of the statement.
LLM Response

To address whether the pretzel knot \(P(-3, 5, 13)\) has infinite order in the smooth concordance group, we need to delve into some aspects of knot theory and specifically properties related to the smooth concordance group. The smooth concordance group is an abelian group where elements are equivalence classes of knots under the relation of smooth concordance. A knot \(K\) has infinite order in this group if there does not exist a positive integer \(n\) such that \(K \# K \# \cdots \# K\) (with

Verification result: The actual output does not provide a conclusive proof or disproof of the statement as expected.
FAIL3/3 runs agreed18044ms
Prompt

If 1 = d1 < ... < dn = n are the divisors of n, then let τ⊥(n) count the number of i for which (di, di+1) = 1. Is it true that τ⊥(n) / ω(n) → ∞ for almost all n? ...

Creator claimed: The AI model will provide a solution to the problem.
LLM Response

Let $n$ be a positive integer. Let $1 = d_1 < d_2 < \dots < d_n = n$ be the divisors of $n$. We define $\tau^\perp(n)$ to be the number of $i$ such that $(d_i, d_{i+1}) = 1$. We want to determine if $\frac{\tau^\perp(n)}{\omega(n)} \to \infty$ for almost all $n$. Let $\omega(n)$ be the number of distinct prime factors of $n$. Let $\tau(n)$ be the number of divisors of $n$. We want to show that $\frac{\tau^\perp(n)}{\omega(n)} \to \infty$ for almost all $n$. Let $n = p_1 p_2 \dots p_k$ be a pr

Verification result: Validation unavailable — cannot confirm claim
FAILLab Verifiedqwen2.5:14b18199ms
Prompt

If 1 = d1 < ... < dn = n are the divisors of n, then let τ⊥(n) count the number of i for which (di, di+1) = 1. Is it true that τ⊥(n) / ω(n) → ∞ for almost all n? ...

Creator claimed: The AI model will provide a solution to the problem.
LLM Response

The question you're asking involves a deep dive into number theory and asymptotic analysis. Let's break down the components of your query: - \( \tau^\perp(n) \): This function counts how many pairs of consecutive divisors of \( n \), denoted as \( (d_i, d_{i+1}) \), are coprime (their greatest common divisor is 1). - \( \omega(n) \): This represents the number of distinct prime factors of \( n \). The question asks whether the ratio \( \frac{\tau^\perp(n)}{\omega(n)} \to \infty \) as \( n \

Verification result: The actual output does not provide a solution to the problem but rather an analysis and discussion of the question's components without concluding whether the ratio tends to infinity.
FAIL3/3 runs agreed69115ms
Prompt

Use only emoji to explain how a transformer neural network works and its advantage. Be creative!

Creator claimed: The AI model will explain how a transformer neural network works using only emojis.
LLM Response

🤖🧠➡️🔢➡️🔍➡️📊➡️🎯 🔄🔗➡️🧩➡️🎨➡️🌟 ⚡️💡➡️🚀➡️🏆 **Advantage:** 🚀➡️🌍➡️💬➡️📚➡️🎉 🕒➡️⚡️➡️📈➡️🏅 **Explanation:** A transformer neural network (🤖🧠) processes input data (🔢) by analyzing relationships (🔍) to generate meaningful outputs (📊🎯). It uses self-attention mechanisms (🔄🔗) to focus on relevant parts of the data (🧩🎨), enabling it to handle complex tasks efficiently (🌟⚡️💡). Its parallel processing (🚀) makes it faster and more scalable (🏆). The advantage is

Verification result: Validation unavailable — cannot confirm claim
FAILLab Verifiedqwen2.5:14b2995ms
Prompt

Use only emoji to explain how a transformer neural network works and its advantage. Be creative!

Creator claimed: The AI model will explain how a transformer neural network works using only emojis.
LLM Response

🤖💡 ↔️ 📜📚 → 🧠💭⚡✨ ➕ 🔄🔄🔄 → 🔐👀👌

Verification result: The actual output does not explain how a transformer neural network works or its advantages using only emojis as promised.

Verification Tests

ERRORTool Claims Check403ms
Verification result: Test could not be completed

Penalties Applied

15
Clickbait title
Title claims AI 'changed science forever' but content only describes general capabilities without demonstrating transformative impact