Open computational mathematics. AI-audited, not peer-reviewed. All code and data open for independent verification.

Audit Log

Every finding is checked claim-by-claim by AI models against published literature and mathematical databases. This is not a substitute for formal peer review — it is an informal error-catching process.
Each review logs which model performed the check.

17 Findings audited

53 Reviews total

217 Issues discovered

92% Issues resolved

What the Badges Mean

Gold literature-supported

3+ published papers corroborate the methods. Validated against published benchmarks.

e.g. Hausdorff digit 1 dominance — validated against Jenkinson-Pollicott, Hensley, and Falk-Nussbaum

Silver literature-supported

1+ published paper plus arXiv coverage. Methods grounded in established literature.

e.g. Spectral gaps — Bourgain-Gamburd-Sarnak property (τ) computationally supported at large scale

Bronze novel observation

Novel observation. Related preprints exist but no direct literature precedent.

e.g. Golden ratio witness — no prior report of this concentration

How It Works

Claim Extraction

Each finding's specific numerical claims are identified — not vague descriptions, but checkable statements like "A={1,2,3} has exactly 27 exceptions, all ≤ 6234."

Literature Cross-Reference

Each claim is checked against live academic databases via our MCP server: arXiv, zbMATH, Semantic Scholar, OEIS, LMFDB, and Lean/Mathlib. Not a keyword search — an actual comparison of our numbers against published theorems and bounds.

Claim-by-Claim Verdict

Each claim receives: VERIFIED, NEEDS CLARIFICATION, DISPUTED, or UNVERIFIABLE. The reviewer explains reasoning and cites specific papers.

Overall Verdict & Certification

ACCEPT, ACCEPT WITH REVISION, REVISE AND RESUBMIT, or REJECT. This is not a substitute for traditional peer review — it is a transparent pre-review process. The review is saved with the reviewer's model identity.

As AI models get smarter, findings get re-reviewed. The ledger grows. Confidence compounds.

The Living Ledger

Findings accumulate reviews over time from various AI models and occasional manual checks. Each review logs which model performed it.

Date Model Provider Verdict Finding

2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba Density Phase Transition: A={1,...

2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...

2026-04-06 o3-pro OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...

2026-04-06 o3 OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...

2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION The {1,k} Density Hierarchy: Digit 2 Is...

2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba Exception Hierarchy: 27 → 2 → 0...

2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba's Conjecture (A=5): Proof Frame...

2026-04-06 o3-pro OpenAI REVISE AND RESUBMIT Zaremba's Conjecture (A=5): Proof Frame...

Real Issues Found

Across 17 findings, reviewers discovered 217 issues in 16 findings. 200 resolved, 17 remaining.

9Critical

112Important

96Minor

Critical

Transitivity proof: Circular dependency in Step 3 — used orbit size to bound |H|, which is what the proof tries to establish.
Rewritten with non-circular resultant-based argument. Borel exclusion strengthened to check all bases.

Critical

Zaremba proof: MOW constant-tracking unverified — no explicit constants in published paper.
Retitled "proof framework". Six known gaps enumerated. rho_eta needs interval certification.

Important

Zaremba density: δ > 1/2 threshold contradicted by our own data.
{2,3,4,5} has δ=0.605 but only 97%. Reframed as two necessary conditions (digit 1 + transitivity).

Important

Digit pair hierarchy: "Closed" exception sets declared without completeness certificate.
Rephrased as conjectural. No branch-and-bound argument provided for finiteness.

Minor

Hausdorff: 3-decimal accuracy claimed without truncation error analysis.
Precision hedged. Convergence study (N=15, 25, 35, ...) added to show resolution above numerical noise.

Community Verification

Anyone can submit computation results via our Colab notebooks. Every new submission is automatically re-run on our GPU cluster to confirm the numbers match. Fake or tampered results are flagged instantly.

Submit

Run an experiment on Colab (free T4). Click “Submit to GitHub” — results are pre-filled.

Triage

Bot checks against known frontiers. Already computed? Auto-closed. New data? Labeled for verification.

Verify

Research agent re-runs the exact same experiment on our cluster. Numbers match? Labeled verified.

Submissions are free. Verification costs GPU time. That’s what Guerrilla Mathematics™ funds.

What This Is NOT

Not traditional peer review

No human referee panel. This is AI-assisted literature cross-referencing with claim-by-claim analysis.

Not proof verification

We check mathematical context, not formal correctness. For formal proofs, use Lean 4.

Not infallible

AI reviewers make errors. That's why the ledger accumulates reviews from multiple models.

Contribute

Any AI model or human researcher can verify our findings, run new experiments, and submit reviews.

Fastest: The Research Agent

If you have Claude Code and a GPU, the research agent handles everything — monitoring experiments, harvesting results, running multi-model peer reviews, fixing issues, and deploying updates.

git clone https://github.com/cahlen/idontknow && cd idontknow
export OPENAI_API_KEY='sk-...'
./scripts/run_agent.sh              # one cycle
./scripts/run_agent.sh --loop 10m   # autonomous loop

Uses your Claude Code account for analysis. OpenAI key optional (for multi-model reviews). Source · Guide

Manual: Review a Finding

1 Connect to mcp.bigcompute.science

2 Call get_finding("slug")

3 Call verify_finding("slug")

4 Write review per schema

5 Submit PR

{
  "mcpServers": {
    "bigcompute": {
      "url": "https://mcp.bigcompute.science/mcp"
    }
  }
}

22 tools. No auth. arXiv, zbMATH, OEIS, LMFDB, Lean/Mathlib, and more.

Audit Dashboard

17 findings · 53 reviews · 217 issues tracked (200 resolved)

SILVER 7 findings

Cohen-Lenstra at Scale: h=1 Rate Falls to 15% at 10^10, Genus ... +

o3-pro · Accept w/ revision (3 reviews)

7/7 resolved

Review Ledger

2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION

Issues

important Add wall-clock time, GPU utilization, and validation sampling detai... resolved

minor Provide total wall-clock time, GPU utilisation, and validation samp... resolved

minor Total wall-clock time for the main 27 billion discriminant batch wa... resolved

minor Computed aggregate counts for d in [10^9,10^{10}): h=1: 1,468,078,9... resolved

minor Publish aggregated counts and a checksum of the raw file so others ... resolved

important Randomised cross-validation was performed on the large-scale data: ... resolved

+1 more