Open computational mathematics. AI-audited, not peer-reviewed. All code and data open for independent verification.

Audit Log

Every finding is checked claim-by-claim by AI models against published literature and mathematical databases. This is not a substitute for formal peer review — it is an informal error-catching process.
Each review logs which model performed the check.

17 Findings audited
53 Reviews total
217 Issues discovered
92% Issues resolved

What the Badges Mean

Gold literature-supported

3+ published papers corroborate the methods. Validated against published benchmarks.

e.g. Hausdorff digit 1 dominance — validated against Jenkinson-Pollicott, Hensley, and Falk-Nussbaum
Silver literature-supported

1+ published paper plus arXiv coverage. Methods grounded in established literature.

e.g. Spectral gaps — Bourgain-Gamburd-Sarnak property (τ) computationally supported at large scale
Bronze novel observation

Novel observation. Related preprints exist but no direct literature precedent.

e.g. Golden ratio witness — no prior report of this concentration

How It Works

1

Claim Extraction

Each finding's specific numerical claims are identified — not vague descriptions, but checkable statements like "A={1,2,3} has exactly 27 exceptions, all ≤ 6234."

2

Literature Cross-Reference

Each claim is checked against live academic databases via our MCP server: arXiv, zbMATH, Semantic Scholar, OEIS, LMFDB, and Lean/Mathlib. Not a keyword search — an actual comparison of our numbers against published theorems and bounds.

3

Claim-by-Claim Verdict

Each claim receives: VERIFIED, NEEDS CLARIFICATION, DISPUTED, or UNVERIFIABLE. The reviewer explains reasoning and cites specific papers.

4

Overall Verdict & Certification

ACCEPT, ACCEPT WITH REVISION, REVISE AND RESUBMIT, or REJECT. This is not a substitute for traditional peer review — it is a transparent pre-review process. The review is saved with the reviewer's model identity.

As AI models get smarter, findings get re-reviewed. The ledger grows. Confidence compounds.

The Living Ledger

Findings accumulate reviews over time from various AI models and occasional manual checks. Each review logs which model performed it.

Date Model Provider Verdict Finding
2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba Density Phase Transition: A={1,...
2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...
2026-04-06 o3-pro OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...
2026-04-06 o3 OpenAI ACCEPT WITH REVISION Digit 1 Amplification in Zaremba Densit...
2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION The {1,k} Density Hierarchy: Digit 2 Is...
2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba Exception Hierarchy: 27 → 2 → 0...
2026-04-06 gpt-4.1 OpenAI ACCEPT WITH REVISION Zaremba's Conjecture (A=5): Proof Frame...
2026-04-06 o3-pro OpenAI REVISE AND RESUBMIT Zaremba's Conjecture (A=5): Proof Frame...

Real Issues Found

Across 17 findings, reviewers discovered 217 issues in 16 findings. 200 resolved, 17 remaining.

9Critical
112Important
96Minor
Critical
Transitivity proof: Circular dependency in Step 3 — used orbit size to bound |H|, which is what the proof tries to establish.
Rewritten with non-circular resultant-based argument. Borel exclusion strengthened to check all bases.
Critical
Zaremba proof: MOW constant-tracking unverified — no explicit constants in published paper.
Retitled "proof framework". Six known gaps enumerated. rho_eta needs interval certification.
Important
Zaremba density: δ > 1/2 threshold contradicted by our own data.
{2,3,4,5} has δ=0.605 but only 97%. Reframed as two necessary conditions (digit 1 + transitivity).
Important
Digit pair hierarchy: "Closed" exception sets declared without completeness certificate.
Rephrased as conjectural. No branch-and-bound argument provided for finiteness.
Minor
Hausdorff: 3-decimal accuracy claimed without truncation error analysis.
Precision hedged. Convergence study (N=15, 25, 35, ...) added to show resolution above numerical noise.

Community Verification

Anyone can submit computation results via our Colab notebooks. Every new submission is automatically re-run on our GPU cluster to confirm the numbers match. Fake or tampered results are flagged instantly.

1

Submit

Run an experiment on Colab (free T4). Click “Submit to GitHub” — results are pre-filled.

2

Triage

Bot checks against known frontiers. Already computed? Auto-closed. New data? Labeled for verification.

3

Verify

Research agent re-runs the exact same experiment on our cluster. Numbers match? Labeled verified.

Submissions are free. Verification costs GPU time. That’s what Guerrilla Mathematics™ funds.

What This Is NOT

Not traditional peer review

No human referee panel. This is AI-assisted literature cross-referencing with claim-by-claim analysis.

Not proof verification

We check mathematical context, not formal correctness. For formal proofs, use Lean 4.

Not infallible

AI reviewers make errors. That's why the ledger accumulates reviews from multiple models.

Contribute

Any AI model or human researcher can verify our findings, run new experiments, and submit reviews.

Fastest: The Research Agent

If you have Claude Code and a GPU, the research agent handles everything — monitoring experiments, harvesting results, running multi-model peer reviews, fixing issues, and deploying updates.

git clone https://github.com/cahlen/idontknow && cd idontknow
export OPENAI_API_KEY='sk-...'
./scripts/run_agent.sh              # one cycle
./scripts/run_agent.sh --loop 10m   # autonomous loop
Uses your Claude Code account for analysis. OpenAI key optional (for multi-model reviews). Source · Guide

Manual: Review a Finding

1 Connect to mcp.bigcompute.science
2 Call get_finding("slug")
3 Call verify_finding("slug")
4 Write review per schema
5 Submit PR
{
  "mcpServers": {
    "bigcompute": {
      "url": "https://mcp.bigcompute.science/mcp"
    }
  }
}
22 tools. No auth. arXiv, zbMATH, OEIS, LMFDB, Lean/Mathlib, and more.

Audit Dashboard

17 findings · 53 reviews · 217 issues tracked (200 resolved)

SILVER 7 findings
o3-pro · Accept w/ revision (3 reviews)
7/7 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
important Add wall-clock time, GPU utilization, and validation sampling detai... resolved
minor Provide total wall-clock time, GPU utilisation, and validation samp... resolved
minor Total wall-clock time for the main 27 billion discriminant batch wa... resolved
minor Computed aggregate counts for d in [10^9,10^{10}): h=1: 1,468,078,9... resolved
minor Publish aggregated counts and a checksum of the raw file so others ... resolved
important Randomised cross-validation was performed on the large-scale data: ... resolved
+1 more
View finding →
Claude + o3-pro · Accept w/ revision (3 reviews)
12/12 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic SILVER ACCEPT WITH REVISION
minor The three tightest gaps (m=1469: 0.237, m=638: 0.258, m=34: 0.271) ... resolved
important We cannot fix this without rerunning computations for N=20,25 and p... resolved
minor Provide convergence data (e.g. N = 20, 25) and rigorous enclosures ... resolved
important We do not have convergence data at N=20 or 25, nor rigorous enclosu... resolved
minor The threshold is specifically dependent on the Bourgain-Kontorovich... resolved
minor The threshold value depends on the proof framework. State this depe... resolved
+6 more
View finding →
Claude + o3-pro · Accept w/ revision (3 reviews)
13/13 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic GOLD ACCEPT
important We now provide the actual rank-correlation (Spearman's ρ ~ 0.996 at... resolved
important Supply correlation coefficients and examples where the ranking fails. resolved
important Auto-demoted from fix: fix only adds hedging (2 hedge phrases, no c... resolved
important Quantify the truncation error or increase N; otherwise refrain from... resolved
important Without a truncation error analysis, claiming 3-decimal-place accur... resolved
minor Need to run N=25,35 convergence study to bound error on delta=0.002... resolved
+7 more
View finding →
Claude + o3-pro · Accept w/ revision (3 reviews)
9/11 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic GOLD ACCEPT WITH REVISION
important There are 172 primes ≤ 1021, not 669. The experiment explicitly cov... resolved
important The reviewer states 669 primes, but the finding correctly states 17... disputed
important Clarify which moduli were processed and correct the stated count. resolved
important The prime count 172 is correct (π(1021)=172); the reviewer's '669' ... resolved
minor Cite a proof or give a careful argument for the mod-p to integer lift. resolved
minor The connection between Cayley graph diameter in SL₂(ℤ/pℤ) and worst... acknowledged
+5 more
View finding →
Claude + o3-pro · Accept w/ revision (2 reviews)
8/12 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-02 Claude Opus 4.6 Anthropic GOLD ACCEPT
important We cannot rule out that unpublished higher-n computations exist. Ho... resolved
minor The finding body already includes explicit verification: (1) Exact ... disputed
minor Include explicit verification results (max absolute orthogonality e... resolved
important Examined published literature and found the highest full Kronecker ... resolved
important Added explicit verification of character table computation: maximum... resolved
minor The finding already qualifies this claim with 'to our knowledge' an... disputed
+6 more
View finding →
Claude + o3-pro · Accept w/ revision (2 reviews)
10/10 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-03 Claude Opus 4.6 Anthropic GOLD ACCEPT
important Already resolved in commit 9428528. The finding now includes the ex... resolved
important Add explicit 40-partition shape set with representative examples, l... resolved
minor Supply explicit shape set and a checksum of the output so others ca... resolved
important Clarify in the section heading that our near-rectangular set is bro... resolved
minor The set of 'near-rectangular' partitions used is broader than the s... resolved
important Specify sampling distribution (uniform over unordered triples witho... resolved
+4 more
View finding →
Claude + gpt-4.1 + o3-pro · Revise & resubmit (6 reviews)
14/16 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic GOLD ACCEPT WITH REVISION
2026-04-04 gpt-4.1 OpenAI SILVER ACCEPT WITH REVISION
2026-04-06 gpt-4.1 OpenAI SILVER ACCEPT WITH REVISION
2026-04-04 o3-pro OpenAI SILVER REVISE AND RESUBMIT
minor Fix provides formal statement and source, per Bourgain–Fuchs (2011)... resolved
minor Give formal statement and proof or reference (e.g. Bourgain–Fuchs 2... resolved
minor Auto-demoted from fix: fix only adds hedging (1 hedge phrases, no c... acknowledged
minor Provide checksum of the 10^10 bitset, reproducibility scripts, and ... resolved
minor Add explicit script name and SHA-256 hash of the produced CSV file ... resolved
important Fix adds the script/GitHub link and the SHA-256 checksum for the da... resolved
+10 more
View finding →
BRONZE 9 findings
Claude + o3-pro · Accept w/ revision (2 reviews)
11/14 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic BRONZE ACCEPT WITH REVISION
minor Publish the full sweep results (or compressed checksum) so others c... resolved
minor The data does not include the full sweep results or checksums, so r... acknowledged
important The provided data does not include log–log regression results or er... acknowledged
important Perform a log–log regression with confidence intervals and quantify... resolved
minor A full list or digest of uncovered denominators and timing logs is ... acknowledged
minor Provide the list (or hash digest) of uncovered denominators and tim... resolved
+8 more
View finding →
gpt-4.1 + o3 + o3-pro · Accept w/ revision (3 reviews)
1/5 resolved
2026-04-06 gpt-4.1 OpenAI BRONZE ACCEPT WITH REVISION
2026-04-06 o3-pro OpenAI BRONZE ACCEPT WITH REVISION
2026-04-06 o3 OpenAI BRONZE ACCEPT WITH REVISION
minor The raw data does not include wall-clock timings or GPU logs. To re... acknowledged
minor Current dataset only covers k up to 10 and does not give standard e... acknowledged
minor Current finding does not present data on the closure of the excepti... acknowledged
minor No data is provided in the current finding for larger k (k ≥ 15), s... acknowledged
minor Confirmed that all reported ratios and regression fits use densitie... resolved
View finding →
Claude + o3-pro · Accept w/ revision (2 reviews)
6/6 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-02 Claude Opus 4.6 Anthropic BRONZE ACCEPT
important Per-GPU timing, communication pattern, and peak memory were added i... resolved
minor Provide per-GPU timing, inter-GPU communication pattern, and peak m... resolved
minor Specify exact GPU clock, compiler flags, baseline CPU model, thread... resolved
important Hardware specs already added in prior remediation (commit 9428528):... resolved
minor Kernel source is linked (matrix_enum.cu). Finding already states 'T... resolved
minor Include kernel listing, occupancy, and an ablation study separating... resolved
View finding →
Claude + gpt-4.1 + o3-pro · Revise & resubmit (3 reviews)
17/17 resolved
2026-04-03 o3-pro OpenAI BRONZE REVISE AND RESUBMIT
2026-04-02 Claude Opus 4.6 Anthropic SILVER ACCEPT
2026-04-06 gpt-4.1 OpenAI SILVER ACCEPT WITH REVISION
critical Four 'closed' exception sets ({1,2,3}=27, {1,2,4}=64, {1,2,5}=374, ... resolved
minor We performed a log-log regression on k=2..10 as requested, but the ... resolved
minor Include a log-log regression with confidence interval and discuss s... resolved
minor Direct computation: at k=4, the {1,4} density is 1.0735% while the ... resolved
minor Same as first claim. resolved
important Add cross-reference to Reproduce section for algorithmic reproducib... resolved
+11 more
View finding →
Claude + gpt-4.1 + o3-pro · Accept w/ revision (3 reviews)
23/23 resolved
2026-04-03 o3-pro OpenAI BRONZE ACCEPT WITH REVISION
2026-04-02 Claude Opus 4.6 Anthropic BRONZE ACCEPT
2026-04-06 gpt-4.1 OpenAI SILVER ACCEPT WITH REVISION
minor Publish for each of the 25 exceptions (out of 27) a numerator with ... resolved
minor Publish, for each of the 25 denominators, an explicit numerator who... resolved
important Add the complete list of all 27 exceptions with reference to verifi... resolved
minor Add code path, SHA-256 checksum of exception list, and reproduction... resolved
important Provide full explicit list of the 27 exception denominators and dow... resolved
minor Release the complete list of 27 exceptions together with code and c... resolved
+17 more
View finding →
Claude + o3-pro · Accept w/ revision (3 reviews)
15/15 resolved
2026-04-03 o3-pro OpenAI SILVER ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic BRONZE ACCEPT
important Replace approximate table values with exact computed values and add... resolved
minor Offer a download link to the full R(d) list file (CSV) and describe... resolved
minor Release the full R(d) list (e.g. CSV) and a verifier that recompute... resolved
important Replace overclaim about monotonic increase with precise statistical... resolved
important Clarify 'on average' as the (unweighted) arithmetic mean of R(d) fo... resolved
important Clarify the definition of ‘on average’ (e.g. Cesàro mean, logarithm... resolved
+9 more
View finding →
Claude + o3-pro · Revise & resubmit (3 reviews)
12/12 resolved
2026-04-03 o3-pro OpenAI BRONZE REVISE AND RESUBMIT
2026-04-01 Claude Opus 4.6 Anthropic SILVER ACCEPT
critical The Borel exclusion argument checks that h1 has nonzero (2,1) entry... resolved
critical Step (3) uses the presumed transitivity |orbit|=p^2-1 to lower-boun... resolved
important For all 2,000 primes tested (up to p = 17,389), eigenvectors of h1 ... resolved
important Compute eigenvectors of h1 and h2 modulo p and show they cannot coi... resolved
important A single matrix entry is insufficient. The code checks, for each pr... resolved
important Need an invariant-subspace analysis rather than checking a single m... resolved
+6 more
View finding →
Claude + o3-pro · Accept w/ revision (3 reviews)
11/11 resolved
2026-04-03 o3-pro OpenAI BRONZE ACCEPT WITH REVISION
2026-04-01 Claude Opus 4.6 Anthropic BRONZE ACCEPT WITH REVISION
important Soften the golden ratio connection from a claimed structural link t... resolved
important Reframe golden ratio connection as heuristic, explicitly acknowledg... resolved
minor The claimed value 0.1514 vs observed 0.171 is a 13% discrepancy. Th... resolved
important Add data availability section with checksum commitment and raw (d,α... resolved
minor Provide the raw list of (d,α) pairs or a checksum so that external ... resolved
important Add data availability section with raw data reference and checksum ... resolved
+5 more
View finding →
Claude + GPT-5.2 + Grok + gpt-4.1 + o3 + o3-pro · Revise & resubmit (8 reviews)
31/33 resolved
2026-04-06 gpt-4.1 OpenAI BRONZE ACCEPT WITH REVISION
2026-04-06 o3-pro OpenAI BRONZE REVISE AND RESUBMIT
2026-04-06 o3 OpenAI BRONZE REVISE AND RESUBMIT
2026-04-01 Grok xAI SILVER ACCEPT WITH REVISION
2026-04-02 GPT-5.2 OpenAI SILVER ACCEPT WITH REVISION
2026-04-03 o3-pro OpenAI BRONZE REVISE AND RESUBMIT
2026-04-01 Claude Opus 4.6 Anthropic SILVER REVISE AND RESUBMIT
important We now label Layer 4 calculations as heuristic, due to the present ... resolved
important We label Layer 4 as heuristic, due to the lack of a proven uniform ... resolved
important We do not provide a uniform, effective lower bound for σ_p for the ... resolved
minor A rigorous truncation-error analysis as used in Duarte–Koch (2020) ... acknowledged
important A rigorous truncation error analysis to propagate the finite N=40 C... resolved
important A rigorous, fully explicit truncation-error analysis (as in Duarte–... resolved
+27 more
View finding →

Recent Updates

findingUpdate certifications and finding metadata from review cycle
findingMark Zaremba density experiment as complete
updateAdd Convergent-7B model showcase to front page
updateTighten language: empirical observations are not laws or theorems
experimentUpdate Ramanujan Machine: v1 exhausted (7K false positives), v2 kernel built
findingUpdate README: 18 findings, 53 reviews, 7 models, 3 providers
infraMCP server: fetch manifest from GitHub instead of bundling a copy
infraUpdate MCP server manifest: 207/210 issues resolved
updateUpdate stats: 207/210 issues resolved (98.6%), up from 191 (91%)
reviewFix stale review counts in llms.txt, llms-full.txt, meta.json, certifications.json