by cahlen complete

Hardware

8× NVIDIA B200 (183 GB VRAM each, 1.43 TB total) 2× Intel Xeon Platinum 8570 (112 cores / 224 threads) 2 TB DDR5 RAM
number-theorycontinued-fractionsopen-conjectures b200dgxnvlink cuda-kernelbrute-forcellm-provingformal-verificationtransfer-operatorspectral-theory

Key Results

Conjecture
Zaremba's Conjecture (1972)
Verified Range
d = 1 to 210,000,000,000
Failures
0
LLM Proofs
19/20

Zaremba’s Conjecture: 210 Billion Verified on 8× NVIDIA B200

Abstract

We verify Zaremba’s Conjecture for all dd from 1 to 210 billion using GPU-accelerated continued fraction tree enumeration across 8 NVIDIA B200 GPUs. The v6 multi-pass kernel completes in 116 minutes with zero failures. Combined with spectral analysis (congruence transfer operator gaps uniform 0.237\geq 0.237 across 1,214 square-free moduli), an algebraic transitivity proof (Γ generates SL₂ for ALL primes via Dickson’s classification), Cayley graph diameters (diam(p) ≤ 2·log(p) for 669 primes), and 19/20 machine-verified Lean 4 proofs from a dual-model LLM race, this is the most comprehensive computational attack on Zaremba’s Conjecture to date.

Background

Zaremba’s Conjecture (1972): For every positive integer dd, there exists an integer aa with gcd(a,d)=1\gcd(a, d) = 1 such that the continued fraction expansion

ad=[0;a1,a2,,ak]\frac{a}{d} = [0;\, a_1, a_2, \ldots, a_k]

has all partial quotients ai5a_i \leq 5.

This conjecture has been open for over 50 years. The strongest partial result is due to Huang (2015), who proved it holds for a density-1 subset of positive integers — meaning almost all dd satisfy the conjecture, but infinitely many exceptions could theoretically remain.

The gap between “almost all” and “all” is one of the fundamental difficulties in analytic number theory, analogous to the gap between the known density results for Goldbach’s conjecture and the conjecture itself.

Hardware

ComponentSpecification
NodeNVIDIA DGX B200
GPUs8× NVIDIA B200 (183 GB VRAM each)
Total VRAM1.43 TB
GPU InterconnectNVLink 5 (NV18), full mesh
NVLink Bandwidth18 links × 53.125 GB/s = 956 GB/s per GPU
CPUs2× Intel Xeon Platinum 8570 (56 cores each)
Total Cores/Threads112 / 224
System RAM2 TB DDR5
Storage28 TB NVMe

Method

Part 1: LLM-Assisted Formal Proving

We formalized Zaremba’s Conjecture in Lean 4, creating 20 theorem statements for d=1,,20d = 1, \ldots, 20 with bound A=5A = 5. Each theorem has the form:

theorem zaremba_d7 : HasZarembaWitness 7 5 := by sorry

where HasZarembaWitness d A asserts:

a:N,  0<a    ad    gcd(a,d)=1    all CF quotients of a/dA\exists\, a : \mathbb{N},\; 0 < a \;\wedge\; a \leq d \;\wedge\; \gcd(a, d) = 1 \;\wedge\; \text{all CF quotients of } a/d \leq A

We served two SOTA proving models on split GPU sets:

ModelHF IDParametersGPUsPort
Goedel-Prover-V2-32BGoedel-LM/Goedel-Prover-V2-32B32B (Qwen3)0–38000
Kimina-Prover-72BAI-MO/Kimina-Prover-72B72B (Qwen2.5)4–78001

Both served via vLLM 0.18.0 with 4-way tensor parallelism. Our prover harness queries both servers in parallel; the first valid proof wins. Each proof is verified by the Lean 4 compiler.

The proof pattern for each theorem is:

exact ⟨WITNESS, by decide, by decide, by native_decide, by native_decide⟩

where WITNESS is a concrete integer and native_decide compiles the verification to native code.

Part 2: CUDA-Accelerated Exhaustive Verification

We wrote a custom CUDA kernel (zaremba_verify_v2.cu) where each CUDA thread handles one value of dd. The kernel went through two iterations:

v1 searched [d/7,d/3][d/7, d/3] linearly — this worked for small dd but hit a scaling wall at d>109d > 10^9 (199 d/sec at d=3×109d = 3 \times 10^9, projected weeks for the full range).

v2 uses our witness distribution discovery (α(d)/d0.170\alpha(d)/d \approx 0.170) to dramatically narrow the search:

__device__ uint64_t find_witness_v2(uint64_t d) {
    // Phase 1: Spiral outward from 0.170*d in [0.165d, 0.185d]
    uint64_t center = (uint64_t)(0.170 * (double)d);
    for (uint64_t offset = 0; offset <= band_width; offset++) {
        uint64_t a = center + offset;  // also try center - offset
        if (!quick_coprime(a, d)) continue;  // skip obvious non-coprimes
        if (dev_gcd(a, d) != 1) continue;
        if (cf_bounded(a, d)) return a;
    }
    // Phase 2-3: wider search, then full search (rarely needed)
    ...
}

Key optimizations in v2:

  1. Targeted search: Start at a=0.170da = \lfloor 0.170d \rfloor and spiral outward — hits the witness in a band of width 0.4%\sim 0.4\% of dd for 99% of cases
  2. Coprimality sieve: Skip candidates sharing small factors (2, 3, 5, 7, 11, 13) with dd before computing the full gcd
  3. CF prefix filter: Check if d/a5\lfloor d/a \rfloor \leq 5 before running the full CF expansion

Benchmark (100K values at d3×109d \approx 3 \times 10^9, single GPU):

KernelRateTime
v1199 d/sec503s
v22,612 d/sec38s
Speedup13×

We launched 8 parallel instances, one per GPU:

GPURangeValues
0d=1d = 1 to 10910^910910^9
1d=109+1d = 10^9 + 1 to 2×1092 \times 10^910910^9
7d=7×109+1d = 7 \times 10^9 + 1 to 8×1098 \times 10^910910^9

Part 3: Transfer Operator Spectral Analysis

To move beyond brute-force verification toward a potential proof, we computed the spectral properties of the Gauss-type transfer operator for CFs bounded by A=5A = 5:

Lsf(x)=k=151(k+x)2sf(1k+x)\mathcal{L}_s f(x) = \sum_{k=1}^{5} \frac{1}{(k+x)^{2s}} f\left(\frac{1}{k+x}\right)

The Hausdorff dimension δ\delta of the set E5E_5 (reals with all CF quotients 5\leq 5) is the unique ss where the spectral radius of Ls\mathcal{L}_s equals 1. If 2δ>12\delta > 1, the circle method (Bourgain-Kontorovich style) can in principle be applied.

We discretized Ls\mathcal{L}_s using Chebyshev collocation with N=40N = 40 points on [0,1][0, 1], converting the eigenvalue problem into a dense matrix eigensolve computed on GPU via cuSOLVER.

Results

LLM Proving Race: 19/20

ddWinnerWitness aaddWinnerWitness aa
1Goedel111Kimina2
2Goedel112Goedel5
3Goedel113Kimina3
4Goedel114Goedel3
5Goedel115Kimina4
6Kimina516Goedel3
7Kimina217Kimina3
8Kimina318Goedel5
9FAIL19Kimina4
10Kimina320Kimina9

Final score: Goedel-Prover and Kimina-Prover split the 19 proved cases. All 19 proofs verified by the Lean 4 compiler.

The single failure (d=9d = 9, correct witness a=2a = 2) was a search problem: both models repeatedly tried a=1a = 1 and a=3a = 3 instead of a=2a = 2 across 80 total attempts. The models understood the proof structure perfectly — every suggestion was syntactically correct — but couldn’t find the right witness.

Pipeline Bugs Encountered

Three bugs were discovered and fixed during the proving run:

  1. Sorry acceptance: Lean 4 compiles sorry as a warning (exit code 0), not an error. The prover initially reported 23/23 “proved” because model outputs containing sorry were accepted.

  2. Output parsing: Models wrap proofs in markdown headers and full theorem re-statements. Injecting the raw output into the source file produces garbage. Fixed with bracket-matching truncation.

  3. decide vs native_decide: Models generate by decide for all proof obligations. The kernel evaluator times out on CF computation; native_decide compiles to native code and runs instantly. Fixed with post-processing.

Exhaustive Verification: 0 Failures to 2.1×10112.1 \times 10^{11}

Verified: The v6 multi-pass GPU matrix enumeration kernel confirmed zero gaps for all d2.1×1011d \leq 2.1 \times 10^{11} (210 billion) in 116 minutes on 8× NVIDIA B200. This is a complete computational proof that Zaremba’s Conjecture holds for every integer up to 210 billion.

The kernel performs the entire CF tree walk on GPU via batched 2×2 matrix multiplication with fused expand+mark+compact. Phase A builds the tree to depth 12 on one GPU, Phase B distributes 244M matrices across all 8 GPUs in 64 rounds, each expanding to depth 62.

MethodHardwareRangeTimeRate
Python (multiprocessing)112 CPU coresd=1d = 1 to 10610^6751.7 s1,330 d/s
v5 kernel1× B200d=1d = 1 to 10810^87.5 s13.3M d/s
v6 multi-pass8× B200d=1d = 1 to 10910^921.8 s45.9M d/s
v6 multi-pass8× B200d=1d = 1 to 101010^{10}179 s55.9M d/s
v6 multi-pass8× B200d=1d = 1 to 101110^{11}1,746 s57.3M d/s
v6 multi-pass8× B200d=1d = 1 to 2.1×10112.1 \times 10^{11}6,962 s30.2M d/s

Speedup from Python baseline: ~43,000× on 8 GPUs.

The v5 kernel (single GPU) reformulated CF tree enumeration as batched 2×2 matrix multiplication — a 175× speedup over v4. The v6 kernel extended this with multi-pass chunking to eliminate buffer overflow at scale, enabling verification beyond 101010^{10}.

Transfer Operator: Hausdorff Dimension to 15 Digits

QuantityValue
Hausdorff dimension δ\delta0.8368294436812080.836829443681208
Precision15 digits
2δ2\delta1.6741.674
2δ>12\delta > 1?Yes (circle method threshold met)
Spectral gap0.7170.717
λ1/λ0\lvert\lambda_1/\lambda_0\rvert0.2830.283

The spectral gap of 0.7170.717 is strong: the dominant eigenfunction controls the operator’s behavior overwhelmingly, with the second eigenvalue less than 30% of the leading one. This matches Jenkinson-Pollicott (2001) and subsequent refinements, independently verified from scratch on GPU.

Why this matters:

  • 2δ>12\delta > 1 confirms the Hausdorff dimension of E5E_5 is large enough for the circle method to potentially close the gap from density-1 to all dd
  • The spectral gap quantifies mixing rates in the continued fraction dynamics
  • Phase 2 complete: congruence spectral gaps computed for all 1,214 square-free moduli up to m=1999m = 1999 — every gap positive (min 0.2370.237), confirming property (τ\tau) at this scale. See findings
  • Transitivity proved: algebraic proof that Γ{1,,5}\Gamma_{\{1,\ldots,5\}} generates SL2(Z/pZ)\text{SL}_2(\mathbb{Z}/p\mathbb{Z}) for every prime pp — no local obstructions exist. See findings

Analysis: The Witness Distribution

Define α(d)\alpha(d) as the smallest Zaremba witness for dd with bound A=5A = 5. Based on exhaustive computation over d=1d = 1 to 100,000100{,}000:

Concentration at α(d)/d0.171\alpha(d)/d \approx 0.171

For d>1000d > 1000:

Mean  α(d)d=0.1712,99% interval:  [0.1708,  0.1745]\text{Mean}\;\frac{\alpha(d)}{d} = 0.1712, \qquad \text{99\% interval:}\; [0.1708,\; 0.1745]

This is an extraordinarily tight concentration — 99% of witnesses fall in a band of relative width 2%.

Connection to the Golden Ratio

The continued fraction prefix of α(d)/d\alpha(d)/d is overwhelmingly [0;5,1,][0;\, 5, 1, \ldots] (99.7% of cases). The infinite continued fraction [0;5,1,1,1,][0;\, 5, 1, 1, 1, \ldots] converges to

15+φ=15+1+520.1511\frac{1}{5 + \varphi} = \frac{1}{5 + \frac{1+\sqrt{5}}{2}} \approx 0.1511

where φ\varphi is the golden ratio. The finite truncation and coprimality constraint shift the observed center to 0.17120.1712, between the convergents:

[0;5,1]=160.1667and[0;5,1,1]=2110.1818[0;\, 5, 1] = \frac{1}{6} \approx 0.1667 \qquad\text{and}\qquad [0;\, 5, 1, 1] = \frac{2}{11} \approx 0.1818

The Bound A=5A = 5 Is Tight

The maximum partial quotient actually used in the smallest witness:

Max quotientFrequency
5599.91%
440.07%
3\leq 30.02%

For 99.91% of all dd, the bound A=5A = 5 is necessary — A=4A = 4 would fail.

CF Length Distribution

The CF length of α(d)/d\alpha(d)/d peaks at k=13k = 13 and grows as O(logd)O(\log d):

Length kk8\leq 89–1011–1213–1415–1617\geq 17
Frequency1.6%8.7%36.0%42.4%10.1%1.2%

Implications

  1. Search optimization. Any Zaremba verification algorithm should start searching at a0.170da \approx 0.170\,d. This reduces the search space by 6×\sim 6\times compared to starting at a=1a = 1. The v4 inverse CF kernel eliminates search entirely for most cases.

  2. Transfer operator confirms circle method. The Hausdorff dimension δ=0.8368\delta = 0.8368 gives 2δ=1.674>12\delta = 1.674 > 1, confirming the circle method threshold is met with substantial margin. The spectral gap of 0.7170.717 provides room for the congruence analysis in Phase 2.

  3. Proof strategy. The transfer operator’s dominant eigenfunction peaks near x0.171x \approx 0.171, explaining why witnesses concentrate at α(d)/d0.171\alpha(d)/d \approx 0.171. A full proof may require bounding the congruence transfer operator’s spectral radius uniformly across all moduli. See the companion transfer operator analysis for details.

  4. LLM proving bottleneck. Current SOTA provers nail proof structure but fail at witness search. Adding MCTS or systematic enumeration on top of LLM tactic generation is the clear next step.

  5. Gauss map connection. Witnesses concentrate near the preimage of [12,1][\frac{1}{2}, 1] under the Gauss map branch x1/(5+x)x \mapsto 1/(5 + x). The transfer operator spectral analysis makes this connection rigorous.

Reproducibility

Clone and verify:

git clone https://github.com/cahlen/idontknow
cd idontknow

# Compile v6 multi-pass kernel
nvcc -O3 -arch=sm_100a -o matrix_v6 \
    scripts/experiments/zaremba-conjecture-verification/matrix_enum_multipass.cu -lpthread

# Verify d=1 to 1B (any multi-GPU NVIDIA system)
./matrix_v6 1000000000

# Verify d=1 to 210B (requires 8× B200 or similar, ~116 min)
./matrix_v6 210000000000

Single-GPU quick check (v5 kernel):

nvcc -O3 -arch=sm_100a -o matrix_enum \
    scripts/experiments/zaremba-conjecture-verification/matrix_enum.cu
./matrix_enum 100000000  # 100M in ~7.5s on one B200

Raw Data

References

  • Zaremba, S.K. (1972). “La methode des ‘bons treillis’ pour le calcul des integrales multiples.” Applications of Number Theory to Numerical Analysis, pp. 39—119.
  • Bourgain, J. and Kontorovich, A. (2014). “On Zaremba’s conjecture.” Annals of Mathematics, 180(1), pp. 137—196.
  • Huang, S. (2015). “An improvement to Zaremba’s conjecture.” Geometric and Functional Analysis, 25(3), pp. 860—914.
  • Dickson, L.E. (1901). Linear Groups with an Exposition of the Galois Field Theory. B.G. Teubner, Leipzig.
  • Jenkinson, O. and Pollicott, M. (2001). “Computing the dimension of dynamically defined sets.” Ergodic Theory and Dynamical Systems, 21(5), pp. 1429—1445.

Computed 2026-03-28 on NVIDIA DGX B200. Transfer operator analysis: companion post. Code: github.com/cahlen/idontknow.

This work was produced through human–AI collaboration (Cahlen Humphreys + Claude). Not independently peer-reviewed. All code and data open for verification at github.com/cahlen/idontknow.

Recent Updates

updateAuto-generated changelog from git log — updates on every build