Zaremba's Conjecture: 210 Billion Verified in 116 Minutes on 8× NVIDIA B200

Name: Zaremba's Conjecture: 210 Billion Verified in 116 Minutes on 8× NVIDIA B200
Creator: Cahlen Humphreys
Published: 2026-03-28
License: https://creativecommons.org/licenses/by/4.0/

Cahlen Humphreys

March 28, 2026 by cahlen complete

Hardware

8× NVIDIA B200 (183 GB VRAM each, 1.43 TB total) 2× Intel Xeon Platinum 8570 (112 cores / 224 threads) 2 TB DDR5 RAM

number-theorycontinued-fractionsopen-conjectures b200dgxnvlink cuda-kernelbrute-forcellm-provingformal-verificationtransfer-operatorspectral-theory

Key Results

Conjecture: Zaremba's Conjecture (1,972)
Conjecture Year: 1,972
Bound: 5
Status: 210B strong computational evidence (zero gaps, 116 min on 8× B200, original v6 kernel). Certification via hardened v6.1 kernel pending re-run. Spectral gaps to m=2,000 complete. Transitivity argument (AI-assisted, not peer-reviewed) for ALL primes.
Verified Range: 1, 2.1×10¹¹
Failures: 0
Verification Time: 6,962.2 seconds (116 min) on 8× B200
Verification Kernel: v6 multi-pass GPU matrix enumeration (256 rounds × 8 GPUs, 119,210 seeds per chunk)
Llm Proofs: 19/20
Models Used: Goedel-Prover-V2-32B, Kimina-Prover-72B

Zaremba’s Conjecture: 210 Billion Verified on 8× NVIDIA B200

Abstract

We verify Zaremba’s Conjecture for all $d$ from 1 to 210 billion using GPU-accelerated continued fraction tree enumeration across 8 NVIDIA B200 GPUs. The v6 multi-pass kernel completes in 116 minutes with zero failures. Combined with spectral analysis (congruence transfer operator gaps uniform $\geq 0.237$ across 1,214 square-free moduli), an algebraic transitivity proof (Γ generates SL₂ for ALL primes via Dickson’s classification), Cayley graph diameters (diam(p) ≤ 2·log(p) for 669 primes), and 19/20 machine-verified Lean 4 proofs from a dual-model LLM race, this is the most comprehensive computational attack on Zaremba’s Conjecture to date.

Background

Zaremba’s Conjecture (1972): For every positive integer $d$ , there exists an integer $a$ with $\gcd(a, d) = 1$ such that the continued fraction expansion

$\frac{a}{d} = [0;\, a_1, a_2, \ldots, a_k]$

has all partial quotients $a_i \leq 5$ .

This conjecture has been open for over 50 years. The current partial-result landscape includes:

Bourgain–Kontorovich (2014): density-1 subset of positive integers satisfies the conjecture with $A = 50$ .
Huang (2015): tightened the density-1 subset to $A = 5$ .
Shkredov (March 2026, arXiv:2603.14116): Zaremba-type result for all sufficiently large prime denominators, with an effective but large threshold.

None of these closes the classical conjecture (“all $d$ with $A = 5$ ”). The gap between “almost all” (or “all large primes”) and “all $d$ ” is one of the fundamental difficulties in analytic number theory, analogous to the gap between density results for Goldbach’s conjecture and the conjecture itself. The work described below is a computational verification over an initial range plus a proof framework; it is not a complete proof (see findings/zaremba-conjecture-framework for the explicit list of unresolved analytic gaps).

Hardware

Component	Specification
Node	NVIDIA DGX B200
GPUs	8× NVIDIA B200 (183 GB VRAM each)
Total VRAM	1.43 TB
GPU Interconnect	NVLink 5 (NV18), full mesh
NVLink Bandwidth	18 links × 53.125 GB/s = 956 GB/s per GPU
CPUs	2× Intel Xeon Platinum 8570 (56 cores each)
Total Cores/Threads	112 / 224
System RAM	2 TB DDR5
Storage	28 TB NVMe

Method

Part 1: LLM-Assisted Formal Proving

We formalized Zaremba’s Conjecture in Lean 4, creating 20 theorem statements for $d = 1, \ldots, 20$ with bound $A = 5$ . Each theorem has the form:

theorem zaremba_d7 : HasZarembaWitness 7 5 := by sorry

where HasZarembaWitness d A asserts:

$\exists\, a : \mathbb{N},\; 0 < a \;\wedge\; a \leq d \;\wedge\; \gcd(a, d) = 1 \;\wedge\; \text{all CF quotients of } a/d \leq A$

We served two SOTA proving models on split GPU sets:

Model	HF ID	Parameters	GPUs	Port
Goedel-Prover-V2-32B	`Goedel-LM/Goedel-Prover-V2-32B`	32B (Qwen3)	0–3	8000
Kimina-Prover-72B	`AI-MO/Kimina-Prover-72B`	72B (Qwen2.5)	4–7	8001

Both served via vLLM 0.18.0 with 4-way tensor parallelism. Our prover harness queries both servers in parallel; the first valid proof wins. Each proof is verified by the Lean 4 compiler.

The proof pattern for each theorem is:

exact ⟨WITNESS, by decide, by decide, by native_decide, by native_decide⟩

where WITNESS is a concrete integer and native_decide compiles the verification to native code.

Part 2: CUDA-Accelerated Exhaustive Verification (v6 kernel)

The verification is performed by a single GPU kernel, matrix_enum_multipass.cu (“v6”), which reformulates continued fraction enumeration as batched $2 \times 2$ matrix multiplication on GPU. The search is generative: rather than looping over each $d \in [1, D]$ and searching for a witness $a$ , the kernel enumerates all CF paths $[0; a_1, a_2, \ldots, a_K]$ with partial quotients $a_i \in \{1, 2, 3, 4, 5\}$ , computes the corresponding denominator $q_K$ via the matrix product $g_{a_1} \cdots g_{a_K}$ where $g_a = \begin{pmatrix} 0 & 1 \\ 1 & a \end{pmatrix}$ , and marks $q_K$ in a shared bitset via atomicOr. Any $d \leq D$ not marked at the end is a counterexample.

The computation proceeds in two phases:

Phase A (single GPU, serial): build the CF tree to depth 12, producing $5^{12} \approx 2.44 \times 10^8$ depth-12 seed matrices. These are pruned by $q \leq D$ at every level to keep the working set bounded.
Phase B (8 GPUs in parallel): the depth-12 seeds are divided into 256 rounds of 119,210 seeds per chunk per GPU. Each chunk is expanded from depth 12 up to depth 62 (sufficient for $q \leq 2.1 \times 10^{11}$ ). Within each expansion, a fused expand+mark+compact kernel (a) multiplies every live matrix by each $g_1, \ldots, g_5$ , (b) atomically sets the bit for any $q \leq D$ , and (c) retains only children with $q \leq D/5$ for the next level (since further expansion can only multiply $q$ by factors $\geq 5$ , after further refinements that bound the growth per level).

An important software-audit note: the original v6 kernel counts every expansion via atomicAdd(out_count, 1ULL) but only writes the matrix if pos < max_out, then clips the next frontier to min(h_out, BUF_SLOTS). This was intended to be provably unnecessary under the chosen chunk size, but the original kernel did not emit a certificate proving that clipping never occurred. A hardened replacement, matrix_enum_multipass_v6_1.cu (in the same directory), closes this gap by (i) aborting with a fatal error on any overflow (unless the diagnostic flag ZAREMBA_PROBE=1 is set) and (ii) printing a final no-overflow certificate reporting peak frontier vs BUF_SLOTS for both phases. Upgrading the 210B claim from strong computational evidence to certified computational result requires a re-run of v6.1 on equivalent hardware and checking the certificate block.

Part 3: Transfer Operator Spectral Analysis

To move beyond brute-force verification toward a potential proof, we computed the spectral properties of the Gauss-type transfer operator for CFs bounded by $A = 5$ :

$\mathcal{L}_s f(x) = \sum_{k=1}^{5} \frac{1}{(k+x)^{2s}} f\left(\frac{1}{k+x}\right)$

The Hausdorff dimension $\delta$ of the set $E_5$ (reals with all CF quotients $\leq 5$ ) is the unique $s$ where the spectral radius of $\mathcal{L}_s$ equals 1. If $2\delta > 1$ , the circle method (Bourgain-Kontorovich style) can in principle be applied.

We discretized $\mathcal{L}_s$ using Chebyshev collocation with $N = 40$ points on $[0, 1]$ , converting the eigenvalue problem into a dense matrix eigensolve computed on GPU via cuSOLVER.

Results

LLM Proving Race: 19/20

$d$	Winner	Witness $a$	$d$	Winner	Witness $a$
1	Goedel	1	11	Kimina	2
2	Goedel	1	12	Goedel	5
3	Goedel	1	13	Kimina	3
4	Goedel	1	14	Goedel	3
5	Goedel	1	15	Kimina	4
6	Kimina	5	16	Goedel	3
7	Kimina	2	17	Kimina	3
8	Kimina	3	18	Goedel	5
9	FAIL	—	19	Kimina	4
10	Kimina	3	20	Kimina	9

Final score: Goedel-Prover and Kimina-Prover split the 19 proved cases. All 19 proofs verified by the Lean 4 compiler.

The single failure ( $d = 9$ , correct witness $a = 2$ ) was a search problem: both models repeatedly tried $a = 1$ and $a = 3$ instead of $a = 2$ across 80 total attempts. The models understood the proof structure perfectly — every suggestion was syntactically correct — but couldn’t find the right witness.

Pipeline Bugs Encountered

Three bugs were discovered and fixed during the proving run:

Sorry acceptance: Lean 4 compiles sorry as a warning (exit code 0), not an error. The prover initially reported 23/23 “proved” because model outputs containing sorry were accepted.
Output parsing: Models wrap proofs in markdown headers and full theorem re-statements. Injecting the raw output into the source file produces garbage. Fixed with bracket-matching truncation.
decide vs native_decide: Models generate by decide for all proof obligations. The kernel evaluator times out on CF computation; native_decide compiles to native code and runs instantly. Fixed with post-processing.

Exhaustive Verification: 0 Failures to $2.1 \times 10^{11}$

Verified: The v6 multi-pass GPU matrix enumeration kernel reported zero gaps for all $d \leq 2.1 \times 10^{11}$ (210 billion) in 6,962.2 s (116 min) on 8× NVIDIA B200. Under the caveat that the original kernel did not emit a no-overflow certificate (see “Part 2: CUDA-Accelerated Exhaustive Verification” above), this result is treated here as strong computational evidence; certification via the hardened v6.1 kernel on equivalent hardware is the pending next step.

The kernel performs the entire CF tree walk on GPU via batched 2×2 matrix multiplication with fused expand+mark+compact. Phase A builds the tree to depth 12 on one GPU (2.44 × 10⁸ live seeds after pruning), Phase B distributes these across all 8 GPUs in 256 rounds of 119,210 seeds per chunk, each expanding to depth 62.

Scaling of the v6 kernel on 8× B200 (all using the same matrix_enum_multipass.cu source, with num_rounds tuned per target):

Range	num_rounds	Time	Rate	Notes
$d \leq 10^9$	1	21.8 s	45.9M d/s	per-chunk seeds = 30.5M ≫ 119,210; unverified as certified
$d \leq 10^{10}$	32	179 s	55.9M d/s	per-chunk seeds ≈ 954K ≫ 119,210; unverified as certified
$d \leq 10^{11}$	128	1,746 s	57.3M d/s	per-chunk seeds ≈ 238K ≫ 119,210; unverified as certified
$d \leq 2.1 \times 10^{11}$	256	6,962.2 s	30.2M d/s	per-chunk seeds = 119,210; 210B headline run

The v6 kernel extended an earlier single-GPU v5 kernel with multi-pass chunking to keep Phase B’s buffer working set small. Earlier kernel iterations (v1 through v5, including a targeted-search heuristic around $a \approx 0.170 d$ used in v2) have been archived and are not used in the 210B claim; see the repository history for details.

Important caveat on all rows above. At num_rounds = 1 (the $d \leq 10^9$ row), each GPU processed 30.5M seeds in a single chunk — orders of magnitude above the 210B chunk size. Local v6.1 probes on a single RTX 5090 (see paper/CERTIFICATE.md) confirm that such large chunk sizes produce peak Phase B frontiers far in excess of the 2 × 10⁹ BUF_SLOTS of the original v6 kernel, so those smaller runs almost certainly clipped silently.

More importantly, v6.1 probe measurements at the exact 210B chunk size (119,210 seeds per chunk, matching num_rounds = 256 × 8 GPUs) show:

max_d	`h_out` peak observed	Overflow events	Wall time	Interpretation
$10^8$	$1.91 \times 10^9$	0	1,407 s	True unclipped peak; 95.5% of the B200 $2 \times 10^9$ buffer
$10^9$	$2.00 \times 10^9$ (saturated at $5 \times$ local BUF_SLOTS)	$17.5 \times 10^{12}$	6,987 s	True peak > $4 \times 10^8$ but exact value not recoverable from a clipped probe
$10^{10}$	$4.29 \times 10^9$ (saturated at a higher level)	$25.2 \times 10^{12}$	7,491 s	Saturation level grows with $\max_d$ — confirms peak is monotone in $\max_d$
$2.1 \times 10^{11}$	requires v6.1 on $\geq 1.5$ TB of GPU memory	—	—	—

Reading this correctly. The $\max_d = 10^8$ row is a direct measurement of the true unclipped frontier — no overflow was recorded, so the atomic counter was not saturating. The $\max_d \geq 10^9$ rows’ peak values are structural saturation artifacts: once a level of our local kernel clips to $\texttt{BUF\_SLOTS} = 4 \times 10^8$ , the next level’s atomic counter is bounded above by a small multiple of BUF_SLOTS. The overflow-event columns are the informative parts of those rows: 17.5 trillion and 25.2 trillion overflow events respectively confirm that the true unclipped per-level frontier exceeds $4 \times 10^8$ by a very large margin at $\max_d \geq 10^9$ .

What can be concluded: per-chunk peak frontier is non-decreasing in $\max_d$ at fixed chunk size, so the true peak at $\max_d = 2.1 \times 10^{11}$ is at least $1.91 \times 10^9$ , and very likely significantly larger (since the probe at $\max_d = 10^{10}$ was already well past a $5 \times 4\times 10^8$ saturation). Whether the excess stays under the B200’s $2 \times 10^9$ or overflows it cannot be determined from our local probes alone — that is what the v6.1 re-run on $\geq 1.5$ TB of GPU memory is for. The 210B run therefore currently stands as strong computational evidence, not a certified computational result: the v6 kernel emits no machine-checkable no-overflow certificate, and “Uncovered = 0” is conditional on an un-verified no-overflow hypothesis.

First certified sub-range. A CERTIFY run (no probe mode, hard abort on overflow) at $\max_d = 10^6$ , 2,048 rounds, completed in 226.4 s on the RTX 5090 and produced the project’s first machine-checkable no-overflow log:

--- NO-OVERFLOW CERTIFICATE ---
BUF_SLOTS                 : 400000000
Phase A peak frontier     : 216330790  (0.5408 of BUF_SLOTS)
Phase A overflow events   : 0
Phase B peak frontier     : 262804169  (0.6570 of BUF_SLOTS)
Phase B overflow events   : 0
All peaks < BUF_SLOTS     : YES
No-overflow abort fired   : NO

So the sub-range $d \in [1, 10^6]$ is now certified in the machine-checkable sense, not merely computationally-evident. The full log is archived on the Hugging Face dataset as certification/v6_1_CERTIFY_d1000000_r2048_NO_OVERFLOW.log. CERTIFY runs at $\max_d \geq 10^7$ hard-aborted at Phase B round 1 on the RTX 5090 (because the local $4 \times 10^8$ buffer is five times smaller than the B200’s), directly corroborating the probe-mode conclusion that per-chunk Phase B peaks routinely exceed $10^9$ .

Important non-conclusion. Clipping, if it happened in the 210B run, does not directly invalidate $\texttt{Uncovered} = 0$ . In expand_mark_compact_safe, the bitset atomicOr mark fires before the buffer-availability check, so a clipped matrix still marks its denominator; what is lost is its descendants. Because the 244M Phase A seeds produce massively redundant CF coverage, denominators whose CF paths were clipped are usually marked by other unclipped paths. The issue is that the v6 kernel does not prove this, and the v6.1 re-run is what upgrades the status from “very likely correct” to “machine-checked”.

Transfer Operator: Hausdorff Dimension to 15 Digits

Quantity	Value
Hausdorff dimension $\delta$	$0.836829443681208$
Precision	15 digits
$2\delta$	$1.674$
$2\delta > 1$ ?	Yes (circle method threshold met)
Spectral gap	$0.717$
$\lvert\lambda_1/\lambda_0\rvert$	$0.283$

The spectral gap of $0.717$ is strong: the dominant eigenfunction controls the operator’s behavior overwhelmingly, with the second eigenvalue less than 30% of the leading one. This matches Jenkinson-Pollicott (2001) and subsequent refinements, independently verified from scratch on GPU.

Why this matters:

$2\delta > 1$ shows the Hausdorff dimension of $E_5$ is large enough for the circle method to potentially close the gap from density-1 to all $d$
The spectral gap quantifies mixing rates in the continued fraction dynamics
Phase 2 complete: congruence spectral gaps computed for all 1,214 square-free moduli up to $m = 1999$ — every gap positive (min $0.237$ ), evidence consistent with property ( $\tau$ ) at this scale. See findings
Transitivity argument (AI-assisted, not peer-reviewed): algebraic proof that $\Gamma_{\{1,\ldots,5\}}$ generates $\text{SL}_2(\mathbb{Z}/p\mathbb{Z})$ for every prime $p$ — no local obstructions exist. See findings

Analysis: The Witness Distribution

Define $\alpha(d)$ as the smallest Zaremba witness for $d$ with bound $A = 5$ . Based on exhaustive computation over $d = 1$ to $100{,}000$ :

Concentration at $\alpha(d)/d \approx 0.171$

For $d > 1000$ :

$\text{Mean}\;\frac{\alpha(d)}{d} = 0.1712, \qquad \text{99\% interval:}\; [0.1708,\; 0.1745]$

This is an extraordinarily tight concentration — 99% of witnesses fall in a band of relative width 2%.

Connection to the Golden Ratio

The continued fraction prefix of $\alpha(d)/d$ is overwhelmingly $[0;\, 5, 1, \ldots]$ (99.7% of cases). The infinite continued fraction $[0;\, 5, 1, 1, 1, \ldots]$ converges to

$\frac{1}{5 + \varphi} = \frac{1}{5 + \frac{1+\sqrt{5}}{2}} \approx 0.1511$

where $\varphi$ is the golden ratio. The finite truncation and coprimality constraint shift the observed center to $0.1712$ , between the convergents:

$[0;\, 5, 1] = \frac{1}{6} \approx 0.1667 \qquad\text{and}\qquad [0;\, 5, 1, 1] = \frac{2}{11} \approx 0.1818$

The Bound $A = 5$ Is Tight

The maximum partial quotient actually used in the smallest witness:

Max quotient	Frequency
$5$	99.91%
$4$	0.07%
$\leq 3$	0.02%

For 99.91% of all $d$ , the bound $A = 5$ is necessary — $A = 4$ would fail.

CF Length Distribution

The CF length of $\alpha(d)/d$ peaks at $k = 13$ and grows as $O(\log d)$ :

Length $k$	$\leq 8$	9–10	11–12	13–14	15–16	$\geq 17$
Frequency	1.6%	8.7%	36.0%	42.4%	10.1%	1.2%

Implications

Search optimization. Any Zaremba verification algorithm should start searching at $a \approx 0.170\,d$ . This reduces the search space by $\sim 6\times$ compared to starting at $a = 1$ . The v4 inverse CF kernel eliminates search entirely for most cases.
Transfer operator supports circle method. The Hausdorff dimension $\delta = 0.8368$ gives $2\delta = 1.674 > 1$ , consistent with the circle method threshold being met with substantial margin. The spectral gap of $0.717$ provides room for the congruence analysis in Phase 2.
Proof strategy. The transfer operator’s dominant eigenfunction peaks near $x \approx 0.171$ , explaining why witnesses concentrate at $\alpha(d)/d \approx 0.171$ . A full proof may require bounding the congruence transfer operator’s spectral radius uniformly across all moduli. See the companion transfer operator analysis for details.
LLM proving bottleneck. Current SOTA provers nail proof structure but fail at witness search. Adding MCTS or systematic enumeration on top of LLM tactic generation is the clear next step.
Gauss map connection. Witnesses concentrate near the preimage of $[\frac{1}{2}, 1]$ under the Gauss map branch $x \mapsto 1/(5 + x)$ . The transfer operator spectral analysis makes this connection rigorous.

Reproducibility

Clone and verify:

git clone https://github.com/cahlen/idontknow
cd idontknow

# Canonical v6 kernel (original 210B headline run)
nvcc -O3 -arch=sm_100a -o matrix_v6 \
    scripts/experiments/zaremba-effective-bound/matrix_enum_multipass.cu -lpthread

# Hardened v6.1 kernel (recommended): adds hard overflow abort + no-overflow
# certificate block to the final output.
nvcc -O3 -arch=sm_100a -o matrix_v6_1 \
    scripts/experiments/zaremba-effective-bound/matrix_enum_multipass_v6_1.cu -lpthread

# Verify d=1 to 210B (requires 8× B200 or similar; ~116 min with v6 original).
# The v6.1 binary prints a "NO-OVERFLOW CERTIFICATE" section at the end.
./matrix_v6_1 210000000000

# Optional: diagnostic probe mode (disables hard abort, reports peak frontiers
# without halting). Use this on smaller hardware to measure true peak frontier
# and detect whether the original run configuration would have clipped.
ZAREMBA_PROBE=1 ZAREMBA_ROUNDS=2048 ./matrix_v6_1 1000000000

# Override num_rounds via environment (v6.1 only):
ZAREMBA_ROUNDS=256 ./matrix_v6_1 210000000000

Build-flag knobs (v6.1): compile with -DBUF_SLOTS=400000000ULL to run on smaller GPUs (e.g. a single RTX 5090 at 32 GB) — the default of 2 billion slots targets B200-class memory. A smaller BUF_SLOTS will make the hard-abort trigger earlier, which is the correct behavior for exposing unsafe chunk sizes.

Raw Data

v6 verification log (210B headline run): run_210B.log
v6 verification log (100B): run_100B_v2.log
v6 verification log (10B): run_10B_v3.log
Verification manifest (SHA256 checksums, exact invocation, environment): paper/verification-manifest.txt
Spectral gap data (1,214 moduli): /data/spectral-gaps.json

References

Zaremba, S.K. (1972). “La méthode des ‘bons treillis’ pour le calcul des intégrales multiples.” Applications of Number Theory to Numerical Analysis, pp. 39–119.
Bourgain, J. and Kontorovich, A. (2014). “On Zaremba’s conjecture.” Annals of Mathematics, 180(1), pp. 137–196.
Huang, S. (2015). “An improvement to Zaremba’s conjecture.” Geometric and Functional Analysis, 25(3), pp. 860–914.
Shkredov, I.D. (2026). “On Zaremba’s conjecture for prime denominators.” arXiv:2603.14116 (preprint, March 2026).
Dickson, L.E. (1901). Linear Groups with an Exposition of the Galois Field Theory. B.G. Teubner, Leipzig.
Jenkinson, O. and Pollicott, M. (2001). “Computing the dimension of dynamically defined sets.” Ergodic Theory and Dynamical Systems, 21(5), pp. 1429–1445.

Computed 2026-03-28 on NVIDIA DGX B200. Transfer operator analysis: companion post. Canonical kernel: scripts/experiments/zaremba-effective-bound/matrix_enum_multipass.cu. Hardened replacement with no-overflow certificate: matrix_enum_multipass_v6_1.cu in the same directory.

This work was produced through human–AI collaboration (Cahlen Humphreys + Claude). Not independently peer-reviewed. All code and data open for verification at github.com/cahlen/idontknow.

Hardware

Key Results

Zaremba’s Conjecture: 210 Billion Verified on 8× NVIDIA B200

Abstract

Background

Hardware

Method

Part 1: LLM-Assisted Formal Proving

Part 2: CUDA-Accelerated Exhaustive Verification (v6 kernel)

Part 3: Transfer Operator Spectral Analysis

Results

LLM Proving Race: 19/20

Pipeline Bugs Encountered

Exhaustive Verification: 0 Failures to 2.1×10112.1 \times 10^{11}2.1×1011

Transfer Operator: Hausdorff Dimension to 15 Digits

Analysis: The Witness Distribution

Concentration at α(d)/d≈0.171\alpha(d)/d \approx 0.171α(d)/d≈0.171

Connection to the Golden Ratio

The Bound A=5A = 5A=5 Is Tight

CF Length Distribution

Implications

Reproducibility

Raw Data

References

Recent Updates

Exhaustive Verification: 0 Failures to $2.1 \times 10^{11}$

Concentration at $\alpha(d)/d \approx 0.171$

The Bound $A = 5$ Is Tight