by cahlen in-progress

Hardware

8× NVIDIA B200 (183 GB VRAM each, 1.43 TB total) 2× Intel Xeon Platinum 8570 (112 cores / 224 threads) 2 TB DDR5 RAM
algebraic-combinatoricsrepresentation-theorygeometric-complexity-theory b200dgxnvlink cuda-kernelmurnaghan-nakayamacharacter-tables

Key Results

Kronecker Coefficients: Pushing the Frontier to n=120n = 120

Abstract

We compute Kronecker coefficients g(λ,μ,ν)g(\lambda, \mu, \nu) of the symmetric group SnS_n for nn up to 120, roughly doubling the existing computational frontier. Kronecker coefficients describe tensor product decompositions of symmetric group representations and are central to algebraic combinatorics, quantum information theory, and the geometric complexity theory (GCT) approach to P vs NP. No combinatorial formula for them is known — this is one of the biggest open problems in the field. We use GPU-parallelized computation via the Murnaghan-Nakayama rule, focusing on near-rectangular partitions relevant to GCT.

Background

Kronecker Coefficients

For three partitions λ,μ,ν\lambda, \mu, \nu of the same integer nn, the Kronecker coefficient g(λ,μ,ν)g(\lambda, \mu, \nu) is:

g(λ,μ,ν)=1n!σSnχλ(σ)χμ(σ)χν(σ)g(\lambda, \mu, \nu) = \frac{1}{n!} \sum_{\sigma \in S_n} \chi^\lambda(\sigma)\, \chi^\mu(\sigma)\, \chi^\nu(\sigma)

where χλ\chi^\lambda is the irreducible character of SnS_n indexed by λ\lambda.

Equivalently, summing over conjugacy classes (partitions ρ\rho of nn):

g(λ,μ,ν)=ρn1zρχλ(ρ)χμ(ρ)χν(ρ)g(\lambda, \mu, \nu) = \sum_{\rho \vdash n} \frac{1}{z_\rho}\, \chi^\lambda(\rho)\, \chi^\mu(\rho)\, \chi^\nu(\rho)

where zρ=iimimi!z_\rho = \prod_i i^{m_i} m_i! for ρ\rho having mim_i parts equal to ii.

Why They Matter

  1. Representation theory: Kronecker coefficients are the structure constants for the representation ring of SnS_n. Understanding them is equivalent to understanding how representations decompose under tensor product.

  2. Quantum information: g(λ,μ,ν)>0g(\lambda, \mu, \nu) > 0 if and only if the spectra (λ/n,μ/n,ν/n)(\lambda/n, \mu/n, \nu/n) are compatible as eigenvalue spectra of a tripartite quantum state and its marginals (the quantum marginal problem).

  3. P vs NP (GCT): Mulmuley and Sohoni’s geometric complexity theory program reduces the permanent vs. determinant problem to showing that certain Kronecker coefficients (for specific “padded” partitions) are positive. Computing these specific coefficients is a bottleneck.

  4. Open problem: No combinatorial interpretation of g(λ,μ,ν)g(\lambda, \mu, \nu) is known, unlike Littlewood-Richardson coefficients which count LR tableaux. Finding one is a major open problem (Stanley, 2000).

Current Frontier

AuthorsYearRangeMethod
Pak, Panova2017n50n \leq 50 (specific triples)Murnaghan-Nakayama
Ikenmeyer et al.2023n60n \leq 60 (GCT triples)Specialized algorithms
Generic software (SageMath)Currentn30n \leq 30 (full tables)Slow, single-threaded

No GPU-accelerated computation of Kronecker coefficients has been published.

The Murnaghan-Nakayama Rule

Character values χλ(ρ)\chi^\lambda(\rho) are computed recursively. For ρ=(r1,r2,,rk)\rho = (r_1, r_2, \ldots, r_k):

χλ(ρ)=border strips B of λ,B=r1(1)ht(B)χλB(r2,,rk)\chi^\lambda(\rho) = \sum_{\text{border strips } B \text{ of } \lambda, |B| = r_1} (-1)^{\text{ht}(B)} \chi^{\lambda \setminus B}(r_2, \ldots, r_k)

A border strip (rim hook) of size rr is a connected skew shape with no 2×22 \times 2 square. Its height ht(B)\text{ht}(B) is one less than the number of rows it occupies.

This recursion has exponential worst-case complexity but is highly parallelizable: each branch of the recursion is independent.

Hardware

ComponentSpecification
NodeNVIDIA DGX B200
GPUs8× NVIDIA B200 (183 GB VRAM each)
Total VRAM1.43 TB
GPU InterconnectNVLink 5 (NV18), full mesh
CPUs2× Intel Xeon Platinum 8570
System RAM2 TB DDR5

Method

Computation Strategy

For n50n \leq 50: Compute the full character table χλ(ρ)\chi^\lambda(\rho) for all λ,ρn\lambda, \rho \vdash n using Murnaghan-Nakayama on CPU, then upload to GPU. The GPU kernel parallelizes over triples (λ,μ,ν)(\lambda, \mu, \nu) — there are p(n)3p(n)^3 such triples, where p(n)p(n) is the partition function. For n=50n = 50, p(50)38.5×1015p(50)^3 \approx 8.5 \times 10^{15} — far too many for full enumeration, but we can compute all triples where g>0g > 0.

For 50<n12050 < n \leq 120: Selective computation of GCT-relevant triples. These are partitions near rectangular shape: λ(ab)\lambda \approx (a^b) (a rectangle of aa columns and bb rows) with small perturbations. The GCT program specifically needs:

  • λ=(nk,1k)\lambda = (n - k, 1^k) (hooks)
  • λ=(ab)+small perturbation\lambda = (a^b) + \text{small perturbation} (near-rectangles)

For these structured partitions, the Murnaghan-Nakayama recursion has much smaller branching factor.

GPU Parallelism

  • Outer loop (GPU blocks): Each CUDA block handles one triple (λ,μ,ν)(\lambda, \mu, \nu)
  • Inner loop (threads within block): Threads cooperate on the sum over conjugacy classes ρ\rho, with each thread evaluating χλ(ρi)χμ(ρi)χν(ρi)/zρi\chi^\lambda(\rho_i) \chi^\mu(\rho_i) \chi^\nu(\rho_i) / z_{\rho_i} for a subset of classes
  • Memory: The 1.43 TB VRAM can hold character tables for nn up to ~70 in full, or selective columns for larger nn

Results

PENDING — experiment not yet run.

Validation: Known Values

Triple (λ,μ,ν)(\lambda, \mu, \nu)Known ggOur value
((3,2,1),(3,2,1),(3,2,1))((3,2,1), (3,2,1), (3,2,1))1PENDING
((4,2),(3,3),(3,2,1))((4,2), (3,3), (3,2,1))1PENDING

New Results at n=120n = 120

PENDING — will report:

  • Number of triples computed
  • Distribution of non-zero Kronecker coefficients
  • Maximum gg value found
  • Specific GCT-relevant values

Analysis

PENDING — will analyze:

  1. Growth rate of g(λ,μ,ν)g(\lambda, \mu, \nu) for near-rectangular partitions as nn increases
  2. Positivity patterns: which triples have g>0g > 0? Any new combinatorial insights?
  3. Comparison with the “stretched” Kronecker coefficient asymptotics
  4. Implications for GCT: do the computed values support or challenge Mulmuley’s conjectures?

Reproducibility

git clone https://github.com/cahlen/idontknow
cd idontknow
nvcc -O3 -arch=sm_100a -o kronecker_compute scripts/experiments/kronecker-coefficients/kronecker_compute.cu

# Full table for S_30
./kronecker_compute 30 all

# GCT-relevant triples for S_120
./kronecker_compute 120 gct

Computed on NVIDIA DGX B200. Code: github.com/cahlen/idontknow.