Neuroalgebraic Geometry: Overview

About this folder

This folder collects notes on neuroalgebraic geometry — the study of neural networks and statistical learning via algebraic geometry. It lives under concepts/algebraic-geometry/ because the mathematical foundations (varieties, resolution of singularities, semi-algebraic sets) are developed there. Notes assume familiarity with schemes, birational geometry, and real algebraic geometry at the level of the parent folder.


📋 Notes in This Folder

File Status Topic
singular-learning-theory.md ✅ Written Watanabe’s RLCT, resolution of singularities, Bayesian asymptotics
bernstein-sato-and-zeta-functions.md ✅ Written Gelfand zeta function, b-function, Kashiwara’s theorem, monodromy, SLT pipeline
expressivity-and-complexity.md 🔲 Planned Neuromanifold dimension/degree, tensor rank, Milnor-Thom bounds
loss-landscape-geometry.md 🔲 Planned Critical point structure, Morse theory, EDD, symmetry quotients
algebraic-statistics-connections.md 🔲 Planned MLE on varieties, graphical models, identifiability, exponential families

🗺️ Subtopic Map

Thread 1 — Singular Models & Bayesian Asymptotics

Subtopic Key Idea Primary Source
Nonidentifiability & singular models Parameter map φ: W → M is not injective; fibers are positive-dimensional Watanabe 2009
Real log canonical threshold (RLCT) Birational invariant λ controlling Bayes free energy; λ ≤ d/2 for singular models Watanabe 2009, 2211.10049
Resolution of singularities Hironaka: blow up W to resolve φ; pullback posterior is normal crossing Watanabe 2009
Free energy asymptotics F_n = nS_n + λ log n − (m−1) log log n + O(1); RLCT replaces d/2 in BIC Watanabe 2009
Phase transitions in learning λ changes discontinuously as hyperparameters vary; analogy with stat-mech phase transitions 2406.10234
SLT for deep learning Neural networks are singular; singularities are good (smaller effective complexity) 2010.11560

Thread 2 — Neuromanifolds & Expressivity

Subtopic Key Idea Primary Source
Neuromanifold definition M = image of φ: W → F(X) in function space; semi-algebraic variety 2501.18915
Dimension and sample complexity dim M controls covering number; log N_ε = O(m log(d/ε)) 2501.18915
Degree and curvature deg M bounds number of complex critical points of distance minimization 2501.18915
Identifiability via fibers φ^{-1}(f) for generic f encodes symmetry group (permutation + scaling for MLPs) 2501.18915
Linear networks as determinantal varieties Neuromanifold of linear MLPs = rank-r matrices; RLCT known exactly Watanabe 2009, 2501.18915
Tropical geometry for ReLU Max-plus algebra linearizes ReLU computations; tropical varieties count linear regions 2501.18915

Thread 3 — Loss Landscape & Optimization Geometry

Subtopic Key Idea Primary Source
Euclidean distance degree (EDD) Counts complex critical points of ‖f − f_0‖² over M; bounds real optima 2501.18915
Singularities as implicit bias Singular points attract gradient flow → preference for simpler subnetworks 2501.18915, IPAM
Critical point theory Distinguishing genuine optima from saddles using algebraic discriminants IPAM workshop
Symmetry-induced degeneracies Permutation symmetry of hidden neurons creates flat directions; quotient geometry General

Thread 4 — Algebraic Statistics Connections

Subtopic Key Idea Primary Source
MLE on varieties Likelihood geometry: ML degree counts critical points of log-likelihood on variety Sturmfels et al.
Graphical models as varieties Conditional independence = polynomial equations on covariance matrices General
Exponential families Sufficient statistics map defines a toric variety; duality with log-partition function General
AI alignment & information theory Bayesian free energy provides universal criterion for model comparison under singularity 2406.10234

🔗 Dependency Graph

flowchart TD
    AG["Algebraic Geometry
concepts/algebraic-geometry/note.md"] SLT["Singular Learning Theory
singular-learning-theory.md"] EX["Expressivity & Complexity
expressivity-and-complexity.md"] LOSS["Loss Landscape Geometry
loss-landscape-geometry.md"] AS["Algebraic Statistics
algebraic-statistics-connections.md"] AG --> SLT AG --> EX AG --> AS SLT --> LOSS EX --> LOSS AS --> SLT

📚 Master References

Reference Authors Year What It Covers Link
Thread 1: Singular Learning Theory
Algebraic Geometry and Statistical Learning Theory S. Watanabe 2009 Foundational monograph: RLCT, resolution, free energy asymptotics, WBIC Cambridge Univ. Press
Mathematical Theory of Bayesian Statistics S. Watanabe 2018 Extended monograph: WAIC/WBIC derivations, phase transitions in Bayesian inference Routledge
Equations of States in Singular Statistical Estimation S. Watanabe 2010 Fundamental asymptotic relations linking Bayes generalization error, training error, free energy arXiv:0712.0653
A Widely Applicable Bayesian Information Criterion S. Watanabe 2013 WBIC: singular-model-aware BIC replacement via tempered posterior average arXiv:1208.6338
Recent Advances in AG and Bayesian Statistics S. Watanabe 2022 20-year review of SLT: birational methods, renormalized posteriors, universal formula arXiv:2211.10049
Review: Stat Mech–ML Equivalence S. Watanabe 2024 Algebraic research program; phase transitions; AI alignment via free energy arXiv:2406.10234
Deep Learning is Singular, and That’s Good Murfet, Wei et al. 2021 Neural nets as singular models; SLT for DL; RLCT experiments arXiv:2010.11560
The Local Learning Coefficient Lau et al. 2023 Scalable SGLD-based RLCT estimator; detects phase transitions in transformers arXiv:2308.12108
Classification of Real Hyperplane Singularities by RLCT Lau, Wiesmann 2024 Combinatorial SageMath algorithm for RLCT of hyperplane arrangement polynomials arXiv:2411.13392
Thread 2: Expressivity and Algebraic Complexity
Neuroalgebraic Geometry TBD 2025 Expository overview: neuromanifolds, dimension/degree/singularities/fibers/EDD arXiv:2501.18915
Geometry of Polynomial Neural Networks Kubjas et al. 2024 Neurovariety dimension/degree; learning degree as training complexity arXiv:2402.00949
Algebraic Complexity and Neurovariety of Linear Convolutional Networks Shahverdi 2024 Neuromanifold of 1-D linear CNNs = semialgebraic set; EDD equals ED degree of Segre variety arXiv:2401.16613
Activation Degree Thresholds and Expressiveness Finkel et al. 2024 Activation threshold: minimal degree at which neurovariety achieves maximum dimension arXiv:2408.04569
On the Expressive Power of Deep Learning: A Tensor Analysis Cohen et al. 2016 Deep CNNs ↔︎ hierarchical Tucker decompositions; exponential depth separation via tensor rank arXiv:1509.05009
On the Number of Linear Regions of Deep Neural Networks Montúfar et al. 2014 Deep ReLU networks carve exponentially more linear regions than shallow; piecewise-linear complexity arXiv:1402.1869
Benefits of Depth in Neural Networks Telgarsky 2016 Sawtooth functions: depth-\(k^3\) expressible but require \(\Omega(2^k)\) width at depth \(O(k)\) arXiv:1602.04485
The Euclidean Distance Degree of an Algebraic Variety Draisma et al. 2016 Defines EDD; counts complex critical points of squared distance; bounds real optima arXiv:1309.0049
An Introduction to Tropical Geometry Maclagan, Sturmfels 2015 Tropical varieties and max-plus algebra; applied to ReLU linearization arXiv:1502.05950
Algebraic Complexity Theory Bürgisser, Clausen, Shokrollahi 1997 Tensor rank, circuit complexity, Strassen’s theorem Springer
Why Does Deep and Cheap Learning Work So Well? Lin et al. 2017 Hierarchical polynomial structure in physics maps to efficient deep networks; RG group analogy arXiv:1608.08225
Thread 3: Loss Landscape Geometry
Loss Surface of Deep Linear Networks via Algebraic Geometry Mehta et al. 2018 Numerical AG (homotopy continuation) enumerates all stationary points; algebraic degree bounds arXiv:1810.07716
Geometry of the Loss Landscape: Symmetries and Invariances Simsek et al. 2021 Permutation symmetry generates structured manifold of minima; one extra neuron connects all minima arXiv:2105.12221
Connectedness of Loss Landscapes via Morse Theory Akhtiamov, Thomson 2023 Morse theory for mode connectivity; saddle-point index structure governs path-connectivity PMLR v197
Symmetries of Neural Networks Brea, Gerstner, Urbanczik 2019 Permutation and scaling symmetries; fiber structure of MLP parameter space arXiv:2106.10255
Flat Minima Hochreiter, Schmidhuber 1997 Flat-minima hypothesis with MDL/Bayesian argument; precursor to sharpness-aware geometry bioinf.jku.at
Morse Theory Milnor 1963 Canonical reference: deformation retracts, index theory, CW-complex reconstruction Princeton Univ. Press
Thread 4: Algebraic Statistics
Algebraic Statistics Sullivant 2018 Comprehensive AMS textbook: algebraic exponential families, MLE degree, identifiability AMS GSM 194
Lectures on Algebraic Statistics Drton, Sturmfels, Sullivant 2009 Compact introduction to MLE on varieties, likelihood geometry, graphical models Birkhäuser
Likelihood Geometry Huh, Sturmfels 2014 ML degree as Euler characteristic of very affine variety; toric and determinantal models arXiv:1305.7462
Algebraic Statistics for Computational Biology Pachter, Sturmfels 2005 Gröbner bases, toric models, tropical geometry for genomics; graphical models and identifiability Cambridge Univ. Press
Learning Algebraic Varieties from Samples Breiding et al. 2018 Persistent-homology pipeline to recover dimension/degree/equations of variety from point cloud arXiv:1802.09436
Background / Cross-Cutting
Ideals, Varieties, and Algorithms Cox, Little, O’Shea 2015 Standard intro: Gröbner bases, Nullstellensatz, primary decomposition, elimination theory Springer
Solving Systems of Polynomial Equations Sturmfels 2002 Resultants, Bernstein’s theorem, Nash equilibria, AG of statistical models AMS CBMS 97
Information, Physics, and Computation Mézard, Montanari 2009 Stat mech of disordered systems; belief propagation, cavity method; phase transitions in large NNs Oxford Univ. Press
Tensor Decompositions and Applications Kolda, Bader 2009 Tucker/CP decompositions; tensor rank; algebraic complexity SIAM Review
IPAM Workshop: AG — A Window to ML IPAM 2024 Community overview: grokking, neural collapse, LoRA, network verification via AG IPAM
Understanding Deep Learning Requires Rethinking Generalization Zhang et al. 2017 Empirical motivation for SLT: classical bounds fail for neural nets arXiv:1611.03530
Grokking: Generalization Beyond Overfitting Power et al. 2022 Delayed generalization; one of the IPAM open problems arXiv:2201.02177