Bookmarks for 2024-05-27

2 bookmarks were saved this day.

18.

Geometric Deep Learning and Equivariant Neural Networks

arxiv.org/pdf/2105.13926

We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks. We develop gauge equivariant convolutional neural networks on arbitrary manifolds M using principal bundles with structure group K and equivariant maps between sections of associated vector bundles. We also discuss group equivariant neural networks for homogeneous spaces M=G/K, which are instead equivariant with respect to the global symmetry G on M. Group equivariant layers can be interpreted as intertwiners between induced representations of G, and we show their relation to gauge equivariant convolutional layers. We analyze several applications of this formalism, including semantic segmentation and object detection networks. We also discuss the case of spherical networks in great detail, corresponding to the case M=S2=SO(3)/SO(2). Here we emphasize the use of Fourier analysis involving Wigner matrices, spherical harmonics and Clebsch-Gordan coefficients for G=SO(3), illustrating the power of representation theory for deep learning.

17.

Categorical Deep Learning: An Algebraic Theory of Architectures

arxiv.org/pdf/2402.15332

We present our position on the elusive quest for a general-purpose framework for specifying and studying deep learning architectures. Our opinion is that the key attempts made so far lack a coherent bridge between specifying constraints which models must satisfy and specifying their implementations. Focusing on building a such a bridge, we propose to apply category theory — precisely, the universal algebra of monads valued in a 2-category of parametric maps — as asingle theory elegantly subsuming both of these flavours of neural network design. To defend our position, we show how this theory recovers constraints induced by geometric deep learning, as well as implementations of many architectures drawn from the diverse landscape of neural networks, such as RNNs. We also illustrate how the theory naturally encodes many standard constructs in computer science and automata theory.