Categories for AI

Why Category Theory?

Much of science starts in an ad-hoc manner, with greater understanding comming later on. Deep learning is an example of this, with a lot of ad-hoc design choices. The claim of the course organizers is that Category theory will become the unifying deep learning framework.

What is category theory?

Tom Leinster summarizes it as taking a birds eye view of math, where the intricate details are not seen but patterns undetectable from the ground could be seen.

What is applied category theory?

A particular way to structure knowledge, based on the idea of compositionality. It is a formal language to connect different scientific areas, e.g. physics, chemistry, systems theory, functional programming, game theory, information theory, cryptography etc.

Aim of this course is to:

Teach how to approach category theory.
Give practical examples from deep learning.
Provide a sense of the philosophy behind it.

Rough outline of content:

Week 2: Categories and functors.
Categorical dataflow: Optics and lenses and data structures for backpropagation.
Geometric deep learning and naturality.
Monoids, Monads, Mappings and LSTMs.
Guest lectures.

Today's content:

What is compositionality?
What is category theory?
What is needed to start learning and take advantage of it.
What category theory can do for deep learning.

Compositionality is often misunderstood to just be the ability to build systems by composing them of smaller subsystems. But there is more to it than this, namely, it is:

The ability to build systems by composing them out of smaller subsystems.
The ability to reason about the resulting system in terms of all its components.

It is not a property of the system itself, but our models of the system.

Compositionality needs both of the properties above, for e.g. our models of markets, economies, neural networks etc satisfy 1 but not 2. Meanwhile, our models of differentiable functions, compilers, Markov kernels, polynomials etc. are compositional.

Relating neural networks (above mentioned to not be compositional) and differentiable functions (above mentioned to be compositional), the difference is that we need to define compositionality with respect to something. If our model treats neural networks as differential functions then they are compositional however if our model treats them as generative/discriminative models, they are not compositional. In this sense, compositionality is about (exposed) interfaces. Non-compositionality means we need extra information not available from the interface - in this case scaling is harder since we only interact with the system through the inteface and we now have uncertainty about the system's behaviour.

Category theory studies relationships between objects rather than objects themselves.