#063 - Prof. YOSHUA BENGIO - GFlowNets, Consciousness Causality - 2022

Details

Title : #063 - Prof. YOSHUA BENGIO - GFlowNets, Consciousness Causality Author(s): Machine Learning Street Talk Link(s) : https://www.youtube.com/watch?v=M49TMqK5uCE

Rough Notes

(Generative Flow Networks) GFlowNets present a framework for active learning. An oracle is queried, and we want to approximate it such that:

Some notion of uncertainty.
Efficiently.
The queries are diverse.

MCMC can fail in high dimensions, since many complex distributions have areas of very low probability densities - in fact in high dimensions it could be exponentially expensive (#TODO Find citation). This raises the question: Can we make use of some structure to escape this exponential problem? That is, can we achieve systematic generalization - generalize far from the data in a meaningful way.

GFlowNets need:

A reward function.
Deterministic episodic environment.

GFlowNets train to sample proportionally to the reward function i.e. the actions are distributed in proportion to the reward. (unlike current standard approaches like in AlphaGo which sample actions to maximize the expected reward, and the trajectories go to the state(s) with the highest reward).

Bengio: "Its a generic learnable inference for probabilistic ML". Can be thought of as a learnable replacement for MCMC. Can be used to estimate intractable quantities like partition function as well, making use of generalization power of NNs.

Imagine a Galton board where each particle has a probability of 0.5 for going down each side of each peg - this results in the particles empirically converging to a Gaussian. By tuning these parameters for each peg, we can generate any distribution. GFlowNets can be thought of using a NN to takes pegs and assign its correspinding parameter, this allows for generalization, as the NNs share "statistical strength"/information across all possible states so it can generalize to areas we have not seen.

GFlowNets can also be used to estimate entropies.

Sampling trajectories is like sampling thoughts.

GFlowNets for drug discovery are bandits, but the action space is combinatorial - one GFlowNet acquires knowledge about the reward function from the real world, another uses it to control the policy that does the search - its reward is the uncertainty reduction it will get.

When we want systems to help users, the human is driving. We need systems that explicitly model uncertainties about what the users need/wants, GFlowNets can represent such models efficiently.

In scientific discovery: Scientists plan experiments that allow to reduce uncertainty about some aspect of the world - there is a sequence of questions they can ask nature and we want to ask these as efficiently as possible.

GFlowNets are a good approach to implement System 2 inductive biases - one of them is that we think causally.

#063 - Prof. YOSHUA BENGIO - GFlowNets, Consciousness Causality - 2022

Details

Rough Notes

Anki

#TODO How do GFlowNets differ from MCST?

#TODO Describe the Galton board analogy for GFlowNets.

#TODO How do GFlowNets differ from traditional Reinforcement Learning?