NeurIPS 2022

Rough Notes

Cicero (Meta)

piKL objetive (alphastar did it to aid exploration, these guys used it to model the other players/encourage human like behaviour)
Difference between lying (not done for AI safety reasons) and witholding information -> Noam: never having it lie improved the performace dramatically, although as per profressional human players, rarely lying may help.
Data augmentation to filter illegal moves (I guess this is in the paper)
Dora vs Cicero cases

UQ (IBM)

One way comm

Infinitesimal jackknife?
Some coverage metrics PICP,MPIW
Montonic selective risk (msr) - monotone error decreased for every subgroup - (#DOUBT MI used in eventual objective that imposes the sufficiency criteria which is the theorem they proved.

Two-way comm

Learning rejector and classifier concurrently - they cast the MILP as a differentiable relaxation - (#DOUBT relation between relaxing MILPs and Gumbel-softmax trick).

Richer comm

Their disentangled model - (#TODO Look into exact mechanism to encourage disentanglement)
See UQ360 paper, I assume all/most other papers mentioned here will have some relation to that.

Learning agents (LG)

keyword: Compositional generalizability (w/ hierarchical RL)
nonlinearity- boiled cabbage vs. fried egg example.
hypothesis ranking and popularity and recency bias.
language priors ? from LLMs?
EXAONE for scientific discovery, makes use of actual academic papers. how is a subgroup defined)
Upper bound on cond. MI used in eventual objective that imposes the sufficiency criteria which is the theorem they proved.

Two-way comm

Learning rejector and classifier concurrently - they cast the MILP as a differentiable relaxation - (#DOUBT relation between relaxing MILPs and Gumbel-softmax trick).

Richer comm

Their disentangled model - (#TODO Look into exact mechanism to encourage disentanglement)
See UQ360 paper, I assume all/most other papers mentioned here will have some relation to that.

Learning agents (LG)

keyword: Compositional generalizability (w/ hierarchical RL)
nonlinearity- boiled cabbage vs. fried egg example.
hypothesis ranking and popularity and recency bias.
language priors ? from LLMs?
EXAONE for scientific discovery, makes use of actual academic papers. CDEd paper.

Human-in-the-Loop (Toloka)

Due to data drifts, anomalies etc.

Keynote 2

Prediction Policy Problems Kleinberg et al. 2015
Keyword: Algorithmic auditing , regression disconuity design
Pedromo et al 2022 - dropout crisis
Knowes et al 2015 Dropout Early Warning Sustem
individual vs environment and malleable vs. non-malleable grouping reveals impact of system level factors
Kirchner ProPubloca 2017 probabilistic genotyping DNA (DNA mixture interpretation paper byNIST)

Keynote 3

Conformal prediction (missed)

Keynote 4

Model centric, data centric and human centric AI.
Semantics are not captured well at least not enough to go to the stickman eating ramen.
What makes a good user interface?
Training lanague models to followe instruction with human feedback - can we perform something like experimental design after considering the LLM to be a prior.
Some challenges: Incentivizing users to work w AI.
Michael Bernstein (Stanford) 2 cultures of evaluation in AI vs HCI.

Town hall

"Exhibitors" instead of "Sponsors" - no tiers, no emphasis on the scientific aspect.
Stats: In person - 10300, virtual 3160. for reference, 2010 had 1354 participants. 9634 submissions, 2905 accepted.

Keynote -2

Bias in the data generation.
All datasets are biased, some are useful.
Dataset selection Neurips2002
OpenML, HuggingFace and ofc Kaggle
Big-Bench
Reduced, Reused and Recycled NeurIPS 2021 D&B

CML-4-Impact

Some hot topics: partial discovery, needing high quality simulators
Keywords: sequential discovery learns from some treatment then propagates outwards.
COutnerfactual generated by the machine as an expression of what it thinks something looks like in its language
go.topcoder.com/schmidtcentercancerchallenge - cells here are imagined to be in steady state.
Human in the loop to transcend Kleinberg's theorem?

Questions from poster session

Is this prior elicitation
Difference between graphical model and actual causal DAG

Information theory cognitive science

Talk 1 Palmer

Things arent driven to optimality, theres only best "sensors" in comparison to other things.
Flash lag illusion
Optimal prediction slide references - the loss function \(I_{past} - \beta I_{future}\) also important, or rather \(I_{future}\) as another information related value like for cognitive (something).

Talk 2 Gornet

Intuition: spatial structure influences temporal correlations
Visual maps relation to reasoning slide

Talk 3 Sajid

lost of evidence for tradeoff between exploration and exploitation in e.g. monkey
amortized Active inf?

Talk 5 Tatyana

Hyperbolic geometry as a tool to represent hierachical tree like networks
There is hierarchical structure in decision making

Talk 6

Ask for his thesis

Talk 7 Information is not enough

George Miller's paper is important
Duncan Luce - whatever happened to ingo theory in psychology
Ma, Husain, Bays 2014 Nature Neuroscience
paper: in neuron - deep rl and its neuroscientific implications
Fang Z 2022 thesis - learning generalizable representtions through compression
keyword: rate distortion RL - specific alg: rate distortion policy gradient - RD multi-agent
paper: human RL w visual information

Lightning talks, spotlight

Emergent comms EC : using info theoretic perspectives to make generalizable and translatable EC
Generalizing w overly complex representations: against the simplicity bias - great slide just before limitation
soft labels learning more than true labels (see also on the informativeness of supervision signals)
unsupervised machine translation : given a good prior, its possible to PAC_learn a translator

Andrew Gordon Wilson ; when bayesian orthodoxy goes wrong

Sequence completion example for Occams razor
Modified Netwonian mechanics vs. General Relativity
Marginal likelihood \(\neq\) Generalization - marginal likelihood is not to optimize if you want to generalize
Marginal likelihood is great for scientific hypothesis testing
What about normalizing images in the BMA for MNIST?
What exactly is M -> empire strikes back if we assume a distribution over DAGs like from the Indian Chef's process?

Kun Zhang

3 dims (in causal disc. w/ obs data) are iid, parametric assumptions and latent confounders
wrong direction -> non id noise is the main principle
- Zhang has Neurips paper in 2019,20,22 l

Panel disc.

A benchmark does not test hypotheses
Samy Bengio's "changing test set" idea relates to the "do imagenet classifiers generalize to imagenet" paper
a good paper is simple and summarizable in 2ish sentences, and also scientifically rigourous

Emacs 29.4 (Org mode 9.6.15)