NeurIPS 2019 is the 33rd annual conference on Neural Information Processing Systems. Once again the recordings are online, and here are a few that caught my eye.

Overview

I’ve linked to the individual videos below, along with my notes, but in summary:

Machine Learning for Computational Biology and Health is a wide-raging tour of ML in those domains.
From System 1 Deep Learning to System 2 Deep Learning outlines a road-map to deep-learning addressing symbolic (conscious) reasoning.
Representation Learning and Fairness, a modular framework for algorithmic fairness.

Machine Learning for Computational Biology and Health (Anna Goldenberg, Barbara Engelhardt)

A two-hour firehose tutorial on the application of ML in biology (research) and clinical settings.

It’s great to get a run-down on the challenges (large feature sets, small sample size; missing data is not random, to name two) and a tour of applications. And it’s an extensive tour: genetics, epigenetics, transcriptomics, proteomics, microbiome (briefly).

The second half focuses more on clinic data: patient modelling, predicting (for example) heart attack, problems and lessons from working with clinicians). This quote was great, as a reminder that nicely cleaned-up data isn’t going to cut it in the clinic:

“This is actually a must. If you’re working in a clinical context, irregularly sampled data is the only kind of data you will get. Irregular sampled with messiness. There is no way around that.” (1:27:25)

Papers and need to chase down:

Related events:

SAIL: Symposium on Artificial Intelligence for Learning Health Systems
ACM CHIL: ACM Conference on Health, Inference, and Learning
ML4HC: Machine Learning for Healthcare
ProbGen: Probabilistic Modeling in Genomics Conference
Fair ML for health

Watch the recording.

From System 1 Deep Learning to System 2 Deep Learning (Yoshua Bengio)

It’s been a long time since I heard anyone talking about how symbolic processing fits into neural-style computation. There’s been a lot going on in this area, but the last time I touched on this seriously was back in 1990s with Pollack’s Recursive Distributed Representations work.

The talk is outlining a series of challenges, and a way through them, for deep learning to move to system 2 cognition. In particular, it’s not a matter of getting more data and building bigger models. Instead, it’s looking at variables and trying to capture higher-level causation (beyond what you can find in pixel-level data).

Interestingly, Bengio isn’t looking a bolting symbolic systems on-top of networks. He wants to implement symbolic processing in a neural architecture.

Papers for me to follow-up on:

Bengio (2017) The Consciousness Prior
Bengio, Bengio & Cloutier (1991) Learning a synaptic learning rule
Ortega et al. (2019) Meta-learning of Sequential Strategies
Bengio et al. (2019) A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

Here’s the recording.

Representation Learning and Fairness (Sanmi Koyejo)

The focus here is on separating fairness from the machine learning model. The fairness parts (data regulator, data producer) are before the model, so that the model can ignore fairness. It’s a fascinating idea explored in this two hour tutorial.

It is not magic: you have to define fairness as a “data regular”. One example given was individual fairness, where similar inputs (according to some metric you define) should produce similar outputs.

With a way to measure fairness, the “data producer” can then transform data sets to learn a “fair” representation. This is the point where the “representation learning” of the title comes into things. Representation learning is one way to compute a data summary. That means going from a high dimensional representation to a lower one (like PCA). What’s supposed to happen is points that should be treated similarly are moved to be closer together in the representation space. Learning this transformation is the job of the “data producer”.

The “data user” learns a model to perform some task (credit rating, for example) using the sanitised data from the data producer. You’ll want to audit the data user (e.g., the predictions made) to check they are fair as defined by the data regulator.

It appears the trick here is trading-off fairness and performance.

Follow-up items:

McNamara, Ong and Williamson (2019) Costs and Benefits of Fair Representation Learning PDF
Hiranandani et al. (2019) Multiclass Performance Metric Elicitation
Weinberger and Saul (2009) Distance Metric Learning for Large Margin Nearest Neighbor Classification
Zemel et al. (2013) Learning Fair Representations PDF
Louizos et al. (2016) The Variational Fair Autoencoder
Song et al. (2018) Learning Controllable Fair Representations
Creager et al. (2019) Flexibly Fair Representation Learning by Disentanglement
Locatello et al. (2019) On the Fairness of Disentangled Representations

This is an entire area I didn’t realise was so far advanced. Go watch the tutorial if you’re interested. It’s really well done.

There’s much more

There are over 250 presentations online already. Topics cover technical aspects of ML, but there are also many domain specific talks: climate change, creativity and design, and what seems to me as a lot of health and biology content.

NeurIPS 2019 presentations that caught my eye

Overview

Machine Learning for Computational Biology and Health (Anna Goldenberg, Barbara Engelhardt)

From System 1 Deep Learning to System 2 Deep Learning (Yoshua Bengio)

Representation Learning and Fairness (Sanmi Koyejo)

There’s much more