We are introducing causality to predictive models for drug discovery. With causality we capture target- or chemotype-specific mechanisms in small and sparse data regimes.
Small Molecules
Causal-Chemprop: SCM built on Chemprop representations and additional molecular descriptors for small molecule property prediction and optimization.
Figure 1: Parity plots for logS predictions on the molecular derivatives of a quinolinyltriazole MIF inhibitor seed structure using (a) Chemprop; (b) counterfactual-based inference with Causal-Chemprop
Enzymes
Causal-ESM/CT: SCM built on sequence-level representations from an evolutionary-scale large language model ESM-2 and physicochemical descriptors from PyBioMed for enzyme property prediction.
Figure 2: Parity plots for predicted fitness score of PPAT homologs using (a) ESM-2; (b) Causal-ESM/CT. The PPAT enzymes were synthesized by DropSynth and activities measured by a multiplexed functional essay.
Publications
Causal-Chemprop: Causal Molecular Machine Learning for Property Prediction and Molecular Optimization
Christian Natajaya (1), Jackson Burns (2) and Lucas Attia (2)
(1) Neopoly Ltd, London, United Kingdom
(2) Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA