Machine learning-based design of proteins, (small molecules and beyond)
Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein that binds to a target more tightly than previously observed. To that end, costly experimental measurements are being replaced with calls to a high-capacity regression model trained on labeled data, which can be leveraged in an in silico search for promising design candidates. The aim then is to discover designs that are better than the best design in the observed data. This goal puts machine-learning based design in a much more difficult spot than traditional applications of predictive modelling, since successful design requires, by definition, some degree of extrapolation---a pushing of the predictive models to its unknown limits, in parts of the design space that are a priori unknown. In this talk, I will anchor this overall problem in protein engineering, and discuss our emerging approaches to tackle it.
About the speaker:
Since Jan. 2018, Jennifer Listgarten is a Professor in the Department of Electrical Engineering and Computer Science, and Center for Computational Biology, at the University of California, Berkeley. She is also a member of the steering committee for the Berkeley AI Research (BAIR) Lab, and a Chan Zuckerberg investigator. From 2007 to 2017 she was at Microsoft Research, through Cambridge, MA (2014-2017), Los Angeles (2008-2014), and Redmond, WA (2007-2008). She completed her Ph.D. in the machine learning group in the Department of Computer Science at the University of Toronto, located in her home town. She has two undergraduate degrees, one in Physics and one in Computer Science, from Queen's University in Kingston, Ontario. Jennifer's research interests are broadly at the intersection of machine learning, applied statistics, molecular biology and science.
Host: Jane Qi (yq2h)