Materials Informatics: Data- and Knowledge-Driven Materials Science

Today, many critical technological applications, including energy, electronics, security and environment, that drive the modern world rely on the design and discovery of advanced materials. Typically, these materials are multicomponent by design and have enormous complexities at the atomic and mesoscale level. Therefore, predictive computational strategies that identify promising candidates with desired response for experimental synthesis and characterization has the potential to accelerate the discovery and realization of new materials at scale. Our interest is to build a transformative materials-by-design research program that leverages the state-of-the-art computational and experimental infrastructure to address grand challenge problems, including (but not limited to) clean energy and additive manufacturing, enabling key technological breakthroughs to foster new materials innovation. In the process, it is envisioned that our work will lay the foundation for an information science driven materials design and discovery approach that takes into account the existing empirical data (big or small), physical models, inference methods and computer simulation tools in pursuit of accelerating the design and discovery of new materials. Currently our three major research thrust areas are density functional theory, machine learning, and optimal learning.

  • Density Functional Theory

    Density Functional Theory

    Computational methods based on density functional theory (DFT) that calculate the electronic structure of solids have been pivotal to accelerate the search and discovery of new materials. In DFT calculations, we calculate the total electronic energy of a solid that can be decomposed into the following terms: total kinetic energy of the electron, total potential energy of the electrons due to the Coulombic attraction to the nuclear center(s), total potential energy due to the average Coulomb repulsion between pairs of electrons, total quantum mechanical exchange energy of the electrons and total correlation energy of the electrons. In our group, we use DFT calculations to determine the ground state properties of solid state materials. We are interested in the electronic structure, magnetic properties, and phononic properties of materials. We employ DFT as implemented in the planewave pseudopotential Quantum ESPRESSO and all electron Wien2k codes in our research

  • Machine Learning

    Machine Learning

    Our interest in data-driven machine learning (ML) methods is to establish a quantitative relationship between the descriptors that represent various aspects of a material and the target material properties. We explore a suite of methods ranging from dimensionality reduction, clustering analysis, classification learning, and regression methods to accomplish the objectives. In addition, we are also exploring Bayesian learning methods when we can leverage domain expertise to constrain our problem. ML methods are especially suitable in materials science applications when either the theory does not exist or the theory is not predictive due to the lack of data all the parameters for the entire chemical space. Although model building is an important task, we are exploring ways to quantify prediction uncertainties in the models. It is the combination of model building and uncertainty quantification that make our approach rigorous and critical for optimal decision making. Most of our work is performed on the programming language R. We are also interested in Python and Matlab.

  • Optimal Learning

    Optimal Learning

    One of the recent advances in the application of ML methods to materials science is its integration with global optimization methods (e.g., efficient global optimization, mean objective cost of uncertainty, and knowledge gradient to name a few) that evaluate the tradeoff between “exploration” (choose the next experiment that has the largest prediction uncertainty) and “exploitation” (choose the next experiment that has the largest predicted mean) in an iterative feedback loop. In the past, it was typical to address the problem of materials discovery from exploiting the ML models (i.e., the “best” predictions based on the arguments of maxima or minima are recommended for the next experiment or calculation). This has been shown to be sub-optimal, especially when the ML models are trained using small data. Optimal learning methods, in an iterative feedback loop, have been shown to rapidly explore the vast search space and accelerate new materials discovery.