Machine learning usually is associated with large data sets, which are readily available for pioneering applications such as image recognition, spam filters, etc. In the sciences, however, data generation is expensive, difficult, and time-consuming. Therefore, it is desirable to make accurate predictions of molecular properties (e.g.) with as few data points as possible. To realize such a data-economic machine-learning scenario, we develop and apply active-learning algorithms. The release of the batchwise variance-based sampling method was our first attempt in this direction. Further work in progress.
Maike Mücke is currently supporting this project as part of her Bachelor's thesis.
Johannes Diedrich is currently supporting this project as part of his Bachelor's thesis.
Chemical synthesis planning would benefit from accurate predictions of how and how fast two polarophiles (electrophiles and nucleophiles) react with each other. We combine computational thermochemistry with data-driven prediction models and kinetic network modeling to achieve this milestone. Work in progress.
Johannes Kircher is currently supporting this project as part of his Master's thesis.
Benchmarking physical models and numerical methods is a fundamental quality assessment for computational chemists. While most benchmarking efforts focus on rather simple and global performance measures (e.g. the mean unsigned error), we are viewing prediction performance from a probabilistic perspective and put it into a local context. We ask questions like What is the prediction error for a specific compound? and What is the uncertainty of that estimated prediction error?. To answer questions like these, we advance (re)sampling methods (e.g. Bootstrapping) and Bayesian methods (e.g. Gaussian process regression) to apply them to problems related to molecular science. Further work in progress.
The above-mentioned methodological developments were applied so far to selected topics of molecular science:
DISPERSION CORRECTIONS TO DENSITY FUNCTIONAL THEORY
KINETIC MODELING OF COMPLEX REACTION NETWORKS
CALIBRATION OF COMPUTED MÖSSBAUER PARAMETERS