Machine learning usually is associated with large data sets, which are readily available for pioneering applications such as image recognition, spam filters, etc. In the sciences, however, data generation is expensive, difficult, and time-consuming. Therefore, it is desirable to make accurate predictions of molecular properties (e.g.) with as few data points as possible. To realize such a data-economic machine-learning scenario, we develop and study active-learning algorithms. The release of the batchwise variance-based sampling (BVS) method was our first attempt in this direction. Further work in progress.
Johannes Diedrich, B.Sc. (research assistant at TU Braunschweig and master's student at Georg-August University) is supporting this project.
Maike Mücke, B.Sc. (master's student at Georg-August University) is supporting this project.
Chemical synthesis planning would benefit from accurate predictions of how and how fast two polarophiles (electrophiles and nucleophiles) react with each other. We combine computational thermochemistry with data-driven prediction models and kinetic network modeling to achieve this milestone. Preliminary results are available at 10.26434/chemrxiv.14102372. Further work in progress.
Johannes Kircher, M.Sc. (doctoral student, Mata research group at Georg-August University) is supporting this project.
René Rahrt, M.Sc. (doctoral student, Koszinowski research group at Georg-August University) is supporting this project.
Maike Vahl, B.Sc. (master's student at Georg-August University) is supporting this project.
Benchmarking physical models and numerical methods is a fundamental quality assessment for computational chemists. While most benchmarking efforts focus on rather simple and global performance measures (e.g. the mean unsigned error), we are viewing prediction performance from a probabilistic perspective and put it into a local context. We ask questions like What is the prediction error for a specific compound? and What is the uncertainty of that estimated prediction error?. To answer questions like these, we advance (re)sampling methods (e.g. Bootstrapping) and Bayesian methods (e.g. Gaussian process regression) to apply them to problems related to molecular science. Further work in progress.
The above-mentioned methodological developments were applied so far to selected topics of molecular science:
MAGNETIC PROPERTIES OF TRANSITION METAL COMPLEXES
DISPERSION CORRECTIONS TO DENSITY FUNCTIONAL THEORY
KINETIC MODELING OF COMPLEX REACTION NETWORKS
CALIBRATION OF COMPUTED MÖSSBAUER PARAMETERS