There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three "small " molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the potential energy surfaces (PESs) in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde, and, in the case of glycine, a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, "QM-22, " which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations. Published under an exclusive license by AIP Publishing.

The MD17 datasets from the perspective of datasets for gas-phase “small” molecule potentials / J.M. Bowman, C. Qu, R. Conte, A. Nandi, P.L. Houston, Q. Yu. - In: THE JOURNAL OF CHEMICAL PHYSICS. - ISSN 0021-9606. - 156:24(2022), pp. 240901.1-240901.10. [10.1063/5.0089200]

The MD17 datasets from the perspective of datasets for gas-phase “small” molecule potentials

R. Conte
;
2022

Abstract

There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three "small " molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the potential energy surfaces (PESs) in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde, and, in the case of glycine, a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, "QM-22, " which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations. Published under an exclusive license by AIP Publishing.
Settore CHIM/02 - Chimica Fisica
2022
Article (author)
File in questo prodotto:
File Dimensione Formato  
JCP_Perspective.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 6.41 MB
Formato Adobe PDF
6.41 MB Adobe PDF Visualizza/Apri
JCP_Perspective.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 3.48 MB
Formato Adobe PDF
3.48 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/931936
Citazioni
  • ???jsp.display-item.citation.pmc??? 4
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 17
social impact