In this study we compare and evaluate different unsupervised clustering algorithms for organofacies discrimination in low maturity dispersed organic matter based on Raman spectroscopic analyses. A total of 1363 Raman spectra were collected from a set of 27 organic-rich samples from the Lower Toarcian shale interval of the Paris Basin sub-surface. Rock-Eval pyrolysis data indicate a type II to type III kerogen with a vitrinite reflectance (Ro%) between 0.45% and 0.65%, and Tmax between 415 °C and 438 °C. Organic petrographic observations under transmitted light reveal the presence of organofacies composed by amorphous organic matter, opaque, and translucent phytoclasts. An optical classification of organic particles was performed on about 40-60 fragments per sample and used as the ground truth. Raman spectra were obtained for all the classified fragments and principal component analysis was performed to underline the variability among spectra. Unsupervised clustering was then applied on Raman spectra principal components. Three clustering methods were applied to evaluate their effectiveness in predicting number, shape and density of clusters and a contingency matrix was used to quantify their ability to predict different organofacies. Gaussian Mixture Model (GMM) was found to be the best algorithm for organofacies identification showing an accuracy mostly between 80% and 90%. This work outlines how unsupervised clustering of Raman spectra of dispersed organic matter can reduce the uncertainty in thermal maturity assessment and help the classification of highly heterogeneous organofacies when using large datasets for Earth and planetary sciences studies.

Automatic organofacies identification by means of Machine Learning on Raman spectra / N.A. Vergara Sassarini, A. Schito, M. Gasparrini, P. Michel, S. Corrado. - In: INTERNATIONAL JOURNAL OF COAL GEOLOGY. - ISSN 0166-5162. - 271:(2023 Apr 15), pp. 104237.1-104237.21. [10.1016/j.coal.2023.104237]

Automatic organofacies identification by means of Machine Learning on Raman spectra

M. Gasparrini
Project Administration
;
2023

Abstract

In this study we compare and evaluate different unsupervised clustering algorithms for organofacies discrimination in low maturity dispersed organic matter based on Raman spectroscopic analyses. A total of 1363 Raman spectra were collected from a set of 27 organic-rich samples from the Lower Toarcian shale interval of the Paris Basin sub-surface. Rock-Eval pyrolysis data indicate a type II to type III kerogen with a vitrinite reflectance (Ro%) between 0.45% and 0.65%, and Tmax between 415 °C and 438 °C. Organic petrographic observations under transmitted light reveal the presence of organofacies composed by amorphous organic matter, opaque, and translucent phytoclasts. An optical classification of organic particles was performed on about 40-60 fragments per sample and used as the ground truth. Raman spectra were obtained for all the classified fragments and principal component analysis was performed to underline the variability among spectra. Unsupervised clustering was then applied on Raman spectra principal components. Three clustering methods were applied to evaluate their effectiveness in predicting number, shape and density of clusters and a contingency matrix was used to quantify their ability to predict different organofacies. Gaussian Mixture Model (GMM) was found to be the best algorithm for organofacies identification showing an accuracy mostly between 80% and 90%. This work outlines how unsupervised clustering of Raman spectra of dispersed organic matter can reduce the uncertainty in thermal maturity assessment and help the classification of highly heterogeneous organofacies when using large datasets for Earth and planetary sciences studies.
Raman spectroscopy; machine learning; cluster analysis; dispersed organic matter; principal component analysis; thermal maturity;
Settore GEO/02 - Geologia Stratigrafica e Sedimentologica
15-apr-2023
8-apr-2023
Article (author)
File in questo prodotto:
File Dimensione Formato  
VergaraSassariniEtAl2023_Published_OpenAccess_compressed.pdf

accesso aperto

Descrizione: open access licence
Tipologia: Publisher's version/PDF
Dimensione 4.26 MB
Formato Adobe PDF
4.26 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/964877
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact