Prediction of Organic Compound Infrared Spectra with Deep Learning and Molecular Mechanics Calculations

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：98

、訪客IP：3.144.228.149

姓名

楊士漢(YANG,SHI-HAN) 查詢紙本館藏

畢業系所

化學工程與材料工程學系

論文名稱

(Prediction of Organic Compound Infrared Spectra with Deep Learning and Molecular Mechanics Calculations)

相關論文

★ 利用固相反應法與電鍍法製備鈣鈦礦太陽能電池之研究	★ 設計以雙噻吩併環戊二烯為核心的電洞傳輸材料並製備高效率穩定鈣鈦礦太陽能電池
★ 反溶劑處理對於製備大面積鈣鈦礦太陽能電池影響	★ 二氧化鈦奈米粒徑尺寸對介觀結構鈣鈦礦太陽能電池光伏特性之影響
★ 塗佈溫度與混合溶劑比例對於刮刀塗佈製備鈣鈦礦層影響及鈣鈦礦太陽能電池性能表現探討	★ 熱處理效應對於混合陽離子鈣鈦礦太陽能電池之光電性質及電池穩定性影響
★ 蔗糖水熱碳化法及後續活化製備活性碳以及活性碳對空氣過濾的應用	★ 雙金屬有機骨架結構混合基質膜合成及芳香烴吸附第一原理計算
★ 製膜溶劑對於混合基質膜中金屬有機框架結構沉澱影響與其氣體滲透特性之探討	★ 金屬有機骨架材料與活性碳共填充之混和基材膜性質探討
★ 蒸氣相成長金屬有機框架材料合成	★ 外表面積和靜電相互作用機理對MOFs染料吸附的重要性
★ 第一原理計算對於氮摻石墨烯在氧氣還原反應與拉曼增強的探討	★ 金屬有機框架結構晶體形貌與缺陷對於混合基材薄膜特性與氣體滲透之探討
★ 鋯金屬有機框架結構之二氧化碳吸附性質探討	★ 金屬有機框架結構晶體形貌與缺陷對於混合基材薄膜特性與氣體滲透之探討

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-8-31以後開放)

摘要(中)

紅外線光譜常被使用來研究化學物質結構，但是使用傳統的計算方法取得計算紅外線光譜將是不準確的或是需要耗費大量計算資源。在這項研究中，深度學習將被研究用來生成有機化合物的計算光譜。我們設計並訓練了一個卷積神經網路模型以加快及提高分子力學生成的光譜品質，這個模型學習了計算與實驗光譜之間的差異，可用來生成比原本只使用分子力學更高品質的光譜。我們設計了一個採用採用變分自動編碼器架構的模型，以研究生成式模型學習光譜生成的能力。這個變分自編碼器模型不僅在重建光譜方面性能非常接近先前的卷積模型，同時也能夠使資料的分布在潛在空間（latent space）中維持高斯分佈。收集和預處理光譜資料的方法也有被研究以為神經網路模型準備訓練資料集。

摘要(英)

Infrared spectroscopy serves as a common tool for the analysis of chemical structure. The conventional computational methods for infrared spectral simulating are either time-consuming or inaccurate. In this work, the use of deep learning models was studied for the task of generating infrared spectra of organic compounds. A convolutional neural network model was designed and trained to fast generate experimental-like spectra with molecular mechanicsspectra as input.The model which learned the differences between experimental and molecular mechanics generated spectra can be applied to produce spectra that are better than the original spectra generated by molecular mechanics calculations. A model that adopts variational autoencoder architecture was designed to investigate the power of generative models for learning the generation of spectra. The variational autoencoder model is able to not only reconstruct spectra with a performance very close to the previous convolutional model but also maintain the distribution of data in the latent space Gaussian. The methods of collecting and preprocessing spectral data were also investigated to prepare a training dataset for the neural network models.

關鍵字(中)

★ 深度學習
★ 紅外線光譜
★ 卷積神經網路
★ 變分自編碼器

關鍵字(英)

★ Deep learning
★ Infrared spectroscopy
★ Convolutional neural network
★ Variational autoencoder

論文目次

Table of Contents
List of Figures .................................................................................................vii
List of Tables..................................................................................................viii
1. Introduction ................................................................................................. 1
1.1 Motivation ................................................................................................. 1
1.2 Related works............................................................................................ 2
1.3 Objective ................................................................................................... 3
2. Background Knowledge.............................................................................. 5
2.1 Infrared Spectroscopy ............................................................................... 5
2.2 Simulating Molecular Infrared Spectroscopy ........................................... 6
2.3 Machine Learning ..................................................................................... 7
2.4 Deep learning and convolutional neural network ..................................... 8
2.5 Generative Models and Variational Autoencoder ................................... 10
2.6 Summary ................................................................................................. 12
3. Proposed Modeling.................................................................................... 13
4. Results and Discussion.............................................................................. 14
4.1 Collecting and Preprocessing Dataset..................................................... 14
4.1.1 Introduction ..................................................................................... 14
4.1.2 Experimental Infrared Spectra ........................................................ 14
4.1.3 Simulated Infrared Spectra.............................................................. 19
4.1.4 Data Preprocessing.......................................................................... 23
4.1.5 Summary ......................................................................................... 26
4.2 Designing Neural Network for Simulated IR Spectra Transformation... 27
4.2.1 Introduction ..................................................................................... 27
4.2.2 Designing the Model Architecture .................................................. 27
4.2.3 Model Training Setups and Details................................................. 29
4.2.4 Model Evaluation ............................................................................ 31
4.2.5 Summary ......................................................................................... 36
4.3 Using deep generative models for generating spectra ............................ 37
4.3.1 Introduction ..................................................................................... 37
4.3.2 Model Architecture and Training .................................................... 38
4.3.3 Model Evaluation ............................................................................ 42
4.3.4 Summary ......................................................................................... 47
4.4 Summary ................................................................................................. 48
5. Conclusions................................................................................................ 50
6. Future Work .............................................................................................. 51
7. Data Availability ........................................................................................ 52
8. References .................................................................................................. 53

參考文獻

1. Robb, E.W. and M.E. Munk, A neural network approach to infrared spectrum
interpretation. Microchimica Acta, 1990. 100(3): p. 131-155.
2. Munk, M.E., M.S. Madison, and E.W. Robb, Neural network models for infrared
spectrum interpretation. Microchimica Acta, 1991. 104(1): p. 505-514.
3. Fine, J.A., et al., Spectral deep learning for prediction and prospective validation of
functional groups. Chemical science, 2020. 11(18): p. 4618-4630.
4. Jung, G., S.G. Jung, and J.M. Cole, Automatic materials characterization from
infrared spectra using convolutional neural networks. Chemical Science, 2023.
14(13): p. 3600-3609.
5. Wang, T., et al., Infrared Spectral Analysis for Prediction of Functional Groups Based
on Feature-Aggregated Deep Learning. Journal of Chemical Information and
Modeling, 2023. 63(15): p. 4615-4622.
6. McGill, C., et al., Predicting Infrared Spectra with Message Passing Neural Networks.
Journal of Chemical Information and Modeling, 2021. 61(6): p. 2594-2609.
7. Weininger, D., SMILES, a chemical language and information system. 1. Introduction
to methodology and encoding rules. Journal of chemical information and computer
sciences, 1988. 28(1): p. 31-36.
8. Kovács, P., et al., Machine-learning Prediction of Infrared Spectra of Interstellar
Polycyclic Aromatic Hydrocarbons. The Astrophysical Journal, 2020. 902(2): p. 100.
9. Ye, S., et al., A Machine Learning Protocol for Predicting Protein Infrared Spectra.
Journal of the American Chemical Society, 2020. 142(45): p. 19071-19077.
10. Gastegger, M., J. Behler, and P. Marquetand, Machine learning molecular dynamics
for the simulation of infrared spectra. Chemical science, 2017. 8(10): p. 6924-6935.
11. Skoog, D.A., F.J. Holler, and S.R. Crouch, Principles of Instrumental Analysis. 2017:
54
Cengage Learning.
12. Lewars, E.G., Computational chemistry. Introduction to the theory and applications of
molecular and quantum mechanics, 2011. 318.
13. Courville, I.G.a.Y.B.a.A., Deep Learning. 2016: MIT Press.
14. Lecun, Y., et al., Gradient-based learning applied to document recognition.
Proceedings of the IEEE, 1998. 86(11): p. 2278-2324.
15. Long, J., E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic
segmentation. in Proceedings of the IEEE conference on computer vision and pattern
recognition. 2015.
16. Rumelhart, D.E., G.E. Hinton, and R.J. Williams, Learning representations by backpropagating errors. Nature, 1986. 323(6088): p. 533-536.
17. Kingma, D.P. and M. Welling, Auto-encoding variational bayes. arXiv preprint
arXiv:1312.6114, 2013.
18. Goodfellow, I., et al., Generative adversarial nets. Advances in neural information
processing systems, 2014. 27.
19. Kullback, S. and R.A. Leibler, On information and sufficiency. The annals of
mathematical statistics, 1951. 22(1): p. 79-86.
20. Linstrom, P.J. and W.G. Mallard, The NIST Chemistry WebBook: A Chemical Data
Resource on the Internet. Journal of Chemical & Engineering Data, 2001. 46(5): p.
1059-1063.
21. Pence, H.E. and A. Williams, ChemSpider: An Online Chemical Information
Resource. Journal of Chemical Education, 2010. 87(11): p. 1123-1124.
22. Technology, N.I.o.A.I.S.a. Spectral Database for Organic Compounds SDBS. [cited
2023; Available from: https://sdbs.db.aist.go.jp.
23. McDonald, R.S. and P.A. Wilks, JCAMP-DX: A standard form for exchange of
infrared spectra in computer readable form. Applied Spectroscopy, 1988. 42(1): p.
55
151-162.
24. Society, A.C. CAS REGISTRY and CAS Registry Number FAQs. [cited 2023;
Available from: https://www.cas.org/support/documentation/chemical-substances/faqs.
25. Spartan’20 v1.1.4. 2021, WaveFunction, Inc., Irvine, CA.
26. Sweigart, A., PyAutoGUI 0.9.54. 2023.
27. Heller, S.R., et al., InChI, the IUPAC International Chemical Identifier. Journal of
Cheminformatics, 2015. 7(1): p. 23.
28. Peach, M.L. and M.C. Nicklaus, Chemoinformatics at the CADD Group of the
National Cancer Institute, in Applied Chemoinformatics. 2018. p. 385-393.
29. Halgren, T.A., Merck molecular force field. I. Basis, form, scope, parameterization,
and performance of MMFF94. Journal of Computational Chemistry, 1996. 17(5-6): p.
490-519.
30. Zar, J.H., Spearman Rank Correlation, in Encyclopedia of Biostatistics. 2005.
31. Chollet, F., Keras: The python deep learning library. Astrophysics source code library,
2018: p. ascl: 1806.022.
32. Bisong, E., Google Colaboratory, in Building Machine Learning and Deep Learning
Models on Google Cloud Platform: A Comprehensive Guide for Beginners, E. Bisong,
Editor. 2019, Apress: Berkeley, CA. p. 59-64.
33. Ioffe, S. and C. Szegedy. Batch normalization: Accelerating deep network training by
reducing internal covariate shift. in International conference on machine learning.
2015. pmlr.
34. Nitish, S., Dropout: a simple way to prevent neural networks from overfitting. J.
Mach. Learn. Res., 2014. 15: p. 1.
35. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
36. Masters, D. and C. Luschi, Revisiting small batch training for deep neural networks.
56
arXiv preprint arXiv:1804.07612, 2018.
37. Henschel, H., et al., Theoretical Infrared Spectra: Quantitative Similarity Measures
and Force Fields. Journal of Chemical Theory and Computation, 2020. 16(5): p. 3307-
3315.
38. Chollet, F. Variational AutoEncoder. 2020 2022/08]; Available from:
https://keras.io/examples/generative/vae/.
39. Yao, Y., L. Rosasco, and A. Caponnetto, On Early Stopping in Gradient Descent
Learning. Constructive Approximation, 2007. 26(2): p. 289-315.
40. Bowman, S.R., et al. Generating Sentences from a Continuous Space. 2016. Berlin,
Germany: Association for Computational Linguistics.

指導教授

張博凱(Bor Kae Chang)

審核日期

2024-8-19

推文