dc.description.abstract | In recent years, the rapid global development has significantly increased the demand for alternative energy sources, particularly solar energy. The new generation of photovoltaic technologies such as Organic Solar Cells (OSC) and Dye-Sensitized Solar Cells (DSC) have attracted attention due to their low cost and diverse material options. Particularly important is the reduction of environmentally harmful heavy metals, such as lead in perovskite solar cells.
In this study, we developed four predictive models based on machine learning, utilizing tree-based XGBoost and Artificial Neural Networks (ANN) techniques. These models employ molecular descriptors (MDs) derived from experimental and DFT calculations, to perform high throughput virtual screening (HTVS) of ternary OSC materials. The HTVS analysis utilized two distinct databases: the first comprised 429,413 unique ternary OSC systems reconstructed from an existing database; the second was drawn from the Harvard Clean Energy Project Database (CEPDB), which includes about 2.3 million unique donor material molecules. These four ML models demonstrated significant power conversion efficiency (PCE) prediction accuracy on closely related molecular test sets (interpolation). However, the XGBoost model showed limited capability in predicting molecules significantly different from those in the training set. Conversely, the ANN model exhibited strong extrapolative ability in HTVS, successfully predicting new potential ternary OSC systems with over 20% PCE. This study, through efficient HTVS, has accelerated the development of OSC molecular materials and advanced ternary OSC technology.
On the other hand, a precise, predictive, and interpretable machine learning model specifically designed for Zn-porphyrin-sensitized solar cells was proposed. This model uses theoretically computable, efficient, and reusable MDs to address these challenges. It performed excellently in the "blind test" of 17 newly designed cells, achieving an average absolute error (MAE) of 1.02%. Notably, the predictive error for ten types of dyes was within 1%. These results validate the machine learning models and their importance in exploring the unknown chemical space of Zn-porphyrins. SHAP analysis identified key MDs that closely correspond with experimental observations, providing valuable chemical guidance for the rational design of dyes in DSCs. These models enable efficient predictive, significantly reducing the analysis time for photovoltaic cells. Promising Zn-porphyrin dyes with excellent PCE have been identified, facilitating high-throughput virtual screening. This predictive tool is publicly accessible at https://ai-meta.chem.ncu.edu.tw/dsc-meta. | en_US |