本論文基於上述兩種頻譜圖,提出了一種自適應頻率軸頻譜圖(Adaptive Frequency Spectrogram, Adaf-Spectrogram)。該頻譜圖透過計算整體資料集的頻率能量分布,自動調整頻率軸的尺度縮放,以更有效地突顯訊號中的關鍵頻率特徵。實驗結果證明,此自適應頻率軸頻譜圖在多種資料集上均具良好適應性,並且在辨識效果上優於傳統頻譜圖(Spectrogram),展現出顯著的性能提升。;In the fields of modern signal processing and artificial intelligence, the spectrogram is a fundamental visual representation that transforms time-domain signals into the time-frequency domain. It has found extensive applications in areas such as human activity recognition, biomedical signal analysis, speech recognition, and environmental sound classification. Among these, the Mel-spectrogram is a prominent variant. By emulating the human auditory system′s perception of frequency through non-linear compression of the frequency axis, it more effectively preserves semantic and prosodic information in speech signals. Consequently, it has become one of the most expressive and widely-adopted acoustic features. With its intuitive yet detailed time-frequency representation, the spectrogram effectively reveals latent time-variant frequency characteristics within a signal, providing highly discriminative input features for deep learning models. It has demonstrated exceptional performance in classification and recognition tasks, particularly within Convolutional Neural Network (CNN) architectures.
Building upon these established representations, this paper proposes a novel Adaptive Frequency Spectrogram (Adaf-Spectrogram). This data-driven method automatically adjusts the frequency axis scaling by computing the overall frequency energy distribution across an entire dataset, thereby more effectively emphasizing critical frequency features. Experimental results demonstrate that the proposed Adaf-Spectrogram exhibits excellent adaptability across multiple datasets. Furthermore, it outperforms conventional linear-scale spectrograms in recognition tasks, showcasing a significant performance improvement.