在股票交易中,根據不同情況設計一個盈利的交易策略是一個重大挑戰。近年來,人工智能的發展為股票市場帶來了新的投資方法。Q-learning,一種強化學習演算法,可以幫助投資者學習市場趨勢並提供更合理的投資決策。在Q-learning中,狀態的制定尤其重要,因為不同的制定方法會影響其表現。本文提出了一種基於非監督式學習的數據驅動方法來設置Q-learning所需的狀態,將多維度的股票市場資料作為特徵,並藉由動態時間校正(DTW) 與 t-SNE 來找尋所需狀態。本文以台灣股市為例,建構單一資產的Q-learning投資決策,並相應地提出了一個由多個資產組成的適當投資組合。實證結果顯示,所提出的方法提供了不錯的投資表現。;Designing a profitable trading strategy based on different situations is a major challenge in stock trading. In recent years, the development of artificial intelligence has brought new in vestment methods to the stock market. Q-learning, a reinforcement learning algorithm, can help investors to learn market trends and recommend more reasonable investment decisions. In Q-learning, the formulation of states is particularly important since different formula tion methods can affect its performance. We propose a data-driven approach based on a non-supervised learning method to set the states required in Q-learning. By utilizing multi dimensional stock market data as features and leveraging Dynamic Time Warping (DTW) and t-SNE, the proposed approach efficiently identifies the desired states for Q-learning. In this work, using the Taiwan stock market as an example, we obtain the Q-learning invest ment decision of a single asset and propose an appropriate investment portfolio consisting of multiple assets accordingly. The empirical results reveal that the proposed method provides a satisfactory investment performance.