本論文為探討如何使用強化學習解決在感知無線電網路中多通道交會問題。感知無線電網路中多通道交會問題是指兩個二級用戶如何在有限的時間內選擇跳到同個通道並成功交換彼此的訊息。本篇論文是在一個對稱性、同步、同質性以及全局共同標號的假設下進行研究,同時為了更貼近實際情況,我們假設每個通道有著不同的通道狀態,並且無法讓用戶得知該通道的狀態情況,某些狀態可能會導致雖然兩個用戶在同個通道卻無法成功通訊。此種無法得知通道狀態的情況下的交會問題稱作盲交會。我們在上述假設下提出了一個快速強化學習演算法,讓兩個用戶學習在不同的狀態下成功交會的交會策略。我們提出的快速強化學習演算法能大幅增加通道選擇策略學習的效率,並且在收斂後學到一組能與已有最佳解的特定通道狀態相比擬的平均交會時間(ETTR),與在沒有最佳解的狀態下有著最佳的效能(最低的ETTR)。;In this thesis, we consider the multichannel rendezvous problem in cognitive radio networks (CRNs) where the probability that two users hopping on the same channel have a successful rendezvous is a function of channel states. The channel states are modelled by stochastic processes with joint distributions known to users. However, the exact state of a channel at any time is not observable. We derived that the lower bound of the ETTR of the general channel model is the ETTR of the fast time-varying channel model and the upper bound is the ETTR of the slow time-varying channel model. By formulating such a multichannel rendezvous problem as an adversarial bandit problem, we propose using a reinforcement learning approach to learn the channel selection probabilities pi(t), i = 1; 2; : : : ;N. Our experimental results show that the reinforcement learning approach is very effective and yields comparable ETTRs when comparing to various approximation policies in the literature.