在智慧倉儲業,叉車機器人使用了各種感測器來偵測環境和測距,用以移動、避障 和辨識和存取倉儲棧板。傳統使用 3D 攝影機以獲取精準的棧板位置和距離資訊,但其 成本高昂、計算速度慢,並且空間解析度不高。本研究提出了一個基於單目視覺影像來 進行棧板物件辨識和距離預測的方法,稱為 MVPRP(Monocular Vision for Pallet recognition and positioning),此方法使用了 YOLACT 網路模型來從事即時 2D 棧板辨識 和定位,並藉由 ResNet 模型估測 2D 影像所缺乏的棧板距離資訊,使其能在低硬體設備 成本、即時運算的情況下解決倉儲自動化中叉車機器人對於棧板偵測和距離估測的問題。;In the smart warehousing industry, forklift robots utilize various sensors for environment detection, ranging, and the recognition and retrieval of warehouse pallets. Traditional approaches employ 3D cameras to obtain accurate pallet positions and distance information, but they are costly, computationally slow, and have limited spatial resolution. This study proposes a method called MVPRP (Monocular Vision for Pallet Recognition and Positioning) that relies on monocular visual imagery for pallet object recognition and distance estimation. The approach utilizes the YOLACT network model for real-time 2D pallet recognition and localization, and leverages the ResNet model to estimate the missing distance information from the 2D images. This enables the solution to address pallet detection and distance estimation challenges in warehouse automation for forklift robots, while maintaining low hardware costs and real-time computation.