本研究旨在開發一套「人工智慧(AI)結合繞射光學元件(DOE)的快照式三維物件辨識技術」,能於單張影像中完成深度量測與物體辨識。 在光學設計方面,本研究使用 Gerchberg–Saxton 演算法最佳化 DOE,使其產生獨特的繞射圖案。接著以疊合不同入射光角度之方法模擬雷射的發散角,並分析雷射發散角對繞射圖案造成擴散之關係,結果顯示在小角度範圍內雷射發散角與繞射圖案發散角度近乎呈線性對應。 實驗上,結合雷射光源、毛玻璃散射片與 CMOS 感測器並建立視差與深度的函式關係,在單次曝光內擷取含深度線索之影像並重建三維點雲。原始點雲的均方根誤差(RMSE)為5.3 mm ;經統計離群剔除與曲面平滑後,RMSE降至2.0 mm 。最後以PointNet架構訓練香蕉、蘋果與檸檬之辨識模型,結合本研究之三維量測技術,在精度有限的重建點雲中成功進行辨識,特徵平均分類正確率約為 70 %。 相較於傳統結構光或掃描式系統,本方法能顯著縮短資料取得時間、減小裝置體積,同時維持可靠的重建品質;特別適用於需要低成本與高速度的環境使用,如工業檢測、機械手臂夾取與人機互動裝置等三維感測應用。 ;This study develops a snapshot 3-D object-recognition technique that combines artificial intelligence (AI) with a diffractive optical element (DOE), enabling both depth measurement and object classification from a single image. In the optical design, the DOE is optimized with the Gerchberg–Saxton algorithm to generate a distinctive diffraction pattern. Laser‐beam divergence is emulated by superimposing multiple incident angles, and the relationship between the laser’s divergence angle and the resulting pattern spread is analyzed. Within a small angular range the two quantities exhibit an almost linear correspondence. Experimentally, a laser source, ground-glass diffuser and CMOS sensor are integrated, and a parallax-to-depth mapping is established. With just one exposure the system captures depth-encoded imagery and reconstructs a 3-D point cloud. The raw point cloud shows a root-mean-square error (RMSE) of 5.3 mm; after statistical outlier removal and moving-least-squares surface smoothing, the RMSE is reduced to 2.0 mm. A PointNet architecture is then trained with 30 high-precision scans each of bananas, apples and lemons. When fed with the snapshot point clouds, the network achieves an average classification accuracy of ≈70 %, successfully recognizing all three fruit types despite limited geometric precision. Compared with traditional structured light or scanning methods, the proposed approach drastically shortens data-acquisition time, reduces hardware size and still maintains reliable reconstruction quality. It is therefore well suited to low-cost, high-speed 3-D sensing scenarios such as industrial inspection, robotic grasping and human–machine interaction devices.