本研究探討檢索增強生成技術如何提升生成式人工智慧在程式設計教育中的有效性。本研究深入比較了整合到RAG系統的商業與開源大型語言模型,檢視檢索設計和提示工程如何影響所產生回應的品質。透過建立專門針對機器學習課程的資料庫,並使用RAGAS評估框架,本研究基於五項品質指標對五個知名模型(GPT-4o、Claude-3.7-Sonnet、Gemini-2.0-Flash、Llama3.3-70b和Ministral-8b)進行比較分析。研究發現不同模型類型間存在顯著的性能差異,其中結構化推理提示(思維鏈和退一步思考)被證實是提升整體模型性能的強力因子。Re-ranking策略被證明是最有效的檢索方法,特別是在提升輕量級開源模型的性能方面。本研究為經濟可行的RAG系統在程式設計教育中的有效性提供了實證證據,有助於縮小商業模型與開源模型之間的性能差距,並為資源受限的教育環境提供實際解決方案。;This study examines how Retrieval-Augmented Generation enhances the effectiveness of generative artificial intelligence in programming education. An in-depth comparison is made between commercial and open-source large language models incorporated into the RAG system, examining how retrieval design and prompt engineering affect the quality of responses produced. By creating a database specific to machine learning courses and using the RAGAS evaluation framework, a comparative analysis of five prominent models (GPT-4o, Claude-3.7-Sonnet, Gemini-2.0-Flash, Llama3.3-70b, and Ministral-8b) is conducted based on five quality metrics. The findings reveal considerable performance differences between different model types, with structured reasoning prompts (Chain-of-Thought and Take a Step Back) proving powerful drivers of overall model performance. Re-ranking proves to be the most effective retrieval approach, especially in enhancing the performance of lightweight open-source models. This research provides empirical evidence for the effectiveness of economically feasible RAG systems in programming education, thus helping bridge the gap in the performance of commercial and open-source models and providing real-world solutions for resource-constrained educational settings.