AI-Qos具反應時間保證與品質可控之AI技術;Latency-Aware Inference Techniques with Controllable Quality and Response-Time Guarantees

NCU Institutional Repository > 資訊電機學院 > 資訊工程學系碩士在職專班 > 博碩士論文 > Item 987654321/99389

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/99389

題名:	AI-Qos具反應時間保證與品質可控之AI技術;Latency-Aware Inference Techniques with Controllable Quality and Response-Time Guarantees
作者:	廖上華;Liao, Shang-Hua
貢獻者:	資訊工程學系在職專班
關鍵詞:	即時串流;網路延遲;物件分割推論
日期:	2026-01-27
上傳時間:	2026-03-06 18:52:26 (UTC+8)
出版者:	國立中央大學
摘要:	隨著即時 AI 應用與串流推論服務的快速發展,AI 推論系統在實際部署環境中所面臨之反應時間與服務穩定性問題日益顯著。傳統推論架構多以固定模型與離線準確率為主要設計考量,難以因應網路延遲波動、模型計算成本差異與系統負載變化所造成的端到端延遲累積,進而影響即時性與推論結果之可用性。為回應實際應用場域需求,本研究提出一套具反應時間感知與品質可控之AI-QoS 推論架構,透過即時延遲量測與場景特性分析,動態調整推論策略以兼顧即時性與推論品質,並設計自適應模型調度機制,於不同延遲與系統負載條件下選擇最適推論模型。實驗結果顯示,所提出之方法能有效降低端到端延遲對推論品質之影響,並在動態環境中維持推論服務之穩定性與可預測性,證實其於即時串流推論與實際 AI 應用部署中具備可行性與實務價值。;The rapid growth of real-time AI applications and streaming inference services has made responsiveness and service stability critical challenges in deployed AI systems. Conventional inference pipelines rely on fixed models and offline accuracy metrics, which are inadequate for handling end-to-end latency accumulation caused by network variability, heterogeneous model computational costs, and dynamic system workloads, resulting in degraded timeliness and inference usability. This work proposes a latency-aware and quality-controllable AI-QoS inference framework that dynamically adapts inference strategies based on real-time latency measurements and scene characteristics. An adaptive model scheduling mechanism is introduced to select suitable inference models under varying latency and resource conditions, balancing inference quality and responsiveness. Experimental results show that the proposed framework effectively mitigates the impact of end-to-end latency on inference quality while maintaining stable and predictable performance in dynamic environments, demonstrating its practicality for real-time streaming inference and deployed AI applications.
顯示於類別:	[資訊工程學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	15	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....