中大學術數位典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/99389
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 94201/94201 (100%)
Visitors : 81548055      Online Users : 1937
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/99389


    Title: AI-Qos具反應時間保證與品質可控之AI技術;Latency-Aware Inference Techniques with Controllable Quality and Response-Time Guarantees
    Authors: 廖上華;Liao, Shang-Hua
    Contributors: 資訊工程學系在職專班
    Keywords: 即時串流;網路延遲;物件分割推論
    Date: 2026-01-27
    Issue Date: 2026-03-06 18:52:26 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 隨著即時 AI 應用與串流推論服務的快速發展,AI 推論系統在實際部署環
    境中所面臨之反應時間與服務穩定性問題日益顯著。傳統推論架構多以固定模型與離線準確率為主要設計考量,難以因應網路延遲波動、模型計算成本差異與系統負載變化所造成的端到端延遲累積,進而影響即時性與推論結果之可用性。為回應實際應用場域需求,本研究提出一套具反應時間感知與品質可控之AI-QoS 推論架構,透過即時延遲量測與場景特性分析,動態調整推論策略以兼顧即時性與推論品質,並設計自適應模型調度機制,於不同延遲與系統負載條件下選擇最適推論模型。實驗結果顯示,所提出之方法能有效降低端到端延遲對推論品質之影響,並在動態環境中維持推論服務之穩定性與可預測性,證實其於即時串流推論與實際 AI 應用部署中具備可行性與實務價值。;The rapid growth of real-time AI applications and streaming inference services has made responsiveness and service stability critical challenges in deployed AI systems. Conventional inference pipelines rely on fixed models and offline accuracy metrics, which are inadequate for handling end-to-end latency accumulation caused by network variability, heterogeneous model computational costs, and dynamic system workloads, resulting in degraded timeliness and inference usability. This work proposes a latency-aware and quality-controllable AI-QoS inference framework that dynamically adapts inference strategies based on real-time latency measurements and scene characteristics. An adaptive model scheduling mechanism is introduced to select suitable inference models under varying latency and resource conditions, balancing inference quality and responsiveness. Experimental results show that the proposed framework effectively mitigates the impact of end-to-end latency on inference quality while maintaining stable and predictable performance in dynamic environments, demonstrating its practicality for real-time streaming inference and deployed AI applications.
    Appears in Collections:[Executive Master of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML192View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明