dc.description.abstract | We are witnessing the era of explosive growth in large language models, where everyone can now have a versatile digital assistant. Compared to the task-oriented dialogue (TOD) chatbots of the past, large language models (LLMs) offer more comprehensive conversational abilities and the characteristic of providing accurate responses.
This paper begins with the historical context of natural language processing to understand its technological evolution. It then explores how Transformer models and the Attention mechanism have overcome the limitations faced by previous natural language processing (NLP) technologies. Coupled with advancements in computing power and the easy availability of massive online data, these developments have ultimately led to the creation of many mainstream large language models today, elevating the field of NLP to unprecedented heights.
This paper focuses on the Taide large language model, derived from the open-source Meta Llama 3, leveraging its superior capabilities to implement our Campus AI Assistant for tasks such as article writing, summarization, and providing answers to questions about Taiwan′s local cultural background. By incorporating the RAG (Retrieval-Augmented Generation) framework, the Campus AI Assistant can quickly and efficiently provide relevant campus-related knowledge without the need for technically demanding fine-tuning of the LLM.
This paper primarily focuses on establishing our own Campus AI Assistant using the On-Premises concept. LangChain serves as the development framework, and Ollama is used as the LLM management platform. We also employ the open-source ChromaDB for vector database storage and select taide/Llama3-TAIDE-LX-8B-Chat as our LLM. By integrating the RAG framework, we enable the Campus AI Assistant to handle specific knowledge processing workflows. We also use web scraping to gather relevant campus data (using National Central University as an example) and perform data preprocessing to optimize RAG retrieval performance. Finally, we use Chainlit WEB UI to present our AI Assistant interface as conceived in this paper.
Our experimental design, whether in writing articles, article summaries, or the accuracy of RAG retrieval, clearly demonstrates that our paper framework has significantly trustworthy and domain-specific answer accuracy, thus proving that this framework is definitively usable. | en_US |