RESEARCH AND APPLICATION PRACTICE OF LARGE LANGUAGE MODEL PRIVATE DEPLOYMENT IN KNOWLEDGE BASE QUESTION ANSWERING SCENARIOS
-
Abstract
To address the urgent need for private deployment of large language models (LLMs) in enterprise knowledge base question answering (KBQA) scenarios, this study presents a detailed exploration of utilizing retrieval-augmented generation (RAG) technology to build a localized knowledge base, using the Shanghai Cigarette Factory as a practical case study. The research integrated the open-source Qwen1.5-32B LLM with the BAAI/BGE-large-zh-v1.5 embedding model. A specific construction scheme was introduced, with model selection thoroughly considering both computational resource limitations and model performance. The practice validates the effectiveness of the RAG technique in significantly improving the accuracy of information retrieval within a specific domain knowledge base. INT4 quantization can significantly reduce video memory usage while having a minimal impact on question answering performance. This work offers a valuable technical scheme and practical guidance for the secure and efficient private deployment and application of LLMs in vertical industries.
-
-