基于协同双源检索的医学教育问答检索增强生成

唐新宇; 杨仕林; 张攀登; 刘豫; 韩秋篪; 刘嘉

doi:10.12146/j.issn.2095-3135.20260127002

基于协同双源检索的医学教育问答检索增强生成

Collaborative Dual-Source Retrieval Augmented Generation for Medical Education Question Answering

摘要

摘要: 大语言模型在医学教育问答中展现出巨大潜力，但其固有的幻觉问题与知识滞后性限制了其在高可靠场景下的应用。检索增强生成引入外部知识能有效缓解大语言模型的不足，但受限于静态知识库的覆盖盲区。为此，迭代检索等技术被引入，通过多轮交互从开放网络等数据源中获取更深的信息，但现有的通用深度检索机制在处理专业医学问题时，因查询时缺乏领域适配而导致检索中存在语义失配与噪声累积。为此，本文提出了协同双源检索增强生成框架（Collaborative Dual-Source Retrieval Augmented Generation, CD-RAG），通过任务自适应查询重写适配异构检索源，并结合本地混合检索与具备反思机制的迭代式网络检索，系统整合了规范化医学知识与动态前沿信息。此外，该框架引入基于语义的重排序模型对异构证据进行统一重排序以消除噪声从而构成高质量上下文，保证大语言模型实现可溯源回答。在MedQA与RAGCare-QA数据集的实验结果表明，CD-RAG优于现有方法，能有效提升大语言模型在医学测试中的准确性与时效性。

Abstract: Large Language Models (LLMs) demonstrate significant potential in medical education question answering, yet their inherent issues of hallucination and knowledge obsolescence limit their application in high-reliability scenarios. While Retrieval-Augmented Generation (RAG) mitigates these limitations by incorporating external knowledge, it is constrained by coverage gaps inherent in static knowledge bases. Techniques such as iterative retrieval have thus been introduced to acquire deeper information from dynamic sources like the open web through multi-turn interactions. However, existing general-purpose deep retrieval mechanisms often lead to semantic mismatches and noise accumulation when processing specialized medical queries, primarily due to a lack of domain adaptation during query formulation. To address this, this paper proposes a Collaborative Dual-Source Retrieval Augmented Generation framework (CD-RAG). CD-RAG employs task-adaptive query rewriting to tailor requests for heterogeneous retrieval sources. It systematically integrates standardized medical knowledge with dynamic, cutting-edge information by combining local hybrid retrieval with iterative web retrieval enhanced by a reflection mechanism. Furthermore, the framework incorporates a semantic based re-ranking model to unify and denoise heterogeneous evidence, constructing high-quality contextual inputs to enable traceable responses from the LLM. Experimental results on the MedQA and RAGCare-QA datasets indicate that CD-RAG outperforms existing methods, effectively enhancing both the accuracy and timeliness of LLMs in medical assessments.

HTML全文

参考文献(0)

施引文献

资源附件(0)