Eval-RAG Accepted at EMNLP 2023: Reducing Hallucinations of LLMs in Professional Fields

Eval-RAG Accepted at EMNLP 2023: Reducing Hallucinations of LLMs in Professional Fields

Eval-RAG Accepted at EMNLP 2023: Reducing Hallucinations of LLMs in Professional Fields

Author

Author

Author

Jacob Choi

Jacob Choi

Jacob Choi

Date Published

Date Published

Date Published

Dec 11, 2023

Dec 11, 2023

Dec 11, 2023

Exciting news from NLLP 2023 (Natural Legal Language Processing 2023)! Linq, Law&Good, and Yonsei University have introduced Eval-RAG, a new technology that enhances the precision of AI-generated responses in legal contexts. This innovation promises to revolutionize how legal professionals interact with AI technologies.

Stay connected for more updates on how we're driving innovation in legal technology!

Link to Paper: https://aclanthology.org/2023.nllp-1.13/


Abstract

While large language models (LLMs) have demonstrated significant capabilities in text generation, their utilization in areas requiring domain-specific expertise, such as law, must be approached cautiously. This caution is warranted due to the inherent challenges associated with LLM-generated texts, including the potential presence of factual errors. Motivated by this issue, we propose Eval-RAG, a new evaluation method for LLM-generated texts. Unlike existing methods, Eval-RAG evaluates the validity of generated texts based on the related document that are collected by the retriever. In other words, Eval-RAG adopts the idea of retrieval augmented generation (RAG) for the purpose of evaluation. Our experimental results on Korean Legal Question-Answering (QA) tasks show that conventional LLM-based evaluation methods can be better aligned with Lawyers’ evaluations, by combining with Eval-RAG. In addition, our qualitative analysis show that Eval-RAG successfully finds the factual errors in LLM-generated texts, while existing evaluation methods cannot.