-
LLM 평가자의 효율성 평가 (LLM-as-Judge)
Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)
Use cases, techniques, alignment, finetuning, and critiques against LLM-evaluators.
-
도메인 외 파인튜닝을 통한 환각 탐지 부트스트래핑
Out-of-Domain Finetuning to Bootstrap Hallucination Detection
How to use open-source, permissive-use data and collect less labeled samples for our tasks.