home/categories/academic/agentscope-ai-openjudge-skills-ref-hallucination-arena-skill-md
academicresearch

ref-hallucination-arena

Benchmark LLM reference recommendation capabilities by verifying every cited paper against Crossref, PubMed, arXiv, and DBLP. Measures hallucination rate, per-field accuracy (title/author/year/DOI), discipline breakdown, and year constraint compliance. Supports tool-augmented (ReAct + web search) mode. Use when the user asks to evaluate, benchmark, or compare models on academic reference hallucination, literature recommendation quality, or citation accuracy.

agentscope-ai
maintainer
agentscope-ai
Mis à jour 3/5/2026
Étoiles
537
Forks
45
quick start

Installation and usage

Benchmark LLM reference recommendation capabilities by verifying every cited paper against Crossref, PubMed, arXiv, and DBLP. Measures hallucination rate, per-field accuracy (title/author/year/DOI), discipline breakdown, and year constraint compliance. Supports tool-augmented (ReAct + web search) mode. Use when the user asks to evaluate, benchmark, or compare models on academic reference hallucination, literature recommendation quality, or citation accuracy.

Installation
$ install --globalskills.sh
Utilisation

Après l'installation, vous pouvez utiliser ce skill en exécutant la commande suivante dans votre terminal :

skills use ref-hallucination-arena