PALO ALTO, CA, UNITED STATES, February 19, 2025 /ennovaterz/ — Root Signals, a leader in large language model (LLM) evaluation and AI application quality control, proudly announces the release of Root Judge, a groundbreaking LLM that sets a new standard for reliable, customizable and locally-deployable evaluation models. Root Judge is built as a fine-tuned version of Meta’s Llama-3.3-70B-Instruct, one of the most powerful mid-sized open-weights models.
Root Judge is primarily designed to serve as an LLM-as-a-Judge, enabling organizations to:
- Detect Context-Grounded Hallucinations: Automatically detect, describe and block hallucinations in Retrieval-Augmented-Generation (RAG) pipelines.
- Facilitate Pairwise Preference Judgments: Use customizable rubrics for tasks like inference-time compute optimization or synthetic data generation requiring Best-of-N decisions.
- Support Privacy-Focused Deployments: Avoid sending sensitive data over the public internet while leveraging cutting-edge LLM capabilities.
Solid Training Foundation
Root Judge was meticulously post-trained on a high-quality, human-annotated dataset mix, designed for pairwise preference judgments and multi-turn instruction-following tasks with source citation. Leveraging advanced optimization techniques, such as Direct Preference Optimization (DPO) with Identity Preference Optimization (IPO) loss, the model underwent training on 384 AMD Radeon Instinct™ MI250X GPUs using the LUMI Supercomputer.
”With solutions for reliable and explainable AI, Root Signals is contributing to a critical topic to enterprises. The successful training of Root Judge on the LUMI supercomputer demonstrates both the power of AMD compute platforms and the vibrancy of Finland’s AI ecosystem. This is exactly the kind of innovation we need to see more of in Finland and Europe,” says Peter Sarlin, Co-Founder and CVP, AMD Silo AI.
Why Root Judge Stands Out:
- Fine-Tuned Excellence: State-of-the-Art hallucination detection, outperforming both closed source frontier models such as OpenAI’s GPT-4o, o1-mini, o1-preview and Anthropic’s Sonnet-3.5 as well as other open source Judge LLMs of similar size.
- Explainable Outputs: Designed to provide transparent justifications for scoring, enhancing trust in AI-driven assessments.
- Open Access for Innovation: With open weights and a focus on privacy-centric deployments, Root Judge fosters innovation while addressing data security concerns.
A Model Built for the Future
“Root Judge represents a major leap in how organizations can evaluate and optimize their LLM systems,” says Ari Heljakka, CEO of Root Signals. “Its ability to transparently deliver context-grounded judgments ensures that businesses can deploy AI responsibly and effectively, while optimizing inference costs and ensuring privacy.”
Root Judge’s applications extend across industries, making it a versatile tool for enterprises, developers, and researchers seeking reliable AI solutions tailored to their needs.











