Skip to content
Artificial Intelligence

Benchmarking and evaluating AI solutions in legal work

· 5 minute read

· 5 minute read

What to ask, what to expect, and where the value lies

Highlights

  • Independent, task-based benchmarking is crucial for evaluating legal AI, with a recent Vals.ai study showing Thomson Reuters CoCounsel outperforming other vendors in core tasks like document summarization.
  • Law firms must conduct thorough due diligence by asking potential AI providers critical questions about data security, data ownership policies, and the authoritative legal content and expertise underpinning the technology.
  • Legal AI delivers measurable value and immediate ROI by optimizing key workflows such as document review, legal research, and drafting.

As legal teams navigate a crowded and evolving AI marketplace, benchmarking provides a critical foundation for informed decision-making. It allows your firm to move beyond vendor-speak to hard facts. Without it, you could be left to guess if a tool meets your needs, especially when AI systems offer similar-sounding capabilities.

Independent, task-based benchmarking enables clear vendor comparison, revealing performance differences that directly impact workflows. Because not all legal AI is created equal, seeing a side-by-side comparison helps you choose what is best for your firm.

Jump to ↓

Assessing the accuracy of legal AI


The importance of benchmarking AI tools


Questions to ask your AI provider — and what to listen for


Where legal AI delivers measurable value


Training, trust, technology

In a first-of-its-kind benchmarking study, Vals.ai evaluated leading legal AI tools — including Thomson Reuters CoCounsel — across core legal tasks. Their results were compared to a lawyer control group (Lawyer Baseline), bringing a clear view of AI accuracy in real-world legal work.

Tasks evaluated included:

  • Data extraction
  • Document Q&A
  • Document summarization
  • Chronology generation

CoCounsel achieved top-tier results across the board, with performance scores on all four tasks ranging from 73.2% to 89.6%, each exceeding the Lawyer Baseline by more than 10 points. It received a top score for document summarization at 77.2% and its average score of 79.5% was the highest among all participating vendors.

The study findings reinforce that legal AI tools offer real value to lawyers and law firms, even as opportunities remain to refine both the tools themselves and the methods we use to evaluate them.

The importance of benchmarking AI tools

For law firms, benchmarking is more than a technical exercise — it’s due diligence. It helps ensure:

  • The AI platform delivers the precision clients expect
  • The tool aligns with your real workflows, not hypothetical use cases
  • You’re not exposing sensitive data without understanding how it’s handled

Perhaps most importantly, it gives your team confidence. Clear benchmarks give legal professionals a reason to trust the tools they adopt — because trust is earned, not implied.

Questions to ask your AI provider — and what to listen for

When it’s your firm’s time to evaluate AI tools, it’s important to know what questions to ask and what answers to look for. It’s equally important to research the AI vendors as well as their tools. Be cautious about vague answers or one-size-fits-all solutions.

Asking targeted questions about data governance, domain expertise, and implementation reveals far more than a feature list ever could. The more you press for clarity, the easier it becomes to separate surface-level flash from long-term fit.

Infographic

Infographic

Choosing the right legal AI partner: A comprehensive worksheet

View infographic ↗

Data security and access controls

What to ask: Where is our data stored? Who can access it?

What to expect: Your vendor should provide encryption of your data both in transit and at rest and comply fully with privacy regulations such as GDPR and HIPAA. They should also clearly articulate their data storage location and retention policies. Data must be accessed based on roles, and access must be logged. The vendor should be able to clearly communicate the location, purpose, retention interval, and backup strategy.

Data usage and ownership

What to ask: Will you use our data to train your AI? Who owns the inputs we provide and the outputs generated?

What to expect: A reputable vendor will respond with complete clarity. Be cautious of a provider that avoids giving direct answers or attempts to claim ownership over your firm’s content or its results.

Legal expertise behind the AI

What to ask: What legal content and professionals inform your system?

What to expect: Legal AI should be built on authoritative sources and guided by credentialed legal experts. CoCounsel Legal, for example, draws on Westlaw and Practical Law content, with contributions from over 1,200 bar-admitted attorneys.

Of the legal professionals surveyed in a recent Generative AI in Professional Services report, 50% or more cited six top use cases for AI in legal work. These areas offer immediate ROI for firms, such as significant time savings, optimized workflows, and more time for firms to focus on high-value tasks. As a result, firms can enhance their overall productivity and competitiveness without compromising quality.

Top use cases for legal professionals:

  1. Document review
  2. Legal research
  3. Document summarization
  4. Brief or memo drafting
  5. Contract drafting
  6. Correspondence drafting

Training, trust, technology

Legal AI isn’t just about innovation — it’s about accuracy, transparency, and trust. With independent benchmarks and the right evaluation framework, your firm can confidently select tools that align with your standards and accelerate your workflows. Backed by decades of legal and technical expertise, Thomson Reuters CoCounsel Legal proves that measurable performance isn’t just possible: it’s here.

White paper

White paper

Evaluating AI solutions for legal professionals

Access white paper ↗

More answers