The post Harvey.ai Enhances AI Evaluation with BigLaw Bench: Arena appeared on BitcoinEthereumNews.com. Luisa Crawford Nov 07, 2025 12:03 Harvey.ai introduces BigLaw Bench: Arena, a new AI evaluation framework for legal tasks, offering insights into AI system performance through expert pairwise comparisons. Harvey.ai has unveiled a novel AI evaluation framework named BigLaw Bench: Arena (BLB: Arena), designed to assess the effectiveness of AI systems in handling legal tasks. According to Harvey.ai, this approach allows for a comprehensive comparison of AI models, giving legal experts the opportunity to express their preferences through pairwise comparisons. Innovative Evaluation Process BLB: Arena operates by having legal professionals review outputs from different AI models on various legal tasks. Lawyers select their preferred outputs and provide explanations for their choices, enabling a nuanced understanding of each model’s strengths. This process allows for a more flexible evaluation compared to traditional benchmarks, focusing on the resonance of each system with experienced lawyers. Monthly Competitions On a monthly basis, major AI systems at Harvey compete against foundation models, internal prototypes, and even human performance across numerous legal tasks. This rigorous testing involves hundreds of legal tasks, and the outcomes are reviewed by multiple lawyers to ensure diverse perspectives. The extensive data collected through these evaluations are used to generate Elo scores, which quantify the relative performance of each system. Qualitative Insights and Preference Drivers Beyond quantitative scores, BLB: Arena collects qualitative feedback, providing insights into the reasons behind preferences. Feedback is categorized into preference drivers such as Alignment, Trust, Presentation, and Intelligence. This categorization helps transform unstructured feedback into actionable data, allowing Harvey.ai to improve its AI models based on specific user preferences. Example Outcomes and System Improvements In recent evaluations, the Harvey Assistant, built on GPT-5, demonstrated significant performance improvements, outscoring other models and confirming its readiness for production use. The preference driver… The post Harvey.ai Enhances AI Evaluation with BigLaw Bench: Arena appeared on BitcoinEthereumNews.com. Luisa Crawford Nov 07, 2025 12:03 Harvey.ai introduces BigLaw Bench: Arena, a new AI evaluation framework for legal tasks, offering insights into AI system performance through expert pairwise comparisons. Harvey.ai has unveiled a novel AI evaluation framework named BigLaw Bench: Arena (BLB: Arena), designed to assess the effectiveness of AI systems in handling legal tasks. According to Harvey.ai, this approach allows for a comprehensive comparison of AI models, giving legal experts the opportunity to express their preferences through pairwise comparisons. Innovative Evaluation Process BLB: Arena operates by having legal professionals review outputs from different AI models on various legal tasks. Lawyers select their preferred outputs and provide explanations for their choices, enabling a nuanced understanding of each model’s strengths. This process allows for a more flexible evaluation compared to traditional benchmarks, focusing on the resonance of each system with experienced lawyers. Monthly Competitions On a monthly basis, major AI systems at Harvey compete against foundation models, internal prototypes, and even human performance across numerous legal tasks. This rigorous testing involves hundreds of legal tasks, and the outcomes are reviewed by multiple lawyers to ensure diverse perspectives. The extensive data collected through these evaluations are used to generate Elo scores, which quantify the relative performance of each system. Qualitative Insights and Preference Drivers Beyond quantitative scores, BLB: Arena collects qualitative feedback, providing insights into the reasons behind preferences. Feedback is categorized into preference drivers such as Alignment, Trust, Presentation, and Intelligence. This categorization helps transform unstructured feedback into actionable data, allowing Harvey.ai to improve its AI models based on specific user preferences. Example Outcomes and System Improvements In recent evaluations, the Harvey Assistant, built on GPT-5, demonstrated significant performance improvements, outscoring other models and confirming its readiness for production use. The preference driver…

Harvey.ai Enhances AI Evaluation with BigLaw Bench: Arena

For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com


Luisa Crawford
Nov 07, 2025 12:03

Harvey.ai introduces BigLaw Bench: Arena, a new AI evaluation framework for legal tasks, offering insights into AI system performance through expert pairwise comparisons.

Harvey.ai has unveiled a novel AI evaluation framework named BigLaw Bench: Arena (BLB: Arena), designed to assess the effectiveness of AI systems in handling legal tasks. According to Harvey.ai, this approach allows for a comprehensive comparison of AI models, giving legal experts the opportunity to express their preferences through pairwise comparisons.

Innovative Evaluation Process

BLB: Arena operates by having legal professionals review outputs from different AI models on various legal tasks. Lawyers select their preferred outputs and provide explanations for their choices, enabling a nuanced understanding of each model’s strengths. This process allows for a more flexible evaluation compared to traditional benchmarks, focusing on the resonance of each system with experienced lawyers.

Monthly Competitions

On a monthly basis, major AI systems at Harvey compete against foundation models, internal prototypes, and even human performance across numerous legal tasks. This rigorous testing involves hundreds of legal tasks, and the outcomes are reviewed by multiple lawyers to ensure diverse perspectives. The extensive data collected through these evaluations are used to generate Elo scores, which quantify the relative performance of each system.

Qualitative Insights and Preference Drivers

Beyond quantitative scores, BLB: Arena collects qualitative feedback, providing insights into the reasons behind preferences. Feedback is categorized into preference drivers such as Alignment, Trust, Presentation, and Intelligence. This categorization helps transform unstructured feedback into actionable data, allowing Harvey.ai to improve its AI models based on specific user preferences.

Example Outcomes and System Improvements

In recent evaluations, the Harvey Assistant, built on GPT-5, demonstrated significant performance improvements, outscoring other models and confirming its readiness for production use. The preference driver data indicated that intelligence was a key factor in human preference, highlighting the system’s ability to handle complex legal problems effectively.

Strategic Use of BLB: Arena

The insights gained from BLB: Arena are crucial for Harvey.ai’s decision-making process regarding the selection and enhancement of AI systems. By considering lawyers’ preferences, the framework helps identify the most effective foundation models, contributing to the development of superior AI solutions for legal professionals.

Image source: Shutterstock

Source: https://blockchain.news/news/harvey-ai-enhances-ai-evaluation-biglaw-bench-arena

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

CME Group to launch options on XRP and SOL futures

CME Group to launch options on XRP and SOL futures

The post CME Group to launch options on XRP and SOL futures appeared on BitcoinEthereumNews.com. CME Group will offer options based on the derivative markets on Solana (SOL) and XRP. The new markets will open on October 13, after regulatory approval.  CME Group will expand its crypto products with options on the futures markets of Solana (SOL) and XRP. The futures market will start on October 13, after regulatory review and approval.  The options will allow the trading of MicroSol, XRP, and MicroXRP futures, with expiry dates available every business day, monthly, and quarterly. The new products will be added to the existing BTC and ETH options markets. ‘The launch of these options contracts builds on the significant growth and increasing liquidity we have seen across our suite of Solana and XRP futures,’ said Giovanni Vicioso, CME Group Global Head of Cryptocurrency Products. The options contracts will have two main sizes, tracking the futures contracts. The new market will be suitable for sophisticated institutional traders, as well as active individual traders. The addition of options markets singles out XRP and SOL as liquid enough to offer the potential to bet on a market direction.  The options on futures arrive a few months after the launch of SOL futures. Both SOL and XRP had peak volumes in August, though XRP activity has slowed down in September. XRP and SOL options to tap both institutions and active traders Crypto options are one of the indicators of market attitudes, with XRP and SOL receiving a new way to gauge sentiment. The contracts will be supported by the Cumberland team.  ‘As one of the biggest liquidity providers in the ecosystem, the Cumberland team is excited to support CME Group’s continued expansion of crypto offerings,’ said Roman Makarov, Head of Cumberland Options Trading at DRW. ‘The launch of options on Solana and XRP futures is the latest example of the…
Share
BitcoinEthereumNews2025/09/18 00:56
Health Insurers To Cover Covid Vaccines Despite RFK, Jr. Moves

Health Insurers To Cover Covid Vaccines Despite RFK, Jr. Moves

The post Health Insurers To Cover Covid Vaccines Despite RFK, Jr. Moves appeared on BitcoinEthereumNews.com. The nation’s biggest health insurance companies will continue to cover vaccinations – including those against Covid-19 and seasonal flu – previously recommended by a federal advisory committee, America’s Health Insurance Plans said Wednesday, Sept. 17, 2025. In this photo is a free flu and Covid-19 vaccine shots available sign, CVS, Queens, New York. (Photo by: Lindsey Nicholson/Universal Images Group via Getty Images) UCG/Universal Images Group via Getty Images The nation’s biggest health insurance companies will continue to cover vaccinations – including those against Covid-19 and seasonal flu – previously recommended by a federal advisory committee. The announcement by America’s Health Insurance Plans (AHIP), which includes CVS Health’s Aetna, Humana, Cigna, Centene and an array of Blue Cross and Blue Shield plans as members, comes ahead of the first meeting of the reconstituted Advisory Committee on Immunization Practices, which now has new members chosen by U.S. Health and Human Services Secretary Robert F. Kennedy Jr., a vaccine critic. “Health plans are committed to maintaining and ensuring affordable access to vaccines,” AHIP said in a statement Wednesday. “Health plan coverage decisions for immunizations are grounded in each plan’s ongoing, rigorous review of scientific and clinical evidence, and continual evaluation of multiple sources of data.” The move by AHIP is good news for millions of Americans at a time of year when they flock to drugstores, pharmacies, physician’s offices and outpatient clinics to get their seasonal flu and Covid shots. Kennedy’s changes to U.S. vaccine policy have created confusion across the country over whether certain vaccines long covered by insurance would continue to be. AHIP has now provided some clarity for millions of Americans. “Health plans will continue to cover all ACIP-recommended immunizations that were recommended as of September 1, 2025, including updated formulations of the COVID-19 and influenza vaccines, with no cost-sharing…
Share
BitcoinEthereumNews2025/09/18 03:11
US, UK, Canada Launch Operation Atlantic to Tackle Crypto Scams

US, UK, Canada Launch Operation Atlantic to Tackle Crypto Scams

Law enforcement agencies from the United States, United Kingdom, and Canada have launched Operation Atlantic, a joint effort to combat rising crypto scams and protect
Share
Coinlaw2026/03/17 22:11